解析MYSQL BINLOG 二进制格式(3)--QUERY_EVENT
原创:转载请说明出处谢谢!
上接
http://blog.itpub.net/7728585/viewspace-2133188/ 解析MYSQL BINLOG 二进制格式(1)--准备工作
http://blog.itpub.net/7728585/viewspace-2133189/ 解析MYSQL BINLOG 二进制格式(2)--FORMAT_DESCRIPTION_EVENT
class:Query_log_event
event:QUERY_EVENT
event_code:02
参考源:
1、log_event.h 中关于class Query_log_event的解释
2、log_event.cc
3、internals-en.epub
文档和源码解释都为如果发生改变数据库的语句都会在这部分显示
Query_log_event is created for each query that modifies the
database, unless the query is logged row-based.
但是需要改考虑这里的修改,如果按照DDL和DML分,
DDL:我们知道BINLOG只是记录了语句就在这部分显示
DML:DML会记录这部分因为这里考虑为ROW-FORMAT格式的没有语句他更改记录在
Table_map_log_event/TABLE_MAP_EVENT typecode=19
Write_rows_log_event/WRITE_ROW_EVENT typecode=30
Update_rows_log_event/UPDATE_ROW_EVENT typecode=31
Delele_rows_log_event/DELETE_ROW_EVENT typecode=32
事件中了。但是他会记录一个BEGIN如:
#170209 7:12:19 server id 93157 end_log_pos 585 CRC32 0x3e6c10f8 Query thread_id=2 exec_time=1 error_code=0
SET TIMESTAMP=1486595539/*!*/;
BEGIN
/*!*/;
及这里的
00000240 42 45 47 49 4e f8 10 6c 3e d3 a5 9b 58 13 e5 6b |BEGIN..l>...X..k|
1、fixed data part
4 bytes:一个用于指定线程id的值源码中叫做slave_proxy_id,主要用于
创建临时表的避免重复
4 bytes:语句的执行时间,单位秒
1 bytes:执行语句的默认数据库名字长度,源码解释为
The length of the name of the currently selected database
2 bytes:执行语句的错误码,如果是0就表示没有错误,主要用于一些无事物引擎比如MyISAM引擎一个语句
如insert select 执行了一部分的情况,当slave进行复制的时候检查这个错误码如果不相同则
停止复制,所有的错误码写到了mysqld_error.h 中可以自行参考
源码解释为:
Error code generated by the master. If the master fails, the
slave will fail with the same error code, except for the error
codes ER_DB_CREATE_EXISTS == 1007 and ER_DB_DROP_EXISTS == 1008.
2 bytes:variable data part中status_vars block的长度,详见variable data part
源码解释为:
The length of the status_vars block of the Body, in bytes
2、variable data part
var-size:status_vars block 可能存在多个叫做status variable的键值对中,每个
status variable包含一个1 bytes的variable值余下的具体值,关于各种的
具体值附在最后为源码注释截取,当然internals-en.epub也有一部分下面是
internals-en.epub中的部分:
Q_FLAGS_CODE=0:4 bytes,只在MYSQL 5.0写入,不考虑
Q_SQL_MODE_CODE=1:8 bytes,他是一个每一位代表SQL_MODE中的一个值,参考
最后源码的解释
Q_CATALOG_CODE=2:只在MYSQL 5.0.0到5.0.3使用不考虑
Q_AUTO_INCREMENT=3:2 bytes非负整数用于表示参数auto_increment_increment
和auto_increment_offset,这个只会在auto_increment大于
1的时候出现
Q_CHARSET_CODE=4:6 bytes用于表示character_set_client,collation_connection
和collation_server参数(totally 2+2+2=6 bytes)参考最后源码
解释
Q_TIME_ZONE_CODE=5:用于描述time zone信息
Q_CATALOG_NZ_CODE=6:用于描述catalog name,长度占用一个字节,随后这个值为std
Q_LC_TIME_NAMES_CODE=7:2 bytes 非负整数,只有当lc_time_names不设置为en_US的时候使用
Q_CHARSET_DATABASE_CODE=8:2 bytes 非负整数为collation_database系统变量,5.7源码解释
说这部分新版本不一定使用。
当然这里还有很多,详细参考后面给出的源码解释和internals-en.epub相关部分
var-size:数据库名字,以0X00结尾,既然0X00结尾那么其长度为数据库名字实际长度+1
源码描述为:
db_len+1 The currently selected database, as a null-terminated string.
var-size:这就是真正的语句,文档中说这个域的长度不固定,当然固定因为他是具体的语句
但是他的长度可以通过下面的计算方式得出:
总长度(event_header中)-event_header长度(v4为19)-fixed data part(13)-
status_vars block的长度 - 默认数据库名字长度 来得到因为这些定值
实际解析:
我使用了语句 create table testctas1 as select * from testcast,当然我是5.7关闭了GTID
不然这个语句会报错的
这样会发现这个语句在binlog中转变为2部分
1、create table DDL部分
2、DML部分
DML部分我们先不考虑,下面是CREATE TABLE DDL部分的解析
binlog的Query_log_event 二进制部分
-
00000100 d3 a5 9b 58 02 e5 6b 01 00 bd 00 00 00 | ..X..k......|
-
00000110 c0 01 00 00 00 00 02 00 00 00 01 00 00 00 04 00 |................|
-
00000120 00 1a 00 00 00 00 00 00 01 00 00 20 40 00 00 00 |........... @...|
-
00000130 00 06 03 73 74 64 04 21 00 21 00 21 00 74 65 73 |...std.!.!.!.tes|
-
00000140 74 00 43 52 45 41 54 45 20 54 41 42 4c 45 20 60 |t.CREATE TABLE `|
-
00000150 74 65 73 74 63 74 61 73 31 60 20 28 0a 20 20 60 |testctas1` (. `|
-
00000160 69 64 31 60 20 69 6e 74 28 31 31 29 20 44 45 46 |id1` int(11) DEF|
-
00000170 41 55 4c 54 20 4e 55 4c 4c 2c 0a 20 20 60 69 64 |AULT NULL,. `id|
-
00000180 32 60 20 69 6e 74 28 31 31 29 20 44 45 46 41 55 |2` int(11) DEFAU|
-
00000190 4c 54 20 4e 55 4c 4c 2c 0a 20 20 60 6e 61 6d 65 |LT NULL,. `name|
-
000001a0 60 20 76 61 72 63 68 61 72 28 32 30 29 20 44 45 |` varchar(20) DE|
-
000001b0 46 41 55 4c 54 20 4e 55 4c 4c 0a 29 66 a6 84 c9 |FAULT NULL.)f...|
mysqlbinlog解析如下:
# at 259
#170209 7:12:19 server id 93157 end_log_pos 448 CRC32 0xc984a666 Query thread_id=2 exec_time=1 error_code=0
use `test`/*!*/;
SET TIMESTAMP=1486595539/*!*/;
SET @@session.pseudo_thread_id=2/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1075838976/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!/C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
CREATE TABLE `testctas1` (
`id1` int(11) DEFAULT NULL,
`id2` int(11) DEFAULT NULL,
`name` varchar(20) DEFAULT NULL
)
/*!*/;
--event header:
这部分不想过多解析,如果不懂看前面的文章,因为都是一样的简单解析
d3 a5 9b 58: timestamp,小端显示
02: event_type为02
e5 6b 01 00:service_id,小端显示0X016be5及十进制93157
mysql> show variables like '%server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id | 93157 |
+---------------+-------+
bd 00 00 00:event长度
c0 01 00 00:下一个event位置
00 00:flags
--fixed data part
02 00 00 00:slave_proxy_id小端显示0X00000002 这个和mysqlbinlog 解析的 thread_id=2以及SET @@session.pseudo_thread_id=2/*!*/;一致
01 00 00 00:执行时间小端显示0X00000001 这个和mysqlbinlog 解析的exec_time=1
04:默认数据库长度我的数据库名字为test当然长度也就是0X04也就是4
00 00:执行错误码,我这里没有错误全是0X00
1a 00:status_vars block及0X001a及26,如果仔细数一下后面的status_vars block的大小确实是26,其实我就是按照这个分割开的。
--variable data part
00 00 00 00 00 01 00 00 20 40 00 00 00 00 06 03
73 74 64 04 21 00 21 00 21 00 :
这部分是非常重要的status_vars block需要一个键值对一个解释
-00 00 00 00 00:为Q_FLAGS2_CODE,
文档上说它只在MYSQL 5.0中写入
键为00 值为00 00 00 00
-01 00 00 20 40 00 00 00 00:
为Q_SQL_MODE_CODE,
键为01 值为00 00 20 40 00 00 00 00 小端显示
为0X40200000则换算一下为1075838976
可以看到这个和MYSQLBINLOG解析的
SET @@session.sql_mode=1075838976/*!*/;
可以看到一致,关于这个也可以看看最后的
关于源码的Q_SQL_MODE_CODE解释
-06 03 73 74 64:为Q_CATALOG_NZ_CODE,
键值06 值为03 73 74 64,0X03为长度,0X73 0X74 0X64就是
std的ASCII
-04 21 00 21 00 21 00:为Q_CHARSET_CODE
键值04 值为21 00 21 00 21 00,这个都是0X0021 就是33
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/;
可以看到和MYSQLBINLOG解析一致
74 65 73 74 00:这部分是数据库的实际名字以0X00结尾0X74 0X65 0X73 0X74就是test的ASCII值
和MYSQLBINLOG解析的use `test`/*!*/; 一致
43 52 45 41 54 45 20 54 41 42 4c 45 20 60
74 65 73 74 63 74 61 73 31 60 20 28 0a 20 20 60
69 64 31 60 20 69 6e 74 28 31 31 29 20 44 45 46
41 55 4c 54 20 4e 55 4c 4c 2c 0a 20 20 60 69 64
32 60 20 69 6e 74 28 31 31 29 20 44 45 46 41 55
4c 54 20 4e 55 4c 4c 2c 0a 20 20 60 6e 61 6d 65
60 20 76 61 72 63 68 61 72 28 32 30 29 20 44 45
46 41 55 4c 54 20 4e 55 4c 4c 0a 29 :
这部分就是实际的语句的文本值里面全是ASCII二进制显示而已
-
00000140 74 00 43 52 45 41 54 45 20 54 41 42 4c 45 20 60 |t.CREATE TABLE `|
-
00000150 74 65 73 74 63 74 61 73 31 60 20 28 0a 20 20 60 |testctas1` (. `|
-
00000160 69 64 31 60 20 69 6e 74 28 31 31 29 20 44 45 46 |id1` int(11) DEF|
-
00000170 41 55 4c 54 20 4e 55 4c 4c 2c 0a 20 20 60 69 64 |AULT NULL,. `id|
-
00000180 32 60 20 69 6e 74 28 31 31 29 20 44 45 46 41 55 |2` int(11) DEFAU|
-
00000190 4c 54 20 4e 55 4c 4c 2c 0a 20 20 60 6e 61 6d 65 |LT NULL,. `name|
-
000001a0 60 20 76 61 72 63 68 61 72 28 32 30 29 20 44 45 |` varchar(20) DE|
-
000001b0 46 41 55 4c 54 20 4e 55 4c 4c 0a 29 66 a6 84 c9 |FAULT NULL.)f...|
这里也能看到。
66 a6 84 c9: CRC32校验码
后记:
Query_log_event 是binlog中的关键的EVENT,DDL记录了语句并且记录了很多关于语句的参数环境信息可以看看
use `test`/*!*/;
SET TIMESTAMP=1486595539/*!*/;
SET @@session.pseudo_thread_id=2/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1075838976/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!/C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
mysqlbinlog解析出来的这些大部分是不是都在本文提到过?
关于ROW_format的DML的Query_log_event虽然没有具体语句但是它确实存在,记录语句为BEGIN,这个在以后的文章中会在解析。
-
@anchor Table_query_log_event_status_vars
-
<table>
-
<caption>Status variables for Query_log_event</caption>
-
-
<tr>
-
<th>Status variable</th>
-
<th>1 byte identifier</th>
-
<th>Format</th>
-
<th>Description</th>
-
</tr>
-
-
<tr>
-
<td>flags2</td>
-
<td>Q_FLAGS2_CODE == 0</td>
-
<td>4 byte bitfield</td>
-
<td>The flags in @c thd->options, binary AND-ed with @c
-
OPTIONS_WRITTEN_TO_BIN_LOG. The @c thd->options bitfield contains
-
options for "SELECT". @c OPTIONS_WRITTEN identifies those options
-
that need to be written to the binlog (not all do). Specifically,
-
@c OPTIONS_WRITTEN_TO_BIN_LOG equals (@c OPTION_AUTO_IS_NULL | @c
-
OPTION_NO_FOREIGN_KEY_CHECKS | @c OPTION_RELAXED_UNIQUE_CHECKS |
-
@c OPTION_NOT_AUTOCOMMIT), or 0x0c084000 in hex.
-
-
These flags correspond to the SQL variables SQL_AUTO_IS_NULL,
-
FOREIGN_KEY_CHECKS, UNIQUE_CHECKS, and AUTOCOMMIT, documented in
-
the "SET Syntax" section of the MySQL Manual.
-
-
This field is always written to the binlog in version >= 5.0, and
-
never written in version < 5.0.
-
</td>
-
</tr>
-
-
<tr>
-
<td>sql_mode</td>
-
<td>Q_SQL_MODE_CODE == 1</td>
-
<td>8 byte bitfield</td>
-
<td>The @c sql_mode variable. See the section "SQL Modes" in the
-
MySQL manual, and see sql_priv.h for a list of the possible
-
flags. Currently (2007-10-04), the following flags are available:
-
<pre>
-
MODE_REAL_AS_FLOAT==0x1
-
MODE_PIPES_AS_CONCAT==0x2
-
MODE_ANSI_QUOTES==0x4
-
MODE_IGNORE_SPACE==0x8
-
MODE_NOT_USED==0x10
-
MODE_ONLY_FULL_GROUP_BY==0x20
-
MODE_NO_UNSIGNED_SUBTRACTION==0x40
-
MODE_NO_DIR_IN_CREATE==0x80
-
MODE_POSTGRESQL==0x100
-
MODE_ORACLE==0x200
-
MODE_MSSQL==0x400
-
MODE_DB2==0x800
-
MODE_MAXDB==0x1000
-
MODE_NO_KEY_OPTIONS==0x2000
-
MODE_NO_TABLE_OPTIONS==0x4000
-
MODE_NO_FIELD_OPTIONS==0x8000
-
MODE_MYSQL323==0x10000
-
MODE_MYSQL323==0x20000
-
MODE_MYSQL40==0x40000
-
MODE_ANSI==0x80000
-
MODE_NO_AUTO_VALUE_ON_ZERO==0x100000
-
MODE_NO_BACKSLASH_ESCAPES==0x200000
-
MODE_STRICT_TRANS_TABLES==0x400000
-
MODE_STRICT_ALL_TABLES==0x800000
-
MODE_NO_ZERO_IN_DATE==0x1000000
-
MODE_NO_ZERO_DATE==0x2000000
-
MODE_INVALID_DATES==0x4000000
-
MODE_ERROR_FOR_DIVISION_BY_ZERO==0x8000000
-
MODE_TRADITIONAL==0x10000000
-
MODE_NO_AUTO_CREATE_USER==0x20000000
-
MODE_HIGH_NOT_PRECEDENCE==0x40000000
-
MODE_PAD_CHAR_TO_FULL_LENGTH==0x80000000
-
</pre>
-
All these flags are replicated from the server. However, all
-
flags except @c MODE_NO_DIR_IN_CREATE are honored by the slave;
-
the slave always preserves its old value of @c
-
MODE_NO_DIR_IN_CREATE. For a rationale, see comment in
-
@c Query_log_event::do_apply_event in @c log_event.cc.
-
-
This field is always written to the binlog.
-
</td>
-
</tr>
-
-
<tr>
-
<td>catalog</td>
-
<td>Q_CATALOG_NZ_CODE == 6</td>
-
<td>Variable-length string: the length in bytes (1 byte) followed
-
by the characters (at most 255 bytes)
-
</td>
-
<td>Stores the client's current catalog. Every database belongs
-
to a catalog, the same way that every table belongs to a
-
database. Currently, there is only one catalog, "std".
-
-
This field is written if the length of the catalog is > 0;
-
otherwise it is not written.
-
-
-
-
-
auto_increment
-
Q_AUTO_INCREMENT == 3
-
two 2 byte unsigned integers, totally 2+2=4 bytes
-
-
The two variables auto_increment_increment and
-
auto_increment_offset, in that order. For more information, see
-
"System variables" in the MySQL manual.
-
-
This field is written if auto_increment > 1. Otherwise, it is not
-
written.
-
-
-
-
-
charset
-
Q_CHARSET_CODE == 4
-
three 2 byte unsigned integers, totally 2+2+2=6 bytes
-
The three variables character_set_client,
-
collation_connection, and collation_server, in that order.
-
character_set_client is a code identifying the character set and
-
collation used by the client to encode the query.
-
collation_connection identifies the character set and collation
-
that the master converts the query to when it receives it; this is
-
useful when comparing literal strings. collation_server is the
-
default character set and collation used when a new database is
-
created.
-
-
See also "Connection Character Sets and Collations" in the MySQL
-
5.1 manual.
-
-
All three variables are codes identifying a (character set,
-
collation) pair. To see which codes map to which pairs, run the
-
query "SELECT id, character_set_name, collation_name FROM
-
COLLATIONS".
-
-
Cf. Q_CHARSET_DATABASE_CODE below.
-
-
This field is always written.
-
-
-
-
-
time_zone
-
Q_TIME_ZONE_CODE == 5
-
Variable-length string: the length in bytes (1 byte) followed
-
by the characters (at most 255 bytes).
-
The time_zone of the master.
-
-
See also "System Variables" and "MySQL Server Time Zone Support"
-
in the MySQL manual.
-
-
This field is written if the length of the time zone string is >
-
0; otherwise, it is not written.
-
-
-
-
-
lc_time_names_number
-
Q_LC_TIME_NAMES_CODE == 7
-
2 byte integer
-
A code identifying a table of month and day names. The
-
mapping from codes to languages is defined in @c sql_locale.cc.
-
-
This field is written if it is not 0, i.e., if the locale is not
-
en_US.
-
-
-
-
-
charset_database_number
-
Q_CHARSET_DATABASE_CODE == 8
-
2 byte integer
-
-
The value of the collation_database system variable (in the
-
source code stored in @c thd->variables.collation_database), which
-
holds the code for a (character set, collation) pair as described
-
above (see Q_CHARSET_CODE).
-
-
collation_database was used in old versions (???WHEN). Its value
-
was loaded when issuing a "use db" query and could be changed by
-
issuing a "SET collation_database=xxx" query. It used to affect
-
the "LOAD DATA INFILE" and "CREATE TABLE" commands.
-
-
In newer versions, "CREATE TABLE" has been changed to take the
-
character set from the database of the created table, rather than
-
the character set of the current database. This makes a
-
difference when creating a table in another database than the
-
current one. "LOAD DATA INFILE" has not yet changed to do this,
-
but there are plans to eventually do it, and to make
-
collation_database read-only.
-
-
This field is written if it is not 0.
-
-
-
-
table_map_for_update
-
Q_TABLE_MAP_FOR_UPDATE_CODE == 9
-
8 byte integer
-
-
The value of the table map that is to be updated by the
-
multi-table update query statement. Every bit of this variable
-
represents a table, and is set to 1 if the corresponding table is
-
to be updated by this statement.
-
-
The value of this variable is set when executing a multi-table update
-
statement and used by slave to apply filter rules without opening
-
all the tables on slave. This is required because some tables may
-
not exist on slave because of the filter rules.
-
-
-
-
-
@subsection Query_log_event_notes_on_previous_versions Notes on Previous Versions
-
-
* Status vars were introduced in version 5.0. To read earlier
-
versions correctly, check the length of the Post-Header.
-
-
* The status variable Q_CATALOG_CODE == 2 existed in MySQL 5.0.x,
-
where 0<=x<=3. It was identical to Q_CATALOG_CODE, except that the
-
string had a trailing '/0'. The '/0' was removed in 5.0.4 since it
-
was redundant (the string length is stored before the string). The
-
Q_CATALOG_CODE will never be written by a new master, but can still
-
be understood by a new slave.
-
-
* See Q_CHARSET_DATABASE_CODE in the table above.
-
-
* When adding new status vars, please don't forget to update the
-
MAX_SIZE_LOG_EVENT_STATUS, and update function code_name
-
-
*/
-
class Query_log_event: public Log_event
-
{
-
LEX_STRING user;
-
LEX_STRING host;
-
protected:
-
Log_event::Byte* data_buf;
-
public:
-
const char* query;
-
const char* catalog;
-
const char* db;
-
/*
-
If we already know the length of the query string
-
we pass it with q_len, so we would not have to call strlen()
-
otherwise, set it to 0, in which case, we compute it with strlen()
-
*/
-
uint32 q_len;
-
uint32 db_len;
-
uint16 error_code;
-
ulong thread_id;
-
/*
-
For events created by Query_log_event::do_apply_event (and
-
Load_log_event::do_apply_event()) we need the *original* thread
-
id, to be able to log the event with the original (=master's)
-
thread id (fix for BUG#1686).
-
*/
-
ulong slave_proxy_id;
-
-
/*
-
Binlog format 3 and 4 start to differ (as far as class members are
-
concerned) from here.
-
*/
-
-
uint catalog_len; // <= 255 char; 0 means uninited
-
-
/*
-
We want to be able to store a variable number of N-bit status vars:
-
(generally N=32; but N=64 for SQL_MODE) a user may want to log the number
-
of affected rows (for debugging) while another does not want to lose 4
-
bytes in this.
-
The storage on disk is the following:
-
status_vars_len is part of the post-header,
-
status_vars are in the variable-length part, after the post-header, before
-
the db & query.
-
status_vars on disk is a sequence of pairs (code, value) where 'code' means
-
'sql_mode', 'affected' etc. Sometimes 'value' must be a short string, so
-
its first byte is its length. For now the order of status vars is:
-
flags2 - sql_mode - catalog - autoinc - charset
-
We should add the same thing to Load_log_event, but in fact
-
LOAD DATA INFILE is going to be logged with a new type of event (logging of
-
the plain text query), so Load_log_event would be frozen, so no need. The
-
new way of logging LOAD DATA INFILE would use a derived class of
-
Query_log_event, so automatically benefit from the work already done for
-
status variables in Query_log_event.
-
*/
-
uint16 status_vars_len;
-
-
/*
-
'flags2' is a second set of flags (on top of those in Log_event), for
-
session variables. These are thd->options which is & against a mask
-
(OPTIONS_WRITTEN_TO_BIN_LOG).
-
flags2_inited helps make a difference between flags2==0 (3.23 or 4.x
-
master, we don't know flags2, so use the slave server