导读
MYSQL性能分析最快的就是慢日志分析, 慢日志可以记录超过long_query_time的SQL, 也可以记录未使用索引的SQL(log_queries_not_using_indexes), 一般未使用索引的SQL会比较慢, 但也不是绝对的, 所以这个参数默认是OFF的.
通常我们可以使用pt-query-digest
或者mysqldumpslow
之类的工具来分析慢日志. 但存在局限性, 比如我想查看返回行数超过200的SQL, 就没得相关参数来实现了, 虽然可以使用awk
之类的来实现, 但比较麻烦, 而且效率存在一定问题.
慢日志查看
慢日志里面记录的信息还是很全的, 比如执行时间, 执行账号,锁时间,返回行数之类的.
对象 | 描述 | 介绍 |
---|---|---|
Query_time | The statement execution time in seconds. | 执行时间,单位:秒 |
Lock_time | The time to acquire locks in seconds. | 获取锁的时间 |
Rows_sent | The number of rows sent to the client. | 返回客户端的行数 |
Rows_examined | The number of rows examined by the server layer (not counting any processing internal to storage engines). | 服务端产生的行数 |
如果启用了log_slow_extra参数的话, 还会记录更多的信息
column1 | column2 |
---|---|
Thread_id | The statement thread identifier. |
Errno | The statement error number, or 0 if no error occurred. |
Killed | If the statement was terminated, the error number indicating why, or 0 if the statement terminated normally. |
Bytes_received | The Bytes_received value for the statement. |
Bytes_sent | The Bytes_sent value for the statement. |
Read_first | The Handler_read_first value for the statement. |
Read_last | The Handler_read_last value for the statement. |
Read_key | The Handler_read_key value for the statement. |
Read_next | The Handler_read_next value for the statement. |
Read_prev | The Handler_read_prev value for the statement. |
Read_rnd | The Handler_read_rnd value for the statement. |
Read_rnd_next | The Handler_read_rnd_next value for the statement. |
Sort_merge_passes | The Sort_merge_passes value for the statement. |
Sort_range_count | The Sort_range value for the statement. |
Sort_rows | The Sort_rows value for the statement. |
Sort_scan_count | The Sort_scan value for the statement. |
Created_tmp_disk_tables | The Created_tmp_disk_tables value for the statement. |
Created_tmp_tables | The Created_tmp_tables value for the statement. |
Start | The statement execution start time. |
End | The statement execution end time. |
比如存在如下的一条慢日志:
代码语言:slowlog复制# Time: 2024-05-28T06:17:48.704559Z
# User@Host: root[root] @ localhost [127.0.0.1] Id: 3224
# Query_time: 0.000057 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 Thread_id: 3224 Errno: 0 Killed: 0 Bytes_received: 0 Bytes_sent: 67 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-05-28T06:17:48.704502Z End: 2024-05-28T06:17:48.704559Z
SET timestamp=1716877068;
select USER();
# Time: 2024-05-28T06:17:48.704559Z 表示记录这条慢日志的时间, 也可以当作是SQL执行结束时间. 这个时间是Z结尾的, 表明是UTC时间, 未加上时区, 所以我们计算的时候得人为的加上 8 小数. 如果带有时间的, 则为# Time: 2024-05-28T11:04:32.649161 08:00
这种样子.
# User@Host: rootroot @ localhost 127.0.0.1 Id: 3224 表明是 root@127.0.0.1 账号执行的SQL, 连接id是3224 (show processlist
看到的id)
# Query_time: 0.000057 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 Thread_id: 3224 Errno: 0 Killed: 0 Bytes_received: 0 Bytes_sent: 67 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-05-28T06:17:48.704502Z End: 2024-05-28T06:17:48.704559Z
我这里启用了log_slow_extra, 所以能看到Thread_id及其之后的信息, 我们可以看到这里的End的时间和上面的# Time的时间是一样的, 代表SQL结束时间(不一定是正常结束,ctrl c之类的也会记录下来.)
SET timestamp=1716877068;
select USER(); 就是我们执行的SQL语句, 当然也可能是某些客户端工具自动执行的.
也有一些特殊的SQL, 比如退出连接的时候, 记录的是 # administrator command: Quit; (QUIT:0x01 表示断开连接)
自定义脚本分析慢日志
既然慢日志的格式我们知道了, 那我们只需要取 两个 # Time: 之间的信息 即可得到一条慢日志, 然后记录下来, 按照我们的需求做分析, 我们本次的需求比较简单, 就文件读取,字符串匹配, 只是简单的打印平均值, 最大值, 最小值即可. 当然还有执行次数. 还是使用python3来实现, 比较简单, 我就直接演示了.
代码语言:shell复制python3 slowlog_analyze.py /data/mysql_3314/mysqllog/dblogs/slow3314.log
当然我们可以加一些过滤规则, 比如执行次数大于1000, Rows_examined > 200 之类的SQL再展示. 我这里就不去演示了.
我们的SQL没有做脱敏或者其它处理, 所以速度很快,嘎嘎快
工具 | 时间(秒) |
---|---|
slowlog_analyze.py | 1.3 |
pt-query-digest | 4.3 |
mysqldumpslow | 31.8 |
可以看到速度遥遥领先其它工具. 当然如果加上过滤规则,SQL处理等之后, 速度会慢一点, 但不会很明显, 最主要是能符合我们的要求.
附脚本
代码语言:python代码运行次数:0复制#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# write by ddcw @https://github.com/ddcw
# 分析mysql慢日志的,
import sys,os
import datetime
import re
from decimal import Decimal
if len(sys.argv) <= 1:
msg = "USAGE: python3 " str(sys.argv[0]) " /PATH/slow3314.logn"
sys.stdout.write(msg)
sys.exit(1)
filename = sys.argv[1]
SLOWLOG_DICT = {} #取HASH,没必要, 直接把SQL作为key
ADDHOURS = datetime.timedelta(hours=8)
PATTERN_UH = re.compile(r"[([^[]] )]")
PATTERN_KV = re.compile(r"(w ):s ([0-9.] )")
with open(filename,'r') as f:
log_time = None #日志记录时间, 基本上可以算是sql执行结束时间
user_host = None #执行慢SQL的账号
Query_time = None
Lock_time = None
Rows_sent = None
Rows_examined = None # 服务端所产生的行数. 是可能大于表的总行数的. select * from db1.t1 limit 10 则为总行数 10
extra_info = None #Thread_id Errno Bytes_received ...
sql = ""
NOT_BREAK_FLAG = True
while NOT_BREAK_FLAG:
data = f.readline()
if data == '':
NOT_BREAK_FLAG = False
# Time:
if data[:8] == "# Time: " or not NOT_BREAK_FLAG:
if log_time is not None: #记录上一条SQL
if sql in SLOWLOG_DICT and SLOWLOG_DICT[sql]['user_host'] == user_host:
SLOWLOG_DICT[sql]['log_time'].append( log_time)
SLOWLOG_DICT[sql]['Query_time'].append( Query_time)
SLOWLOG_DICT[sql]['Lock_time'].append( Lock_time)
SLOWLOG_DICT[sql]['Rows_sent'].append( Rows_sent)
SLOWLOG_DICT[sql]['Rows_examined'].append( Rows_examined)
SLOWLOG_DICT[sql]['extra_info'].append( extra_info)
else:
SLOWLOG_DICT[sql] = {}
SLOWLOG_DICT[sql]['user_host'] = user_host
SLOWLOG_DICT[sql]['log_time'] = [log_time]
SLOWLOG_DICT[sql]['Query_time'] = [Query_time]
SLOWLOG_DICT[sql]['Lock_time'] = [Lock_time]
SLOWLOG_DICT[sql]['Rows_sent'] = [Rows_sent]
SLOWLOG_DICT[sql]['Rows_examined'] = [Rows_examined]
SLOWLOG_DICT[sql]['extra_info'] = [extra_info]
elif data != "":
log_time = data.split()[-1]
if log_time[-1] == 'Z':
log_time = datetime.datetime.strptime(log_time, "%Y-%m-%dT%H:%M:%S.%fZ") ADDHOURS
else:
log_time = datetime.datetime.strptime(log_time, "%Y-%m-%dT%H:%M:%S.%f 08:00")
#log_time = log_time..timestamp() # 转为float
sql = ""
# User@Host:
elif data[:13] == "# User@Host: ":
user_host = "@".join(PATTERN_UH.findall(data))
# Query_time:
elif data[:14] == "# Query_time: ":
kv = {key: Decimal(value) for key, value in PATTERN_KV.findall(data)}
Query_time = kv['Query_time']
Lock_time = kv['Lock_time']
Rows_sent = kv['Rows_sent']
Rows_examined = kv['Rows_examined']
# extra_info TODO
elif data[:14] == "SET timestamp=":
continue
# Time mysqld刚启动时产生的header.
elif data[:4] == "Time":
log_time = None
else: # 剩下的就是SQL了
sql = data
# 输出慢日志信息了.
for sql in SLOWLOG_DICT:
# 过滤规则
if len(SLOWLOG_DICT[sql]['Query_time']) < 0:
continue
sys.stdout.write("n")
sys.stdout.write(f"{''.ljust(20)}{'TOTAL'.ljust(20)}{'MIN'.ljust(20)}{'MAX'.ljust(20)}{'AVG'.ljust(20)}n")
for k in ['Query_time','Lock_time','Rows_sent','Rows_examined']:
_total = sum(SLOWLOG_DICT[sql][k])
_min = min(SLOWLOG_DICT[sql][k])
_max = max(SLOWLOG_DICT[sql][k])
_avg = round(_total / len(SLOWLOG_DICT[sql][k]),6)
sys.stdout.write(f"{k.ljust(20)}{str(_total).ljust(20)}{str(_min).ljust(20)}{str(_max).ljust(20)}{str(_avg).ljust(20)}n")
sys.stdout.write(f"EXECUTE_TIME:{len(SLOWLOG_DICT[sql]['Query_time'])}n{sql}n")