数据库的大日志文件处理技巧

如何分析数据库的大日志文件？

在做数据库维护的时候，经常需要使用数据库日志来排查问题，有时候会遇到日志文件比较大，例如一个历史MySQL的slowlog上TB了，或者MongoDB的log上大几百G，通常这种情况下，我们有下面几个方法来处理日志。

大日志处理方法

当我们遇到日志文件很大的时候，使用vim打开不可取，打开的时间很慢，而且还有可能打爆服务器内存。

一般是通过下面几种方法来处理：

1、head 或者 tail 命令查看日志首尾信息。

head命令可以获取日志的前面若干行；

tail 命令可以获取日志的后面若干行；

上述两个命令使用方法比较简单：

head -n number xxx > a.txt

或者

tail -n number xxx > a.txt

这样，使用重定向，其实就可以分析日志的前后几行了。

这种方法，大家都知道，但是这种方法有一个缺点，就是只能查看文件的首尾部分，如果我们需要查看日志文件全文或者日志文件中间部分，这种方法就难以实现。

2、使用split命令切分文件

这个命令是一个比较隐蔽的基础命令，时常会忘记它的存在。

我们先使用split --help来查看它的说明：

代码语言：javascript复制

[root@yeyz ~]# split --help
Usage: split [OPTION]... [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is 'x'.  With no INPUT, or when INPUT
is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.
  -a, --suffix-length=N   generate suffixes of length N (default 2)
      --additional-suffix=SUFFIX  append an additional SUFFIX to file names
  -b, --bytes=SIZE        put SIZE bytes per output file
  -C, --line-bytes=SIZE   put at most SIZE bytes of lines per output file
  -d, --numeric-suffixes[=FROM]  use numeric suffixes instead of alphabetic;
                                   FROM changes the start value (default 0)
  -e, --elide-empty-files  do not generate empty output files with '-n'
      --filter=COMMAND    write to shell COMMAND; file name is $FILE
  -l, --lines=NUMBER      put NUMBER lines per output file
  -n, --number=CHUNKS     generate CHUNKS output files; see explanation below
  -u, --unbuffered        immediately copy input to output with '-n r/...'
      --verbose           print a diagnostic just before each
                            output file is opened
      --help     display this help and exit
      --version  output version information and exit

SIZE is an integer and optional unit (example: 10M is 10*1024*1024).  Units
are K, M, G, T, P, E, Z, Y (powers of 1024) or KB, MB, ... (powers of 1000).

CHUNKS may be:
N       split into N files based on size of input
K/N     output Kth of N to stdout
l/N     split into N files without splitting lines
l/K/N   output Kth of N to stdout without splitting lines
r/N     like 'l' but use round robin distribution
r/K/N   likewise but only output Kth of N to stdout

GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
For complete documentation, run: info coreutils 'split invocation'

相比于Linux操作系统中的其他命令，这个命令的--help输出不算太长，最常用的是-b参数。

-b 代表切分出来的文件大小。单位可以是1K、1M、1G这种容易读的形式。

下面是-b参数的一个小测试（按照1MB的单位来分割b.txt）：

代码语言：javascript复制

[root@yeyz test]# ll
total 31524
-rw-r--r-- 1 root root 32277286 May 12 23:52 b.txt
[root@yeyz test]# split -b 1M b.txt
[root@yeyz test]# ll
total 63048
-rw-r--r-- 1 root root 32277286 May 12 23:52 b.txt
-rw-r--r-- 1 root root  1048576 May 12 23:52 xaa
-rw-r--r-- 1 root root  1048576 May 12 23:52 xab
-rw-r--r-- 1 root root  1048576 May 12 23:52 xac
-rw-r--r-- 1 root root  1048576 May 12 23:52 xad
-rw-r--r-- 1 root root  1048576 May 12 23:52 xae
-rw-r--r-- 1 root root  1048576 May 12 23:52 xaf
-rw-r--r-- 1 root root  1048576 May 12 23:52 xag
-rw-r--r-- 1 root root  1048576 May 12 23:52 xah
-rw-r--r-- 1 root root  1048576 May 12 23:52 xai
-rw-r--r-- 1 root root  1048576 May 12 23:52 xaj
-rw-r--r-- 1 root root  1048576 May 12 23:52 xak
-rw-r--r-- 1 root root  1048576 May 12 23:52 xal
-rw-r--r-- 1 root root  1048576 May 12 23:52 xam
-rw-r--r-- 1 root root  1048576 May 12 23:52 xan
-rw-r--r-- 1 root root  1048576 May 12 23:52 xao
-rw-r--r-- 1 root root  1048576 May 12 23:52 xap

3、配合Linux操作系统日志轮滚

在Linux服务器中，可以使用自带的日志轮滚方法，来对数据库日志进行轮滚，通常，我们的轮滚规则，写在下面这个路径下面。

/etc/logrotate.d/

看一段线上的mongodb log的轮滚配置：

代码语言：javascript复制

/data1/mongo[0-9]*[0-9]/log/mongod.log {
    minsize 1k
    daily
    dateext
    rotate 10
    maxage 10
    start 0
    missingok
    create 0644 root root
    notifempty
    sharedscripts
    postrotate
        for port in $(ps axu | grep mongo[0-9]*.conf | awk -F '/' '{print $(NF-1)}' | sed 's/mongo//g')
        do
                /bin/rm -f /data1/mongo${port}/log/mongod.log.????-??-??T??-??-?? 2> /dev/null
                /bin/sleep 1
                /bin/kill -SIGUSR1 $(cat /data1/mongo${port}/mongod.pid  2> /dev/null) 2> /dev/null
                /bin/rm -f /data1/mongo${port}/log/mongod.log.????-??-??T??-??-?? 2> /dev/null
        done
    endscript
}

可以看到，其中/data1/mongoxxx是我们mongodb日志的路径，紧接着是一些日志轮滚的配置，最下面嵌入了一部分让日志轮滚生效的shell脚本。

上述配置中，以rotate 10和maxage 10最为重要，它们代表这个日志最多轮滚10次，同时最多保留10天。其他参数大家可以自行查阅。

总结

文中我们一共分享了3种处理大的日志文件的做法：

1、tail 或者 head 命令

这种方式的使用场景有限制，只能查看日志首尾的内容。

2、split命令

这种方法使用场景比较广，可以将一个大文件通过size来进行分割，在实际情况中使用的最多。

3、/etc/logrotate.d中配置Linux日志轮滚

这种方法，更像是一种未雨绸缪的方法，因为它可以提前将日志进行轮滚，这样就保证了日志不会太大，线上环境，我建议您配置上。

就这样吧~

linux javascript 数据库 mongodb sql

0 人点赞