gdb调试基础命令

2022-09-26 17:46:13 浏览数 (1)

gdb调试

复习并整理gdb

1. 调试准备

被调试的程序需要生成调试符号信息,即在gcc/g 编译时加上-g选项。

代码语言:javascript复制
g   hello.cpp -g -o hello
  • -g选项同样适用于makefile,cmake等工具编译生成的linux程序
  • 实际生成时最好关闭编译器优化选项。

2. 启动gdb调试的方法

直接调试目标程序

代码语言:javascript复制
gdb filename	# filename为要启动的调试程序名称

attach到进程

代码语言:javascript复制
gdb attach pid	#attach到pid对应的进程

调试core文件——定位进程崩溃问题

LINUX默认不会打开程序崩溃时产生的core文件。使用`ulimit -c查看

代码语言:javascript复制
 doper@arch-doper  ~  ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         unlimited	# 大小为0,表示关闭

使用ulimit -c xxx来修改对应的大小

但如果这样修改只对当前session有效,关闭这个session后这个值就恢复了。但一般服务器程序都以守护进程的方式运行,也就是说当前会话虽然被关闭了,但服务器程序仍然在运行。这样,程序崩溃后是无法产生core文件的。为了能够使选项永久生效,有如下方法:

(1) 在/etc/security/limits.conf增加

代码语言:javascript复制
#<domain>      <type>  <item>         <value>
*				soft	core		 unlimited

(2) 把ulimit -c unlimited添加到/etc/profile文件中,然后source /etc/profile即可。这是针对root用户的,如果要仅作用于某一用户,则把命令添加到对应的~/.bashrc~/.bash_profile

一般core文件名是core.pid,所以程序漰溃后可以使用gdb filename corename来找到崩溃原因,如

代码语言:javascript复制
gdb msg_server core.21985

之后使用bt查看调用栈分析即可。

但如果多程序同时崩溃就不知道哪个进程对应哪个服务,这时候有两种解决方法

(1) 程序启动记录PID

在程序启动时将PID记录下来

(2) 自定义core文件名称和目录。

/proc/sys/kernel/core_uses_pid可以控制在产生的core文件名是否添加PID作为扩展,如果添加,文件内容为1,否则为0.

代码语言:javascript复制
 doper@arch-doper  ~  cat /proc/sys/kernel/core_uses_pid 
1

/proc/sys/kernel/core_pattern可以设置格式化的core文件保存位置或文件名。例如

代码语言:javascript复制
echo "/corefile/core-%e-%p-%t" > /proc/sys/kernel/core_pattern

%e: 程序名

%p: pid

%t: core文件生成时间

….

3. gdb常用命令详解

3.1 run

gdb filename是指attach到一个调试文件,并没有启动,使用run(r)启动

代码语言:javascript复制
(gdb) r
Starting program: /home/doper/github/redis-6.0.3/src/redis-server 
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-101.el8.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
2312:C 12 Nov 2021 22:41:40.104 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
2312:C 12 Nov 2021 22:41:40.104 # Redis version=6.0.3, bits=64, commit=00000000, modified=0, pid=2312, just started
2312:C 12 Nov 2021 22:41:40.104 # Warning: no config file specified, using the default config. In order to specify a config file use /home/doper/github/redis-6.0.3/src/redis-server /path/to/redis.conf
2312:M 12 Nov 2021 22:41:40.109 * Increased maximum number of open files to 10032 (it was originally set to 1024).
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 6.0.3 (00000000/0) 64 bit
  .-`` .-```.  ```/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in standalone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 2312
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

2312:M 12 Nov 2021 22:41:40.111 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
2312:M 12 Nov 2021 22:41:40.111 # Server initialized
2312:M 12 Nov 2021 22:41:40.111 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
2312:M 12 Nov 2021 22:41:40.111 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
[New Thread 0x7ffff6587700 (LWP 2316)]
[New Thread 0x7ffff5d86700 (LWP 2317)]
[New Thread 0x7ffff5585700 (LWP 2318)]
[New Thread 0x7ffff4d84700 (LWP 2319)]
2312:M 12 Nov 2021 22:41:40.114 * Ready to accept connections

3.2 continue

程序触发断点或者Ctrl C组合键中断后,使用continue(c)可以让程序继续运行。如果后面遇到断点则会继续停止

代码语言:javascript复制
2312:M 12 Nov 2021 22:41:40.114 * Ready to accept connections
^C
Thread 1 "redis-server" received signal SIGINT, Interrupt.
0x00007ffff715f1b7 in epoll_wait () from /lib64/libc.so.6
(gdb) c
Continuing.

3.3 break

添加断点,使用break(b) 行号/函数名/文件名:行号添加

代码语言:javascript复制
break functionname
break LineNo
break filename:LineNo

3.4 tbreak

添加一个临时断点,被触发一次后便会删除

break的三种断点类型

普通断点条件断点数据断点

普通断点就是我们添加的断点除去条件断点和硬件断点的断点。

数据断点就是被监视的内存值或者变量值发生变化时触发的断点,watch命令添加的部分断点就是数据断点

条件断点就是某个条件满足时才会触发的断点。格式break [lineNo] if [condition]

代码语言:javascript复制
#include <iostream>
using namespace std;
int main() {
    for(int i=0; i<10000;   i) {
        cout << "!" << endl;
    }
    return 0;
}

这里使用break 5 if i == 5000,当i=5000时就会触发断点。

3.5 breaktrace(bt) 与 frame

bt用于查看当前线程的调用堆栈

代码语言:javascript复制
(gdb) bt
#0  anetListen (err=0xbb35e8 <server 680> "", s=6, sa=0xdc6c20, len=28, backlog=511) at anet.c:466
#1  0x0000000000430988 in _anetTcpServer (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, af=10, backlog=511) at anet.c:501
#2  0x0000000000430a63 in anetTcp6Server (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, backlog=511) at anet.c:524
#3  0x000000000043684b in listenToPort (port=6379, fds=0xbb34b4 <server 372>, count=0xbb34f4 <server 436>) at server.c:2648
#4  0x0000000000436f80 in initServer () at server.c:2792
#5  0x000000000043cda6 in main (argc=1, argv=0x7fffffffe3b8) at server.c:5128

示例中有六层堆栈,顶层是main函数,底层是断点所在的anetListen函数。如果要切换到其他堆栈,可以使用frame(f)命令

代码语言:javascript复制
(gdb) bt
#0  anetListen (err=0xbb35e8 <server 680> "", s=6, sa=0xdc6c20, len=28, backlog=511) at anet.c:466
#1  0x0000000000430988 in _anetTcpServer (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, af=10, backlog=511) at anet.c:501
#2  0x0000000000430a63 in anetTcp6Server (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, backlog=511) at anet.c:524
#3  0x000000000043684b in listenToPort (port=6379, fds=0xbb34b4 <server 372>, count=0xbb34f4 <server 436>) at server.c:2648
#4  0x0000000000436f80 in initServer () at server.c:2792
#5  0x000000000043cda6 in main (argc=1, argv=0x7fffffffe3b8) at server.c:5128
(gdb) f 0
#0  anetListen (err=0xbb35e8 <server 680> "", s=6, sa=0xdc6c20, len=28, backlog=511) at anet.c:466
466	    return ANET_OK;
(gdb) f 1
#1  0x0000000000430988 in _anetTcpServer (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, af=10, backlog=511) at anet.c:501
501	        if (anetListen(err,s,p->ai_addr,p->ai_addrlen,backlog) == ANET_ERR) s = ANET_ERR;
(gdb) f 2
#2  0x0000000000430a63 in anetTcp6Server (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, backlog=511) at anet.c:524
524	    return _anetTcpServer(err, port, bindaddr, AF_INET6, backlog);
(gdb) f 3
#3  0x000000000043684b in listenToPort (port=6379, fds=0xbb34b4 <server 372>, count=0xbb34f4 <server 436>) at server.c:2648
2648	            fds[*count] = anetTcp6Server(server.neterr,port,NULL,
(gdb) f 4
#4  0x0000000000436f80 in initServer () at server.c:2792
2792	        listenToPort(server.port,server.ipfd,&server.ipfd_count) == C_ERR)
(gdb) f 5
#5  0x000000000043cda6 in main (argc=1, argv=0x7fffffffe3b8) at server.c:5128
5128	    initServer();

可以看到这里的调用顺序为initServer->litenToPort->anetTcp6Server->_anetTcpServer->anetListen->断点处

3.6 info break, enable, disable, delete

info break(info b)用于查看当前设置的断点,enable用于激活断点,disable用于禁用断点,delete用于删除断点。

注意enable,disable,delete如果不指明具体操作哪个断点,则标识要启动/禁用/删除所有断点。

代码语言:javascript复制
(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x000000000043c735 in main at server.c:5001
	breakpoint already hit 1 time
2       breakpoint     keep y   0x0000000000430718 in anetListen at anet.c:455
	breakpoint already hit 1 time
3       breakpoint     keep y   0x0000000000430762 in anetListen at anet.c:458
4       breakpoint     keep y   0x00000000004307ae in anetListen at anet.c:464
5       breakpoint     keep y   0x00000000004307b5 in anetListen at anet.c:466
	breakpoint already hit 1 time

(gdb) disable 2
(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x000000000043c735 in main at server.c:5001
	breakpoint already hit 1 time
2       breakpoint     keep n   0x0000000000430718 in anetListen at anet.c:455
	breakpoint already hit 1 time
3       breakpoint     keep y   0x0000000000430762 in anetListen at anet.c:458
4       breakpoint     keep y   0x00000000004307ae in anetListen at anet.c:464
5       breakpoint     keep y   0x00000000004307b5 in anetListen at anet.c:466
	breakpoint already hit 1 time

3.7 list

list命令用来查看当前断点附近的代码

list 命令可以查看当前位置向下10行代码

list -命令可以查看当前位置向上10行代码

3.8 print与ptype命令

print(p)用来输出查看某个变量的值,也可以直接修改当前内存中的变量值

代码语言:javascript复制
(gdb) bt
#0  anetListen (err=0xbb35e8 <server 680> "", s=6, sa=0xdc6c20, len=28, backlog=511) at anet.c:466
#1  0x0000000000430988 in _anetTcpServer (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, af=10, backlog=511) at anet.c:501
#2  0x0000000000430a63 in anetTcp6Server (err=0xbb35e8 <server 680> "", port=6379, bindaddr=0x0, backlog=511) at anet.c:524
#3  0x000000000043684b in listenToPort (port=6379, fds=0xbb34b4 <server 372>, count=0xbb34f4 <server 436>) at server.c:2648
#4  0x0000000000436f80 in initServer () at server.c:2792
#5  0x000000000043cda6 in main (argc=1, argv=0x7fffffffe3b8) at server.c:5128
(gdb) f 4
#4  0x0000000000436f80 in initServer () at server.c:2792
2792	        listenToPort(server.port,server.ipfd,&server.ipfd_count) == C_ERR)
(gdb) p server.port
$1 = 6379
(gdb) p server.ipfd
$2 = {0 <repeats 16 times>}

##下面尝试修改

(gdb) p server.port=6400
$3 = 6400
(gdb) p server.port
$4 = 6400

注意print修改要注意变量的位置和作用,以便程序崩溃出错(如除以0,空指针异常等)

ptype顾名思义print type,用来输出一个变量的类型。

代码语言:javascript复制
(gdb) ptype server
type = struct redisServer {
    pid_t pid;
    char *configfile;
    char *executable;
    char **exec_argv;
    int dynamic_hz;
    int config_hz;
    int hz;
    redisDb *db;
    dict *commands;
    dict *orig_commands;
    aeEventLoop *el;
    unsigned int lruclock;
    int shutdown_asap;
    int activerehashing;
    int active_defrag_running;
    char *pidfile;
    int arch_bits;
    int cronloops;
    char runid[41];
    int sentinel_mode;
    size_t initial_memory_usage;
    int always_show_logo;
    dict *moduleapi;
	...省略
(gdb) ptype server.port
type = int

3.9 info和thread命令

info是一个复合命令,用来查看当前进程所有线程的运行状态。

eg: 查看线程状态

代码语言:javascript复制
(gdb) info threads
  Id   Target Id                                          Frame 
* 1    Thread 0x7ffff7fdef80 (LWP 2487) "redis-server"    0x00007ffff715f1b7 in epoll_wait () from /lib64/libc.so.6
  2    Thread 0x7ffff6587700 (LWP 2488) "bio_close_file"  0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3    Thread 0x7ffff5d86700 (LWP 2489) "bio_aof_fsync"   0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4    Thread 0x7ffff5585700 (LWP 2490) "bio_lazy_free"   0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5    Thread 0x7ffff4d84700 (LWP 2491) "jemalloc_bg_thd" 0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

eg: 查看断点状态

代码语言:javascript复制
(gdb) info b
Num     Type           Disp Enb Address            What
6       breakpoint     keep y   0x000000000043c735 in main at server.c:5001
7       breakpoint     keep y   0x0000000000433671 in serverLogRaw at server.c:1023

如何切换线程查看

*号代表当前在哪个线程

代码语言:javascript复制
(gdb) info threads
  Id   Target Id                                          Frame 
* 1    Thread 0x7ffff7fdef80 (LWP 2487) "redis-server"    0x00007ffff715f1b7 in epoll_wait () from /lib64/libc.so.6
  2    Thread 0x7ffff6587700 (LWP 2488) "bio_close_file"  0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3    Thread 0x7ffff5d86700 (LWP 2489) "bio_aof_fsync"   0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4    Thread 0x7ffff5585700 (LWP 2490) "bio_lazy_free"   0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5    Thread 0x7ffff4d84700 (LWP 2491) "jemalloc_bg_thd" 0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff5d86700 (LWP 2489))]
#0  0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) info threads
  Id   Target Id                                          Frame 
  1    Thread 0x7ffff7fdef80 (LWP 2487) "redis-server"    0x00007ffff715f1b7 in epoll_wait () from /lib64/libc.so.6
  2    Thread 0x7ffff6587700 (LWP 2488) "bio_close_file"  0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 3    Thread 0x7ffff5d86700 (LWP 2489) "bio_aof_fsync"   0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4    Thread 0x7ffff5585700 (LWP 2490) "bio_lazy_free"   0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5    Thread 0x7ffff4d84700 (LWP 2491) "jemalloc_bg_thd" 0x00007ffff743348c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

3.10 next,step,until,finish,return,jump

next(n): 单步步过(step over),即遇到函数调用时不进入函数体内部,而是直接跳过

step(s): 单步步入(step into),遇到函数调用时进入函数内部

until(u): 当程序运行到指定行停下来

finish: 用于执行完整的函数体,然后正常返回到上层调用中

return: 立即从当前位置结束并返回到上层调用中,也就是说,如果使用了return,则当前函数还有剩余的代码未执行完毕时,也不会再执行了。并且return后面可以指定一个返回值,用来指定函数返回值

jump: 直接跳到指定的位置执行,跳过中间代码直接到指定行运行。

3.11 set args 和 show args

在gdb filename或gdb attach pid后,若在run之前要指定程序的命令行参数,则可以使用set args,查看使用show args

代码语言:javascript复制
(gdb) set args ../redis.conf 
(gdb) show args
Argument list to give program being debugged when it is started is "../redis.conf ".
(gdb) r
Starting program: /home/doper/github/redis-6.0.3/src/redis-server ../redis.conf

3.12 watch命令

watch可以用来监视一个变量或者一段内存。当这个变量或者该内存处的值发生变化,gdb就会中断。监视某个变量或者某个内存会产生一个观察点。

代码语言:javascript复制
int i;
watch i
char *p
watch p与watch *p(前者查监视指针,后者监视内容)

3.13 display

display命令用于监视变量或者内存的值(包括寄存器),每次gdb中断,都会自动输出这些被监视变量或者内存的值。

3.14 dir

使用gdb调试,生成可执行文件的机器和实际执行该可执行程序的机器不是同一台,这时如果可执行程序崩溃,用gdb调试core文件时,会提示”No such file or directory”

gcc/g 编译出来的可执行程序并不包含完整的代码,-g 只是加了一个可执行程序与源码之间的位置映射关系,这时候就可以通过dir来重新定位这种关系。

4. gdb调试多线程程序

4.1 调试多线程程序的方法

  • 先gdb让程序跑起来,然后Ctrl C中断程序,使用Info threads命令查看当前进程下有多少线程在运行。
  • 使用thread 线程id(#号后面那个)切换到对应线程
  • 使用bt查看对应的调用堆栈,

4.2 调试时控制线程切换(重要)

set scheduler-locking on可以用来锁定当前线程,只观察这个线程的运行情况,锁定这个线程,其他线程处于暂停状态,也就是说,此时在当前线程执行next,step,util,finish,return命令时,其他线程时不会运行的。

set scheduler-locking step当且仅当使用nextstep命令做单步调试时会锁定当前线程,如果使用until,finish,return等线程内调试命令(它们不是单步控制命令),则其他线程还是有机会运行的。

set scheduler-locking off用于释放锁定当前线程。

5. gdb调试多进程程序

5.1 方法一

先在一个shell窗口中调试父进程,等待子进程被fork出来后,再开启另外一个shell窗口使用gdb attach命令将gdb attach到子进程上。

代码语言:javascript复制
# shell1
[root@localhost sbin]# ./nginx -c /usr/local/nginx/conf/nginx.con
代码语言:javascript复制
# shell2
[root@localhost ~]# lsof -i -Pn | grep nginx
nginx     2903    root    8u  IPv4  49492      0t0  TCP *:80 (LISTEN)
nginx     2904  nobody    8u  IPv4  49492      0t0  TCP *:80 (LISTEN)
[root@localhost ~]# gdb attach 2904
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-15.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3 : GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
...省略输出
(gdb) bt
#0  0x00007f89f141817b in epoll_wait () from /lib64/libc.so.6
#1  0x000000000044e546 in ngx_epoll_process_events (cycle=0xa46cf0, timer=18446744073709551615, flags=1) at src/event/modules/ngx_epoll_module.c:800
#2  0x000000000043f317 in ngx_process_events_and_timers (cycle=0xa46cf0) at src/event/ngx_event.c:247
#3  0x000000000044c38f in ngx_worker_process_cycle (cycle=0xa46cf0, data=0x0) at src/os/unix/ngx_process_cycle.c:750
#4  0x000000000044926f in ngx_spawn_process (cycle=0xa46cf0, proc=0x44c2e1 <ngx_worker_process_cycle>, data=0x0, name=0x4cfd60 "worker process", respawn=-3)
    at src/os/unix/ngx_process.c:199
#5  0x000000000044b5a4 in ngx_start_worker_processes (cycle=0xa46cf0, n=1, type=-3) at src/os/unix/ngx_process_cycle.c:359
#6  0x000000000044acf4 in ngx_master_process_cycle (cycle=0xa46cf0) at src/os/unix/ngx_process_cycle.c:131
#7  0x000000000040bc05 in main (argc=3, argv=0x7ffdec94bca8) at src/core/nginx.c:382
(gdb)

5.2 方法二

gdb调试器提供了一个follow-fork选项,通过set follow-fork mode设置一个进程fork处新的子进程时,gdb时继续调试父进程(取值parent)还是继续调试子进程(取值child),默认继续调试父进程。

代码语言:javascript复制
# fork之后gdb attach到子进程
set follow-fork child
# fork之后gdb attach到父进程,这是默认值
set follow-fork parent

0 人点赞