最近优化了一版程序:用到了golang的优雅退出机制。
程序使用etcd的election sdk
做高可用选主,需要在节点意外下线的时候,主动去etcd卸任(删除10s租约), 否则已经下线的节点还会被etcd认为是leader。
所以在这里,优雅退出是技术刚需。
另外根据《云原生十二要素方法论》 第9条: 快速启动和优雅终止可最大化健壮性 , 也推荐各位遵守实践。 Fast startup and shutdown are advocated for a more robust and resilient system.
粗浅的认知方案:捕获程序的终止信号, 主动去卸任。
标准信号[1] Linux支持如下标准信号,第二列指示该信号遵守的标准。
代码语言:javascript复制 Signal Standard Action Comment
────────────────────────────────────────────────────────────────────────
SIGABRT P1990 Core Abort signal from abort(3)
SIGALRM P1990 Term Timer signal from alarm(2)
SIGBUS P2001 Core Bus error (bad memory access)
SIGCHLD P1990 Ign Child stopped or terminated
SIGCLD - Ign A synonym for SIGCHLD
SIGCONT P1990 Cont Continue if stopped
SIGEMT - Term Emulator trap
SIGFPE P1990 Core Floating-point exception
SIGHUP P1990 Term Hangup detected on controlling terminal
or death of controlling process
SIGILL P1990 Core Illegal Instruction
SIGINFO - A synonym for SIGPWR
SIGINT P1990 Term Interrupt from keyboard
SIGIO - Term I/O now possible (4.2BSD)
SIGIOT - Core IOT trap. A synonym for SIGABRT
SIGKILL P1990 Term Kill signal
SIGLOST - Term File lock lost (unused)
SIGPIPE P1990 Term Broken pipe: write to pipe with no
readers; see pipe(7)
SIGPOLL P2001 Term Pollable event (Sys V);
synonym for SIGIO
SIGPROF P2001 Term Profiling timer expired
SIGPWR - Term Power failure (System V)
SIGQUIT P1990 Core Quit from keyboard
SIGSEGV P1990 Core Invalid memory reference
SIGSTKFLT - Term Stack fault on coprocessor (unused)
SIGSTOP P1990 Stop Stop process
SIGTSTP P1990 Stop Stop typed at terminal
SIGSYS P2001 Core Bad system call (SVr4);
see also seccomp(2)
SIGTERM P1990 Term Termination signal
SIGTRAP P2001 Core Trace/breakpoint trap
SIGTTIN P1990 Stop Terminal input for background process
SIGTTOU P1990 Stop Terminal output for background process
SIGUNUSED - Core Synonymous with SIGSYS
SIGURG P2001 Ign Urgent condition on socket (4.2BSD)
SIGUSR1 P1990 Term User-defined signal 1
SIGUSR2 P1990 Term User-defined signal 2
SIGVTALRM P2001 Term Virtual alarm clock (4.2BSD)
SIGXCPU P2001 Core CPU time limit exceeded (4.2BSD);
see setrlimit(2)
SIGXFSZ P2001 Core File size limit exceeded (4.2BSD);
see setrlimit(2)
SIGWINCH - Ign Window resize signal (4.3BSD, Sun)
其中SIGKILL
,SIGSTOP
信号不能被捕获、阻塞、忽略。
我们常见的三种终止程序的操作:
1.CTRL C
实际是发送SIGINT
信号,
2.kill pid
的作用是向指定进程发送SIGTERM
信号(这是kill默认发送的信息), 若应用程序没有捕获并响应该信号的逻辑,则该信号默认动作是kill掉进程,这是终止进程的推荐做法。
3.kill -9 pid
则是向指定进程发送SIGKILL
信号,SIGKILL信号既不能被应用程序捕获,也不能被阻塞或忽略,
故要达成我们的目的,这里捕获 SIGINT
SIGTREM
信号就可满足需求。
golang提供signal
包来监听并反馈收到的信号。
可针对长时间运行的程序,新开协程,持续监听信号,并插入优雅关闭的代码。
代码语言:javascript复制c := make(chan os.Signal)
signal.Notify(c, syscall.SIGTERM, syscall.SIGINT)
go func() {
select {
case sig:= <-c: {
log.Infof("Got %s signal. Aborting...n", sig)
eCli.Close() // 利用 etcd election sdk主动卸任
os.Exit(1)
}
}
}()
是不是依旧适配容器?
我们得看DOCKER官方docker stop
,docker kill
命令的定义。
docker stop[2]: The main process inside the container will receiver SIGTREM, and after a grace period,SIGKILL .(default grace period =10s)
docker kill[3]:The main process inside the container is sent SIGKILL signal (default), or the signal that is specified with the --signal option
我们常用的docker stop命令:向容器内进程发送SIGTREM
信号,10s后发送SIGKILL
信号,这10s时间给了程序做优雅关闭的时机,所以上面代码的逻辑是能适配容器的。
Ref: 十二要素App方法论
引用链接
[1]
标准信号: https://www.man7.org/linux/man-pages/man7/signal.7.html
[2]
docker stop: https://docs.docker.com/engine/reference/commandline/stop/
[3]
docker kill: https://docs.docker.com/engine/reference/commandline/kill/