Docker Multi-process
应用容器,一般来说只有一个main serivce process(it can spawn child processes). 需要一个 init process(PID 1) 去管理 children reaping, handle signals, 也就是说, 如果你的 service process 有 fork 但是 no reaping,那么你就需要一个 init process 了,否则会造成 zombie process.
特别是在容器中使用第三方 app 的时候,不清楚对方是否会产生 child processes 或者 reaping, 所以最好使用 init process, see this articale and what is the advantage of tini.
Exploration
这里一篇文章关于 run multiple services in docker, the options could be:
- using
--init, it isdocker-initprocess backed bytini.
1 | # ps aux can see docker-init |
- using wrapper script, for example, entrypoint script.
- main process along with temporary processes, set job control in wrapper script.
- install dedicated init process and config them, for example, supervisord, tini, dumb-init, etc.
For catching signals and child process reaping, if not using tini or other dedicated init process, you need to write code by yourself.
init Process
可以看看container commonly used init processes:
For tini, using steps:
1 | # in alpine docker |
How tini Proxies Signal
之前看了一篇关于 Linux delivery signal 之于 container init process 的文章,提到了在 container 中 kill 1 的操作为什么有时会失败,然后讲了什么时候 kernel 会把信号推送到 init process,以及什么时候不会。这篇文章只提到了源码的一部分,也就是 init process(SIGNAL_UNKILLABLE) + non-default signal handler + current namespace, see the second if condition:
https://github.com/torvalds/linux/blob/a76c3d035872bf390d2fd92d8e5badc5ee28b17d/kernel/signal.c#L79-L99
1 | static bool sig_task_ignored(struct task_struct *t, int sig, bool force) |
The emphasis is on sigcgt bitmask, this is correct as docker has documented here:
https://docs.docker.com/engine/reference/run/#foreground
1 | A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. As a result, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so. |
也就是说,用户如果在 init process 注册了 SIGTERM handler(sigcgt bit set to 1) 那么 handler == SIG_DFL is false,所以 init process 就可以收到了.
但问题是我查看 tini init process signal bitmask sigcgt is 0 for all fields, 所以 kernel 甚至都不会把信号传递过去, so how come the tini forwards signal if no signal would be delivered at all? I have opened a question regarding this.
From the author’s comment, I know The way Tini catches signals is by blocking all signals that should be forwarded to the child, and then waiting for them via sigtimedwait. If takes a closer look at the caller of sig_task_ignored:
https://github.com/torvalds/linux/blob/a76c3d035872bf390d2fd92d8e5badc5ee28b17d/kernel/signal.c#L101-L120
1 | static bool sig_ignored(struct task_struct *t, int sig, bool force) |
You will see Blocked signals are never ignored! So tini will always receive the signals from kernel.
关于 tini main loop 中的 sigtimedwait 其实就是block execution 等待信号的到来
https://github.com/krallin/tini/blob/378bbbc8909a960e89de220b1a4e50781233a740/src/tini.c#L501-L514
1 | int wait_and_forward_signal(sigset_t const* const parent_sigset_ptr, pid_t const child_pid) { |
可以参考 signal(7) section Synchronously accepting a signal 关于 sigtimedwait 的讲解.
还可以通过 sudo strace -p <pid> 去观察 tini 是如何 forward signal 的:
1 | # tini is init process |