Asynchronous Processes

在学习orphan 和 zombie process的时候引出了一个wait command的问题,因为我发现shell script中尽管没有用wait 但background children processes 仍然被reap了没有造成zombie, for explanation please see my question.

To recap, bash wait 和 linux API wait() 是不一样的,Bash takes care of reaping processes for you, The wait bash command has no effect on reaping processes. And the bash stores the child process exit status in memory and it becomes available to your upon calling wait.

Sometimes when I run some time-consuming tasks I want to make them execute parallelly to improve the CPU utilization and reduce execution time (if the machine is multi-core or multi-processing unit)

Let’s talk about different patterns to do that in shell script, for example, I have scripts: back.sh

1
2
#!/bin/bash
tail -f /dev/null

hello.sh

1
2
3
#!/bin/bash
echo "====== hello"
exit 0

Wait for all background tasks

In main.sh, if:

1
2
3
4
5
6
7
8
9
# $! capture the immediate background process id
declare -a nums=(1 2 3)
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
done
wait
echo "done!"

you will get the result like this, only get done! after all background processes finished:

1
2
3
4
5
6
7
###### PID is 11649
###### PID is 11650
###### PID is 11651
====== hello
====== hello
====== hello
done!

But if the main.sh:

1
2
3
4
5
6
7
8
9
./back.sh &
declare -a nums=(1 2 3)
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
done
wait
echo "done!"

The wait will hold on until all background tasks complete, you will never see done! because back.sh will never exit. Have to use kill command to kill it.

The improved way is to only pass related PIDs to wait, so scheduler will not care unrelated background task back.sh:

1
2
3
4
5
6
7
8
9
10
11
declare -a nums=(1 2 3)
declare -a pids
./back.sh &
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
pids+=($!)
done
wait ${pids[@]}
echo "done!"

Wait background task in arbitrary order

This way is very similar to above example, but we wait individually. Also notice that wait PID will return the subprocess exit code! If PID is not given, all currently active child processes are waited for, and the return status is zero. Check man wait for detail.

In main.sh, write:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/bin/bash
declare -a nums=(1 2 3)
declare -a pids
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
# pids[n]=$! also works, [ ] treat n as number already
pids[${n}]=$!
# n is not declared, treat as 0 as default number
let n+=1
done

for pid in "${pids[@]}"
do
# check exit code
if wait ${pid}; then
echo "success"
else
echo "abnormal"
fi
done
echo "done!"

SIGCHLD signal

When child process is done or terminated, it will send SIGCHLD signal to parent, can trap it and do something may be recycle resources. You need to enable job control first, see this issue

using SIGCHLD to catch the point of child process termination.

1
2
3
4
5
6
7
8
9
10
11
#!/bin/bash
# enable job control, see man set
# set -m is the same
set -o monitor
# trap sigchld
trap "reaping child process" SIGCHLD

(sleep 2) &

# do other things
tail -f /dev/null

Others

Acutally jobs command can monitor the background processes:

1
2
3
4
5
6
# ./back.sh  &
[1] 15405
# jobs -l
[1]+ 15405 Running ./back.sh &
# jobs -p
15405
0%