Shell Process Demystify

Understanding how SHELL works under the hood is a must to me, I have encountered several interesting and confusing issues in my daily work about SHELL. Let’s dive deeply into shell process and its relationships to explore how subshells are created and the relationship between parent and child shell.

Shell Type

Due to the bash shell’s popularity, it’s rare to use any other shell as a default shell.

The default interactive shell starts whenever a user logs into a virtual console terminal or starts a terminal emulator in the GUI. Another default shell, /bin/sh, is the default system shell. The default system shell is used for system shell scripts, such as those needed at startup.

In my Redhat and CentOS system, they are the same:

1
lrwxrwxrwx. 1 root root 4 Apr 13  2018 /bin/sh -> bash

To see the user default login shell, go to see /etc/passwd, for example:

1
2
fyre:x:1000:1000::/home/fyre:/bin/bash
demo:x:1001:1001::/home/demo:/bin/bash

Shell Relationships

You can use ps -f to see difference before you run several times bash(child) in a shell(parent):

1
2
3
4
5
6
7
# run bash
# then run ps -f

UID PID PPID C STIME TTY TIME CMD
root 7762 7758 0 Jun03 pts/0 00:00:00 -bash
root 14957 7762 0 17:12 pts/0 00:00:00 bash
root 15028 14957 0 17:13 pts/0 00:00:00 ps -f

Here PID 14957 has parent 7762.

A child shell is also called a subshell. A subshell can be created from a parent shell or from another subshell. For example, run bash 3 times:

1
2
3
4
5
6
7
8
9
# bash bash bash
ps --forest -f

UID PID PPID C STIME TTY TIME CMD
root 7762 7758 0 Jun03 pts/0 00:00:00 -bash
root 2264 7762 0 23:52 pts/0 00:00:00 \_ bash
root 2467 2264 0 23:55 pts/0 00:00:00 \_ bash
root 2487 2467 0 23:55 pts/0 00:00:00 \_ bash
root 2510 2487 0 23:55 pts/0 00:00:00 \_ ps --forest -f

Constructs Create SubShell

Refer to this article what is a subshell

Note subshells are often used for multi-processing in shell scripts. However, entering into a subshell is an expensive method and can significantly slow down processing.

A subshell is typically implemented by forking a new process (but some shells may optimize this in some cases).

  • Subshell for grouping: (...) does nothing but create a subshell and wait for it to terminate. Contrast with {...} which groups commands purely for syntactic purposes and does not create a subshell.
  • Background &: creates a subshell and does not wait for it to terminate.
  • Pipeline: | creates two subshells, one for the left-hand side and one for the right-hand side, and waits for both to terminate. The shell creates a pipe and connects the left-hand side’s standard output to the write end of the pipe and the right-hand side’s standard input to the read end. In some shells (ksh88, ksh93, zsh, bash with the lastpipe option set and effective), the right-hand side runs in the original shell, so the pipeline construct only creates one subshell.
  • Command substitution: $() creates a subshell with its standard output set to a pipe, collects the output in the parent and expands to that output, minus its trailing newlines. (And the output may be further subject to splitting and globbing, but that’s another story.)
  • Process substitution: <(cmd) creates a subshell with its standard output set to a pipe and expands to the name of the pipe. The parent (or some other process) may open the pipe to communicate with the subshell. >(cmd) does the same but with the pipe on standard input.
  • Coprocess: coproc creates a subshell and does not wait for it to terminate. The subshell’s standard input and output are each set to a pipe with the parent being connected to the other end of each pipe.

Process List

For a command list to be considered a process list (a grouping), the commands must be encased in parentheses (). Adding parentheses and turning the command list into a process list created a subshell to execute the commands.

1
2
3
4
5
6
# echo $BASH_SUBSHELL
0
# (echo $BASH_SUBSHELL)
1
# ( (echo $BASH_SUBSHELL) )
2

For parent variables act in subshell (), from this, this and this posts, long story short: subshell () inherit all variables. Even $$ (the PID of the original shell) is kept. The reason is that for a subshell, the shell just forks and doesn’t execute a new shell (such as run a script ./xx)

Note, usually use subshell () with &.

Background mode

Background mode is very handy. And it provides a method for creating useful subshells at the CLI.

1
2
# jobs -l
[1]+ 7552 Running sleep 40 &

[1] is job number, 7552 is PID, then Running is job status. The jobs command displays any user’s processes (jobs) currently running in background mode:

Using a process list in background mode is one creative method for using subshells at the CLI. Remember we start Jetty in conductor container? docker load and scp are also suitable for background execution sometimes.

Co-processing

Co-processing performs almost identically to putting a command in background mode, except for the fact that it creates a subshell.

1
2
3
# coproc sleep 2
[1] 8174
[1]+ Done coproc COPROC sleep 2

it the same as:

1
# (sleep 2) &

The COPROC is a name given to the porcess, you can change it:

1
# coproc My_Job { sleep 10; }

The only time you need to name a co-process is when you have multiple co-processes running, and you need to communicate with them all. Otherwise, just let the coproc command set the name to the default, COPROC.

This will create a nested subshell:

1
# coproc ( sleep 10; sleep 2 )

My question

Remember in conductor container we start Jetty using the (...) &. We want to run it in a separate process in background. Why not just &? So If I want to run something in background, should I use & to put command in background or ()& to put subshell in background?

Referring to my question. I am testing in these 2 cases but did not see difference:

1
2
sleep 1 & ps -f
(sleep 1)& ps -f

May be different Linux distro has different result, check it first. For now, using & directly on command is fine.

Shell Build-in Commands

An external command, sometimes called a filesystem command, is a program that exists outside of the bash shell. They are not built into the shell program. An external command program is typically located in /bin, /usr/bin, /sbin, or /usr/sbin.

1
2
3
4
5
# which ps
/usr/bin/ps

# type -a ps
ps is /usr/bin/ps

Whenever an external command is executed, a child process is created. This action is termed forking. It takes time and effort to set up the new child process’s environment. Thus, external commands can be a little expensive.

When using a built-in command, no forking is required. Therefore, built-in commands are less expensive.

Built-in commands are different in that they do not need a child process to execute. They were compiled into the shell and thus are part of the shell’s toolkit.

0%