The Linux Command Line 2nd

距离第一次看这本书的第一版已经快2年了,当时还完全是个newbie. 这是看第二遍,第二版,重新整理一下笔记,巩固一下不太常用的内容。 — 2021/02/06 - 2021/03/13.

Part 1

When we speak of the command line, we are really referring to the shell. The shell is a program that takes keyboard commands and passes them to the operating system to carry out.

If the last character of the prompt is a hash mark(#) rather than a $, the terminal session has superuser privileges.

Terminal emulator,想想为啥叫emulator, 因为它是从GUI中提供的,并不是系统boot后直接进入terminal的形式。

Unix-like systems such as Linux always have a single file system tree, regardless of how many drives or storage devices are attached to the computer. Starts from /.

Linux has no concept of a file extension like some other operating systems. Although Unix-like operating systems don’t use file extensions to determine the contents/purpose of files, many application programs do.

Some options and arguments unknown to me or not common:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# cd to home of vagrant user
cd ~vagrant
# list multiple directories
ls ~ /usr /opt

# A: list all but . and ..
# F: append / for directories
ls -AF

# 3: number of existing hard link to this file data part
drwxr-xr-x 3 root root 24 Jun 6 2020 mnt

# in less page hit 'h': display help screen
less file

# re-initialize the terminal, deeper than clear
# 很少用到
reset

# -u: When copying files from one directory to another
# only copy files that either don’t exist or are newer
# than the existing corresponding files in the destination directory.
cp -u|--update

# Normally, copies take on the default attributes of the user performing the copy.
# a: preserve the original permission and user/group
# for example, root user cp oridinary user file
cp -a

Remember that a file will always have at least one hard link because the file’s name is created by a link. When we create hard links, we are actually creating additional name parts that all refer to the same data part.

Directory 是不能有hard link的,一个文件夹的hardlink 部分数字一般是2+n, 2代表文件夹内部., .., n 代表subdirectories数量。

The type command is a shell builtin that displays the kind of command the shell will execute, command can be categorized as:

  • executable binary
  • script
  • function
  • alias

which is for locating executables only. apropos is the same as man -k:

1
2
apropos partition
man -k partition

Modern version of redirection:

1
2
3
4
# &> is equal to 2>&1
ls -l /bin/usr &> ls-output.txt
# append
echo "123" &>> ls-output.txt

Interesting accident below, The lesson here is that the redirection operator silently creates or overwrites files, so you need to treat it with a lot of respect.

1
2
3
4
cd /usr/bin
# overwrite less program under /usr/bin
# instead use ls | less
ls > less

Path expansion, 我有blog专门记录了shell globings, pattern match, expansion:

1
2
3
4
5
6
7
8
# list hidden files without . and ..
echo .[!.]*

# arithmetic
# 注意区别(( ))是针对integer的compound command, used in condition such as if, while, for, etc.
echo $((3 + 4))
# can use single parentheses inside without $ prefix
echo "$(((5**2) * 3))"

Remember, parameter expansion “$USER”, arithmetic expansion, and command substitution still take place within double quotes. If we need to suppress all expansions, we use single quotes.

Command history quick rerun:

1
2
3
4
5
6
7
8
9
10
# execute last command
!!
# execute command at 123 line in hsitory
!123

# be cautious
# last command starts with "string"
!string
# last command contains "string"
!?string

Mentioned a new command script, used to record the terminal typing as a hardcopy into a file, for example for students learning. The history is controlled by env vars:

1
2
3
# ignore duplicates command
export HISTCONTROL=ignoredups
export HISTSIZE=1000

Load average in top command: refers to the number of processes that are waiting to run; that is, the number of processes that are in a runnable state and are sharing the CPU. Three values are shown, each for a different period of time. The first is the average for the last 60 seconds, the next the previous 5 minutes, and finally the previous 15 minutes. Values less than 1.0 indicate that the machine is not busy (还和有多少cores有关,多核则可以大于1).

Other commands for top useful are ?/h for help, f for column and sort selection, m check memory utilization, sorted by memory can type shift + m.

About signal (这部分解释得比较好):

  • HUP: also used by many daemon programs to cause a reinitialization. This means that when a daemon is sent this signal, it will reload its configuration file. The Apache web server is an example of a daemon that uses the HUP signal in this way.
  • INT: Interrupt. This performs the same function as ctrl-c sent from the terminal. It will usually terminate a program.
  • TERM: Terminate. This is the default signal sent by the kill command. If a program is still “alive” enough to receive signals, it will terminate.
  • STOP: Stop. This signal causes a process to pause without terminating. Like the KILL signal, it is not sent to the target process, and thus it cannot be ignored.
  • TSTP: Terminal stop. This is the signal sent by the terminal when ctrl-z is pressed. Unlike the STOP signal, the TSTP signal is received by the program, but the program may choose to ignore it.
  • CONT: Continue. This will restore a process after a STOP or TSTP signal. This signal is sent by the bg and fg commands. For example, use ctrl-z with a fg and then run bg to make it run in background.

Part 2

The set command will show both the shell and environment variables as well as any defined shell functions, while printenv will display only the environment variables.

Environment variables: PS1: Stands for prompt string 1. This defines the contents of the shell prompt. Also prompt can be customized by shell function and scripts. 还有PS2, PS3, etc. see this article.

Part 3

High- and Low-Level Package Tools

1
2
3
4
5
6
7
Distributions              | Low-level tools       |  High-level tools
------------------------------------------------------------------------------
Debian-style | dpkg | apt-get, apt, aptitude
------------------------------------------------------------------------------
Fedora, | rpm | yum, dnf
Red Hat Enterprise Linux, | |
CentOS | |

Low-level tools that handle tasks such as installing and removing package files. High-level tools that perform metadata searching and dependency resolution.

主要了解安装,更新,删除,查找这几个操作即可,low/high level都有相关的选项.

Linux storage device names convention: /dev/fd* Floppy disk drives /dev/hd* IDE (PATA) disks on older systems /dev/lp* printer /dev/sd* SCSI disks. On modern Linux systems, the kernel treats all disk-like device as SCSI disks /dev/sr* Optical drives (CD/DVD readers and burners)

/var/log/messages vs /var/log/syslog, /var/log/messages is the syslog on non-Debian/non-Ubuntu systems such as RHEL or CentOS systems usually. /var/log/messages instead aims at storing valuable, non-debug and non-critical messages. This log should be considered the “general system activity” log. /var/log/syslog in turn logs everything, except auth related messages.

1
2
# a great way to watch what the system is doing in near real-time.
tail -f /var/log/messages

The last digit in /etc/fstab file is used to specify integrity checking privilege when system boots by fsck (file system check), 0 means not routinely checked:

1
/dev/sdb /data xfs defaults 0 0

fsck can also repair corrupt file systems with varying degrees of success, depending on the amount of damage. On Unix-like file systems, recovered portions of files are placed in the lost+found directory, located in the root of each file system.

1
2
# unmount /dev/sdb1 first
sudo fsck /dev/sdb1

dd (data definition) can be used to copy block of data:

1
2
3
4
# backup sdb to sdc
dd if=/dev/sdb of=/dev/sdc
# backup to ordinary file
dd if=/dev/sdb of=/tmp/sdb.bk

When it comes to networking, there is probably nothing that cannot be done with Linux. Linux is used to build all sorts of networking systems and appliances, including firewalls, routers, name servers, network-attached storage (NAS) boxes, and on and on.

The locate database is created by another program named updatedb(you can run it manually or by system cronjob). find command can have logic operators:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# -and/-a
# -or/-o
# -not/!
# escape () as it has special meaning to shell
# object can only be a file or directory, so use -or
find ~ \( -type f -not -perm 0600 \) -or \( -type d -not -perm 0700 \)

# -print, always test first
find ~ -type f -name '*.bak'[-print]
# the same as
find ~ -type f -and -name '*.bak' -and -print
# order matters!!! just like if condition and group, from left to right, short execution
find ~ -print -and -type f -and -name '*.bak'
# -delete action
find ~ -type f -name '*.bak' -delete

# Here, command is the name of a command, {} is a symbolic
# representation of the current pathname, and the semicolon
# is a required delimiter indicating the end of the command
# {} and ; need to quote as special meaning to shell
find ~ -type f -name 'foo*' -exec ls -l '{}' ';'
# -ok: interactive with user
find ~ -type f -name 'foo*' -ok ls -l '{}' ';'

# more powerful and flexible
find /tmp -mindepth 1 -maxdepth 1 -type d -mmin +50 -exec ls -d {} \;

It is really useful to do range search with timestamp: atime, ctime, and mtime.

xargs is more efficient then regular -exec {} ; because it executes in one time (when setting properly), 关于这2个效率的比较: FIND -EXEC VS. FIND | XARGS. 不同options是不同的效率!!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# -r: when no input for xargs, don't run it
# 如果不用-r, 则xargs总会执行,输出不符合的结果
find ~ -type f -name 'foo*' -print | xargs -r ls -l
# same as -exec with +
find ~ -type f -name 'foo*' -exec ls -l '{}' '+'

# practices
find ~ \( -type f -not -perm 0600 -exec chmod 0600 '{}' ';' \) -or \( -type d -not -perm 0700 -exec chmod 0700 '{}' ';' \)
# with tar
# r: append mode to tar
find ~ -name 'file-A' -exec tar rf old.tar '{}' '+'

# -: stdin or stdout as needed
# --files-from=- 表示从upper pipeline得到文件列表
# cf - 表示tar的东西又输出到下一个pipeline了
find playground -name 'file-A' | tar cf - --files-from=- | gzip > playground.tgz
# --files-from=- abbr is equal to -T -
find playground -name 'file-A' | tar cf target.tar -T -

# tar and transfer remote dir to local and untar
ssh remote-sys 'tar cf - Documents' | tar xf -

If the filename - is specified, it is taken to mean standard input or output, as needed. (By the way, this convention of using - to represent standard input/output is used by a number of other programs, too).

下面介绍一些text processing commands,其实就是git 使用的部分功能. commonly use and known: cat, sort, uniq, cut, paste, join, comm, diff, patch, tr, sed, aspell.

sort: Many uses of sort involve the processing of tabular data

1
2
3
4
5
6
7
8
9
10
11
# --key/-k 1,1: starts from field 1 and end at field 1
# -key=2n: field 2 sort as numeric
sort --key=1,1 --key=2n distros.txt

# sort on specific part of a field
# Fedora 10 11/25/2008
# Ubuntu 8.10 10/30/2008
sort -k 3.7nbr -k 3.1nbr -k 3.4nbr distros.txt

# sort on shell field
sort -t ':' -k 7 /etc/passwd | head

cut: cut is best used to extract text from files that are produced by other programs, rather than text directly typed by humans.

1
2
3
4
5
6
7
8
9
# Fedora    10   11/25/2008
# Ubuntu 8.10 10/30/2008
# extrace year number
# -c: character range
# -f: field index, start from 1
cut -f 3 file | cut -c 7-10
# can convert tab to spaces
# tab 是会对齐field的
expand file | cut -c 23-

paste: it adds one or more columns of text to a file in argument order

1
2
# assume file1 file2 file3 are columned format
paste file1 file2 file3

The join program it joins data from multiple files based on a shared key field. The same as DB join operation:

1
2
# assume file1 and file2 has shared field
join file1 file2

comm: Compare Two Sorted Files Line by Line diff: Compare Files Line by Line

1
2
3
4
5
# column 1 is in file1
# column 2 is in file2
# column 3 is in both file1 and file2
# -n (-12): remove 1 and 2 column in output
comm -12 file1.txt file2.txt

patch: Apply a diff to an Original, It accepts output from diff and is generally used to convert older version files into newer versions. git其实push的也是diff的部分,然后patch 到target branch中.

1
2
3
4
# run in the same dir level
diff -Naur old_file new_file > diff_file
# diff_file has enough information for patch
patch < diff_file

where old_file and new_file are either single file or directory containing files. The r option supports recursion of a directory tree.

tr: think of this as a sort of character-based search-and-replace operation.

1
2
3
4
5
6
echo "lowercase letters" | tr a-z A-Z
echo "lowercase letters" | tr [:lower:] A-Z
# delete
tr -d '\r' <dos_file> unix_file
# squeeze
echo "aaabbbccc" | tr -s ab

nl: number Lines, add header, body, footer fold: wrap each line to a specified length

1
2
3
# -w: width 50 character
# -s: bread at speace not in the middle of word
fold -w 50 -s test.file

fmt: simple text formetter, it fills and joins lines in text while preserving blank lines and indentation.

1
2
3
# -w: width 50 characters
# -c: operate in crown margin mode, better format than fold
fmt -cw 50 test.file

printf: not used for pipelines, used mostly in scripts where it is employed to format tabular data, similar to C printf:

1
2
3
4
printf "I formatted '%s' as a string.\n" foo
# the same placeholder as C program
# %[flags][width][.precision]placeholder
printf "%d, %f, %o, %s, %x, %X\n" 380 380 380 380 380 380

make command to compile C/C++ program. Some compilers translate high level instructions into assembly language and then use an assembler to perform the final stage of translation into machine language. A process often used in conjunction with compiling is called linking. A program called a linker is used to form the connections between the output of the compiler and the libraries that the compiled program requires. The final result of this process is the executable program file, ready for use.

Shell scripts that do not require compiling. They are executed directly. These are written in what are known as scripting or interpreted languages. These languages have grown in popularity in recent years and include Perl, Python, PHP, Ruby, and many others.

we need Makefile (usually generated by configure script) to instruct make command to compile source code:

1
2
3
4
5
6
7
8
9
# usually this file is to collect system configuration and check
# required libraries
./configure
# compile and build app
# will choose Makefile in the same directory automatically
make
# install app
# install is also a build target from running make command
sudo make install

install will install the final product in a system directory for use. Usually, this directory is /usr/local/bin, the traditional location for locally built software. However, this directory is not normally writable by ordinary users, so we must become the superuser to perform the installation.

Part 4

Part 4 primarily talks about scripting, I place the notes in shell and scripting blogs.

Positional arguments, $0 will always contain the first item appearing on the command line(exactly what it is). When parameters size is large, use shift to access:

1
2
3
4
5
6
7
count=1
# $2 keep moving to $1
while [[ $# -gt 0 ]]; do
echo "Argument $count = $1"
count=$((count + 1))
shift
done

Just as positional parameters are used to pass arguments to shell scripts, they can also be used to pass arguments to shell functions.

Here document(<<[-]) and here string(<<<), <<- is related to stripe leading tab, depends on your use case (usually << is fine).

If the optional in words portion of the for command is omitted, for defaults to processing the positional parameters:

1
2
3
4
5
6
# set positional parameters
set a b c
# without in keywork, for loops the positional parameters
for i; do
echo "$i"
done

Modern for loop mimics C program:

1
2
3
4
# i is treated as integer, no need $i prefix in (( ))
for (( i=0; i<5; i=i+1 )); do
echo $i
done

Arithmetic evaluation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# the same as (( )) for integer compution
echo $(( a + 2 ))
echo $(( 5 % 2 ))
# number bases, output is converted to 10 based value
# $1 is a 16 based value
echo $((0x$1))
echo $((0xff))
# 2#: 2 based
echo $((2#11111111))

# can use C style shortcut, also used in for loop (( ))
declare -i a=0
$((a+=1))
$((a-=1))
$((a/=1))
$((a*=1))
$((a++))
$((a--))
$((++a))
$((--a))

# bit operations are the same as C
for ((i=0;i<8;++i)); do echo $((1<<i)); done
# logic operation: < > <= >= == != && || !
echo $((a<1?++a:--a))

类似C中的用法,同时做赋值和判断:

1
2
3
4
foo=
# foo is assigned 5 and valued as true
# notice that here = is not ==, although in [ ], = is the same as ==
if (( foo = 5 )); then echo "It is true."; fi

The bc program reads a file written in its own C-like language and executes it. A bc script may be a separate file, or it may be read from standard input.

1
2
3
4
# start and quiet bc, use stdin to input
bc -q
# use here string
bc <<< "2+2"

Group command does not create subshell:

1
2
3
4
5
6
# note the space after { and before }
# each cmd must has ;
{ cmd1; cmd2; [cmd3; ...] }

# { } returns the total output
{ ls -l; echo "Listing of foo.txt"; cat foo.txt; } > output.txt

Therefore, in most cases, unless a script requires a subshell, group commands are preferable to sub­ shells. Group commands are both faster and require less memory.

Process substitution, it feeds the output of a process (or processes) into the stdin of another process. 这里的例子主要是和read 结合使用:

1
2
3
4
5
# rediect output from process substitution to read
read < <(echo "foo")
# or using here string
read <<< "foo"
echo $REPLY

Process substitution allows us to treat the output of a subshell as an ordinary file for purposes of redirection. 可以看做一个文件.

1
2
# output is /dev/fd/63
echo <(echo "foo")

Process substitution is often used with loops containing read.

Other example of process substitution:

1
2
3
grep word /usr/share/dict/linux.words | wc
# can be modified as
wc <(grep word /usr/share/dict/linux.words)

In most Unix­like systems, it is possible to create a special type of file called a named pipe. Named pipes are used to create a connection between two processes and can be used just like other types of files. Named pipes behave like files but actually form first­in first­out (FIFO) buffers.

1
2
3
4
5
6
7
8
9
# type is p
# prw-r--r--. 1 root root 0 Mar 13 23:11 pipe1
mkfifo pipe1

# in terminal 1
ls -l > pipe1

# in terminal 2
cat < pipe1

The input will block if no receiving part.

0%