This article used to walk you through some commonly yum and rpm usages , based on a real life scenario.

################################################################ #   Date           Description #   03/05/2019     yum autoremove #   03/02/2019     upgrade rpm #   03/01/2019     list rpm dependencies #   02/27/2019     yum provides #   02/25/2019     search rpm installed #   02/24/2019     install rpm #   01/19/2019     remove package # ################################################################

Yum command cheat sheet rpm command is one of the package management command.

01/19/2019

Remove or erase a installed package with its dependencies:

1
2
rpm -ev <package name>
yum erase <package name>

if the rpm is part of other dependencies, rpm -ev will fail, or you can use yum erase to delete them all:

1
2
3
4
rpm -ev containerd.io

error: Failed dependencies:
containerd.io >= 1.2.2-3 is needed by (installed) docker-ce-3:18.09.2-3.el7.x86_64

Remove or erase a installed package without checking for dependencies

1
rpm -ev --nodeps <package name>

For example:

1
2
3
4
rpm -ev --nodpes containerd.io

Preparing packages...
containerd.io-1.2.2-3.3.el7.x86_64

02/24/2019

This command will install a single rpm file if it meets all dependencies, otherwise install will fail and the output will show you the missig rpms.

1
rpm -ivh <rpm name>

For example:

1
2
3
4
5
6
7
8
rpm -ivh 416b2856f8dbb6f07a50a46018fee8596479ebc0eaeec069c26bedfa29033315-kubeadm-1.13.2-0.x86_64.rpm

warning: 416b2856f8dbb6f07a50a46018fee8596479ebc0eaeec069c26bedfa29033315-kubeadm-1.13.2-0.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 3e1ba8d5: NOKEY
error: Failed dependencies:
cri-tools >= 1.11.0 is needed by kubeadm-1.13.2-0.x86_64
kubectl >= 1.6.0 is needed by kubeadm-1.13.2-0.x86_64
kubelet >= 1.6.0 is needed by kubeadm-1.13.2-0.x86_64
kubernetes-cni >= 0.6.0 is needed by kubeadm-1.13.2-0.x86_64

02/25/2019

These two both work:

1
2
3
## query package installed
rpm -qa | grep <package name>
yum list installed | grep <package name>

For example:

1
2
rpm -qa | grep docker
docker-ce-18.06.1.ce-3.el7.x86_64
1
2
yum list installed | grep docker
docker-ce.x86_64 18.06.1.ce-3.el7 installed

02/27/2019

Find packages that provide the queried file, for example:

1
2
3
4
5
6
7
yum provides host

32:bind-utils-9.9.4-14.el7.x86_64 : Utilities for querying DNS name servers
Repo : Local-Base
Matched from:
Filename : /usr/bin/host
...

Next you can install it:

1
yum install -y bind-utils

03/01/2019

If you have a local rpm file, you can list its dependencies by running:

1
rpm -qpR <rpm name>

For example:

1
2
3
4
5
6
7
8
9
10
11
rpm -qpR 416b2856f8dbb6f07a50a46018fee8596479ebc0eaeec069c26bedfa29033315-kubeadm-1.13.2-0.x86_64.rpm

warning: 416b2856f8dbb6f07a50a46018fee8596479ebc0eaeec069c26bedfa29033315-kubeadm-1.13.2-0.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 3e1ba8d5: NOKEY
cri-tools >= 1.11.0
kubectl >= 1.6.0
kubelet >= 1.6.0
kubernetes-cni >= 0.6.0
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1

03/02/2019

If you run man rpm, there are two similar statements:

1
2
3
4
5
6
7
8
9
10
11
The general form of an rpm upgrade command is

rpm {-U|--upgrade} [install-options] PACKAGE_FILE ...

This upgrades or installs the package currently installed to a newer version. This is the same as install,
except all other version(s) of the package are removed after the new package is installed.

rpm {-F|--freshen} [install-options] PACKAGE_FILE ...

This will upgrade packages, but only ones for which an earlier version is installed.

Both rpm -Fvh and rpm -Uvh will perform the same task but the diff is rpm -Uvh is also same as rpm -ivh, you can use any of them I mean rpm -ivh or rpm -Uvh for installing the package.

But for upgrading installed package you can use any of rpm -Fvh or rpm -Uvh.

rpm -Fvh is used for upgrading the existing package (installed package). rpm -Uvh is used for installing the package and upgrading the package both.

For example, upgrade ansible from 2.4.6.0 to 2.7.8:

1
2
3
4
5
6
7
8
rpm -Fvh ansible-2.7.8-1.el7.ans.noarch.rpm

warning: ansible-2.7.8-1.el7.ans.noarch.rpm: Header V4 RSA/SHA1 Signature, key ID 442667a9: NOKEY
Preparing... ################################# [100%]
Updating / installing...
1:ansible-2.7.8-1.el7.ans ################################# [ 50%]
Cleaning up / removing...
2:ansible-2.4.6.0-1.el7.ans ################################# [100%]

03/05/2019

Remove dependencies which are not in use, any unneeded dependencies from your system, for example:

1
yum autoremove docker-ce
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Dependencies Resolved

=========================================================================================================================
Package Arch Version Repository Size
=========================================================================================================================
Removing:
docker-ce x86_64 18.06.1.ce-3.el7 @docker-local.repo 168 M
Removing for dependencies:
container-selinux noarch 2:2.68-1.el7 @Local-Extras 36 k
libcgroup x86_64 0.41-20.el7 @Local-Base 134 k
libseccomp x86_64 2.3.1-3.el7 @Local-Base 297 k
libtool-ltdl x86_64 2.4.2-22.el7_3 @Local-Base 66 k
policycoreutils-python x86_64 2.5-29.el7_6.1 @Local-Base 1.2 M

Transaction Summary
=========================================================================================================================
Remove 1 Package (+5 Dependent packages)

You also can add clean_requirements_on_remove=1 in /etc/yum.conf file, then run

1
yum remove docker-ce

the same effect as using autoremove.

Designed for data extraction and reporting.

awk is its own programming language itself and contains a lot of really good tools, enables a programmer to write tiny but effective programs in the form of statements that define text patterns that are to be searched for in each line of a document and the action that is to be taken when a match is found within a line.

Reference from GeeksforGeeks awk in 20 mins WHAT CAN WE DO WITH AWK ?

  1. AWK Operations: (a) Scans a file line by line (b) Splits each input line into fields © Compares input line/fields to pattern (d) Performs action(s) on matched lines

  2. Useful For: (a) Transform data files (b) Produce formatted reports

  3. Programming Constructs: (a) Format output lines (b) Arithmetic and string operations © Conditionals and loops

日期记录的部分主要平时遇到的零散总结: ################################################################ #   Date           Description #   09/11/2019     skip first line #   02/28/2019     print last column #   02/26/2019     awk remote execution # ################################################################

02/26/2019

When use awk in script, may suffer shell unexpected expanding:

1
ssh -o StrictHostKeyChecking=no sshrm1 "ifconfig eth0 | grep \"inet\" | awk '{print $2}'"

Above will not get right data, instead preceding \ before $

1
ssh -o StrictHostKeyChecking=no sshrm1 "ifconfig eth0 | grep \"inet\" | awk '{print \$2}'"

Another method is awk the return value from ssh rather than wrap it in ssh command.

02/28/2019

Print last column separated by space:

1
2
## NF: count of fields of a line
awk '{print $NF}' <file>

09/11/2019

Skip the first line:

1
2
## NR: current count of lines
awk 'NR>1 {print $1}' <file>

You can use NR>=2, NR<5, NR==3, etc to limit the range.

Quick Start

1
2
3
4
## check version
awk -W version
## looks also works
awk --version

awk has BEGIN and END block, between is the body:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
## BEGIN and END run only once
## body run as line number times
awk 'BEGIN {print "start..."} {print NR, $0} END {print NR}' /etc/hosts

## BEGIN
start...
## body
1 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
2 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
3 172.16.163.83 myk8s1.fyre.ibm.com myk8s1
4 172.16.182.156 myk8s2.fyre.ibm.com myk8s2
5 172.16.182.187 myk8s3.fyre.ibm.com myk8s3
## END
5

We can also put the awk option into awk script:

1
awk -f file.awk /etc/passwd

file.awk content:

1
2
3
4
5
6
## FS is used to specify delimiter to parse line, by default awk use space
BEGIN { FS=":" ; print "User Name:"}
## $3 > 999 is the condition match
## NR is internal variable of awk
$3 > 999 {print NR, $0; count++ }
END {print "Total Lines: " NR " Count Lines: " count}

Let’s see more examples, actually sed may perform the same task but awk is more readable.

1
2
3
## set "," as delimiter, $1 to uppercase, $2 to lowercase
## toupper and tolower is awk internal functions
awk -F"," '{print toupper($1), tolower($2), $3}' <file>

lastlog.awk file to show non-root user login statistics

1
2
3
4
5
6
7
8
9
10
11
12
13
## exclude if match these:
!(/Never logged in/ || /^Username/ || /^root/) {
cnt++
## line fields == 8
if (NF == 8)
printf "%8s %2s %3s %4s\n", $1, $5, $4, $8
else
printf "%8s %2s %3s %4s\n", $1, $6, $5, $9
}
END {
print "==============================="
print "Total # of user processed: " cnt
}

I need to clean yum pending or unfinished transactions before our installer start to work, otherwise the yum update or yum install may fail. But where are these pending transactions from? Sometimes the machine is down or unexpected thing happens, the yum installation process failed.

Problem

you may see error like this:

1
2
3
There are unfinished transactions remaining. You might consider running yum-complete-transaction first to finish them.
The program yum-complete-transaction is found in the yum-utils package.

Solution

According to the prompt, we need first install yum-utils

1
yum install -y yum-utils

yum-complete-transaction is a program which finds incomplete or aborted yum transactions on a system and attempts to complete them. It looks at the transaction-all* and transaction-done* files which can normally be found in /var/lib/yum if a yum transaction aborted in the middle of execution.

If it finds more than one unfinished transaction it will attempt to complete the most recent one first. You can run it more than once to clean up all unfinished transactions.

Then just issue the following command to do a cleanup:

1
yum-complete-transaction --cleanup-only

You can also check how many pending transactions exist:

1
find /var/lib/yum -maxdepth 1 -type f -name 'transaction-all*' -not -name '*disabled' -printf . | wc -c

In Ansible playbookk add task:

1
2
3
4
5
6
# ensure existence of yum-utils first
- name: clean yum pending transactions
command: yum-complete-transaction --cleanup-only
become: true
args:
warn: no

Now, let’s practice what we have learned from Offline package Installation I. For example, I want to install docker and kubeadm etc offline in the target machine.

Docker

Note: here we only download actual dependencies need for installation, not all rpms if we use --installroot option

I want to install Docker 18.06.3 (currently kubeadm now properly recognizes Docker 18.09.0 and newer, but still treats 18.06 as the default supported version). You should perform below steps on a machine that hasn’t installed docker yet.

Note: install from a package, rpms list in this link are not complete, they are in top level but can be used to upgrade version.

Uninstall old version

1
2
3
4
5
6
7
8
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine

The contents of /var/lib/docker/, including images, containers, volumes, and networks, are preserved. The Docker CE package is now called docker-ce.

Set up docker repository

Before you install Docker CE for the first time on a new host machine, you need to set up the Docker repository. Afterward, you can install and update Docker from the repository.

1
2
3
yum install -y yum-utils \
device-mapper-persistent-data \
lvm2

Use the following command to set up the stable repository, yum-utils contains yum-config-manager:

1
2
3
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo

List docker version

List and sort the versions available in your repo. This example sorts results by version number, highest to lowest, and is truncated:

1
yum list docker-ce --showduplicates | sort -r
1
2
3
4
5
6
7
8
9
10
11
12
Loaded plugins: product-id, search-disabled-repos
docker-ce.x86_64 3:18.09.2-3.el7 docker-ce-stable
docker-ce.x86_64 3:18.09.1-3.el7 docker-ce-stable
docker-ce.x86_64 3:18.09.0-3.el7 docker-ce-stable
docker-ce.x86_64 18.06.3.ce-3.el7 docker-ce-stable
docker-ce.x86_64 18.06.2.ce-3.el7 docker-ce-stable
docker-ce.x86_64 18.06.1.ce-3.el7 docker-ce-stable
docker-ce.x86_64 18.06.0.ce-3.el7 docker-ce-stable
docker-ce.x86_64 18.03.1.ce-1.el7.centos docker-ce-stable
docker-ce.x86_64 18.03.0.ce-1.el7.centos docker-ce-stable
docker-ce.x86_64 17.12.1.ce-1.el7.centos docker-ce-stable
...

Download docker rpms

Install a specific version by its fully qualified package name, which is the package name (docker-ce) plus the version string (2nd column) starting at the first colon (:), up to the first hyphen, separated by a hyphen (-). For example, docker-ce-18.06.3.ce.

1
2
mkdir -p /root/docker-18.06.3-rpms
yum install --downloadonly --downloaddir=/root/docker-18.06.3-rpms docker-ce-18.06.3.ce

list the rpms in the target folder:

1
2
3
4
5
6
7
8
9
10
audit-2.8.4-4.el7.x86_64.rpm               libselinux-utils-2.5-14.1.el7.x86_64.rpm
audit-libs-2.8.4-4.el7.x86_64.rpm libsemanage-2.5-14.el7.x86_64.rpm
audit-libs-python-2.8.4-4.el7.x86_64.rpm libsemanage-python-2.5-14.el7.x86_64.rpm
checkpolicy-2.5-8.el7.x86_64.rpm libsepol-2.5-10.el7.x86_64.rpm
container-selinux-2.68-1.el7.noarch.rpm libtool-ltdl-2.4.2-22.el7_3.x86_64.rpm
docker-ce-18.06.3.ce-3.el7.x86_64.rpm policycoreutils-2.5-29.el7_6.1.x86_64.rpm
libcgroup-0.41-20.el7.x86_64.rpm policycoreutils-python-2.5-29.el7_6.1.x86_64.rpm
libseccomp-2.3.1-3.el7.x86_64.rpm python-IPy-0.75-6.el7.noarch.rpm
libselinux-2.5-14.1.el7.x86_64.rpm setools-libs-3.3.8-4.el7.x86_64.rpm
libselinux-python-2.5-14.1.el7.x86_64.rpm

Note that the required components may be changed in later version, such as 18.09.2, there are 2 more packages docker-ce-cli-18.09.2 and containerd.io.

1
2
3
mkdir -p /root/docker-18.09.2-rpms
yum install --downloadonly --downloaddir=/root/docker-18.09.2-rpms docker-ce-18.09.2 docker-ce-cli-18.09.2 containerd.io

Install docker rpms

now install docker 18.06.3 offline by running:

1
yum --disablerepo=* -y install /root/docker-18.06.3-rpms/*.rpm

Note: please refer my blog Set up and Use Local Yum Repository if you want to create and use local yum repository

When I was working on the upgrade DS k8s installer issue, I ran into the problem that I need to install ansible, docker and kubeadm offline. In the production environment, we may not have internet access, that means we need to prepare the rpms and dependencies needed and create a self-contained installer.

Download missing rpms without installing

Note: This method is (by-design) sensitive to the existence of already-installed packages. It will only download missing dependencies you need for that particular box, not all rpms.

First let’s install the yum-plugin-downloadonly:

1
yum install -y yum-plugin-downloadonly
1
yum install --downloadonly --downloaddir=<directory> <package:version>

For example, I want to get missing rpms for vim editor, reside them in /root/vim folder

1
2
mkdir -p /root/vim
yum install --downloadonly --downloaddir=/root/vim vim

List the target folder:

Another way is using yumdownloader that is from yum-utils. The difference is if the package is already installed completely, yumdownloader will download the outermost level rpm but --downloadonly will do nothing.

1
yum install -y yum-utils
1
yumdownloader --resolve --destdir=/root/vim vim

Download all rpms without installing

yum & yumdownloader

Usually what we really want is to resolve all dependencies and download them, even though some required rpms have already installed in box, yumdownloader or yum --downloadonly with --installroot option is the solution.

Keep in mind that yumdownloader will use your yum database when resolving dependencies.

For example if you download bash, which needs glibc, it will resolve glibc and skip it, since it is installed. If you want to download all dependencies, use a different installroot instead.

1
2
3
mkdir -p /root/vim
mkdir -p /root/new_root
yumdownloader --installroot=/root/new_root --destdir=/root/vim/ --resolve vim

This is what I need for a self-contained offline installer.

Let’s check how many vim related rpms are here, way too many then what we get from the first section.

1
2
ls -ltr /root/vim | wc -l
57

repotrack

This method can also resolve and download all dependencies, repotrack is from yum-utils, it will down all the dependencies for any architecture by default.

1
2
mkdir -p /root/vim
repotrack -p /root/vim vim-enhanced

if you check /root/vim, there are some i686 rpms, once you delete them and count again, 57 the same as we use yumdownloader above.

Note: actually repotrack has -a option to specify arch, but I am not able to use it, when I specify x86_64, it still downloads i686.

Install local rpms

Now the problem is how to install these rpms in correct order, install them one by one is obviously infeasible, the method that can resolve their dependencies and install automatically is welcome, both command like:

1
yum --disablerepo=* --skip-broken install -y /root/vim/*.rpm

and

1
rpm --force -ivh /root/vim/*.rpm

may work but it’s not a good way, you may encounter rpm version upgrade issue and duplicate problem. Now from my knowledge create a local yum repository is clean and elegant, please refer my blog Set up and Use Local Yum Repository.

This blog reformats and builds on top of this stackoverflow topic. Big thanks to rahmu and people contributed.

Problem

Let’s say the command conky stopped responding on my desktop, and I want to kill it manually. I know a little bit of Unix, so I know that what I need to do is execute the command kill <PID>. In order to retrieve the PID, I can use ps or top or whatever tool my Unix distribution has given me. But how can I do this in one command?

Answer

1
ps aux | grep conky | grep -v grep | awk '{print $2}' | xargs kill

DISCLAIMER: This command only works in certain cases. Don’t copy/paste it in your terminal and start using it, it could kill processes unsuspectingly. Rather learn how to build it.

How it works

  • ps aux

This command will output the list of running processes and some info about them. The interesting info is that it’ll output the PID of each process in its 2nd column. Here’s an extract from the output of the command on my box:

1
2
3
4
5
6
7
8
9
10
11
12
$ ps aux
rahmu 1925 0.0 0.1 129328 6112 ? S 11:55 0:06 tint2
rahmu 1931 0.0 0.3 154992 12108 ? S 11:55 0:00 volumeicon
rahmu 1933 0.1 0.2 134716 9460 ? S 11:55 0:24 parcellite
rahmu 1940 0.0 0.0 30416 3008 ? S 11:55 0:10 xcompmgr -cC -t-5 -l-5 -r4.2 -o.55 -D6
rahmu 1941 0.0 0.2 160336 8928 ? Ss 11:55 0:00 xfce4-power-manager
rahmu 1943 0.0 0.0 32792 1964 ? S 11:55 0:00 /usr/lib/xfconf/xfconfd
rahmu 1945 0.0 0.0 17584 1292 ? S 11:55 0:00 /usr/lib/gamin/gam_server
rahmu 1946 0.0 0.5 203016 19552 ? S 11:55 0:00 python /usr/bin/system-config-printer-applet
rahmu 1947 0.0 0.3 171840 12872 ? S 11:55 0:00 nm-applet --sm-disable
rahmu 1948 0.2 0.0 276000 3564 ? Sl 11:55 0:38 conky -q

  • grep conky

I’m only interested in one process, so I use grep to find the entry corresponding to my program conky.

1
2
3
$ ps aux | grep conky
rahmu 1948 0.2 0.0 276000 3564 ? Sl 11:55 0:39 conky -q
rahmu 3233 0.0 0.0 7592 840 pts/1 S+ 16:55 0:00 grep conky
  • grep -v grep

As you can see in step 2, the command ps outputs the grep conky process in its list (it’s a running process after all). In order to filter it, I can run grep -v grep. The option -v tells grep to match all the lines excluding the ones containing the pattern.

1
2
$ ps aux | grep conky | grep -v grep
rahmu 1948 0.2 0.0 276000 3564 ? Sl 11:55 0:39 conky -q
  • awk '{print $2}'

Now that I have isolated my target process. I want to retrieve its PID. In other words I want to retrieve the 2nd word of the output. Lucky for me, most (all?) modern unices will provide some version of awk, a scripting language that does wonders with tabular data. Our task becomes as easy as print $2.

1
2
$ ps aux | grep conky | grep -v grep | awk '{print $2}'
1948
  • xargs kill

I have the PID. All I need is to pass it to kill. To do this, I will use xargs.

xargs kill will read from the input (in our case from the pipe), form a command consisting of kill <items> (<items> are whatever it read from the input), and then execute the command created. In our case it will execute kill 1948. Mission accomplished.

Final words

Note that depending on what version of unix you’re using, certain programs may behave a little differently (for example, ps might output the PID in column $3). If something seems wrong or different, read your vendor’s documentation (or better, the man pages). Also be careful as long pipes can be dangerous.

Don’t make any assumptions especially when using commands like kill or rm. For example, if there was another user named ‘conky’ (or ‘Aconkyous’) my command may kill all his running processes too!

Complement

Actually you can simplify the pipeline further to

1
pkill conky

or

1
kill $(pgrep conky)

More information please see man bash, it has comprehensive information.

This blog collects the commonly used code snippets based on my daily work, also do summary from related stackoverflow topics.

set builtin

Usually I use set -x for debugging purpose, today I see a new statement set -ex. What is this and what is set in Bash? 后来又知道了很多,见awesome list中的bash tutoral.

The Set Builtin, in short, set allows you to change the values of shell options and set the positional parameters, or to display the names and values of shell variables.

set -e, causes the shell to exit if any subcommand or pipeline returns a non-zero status. This tells bash that it should exit the script if any statement returns a non-true return value. The benefit of using -e is that it prevents errors snowballing into serious issues when they could have been caught earlier.

But sometimes set -e may not be good, see these two posts: What does ‘set -e’ do, and why might it be considered dangerous? 这个回答很有启发,用哪种方法还得看具体场景。一定要考虑清楚。

“set -e” usage

get path of running script

1
curpath=$(dirname $(readlink -f $0))

readlink -f $0 will follow every symlink in every component of the given name recursively and get the canonical path. A single file existing on a system can have many different paths that refer to it, but only one canonical path, canonical gives a unique absolute path for a given file. That means even though you call a script in it’s current directory, readlink -f $0 will give you the absolute path!

dirname $0 cut the script name to get the calling path, the path is relative not absolute.

run script in it’s driectory

Sometimes we want to run script in it’s folder by ./xxx.sh. we can check that:

1
2
3
4
5
SCRIPT_PATH=$(dirname $0)
if [[ "X""${SCRIPT_PATH}" != "X." ]]; then
LogMsg "###### ERROR: Please run this script in it's directory!"
exit 1
fi

create tmp file to store log

Create a temporary file or directory, this temp file is owned and grouped by the current user. Aside from the obvious step of setting proper permissions for files exposed to all users of the system, it is important to give temporary files nonpredictable filenames, for example:

1
2
3
4
# $$: current PID
OUT_FILE=/tmp/$(basename $0).$$.$RANDOM$RANDOM
# or
OUT_FILE=$(mktemp /tmp/log.$$.XXXXXXXXX)

For regular use, it may be more wise to avoid /tmp and create a /tmp under its home.

it will randomly generate 6 characters to replace XXXXXX. You may need to delete the tmp file when script exits, for example, use trap:

1
2
3
4
5
6
7
8
function exitHook {
rm -f $OUT_FILE
rm -f ${OUT_FILE}.yml
rm -f ${OUT_FILE}.out
rm -f ${OUT_FILE}.err
}
## must put at beginning of script
trap exitHook EXIT

Actually, you can get random number from

1
echo $RANDOM

you can also seed it to generate reproducible sequence: https://stackoverflow.com/questions/42004870/seed-for-random-environment-variable-in-bash

if condition

List of test command condition Or check manual man test.

The test command, it can be written as[] or test expression, [[ ]] is modern format, it supports regular expression =~ for string. which one if preferred: test is traditional (and part of the POSIX specification for standard shells, which are often used for system startup scripts), whereas [[ ]] is specific to bash (and a few other modern shells). It’s important to know how to use test since it is widely used, but [[ ]] is clearly more useful and is easier to code, so it is preferred for modern scripts.

1
2
3
4
## don't double quote regexp
if [[ "$name" =~ colou?r ]]; then
echo "..."
fi

其他test 的变量operands 一般用double quote括起来,防止值为空的时候出错.

对于file system, 主要检测-e, -f, -d, -L, -r -w -x, etc. 还有更多的检测选择,参考man.

对于string 则主要就是检测-n, -z, =, ==, !=, =~, >, <.

For comparing integers, -eq, -ne, -ge, -gt, -le, -lt. Or use (( xxx )), this is a compound command designed for integers:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
INT=-3
if [ -z "$INT" ]; then
echo "INT is empty." >&2
exit 1
fi
if [ "$INT" -lt 0 ]; then
echo "INT is negative."
else
echo "INT is positive."
fi
if [ $((INT % 2)) -eq 0 ]; then
echo "INT is even."
else
echo "INT is odd."
fi

# or using (())
if ((1)); then echo "It is true."; fi
if ((0)); then echo "It is true."; fi

# 需要注意(()) 中的变量不再需要expansion symbol $了,直接用变量名
declare -i day=30
if (( day > 0 || day < 31 )); then
echo "day is good"
fi

# 这里结合read command,判断输入是否是有一个item
read -p "input one item -> "
(( "$(echo \"$REPLY\" | wc -w)" > 1 )) && echo "invalid input"

== or =, != and =~ are used for string comparision:

1
2
3
4
# sth does not exist? or using -z
if [[ "${sth}""X" == "X" ]]; then
LogMsg "###### INFO: ..."
fi

or

1
2
3
4
5
# True if the length of "STRING" is zero.
if [[ -z "${sth}" ]]; then
LogMsg "###### INFO: ..." >&2
exit 1
fi
1
2
3
4
5
# directory does not exist?
if [[ ! -d "${folder_path}" ]]; then
LogMsg "###### ERROR: ${folder_path} directory doesn't exist!"
exit 1
fi

对于logial operators, 有2种模式,一种是在command内部使用,比如: test(-a, -o, !), [[ ]], (())(&& || !):

1
2
3
4
5
6
if [[ "$INT" -ge "$MIN_VAL" && "$INT" -le "$MAX_VAL" ]]
# same as test
if [ "$INT" -ge "$MIN_VAL" -a "$INT" -le "$MAX_VAL" ]
# note in test need escape
if [[ ! ("$INT" -ge "$MIN_VAL" && "$INT" -le "$MAX_VAL") ]]
if [ ! \( "$INT" -ge "$MIN_VAL" -a "$INT" -le "$MAX_VAL" \) ]

Since all expressions and operators used by test are treated as command arguments by the shell (unlike [[ ]] and (( )) ), characters that have special meaning to bash, such as <, >, (, and ), must be quoted or escaped.

一种是外部使用的, provided by bash, for example: [[ ]] && [[ ]] || [[ ]], [[ ! xxx ]]. They obey short circuit rule.

Tips: 对于简单的if-condition, 可以替换为形如:

1
2
3
4
5
6
7
# chaining commands
[ -r ~/.profile ] && . ~/.profile
cat ~/.profile && echo "this is profile" || echo "failed to read profile"
test -f "$FILE" && source "$_" || echo "$_ does not exist" >& 2
[ ! -r "$FILE" ] && { echo "$FILE is not readable" ; exit 1 }
# parameters expansion 甚至都不需要if-condition
${var:="hello"}

select loop

The select loop provides an easy way to create a numbered menu from which users can select options. It is useful when you need to ask the user to choose one or more items from a list of choices.

Note that this loop was introduced in ksh and has been adapted into bash. It is not available in sh.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# PS3 is designed for select command
PS3="Enter your choice (must be a number): "
select DRINK in tea cofee water juice appe all none
do
# After a match is found, no further matches are attempted.
# don't need the double quote
# the pattern match is the same as pathname expansion
# for example: ???) [[:alpha:]]) *.txt)
case $DRINK in
tea | cofee | water | all)
echo "Go to canteen"
break
;;
juice|appe)
echo "Available at home"
break
;;
none)
break
;;
# match anything at last
*)
echo "ERROR: Invalid selection"
;;
esac
done

When select you can use index number or literal, if no break, it will loop forever. If want case to match more than one terms, use ;;& instead of ;; at end of each case. The addition of the ;;& syntax allows case to continue to the next test rather than simply terminating.

input password and confirm

Must not show password user input:

1
2
3
4
5
6
7
8
9
10
11
12
echo "****************************************************************"
echo "Please input the password:"
echo "****************************************************************"
while true; do
read -s -p "PASSWORD: " PASSWORD
echo
read -s -p "CONFIRM: " PASSWORD_CONFIRM
echo
[ ${#PASSWORD} -lt 6 ] && echo "The length of password at least 6, please try again" && continue
[ "${PASSWORD}" = "${PASSWORD_CONFIRM}" ] && break
echo "Passwords do not match please try again..."
done

script input parameters

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
if [ $# -eq 0 ]; then
echo "No command-line arguments were specified..."
# call Usage function here
exit 1
fi

## case和C语言中有一样的性质,如果没有break,会继续对比接下来的选项
## 这里并不需要,因为shift 且没有相同的flags
while [ $# -gt 0 ]
do
case "$1" in
-p1)
shift
P1=${1}
shift;;

-p2)
shift
P2=${1}
shift;;

-h|--help)
# Usage
exit 0;;

*) # Usage
exit 1;;
esac
done

[[ "X$P1" = "X" ]] && exit 1
[[ "X$P2" = "X" ]] && exit 1

Note there are 2 shift in one case, after each shift, $# minus 1.

function

The function refers to passed arguments by their position (not by name), that is $1, $2, and so forth. $0 is the name of the script itself.

1
2
3
4
5
6
7
8
9
function example()
{
## local var prevent var leaking to shell
local first=$1
local second=$2
## return code is similar to exit code but this is return
## will break the rest execution
return <return code>
}

Need to call your function after it is declared.

1
2
3
4
5
example "p1" "p2"

args #0 is <absolute path to script itself>
args #1 is p1
args #2 is p2

Show functions:

1
2
3
4
5
6
## list all function names
declare -F
## show definition
declare -f [function name]
## clear a function
unset -f <function name>

Export functions, to make it available to subshells, similarly to export variables:

1
2
## -xf: export a function
declare -xf <function name>

log message

1
2
3
4
5
6
LogMsg()
{
# parse input and reformat
logMsg="$@"
echo "["`date +"%Y/%m/%d %r"`"] " ${logMsg}
}
1
2
3
LogMsg "[INFO] ..."
LogMsg "[WARNING] ..."
LogMsg "[ERROR]..."

Actually, this style [INFO] [2019-10-11 15:59:26-0081] ... it better.

check last command result

1
2
3
4
5
6
7
echo_success_failure() {
if [ $? -eq 0 ]; then
LogMsg "###### INFO: Success..."
else
LogMsg "###### INFO: Failure..."
fi
}

run as root

1
2
3
4
5
effective_uid=`id -u` 2>/dev/null
if [ $effective_uid -ne 0 ]; then
LogMsg "###### ERROR: Please run this script as root or sudo"
exit 1
fi

IFS and read array

The default value of IFS contains a space, a tab, and a newline character. Convert string to array with specific delimiter, for example:

1
2
3
4
5
6
7
string="item1:item2:item3"
# <<<: is here string, the same as here doc but shorter single string
OLD_IFS=$IFS
IFS=':' read -a array <<< "${string}"
# or using process substitution
IFS=':' read -a array < <(echo "${string}")
IFS=$OLD_IFS

This version has no globbing problem, the delimiter is set in $IFS (here is space), variables quoted. Don’t forget to do sanity check after converting.

1
2
3
${array[0]}  ===> item1
${array[1]} ===> item2
${array[2]} ===> item3

Why we use here string rather than pipeline, for example:

1
echo "${string}" | read

这是不行的,因为pipeline 的本质是subshell, 但是read 需要更改当前parent shell的内容的。这里read 实际上在更改了subshell中$REPLY的内容,一旦command 结束,subshell就没了, parent shell 并没有变化.

此外,验证输入的正确性也很重要,一般用[[ =~ ]] regular expression 去检测了.

Actually if the string use spaces as delimiter, we can loop items directly:

1
2
3
4
5
string="item1 item2 item3"
for i in ${string}
do
echo ${i}
done

loop array

break and continue can be used on loop. 还要注意当do 写在下一行的时候,do前面不需要;.

1
2
3
4
5
declare -a array=("element1" "element2" "element3")
for i in "${array[@]}"
do
echo "${i}"
done

declare or typeset are an explicit way of declaring variable in shell scripts.

In BASH it is safer to quote the variable using "" for the cases when $i may contain white spaces or shell expandable characters.

If you want to use index of array element

1
2
3
4
5
6
7
8
9
# get length of an array
arraylength=${#array[@]}

# use for loop to read all values and indexes
for (( i=0; i<${arraylength}; i++ ))
do
## ${array[$i]} 这里注意,先解析的$i
echo $i " / " ${arraylength} " : " ${array[$i]}
done

If we use declare to define a integer variable:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
declare -i x=10
while (( x > 0 ))
do
echo $x
## no need to use 'let x=x-1'
## because x is type integer
x=x-1
done

# true loop 3 种写法
while true | while ((1)) | while :
do
## pass
done

Until loop continues until it receives a zero exit status.

1
2
3
4
5
6
count = 1

until [[ "$count" -gt 5 ]]; do
echo "$count"
count=$((count + 1))
done

In ZSH shell, you can use foreach loop:

1
2
3
4
## () is a must
foreach item (`ls /tmp`)
echo $item
end

Another index loop using seq:

1
2
3
4
for i in $(seq 1 10)
do
echo $i
done

read file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# read 3 fields a line, line by line from distros.txt file
# note that < is placed after done, it is the input for loop
while read distro version release; do
printf "Distro: %s\tVersion: %s\tReleased: %s\n" \
"$distro" \
"$version" \
"$release"
# no need cat here
done < distros.txt
# or
done < <(cat distros.txt)

# can also pipeline input to a loop
# while and read is running on subshell
sort -k 1,1 -k 2n distros.txt | while read distro version release; do
printf "Distro: %s\tVersion: %s\tReleased: %s\n" \
"$distro" \
"$version" \
"$release"
done

# using process substitution
# list last 3 lines of dir
while read attr links owner group size date time filename; do
cat << EOF
Filename: $filename
Size: $size
EOF
done < <(ls -ltrh | tail -n +2)

chmod

chmod recursively for directory and it’s content

1
chmod -R 0755 <target directory>

Or only add executable for file

1
find . -name '<file name>' -type f | xargs chmod +x
1
-rwxr-xr-x ...

pass parameters to script for read

Read can read from keyboard input or file or pipeline: read [-options] [variables...]. If no variable name is supplied, the shell variable $REPLY contains the line of data. If read receives fewer than the expected number, the extra variables are empty, while an excessive amount of input results in the final variable containing all of the extra input.

1
2
3
4
5
6
7
8
9
10
11
12
13
# pass parameters to read command
# must stick to this format
echo "admin
123456" | ./script.sh

# receive code snippet in script.sh
# ${username} ===> admin
# ${password} ===> 123456
echo -n "Please enter username -> "
read username
echo -n "Please enter an password -> "
# -s: silent
read -s password

Other options:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# -p: prompt
read -p "Enter one or more values > "
echo "REPLY = '$REPLY'"

# -t: timeout
# -s: silent
if read -t 10 -sp "Enter secret passphrase > " secret_pass; then
echo -e "\nSecret passphrase = '$secret_pass'"
else
echo -e "\nInput timed out" >&2
exit 1
fi

# -e: pair with -i
# -i: default valut passed to read
read -e -p "What is your user name? " -i $USER
echo "REPLY = '$REPLY'"

setup ssh password-less

Idempotence:

1
2
3
4
5
6
ssh-keyscan -H ${remote} >> ~/.ssh/known_hosts
sshpass -p "<password>" ssh-copy-id -i ~/.ssh/id_rsa.pub root@${remote}
if [[ $? -ne 0 ]]; then
LogMsg "######ERROR: Something went wrong with ssh-copy-id. Check for incorrect credentials ... "
exit 1
fi

recursive call

1
2
3
4
5
6
7
8
example()
{
<execute sth>
if [[ $? -ne 0 ]]; then
LogMsg "######ERROR: Something went wrong… "
example
fi
}

tee command

tee command reads the standard input and writes it to both the standard output and one or more files, -a flag used to append output to existing file, if no -a, tee will create the file if not exist.

1
2
3
4
5
LogMsg()
{
logMsg="$@"
echo "["`date +"%Y/%m/%d %r"`"]" ${logMsg} | tee -a logs/ds_${stage}_${timeStamp}.log
}
1
2
3
4
5
6
7
8
9
10
# 注意这里tee 有2个方向的输出,可以用来检查pipeline的中间输出是什么
+-------------+ +-------+ +--------------+
| command | | tee | | stdout |
| output +---->+ +--->+ |
+-------------+ +---+---+ +--------------+
|
+---v---+
| file |
| |
+-------+

statement block

这个很有意思,之前都没见过: {} Statement block in shell script

do something after reboot

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/usr/bin/env bash
# this script will do sth after reboot
# in /root/completeme.sh
# then restore /etc/profile
#################################################

echo "Warning! This script is going to reboot now to complate the procedure"
echo "After reboot, login as root to perform the final steps"
echo "Press Ctrl-C now to stop this script in case you don\'t want to reboot"

## heredoc
cat << REBOOT >> /root/completeme.sh
## do sth after reboot

touch /tmp/after-reboot
rm -f /etc/profile
mv /etc/profile.bak /etc/profile
echo DONE
REBOOT

chmod +x /root/completeme.sh
cp /etc/profile /etc/profile.bak
## after reboot /etc/profile will be executed so /root/completeme.sh
echo /root/completeme.sh >> /etc/profile
reboot

monitor CPU load

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#!/usr/bin/env bash

## to increase CPU load
## dd if=/dev/zero of=/dev/null
## or use stress command!

while sleep 60
do
## to remove header of ps output, append `=` or user --no-headers flag
## CPU$ 0.0 will be in part if CPU$ > 0.0
REC=`ps -eo pcpu= -o pid= -o comm= | sort -k1 -n -r | head -1`
USAGE=`echo $REC | awk '{print $1}'`
## truncate decimal part
USAGE=${USAGE%.*}
PID=`echo $REC | awk '{print $2}'`
PNAME=`echo $REC | awk '{print $3}'`

# Only if we have a high CPU load on one process, run a check within 7 seconds
# In this check, we should monitor if the process is still that active
# If that's the case, root gets a message

## man test
if [ $USAGE -gt 80 ]
then
USAGE1=$USAGE
PID1=$PID
PNAME1=$PNAME
sleep 7
REC=`ps --no-headers -eo pcpu,pid -o comm= | sort -k1 -n -r | head -1`
USAGE2=`echo $REC | awk '{print $1}'`
USAGE2=${USAGE2%.*}
PID2=`echo $REC | awk '{print $2}'`
PNAME2=`echo $REC | awk '{print $3}'`

# Now we have variables with the old process information and with the
# new information

[ $USAGE2 -gt 80 ] && [ $PID1 = $PID2 ] && mail -s "CPU load of $PNAME is above 80%" root@blah.com < .
fi
done

I want to introduce you Git LFS, it is a command line extension and specification for managing large files with Git. LFS is great for large, changing files, there is basically a text pointer to the large file archived some place else.

Usually we store large file or object in artifactory, for example: jFrog, Nexus, etc.

Install Git LFS

Note: you need to install Git LFS if you git pull from a remote repository that has it

For example, I am working on a RHEL machine. First go to source page, follow the installation guide to install:

This will create a yum repos for git-lfs:

1
yum install -y git-lfs

you can see git-lfs is installed in your machine:

Once downloaded and installed, set up Git LFS and its respective hooks by running:

1
git lfs install

Note: You’ll need to run this in your repository directory, once per repository.

Track Large File

Select the file types you’d like Git LFS to manage (or directly edit your .gitattributes). You can configure additional file extensions at anytime.

1
git lfs track "*.tar.gz"

Note: run this track command at the top level of your repository, then you need to git add .gitattributes file

Manage Large File

Then, just do normal git add and git commit to manage your large size file.

1
2
3
git add *.tar.gz
git commit -m "add tar.gz file"
git push origin <your branch>

Actually, you can check the large files you managed by running:

1
git lfs ls-files

This article used to walk you through some commonly tar usages , based on a real life scenario.

################################################################ #   Date           Description #   05/29/2019     vim tar files #   05/29/2019     extract single file to another directory #   05/28/2019     extract file to another directory #   05/23/2019     extract single file from archive #   04/21/2019     untar keep owner and permission #   02/27/2019     untar to specified folder #   02/22/2019     list tar content #   02/21/2019     tar exclude #   02/20/2019     untar multiple files #   02/19/2019     tar multiple files # ################################################################

Sometimes I see people use -czf but sometimes czf, dash or not to pass flags? Historical and compatible reason, no dash version is probably more portable.

tar is one of those ancient commands from the days when option syntax hadn’t been standardized. Because all useful invocations of tar require specifying an operation before providing any file name, most tar implementations interpret their first argument as an option even if it doesn’t begin with a -. Most current implementations accept a -.

注意, 这里的例子大多是用的old option style for compatibility. For example czf this set of letters must be the first to appear on the command line, after the tar program name and some white space; old options cannot appear anywhere else.

还要注意,当前pwd是/tmp然后运行tar,则tar的结果就在/tmp, 和-C无关, -Coption只是在执行中暂时去指定的位置。

02/19/2019

Basic operation: tar multiple files into example.tar.gz

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
## use -C to go to target directory
## target directory: the directory which contains file1/2/3
tar czf example.tar.gz -C <target directory> file1 file2 file3

## tar a directory as whole
## target directory: <folder name>'s parent folder
## untar 结果是<folder name>这个文件夹
tar czf example.tar.gz -C <target directory> <folder name>


# 如果只想打包某一文件夹内的内容, 则用-C 进入那个文件夹
## 但这样用tar tvf 查看,会有./ 前缀, 因为最后那个`.` 会展开显示所有hidden file,包括当前文件夹那个`.`
tar czf example.tar.gz -C <target directory> .

## 用`*`就没有./ 前缀,但是不会包含hidden file, 必须自己列出来
## 但这样用tar tvf 查看就没有前缀了
tar czf example.tar.gz -C <target directory> * .hidden1 .hidden2

The file path matters! see my blog.

02/20/2019

When untar multiple files, you cannot do this, it will fail

1
tar zxf file1.tar.gz file2.tar.gz file3.tar.gz

The reason please see this link, the solution is to use xargs instead:

1
2
3
# -I: specify replace-str
# {}: placeholder
ls *.tar.gz | xargs -I{} tar xzf {}

Or you can use find with -exec

1
find . -maxdepth 1 -name "*.tar.gz" -exec tar zxf '{}' \;

02/21/2019

For example, if you want to tar things inside a folder foler1 but excluding some files:

1
2
3
4
5
6
## 注意最后的`.` 目标必须放最后
## target directory: 进入
cd folder1
tar czf folder1.tar.gz --exclude="folder1.tar.gz" --exclude='file1' --exclude='file2' *
## if you want to have hidden files
tar czf folder1.tar.gz --exclude="folder1.tar.gz" --exclude='file1' --exclude='file2' * .file3 .file4

If you don’t exclude folder1.tar.gz, it will tar itself again.

02/22/2019

List tar.gz file content, flag z is used to distinguish tar and tar.gz

1
2
tar tvf target.tar
tar ztvf target.tar.gz

02/27/2019

If you don’t specify target folder, untar will put things in current directory, use -C option to specify it. For example, I want to untar source.tar.gz to /etc/yum.repos.d/ folder:

1
tar zxf /tmp/source.tar.gz -C /etc/yum.repos.d/

For -C option, in c and r mode, this changes the directory before adding the following files. In x mode, change directories after opening the archive but before extracting entries from the archive.

04/21/2019

When unpacking, consider using p option to perserve file permissions. Use this in extract mode to override your umask and get the exact permissions specified in the archive… The p option is the default when working as the superuser, it will get what it has. If you are a regular user, add p to keep permissions.

1
tar zxpf target.tar.gz

It seems umask ignores execute bit? When I untar the file with rwxrwxrwx permission inside by regular user with umask 0002, the final permission is rwxrwxr-x.

if you want to keep owner as well:

1
tar --same-owner -zxpf target.tar.gz

Note that there is a - before zxpf.

05/23/2019

Extract specific files from tarball to current directory:

1
tar xzf target.tar.gz file1 file2

Note that no leading / in the path (it uses relative path in tar file!), you can use tar xtvf target.tar.gz to check the path.

05/28/2019

tar by default extracts file to current directory, if you want to place the untar files to another directory, run:

1
tar zxf target.tar.gz -C /target/directory

Note that the target directory has to exist before running that command.

05/29/2019

If you want to extact files to another directory:

1
2
## file1 and file2 put at end
tar xzf target.tar.gz -C /target/directory file1 file2

11/12/2020

Latest VIM support edit on tar file:

1
2
## then select file in dashboard, edit and save normally
vim source.tar.gz

This blog used to walk through some ssh(secure shell), scp and sftp use cases.

Aside: difference between OpenSSH vs OpenSSL The file format is different but they both encode the same kind of keys. Moreover, they are both generated with the same code.

Notice that restart sshd daemon will not disconnect current ssh connection, even if you stop the sshd daemon for a short time or restart the network daemon(don’t stop it!), the current ssh session is still working, see this issue: The reason is, sshd fork a child process on each connection, the child process will not die if either sshd or the whole network is restarted. sshd listens on port 22 for incoming connections. When someone connects it spawns a new process for that connection and goes back to listening.

1
2
3
4
5
6
7
8
9
# check pid and ppid of current ssh child process
ps -ef | grep -v grep | grep ssh

# restart some daemon
systemctl restart sshd
systemctl restart network

# the old child ssh session ppid was changed to 1
ps -ef | grep -v grep | grep ssh

Install SSH (SCP SFTP)

Notice that ssh, scp and sftp are all installed from openssh-clients package. 操作的目标机器的用户以及密码就是目标机器上在/etc/passwd中对应的用户及其密码。

First, understand that there are openssh client and openssh server: Installing and Enabling OpenSSH on CentOS 7, this article briefly introduces openssh configuration and firewall setting for it.

1
yum –y install openssh-server openssh-clients

If only install openssh-clients, you can ssh to others but others cannot ssh to you, since you don’t have ssh server listening at port 22.

After installing openssh-server, enable and start the sshd daemon

1
2
3
4
systemctl enable sshd
systemctl start sshd
# check status
systemctl status sshd

The system OpenSSH server configuration file is /etc/ssh/sshd_config, the custom configuration file is ~/.ssh/config. The /etc/ssh/ssh_config is for system-wide client behavior.

Restricted configuration you may need on server side:

1
2
3
4
5
Port 22
PermitRootLogin prohibit-password
PubkeyAuthentication yes
# after copy the public key in
PasswordAuthentication no

After making changes, restart sshd daemon.

Firewall setting for ssh is file /etc/sysconfig/iptables.

SSHFS

This is a remote mount implemented by SSH, handy if NFS is not workable, search my blog <<Linux Storage System>>.

SSH Tunnel

Forward a local port to remote port, one on one mapping. 用在比如database or web servers 没有对外开放端口,我们可以通过SSH穿过防火墙(SSH的端口是开 放的)去远程映射它们的端口到本地,然后通过localhost访问。

1
2
3
4
5
6
7
# -L: port forward
# 10003: local port
# 8000: remote port of a web server, for example a python simple http server.
# -N: Do not execute a remote command. This is useful for just forwarding ports.
# If the 8000 port is blocked by firewall remotely, but after mapping,
# we can access it locally with the local port 1234.
ssh -L [127.0.0.1:]10003:remotehost:8000 user@remotehost -N

Then go to localhost:10003 on browser to see the web page.

The port forwarding approach is limited on that single port mapping, for unlimited access, you need SOCKS proxy tunneling, see next section.

SSH SOCKS Proxy Tunnel

Introduction to SOCKS proxy

Although by default SOCKS proxy does not provide encryption, but we run it over SSH, so the traffic is encrypted.

How To Route Web Traffic Securely Without a VPN Using a SOCKS Tunnel: A SOCKS proxy is basically an SSH tunnel in which specific applications forward their traffic down the tunnel to the server, and then on the server end, the proxy forwards the traffic out to the general Internet. Unlike a VPN, a SOCKS proxy has to be configured on an app by app basis on the client machine, but can be set up without any specialty client agents.

The remote host must has ssh server running.

1
2
3
4
5
6
7
8
9
10
# -D: dynamic application-level port forwarding, see curl man for more
# explanation about SOCKS support.
# [127.0.0.1:]11000: local mapping port.
# -N: Do not execute a remote command. This is useful for just forwarding ports.
# -C: Compresses the data before sending it
# -q: quiet

# -f: Forks the process in the background
# don't like tunnel, on the remote host it is dynamic forwarding
ssh -D [127.0.0.1:]11000 -f -C -N -q user@remotehost

This is actaully a SOCKS5 proxy created by SSH, after it is established, you can check by:

1
2
# Now you can access the web that original can only access by remote host.
curl -ILk -x socks5://localhost:11000 "https://web_can_only_access_by_remotehost"

Or configuring the web browser to use this SOCKS5 proxy: localhost:11000. On Firefox FoxyProxy plugin, set and use it. Now we can access whatever the remotehost can access.

Manually kill the tunnel process if you use -f.

SSH X11 Forwarding

Similar to VNC, but VNC transmits whole desktop which is more expensive. Linux has good support to X11, on Mac, need to install XQuartz(still not work on Mac).

1
2
3
4
5
# -X: X11 forwarding
ssh -X user@remotehost

# gedit is running on remotehost but reflect GUI locally
> gedit

SSH Agent

很久之前看书的时候没明白这个概念. 一个常见的用处就是保护originating host的private key. A handy program called ssh-agent simplifies working with SSH private keys.

In Mac, ssh-agent is running by default, but in Linux, start by yourself (ensure only one instance of ssh-agent is running).

1
2
3
4
5
6
7
# Don't need do this if you are Mac, or your company laptop has agent running by
# default you can check by:
ssh-add -l

# First check if only one instance is running
ps aux | grep ssh-agent
# if it is there but cannot work, kill it.

If you run ssh-agent, it will output environment vars you need to set, for example, you can also manually export these instead of using eval:

1
2
3
4
5
6
7
8
9
ssh-agent

# export these manually is OK
SSH_AUTH_SOCK=/tmp/ssh-YI7PBGlkOteo/agent.2547; export SSH_AUTH_SOCK;
SSH_AGENT_PID=2548; export SSH_AGENT_PID;
echo Agent pid 2548;

# Start it.
eval $(ssh-agent)

Add your private key to ssh-agent, sometimes git ssh clone failed, you may need to add private key to agent:

1
2
3
4
5
6
7
8
9
10
11
# default path ~/.ssh/id-rsa
ssh-add
ssh-add <other private key path>

# list all identities
ssh-add -l

# delete all identities
ssh-add -D
# delete specified identity
ssh-add -d <private key path>

How to start ssh-agent on login: https://stackoverflow.com/questions/18880024/start-ssh-agent-on-login Add below to your .bash_profile:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
SSH_ENV="$HOME/.ssh/env"

function start_agent {
echo "Initialising new SSH agent..."
/usr/bin/ssh-agent | sed 's/^echo/#echo/' > "${SSH_ENV}"
echo 'succeeded'
chmod 600 "${SSH_ENV}"
. "${SSH_ENV}" > /dev/null
# add private key ~/.ssh/id_rsa.pub
/usr/bin/ssh-add;
}

# Source SSH settings, if applicable
if [ -f "${SSH_ENV}" ]; then
# need resource, but ssh-agent is there
. "${SSH_ENV}" > /dev/null
# ps ${SSH_AGENT_PID} doesn't work under cywgin
ps -ef | grep ${SSH_AGENT_PID} | grep ssh-agent$ > /dev/null || {
# statement block
start_agent;
}
else
start_agent;
fi

When you try to make a connection to a remote host, and you have ssh-agent running, the SSH client will automatically use the keys stored in ssh-agent to authenticate with the host.

Advantages:

  1. For encrypted SSH private keys,只有第一次加入ssh-agent的时候要求输入password 如果 不使用ssh-agent,每次SSH都会要求输入password
  2. If you are using Ansible to manage hosts that use different SSH keys, using an SSH agent simplifies your Ansible configuration files.
  3. ssh-agent forwarding, see below

ssh-agent 还解决了一个问题,比如你有personal and work git account各一个,但各自是不同的 ssh key pairs,在git clone的时候如何指定用哪个private key呢? see this link.

SSH Agent Forwarding

If you are cloning a Git repository on remote host via SSH, you’ll need to use an SSH private key recognized by your Git server. I like to avoid copying private SSH keys to my remote host (for example, a EC2 instance), in order to limit the damage in case a host ever gets compromised.

1
2
3
4
5
6
7
8
# The example.xx.com does not have the private key to access git repo but the
# local host has.
# -A: agent forwarding
ssh -A root@example.xxx.com

# git clone via ssh mechanism on remote host with the private key provided by
# agent from local host.
git clone git@github.com:lorin/mezzanine-example.git

Here -A limit the agent forwarding in this session only, you can have ssh config to set agent forwarding on a broad view.

ProxyJump

现在想想,当时登录openshift or softlayer 的master时,也需要经过堡垒机,所以应该可以配置 proxyjump. 这个和ssh-agent没有必然联系,如果没用ssh-agent, 则应该可以在配置config file中 指定key的位置.

Using OpenSSH ProxyJump It uses vagrant VMs to demonstrate. 但是我觉得不需要再指定port, user, identifykey对 target server了,这些应该在bastion上已经配置好了。

SSH agent and ProxyJump explained Talking risk of SSH agent forwarding, access internal hosts through a bastion, ProxyJump is much safer. 也谈到了SSH是怎么handshake, 传输中用的是新的对称钥匙.

JumpBox or Bastion Host: Notice that you need to generate key and copy the public key to bastion host first.

Baston hosts are usually public-facing, hardened systems that serve as an entrypoint to systems behind a firewall or other restricted location, and they are especially popular with the rise of cloud computing.

The ssh command has an easy way to make use of bastion hosts to connect to a remote host with a single command. Instead of first SSHing to the bastion host and then using ssh on the bastion to connect to the remote host, ssh can create the initial and second connections itself by using ProxyJump.

1
2
3
4
5
# -J specify the jumphost
ssh -J <bastion-host> <remote-host> [-l <remote login user>] [-i <pem file>]
ssh -J user@<bastion:port> <user@remote:port>
# Jump through a series of hosts
ssh -J <bastion1>,<bastion2> <remote>

最主要的还是配置~/.ssh/config 文件, basic settings: 注意,之前遇到过奇怪的问题,用同样的config file,别人一切正常,但我就连不上,简化了一下config file后就好了,当时的解决办法是把Match host模块移到匹配的Host 下面,其实Match host不要也行 很多可以合并的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# May have more options, for example, User, Port, AgentForward, etc.
# refer `man ssh_config`

# The `Host` sections are read in order and the options matched will get
# accumulated

# The Bastion Host
Host <jump-host-nickname>
User <user name>
# default is no
ProxyUseFdpass no
# jumpbox port
Port 22
# jumpbox IP
HostName <hostname or IP>

# The Remote Host
Host <remote-host-nickname>
Hostname <remote-hostname or ip address for example: 172.12.234.12>
User <user name>
AddKeysToAgent yes
IdentitiesOnly yes
# may need pem file, the private key
IdentityFile ~/.ssh/file.pem
StrictHostKeyChecking no
ServerAliveInterval 60

# The remote host match this IP will use jumpbox
# 这个可以不要,就是用来match Host的
Match host 172.??.*
ProxyJump <jump-host-nickname>

# Or can specify jumpbox directly
Host <remote-host-nickname>
HostName < remote-hostname or ip address>
ProxyJump bastion-host-nickname

Then you can ssh directly: ssh remote-host-nickname

ProxyCommand is an alternative of ProxyJump, but it is old.

1
ssh -o ProxyCommand="ssh -W %h:%p bastion-host" remote-host

Force SSH Password Login

Usually SSH password authentication is disabled, that is, you can only log in over SSH using public key authentication, to enable password login:

1
2
3
4
5
6
7
# /etc/ssh/sshd_config
# set to yes
PasswordAuthentication yes
# you may also need to allow root login
PermitRootLogin yes
# restart sshd
systemctl restart sshd

Create new user to test:

1
2
useradd alice
passwd alice

Logout and try:

1
2
3
4
5
6
7
# PubkeyAuthentication may be needed
# then input the password
ssh -p 2222 \
-o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null \
-o PubkeyAuthentication=no \
alice@127.0.0.1

Debug

  1. Use -v, -vv, -vvv flag in ssh command (比如之前pem file 权限和格式没设置对都 有提示)
  2. Wireshark capture ssh traffic on that interface, you should see SSHv2 protocol and more details
  3. Check system log, journalctl | grep sshd.
  4. Launch sshd on another port in debug mode: sudo /usr/bin/sshd -d -p 2020, then ssh to this port 2020 from client ssh -p 2020 user@remote_server.
  5. Possibly restriced by firewall

Usage Summary

1
2
3
4
5
6
7
8
9
10
11
12
13
10/01/2018 ssh send command
11/14/2018 ssh run shell script
12/19/2018 ssh-copy-id
01/06/2019 ssh-kenscan
01/08/2019 ECDSA host key changed
01/22/2019 no prompt first time
01/23/2019 sshpass
02/21/2019 scp folder or files
03/11/2019 ssh -i option
03/12/2019 recover public key
09/05/2020 sftp
01/20/2021 ssh config
03/17/2022 ssh config permission

10/01/2018

use ssh send commands to execute on remote machine: -t flag allow you to interact with remote machine:

11/14/2018

use ssh run shell script in remote machine

12/19/2018

use ssh-copy-id to copy local machine public key to remote machine’s ~/.ssh/authorized_keys file, so next time when you ssh, scp or sftp again, no prompt to require password:

Sometimes I see people use ~/.ssh/id_rsa with ssh-copy-id, that confused me because that is private key, OK, man tells me why:

1
2
3
4
-i identity_file
...If the filename does not end in .pub this is added. If the filename
is omitted, the default_ID_file is used.

01/06/2019

use ssh-keyscan to get remote machine ecdsa identity, you can put this item into local known_hosts file, so when first time ssh login, there is no prompt to input yes:

Actually better to use -o StrictHostKeyChecking=no flag.

01/08/2019

I create a new cluster with the same master hostname as the deleted one, so when I try to ssh to it, interesting thing happens:

go to ~/.ssh/known_hosts file and delete the corresponding ECDSA line

01/22/2019

when you first time ssh or scp, sftp to remote machine, it will prompt to add remote machine to ~/.ssh/known_hosts file, this may interrupt ansible or shell script running, so I want to skip it. For example:

use -o StrictHostKeyChecking=no option, it will silently add remote host name to ~/.ssh/known_host file.

1
2
ssh-copy-id -i .ssh/id_dsa.pub -o StrictHostKeyChecking=no root@example.com
scp -o StrictHostKeyChecking=no -r ./source root@example.com:~

if you don’t want to add the host name, -o UserKnownHostsFile=/dev/null option can save you.

01/23/2019

scp or ssh without prompt input password

1
2
3
yum install -y sshpass
# Explicitly input password
sshpass -p <password> scp/ssh ...

It’s useful to set password-less at first time, combine all of these, no prompt will show up:

1
sshpass -p <password> ssh-copy-id -i ~/.ssh/id_rsa.pub -o StrictHostKeyChecking=no ...

02/21/2019

scp source directory and it’s content recursively to root user home directory in example.com.

1
scp -o StrictHostKeyChecking=no -r ~/source root@example.com:~

scp all files in source directory to target directory in example.com.

1
scp -o StrictHostKeyChecking=no ./source/* root@example.com:~/target

03/11/2019

The ssh command has -i option, you associate private key with this flags:

1
ssh -i ~/.ssh/id_rsa xxx

Note that SSH never send private key over the network, -i merely used to answer challenge that is generated using the corresponding public key from target machine, you don’t need to explicitly use -i if you use default private key in right location.

03/12/2019

If public key is lost, you can use existing private key to generate one:

1
ssh-keygen -y -f ~/.ssh/id_rsa > ~/.ssh/id_rsa.pub

Or just create new key pair

1
echo "yes" | ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa

09/05/2020

sftp server is up when install openssh-server package, and can be configured in /etc/ssh/sshd_config file: https://www.techrepublic.com/article/how-to-set-up-an-sftp-server-on-linux/

For the interactive commands, see man sftp, there are logs of regular commands can be used in sftp, for example: cd, chmod, ln, rm, etc.

The online free ftp server for testing purpose. Also, the mmnt.net can be used to find free ftp servers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Use password or ssh-public key to login to sftp server
sftp -o UserKnownHostsFile=/dev/null \
-o StrictHostKeyChecking=no \
demo@test.rebex.net[:path]

# print local working directory, the default place to hold download file
sftp> lpwd
# change local working directory
sftp> lcd [path]
# escape to local shell, type `exit` to back
sftp> !
# secape to local command
sftp> ![command]

# enable/disable progress meter
sftp> progress
# download file to local working directory
sftp> get <filename>
# download file to specified directory
sftp> get <filename> <local file path>

# upload file
# default file is in local working directory and upload to sftp current folder
# if no path is specified
sftp> put [local file path] [remote file path]

# quit sftp
sftp> bye

For non-interactive download/upload file:

1
2
3
4
5
# download
sftp user@hostname[:path] <local file path>
# upload, tricky
echo "put <local file path>" | sftp user@hostname[:path]
sftp user@hostname[:path] <<< $'put <local file path>'

Used in shell script:

1
2
3
4
5
6
sftp user@hostname <<EOF
cd /xxx/yyy/zzz
cd /aaa/bbb/ccc
put file.tgz
bye
EOF

01/20/2021

When the network connection is poor and ruining your SSH session, you can adjust the connection settings with larger interval probing and retry times:

1
2
3
4
5
6
7
8
9
10
11
12
Host myhostshortcut
User XXX
Port 22
# or ip address
HostName myhost.com
User barthelemy
# no-op probe to server interval 60s
ServerAliveInterval 60
# probe conunt max 10 times if noresponse
ServerAliveCountMax 10
# no tcp no-op probe
TCPKeepAlive no

08/13/2021

When I ran git pull on Gitlab local repo from my cloudtop, I got error output:

1
2
3
4
5
6
Received disconnect from UNKNOWN port 65535:2: Too many authentication failures
Disconnected from UNKNOWN port 65535
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

This was resolved by adding IdentitiesOnly yes in ssh config file under gitlab config block, which instructs ssh to only use the authentication identity files specified on the command line or the configured in the ssh_config file.

Reference: https://www.tecmint.com/fix-ssh-too-many-authentication-failures-error/

03/17/2022

The Ansible playbook failed to ssh target VM and reported error:

1
2
3
4
5
module.backups-v2.null_resource.setup_cluster (local-exec): [WARNING]: Unhandled
error in Python interpreter discovery for host
module.backups-v2.null_resource.setup_cluster (local-exec): 10.10.16.205: Failed
to connect to the host via ssh: Bad owner or permissions
module.backups-v2.null_resource.setup_cluster (local-exec): on /root/.ssh/config

It turns out the permission and owner issue on /root/.ssh/config file, see this ticket for detail, the fix is to set 600 as chmod and chown by owner.

0%