Kubernetes version 1.13.2

Server Login Failed

This morning I find I lose the connection with my icp4d kubernetes server (it was good last night), if I run:

1
2
# kubectl get pods
error: the server doesn't have a resource type "pods"

then:

1
2
3
4
# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4+icp-ee", GitCommit:"d03f6421b5463042d87aa0211f116ba4848a0d0f", GitTreeState:"clean", BuildDate:"2019-01-17T13:14:09Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

error: You must be logged in to the server (the server has asked for the client to provide credentials)

But it seems kubectl config is good, the token is there:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# kubectl config view
apiVersion: v1
clusters:
- cluster:
insecure-skip-tls-verify: true
server: https://172.16.3.23:8001
name: icp-cluster1
- cluster:
certificate-authority: mycluster/ca.pem
server: https://172.16.3.23:8001
name: mycluster
contexts:
- context:
cluster: icp-cluster1
user: admin
name: ""
- context:
cluster: mycluster
user: mycluster
name: mycluster
- context:
cluster: mycluster
namespace: zen
user: mycluster-user
name: mycluster-context
current-context: mycluster-context
kind: Config
preferences: {}
users:
- name: admin
user:
client-certificate: /etc/cfc/conf/kubecfg.crt
client-key: /etc/cfc/conf/kubecfg.key
- name: mycluster
user:
client-certificate: /ibm/InstallPackage/ibm-cp-app/cluster/cfc-certs/kubernetes/kubecfg.crt
client-key: /ibm/InstallPackage/ibm-cp-app/cluster/cfc-certs/kubernetes/kubecfg.key
- name: mycluster-user
user:
token: eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJhdF9oYXNoIjoiYjVkNDMxZDMwNGZmMGUyYWM0NWJlOTY1NjU5YTQyN2ViOWUwNzE5NCIsInJlYWxtTmFtZSI6ImN1c3RvbVJlYWxtIiwidW5pcXVlU2VjdXJpdHlOYW1lIjoiYWRtaW4iLCJpc3MiOiJodHRwczovL215Y2x1c3Rlci5pY3A6OTQ0My9vaWRjL2VuZHBvaW50L09QIiwiYXVkIjoiYTc1ZTZmZjQ3YzQyZTJhZDA3YjZiMjUzMTVmZTExMTQiLCJleHAiOjE1NjAyMzkwNzEsImlhdCI6MTU2MDIxMDI3MSwic3ViIjoiYWRtaW4iLCJ0ZWFtUm9sZU1hcHBpbmdzIjpbXX0.cwGioosvwjONIllJExWRADicgibShbSl2x05r3hpiMpXQQia_4HDuvfUCNNyvLiFkBfz1xvuoz9JeAkOdRa7QVR0RD8TGVnYyu10S50AQ5b_LjGaTNoxdGJjLLEGkBt5gzJCsZaVw49ttd-lzDV28badpUBtm1cih4-3o-wbM6inJqCqR97ujgImRW0BS0Jj1pbENAEidAquyZscGMje5vyyRc9A67VWWJxZXo0J1fG081yhvaryRWbvinLLSPRm8_eley1GqItUMvRmIpzC-X7xsg4zIvCE8QhPoKrJp2xRFjDwsvCN44wJv9hdkfx3cGxjjOBdg6ofsVkNND5njg

I check the docker and kubelet status, all are active:

1
2
systemctl status docker
systemctl status kubelet

Then I try to reboot all nodes, and it set up correctly. Don’t know how to reproduce this issue, no idea what happened and how to fix it (without rebooting), sadly.

Server Connection Refused 6443

Similar issue happened again in my dstest cluster:

1
2
3
4
5
6
# kubectl get pods -n test-1
The connection to the server 9.30.188.95:6443 was refused - did you specify the right host or port?

# kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:35:51Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server 9.30.188.95:6443 was refused - did you specify the right host or port?

Check this required ports list.

1
2
# netstat -tunlp | grep 6443
tcp6 0 0 :::6443 :::* LISTEN 27047/kube-apiserve

Note, there is no kube-apiserver service in systemctl, so how to restart it? The kube-apiserver is from a static pod, so I think I can restart the container directly by docker restart <container ID>:

1
2
3
# docker ps -a | grep apiserver
8f661411fa02 177db4b8e93a "kube-apiserver --au..." About an hour ago Up 4 minutes k8s_kube-apiserver_kube-apiserver-dstest1.fyre.ibm.com_kube-system_d175f38c007e23cc443d6ba50ba15533_0
0f05946a8a59 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_kube-apiserver-dstest1.fyre.ibm.com_kube-system_d175f38c007e23cc443d6ba50ba15533_0

Haven’t got chance to reproduce this issue, this solution may not work…

In a health cluster:

1
2
3
4
5
# kbc cluster-info
Kubernetes master is running at https://9.30.188.95:6443
KubeDNS is running at https://9.30.188.95:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Server Connection Refused 8080

This issue is similar to 6443 one, but it shows:

1
The connection to the server localhost:8080 was refused - did you specify the right host or port?

Recall that when we set up K8s cluster by kubeadm, we run:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
...
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
...
Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

I can reproduce this issue if the environment variable KUBECONFIG is missing, so try to export it, both ways are fine:

1
2
export KUBECONFIG=$HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf

A good /etc/kubernetes folder has these items:

1
2
3
4
5
6
7
8
# ls -ltr /etc/kubernetes/
total 36
drwxr-xr-x 3 root root 4096 Jun 20 10:02 pki
-rw------- 1 root root 5447 Jun 20 10:02 admin.conf
-rw------- 1 root root 5539 Jun 20 10:02 kubelet.conf
-rw------- 1 root root 5483 Jun 20 10:02 controller-manager.conf
-rw------- 1 root root 5435 Jun 20 10:02 scheduler.conf
drwxr-xr-x 2 root root 113 Jun 20 10:02 manifests

The manifests contains yaml files for creating etcd, kube-apiserver and kube-controller-manager, kube-scheduler.

How to list images and tags in the docker registory? How to delete image(layers) in docker registory? These are general demands in my daily work, let’s figure them out.

A brife digression: The OpenShift platform has web UI to deal with images in integrated docker registry (it is called imagestream in OpenShift), usually after you login to terminal, run oc version will show you the web address. You can list and delete imagestream there.

For example, I use OpenShift integrated docker registry and push my docker images to a project called datastage (I configuring the setting so other project can pull images from this project):

Resurces

Docker Registry HTTP API V2 Registry 清理镜像 v2 Docker registry authentication Registry tool Git project Cleanup Your Docker Registry

Quick Set up

After installing docker, get and run docker registry from Docker Offical Images - registry.

1
docker pull registry

you will get:

1
2
3
4
docker images

REPOSITORY TAG IMAGE ID CREATED SIZE
registry latest f32a97de94e1 3 months ago 25.8MB

then run it locally with image deletion enabled:

1
docker run -d -p 5000:5000 -e REGISTRY_STORAGE_DELETE_ENABLED=true --restart always --name registry registry

To remove images, you need to setup docker registry with delete enabled(by default it’s off), see my blog Docker Registry Configure

1
2
3
4
docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3021266dca1f registry "/entrypoint.sh /etc..." 2 seconds ago Up 1 second 0.0.0.0:5000->5000/tcp registry

Next, let’s use busybox to illustrate:

1
2
3
docker pull busybox
docker tag busybox localhost:5000/busybox:v1
docker push localhost:5000/busybox:v1

Insecure Docker Registry

Quick set up will give you a insecure private docker registry (means no docker login and use http to access API).

Note that you can use -v option in curl command to get verbose message such as HEADER information.

Check Availability

1
2
3
4
5
6
7
8
curl -k --head -X GET http://localhost:5000/v2/

HTTP/1.1 200 OK
Content-Length: 2
Content-Type: application/json; charset=utf-8
Docker-Distribution-Api-Version: registry/2.0
X-Content-Type-Options: nosniff
Date: Sun, 23 Jun 2019 05:20:01 GMT

This means registry is accessable and user has permission.

List Images

1
2
3
curl -k -X GET http://localhost:5000/v2/_catalog

{"repositories":["busybox"]}

List Image Tags

1
2
3
curl -k -X GET http://localhost:5000/v2/busybox/tags/list

{"name":"busybox","tags":["v1"]}

Delete Images

Deletion of unused digests of docker images to avoid unnecessary space growth in a private docker registry

Deletion is more complicated than list, from Deleting an Image API, there are 2 main steps:

Delete through API

  1. Get the digest of the image with tag v1
1
2
3
4
5
6
7
8
9
10
curl -k --head -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X GET http://localhost:5000/v2/busybox/manifests/v1

HTTP/1.1 200 OK
Content-Length: 527
Content-Type: application/vnd.docker.distribution.manifest.v2+json
Docker-Content-Digest: sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff
Docker-Distribution-Api-Version: registry/2.0
Etag: "sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff"
X-Content-Type-Options: nosniff
Date: Sun, 16 Jun 2019 20:12:07 GMT

Note when deleting a manifest from a registry version 2.3 or later, the following header must be used when HEAD or GET-ing the manifest to obtain the correct digest to delete: Accept: application/vnd.docker.distribution.manifest.v2+json.

You can refer this Image Manifest V 2, Schema 2 to get more header details.

Here, we use the digest from Docker-Content-Digest field in the header, the vaule is sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff.

Actually, if the docker image is loaded, you can inspect it by:

1
docker inspect localhost:5000/busybox:v1 | less

There is a RepoDigests field that also contains the same digest:

1
2
3
4
5
6
...
"RepoDigests": [
"busybox@sha256:7a4d4ed96e15d6a3fe8bfedb88e95b153b93e230a96906910d57fc4a13210160",
"localhost:5000/busybox@sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff"
],
...
  1. Issue delete command
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
curl -k -v -X DELETE http://localhost:5000/v2/busybox/manifests/sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff

* About to connect() to localhost port 5000 (#0)
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 5000 (#0)
> DELETE /v2/busybox/manifests/sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:5000
> Accept: */*
>
< HTTP/1.1 202 Accepted
< Docker-Distribution-Api-Version: registry/2.0
< X-Content-Type-Options: nosniff
< Date: Sun, 16 Jun 2019 20:27:05 GMT
< Content-Length: 0

The response HTTP/1.1 202 Accepted means the deletion succeeds, let’s check the tag again:

1
2
3
curl -k -X GET http://localhost:5000/v2/busybox/tags/list

{"name":"busybox","tags":null}

Note that if the docker registry deletion is not enabled, you will get response {"errors":[{"code":"UNSUPPORTED","message":"The operation is unsupported."}]}.

Delete in File System

Note this way doesn’t required docker registry is deletion enabled!

Actually, docker registry stores image in /var/lib/registry/docker/registry/v2/, there are blobs and repositories directories. blobs directory is where images reside and repositories is where metadata and reference locate.

You need to delete two dirs if you mount docker registry storage in host:

1
2
3
rm -rf <mount path>/registry/v2/repositories/busybox/_manifests/tags/v1/index/sha256/<hash dir>

rm -rf <mount path>/registry/v2/repositories/busybox/_manifests/revisions/sha256/<hash dir>

At the time of deleting those dirs; the docker registry should be in read only mode. Nobody should push to registry.

Garbage Collection

However, the API and file system deletions above only remove the metadata or dereference the connection between manifest with layers data in disk, we need to run garbage collection in docker registry to remove layers:

1
docker exec -it registry sh

check space used before clean:

1
2
3
4
du -sch /var/lib/registry/docker/

764.0K /var/lib/registry/docker/
764.0K total

then run garbage collection:

1
bin/registry garbage-collect /etc/docker/registry/config.yml

Note that /etc/docker/registry/config.yml is the configuration file for docker registry.

then if you check space used again

1
2
3
4
du -sch /var/lib/registry/docker/

8.0K /var/lib/registry/docker/
8.0K total

Other Notice

If you have one image with multiple tags and the digests are the same, delete one of them will remove them all.

If you have one image with multiple tags and the digests are different for each tag, deletion is tag-separate.

Secure Docker Registry

In ICP4D cluster, we use secure docker registry with https and login credentials. But first let’s understand how to set up secure docker registry, see my blog <<Secure Docker Registry>>.

login, see .docker/config curl works?

Check Availability

If you don’t have authentication, you will get 401 Unauthorized status, for example, here https://mycluster.icp:8500 is the private secure docker registry location:

1
2
3
4
5
6
7
8
curl --head -k -X GET https://mycluster.icp:8500/v2/

HTTP/1.1 401 Unauthorized
Content-Type: application/json; charset=utf-8
Docker-Distribution-Api-Version: registry/2.0
Www-Authenticate: Bearer realm="https://mycluster.icp:8600/image-manager/api/v1/auth/token",service="token-service"
Date: Mon, 24 Jun 2019 16:14:10 GMT
Content-Length: 87

Here Www-Authenticate tells you Auth Server address.

In my OpenShift cluster:

1
2
3
4
5
6
7
8
9
10
11
curl -k --head -X GET https://172.30.159.11:5000/v2/

HTTP/1.1 401 Unauthorized
Content-Type: application/json; charset=utf-8
Docker-Distribution-Api-Version: registry/2.0
Www-Authenticate: Bearer realm="https://172.30.159.11:5000/openshift/token"
X-Registry-Supports-Signatures: 1
Date: Mon, 24 Jun 2019 16:37:40 GMT
Content-Length: 87

{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}

Need to apply token from Auth Server.

1
2
3
4
5
6
7
8
9
======================================================================================
I think I get stuck here...
the situation is:
1. I have docker login ability
2. where can I get the token to do API access? auth server, where, how?
3. the platform use what to secure docker registry?

icp4d cluster is more transparent the Openshfit
======================================================================================
1
curl -u openshift:NOyEoOrA0FDm2IgYqlHCDkDepQ7I0vw-7Sx8RzPUmzw -X GET "https://172.30.159.11:5000/openshift/token?service=172.30.159.11:5000&scope=repository:demo1-ds/busybox:pull,push"

https://docs.docker.com/registry/spec/auth/token/

List Images

1

List Tags

1

Remove Images

1

Kubernetes version 1.13.2

In ICP4D non-root development, I see in each tier the pod contains init container, for example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
...
hostname: is-xmetadocker
hostIPC: true
initContainers:
- name: load-data
image: "mycluster.icp:8500/zen/is-db2xmeta-image:11.7.1-1.0"
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add: ["SETFCAP", "SYS_NICE", "IPC_OWNER"]
command: ['/bin/bash', '-c', '--']
args: [ " set -x;
...
"
]
env:
- name: DEDICATED_REPOS_VOLPATH
value: /mnt/dedicated_vol/Repository
volumeMounts:
- name: xmeta-pv-volume
mountPath: "/mnt/dedicated_vol/Repository"
...

Understanding

What is this and what is it for? Let’s take a look. First reference to official documentation.

Init Containers, which are specialized Containers that run before app Containers and can contain utilities or setup scripts not present in an app image.

A Pod can have multiple Containers running apps within it, but it can also have one or more Init Containers, which are run before the app Containers are started.

Init Containers are exactly like regular Containers, except:

  • They always run to completion.
  • Each one must complete successfully before the next one is started.

If an Init Container fails for a Pod, Kubernetes restarts the Pod repeatedly until the Init Container succeeds. However, if the Pod has a restartPolicy of Never, it is not restarted.

Differences from Regular Containers

Init Containers support all the fields and features of app Containers, including resource limits, volumes, and security settings. However, the resource requests and limits for an Init Container are handled slightly differently.

Init Containers do not support readiness probes because they must run to completion before the Pod can be ready.

Usage

Init Containers have separate images from app Containers (actually they can have the same), they have some advantages for start-up related code:

  • They can contain and run utilities that are not desirable to include in the app Container image for security reasons.
  • They can contain utilities or custom code for setup that is not present in an app image. For example, there is no need to make an image FROM another image just to use a tool like sed, awk, python, or dig during setup.
  • The application image builder and deployer roles can work independently without the need to jointly build a single app image.
  • They use Linux namespaces so that they have different filesystem views from app Containers. Consequently, they can be given access to Secrets that app Containers are not able to access.
  • They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.

You can check the logs of init container:

1
kubectl logs <pod name> -c <init container> -n test-1

During the startup of a Pod, the Init Containers are started in order, after the network and volumes are initialized. Each Container must exit successfully before the next is started.

If the Pod is restarted, all Init Containers must execute again. Init Container code should be idempotent.

这里记录了一下 orphan 和 zombie process 的区别和表现形式,以及在 bash 中怎么制造它们和处理 zombie.

Today after killing a running process, it didn’t get removed but was marked as <defunct> (run ps can see it). What’s this?

Zombie vs Orphan process On Unix and Unix-like computer operating systems, a zombie process or defunct process is a process that has completed execution but still has an entry in the process table. This entry is still needed to allow the parent process to read its child’s exit status.

Zombie processes should not be confused with orphan processes: an orphan process is a process that is still executing, but whose parent has died. When the parent dies, the orphaned child process is adopted by init (process ID 1). When orphan processes die, they do not remain as zombie processes; instead, they are waited on by init. The result is that a process that is both a zombie and an orphan will be reaped automatically.

To create orphan in bash:

1
2
3
4
5
# this subshell spawns a new process sleep in background
# and print the child pid then die immediately
(sleep 10 & echo orphan pid $!)
# now the ppid is took over by init process
ps axo pid,ppid,comm | grep <orphan pid>

Notice that & is background sign, don’t append ; after it.

To make zombie in bash:

1
2
3
(sleep 1 & echo zombie pid $!; exec /bin/sleep 60) &
# check <defunct> mark
ps axo pid,ppid,comm | grep <zombie pid>

Analysis: in subshell we first spawn sleep 1 on background and output the child pid, then we exec to replace the parent shell, so sleep 1 will not be reaped by the original parent, neither of init process because its parent does not die, thus become a zombie. Eventually they both are reaped by init.

There is no harm in letting such processes be unless there are many of them. Processes that stay zombies for a long time are generally an error and cause a resource leak, but the only resource they occupy is the process table entry – process ID.

To check the maximum number of process can be created:

1
cat /proc/sys/kernel/pid_max

Usually zombie process will last a very short time, it will be reaped by its parent or init as mentioned above. But if not, need to reap them manually.

I checked the PPID of that defunct process, it’s not PID 1 but other shell process. By killing its parent process, the zombie will be handed over to init and reaped. Just check the PPID:

1
2
3
ps axo pid,ppid | grep defunct
# or
ps -ef | grep defunct

You can also use this command to verify defunct process is gone.

Notice you cannot kill zombie because by kill -9 since it is already dead!

Understanding how SHELL works under the hood is a must to me, I have encountered several interesting and confusing issues in my daily work about SHELL. Let’s dive deeply into shell process and its relationships to explore how subshells are created and the relationship between parent and child shell.

Shell Type

Due to the bash shell’s popularity, it’s rare to use any other shell as a default shell.

The default interactive shell starts whenever a user logs into a virtual console terminal or starts a terminal emulator in the GUI. Another default shell, /bin/sh, is the default system shell. The default system shell is used for system shell scripts, such as those needed at startup.

In my Redhat and CentOS system, they are the same:

1
lrwxrwxrwx. 1 root root 4 Apr 13  2018 /bin/sh -> bash

To see the user default login shell, go to see /etc/passwd, for example:

1
2
fyre:x:1000:1000::/home/fyre:/bin/bash
demo:x:1001:1001::/home/demo:/bin/bash

Shell Relationships

You can use ps -f to see difference before you run several times bash(child) in a shell(parent):

1
2
3
4
5
6
7
# run bash
# then run ps -f

UID PID PPID C STIME TTY TIME CMD
root 7762 7758 0 Jun03 pts/0 00:00:00 -bash
root 14957 7762 0 17:12 pts/0 00:00:00 bash
root 15028 14957 0 17:13 pts/0 00:00:00 ps -f

Here PID 14957 has parent 7762.

A child shell is also called a subshell. A subshell can be created from a parent shell or from another subshell. For example, run bash 3 times:

1
2
3
4
5
6
7
8
9
# bash bash bash
ps --forest -f

UID PID PPID C STIME TTY TIME CMD
root 7762 7758 0 Jun03 pts/0 00:00:00 -bash
root 2264 7762 0 23:52 pts/0 00:00:00 \_ bash
root 2467 2264 0 23:55 pts/0 00:00:00 \_ bash
root 2487 2467 0 23:55 pts/0 00:00:00 \_ bash
root 2510 2487 0 23:55 pts/0 00:00:00 \_ ps --forest -f

Constructs Create SubShell

Refer to this article what is a subshell

Note subshells are often used for multi-processing in shell scripts. However, entering into a subshell is an expensive method and can significantly slow down processing.

A subshell is typically implemented by forking a new process (but some shells may optimize this in some cases).

  • Subshell for grouping: (...) does nothing but create a subshell and wait for it to terminate. Contrast with {...} which groups commands purely for syntactic purposes and does not create a subshell.
  • Background &: creates a subshell and does not wait for it to terminate.
  • Pipeline: | creates two subshells, one for the left-hand side and one for the right-hand side, and waits for both to terminate. The shell creates a pipe and connects the left-hand side’s standard output to the write end of the pipe and the right-hand side’s standard input to the read end. In some shells (ksh88, ksh93, zsh, bash with the lastpipe option set and effective), the right-hand side runs in the original shell, so the pipeline construct only creates one subshell.
  • Command substitution: $() creates a subshell with its standard output set to a pipe, collects the output in the parent and expands to that output, minus its trailing newlines. (And the output may be further subject to splitting and globbing, but that’s another story.)
  • Process substitution: <(cmd) creates a subshell with its standard output set to a pipe and expands to the name of the pipe. The parent (or some other process) may open the pipe to communicate with the subshell. >(cmd) does the same but with the pipe on standard input.
  • Coprocess: coproc creates a subshell and does not wait for it to terminate. The subshell’s standard input and output are each set to a pipe with the parent being connected to the other end of each pipe.

Process List

For a command list to be considered a process list (a grouping), the commands must be encased in parentheses (). Adding parentheses and turning the command list into a process list created a subshell to execute the commands.

1
2
3
4
5
6
# echo $BASH_SUBSHELL
0
# (echo $BASH_SUBSHELL)
1
# ( (echo $BASH_SUBSHELL) )
2

For parent variables act in subshell (), from this, this and this posts, long story short: subshell () inherit all variables. Even $$ (the PID of the original shell) is kept. The reason is that for a subshell, the shell just forks and doesn’t execute a new shell (such as run a script ./xx)

Note, usually use subshell () with &.

Background mode

Background mode is very handy. And it provides a method for creating useful subshells at the CLI.

1
2
# jobs -l
[1]+ 7552 Running sleep 40 &

[1] is job number, 7552 is PID, then Running is job status. The jobs command displays any user’s processes (jobs) currently running in background mode:

Using a process list in background mode is one creative method for using subshells at the CLI. Remember we start Jetty in conductor container? docker load and scp are also suitable for background execution sometimes.

Co-processing

Co-processing performs almost identically to putting a command in background mode, except for the fact that it creates a subshell.

1
2
3
# coproc sleep 2
[1] 8174
[1]+ Done coproc COPROC sleep 2

it the same as:

1
# (sleep 2) &

The COPROC is a name given to the porcess, you can change it:

1
# coproc My_Job { sleep 10; }

The only time you need to name a co-process is when you have multiple co-processes running, and you need to communicate with them all. Otherwise, just let the coproc command set the name to the default, COPROC.

This will create a nested subshell:

1
# coproc ( sleep 10; sleep 2 )

My question

Remember in conductor container we start Jetty using the (...) &. We want to run it in a separate process in background. Why not just &? So If I want to run something in background, should I use & to put command in background or ()& to put subshell in background?

Referring to my question. I am testing in these 2 cases but did not see difference:

1
2
sleep 1 & ps -f
(sleep 1)& ps -f

May be different Linux distro has different result, check it first. For now, using & directly on command is fine.

Shell Build-in Commands

An external command, sometimes called a filesystem command, is a program that exists outside of the bash shell. They are not built into the shell program. An external command program is typically located in /bin, /usr/bin, /sbin, or /usr/sbin.

1
2
3
4
5
# which ps
/usr/bin/ps

# type -a ps
ps is /usr/bin/ps

Whenever an external command is executed, a child process is created. This action is termed forking. It takes time and effort to set up the new child process’s environment. Thus, external commands can be a little expensive.

When using a built-in command, no forking is required. Therefore, built-in commands are less expensive.

Built-in commands are different in that they do not need a child process to execute. They were compiled into the shell and thus are part of the shell’s toolkit.

This article talks about the variable scope in shell, especially how to access variables between different scripts.

Local Variables

The set command displays all functions and local variables defined for a specific process:

1
2
3
4
# this definition will be in `set`, not in `env`
HELLO=999
# run `set` will see `HELLO`
set

It also sorts the display alphabetically.

注意set中的variables 不一定出现在env中,顾名思义, env中显示的才是当前运行shell会被采用的variables.

ENV Variables

Environment variables are visible from the shell session and from any spawned child subshells, and also visable by the command launched from that shell. This makes environment variables useful in applications that create child subshells, which require parent shell information.

To view environment variables, use the env or the printenv command:

1
2
env | grep ^HOME
printenv HOME

You can use export to create environment variable.

1
export demo='hello world'

You can use unset to remove an existing environment variable.

1
unset demo

If you want to only set the variable for one command in one use, just prefix it, eg:

1
2
3
4
# EDITOR will only works for crontab command
EDITOR=vim crontab -e
# hot KUBECONFIG env variable
KUBECONFIG=xx.yml kubectl get nodes

A common trick for programmers is to include the single dot symbol in their PATH environment variable. The single dot symbol represents the current directory:

1
PATH=$PATH:.

Declare Command

Note that declare can print both set and env variables:

1
declare | grep ^HOME=

declare command can be used to create local variables in set or export to env:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# -i: integer
declare -i n=34
# can do arithmetic operations directly without let or expr
n=n/2
# the result is 17
echo $n

# -l: convert to lower case
declare -l hello=WORLD
# -u: convert to upper case
declare -u hello=world
# if reassign value to hello, will be convert to lower/upper case automatically

hello=world
# -p: display the attributes and values of each name
declare -p hello
## result is below, -- means local variable with no options compare to others like -x -i, etc
declare -- hello="world"

## declare can also export variable
declare -x apple=123
## will see apple in env
env | grep apple

declare -p apple
## result is below, -x means export variable
declare -x apple="123"

## remove from env
declare +x apple

You can mix the option:

1
2
3
4
5
6
declare -x -l EDITOR
EDITOR=vIM

declare -p EDITOR
## auto fix it to lower case
declare -xl EDITOR="vim"

Other ways to convert char to lower/upper case:

1
2
3
4
5
6
7
8
9
10
val=abcdEFG
# pattern is optional, eg [A-F]
# all to lower cases
echo ${val,,pattern}
# only first char to lower case
echo ${val,pattern}
# all to upper cases
echo ${val^^pattern}
# only first char to upper case
echo ${val^pattern}

Create constant (read-only) variable, cannot be unset and sustain for the shell session:

1
2
3
4
5
6
## set name as readonly
declare -r name=bob
## export it and readonly
declare -xr name=alice
## define a readonly array
declare -ra array=(1 2 3 4)

Declare different kind of arrays, see my other blog Array in Script:

1
2
3
4
## index-based
declare -a user_name
## associated-based
declare -A user_name

Shell Variables

Shell variables (actually the local variable) are available only in the shell that creates them. In fact, the Linux system also defines standard shell environment variables for you by default.

How to define a Shell environment variable: No space can appear between the variable, the equal sign, and the value:

1
var1=value

Variables defined within the shell script maintain their values throughtout the life of the shell script but are vanished when the shell script completes.

If you want to assign the value of one variable to another, must use $:

1
var2=${var1}

otherwise, var1 will be interpreted as text string.

1
var2=var1

Variable Scope

My question is how to pass variables from one script to another? Basically there are 3 options:

  1. Make the variable an environment variable (export it) before calling the child script (./script2).

  2. source the child script and it will run in the same shell, This would let you share more complex variables like arrays easily, but also means that the other script could modify variables in the caller shell.

  3. pass as parameters to child script: ./script2 var1 var2

  4. if use exec to replace current process, still need export variable.

Preserve ENV variables

Refer to this question. This is interesting and easy to make mistake, if I want to preserve global ENV variable when switch user in shell, use su xxx instead of su - xxx, - option will do:

  1. clears all environment variables except for TERM
  2. initializes the environment variables HOME, SHELL, USER, LOGNAME, PATH
  3. changes to the target user’s home directory
  4. sets argv[0] of the shell to ‘-’ in order to make the shell a login shell

But be very careful not let shell do variable expanding in command option:

1
su xxx -c "... echo $a"

Here $a will be expanded before executing, use single quote or escape the $ sign:

1
2
su xxx -c '... echo $a'
su xxx -c "... echo \$a"

Normally we use a interactive text editor, like vim. The command sed is one of the most commonly used command line editor in Linux world. sed stands for the stream editor, it uses the roles supplied to edit a stream of data on the fly.

Note: Be careful that sed will break the softlink and create a file with the same name! for example:

1
2
ln -s /tmp/source.txt /tmp/link.txt
sed -i -e "s#aaa#bbb#" /tmp/link.txt

then the softlink is gone, a new file named link.txt is created instead. so use readlink first to get resolved symbolic links, then use sed on it.

First you need to understand how sed works with text:

  1. Reads one data line at a time from the input
  2. Matches that data with the supplied editor commands
  3. Changes data in the stream as specified in the commands
  4. Outputs the new data to STDOUT

2个对于输出很有帮助的flags, 可以加在其他功能后面:

1
2
3
4
# -n: quiet
# /p: print the edited line
# 这样组合就只会输出改动的部分
sed -n -e 's/root/toor/p' file

Substitution

I have a text file sedtxt:

1
2
3
4
Using Java print hello world and hello tree.
Using Java print hello world and hello tree.
Using Java print hello world and hello tree.
Using Java print hello world and hello tree.

If I want to substite all hello with hi in-place:

1
sed -i -e 's/hello/hi/g' sedtxt

-i: does in-place substitution in file sedtxt, create backup file automatically by using -i.bak -e: followed by commands s/x/y/flags: substitute option, / is the delimiter, can be other chars; g represents that replace in all occurrences (global).

Note that if not set g, it will replace first occurrence in each line, what if I want to replace the first occurrence in a file, the workaround could be limit the scaning range:

1
sed '0,/Apple/{s/Apple/Banana/}' input_filename

the explain see here

If I want to substitute the second hello with goodbye in each line:

1
sed -e 's/hello/goodbye/2' sedtxt
1
2
3
4
Using Java print hello world and goodbye tree.
Using Java print hello world and goodbye tree.
Using Java print hello world and goodbye tree.
Using Java print hello world and goodbye tree.

If I want to substitute hello with hi and Java with Python:

1
2
3
sed -e 's/hello/hi/g' -e 's/Java/Pyhton/g' sedtxt
# or
sed -e 's/hello/hi/g; s/Java/Python/g' sedtxt
1
2
3
4
Using Pyhton print hi world and hi tree.
Using Pyhton print hi world and hi tree.
Using Pyhton print hi world and hi tree.
Using Pyhton print hi world and hi tree.

If I want to print only the matching lines, convenient for debugging:

1
sed -n -e 's/hello/hi/gp' sedtxt
1
2
3
4
Using Java print hi world and hi tree.
Using Java print hi world and hi tree.
Using Java print hi world and hi tree.
Using Java print hi world and hi tree.

-n: quiet output p: substitute flag to print matching line

Note, combine with grep to debug is good

Using Address

The sed editor assigns the first line in the text stream as line number 1 and continues sequentially for each new line.

only replace 2rd line:

1
sed -e '2s/hello/hi/g' sedtxt
1
2
3
4
Using Java print hello world and hello tree.
Using Java print hi world and hi tree.
Using Java print hello world and hello tree.
Using Java print hello world and hello tree.

range substitution, '1,$s/hello/hi/g' means from top to bottom.

1
sed -e '2,3s/hello/hi/g' sedtxt
1
2
3
4
Using Java print hello world and hello tree.
Using Java print hi world and hi tree.
Using Java print hi world and hi tree.
Using Java print hello world and hello tree.

can also use text pattern to filter lines, this will apply one line contains print word.

1
sed -e '/print/s#and#or#' sedtxt
1
2
3
4
Using Java print hello world or hello tree.
Using Java print hello world or hello tree.
Using Java print hello world or hello tree.
Using Java print hello world or hello tree.

Deletion

Delete consecutive lines after match

I have a text file seddel:

1
2
3
4
The 1st line is 1
The 2rd line is 2
The 3rd line is 3
The 4th line is 4

Delete line 2 to end:

1
sed -e '2,$d' seddel
1
The 1st line is 1

can also use pattern matching, delete 3rd

1
sed -e '/3rd/d' seddel
1
2
3
The 1st line is 1
The 2rd line is 2
The 4th line is 4

You can combine 2 address syntax, this will start from first line and until match 3rd, replace 3rd in the range with NAN:

1
sed -e '1,/3rd/{s/3rd/NAN/g}' seddel
1
2
3
4
The 1st line is 1
The 2rd line is 2
The NAN line is 3
The 4th line is 4

Delete commented and empty line

1
sed -e '/^#/d; /^$/d' <file>

Insertion and Appending

The insert command (i) adds a new line before the specified line. The append command (a) adds a new line after the specified line.

I have a text file sedins:

1
2
3
4
The 1st line is 1
The 2rd line is 2
The 3rd line is 3
The 4th line is 4

Insert at first line:

1
sed -e '1iNew line coming!' sedins
1
2
3
4
5
New line coming!
The 1st line is 1
The 2rd line is 2
The 3rd line is 3
The 4th line is 4

Append at 3rd line:

1
2
3
sed -e '2aNew line coming!' sedins
# using regexp
sed -e '/The 2rd line is 2/a New line coming!' sedins
1
2
3
4
5
The 1st line is 1
The 2rd line is 2
New line coming!
The 3rd line is 3
The 4th line is 4

Insert with white spaces

For example, When developing non-root, I want to add runAsUser: 1000 right after securityContext: with correct alignment:

1
sed -i -e '/securityContext/a\         runAsUser: 1000' xxx.yml

Only to escape the first space. sed can automatically recognize the rest of the spaces.

1
2
3
4
5
...
securityContext:
runAsUser: 1000
privileged: false
...

Changing

The change command allows you to change the contents of an entire line of text in the data stream.

I have a text file sedch:

1
2
3
4
The 1st line is 1
The 2rd line is 2
The 3rd line is 3
The 4th line is 4

change the second line:

1
2
3
sed -e '2cNONE' sedch
## or
sed -e '/2rd/cNONE' sedch
1
2
3
4
The 1st line is 1
NONE
The 3rd line is 3
The 4th line is 4

Transforming chars

The transform command (y) is the only sed editor command that operates on a single character.

I have a text file sedtrans:

1
2
3
4
The 1st line is 1
The 2rd line is 2
The 3rd line is 3
The 4th line is 4

The transform command performs a one-to-one mapping of the inchars and the outchars values.

1
sed -e 'y/1234/5678/' sedtrans
1
2
3
4
The 5st line is 5
The 6rd line is 6
The 7rd line is 7
The 8th line is 8

The transform command is a global command; that is, it performs the transformation on any character found in the text line automatically, without regard to the occurrence. You can’t limit the transformation to a specific occurrence of the character.

sed files in directory and subdirectories recursively

Actually we can find all files by find then exec sed, see this post:

1
find <dir> -type f -name "*sh" -exec sed -i -e "s|${old}|${new}|g" {} \;

Note that -type f is necessary, otherwise will pass directory name to sed.

在学习orphan 和 zombie process的时候引出了一个wait command的问题,因为我发现shell script中尽管没有用wait 但background children processes 仍然被reap了没有造成zombie, for explanation please see my question.

To recap, bash wait 和 linux API wait() 是不一样的,Bash takes care of reaping processes for you, The wait bash command has no effect on reaping processes. And the bash stores the child process exit status in memory and it becomes available to your upon calling wait.

Sometimes when I run some time-consuming tasks I want to make them execute parallelly to improve the CPU utilization and reduce execution time (if the machine is multi-core or multi-processing unit)

Let’s talk about different patterns to do that in shell script, for example, I have scripts: back.sh

1
2
#!/bin/bash
tail -f /dev/null

hello.sh

1
2
3
#!/bin/bash
echo "====== hello"
exit 0

Wait for all background tasks

In main.sh, if:

1
2
3
4
5
6
7
8
9
# $! capture the immediate background process id
declare -a nums=(1 2 3)
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
done
wait
echo "done!"

you will get the result like this, only get done! after all background processes finished:

1
2
3
4
5
6
7
###### PID is 11649
###### PID is 11650
###### PID is 11651
====== hello
====== hello
====== hello
done!

But if the main.sh:

1
2
3
4
5
6
7
8
9
./back.sh &
declare -a nums=(1 2 3)
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
done
wait
echo "done!"

The wait will hold on until all background tasks complete, you will never see done! because back.sh will never exit. Have to use kill command to kill it.

The improved way is to only pass related PIDs to wait, so scheduler will not care unrelated background task back.sh:

1
2
3
4
5
6
7
8
9
10
11
declare -a nums=(1 2 3)
declare -a pids
./back.sh &
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
pids+=($!)
done
wait ${pids[@]}
echo "done!"

Wait background task in arbitrary order

This way is very similar to above example, but we wait individually. Also notice that wait PID will return the subprocess exit code! If PID is not given, all currently active child processes are waited for, and the return status is zero. Check man wait for detail.

In main.sh, write:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/bin/bash
declare -a nums=(1 2 3)
declare -a pids
for i in "${nums[@]}"
do
./hello.sh &
echo "###### PID is $!"
# pids[n]=$! also works, [ ] treat n as number already
pids[${n}]=$!
# n is not declared, treat as 0 as default number
let n+=1
done

for pid in "${pids[@]}"
do
# check exit code
if wait ${pid}; then
echo "success"
else
echo "abnormal"
fi
done
echo "done!"

SIGCHLD signal

When child process is done or terminated, it will send SIGCHLD signal to parent, can trap it and do something may be recycle resources. You need to enable job control first, see this issue

using SIGCHLD to catch the point of child process termination.

1
2
3
4
5
6
7
8
9
10
11
#!/bin/bash
# enable job control, see man set
# set -m is the same
set -o monitor
# trap sigchld
trap "reaping child process" SIGCHLD

(sleep 2) &

# do other things
tail -f /dev/null

Others

Acutally jobs command can monitor the background processes:

1
2
3
4
5
6
# ./back.sh  &
[1] 15405
# jobs -l
[1]+ 15405 Running ./back.sh &
# jobs -p
15405

Overall, tmux is the best among all selections here, but you should know how do others work in case tmux is unavailable. 你还需要理解VNC, Screen, tmux 的原理是什么? 为什么使用它们则SSH 断开之后进程仍可以继续运行呢? 哪里在收集和记录状态。

Run in Background

The most straightforward way is put long running task in background, redirect the stdout/stderr to files or /dev/null, for example:

1
2
./task.sh &> log.$(date "+%Y-%m-%d") &
# then working on it by jobs, bg, fg and ctrl+z commands

Note that ctrl+z also works when editting Vim file, it will suspend editing, put it in background and bring you back to terminal.

To detach a job from the current jobs list, using disown %<job id> command. The detched job will still be running but cannot bring it back to jobs.

Nohup

Why the work gets lost when SSH session drops is because the signal SIGHUP will be sent to the foreground process. If we let the process ignore this signal, then it can keep running when remote SSH connection is gone.

1
2
3
4
5
6
# nohup will redirect the stdin from /dev/null
# stdout/err to FILE $HOME/nohup.out
nohup [bash|sh] <script> &
# similar to
# &> is the modern version of 2>&1
<script> </dev/null &>/dev/null &

You can use jobs, bg and fg command to operate it.

VNC

之前用VNC的目的之一是为了keep SSH long running task on remote machine , prevent lost of work, the alternative is screen and tmuxcommand(best).

Virtual Network Computing (VNC) is a graphical desktop-sharing system that uses the Remote Frame Buffer protocol (RFB) to remotely control another computer. It transmits the keyboard and mouse events from one computer to another, relaying the graphical-screen updates back in the other direction, over a network.

Popular uses for this technology include remote technical support and accessing files on one’s work computer from one’s home computer, or vice versa.

I usually use it to do remote development on Fyre, I have a Fyre central control machine with VNC installed, after vnc to that machine I use it to SSH to other machines within the same internal network(usually connections will not break), you can also install IDE in VNC to do general programming work rather than developing locally.

Note, there are other remote terminal tools such as Termius.

Install VNC Server

The settings are varies on different Linux distros.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
yum update -y

# seems no need this:
yum install open-vm-tools

# if you don’t have desktop installed, note linux has many
# different desktop system themes, only install core packages
# of KDE or GNOME, KDE is better
yum groupinstall 'X Window System' 'KDE'
yum groupinstall 'X Window System' 'GNOME'

yum list | grep tiger
yum -y install tigervnc-server

# Fyre firewalld default is inactive, no need to deal with firewall
# if it’s enable, may need to config
systemctl status firewalld
firewall-cmd --permanent --zone=public --add-service vnc-server
firewall-cmd --reload

# start vnc
# select yes then create password: 123456
# first time it will open port 1
# then you will get the address and port to login:
# centctl1.fyre.ibm.com:1
vncserver

Note you will get new vnc session everytime you run vncserver, check with

1
2
3
4
ps aux | grep vnc

root 2682 0.0 3.0 284860 116840 ? Sl May27 1:38 /usr/bin/Xvnc :1 -auth /root/.Xauthority -desktop mycentctl1.fyre.ibm.com:1 (root) -fp catalogue:/etc/X11/fontpath.d -geometry 1024x768 -pn -rfbauth /root/.vnc/passwd -rfbport 5901 -rfbwait 30000
root 9199 1.4 1.6 231660 62388 pts/3 Sl 08:59 0:00 /usr/bin/Xvnc :2 -auth /root/.Xauthority -desktop mycentctl1.fyre.ibm.com:2 (root) -fp catalogue:/etc/X11/fontpath.d -geometry 1024x768 -pn -rfbauth /root/.vnc/passwd -rfbport 5902 -rfbwait 30000

you can kill it by running:

1
vncserver -kill :2

If the VNC is broken due to Fyre maintenance, open a new vnc session again by vncserver.

Install VNC viewer

Install VNC viewer on your local laptop, then connect by for example:

1
centctl1.fyre.ibm.com:1

Copy and Paste

If you want to copy from local to viewer, sometimes it’s malfunction, kill the klipper process:

1
2
ps aux | grep klipper
kill -9 <PID of klipper>

You can adjust shortcuts in VNC viewer for copy and paste: Settings -> Configure Shortcuts -> copy / paste

Other Settings

Other settings useful: Settings -> Edit Current Profile -> Mouse -> copy on select / Trim trailing space

Adjust font size the themes: Settings -> Manage Profiles -> Edit Profile -> Appearabce -> Black on Random Light -> check Vary the background color for each tab

Also change the text size under Appearance.

Screen resolution

If the screen open by VNC viewer is small, you can change the resolution, run:

1
xrandr -s 1920x1080

on the terminal in your remote machine.

Screen

Create virtual terminal lives beyond your terminal session.

How To Use Linux Screen Screen or GNU Screen is a terminal multiplexer. In other words, it means that you can start a screen session and then open any number of windows (virtual terminals) inside that session. Processes running in Screen will continue to run when their window is not visible even if you get disconnected, the work or progress will not get lost! 所以说可以不用& 后台运行了。

screen command cheat sheet, or see help in screen: crtl+a ?, hit enter to quit. you can use either screen or ctrl+a(key binding).

Note that install and run screen on target machine, VNC 应该是同样的.

1
2
3
4
5
6
7
8
apt update
apt install -y screen
# or
yum install -y screen

# start a screen session with description
# and enter into it
screen -S remote_playbook

You can create multiple windows in one session

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Ctrl+a c         Create a new window (with shell)
Ctrl+a p Go back to previous screen
Ctrl+a n Go to next screen
Ctrl+a Ctrl+a Toggle between the current and previous region or screen

Ctrl+a A Rename the current window

Ctrl+a " List all window
Ctrl+a 0/1/2/3 Switch to window 0 (by number )


Ctrl+a S Split current region horizontally into two regions
Ctrl+a | Split current region vertically into two regions
Ctrl+a tab Switch the input focus to the next region
Ctrl+a Q Close all regions but the current one
Ctrl+a X Close the current region

Ctrl+a k Kill current window or `exit`
Ctrl+a \ Kill all windows
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# list running sessions
screen -ls

# detach session
screen -d [session name]
# detach current session
Ctrl+a d

# attach session
screen -r [session name]
# attach to a running session
screen -x

# detach and logout all sessions
# will not terminate session
Ctrl+a D D

# quit and terminate current session
Ctrl+a :quit
# or kill all window on sessions
Ctrl+a k

How do I know I am in screen and which session:

1
2
3
4
5
6
7
8
9
10
11
# show session id if you are in screen
# for example: 3384204.screen-es-upgrade
echo $STY

# return with screen prefix if you are in screen
# for example: screen.xterm-256color
echo $TERM

# press ctrl+a with t
# otherwise nothing will show
# for example: 22:26:35 Mar 04 chengdolcent 0.36 0.28 0.2

Tmux

Tmux 相比Screen 有terminal底部的提示,不容易和host terminal搞混.

Note that tmux has 3 terms(same as Screen): session -> window -> pane. you can have multiple sessions, each session can have mutliple windows and each window can have several panes. Usually rename the session and window with meanful symbol.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# version
tmux -V
# start a session, notice the bottom line for window status
tmux
tmux new -s <session name>

# rename session, by default no meanful name
Ctrl+b $

# list tmux sessions
tmux ls

# switch list sessions
# use arrow key to expand each session and select
Ctrl+b s

# detch tmux session
Ctrl+b d
# attch again
tmux [attach|a] -t <session name>
# attach most recent one
tmux [a]ttach|a]

# kill all sessions but current
tmux kill-session -a
# kill all sessions but my session
tmux kill-session -t <my session>

In the bottom line, * means current window, you will see the windows list and name here.

Some key bindings for window:

1
2
3
4
5
6
7
8
9
10
11
12
13
Ctrl+b ?         help, press `q` to quit

Ctrl+b c create new window
Ctrl+b , rename window
Ctrl+b w list windows

Ctrl+b <id> switch to other window
Ctrl+b n go to next window
Ctrl+b p go to previous window

Ctrl+b & kill and close the window

Ctrl+b t clock display, press any key goes back

Add panes to windows, active one has the green color.

1
2
3
4
5
6
Ctrl+b "         horizontal split
Ctrl+b % vertical split
Ctrl+b o switch pane
Ctrl+b arrow switch pane by arrow key
Ctrl+b ; switch between most recently panes
Ctrl+b x close pane

You can also adjust the pane size or even promote the pane to a its own window.

1
2
Ctrl+b [esc + arrow]    连续多次esc + arrow resize the pane
Ctrl+b ! promote window

Tmux also has command mode to accomplish the same task by key bindings, similar to Vim. 比如说monitor-activity设置,在有输出变化时会提示高亮. You can pre-set tmux config: .tmux.conf, there are a lot examples on Internet.

This command is awesome for backup with complicated file structure and frequently modifications. It’s more elegant and smart(skip unchanged files) than using portable storage or scp to transfer the files.

Note that SSHFS monut may also fit the needs.

Introduction

Let’s see what is rsync from wiki: rsync is a utility for efficiently transferring and synchronizing files between a computer and an external hard drive and across networked computers by comparing the modification times and sizes of files.

rsync will use SSH to connect. Once connected, it will invoke the remote host’s rsync and then the two programs will determine what parts of the local file need to be transferred so that the remote file matches the local one.

rsync can also operate in a daemon mode, serving and receiving files in the native rsync protocol (using the rsync:// syntax). Here I only talks SSH way.

How to Exclude Files and Directories with Rsync Rsync Command in Linux with Examples

Usage

To get rsync working between two hosts, the rsync program must be installed on both the source and destination, and you’ll need a way to access one machine from the other.

Copy files to remote home or from remote to local

1
2
3
rsync files remote:
rsync files user@remote:
rsync user@remote:source dest

If rsync isn’t in the remote path but is on the system, use --rsync-path=path to manually specify its location. Unless you supply extra options, rsync copies only files. You will see:

1
skipping directory xxx

To transfer entire directory hierarchies, complete with symbolic links, permissions, modes, and devices, use the -a option.

1
2
3
4
5
6
7
8
9
10
11
12
# -n: dry-run, this is vital when you are not sure.
# -P: show progress bar
# -v: verbose mode
# -z: compress during transfer
# -a: archive mode, equals -rlptgoD
# here rsync a file and a dir
rsync -n -P -vza file dir user@remote:<path>

# -q: quiet
# -e: choose a different remote shell
# for example remote ssh uses a port other than 22
rsync -q -e "ssh -p 2322" file user@remote:<path>

To make an exact replica of the source directory, you must delete files in the destination directory that do not exist in the source directory:

1
2
# --delete: delete extraneous files from dest dirs
rsync -v --delete -a dir user@remote:

Please use -n dry-run to see what will be deleted before performing command.

Be particular careful with tailing slash after dir:

1
2
# dir vs dir/
rsync -a dir/ user@remote:dest

This will copy all files under dir to dest folder in remote instead of copy dir into dest.

You can also --exclude/--include=PATTERN and --exclude-from/--include-from=PATTERN_FILEin command.

To speed operation, rsync uses a quick check to determine whether any files on the transfer source are already on the destination. The quick check uses a combination of the file size and its last-modified date.

When the files on the source side are not identical to the files on the destination side, rsync transfers the source files and overwrites any files that exist on the remote side. The default behavior may be inadequate, though, because you may need additional reassurance that files are indeed the same before skipping over them in transfers, or you may want to put in some extra safeguards:

  • --checksum(abbreviation: -c) Compute checksums (mostly unique signatures) of the files to see if they’re the same. This consumes additional I/O and CPU resources during transfers, but if you’re dealing with sensitive data or files that often have uniform sizes, this option is a must. (This will focus on file content, not date stamp)

  • --ignore-existing Doesn’t clobber files already on the target side.

  • --backup (abbreviation: -b) Doesn’t clobber files already on the target but rather renames these existing files by adding a ~ suffix to their names before transferring the new files.

  • --suffix=s Changes the suffix used with –backup from ~ to s.

  • --update (abbreviation: -u) Doesn’t clobber any file on the target that has a later date than the corresponding file on the source.

For example, sync my code repo in local host to remote for testing and developing, after verifying, sync back to local host to check in:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# forward sync source proj folder itself to dest
# result in remote: /home/chengdol/proj
rsync -vza \
./proj \
remote_user@remote:/home/chengdol

# then coding and editing

# backward sync remote proj folder itself to current directory
# result ./proj
rsync -vza \
# exclude folder inside remote proj
--exclude .terraform \
--exclude output \
--exclude utils/__pycache__ \
--exclude deployment/__pycache__ \
remote_user@remote:/home/chengdol/proj \
.

# you will see the incremental transferred files, as well as the .git changes
0%