Linux Networking Summary

//TODO [ ] https://www.youtube.com/watch?v=kQYQ_3ayz8w&list=PLvadQtO-ihXt5k8XME2iv0cKpKhcYqe7i&index=5

常用的关于networking 检查的commands: ss, lsof, netstat, ifconfig, hostname, ip, route, iptables, nc, ping, arp, curl, wget, host, nslookup, dig.

这篇总结主要是来自PluralSight上的LPIC-1课程的Network chapter,以及LFCE Advanced Networking training. 后来加入了一些iptables的内容, from Youtube. Environment: CentOS 7 Enterprise Linux or RedHat.

Frequently Asked Question: What is going on when you hit URL in browser?

About domain name: www.microsoft.com.:

  • root domain: .
  • top-level domain: com
  • second-level domain: microsoft
  • third-level domain: www

以上是最基本的流程,如果使用了HTTPS,还可以描述一下TLS handshakes的过程, 再比如中间有proxy则会Tunnel,有load balancer则可能有TLS termination等等。

Ip vs Ifconfig

ifconfig is obsolete, use ip instead. 我专门有一篇写的ip command.

ipv4: 32 bits long, dotted decimal ipv6: 128 bits long, quad hex

Hostname

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# show full hostname
hostname -f

# node hostname
uname -n

# query and change the system hostname and related settings
hostnamectl

Static hostname: halos1.fyre.xxx.com
Icon name: computer-vm
Chassis: vm
Machine ID: f7bbe4af93974cbfa5c55b68c011d41c
Boot ID: 4e30e7107fa441a9b3ad70d0b784782d
Virtualization: kvm
Operating System: Red Hat Enterprise Linux Server 7.6 (Maipo)
CPE OS Name: cpe:/o:redhat:enterprise_linux:7.6:GA:server
Kernel: Linux 3.10.0-957.10.1.el7.x86_64
Architecture: x86-64

# show domain name
# The chances are unless we have a web server running on our computer, we will not have any dns domain
# name. By default, there is no web server running on a system and hence there is no result when we
# type “dnsdomainname” on the terminal and hit enter.
dnsdomainname
1
2
3
4
5
6
7
8
9
10
11
12
# this will not be persistent
# the static hostname is still unchanged but transient hostname is xxx.example.com
# you can see transient name by hostnamectl
hostname xxx.examplel.com

# this will be persistent in
# /etc/hostname
hostnamectl set-hostname xxx.example.com

# set pretty hostname which includes '
# /etc/machine-info
hostnamectl set-hostname "xxx'ok.example.com"

Notice that the order we add in /etc/hosts file is important! 把fully qualified hostname放第一个,然后aliases,否则在一些场景会出问题!

1
2
# /etc/hosts
<ip> <fully qualified domain name: FQDN> <aliases>

除了local hosts file, 来看看DNS设置, 我有一篇blog讲到了这个。 dig command (DNS lookup utility),用来check response and checking hostname from DNS server.

1
2
3
4
5
# use default dns server
# -t A: type A record
dig www.pluralsight.com -t A
# use specified dns server, for example, google dns server 8.8.8.8
dig www.pluralsight.com @8.8.8.8 -t A

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<<>> DiG 9.9.4-RedHat-9.9.4-61.el7_5.1 <<>> www.pluralsight.com @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14726
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.pluralsight.com. IN A

# 59,186,186 is TTL (second, keep changing)
;; ANSWER SECTION:
www.pluralsight.com. 59 IN CNAME www.pluralsight.com.cdn.cloudflare.net.
www.pluralsight.com.cdn.cloudflare.net. 186 IN A 104.19.162.127
www.pluralsight.com.cdn.cloudflare.net. 186 IN A 104.19.161.127

# server now is 8.8.8.8
;; Query time: 60 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sun Apr 12 13:03:48 PDT 2020
;; MSG SIZE rcvd: 132

Add short format +short to return the IP address only:

1
2
# only show resolved output
dig +short www.pluralsight.com @8.8.8.8

How to check dns record TTL: You can set TTL for the DNS record that defines how long a resolver supposed to cache the DNS query before the query expires. TTL typically used to reduce the load on your authoritative name servers and to speed up DNS queries for clients.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# A is type, check loacl dns resolver
dig A google.com

# other type
# AAAA: ipv6
dig AAAA google.com
# canonical name
dig cname google.com

# get authoritative dns server
# NS: name server
dig +short NS google.com
# check by authoritative dns server
dig A google.com @ns1.google.com.

# onyl show ttl
dig +nocmd +noall +answer +ttlid A google.com
# human-readable
dig +nocmd +noall +answer +ttlunits A google.com

Network services

04/12/2020 目前我只是查看配置,没有去设置过。

Display and set IP address

1
2
3
4
ip -4 addr
ip addr show eth0
# not persist
ip addr add 192.168.1.50/24 dev eth0

没太明白这些配置的具体用法。 Network Manager tool, 这个tool也不是万能的,有的地方不适用, can be used to set persistent change so we will not lost it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# check status
systemctl status NetworkManager
# if not active, start it
systemctl start NetworkManager

# nmcli command
# command-line tool for controlling NetworkManager
# show all connections
nmcli connection show
# pretty format
nmcli -p connection show eth0

# terminal graph interface
nmtui
# then edit a connection, select network interface
# config ipv4 ip address/gateway.
systemctl restart network

Traditional network service, more flexible and common.

1
systemctl status network

The network configuration is read from scripts under /etc/sysconfig/network-scripts/.

1
ifcfg-eth0  ifcfg-eth1  ifcfg-lo ...

这些文件里面都写好了配置,more details see this link: https://www.computernetworkingnotes.com/rhce-study-guide/network-configuration-files-in-linux-explained.html

1
2
3
4
5
6
TYPE=Ethernet
BOOTPROTO=dhcp
NAME=eth0
DEVICE=eth0
ONBOOT=yes
...

After editing the ifcfg-xx file, bring down and up that interface:

1
2
ifdown eth0
ifup eth0

Routing

[ ] IP tables vs routing tables 有啥区别,使用场景? see this question and diagram in comment.

Display routing tables 路由表

1
2
3
4
5
6
7
# see below
ip r
# route and netstat 每个column的意思更清楚一些
# -n: displays the results as IP addresses only and does not attempt to perform a DNS lookup
netstat -rn
# -e: display as netstat format
route -n [-ee]

Explain host routing table (因为这不是一个router), the column name explaination can see man route,比如Flags字母的含义。 The order in the routing table does not matter, the longer prefix always takes priority.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 简而言之,路由表就是找,到哪里,出口在哪以及下一跳是谁

# Destination 表示destination `network name` or `host name`
# 是用来和要出去的packet destination IP 和 (Genmask)mask 作用之后得到的结果对比的
# 如果match了,则通过Iface(interface)送出去
# 如果和mask作用后有多个match, 则去match最长的那个destination

# 0.0.0.0在Destination中表示默认网关,network mask也是0.0.0.0, 任何一个IP和0.0.0.0 与操作,最后
# 就是0.0.0.0了,所以没有match的IP都去了default gateway了

# Gateway: gateway address, 比如192.168.0.1,如果是0.0.0.0,表示unspecified 或者`没有`, 有时
# 用 * 表示没有.
# this assumes that the network is locally connected, as there is no intermediate hop.
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.0.1 0.0.0.0 UG 100 0 0 ens4
# 注意这里是个host IP了,不是network name
192.168.0.1 0.0.0.0 255.255.255.255 UH 100 0 0 ens4
192.168.9.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0

对比一下ip r command, 显示不太一样:

1
2
3
4
5
6
7
8
# proto [type]: routing protocol identifier of this route
# scope link: 表示在设备的网段内通过此链接允许通信

# default gateway
default via 192.168.0.1 dev ens4 proto dhcp metric 100
192.168.0.1 dev ens4 proto dhcp scope link metric 100
# docker0
192.168.9.0/24 dev docker0 proto kernel scope link src 192.168.9.1

Adding routes, 把所有的找不到routing的traffic全部转到192.168.56.104上去,通过eth0, 比如当前的machine无法访问外网,而192.168.56.104却可以, 但之后192.168.56.104也需要配置成router。

1
2
3
# this command is not persistent
# default can be formatted as 192.168.1.0/24
ip route add default via 192.168.56.104 dev eth0

如果需要make it persist, need to edit /etc/sysconfig/network-scripts/ corresponding file eth0, 或者自己添加script,然后重启network systemctl restart network.

Configuring a linux system as router:

1
2
3
4
5
6
# now let's configure machine 192.168.56.104 as a router
vim /etc/sysctl.conf
# add this line to enable ipv4 forward
net.ipv4.ip_forward=1
# reload
sysctl -p

当时在做项目的时候需要去DataStage Ops Console查看performance, 但Openshift worker node外界无法直接访问,只能通过infra node的routing才行,于是先用nodePort expose service, 再设置infra node到对应worker node port的映射,最后对外用MASQUERADE。

1
2
3
4
5
6
# this is operating on nat iptables
# run in infra node
# DNAT: destination nat
iptables -t nat -A PREROUTING -p tcp --dport 32160 -j DNAT --to-destination <worker private IP>:32160
iptables -t nat -A POSTROUTING -j MASQUERADE
iptables -t nat -nvL

Allowing access to the internet via NAT, so traffic can get back to private network.

注意routing这部分还没有涉及到firewall, firewall is inactive

1
2
3
4
5
6
7
# -t nat: working on nat table
# -A POSTROUTING: appending to post routing chain
# -o eth0: outbound via eth0, eth0 connects to internet
# -j MASQUERADE: jump to MASQUERADE rule

# not persistent, see iptables section below
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

then if you check iptables -t nat -nvL will see the postrouting rule with new line added.

Firewall

其实很多linux是靠iptables去实现firewall的功能的,见下一节,firewalld service背后改动的也是iptables.

Implement packet filtering (iptables and firewalld both can do this) firewall zone: represent a concept to manage incoming traffic more transparently. The zones are connected to networking interfaces or assigned a range of source addresses. You manage firewall rules for each zone independently.

配置命令类似于kubectl/oc的形式。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
systemctl start firewalld

# show default zone
firewall-cmd --get-default-zone
# show active zones, will see interfaces apply to it
firewall-cmd --get-active-zones
# show available zones
firewall-cmd --get-zones

# permanently remove interface eth0 from public zone
firewall-cmd --permanent --zone=public --remove-interface=eth0
# permanently add eth0 to external zone
firewall-cmd --permanent --zone=external --add-interface=eth0
# permanently add eth1 to internal zone
firewall-cmd --permanent --zone=internal --add-interface=eth1

# change default zone
firewall-cmd --set-default-zone=external
# after updating, restart to take effect
systemctl restart firewalld

后面主要讲了firewall的配置,可以对不同的zone添加或删除services, ports等,service的默认配置文件在/usr/lib/firewalld/services目录,但是自己创建的service文件在/etc/firewalld/services/

Iptables

用iptables也可以实现firewall的功能via filter table.

There are currently five independent tables:

  • filter: This is the default table (if no -t option is passed),It contains the built-in chains INPUT (for packets destined to local sockets),FORWARD (for packets being routed through the box), and OUTPUT (for locally-generated packets).
  • nat: This table is consulted when a packet that creates a new connection is encountered. It consists of three built-ins: PREROUTING (forltering packets as soon as they come in), OUTPUT (for altering locally-generated packets before routing), and POSTROUTING (foraltering packets as they are about to go out). IPv6 NAT support is available since kernel 3.7.
  • mangle: This table is used for specialized packet alteration.
  • raw: This table is used mainly for configuring exemptions from connection tracking in combination with the NOTRACK target.
  • security: This table is used for Mandatory Access Control (MAC) networking rules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# list 3 basic chain in filter table: INPUT, FORWARD, OUTPUT
# INPUT: traffic comes in firewall
# FORWARD: traffic pass through firewall
# OUTPUT: traffic leaving firewall
iptables [-t filter] -L

# policy ACCEPT: default policy is ACCEPT if no specific rules
# other policies: DROP, REJECT(will send ICMP rejecter to sender)
# by default, most system won't have any rules
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Change default policies。 注意, 可以自己添加rules去加功能,但不要轻易去更改default policy ACCEPT。否则出了意外都不能连接上了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# set default policy to DROP
# accept any traffic for INPUT and OUTPUT

# rules 类似于switch中的case,从上到下match,顺序很重要!
# -A: append
iptables -A INPUT -j ACCEPT
iptables -A OUTPUT -j ACCEPT
# 这里设置为DROP是因为上面新加了ACCEPT
iptables -P INPUT DROP
iptables -P OUTPUT DROP
iptables -P FORWARD DROP

# accept any loopback traffic
# loopback traffic never leaves machine
# -i: in-interface
# -o: out-interface
iptables -A INPUT -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT

# -v: verbose,
# -n: numberic data
# --line-numbers: show rules index
iptables -nvL --line-numbers

# keep current traffic, for example, current ssh connection
iptables -A INPUT -j ACCEPT -m conntrack --ctstate ESTABLISHED,RELATED
iptables -A OUTPUT -j ACCEPT -m conntrack --ctstate ESTABLISHED,RELATED

# remove rule by index from --line-numbers
# -D: delete rule
# 这里就把之前ACCEPT去掉了,但链接并不会断开,因为有conntrack with established
iptables -D INPUT 1
iptables -D OUTPUT 1

# 目前为止,没有新的流量可以进来或出去
# add filter rules to iptables firewall for inbound and outbound traffic
# others can ping me
iptables -A INPUT -j ACCEPT -p icmp --icmp-type 8
# I can ping others
iptables -A OUTPUT -j ACCEPT -p icmp --icmp-type 8
# others can ssh in
# add comment
iptables -A INPUT -j ACCEPT -p tcp --dport 22 -m comment --comment "allow ssh from all"

# I can access others
# 当时这里理解有点问题,为什么不需要INPUT 80 port呢?
iptables -A OUTPUT -j ACCEPT -p tcp --dport 80
iptables -A OUTPUT -j ACCEPT -p tcp --dport 443
# DNS
iptables -A OUTPUT -j ACCEPT -p tcp --dport 53
iptables -A OUTPUT -j ACCEPT -p udp --dport 53
# NTP
iptables -A OUTPUT -j ACCEPT -p tcp --dport 123
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# save current config
# can edit in this output file
iptables-save > orgset
iptables-restore < orgset

# drop if not match
# 这个放最后,否则一来就drop了,但如果设置了default DROP则不需要了
iptables -A INPUT -j DROP
# not acting as a router
iptables -A FORWARD -j DROP

# -I: insert
# 把这个rule加到INPUT chain的第一行
iptables -I INPUT 1 -p tcp --dport 80 -j ACCEPT

# clear rules in all chains
iptables -F [chain name]

来看看iptables service的使用,变成systemctl service的形式了,使用上更正规一些。

1
yum install -y iptables-services

/etc/sysconfig目录下,有iptables and iptables-config files, If set these two values as yes, then iptables will save the config automatically in iptables file, easy to maintain.

1
2
3
4
5
6
7
8
9
10
11
# Save current firewall rules on stop.
# Value: yes|no, default: no
# Saves all firewall rules to /etc/sysconfig/iptables if firewall gets stopped
# (e.g. on system shutdown).
IPTABLES_SAVE_ON_STOP="yes"

# Save current firewall rules on restart.
# Value: yes|no, default: no
# Saves all firewall rules to /etc/sysconfig/iptables if firewall gets
# restarted.
IPTABLES_SAVE_ON_RESTART="yes"

Monitoring Network

Measure network performance, bottleneck

1
2
# 可以查看途径的IP,比如VPN看路径是不是正确的
tracepath www.google.com

traceroute vs tracepath: https://askubuntu.com/questions/114264/what-are-the-significant-differences-between-tracepath-and-traceroute some option of traceroute need root privilege, and has more features then tracepath.

Display network status

1
2
3
# 显示有多少error, drop packets来看是不是网络有问题
ip -s -h link
ip -s -h link show eth0

netstat command can also do the same thing.

1
2
3
4
5
6
7
netstat -i

Kernel Interface table
Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 16008695 0 5 0 8446165 0 0 0 BMRU
eth1 1500 461914 0 12 0 35082 0 0 0 BMRU
lo 65536 277761 0 0 0 277761 0 0 0 LRU

还介绍了一下sysstat command,需要yum安装,安装之后它会收集每日的系统历史数据供查看。这也是一个很重要的系统监控工具。 还有一个command nmap, 用来scan ports:

1
2
3
4
5
yum install -y nmap
# check what ports in your system is opening
nmap scanme.nmap.org
# list interface and routes information
nmap -iflist

Can use ss command (similar to netstat) to show listening tcp ports:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# show listening ipv4 tcp sockets in numeric format
ss -ltn -4

# *:* means listening from any address and any port
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 64 *:2049 *:*
LISTEN 0 128 *:36168 *:*
LISTEN 0 128 *:111 *:*

# list current active connections
ss -t

# 9.30.166.179:ssh is my Mac IP, it ssh to current host
# 这里State is ESTAB, 如果握手没回应,则会显示SYN-SENT
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 128 9.30.166.179:ssh 9.160.91.147:62991
ESTAB 0 0 9.30.166.179:54556 54.183.140.32:https

Network Basic

这里主要是通过做实验,把基本概念过了一遍。用Vitual Box 设置实验环境,在虚拟机中安装使用wireshark, tcpdump很清晰,没有其他干扰信息。设置实验环境时,可以有1主2从,主机可以访问外界(Adapter1 设置NAT, Adapter2/3 设置Internal Network),从机可以访问主机,间接实现外部访问(各自的Adapter1 设置Internal Network连接主机的Internal Network). 然后可以进行各种ip, route, iptables的实验了。

Network topology: LAN, WAN (bus, star, ring, full mesh) Network devices: adapter, switch, router, firewall OSI model

subnetting: a logically grouped collection of devices on the same network subnet mask: network portion / host portion special address: network address (all 0 in host portion) broadcast (all 1 in host portion) loopback 127.0.0.1 classful subnet: class A/B/C, they are inefficient

VLSM: variable length subet mask, for example x.x.x/25 NAT: one to one, many to one map ARP: address resolution protocol (IP -> MAC), broadcast on bus to see who has MAC for a particular IP DNS: map hostname to IP, UDP protocol

IP packet: can be fragmented and reassembled by router and host. fragments其实很影响throughput,因为每个IP packet都有header。还要注意有的IP加密 (VPN)会额外增加IP packet的长度,造成fragments. TTL: time to live in IP header, this is how traceroute works

Routing Table: static: path defined by admin dynamic: path programmatically defined, routing protocol software Quagga on Linux

TCP: connection oriented: three way handshake connection establishment/termination data transfer ports: system can have more than one IP, ports are only unique per IP well know port: 0-1024 flow control: maintained by receiver congestion control: the sender slow down error detection and retransmission

UDP: send it and forget it DNS (dig, host commands) VoIP

  1. setup http service on server host
1
2
3
4
5
6
7
8
yum install -y httpd
# if firewall is on
firewall-cmd --permanent --add-port=80/tcp
firewall-cmd --reload
# set page content
echo "hello world" > /var/www/html/index.html
systemctl enable httpd
systemctl start httpd
  1. get the web page from other host
1
wget http://<ip or hostname>/index.html
  1. install tcpdump wireshark on other host
1
2
3
yum install -y tcpdump wireshark wireshark-gnome
# if you have desktop in linux, start wireshark
wireshark &

Check the arp cache

1
2
3
4
5
# '?' means stale
arp -a
ip neighbor
# delete arp cache
arp -d 192.168.1.1

specify size of the data and ping total number:

1
2
3
4
5
6
# -c 1: ping once
# -s 1472: 1472 bytes long (this is not total length of IP, it will append header)
# so maybe exceed 1500 MTU and then packet will be fragmented
ping -c 1 -s 1472 192.168.1.1
# -t set TTL
ping -c 2 -t 5 192.168.0.1

Create a large file to transfer:

1
2
3
4
5
6
7
# fast allocate file
# -l5G: length of file is 5G
fallocate -l5G test.bin
# then using scp to copy from network
scp ...
# you can check wireshark to see the tcp window scaling graph
# will see slow start and speed up

Traffic control setting 用来模拟网络不好的情况, 如用scp在传输文件,设置tc bad performance,然后恢复,会发现transmission rate提高了。可以查看wireshark window scaling graph 和 IO graph. Linux 下 TC 命令原理及详解

1
2
3
tc qdisc add dev eth1 root netem delay 3000m loss 5%
# remove the above policy
tc qdisc del dev eth1 root

let’s see the statistic: After performance recover, TCP congestion window size enlarge quickly:

This is IO graph, shows TCP window size and update points:

Network Troubleshooting

Network is not reachable. For example, cannot ping through.

1
2
3
4
5
6
7
8
9
10
# check subnet and gateway, then
ip route
# check interface, state DOWN? NO-CARRIER? then
ip addr
# check MAC mapping in layer 2, then
arp -a
# layer1 is ok? link detected no?
# 注意虚拟机是没有这个统计的!真实网卡才有,之前遇到过这个情景了
# port speed 也可以查看
ethtool eth0

No route to host,比如在scp的时候,这时去host server上看一下port是不是打开的

1
ss -lnt4

wireshark看一下client端的情况,发现可能是firewall issue! 端口被屏蔽了。

0%