The external provisioner can be backed by many type of filesystem, here we focus on nfs-client.
Notice that in this project, you will see nfs-client and nfs directories, nfs-client means we already has a nfs server and use it on client. nfs means we don’t have a nfs server, but we share other filesystem in nfs way.
I find a bug in this project, when set rbac, you need to specify -n test-1, otherwise the role was created in test-1 but rolebinding is created in default namespace.
The NFS provisioner is global scoped.
The NFS_SERVER env in deployment.yaml can be hostname or IP address.
If several pods use the same PVC, they share the same PV.
you can customize the storage class if it’s available. For example, set the reclaim policy as retain instead of delete. see doc.
1 2 3 4 5 6 7 8 9 10 11 12 13
apiVersion:storage.k8s.io/v1 kind:StorageClass metadata: name:standard provisioner:kubernetes.io/aws-ebs parameters: type:gp2 ## default is delete reclaimPolicy:Retain ## allow resize the volume by editing the corresponding PVC object ## cannot shrink allowVolumeExpansion:true volumeBindingMode:Immediate
//TODO
[ ] read official document
[ ] udemy course
主讲Anthos的,由于service mesh是其中重要组成,所以讲了很多service mesh的内容, 并且讲得还很好。
Qucik Labs and slides are from PluralSight Anthos special
关于service mesh的实验可以回顾一下是如何在GCloud中操作的。slides也可以下载看看。
Istio is the implementation of a service mesh that improves application resilience as you connect, manage, and secure microservices. It provides operational control and performance insights for a network of containerized applications. It can work across environments(think about Google Anthos)!
Important network functions as below, service mesh decouple them from applications:
Istio uses envoy and sidecar pattern in the K8s pods.
Istio main components:
Pilot: control plane manages the distributed proxies across the either environment, push service communication policies, just like a software defined network.
service discovery
traffic management
intelligent routing
resiliency
Mixer: collect info and send telemetry, logs and traces to your system of choice (prometheus, influxDB, Stackdriver, etc)
Citadel: policies management, service to service auth[n,z], using mutual TLS, credential management.
How does Istio work, for example, life of a request in the mesh:
service A comes up.
envoy is deployed with it and fetches service information, routing and configuration policy from Pilot.
If Citadel is being used, TLS certs are securely distriuted as well.
service A calls service B.
client-side envoy intercepts the call.
envoy consults config to know how/where to route call to service B.
envoy forwards to appropriate instance of service B, the envoy on server side intercepts the request.
server-side envoy checks with Mixer to validate the call should be allowed.
server-side envoy forwards request to service B for response.
envoy forwards response to the original caller, the response is intercepted by envoy on the caller side.
envoy reports telemetry to Mixer, which in turn notifies appropriate plugins.
client-side envoy forwards response to service A
client-side envoy reports telemetry to Mixer, which in turn notifies appropriate plugins.
charger hub will not charge batteries parallelly, one by one instead.
注意天线侧面面向无人机最好,也就是天线和display其实是有角度的。
[x] Multiple flight modes: turn on
[x] Return to home(the take off point) altitude: 30 ~ 60m 自主返回高度设定
[x] Turn off beginner mode
[x] Caliberate IMU, fold the drone (once of out the box suggested),校准会在不同姿态下进行
[x] Enable visual obstacle avoidance, and others in advanced settings
[x] Aircraft battery settings: 15% threshold, RTH
[x] Gimbal settings: Gimbal auto caliberation
[x] file index mode: continuous
[x] set center point: cross
When I was working on securing docker registry, I followed the instructions but when run docker push I always get x509: certificate signed by unknown authority error, this means the self-signed certificate is not identified by docker daemon.
This time to get more detail information, need to check the docker daemon log.
Ubuntu (old using upstart ) - /var/log/upstart/docker.log Ubuntu (new using systemd ) - sudo journalctl -fu docker.service Amazon Linux AMI - /var/log/docker Boot2Docker - /var/log/docker.log Debian GNU/Linux - /var/log/daemon.log CentOS - /var/log/daemon.log | grep docker CoreOS - journalctl -u docker.service Fedora - journalctl -u docker.service Red Hat Enterprise Linux Server - /var/log/messages | grep docker
In Red Hat, from /var/log/messages file I clearly see that the docker daemon pick certificate under /etc/docker/certs.d/<domain, no port number!> folder.
If your OS is using systemd, the journalctl command can help, but the output from container is also dumping here, see this issue: https://github.com/moby/moby/issues/23339.
This is all about securing servers with SSL/TLS certificates from udemy course SSL complete guide.
The quality of SSL varies, you have SSL setup doesn’t mean your site is good secured. The HTTPS may not work correctly, sub-optimal, you can test it here:
https://www.ssllabs.com/index.html
If you click the lock icon at left of the website address, it will show you if the connection is secured or not, it’s certificates, cookies and so on. further click the certificate icon, you will see root CA, intermediate CA and certificate.
Install wireshark on Mac, go to download the stable version dmg package and double click to install.
You can use Chrome inspect -> Network to see traffic or use wireshark
For example, from Network select one item, check HEADER information you can get IP address of remote server, or just use host, nslookup commands to get IP address.
Interesting, from Network I see the the chrome browser sometime uses IPV6 address talk to server, for example facebook and some other sites. see this question
具体命令操作可以参考: <<Set up Secure Docker Registry Container>>
如果要把这个过程自动化,比如自动给网站request安排证书,renew更新,revoke撤销等,需要用到一些automation,比如certbot, or cert-manager in K8s,见下面一章:
tls.crt 就是certificate, 它是一种PEM formatted file, PEM means Privacy Enhanced Mail (concatenated certificate container files), can have different extension like: tls.cert, tls.cer, tls.pem, etc. PEM is a container format that may include just the public certificate, or may include an entire certificate chain including public key, private key, and root certificates. Confusingly, it may also encode a CSR, for example, here it contains 2 block certificates:
symmetric encryption, the same key is used by both sides, for example: AES. This algorithm is embedded in SSL with HTTPS protocol.
asymmetric encryption, for example: RSA.
Hash
How does hash work to verify data integrity:
1 2 3
data + hash(data) ---------> data + hash(data) | | |--hash-->| (compare if they are the same)
Notice that in database the password is hashed, not plain text.
Hash algorithms: MD5, SHA…
MD5: 128 bits, echo 123 | md5
for SHA, use shasum command in linux or use inline tool.
SHA-1: 160 bits
SHA-256: 256 bits
SHA-512: 512 bits
1 2
## SHA-256 shasum -a 256 -t test.txt
HMAC: can be used with md5 or sha. In cryptography, an HMAC (sometimes expanded as either keyed-hash message authentication code or hash-based message authentication code) is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. It may be used to simultaneously verify both the data integrity and the authenticity of a message, as with any MAC. Any cryptographic hash function, such as SHA-256 or SHA-3, may be used in the calculation of an HMAC.
Asymmetric Keys
Encryption
1 2 3
data -------------> code =========> code -------------> data (owner side) public key private key encryption decryption
Usually(but not necessarily), the keys are interchangeable, in the sense that if key A encrypts a message, then B can decrypt it, and if key B encrypts a message, then key A can decrypt it. While common, this property is not essential to asymmetric encryption.
Signature
1 2 3 4 5 6
data |----------> hash value | (hash) ==========> | compare /|\ | | | private key encrypt | public key decrypte \|/ | (hash) | data + encrypted hash data + encrypted hash
Signing ensures the data is sent by the owner of private key and not has been modified inbetween.
public key infrastructure is a set of roles, policies, hardware, software and procedures needed to create, manage, distribute, use, store and revoke digital certificates and manage public-key encryption. The purpose of a PKI is to facilitate the secure electronic transfer of information for a range of network activities such as e-commerce, internet banking and confidential email.
Certificate
A file with some contents:
certificate owner
certificate issuer
signature (RSA created, made by issuer)
public key (from owner, we then use this public key to HTTPS)
Self-signed certificate: issued and signed by the owner.
The basic rule is we trust the CA (the issuer) so on the certificate owner.
Why we need intermediary CAs?
There are not so much public root CAs because of problem of trust. Actually anybody can create own root CA but nobody will trust it. That’s why there is limited set of global root CAs that are trusted worldwide by operating systems and browsers. You can view list of such global CAs with their root certificates in any browser or OS.
Such root CAs have certificates with long period of validity and their main responsibility is simple create “source of trust”. That’s why they don’t issue certificates to end users to avoid additional work and minimize risk that their private keys will be compromised. Intermediate CAs certificates don’t necessarily need to be in the list of trusted certificates in the OS or browser. They need simply be issued by trusted root CA.
Chain of Trust
Let’s see openssl command, generate RSA private key and public key:
这里没有谈到self-signed certificate,见后面
1 2 3 4 5 6 7 8 9 10 11
## check help for sub-command genrsa openssl genrsa -h
## generate private.pem file private key with aes256 encryption method ## will ask you input pass phrase ## 实际上private.pem 包含了public key 的信息了! openssl genrsa -aes256 -out private.pem
## generate public key from above private key ## will ask you pass phrase from private key openssl rsa -in private.pem -outform PEM -pubout -out public.pem
Root CAs in OS
How does web browser trust the root CAs and certificates?
The OS ships a list of trusted certificates, in Mac search Keychain Access.
In Linux, see this link: On Red Hat/Centos, It includes all trusted certificate authorities under /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt. Just add your new certificate authority file(s) to the directory /etc/pki/ca-trust/source/anchors, then run /bin/update-ca-trust to update the certificate authority file.
CSR: certificate signing request (the root CA receive CSR from intermediate CA, the signature of intermediate CA is signed by root CA, the root CA also provide issuer info for intermediate CA. Similarly, end user is signed by intermediate CA)
Web server (the end user) sends you its certificate and all intermediate certificates. Then on your side start the verification process from end user certificates back to top intermediate certificate then root certificate.
这里如何验证的: To verify a certificate, a browser will obtain a sequence of certificates, each one having signed the next certificate in the sequence, connecting the signing CA’s root to the server’s certificate.
Before creating certificate, you need CSR, and before CSR, you need first generate asymmetric keys. (Becuase certificate needs to include signature from upstream and also your public key)
Choose common name (CN) according to the main domain where certificate will be used. (for example, in secure docker registry, CN is the registry address), actually CN is deprecated and Alternative Name(SAN) is used instead.
What is /etc/ssl/certs directory? Actually this is a softlink to /etc/pki/tls/certs.
# this is CN certificate, if you want to have SAN(Subject Alternative Name), see below openssl req \ -newkey rsa:4096 -nodes -x509 -sha256\ -keyout key.pem -out cert.pem -days 365 \ -subj "/C=US/ST=CA/L=San Jose/O=Company Name/OU=Org/CN=<domain>"
-nodes: short for No DES, if you don’t want to protect your private key with a passphrase.
Add -subj '/CN=localhost' to suppress questions about the contents of the certificate (replace localhost with your desired domain).
For anyone else using this in automation, here’s all of the common parameters for the subject: -subj "/C=US/ST=CA/L=San Jose/O=Company Name/OU=Org/CN=<domain>"
Remember to use -sha256 to generate SHA-256-based certificate.
注意有时遇到的cert 文件中可能包含多个CERTIFICATE 块,其中有intermediate CA. 在kubernetes 中构造tls secret的时候,直接使用即可。
To generate SAN certificate, see this post. 主要是注意如何构造 san.cnf 文件给 openssl 使用. SAN主要是为了一证多用,注意这个和wildcard tls certificate不一样, wildcard tls certificate 主要是为了包容subdomains.
SSL/TLS and HTTPS
Both are cryptographic protocols used in HTTPS:
SSL: Secure Socket Layer
TLS: Transport Layer Security.
Difference between SSL vs TLS: TLS is an update and secure version of SSL.
https://www.globalsign.com/en/blog/ssl-vs-tls-difference/
It’s important to note that certificates are not dependent on protocols. Sometimes you hear SSL/TLS certificate, it may be more accurate to call them Certificates for use with SSL and TLS, since the protocols are determined by your server configuration, not the certificates themselves.
Go to ssllab, you can check which version of TLS the web server use, input the web server address and scan, then click IP icon.
Why RSA is not used in data encryption?
too slow.
bi-directional data encryption requires RSA key pairs on both sides.
We encrypt data use symmetic key after setup secure connection.
Why need rsa key pairs on both sides? Because the key is interchangeable, everyone has public key can decrypt data encrypted by private key!
web server sends its certifiate (intermediate and others) to browser
browser generate symmetic key secured by public key from server and send to server. Or use Diffie–Hellman key exchange.
Let’s see wireshark for wikipedia connection:
The top 3 are TCP handshakes, then you see TLS client hello, then TLS server hello.
In client hello, there are lots of information to negoitate with server, here you see some supported version of TLS, also cipher suites.
In server hello, you see the the server has selected one of cipher suites. The TLS_ECDHE.._SHA256 means it uses Diffie–Hellman key exchange and sha256 as hash.
–>> How Does SSL Work?
The main use case for SSL/TLS is securing communications between a client and a server, but it can also secure email, VoIP, and other communications over unsecured networks.
–>> TLS handshakes
During a TLS handshake, the two communicating sides exchange messages to acknowledge each other, verify each other, establish the encryption algorithms they will use, and agree on session keys.
SSL handshakes are now called TLS handshakes, although the “SSL” name is still in wide use.
A TLS handshake also happens whenever any other communications use HTTPS, including API calls and DNS over HTTPS queries.
–>> How does proxy handle TLS handshakes (还记得envoy吗?)
HTTPS knows how to tunnel the TLS handshake even through the proxy.
也就说TLS/SSL through proxy就是通过HTTP CONNECT tunnel去实现的, 特别是看一下第二个回答, comments:
So, the proxy is not MITM’ing the HTTPS connection, by replacing the server’s certificate with its own - it’s simply passing the HTTPS connection straight through between the client and the server. Is that right?
Normally, when HTTPS is done through a proxy, this is done with the CONNECT mechanism: the client talks to the proxy and asks it to provide a bidirectional tunnel for bytes with the target system. In that case, the certificate that the client sees is really from the server, not from the proxy. In that situation, the proxy is kept on the outside of the SSL/TLS session – it can see that some SSL/TLS is taking place, but it has no access to the encryption keys.
Diffie–Hellman
Diffie–Hellman uses one-way function, for example, mod operation.
See, here a, b are priviate keys on both side, g, p are public key. A, B are mod result, K is the final result that both side can get use to encrypt the data.
Elliptic-curve cryptography is used in Diffie–Hellman.
Custom Domain
Purchase custom domain and use free hosting to setup our website.
这篇总结主要是来自PluralSight上的LPIC-1课程的Network chapter,以及LFCE Advanced Networking training. 后来加入了一些iptables的内容, from Youtube.
Environment: CentOS 7 Enterprise Linux or RedHat.
Frequently Asked Question:
What is going on when you hit URL in browser?
# query and change the system hostname and related settings hostnamectl
Static hostname: halos1.fyre.xxx.com Icon name: computer-vm Chassis: vm Machine ID: f7bbe4af93974cbfa5c55b68c011d41c Boot ID: 4e30e7107fa441a9b3ad70d0b784782d Virtualization: kvm Operating System: Red Hat Enterprise Linux Server 7.6 (Maipo) CPE OS Name: cpe:/o:redhat:enterprise_linux:7.6:GA:server Kernel: Linux 3.10.0-957.10.1.el7.x86_64 Architecture: x86-64
# show domain name # The chances are unless we have a web server running on our computer, we will not have any dns domain # name. By default, there is no web server running on a system and hence there is no result when we # type “dnsdomainname” on the terminal and hit enter. dnsdomainname
1 2 3 4 5 6 7 8 9 10 11 12
# this will not be persistent # the static hostname is still unchanged but transient hostname is xxx.example.com # you can see transient name by hostnamectl hostname xxx.examplel.com
# this will be persistent in # /etc/hostname hostnamectl set-hostname xxx.example.com
# set pretty hostname which includes ' # /etc/machine-info hostnamectl set-hostname "xxx'ok.example.com"
Notice that the order we add in /etc/hosts file is important!
把fully qualified hostname放第一个,然后aliases,否则在一些场景会出问题!
除了local hosts file, 来看看DNS设置, 我有一篇blog讲到了这个。
dig command (DNS lookup utility),用来check response and checking hostname from DNS server.
1 2 3 4 5
# use default dns server # -t A: type A record dig www.pluralsight.com -t A # use specified dns server, for example, google dns server 8.8.8.8 dig www.pluralsight.com @8.8.8.8 -t A
# server now is 8.8.8.8 ;;Query time:60msec ;;SERVER:8.8.8.8#53(8.8.8.8) ;;WHEN:SunApr1213:03:48PDT2020 ;;MSG SIZE rcvd:132
Add short format +short to return the IP address only:
1 2
# only show resolved output dig +short www.pluralsight.com @8.8.8.8
How to check dns record TTL: You can set TTL for the DNS record that defines how long a resolver supposed to cache the DNS query before the query expires. TTL typically used to reduce the load on your authoritative name servers and to speed up DNS queries for clients.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# A is type, check loacl dns resolver dig A google.com
# other type # AAAA: ipv6 dig AAAA google.com # canonical name dig cname google.com
# get authoritative dns server # NS: name server dig +short NS google.com # check by authoritative dns server dig A google.com @ns1.google.com.
# onyl show ttl dig +nocmd +noall +answer +ttlid A google.com # human-readable dig +nocmd +noall +answer +ttlunits A google.com
Network services
04/12/2020 目前我只是查看配置,没有去设置过。
Display and set IP address
1 2 3 4
ip -4 addr ip addr show eth0 # not persist ip addr add 192.168.1.50/24 dev eth0
没太明白这些配置的具体用法。
Network Manager tool, 这个tool也不是万能的,有的地方不适用, can be used to set persistent change so we will not lost it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# check status systemctl status NetworkManager # if not active, start it systemctl start NetworkManager
# nmcli command # command-line tool for controlling NetworkManager # show all connections nmcli connection show # pretty format nmcli -p connection show eth0
# terminal graph interface nmtui # then edit a connection, select network interface # config ipv4 ip address/gateway. systemctl restart network
Traditional network service, more flexible and common.
1
systemctl status network
The network configuration is read from scripts under /etc/sysconfig/network-scripts/.
After editing the ifcfg-xx file, bring down and up that interface:
1 2
ifdown eth0 ifup eth0
Routing
[ ] IP tables vs routing tables 有啥区别,使用场景? see this question and diagram in comment.
Display routing tables 路由表
1 2 3 4 5 6 7
# see below ip r # route and netstat 每个column的意思更清楚一些 # -n: displays the results as IP addresses only and does not attempt to perform a DNS lookup netstat -rn # -e: display as netstat format route -n [-ee]
Explain host routing table (因为这不是一个router), the column name explaination can see man route,比如Flags字母的含义。
The order in the routing table does not matter, the longer prefix always takes priority.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# 简而言之,路由表就是找,到哪里,出口在哪以及下一跳是谁
# Destination 表示destination `network name` or `host name` # 是用来和要出去的packet destination IP 和 (Genmask)mask 作用之后得到的结果对比的 # 如果match了,则通过Iface(interface)送出去 # 如果和mask作用后有多个match, 则去match最长的那个destination
# Gateway: gateway address, 比如192.168.0.1,如果是0.0.0.0,表示unspecified 或者`没有`, 有时 # 用 * 表示没有. # this assumes that the network is locally connected, as there is no intermediate hop. Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.0.1 0.0.0.0 UG 100 0 0 ens4 # 注意这里是个host IP了,不是network name 192.168.0.1 0.0.0.0 255.255.255.255 UH 100 0 0 ens4 192.168.9.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
对比一下ip r command, 显示不太一样:
1 2 3 4 5 6 7 8
# proto [type]: routing protocol identifier of this route # scope link: 表示在设备的网段内通过此链接允许通信
# default gateway default via 192.168.0.1 dev ens4 proto dhcp metric 100 192.168.0.1 dev ens4 proto dhcp scope link metric 100 # docker0 192.168.9.0/24 dev docker0 proto kernel scope link src 192.168.9.1
# this command is not persistent # default can be formatted as 192.168.1.0/24 ip route add default via 192.168.56.104 dev eth0
如果需要make it persist, need to edit /etc/sysconfig/network-scripts/ corresponding file eth0, 或者自己添加script,然后重启network systemctl restart network.
Configuring a linux system as router:
1 2 3 4 5 6
# now let's configure machine 192.168.56.104 as a router vim /etc/sysctl.conf # add this line to enable ipv4 forward net.ipv4.ip_forward=1 # reload sysctl -p
# this is operating on nat iptables # run in infra node # DNAT: destination nat iptables -t nat -A PREROUTING -p tcp --dport 32160 -j DNAT --to-destination <worker private IP>:32160 iptables -t nat -A POSTROUTING -j MASQUERADE iptables -t nat -nvL
Allowing access to the internet via NAT, so traffic can get back to private network.
注意routing这部分还没有涉及到firewall, firewall is inactive
1 2 3 4 5 6 7
# -t nat: working on nat table # -A POSTROUTING: appending to post routing chain # -o eth0: outbound via eth0, eth0 connects to internet # -j MASQUERADE: jump to MASQUERADE rule
# not persistent, see iptables section below iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
then if you check iptables -t nat -nvL will see the postrouting rule with new line added.
Implement packet filtering (iptables and firewalld both can do this)
firewall zone: represent a concept to manage incoming traffic more transparently. The zones are connected to networking interfaces or assigned a range of source addresses. You manage firewall rules for each zone independently.
# show default zone firewall-cmd --get-default-zone # show active zones, will see interfaces apply to it firewall-cmd --get-active-zones # show available zones firewall-cmd --get-zones
# permanently remove interface eth0 from public zone firewall-cmd --permanent --zone=public --remove-interface=eth0 # permanently add eth0 to external zone firewall-cmd --permanent --zone=external --add-interface=eth0 # permanently add eth1 to internal zone firewall-cmd --permanent --zone=internal --add-interface=eth1
# change default zone firewall-cmd --set-default-zone=external # after updating, restart to take effect systemctl restart firewalld
filter: This is the default table (if no -t option is passed),It contains the built-in chains INPUT (for packets destined to local sockets),FORWARD (for packets being routed through the box), and OUTPUT (for locally-generated packets).
nat: This table is consulted when a packet that creates a new connection is encountered. It consists of three built-ins: PREROUTING (forltering packets as soon as they come in), OUTPUT (for altering locally-generated packets before routing), and POSTROUTING (foraltering packets as they are about to go out). IPv6 NAT support is available since kernel 3.7.
mangle: This table is used for specialized packet alteration.
raw: This table is used mainly for configuring exemptions from connection tracking in combination with the NOTRACK target.
security: This table is used for Mandatory Access Control (MAC) networking rules
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# list 3 basic chain in filter table: INPUT, FORWARD, OUTPUT # INPUT: traffic comes in firewall # FORWARD: traffic pass through firewall # OUTPUT: traffic leaving firewall iptables [-t filter] -L
# policy ACCEPT: default policy is ACCEPT if no specific rules # other policies: DROP, REJECT(will send ICMP rejecter to sender) # by default, most system won't have any rules Chain INPUT (policy ACCEPT) target prot opt source destination
# set default policy to DROP # accept any traffic for INPUT and OUTPUT
# rules 类似于switch中的case,从上到下match,顺序很重要! # -A: append iptables -A INPUT -j ACCEPT iptables -A OUTPUT -j ACCEPT # 这里设置为DROP是因为上面新加了ACCEPT iptables -P INPUT DROP iptables -P OUTPUT DROP iptables -P FORWARD DROP
# accept any loopback traffic # loopback traffic never leaves machine # -i: in-interface # -o: out-interface iptables -A INPUT -i lo -j ACCEPT iptables -A OUTPUT -o lo -j ACCEPT
# -v: verbose, # -n: numberic data # --line-numbers: show rules index iptables -nvL --line-numbers
# keep current traffic, for example, current ssh connection iptables -A INPUT -j ACCEPT -m conntrack --ctstate ESTABLISHED,RELATED iptables -A OUTPUT -j ACCEPT -m conntrack --ctstate ESTABLISHED,RELATED
# remove rule by index from --line-numbers # -D: delete rule # 这里就把之前ACCEPT去掉了,但链接并不会断开,因为有conntrack with established iptables -D INPUT 1 iptables -D OUTPUT 1
# 目前为止,没有新的流量可以进来或出去 # add filter rules to iptables firewall for inbound and outbound traffic # others can ping me iptables -A INPUT -j ACCEPT -p icmp --icmp-type 8 # I can ping others iptables -A OUTPUT -j ACCEPT -p icmp --icmp-type 8 # others can ssh in # add comment iptables -A INPUT -j ACCEPT -p tcp --dport 22 -m comment --comment "allow ssh from all"
# I can access others # 当时这里理解有点问题,为什么不需要INPUT 80 port呢? iptables -A OUTPUT -j ACCEPT -p tcp --dport 80 iptables -A OUTPUT -j ACCEPT -p tcp --dport 443 # DNS iptables -A OUTPUT -j ACCEPT -p tcp --dport 53 iptables -A OUTPUT -j ACCEPT -p udp --dport 53 # NTP iptables -A OUTPUT -j ACCEPT -p tcp --dport 123
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# save current config # can edit in this output file iptables-save > orgset iptables-restore < orgset
# drop if not match # 这个放最后,否则一来就drop了,但如果设置了default DROP则不需要了 iptables -A INPUT -j DROP # not acting as a router iptables -A FORWARD -j DROP
在/etc/sysconfig目录下,有iptables and iptables-config files, If set these two values as yes, then iptables will save the config automatically in iptables file, easy to maintain.
1 2 3 4 5 6 7 8 9 10 11
# Save current firewall rules on stop. # Value: yes|no, default: no # Saves all firewall rules to /etc/sysconfig/iptables if firewall gets stopped # (e.g. on system shutdown). IPTABLES_SAVE_ON_STOP="yes"
# Save current firewall rules on restart. # Value: yes|no, default: no # Saves all firewall rules to /etc/sysconfig/iptables if firewall gets # restarted. IPTABLES_SAVE_ON_RESTART="yes"
yum install -y nmap # check what ports in your system is opening nmap scanme.nmap.org # list interface and routes information nmap -iflist
Can use ss command (similar to netstat) to show listening tcp ports:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# show listening ipv4 tcp sockets in numeric format ss -ltn -4
# *:* means listening from any address and any port State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 64 *:2049 *:* LISTEN 0 128 *:36168 *:* LISTEN 0 128 *:111 *:*
# list current active connections ss -t
# 9.30.166.179:ssh is my Mac IP, it ssh to current host # 这里State is ESTAB, 如果握手没回应,则会显示SYN-SENT State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 128 9.30.166.179:ssh 9.160.91.147:62991 ESTAB 0 0 9.30.166.179:54556 54.183.140.32:https
Network topology: LAN, WAN (bus, star, ring, full mesh)
Network devices: adapter, switch, router, firewall
OSI model
subnetting: a logically grouped collection of devices on the same network
subnet mask: network portion / host portion
special address:
network address (all 0 in host portion)
broadcast (all 1 in host portion)
loopback 127.0.0.1
classful subnet: class A/B/C, they are inefficient
VLSM: variable length subet mask, for example x.x.x/25
NAT: one to one, many to one map
ARP: address resolution protocol (IP -> MAC), broadcast on bus to see who has MAC for a particular IP
DNS: map hostname to IP, UDP protocol
IP packet: can be fragmented and reassembled by router and host. fragments其实很影响throughput,因为每个IP packet都有header。还要注意有的IP加密 (VPN)会额外增加IP packet的长度,造成fragments.
TTL: time to live in IP header, this is how traceroute works
Routing Table:
static: path defined by admin
dynamic: path programmatically defined, routing protocol software Quagga on Linux
TCP:
connection oriented: three way handshake
connection establishment/termination
data transfer
ports: system can have more than one IP, ports are only unique per IP
well know port: 0-1024
flow control: maintained by receiver
congestion control: the sender slow down
error detection and retransmission
UDP:
send it and forget it
DNS (dig, host commands)
VoIP
setup http service on server host
1 2 3 4 5 6 7 8
yum install -y httpd # if firewall is on firewall-cmd --permanent --add-port=80/tcp firewall-cmd --reload # set page content echo"hello world" > /var/www/html/index.html systemctl enable httpd systemctl start httpd
yum install -y tcpdump wireshark wireshark-gnome # if you have desktop in linux, start wireshark wireshark &
Check the arp cache
1 2 3 4 5
# '?' means stale arp -a ip neighbor # delete arp cache arp -d 192.168.1.1
specify size of the data and ping total number:
1 2 3 4 5 6
# -c 1: ping once # -s 1472: 1472 bytes long (this is not total length of IP, it will append header) # so maybe exceed 1500 MTU and then packet will be fragmented ping -c 1 -s 1472 192.168.1.1 # -t set TTL ping -c 2 -t 5 192.168.0.1
Create a large file to transfer:
1 2 3 4 5 6 7
# fast allocate file # -l5G: length of file is 5G fallocate -l5G test.bin # then using scp to copy from network scp ... # you can check wireshark to see the tcp window scaling graph # will see slow start and speed up
Traffic control setting
用来模拟网络不好的情况, 如用scp在传输文件,设置tc bad performance,然后恢复,会发现transmission rate提高了。可以查看wireshark window scaling graph 和 IO graph.
Linux 下 TC 命令原理及详解
1 2 3
tc qdisc add dev eth1 root netem delay 3000m loss 5% # remove the above policy tc qdisc del dev eth1 root
let’s see the statistic:
After performance recover, TCP congestion window size enlarge quickly:
This is IO graph, shows TCP window size and update points:
Network Troubleshooting
Network is not reachable. For example, cannot ping through.
1 2 3 4 5 6 7 8 9 10
# check subnet and gateway, then ip route # check interface, state DOWN? NO-CARRIER? then ip addr # check MAC mapping in layer 2, then arp -a # layer1 is ok? link detected no? # 注意虚拟机是没有这个统计的!真实网卡才有,之前遇到过这个情景了 # port speed 也可以查看 ethtool eth0
No route to host,比如在scp的时候,这时去host server上看一下port是不是打开的
This blog is for system design, please revisit frequently to refresh. The notes are mainly from https://www.educative.io/ and Youtube channel.
有的系统设计主要是各功能部件合理组合:
Design Instagarm
Design Dropbox
Design Twitter
post tweets(photos, videos), follow others, favorite tweets
generate timeline of top tweets
low latancy
highly available
consistency can take a hit
storage: text + photo + video
ingress (write): new generated storage / sec
egress (read): read volume / sec
read heavy system
data sharding: user id -> tweet id -> (creation time + tweet id, sort by time)
query all servers and aggregate
cache for hot users and tweets
Designing Twitter Search
Designing a Web Crawler (BFS, modular, url frontier, DNS, fetcher, DIS, content filter, extractor, url filter)
Designing Facebook Messenger
each chat server serves a bunch of users, LB maps user to it’s chat server, chat server commuicate with each other to send/receive message
message handling: long polling to receive message
hashtable keep track of online user, if offline, notify delivery failure to sender
handle message order: 单独靠timestamp不行,use sequence number with every message for each user
database: support high frequence write/read row, quick small updates, range based search: HBase, column-oriented key-value NoSQL database
partition by UserID, low latency
有的主要涉及到了数据结构和算法:
Typeahead suggestion (trie, reference)
API rate limiter (dynamic sliding window)
Designing Facebook’s Newsfeed (offline feed generation)
contain updates, posts, video, photos from all people user follows
user average has 200 followers, 300M DAU, fetch 5 times a day, 1KB each post, so can get traffic.
cache each users’ news feed in mem for quick fetch.
feed generation:
retrieve, rank, store
offline generate by dedicated servers, Map<UserID, LikedHashMap/TreeMap<PostID, PostItem>> + LastGenerateTime in memory, LRU cache for user or find user’s activity pattern to help generate newsfeed
feed publishing:
push to notify, pull for serving
Designing Yelp (querying, objects don’t change often, QuadTree)
解释一下我的理解,这里partition讲的是partition quadtree.
从DB中读location id, 通过hashing map to different quadtree server (这个mapping其实就是quadtree index,可以在quadtree server fail后用来重新构造它的数据),然后各自构造自己的quadtree.这些quadtree servers有一个aggregator server(它有自己的copies)。于是每次request要去所有quadtree server查询,然后聚合返回的数据。对于每个quadtree server,它所包含的location id也有一个本地的mapping, to know which DB servers contains this locatio id info. 这个mapping也使用的hashing实现。
Designing Uber backend (requirements, objects do change often, QuadTree)
Design Ticketmaster (first come first serve, highly concurrent, financial transactions ACID)
CAP Theorem
CAP theorem states that it is impossible for a distributed software system to simultaneously provide more than two out of three of the following guarantees (CAP): Consistency, Availability, and Partition tolerance.
When we design a distributed system, trading off among CAP is almost the first thing we want to consider.
Thinking process
requirements clarification
back of the envelope estimation: scale, storate, bandwidth.
system interface definition
defining data model
high level design
detailed design
identifying and resolving bottlenecks
Crucial Components
这里的笔记主要根据以下几点展开:
Database (book: 7 weeks 7 databases)
Cache system (redis, memcache)
Message queue (kafka && zookeeper or others)
Load balancer (nginx, Round Robin approach)
Log systems
monitor system
My domain of knowledge k8s, docker, micro-services
Key Characteristics of Distributed Systems
Scalability: scaling without performance loss (but actually will).
Reliability: keep delivering services when some components fail.
Availability: reliable means available, but not vice versa
Efficiency: latency and (throughput)bandwidth.
Manageability: ease of diagnosing and understanding problems when they occur.
常用技术知识
备份的说法:
Standby replicas
Failover to other healthy copies
Duplicates
Backup (spare)
Redundancy (redundant secondary copy)
NoSQL Database:
An Introduction To NoSQL DatabasesBig Data: social network, search engine, traditional methods of processing and storage are inadequate.
Key-value stores: Redis, Dynamo (redis can also be cache)
Advantage of NOSQL database:
no data models(no pre-defind schema), unstructed , easy to scale up and down (horizontal data sharding), high performance with big data.
Advantage of SQL database:
relational data, normalization (eliminate redundancy), SQL, data integrity, ACID compliance.
Consistent Hashing (with virtual replicas)
https://www.youtube.com/watch?v=ffE1mQWxyKM
Using hash mod strategy is not efficient, think about that add a new server, then original 20 % 3 = 2 now is 20 % 4 = 0. We have to re-organize all the existing mappings.
For example, we have n servers.
Hash the request and get the location of it in the ring, find the server with hash value equal or larger than it and send this request to that server (clockwise move). But server may not distributed in ring evenly or the requests is not uniformly (thus server load factor is not 1/n), so we can use virtual replicas, this can implement by other hash function.
With contsistent hashing, add or remove servers will not cause much overhead. The new added server will grab objects from its near servers and removed server, all original objects will move to next server after the removed one.
题外话: push模式其实也是建立了一个持久的connection,但server一旦有新的信息就会push给client,而不会去在乎client的处理能力,这是一个缺点, long polling对于client要更灵活一些(因为client会request first)。
This is a variation of the traditional polling technique that allows the server to push information to a client whenever the data is available. With Long-Polling, the client requests information from the server exactly as in normal polling, but with the expectation that the server may not respond immediately (keep the connection connected). That’s why this technique is sometimes referred to as a Hanging GET.
Each Long-Poll request has a timeout. The client has to reconnect periodically after the connection is closed due to timeouts or receive the disconnect from server.
Typically, proxies are used to filter requests, log requests, or sometimes transform requests (by adding/removing headers, encrypting/decrypting, or compressing a resource). Another advantage of a proxy server is that its cache can serve a lot of requests.
open (forwarding) proxy: hide clients
reverse proxy: hide servers
Map Reduce
We can have a Map-Reduce (MR) set-up These MR jobs will calculate frequencies of all searched terms in the past hour.
Exponential Moving Average (EMA)
In EMA, we give more weight to the latest data. It’s also known as the exponentially weighted moving average.
Some Design Bottlenecks
data compression 需要吗, 如何选择?
capacity estimation: metadata + content 两方面都要考虑,high level estimations 主要包括: storage for each day, storage for years, incoming bandwidth, outgoing bandwidth. 这些主要来自于: Total user, Daily active user (DAU), size of each request, how many entries each user produce, data growth, 有时对某个量单独估计比较好。
read heavy or wirte heavy? bandwidth, ingress: 每日新增数据总量/秒; egress: 用户浏览或下载总量/秒.
database需要有哪些符合场景的特点? 比如quick small updates, ACID, range based search, etc.
how about consider the peak time read and wirte throughput.
hot user in database handle, 怎么设计database去减轻这个问题.
we may need aggregator server for fetching and process data from different DB or caches.
monitoring system, collect metrics: daily peak, latency. we will realize if we need more replication, load balancing, or caching.
load balancer can sit: between client and web server, web server and application server (or cahce), application server and database. load balancer can be single point of failure, need redundancy to take over when main is down.
load balancer: Round Robin approach, or more intelligent.
Note: The number of processors shown by /proc/cpuinfo might not be the actual number of cores on the processor. For example a processor with 2 cores and hyperthreading would be reported as a processor with 4 cores.
Adding to either end of a linked list does not require a traversal, as long as you keep a reference to both ends of the list. This is what Java does for its add and addFirst/addLast methods.
Same goes for parameterless remove and removeFirst/removeLast methods - they operate on list ends.
remove(int) and remove(Object) operations, on the other hand, are not O(1). They requires traversal, so their costs are O(n).
In the situation that issue k8s instructions from inside the container, usually we use curl command to do that (if you have kubectl binary in container’s execution path, you can use kubectl command as well).
First you need credentials and api server information: