There is a better alternative, please see <<VSC Developing inside a Container>>.
To use and manage third-party libraries without messing up python environment, organizing different project that has its own unique dependencies:
pip: package management
virtualenv: project and dependencies
virtualenvwrapper: making venv more convenient
Does not talk the package that is the form of __init__.py under a folder, we are talking python distribution package.
Pip
Best practice:
always work inside a virtual environment, keep things nice and clean.
be careful using pip with sudo when install packages, otherwise the installation is system-wide.
Mac’s pre-installed Python is not meant for development, you can use Homebrew to install or download Python from python.org, that will go along with pip. For Linux, adhere to system manager to install pip or python. In Mac, try check if pip is there and it’s version.
To install pip(2/3) on Linux:
1 2 3 4 5 6 7 8 9 10 11
# search pip2 or pip3 package sudo yum search python* | grep pip # this will also install libtirpc # python3, python3-libs, and python3-setuptools sudo yum install python3-pip
pip3 -V pip 9.0.3 from /usr/lib/python3.6/site-packages (python 3.6)
pip2 -V pip 9.0.3 from /usr/lib/python2.7/site-packages (python 2.7)
# local or global config info # you will see the package repo pip3 config [debug, edit, get, list, set, unset]
# search pip3 search <package name> # download package in current dir pip3 download <package name> # will auto install other dependencies pip3 install <package name>
# list packages installed pip3 list # show outdate packages pip3 list -o
# uninstall # will not uninstall its dependencies pip3 uninstall <package name>
# show package info # you will see the location where the package is installed # and its source code url pip3 show <package name>
# seek help pip3 help
pip is actually fetching packages from Python package index (or your own package repo)
https://pypi.org/
How to work:
search key work directly.
go to Browse projects -> Operating system -> Linux, then select other classifier (but this is still hard to search what is exactly needed).
check development status, select package in production/stable version.
Pip install from specified repo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
# use additional repo pip install --extra-index-url '<repo url>' vault-client==0.0.4
# or set by # pip config set <ket> <value> global.extra-index-url='<repo url>' global.timeout='10' global.trusted-host='registry.corp.xxx.com' # then run pip install vault-client==0.0.4
# create a virtual env called rates in python3 virtualenv -p python3 rates_py3 virtualenv -p python3.8.4 rates_py3 # python2 based virtualenv -p python2 rates_py2
# activate cd rates_py3 # after this you will see a prefix # 一旦激活,不管在其他任何地方,都是这个环境! . ./bin/activate # check python -V pip -V # you will only see less packages installed pip list
# then start your work in your project folder...
# deactivate deactivate
Other similar tool, this venv may pre-installed or need to pip install globally:
1 2
# python >= 3.3, may popular in furture python3 -m venv <virtual env name>
Syncing packages with colleagues, put this requirement file in version control to share and update:
1 2 3 4 5 6 7 8 9
# fist activate the virtual environment # package list python -m pip freeze > requirements.txt # the condition can be ==, !=, >=, <=
# create another virtual environment with the same python verion like yours # activeate this new environment # then run python -m pip install -r requirements.txt
You can specify version in pip install:
1 2 3 4 5 6 7 8
pip install flask==1.0.0 pip install 'Django<2.0' # upgrade to latest version pip install -U flask
# upgrade pip # take care not to overwrite system pip using sudo pip install -U pip
How to manage the project and virtual environment?
Separating project with virtual environment! 放在不同的文件夹中,使用时激活就行了,一般一个venv对应一个project, 但如果要测试多个不同的环境,也可以多个venvs map to one project.
Real-world example, when develop flask framework, use setup.py with editable pip to install packages in virtual environment, so you can edit the flask source code and it will reflect in real-time:
When would the -e, --editable option be useful with pip install?
1 2 3 4 5
git clone https://github.com/pallets/flask
# activate virtual environment # go to root level of flask directory python -m pip install -e .
Now have developing env for flask.
You can also see tox.ini file in flask git repo, it is used for testing against different python versions in different virtual environments.
Virtualenvwrapper
A user-friently wrapper around virtualenv, easy creation and activation, bind projects to virtualenvs.
# get path which virtualenvwrapper.sh /usr/local/bin/virtualenvwrapper.sh
# add below lines to ~/.bashrc # point virtualenvwrapper to pyhton3 explicitly # the path could be /usr/local/bin/python3, check the config export python3=/usr/local/bin/python3 export VIRTUALENVWRAPPER_PYTHON=/usr/local/bin/python3 # the example path: source /Library/Frameworks/Python.framework/Versions/3.10/bin/virtualenvwrapper.sh # if you don't want to use default virtual env home # use absolute path export WORKON_HOME="/home/<user>/virtualenvs" # set the project homes, when use mkproject will create project folder here # use absolute path export PROJECT_HOME="/home/<user>/dev"
# enter or switch virtual environment workon <venv name>
# will create both venv and project # if the project is bound with venv, use workon will auto switch to project folder mkproject <pro name> mkproject -p python3 <pro name> mkproject -p python2 <pro name>
# create a venv only mkvirtualenv <venv name>
# for old project does not bind with venv # activate venv and go to old project folder, run below to bind them setvirtualenvproject
# remove a venv # you need to manually remove project folder is you want rmvirtualenv <venv name>
What are go package and module is explained here:
https://go.dev/doc/code
A package is a collection of source files in the same directory that are
compiled together. Functions, types, variables, and constants defined in one
source file are visible to all other source files within the same package.
A repository contains one or more modules. A module is a collection of related
Go packages that are released together. A Go repository typically contains only
one module, located at the root of the repository. A file named go.mod there
declares the module path: the import path prefix for all packages within the
module. The module contains the packages in the directory containing its go.mod
file as well as subdirectories of that directory, up to the next subdirectory
containing another go.mod file (if any).
An import path is a string used to import a package. A package’s import path
is its module path joined with its subdirectory within the module. For example,
the module github.com/google/go-cmp contains a package in the directory cmp/.
That package’s import path is github.com/google/go-cmp/cmp. Packages in the
standard library do not have a module path prefix.
// A module is a collection of related Go packages that are released together // in the first line of go.mod, you can see the module name
// main package and func is the entrypoint of app // the file name can be others not necessary as main.go package main
import ( . "fmt"// call fmt funcs directly withou fmt prefix err "errors"// alias "os" "net/http" "regexp" "builtin"// no need to import _ "github.com/ziutek/mysql"// call init in the package )
// auto append `;`, so let { at the same line funcmain() { fmt.Println("Hello world") }
package name in go file should be the same as folder name.
## https://maelvls.dev/go111module-everywhere/ ## In 1.15, it's equivalent to auto. ## In >= 1.16, it's equivalent to on GO111MODULE=""
## you can export or use `go env -w` ## GOROOT is set automatically export GOROOT=/usr/local/go export GOBIN=$GOROOT/bin
## only need to set GOPATH ## go path is the project workspace ## it has src (user create), pkg and bin (auto create for you when build or install) export GOPATH=$HOME/go export PATH=$PATH:$GOBIN
## persistent set and unset go env -w GOPATH=$HOME/go go env GOPATH go env -u GOPATH
Understand run, build, install, get subcommands. pluralsight has Go CLI playbook course.
## used GO >= 1.16 ## has version suffix ## install binary without changing go.mod go install sigs.k8s.io/kind@v0.9.0 go install sigs.k8s.io/kind@latest
## create a module ## link with your code repo url ## generate go.mod go mod init example.com/example ## add missing and remove unused modules ## can edit the package version in require block go mod tidy
# clean mod cache go clean --modcache
## link dependency to local path ## ../greetings is a local package in relative path ## example.com/greetings is that packages module path ## see this tutorial: ## https://go.dev/doc/tutorial/call-module-code go mod edit -replace example.com/greetings=../greetings go mod tidy
# download dependencies for off-line go build # see this ticket: # https://stackoverflow.com/questions/68544611/what-is-the-purpose-of-go-mod-vendor-command go mod vendor
## GO >=1.16, only for editing go.mod ## -x: verbose ## it will modify the go.mod ## or you edit go.mod manually go get github.com/sirupsen/logrus@v1.8.0
## simliar to python dir()/help() ## case-insensitive go doc fmt go doc net/http go doc time.Since go doc fmt.Println ## local API server , browser your own package godoc -http :8080
## linter, but VSC will auto does that when save go files go fmt <package># it runs gofmt -l -w
## test ## file with _test.go suffix go test [-v] [-run="Hello|Bye"] [-cover|-coverprofile]
## -n: dry run go run -n main.go ## -work: print $WORK temp folder ## you will see the intermediary files created go run -work main.go ## run main package go run main.go go run . go run
## build(compile) not install go build hello ## it will create executable in bin folder cd$GOPATH/bin ## executable name is the same as folder name ./hello
## install dir is control by ## env GOPATH and GOBIN go install hello
# check doc from CLI go doc rand.Float64 go doc math.Sin go doc strconv.Atoi
// var block var ( xx int = 34 yy = 78.99 zz = "123" ) // 简短写法 a := "hello"
// swap b := "world" a, b = b, a
// 全局变量不支持简短写法 var GLOBAL int = 100
常量使用
Use all upper cases
字符串类型
1 2
str1 := "abc" str2 := `abc`
强制类型转换
1
int()
算数运算符
关系运算符
<, >, <=, >=, ==, !=
逻辑运算符
&&, ||, !
位运算符
&, | , ^, &^(位清空), <<, >>
If-else
condition 没有括号, no ternary
1 2 3 4 5 6
// if 还可以有初值,和for loop类似了 // 也可以把初值写到外面 if err := file.Chmod(0664); err != nil { log.Print(err) return err }
Switch
每个case 自带break。
关键字 fallthrough 只能放case最后一行, 连接执行2个case且一定执行。
case 后的数据类型必须和switch一致, case 可无序,多个condition可用逗号分开。
switch 省略条件相当于switch true(tagless switch), 也可省略后,把condition 写在case 后面,就像if-else if了
也可以有初值switch t = 100 {..}, t only use in switch block
For loop
no while loop
语法和C一样: for init; conditionl; post {}
condition no bracket
1 2 3 4
for {} for ;; {} for key, value := rangemap {} for index, value := range [slice, array, string] {}
Goto
前后跳都可以
Array
没赋值的默认值和C语言一样:
1 2 3 4 5 6 7 8 9 10 11
var a [10]int// element default is 0 var a = [10]int{} var a = [3]float64 {1, 2, 3} a := [3]float64 {1, 2, 3} // 可以设置index对应的值 a := [3]float64 {0:32, 2:99} a := [...]float64 {0:32, 2:99} len(a) cap(a) // ascending sort sort.Ints(a)
对于数组 len(a) 长度= cap(a) 容量
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
// same address usage as C language var arr = [3]int32{99, 100, 101}
// 不写长度, slice = nil, no underlying array // have to append to use it var slice1 []int var slice1 = []float64 {1, 2, 3} // or // make(type, len, capacity) len 必须小于或等于 cap // make function is in builtin package // used to create object slice1 := make([]int, 2) // len, cap = 2 slice1 := make([]int, 0, 2) // len = 0, cap = 2 // slice1中存的就是切片的地址,不需要取地址符了 // 而数组的地址则是&arr fmt.Printf("%p\n", slice1)
substr := str[1:3] // strings package 主要是字符串 操作函数 // strconv package 主要是字符串 和 基本数据类型之间相互转换 // go '+' operands must have same type, so use Sprintf to concat // different type "123" + strconv.Itoa(100) b1, err := strconv.ParseBool("true") str1 := strconv.FormatBool(b1)
var c func(a ...interface{}) (n int, err error) c = fmt.Println
Anonymous functions and closures
匿名函数,没有函数名,可以赋值给变量 或 直接调用. 所以说Go是支持函数式编程的,匿名函数可以作为其他函数的参数(这个作为参数的函数就叫回调函数),注意不是函数执行的返回值作为参数,是函数本身! 匿名函数也可以作为返回值(闭包)
1 2 3 4 5 6 7 8 9
// no function name and call it func() { fmt.Println() }() // or fun3 := func() { fmt.Println() } fun3()
回调函数:
1 2 3 4 5 6 7 8 9 10 11
// callback function funcadd(a, b int)int { return a + b } // calling function // here we can define func type to replace 'func(int, int)int' in argument list funcoperate(a, b int, fun func(int, int)int) int { return fun(a, b) } // call operate(1, 3, add)
闭包closure, 这里外部函数返回后,匿名函数把外部函数内部的资源i保留了下来。Python中local function一样的道理.
A closure is a function value that references variables from outside its body.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
// `func() int` is treated as return type as a whole // the increment() return a closure funcincrement()func()int { // stay alive in closure function i := 0 returnfunc()int { i++ return i } }
funcB() { fmt.Println("from B") defer fmt.Println("from B defer 1") for i := 0; i <= 10; i++ { fmt.Println("i = ", i) if i == 5 { // panic throw panic("panic happened") } } // will not execute this defer defer fmt.Println("from B defer 2") }
Why Python scripting?
Easy to learn and write, interactively, powerful built-in data types and object oriented. Included packages to support huge range of tasks.
## walk around file system, wrap around Linux system call import os
os.getcwd() ## cd /tmp os.chdir("/tmp") ## current level dir list os.listdir("/") ## recursively to subdir, will return a tuple ## (current dir name, child dirs, child files) os.walk("/tmp") os.stat("/etc/hosts")
## print command line parameters for arg in sys.argv[:]: print(f"{arg}", end = ' ') print()
To parse parameters, use optparse module, see example in git repo 1_optparse-demo.py.
Besides optparse and argparse from the standard library, click module is a good alternative.
To get env varaible:
1 2 3 4 5 6
import os
## set default value when empty response os.getenv("EDITOR", "/usr/bin/vim") os.getenv("HOME") os.environ.get("HOME")
这节的git repo例子很有意思5_signal-primes-v5.py, 6_timeout.py, 用signal handler 去改变条件变量的值,从而改变运行逻辑。之前一直在BASH中用trap去做终止前的处理。Linux has 2 signals set aside for user: SIGUSR1, SIGUSR2.
Executing Commands
Run external commands, for example, call other shell commands or executables by subprocess, similar to linux ().
## 这是最简单的调用 ## this will not disturb current env env = os.environ.copy() ## the run method will call Popen under the nood cmd = ["ls", "-ltr"] ## if does not set stdout/stderr, then command print result to console process = subprocess.run(cmd, env=env, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## 如果要运行和shell有关的命名且需要用到shell的特性 such as shell pipes, filename wildcards, environment variable expansion, and expansion of ~ to a user’s home directory. ## If shell is True, it is recommended to pass args as a string rather than as a sequence. env = os.environ.copy() cmd = "ls -ltr ~ | grep -i download" process = subprocess.run(cmd, shell=True, env=env, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## the same as cmd = ["/bin/sh", "-c", "--", "ls -ltr ~ | grep -i download"] process = subprocess.run(cmd, env=env, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## 高级操作 Popen, run 在背后调用的它 env = os.environ.copy() ## the same as export KUBECONFIG=clusters.yaml env['KUBECONFIG'] = "clusters.yaml" ## this kubectl will refer KUBECONFIG env variable above cmd = ["kubectl", "get", "sts"] process = subprocess.Popen(cmd, env=env, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## out, err is byte type out, err = process.communicate() ## conver to string out = out.decode("utf-8") err = err.decode("utf-8")
if process.returncode != 0: raise subprocess.CalledProcessError(process.returncode, cmd)
关于python concurrency专门的总结: Python Concurrency
String Manipulation
Besides basic operation, it talks about datetime:
1 2 3 4 5 6 7 8 9 10 11 12
import datetime
## depends on your local time zone rightnow = datetime.datetime.now() utc_rightnow = datetime.datetime.utcnow()
rightnow.month rightnow.hour
## lots of % format rightnow.strftime("Today is %A") ## datetime.timedelta, to add and decrease time slot
然后讲了re module. 可以参考这节的git code.
Processing Text, Logging
For long running background service, logging is a must, we can log events into: file, syslog or systemd journal.
Logger has different handlers, for example: StreamHandler(stdout, stderr), FileHanlder, watchFileHandler, SysLogHandler, SockerHandler, JournalHandler, etc.
divmod(x, y) """ returns a pair of numbers consisting of their quotient and remainder. """ x and y : x is numerator and y is denominator x and y must be non complex
## init queue = collections.deque([]) ## same as len(queue) ## canonical way for all collections (tuples, strings, lists, dicts and all their many subtypes) to check empty or not while queue: pass
deflowestCommonAncestor(self, root, p, q): ## 这个表达很省事 if root in (None, p, q): return root ## generator left, right = (self.lowestCommonAncestor(kid, p, q) for kid in (root.left, root.right)) return root if left and right else left or right
// disable remember me checkbox Jenkins.instance.setDisableRememberMe(true) Jenkins.instance.setSystemMessage('Jenkins Server - Automating Jenkins with Groovy') Jenkins.instance.save()
logger.info "Init script complete"
on the browser, type http://localhost:8080/restart, after restart Jenkins, you will see the difference, the checkbox is gone.
Solarized Dark is bad for some syntax highlighting, to import other themes,
download from there, zip file, you can uncompress and put the downloaded folder
to the user home directory, for example /Users/chengdol:
https://iterm2colorschemes.com/
The theme I use is Chalk, import Chalk.itermcolors file(s) in the
scheme(s) folder:
Additionally,go and set iTerm2:
1 2 3 4 5
Preference -> Advanced -> mouse -> Scroll wheel sends arrow keys when in alternate screen mode -> yes
This fix the mess-up issue when you scroll cursor inside Vim.
iTerm2 Tips
Hotkeys, the floating terminal window setup:
Go to Preferences -> Keys -> Hotkey and create a Dedicated Hotkey Window.
My customized hotkey is ctrl + shift + t. Set the hotkey window profiles text
font the same as the default iTerm2 window, here I use MesloLGS NF.
Locating the cursor in iTerm2 terminal: command + /.
Send commands to mutliple panes in the same tab shift + command + i,
disable use the same command.
Go to split pane by direction: command + shift + arrow key.
By default it uses robbyrussell theme, you can see it from ~/.zshrc file,
the theme files are located in ~/.oh-my-zsh/themes folder.
1 2 3 4 5 6 7 8 9 10 11 12
# Set name of the theme to load --- if set to "random", it will # load a random theme each time oh-my-zsh is loaded, in which case, # to know which specific one was loaded, run: echo $RANDOM_THEME # See https://github.com/ohmyzsh/ohmyzsh/wiki/Themes ZSH_THEME="powerlevel10k/powerlevel10k"
# Which plugins would you like to load? # Standard plugins can be found in $ZSH/plugins/ # Custom plugins may be added to $ZSH_CUSTOM/plugins/ # Example format: plugins=(rails git textmate ruby lighthouse) # Add wisely, as too many plugins slow down shell startup. plugins=(git kubectl docker docker-compose gcloud)
Next when you start a new terminal session, the Powerlevel10 configure wizard
will be launched to set your prompt pattern, it will automatically check and
install the font for you. When select encoding, choose Unicode, otherwise no
icon will show.
If you want to reset the configuration, simply run:
# Example aliases # alias zshconfig="mate ~/.zshrc" # alias ohmyzsh="mate ~/.oh-my-zsh" # ### some alias for convenience alias cdd='cd ~/Desktop' alias cdmb='cd ~/Desktop/chengdol.blog/'
## docker alias di='docker images' alias dp='docker ps' alias drp='docker rm -f' alias dri='docker rmi -f' alias dl='docker logs -f'
EC2: Elastic Cloud Compute
AMI: Amazon Machine Image
EBS: Elastic Block Storage, used for EC2 files systems
Security Group: set of firewall rules that control the traffic for your single instance, for example, control who can ssh to EC2 instance, VPC is for groups of instance
S3: Simple Storage Service, maxmium file size is 5T, bucket is accessed via URL, the same as gcloud storage. Can be used for hosting static web site.
RDS: Relation Database Service
Route53: Domain Name System (DNS) servics, you can register your domain name!
EC2
Enhancing Services
EB: Elastic Beanstalk, application service running on EC2
Lambda: Serverless option for executing code, function as a service, only pay when the code is running, significant cost savings if you have infrequent activity. Great for small, irregular tasks, for example, nightly ETL kickoffs, notification type functions
DynamoDB: a managed NoSQL database, supports both key-values and document models
VPC: for securing your services, components in the VPC can connect each through private IP. Multiple subnets can be in VPC, for example, you can configure public subnet and private subnet.
How does VPC work?
route table: control what goes where
network ACL(access control list): act as subnet-level firewalls, control who can come and go
CloudWatch: monitoring resources and acting on alerts, for example, CPU usage on EC2 instances, DynamoDB read/write throughput, estimated billing charges
CloudFront: super fast CDN, works seamlessly with S3, EC2, load balancing and route53
CloudWatch
For example, Increasing network traffic -> EC2 -> alarm CloudWatch -> action -> Auto Scaling Group -> EC2. SNS can also be integrated to CloudWatch.
SNS: simple notification service, Pub/sub messaging for microservices and serverless applications. First create topic, then subscribe this with from email or SMS, etc
IAM
MFA, multi-factor authentication, reuqire more than one factor to authenticate.
MFA process: password + device code (app generated code refresh every 60 seconds) 类似将军令, 要先在手机上下载一个MFA app.
After loging aws console, click the account user name -> My security credentials -> MFA
IAM policy make it easy to assign permissions to users or groups in an administrative way. Users have no permission by default. Policy properties:
Effect: allow, deny
Action: operations user can perform
Resources: user performs on
Root account permission is dangerious, follows amazon suggested best practices to have more securities. For example, create a admin grouo, attch policy to it, then add user to this group, use this user to login.
I use aws cli docker to run the commands, the docker container is ephemeral for each command (for convenience, set alias for docker run command), you need to mount the ~/.aws to container:
1
docker run --rm -ti -v ~/.aws:/root/.aws amazon/aws-cli s3 ls
## after install gcloud SDK ## init gcloud init --console-only
## can login multiple user accounts gcloud auth login ## it is ADC(application default credential) ## for code to interact with GCP, such as terraform CLI gcloud auth application-default login [--project]
## same as gcloud auth login but using SA credential ## and roles gcloud auth activate-service-account [--key-file]
## list auth accounts or service account gcloud auth list ## switch account gcloud config set account <account name> ## revoke account gcloud auth revoke <account name>
## show and install components, i.e alpha, beta, kubectl... gcloud components list gcloud components install [beta]
## all projects under my account, not the used one gcloud projects list
## set which project to use gcloud config set project <project name> ## current project in use gcloud config list project ## get project ID gcloud config get-value project
## list service account in project gcloud iam service-accounts list [--project <project ID>] ## create service-account after auth login and set project gcloud iam service-accounts create <SA name> [--display-name=<"description">] [--project <project id>] ## can update display name and description gcloud iam service-accounts update ... ## disable service account gcloud iam service-accounts enable/disable ... ## delete: When a service account is deleted, its role bindings ## are not immediately removed; they are automatically purged from ## the system after a maximum of 60 days. gcloud iam service-accounts delete ... ## generate credentials json file for terrform ## can also delete it gcloud iam service-accounts keys create ~/key.json \ --iam-account <SA name>@<project ID>.iam.gserviceaccount.com ## see the roles bind to service account gcloud iam service-accounts get-iam-policy <SA>
## see available context ## -o name: show context name kubectl config get-contexts [-o name] ## switch context kubectl config use-context <context name> ## rename context to human readable kubectl config rename-context <old> <new>
## export current configuration to yaml file kubectl config view --minify --flatten > cluster.yaml ## the same as this gcloud command ## KUBECONFIG=clusters.yaml: specify cluster.yaml to store the credentials KUBECONFIG=clusters.yaml gcloud container clusters \ get-credentials <cluster name> --zone=<cluster zone>
## current enabled API list gcloud services list [--porject <project ID>] gcloud services enable <API> gcloud services disable <API>
## grant IAM roles to end user in project ## member can be serviceAccount:email gcloud projects add-iam-policy-binding <project ID> \ --member user:<member> \ --role=roles/gkehub.admin \ --role=roles/resourcemanager.projectIamAdmin
Terms
Cloud SDK commands:
gcloud
kubectl
gsutil (google storage)
bq (big query)
Cloud shell is acutally running on a ephemeral compute engine instance.
Zone is under Region, you can think of a zone as data center in a region.
Anthos is google’s morden solution for hybird and multi-cloud systems and
services management.
GCP cloud functions: serverless execution environment for building and
connecting cloud services. With Cloud Functions you write simple, single-purpose
functions that are attached to events emitted from your cloud infrastructure and
services. Your Cloud Function is triggered when an event being watched is fired.
Your code executes in a fully managed environment. There is no need to provision
any infrastructure or worry about managing any servers.
GCP deployment manager: like Terraform, infrastructure as code.
GCP Dataproc for running Apache Spark and Apache Hadoop clusters.
GCP Dataflows offers managed data pipelines, serverless fully managed data processing.
GCP Dataprep visually explore, clean and prepare data for analysis and machine learning.
BigQuery is fully managed data warehouse.
Pub/Sub (publisher/subscriber) is scalable, reliable messaging.
DataLab offers interactive data exploration. Build on Jupyter.
Kubernetes Architecting
Build on top of compute engine.
Container is isolated in user space to running application code, lightweight,
represent as a process:
process
linux namespace
cgroups
nuion file systems
GKE abstracts away the master, only show the worker nodes on dashboard. Use
Node Pool to manage different kinds of nodes.
Google maintains a container registry: gcr.io
Cloud Run: build on Knative, for serverless workloads.
Cloud Build: Build, test, and deploy on serverless CI/CD platform.
Private Cluster, google products and authorized networks can access.
Fundations
Compute Engine let you run virtual machine. In GCP, K8s nodes are actually
virtual machine running in Compute Engine, just like IBM Fyre, you can see them
in Compute Engine dashboard.
Fully customized virtual machines
Persistent disk/SSD or optional local SSDs
Global load balancing and autoscaling
Per-second billing
VM has built-in SDK commands.
A vCPU is equal to 1 hardware hyper-thread.
Preemptible VM: can be terminated by GCP if the resources is needed in other
places.
VPC: virtual private cloud, VPC is global scope, subnet is regional, can
have different zone on the same subnet. Each VPC network is contained in a GCP
project. VPC make componets connect to each other or isolated from each other.
You control the VPC network, use its route table to forward traffic within
network, even across subnets.
VPC: 3 types:
default mode
auto mode
custom mode (for production)
VPN can connect the on-premises network to GCP network.
VMs can be on the same subnet but different zones. Every subnet has four
reserved IP addresses in its primary IP range: .0 for subnet network itself,
.1 for subnet gateway, second-to-last address in the range and the last address.
The external IP is transparent to VM, managed by VPC. You will not see it by
ip a s command.
In /etc/hosts:
1 2 3
10.128.0.2 instance-1.us-central1-a.c.terraform-k8s-282804.internal instance-1 # Added by Google ## internal DNS reslover 169.254.169.254 metadata.google.internal # Added by Google
Setup VPC peering or VPN to allow internal network connection between VPCs.
You can delete the whole default network setting, and create your own, for
example, auto or custom mode network.
Private google access (for example to access cloud storage) and Cloud NAT
(only outbound is allowed) help VM without external IP to access internet.
RAM Disk: tmpfs, fast scratch disk or cache, faster then disk but slower then memory.
VM comes with a single root persistent disk, can attach additional disk to VM,
it is network storage! The extended disk needs to be formated and mounted by
yourself, for example:
sudo mount -o discard,defaults /dev/disk/by_id/<disk name> /home/<target directory>
App engine is not like Compute engine, it does not comprise of virtual
machines, instead get access a family of services that application needs.
Container(K8s, hybird) is in the middle of Compute engine (IssA) and App engine
(PaaS). You don’t want to focus on the infrastructure at all, just want to focus
on your application code. Especially suited for for building scalable web
application/web site and mobile backends, RESTful API.
首先理解RBAC,在很多场合都有应用: 分为3个部分: identity, roles and resources. Identity 可以是google account, google group and service account(not human). Role 有几种分类,比如primitive role, predefined role, custom role.
IAM: identity and access management, who can do what on which resources. user of IAM can be person, group and application. Always select the least privilege to reduce the exposure to risk.
IAM add new member中 GCP 和 G suite 是共享用户(human)信息的。
Identities:
google accounts
service accounts, belongs to your applications
google groups (collection of google accounts and service accounts)
G suite domains
Cloud identity domains
Service Account: used by application or virtual machine running code on your behalf, can have IAM policies attach to it:
user-managed SA: for example service-account-name@project-id.iam.gserviceaccount.com, you choose the service account name.
gcloud iam service-accounts set/get-iam-policy <service account>
Storage and Database
Storage access control has many options, IAM is one of them and usually is enough. others like ACLs, signed URL and Signed policy document.
Cloud Storage: fully managed object store. In the demo, gsutil command can do versioning, acl, set restrictions, etc.
1 2
# if want to skip heep_proxy setting gs='env -u http_proxy gsutil'
The slide has info about how to choose which service: SQL, NoSQL …?
Cloud SQL: a fully managed database service (MySQL or PostgreSQL), If the Cloud SQL located in the same VPC and the same region, connect it with private IP, otherwise using cloud SQL proxy connection (setup via a script).
Cloud Spanner: Cloud Spanner combines the benefits of relational database structure with non-relational horizontal scale. Used for financial and inventory applications.
Cloud Firestore: the next generation of Cloud Datastore. Cloud Firestore is a NoSQL document database
Cloud Bigtable: a fully managed, wide-column NoSQL database that offers low latency and replication for high availability.
Cloud Memorystore: creates and manages Redis instances on the Google Cloud Platform.
Resource Management
Resource manager, quotas, labels and billing.
Resource Monitor
From stackdriver collection.
Scaling and Automation
Interconnecting Networks
In the demo, Two VMs in differen region and subnet, setup the VPN tunnel they can ping each other via private IP.
理解了这部分,可以自己搭建VPN翻墙了.
Cloud VPN: securely connect your infrastructure to GCP VPC network, useful for low-volume data connections.
Options: IPsec VPN tunnel, dedicated interconnect (for large traffic) and partner interconnect (via other service provider network)
Configure cloud VPN gateway and on-premises VPN gateway, setup VPN tunnel (encrypted traffic), must be paired.
Load Balancing and Auto Scaling
Managed instance groups, typically used with autoscaler.
In the demo, create VM with detached disk, install apach2 then keep the disk to create custom image, use this image to create instance template then creating instance groups.
Infrastructure Automation
Deployment manager and Terraform, can also use Ansible, Chef, Puppet…
Built on open source technologies pioneered by Google—including Kubernetes, Istio, and Knative—Anthos enables consistency between on-premises and cloud environments.
config management is the single source of truth, 可以把所有的policies都放在一个git repo中,是为desired state, 使用时会传播到所有被managed的objects中,是否被managed 在object manifest中有annotation标记.
multiple control planes DNS using Istio CoreDNS, not kube-dns (for local).
Ingress of Anthos
https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-for-anthos
这里将ingress of anthos的概念,组成以及图示都列出来了,很清晰。
Ingress for Anthos is designed to meet the load balancing needs of multi-cluster, multi-regional environments. It’s a controller for the external HTTP(S) load balancer to provide ingress for traffic coming from the internet across one or more clusters.
Ingress for Anthos updates the load balancer, keeping it consistent with the environment and desired state of Kubernetes resources.
Ingress for Anthos uses a centralized Kubernetes API server to deploy Ingress across multiple clusters. This centralized API server is called the config cluster. Any GKE cluster can act as the config cluster. The config cluster uses two custom resource types: MultiClusterIngress and MultiClusterService. By deploying these resources on the config cluster, the Anthos Ingress Controller deploys load balancers across multiple clusters.
There can have multiple mcs and only one mci. mcs can select specific clusters with clusters field. mci can specify default backend and other backends with rules.
Clusters that you register to an environ(An environ is a domain that groups clusters and infrastructure, manages resources, and keeps a consistent policy across them) become visible to Ingress, so they can be used as backends for Ingress.
Environs possess a characteristic known as namespace sameness which assumes that resources with the identical names and same namespace across clusters are considered to be instances of the same resource.
Build Images for cloud and on-premise, Packer template is JSON format (easily source control).
variables
builders: can have multiple builders run parallelly.
provisioners: run in order, need only to specify where to run.
post-processors: auto post-build tasks, eg: compression.
A machine image is a single static unit that contains a pre-configured operating system and installed software which is used to quickly create new running machines. Machine image formats change for each platform. Some examples include AMIs for EC2, VMDK/VMX files for VMware, OVF exports for VirtualBox, etc.
-debug flag in build can help run steps one by one, parallel build in debug mode is running sequentially.
To build Ubuntu VirtualBox image VOF, ISO is download from Ubuntu web site then Packer will launch it to run provisioner. Then use post-processor to compress VOF to tar.gz, or convert to Vagrant box.
What are the differences between Packer and Docker?
# build image # -debug: pause for each step, clear packer build [-debug] <file.json> # -force: delete existing artifact then build packer build -force [-debug] <file.json>
When using -debug flag, Packer will show you the private pem file in current directory, you can use that pem file ssh to running VM, for example, in google cloud:
1 2
# jenkins is the ssh_username you set in template ssh -i gce_googlecompute.pem jenkins@172.16.160.49