{ // The VSC font size. "editor.fontSize":15, // 2 spaces when indent "editor.tabSize":2, // Column ruler and color. "editor.rulers":[80], // Stay the same with iTerm2 ohmyzsh font. "terminal.integrated.fontFamily":"MesloLGS NF", // This can be customized as needed. "python.defaultInterpreterPath":"/usr/bin/python3", "files.autoSave":"afterDelay", "workbench.iconTheme":"material-icon-theme", "workbench.colorCustomizations":{ "editorRuler.foreground":"#5c7858" }, // After installing go plugin "go.useLanguageServer":true, "go.toolsManagement.autoUpdate":true, // To make VSC compile code correctly with file that has below build tags // otherwise VSC cannot find them, for example, the struct in Mock file. "go.buildTags":"integration test mock lasting", // For Markdown Preview Github Styling plugin. "markdown-preview-github-styles.colorTheme":"light", // For Markdown Preview Mermaid Support plugin. "markdown-mermaid.darkModeTheme":"neutral" }
Image Rendering
The pictures displayed in certain blogs are stored in my Google drive, you need
to allowlist for anyone on internet and use the below path in markdown:
1

GitLens, code blame, heatmaps and authorship, etc.
Markdown Preview Github Styling, rendering markdown file as Github style
Markdown Preview Mermaid Support, draw sequence diagrams, see
Example and Live Editor
NOTE: The Mermaid plugin here is for VSCode, not for Hexo deployment, to
enable Mermaid in Hexo rendering, please check Hexo setup blog.
I have encountered a issue that the go extension does not work properly, for
example the source code back tracing is not working, the troubleshooting please
see here.
Most likely you need to run go vendor to download the dependencies locally.
Understand the purpose of every email, for example:
Educate or inform
Make a request
Introduction
Respond
Before sending, review for purpose alignment. If not, adjust the message.
Key elements for good email:
Subject line: keep email from getting deleted
Introduction: create context, build trust, remind who you are
Message: bulk of the email
Call to action: request, the last part of the body
Signature: provide contact information, your brand
Adjust the From name and email address, for example chengdol <email address>
, this can be done through Send email as in Gmail settings.
The Signature helps people understand your skill set, value proposition, may
include:
your name
tagline: directly under your name, a few words, can be your title.
phone, address, links
To, CC and BCC
To: directly talk to a person.
CC: who can hear the conversation, don’t have to act on it.
BCC: blind from others, only the person put you in BCC field knows you are
listening, no one else know you are on recipient list, fine to reply to sender
but not reply all.
Communicating Better
Visuals may help, but should not make it distracting, bold or highlighting is
good to make response stand out.
When many people involved, use Reply All, unless you intend to start a side
conversation. Move to Reply, for example:
1 2
Thank again for the introduction, I'll move this conversation with xx to another thread so we don't clutter your inbox.
Vacation Responder
Set out-of-office auto reply, for example, at a conference, on leave. A good
example:
1 2 3
I’m currently consumed with a project that is taking almost all of my time. If you are emailing me with a product or support question, please email Liz. Otherwise, I will try to respond within 24 hours.
Send Report
This is usually for project progress report weekly to stakeholders. Draft them
in spreadsheet and copy to email.
There are 5 stages for a project (Not Started, In Progress, Complete):
Concept Exit
Design & Planning
Execution/Implementation
Preview
GA
You can also point out weekly:
Highlights
Lowlights
The risk status of milestone for different teams or components:
On Track(green): good, healthy
At Risk(amber): signal caution
Off Track(red): unlikely to be successful or correct
When the stakes are high, and your email can have a major impact on the outcome,
it can pay to invest your time in proper proofreading, you can ask LLM to help
proofread.
Demos
You can also generate draft or template from bard or charGPT, with the input
from you for your purpose.
The First Communication
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
## subject Reaching out from Twitter
## introduction Hi Chris, I have been following you on Twitter for a while, and have interacted a little with you over the last few weeks. I wanted to bring the conversation over to email.
## message body
## call to action Can we get on a call in the next week or so? I am open all day Thursday, just let me know what works for you.
## signature
Virtual Introduction
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
## subject Virtual introduction: Matt and Jesse
## introduction Hi Matt and Jesse, as per our previous conversations I wanted to introduce you to one another.
## message Matt, I’ve known Jesse for a few years and know him to be a very clever developer, and a loyal friend. I know he can help you with some of the coding challenges you are facing right now.
Jesse, Matt is my friend and colleague, and can better explain his challenges than I can, but I think you are the right person for him to talk to.
## call to action I hope you two can get together soon. I’ll let you take it from here.
## signature
Information Heavy
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Follow-up on job search information
Hi Laurie, you had asked for information to help you with your job search Thursday morning when we spoke.
Below are a few of my favorite blog posts which I think are relevant to where you are, based on our conversation. I am happy to talk about any of this with you, over email or on a call. Just let me know what works best for you.
## some links here
I have been blogging for over 14 years, and have plenty to share, but I thought these would be the most interesting and meaningful to you.
I would love to jump on a call this week to talk about your next steps. Are you available for a call Friday before 2?
## signature
Respond to Questions
1 2 3 4 5
Hi Jim, thank you for your thoughtful email. You have a lot of questions and ideas in there. Please scroll down and see my comments in `yellow`.
## copy the original email and answer right after each question and highlight ## with yellow
Negative Situation
1 2 3 4 5 6 7
Team performance
Mike, I promised you an email follow-up from our conversation this morning. I know this is an uncomfortable conversation and I appreciate your willingness to address this with me.
There are two issues we need to address.
1 2 3 4 5 6 7 8 9
Project Foo meeting this morning
Hi Carlos, let me first apologize for how the meeting went this afternoon. I could tell that you were uncomfortable. I wanted to share my perspective on what was happening.
While I knew there was a chance your Project Foo was going to be killed, I was not aware of the reasons stated in the meeting. I have seen what you and your team have done with Project Foo and I have been very impressed.
Seek Help
1 2 3 4 5 6 7 8 9 10
Hi xx, Hope you are doing well!
We have started the work related to ... ## body
At this point it is not clear for us how to proceed ... Can you please provide guidance, or point us to someone who can assist?
figure out customer requirements
hold meetings
write agendas
status reports
Critical skill of being a leader:
Communication
Effective management skill
Emotional intelligence (情商) and Empathy
Communicate in clear, credible and authentic way.
Use passion and confidence to enhance the message.(口音,表情,肢体语言)
Inspire, motivate others
Informs, persuades, guides, assures
when you be a leader in your new team:
devote time and energy to establishing how you want your team to work
first few weeks are critical
get to know your team members
showcase your values
explain how you want the team to work
set and clarify goals, walk the talk
don’t be afraid to over communicate
Don’t:
不要认为没建立关系也能完成工作
不要假设成员理解你的工作模式和期望
不要担心在开始阶段过多的重复谈话(reiterate the stragety over and over)
What does leadership mean to you?
connection:
focus on the person
influence
words
changes:
vision
action
drive change
motivate:
inspire motivation
long-lasting motivation
Why they will follow you?
make them feel comfortable, configent and satisfied:
trust
be open, fair and listen
admit mistake
be decisive
respect the opinions of others
compassion
stability
hope
Effective Leader
The more it becomes about people.
The less it becomes about your personal tech expertise.
The broader your domain becomes, making you even more removed.
Always be deciding(identify ambiguity, make tradeoff)
Always be leaving(make the team self-drive without you)
Always be scaling
//TODO:
[ ] elasticsearch certificate
[ ] es ML node for abnormal detect (这个还有点意思,用来分析数据的)
[ ] logstash input data for testing, movielens
[x] linkedin learning:
[x] JKSJ elasticsearch training and git repo
[x] cerebro, tutorial
ELK Stack
The Elastic Stack is one of the most
effective ways to leverage open source technology to build a central logging,
monitoring, and alerting system for servers and applications.
Logstash: aggregates, filters and supplyments log data, forwards them to
Elasticsearch or others.
Kibana: web-based front-end to visualize and analyze log data.
Beats: lightweight utilities for reading logs from a varity of sources,
sends data to Logstash or other backends.
Altering: send notifications to email, slack, pagerduty so on and so forth.
ELK Stack vs Prometheus:
ELK is general-purpose no-sql stack can be used for monitoring, aggregating all
the logging and shipping to elastic search for ease of browsing all the logging
and similar things. Prometheus is dedicated monitoring system, alongside with
service discovery consul and alert-manager.
My Vagrant elasticsearch cluster setup.
Java runtime and /bin/bash that supports array are required, also note that
elasticserach cannot boot by root user.
Another option is to use docker compose to create testing elasticsearch cluster,
see my repo here.
Set index.unassigned.node_left.delayed_timeout to hours
Starts from data nodes, then master nodes, one by one, ensure config yaml
file is correct for each role
Wait for recovery with big retries
Revert index.unassigned.node_left.delayed_timeout
Upgrade kibana version
This blog series
talks about kafka + elastic architecture.
This blog
shares ways to enable data high reliability as well as extending resources. As
we see the kafka message queue also benefits the data reliability besides
throttling.
There are several ways to
install: binary, rpm or on kubernetes. The package is Java self-contained, you
can also specify ES_JAVA_HOME to use external Java.
Install using archive.
The Elasticsearch .tar.gz package does not include the systemd module (have to
create by yourself). To manage Elasticsearch as a service easily, use the Debian
or RPM package instead.
It is advisable to change the default locations of the config directory, the
data directory, and the logs directory. Usually data directory is mounted on
separate disk.
Before launching, go to edit $ES_HOME/config/elasticsearch.yml. The
configuration files should contain settings which are node-specific (such as
node.name, node.role and storage paths), or settings which a node requires in
order to be able to join a cluster, such as cluster.name and network.host.
The config path can be changed by ES_PATH_CONF env variable.
# cluster name cluster.name:chengdol-es # node name node.name:master # ip to access, the host public IP # or using interface name such as _eth1_ network.host:9.30.94.85
# a list of master-eligible nodes in the cluster # Each address can be either an IP address or a hostname # that resolves to one or more IP addresses via DNS. discovery.seed_hosts: -192.168.1.10:9300 # port default 9300 -192.168.1.11 -seeds.mydomain.com # ipv6 - [0:0:0:0:0:ffff:c0a8:10c]:9301
To form a production cluster, you need to specify, for node roles, see this
document
about how to statically specify master and data nodes.
1 2 3 4 5 6 7 8
cluster.name network.host discovery.seed_hosts cluster.initial_master_nodes # specify dedicated node role # lower version have different syntax node.roles: [ master ] node.roles: [ data ]
The master node is responsible for lightweight cluster-wide actions such as creating or deleting an index, tracking which nodes are part of the cluster, and deciding which shards to allocate to which nodes.
High availability (HA) clusters require at least three master-eligible nodes, at least two of which are not voting-only nodes. Such a cluster will be able to elect a master node even if one of the nodes fails.
Data nodes hold the shards that contain the documents you have indexed. Data nodes handle data related operations like CRUD, search, and aggregations. These operations are I/O-, memory-, and CPU-intensive. It is important to monitor these resources and to add more data nodes if they are overloaded.
# 9200 is the defaul port # on browser or kibana dev console curl-XGET"http://<master/data bind IP>:9200" # response from Elasticsearch server { "name":"master", "cluster_name":"chengdol-es", "cluster_uuid":"XIRbI3QxRq-ZXNuGDRqDFQ", "version": { "number":"7.11.1", "build_flavor":"default", "build_type":"tar", "build_hash":"ff17057114c2199c9c1bbecc727003a907c0db7a", "build_date":"2021-02-15T13:44:09.394032Z", "build_snapshot":false, "lucene_version":"8.7.0", "minimum_wire_compatibility_version":"6.8.0", "minimum_index_compatibility_version":"6.0.0-beta1" }, "tagline":"You Know, for Search" }
Check cluster health and number of master and data nodes:
Besides master and data node role, there are ingest node, remote-eligible node, coordinating node(can be dedicated).
Indiex
[x] how to create and config index
[x] how to check index, status, config
[x] how to create actually document data: check document API
[x] how to reroute shards, through cluster APIs
First, an index is some type of data organization mechanism, allowing the user to partition data a certain way. The second concept relates to replicas and shards, the mechanism Elasticsearch uses to distribute data around the cluster.
So just remember, Indices organize data logically, but they also organize data physically through the underlying shards. When you create an index, you can define how many shards you want. Each shard is an independent Lucene index that can be hosted anywhere in your cluster.
Index module: the settings for index and control all aspects related to an index, for example:
Index template: tell Elasticsearch how to configure an index when it is created, Elasticsearch applies templates to new indices based on an index pattern that matches the index name. Templates are configured prior to index creation and then when an index is created either manually or through indexing a document, the template settings are used as a basis for creating the index. If a new data stream or index matches more than one index template, the index template with the highest priority is used.
There are two types of templates, index templates and component templates(注意old version只有index template, see this legacy index template), template 其实包含了index module的内容.
Elasticsearch API, for example: cluster status, index, document and shards, and reroute, to examine the node type (master and data nodes), shards distribution and CPU load statistics.
Let’s see an example to display doc content:
1 2 3 4 5 6 7
# cat indices curl -X GET "172.20.21.30:9200/_cat/indices?format=json" | jq # search index # get list of docs and its ids curl -X GET "172.20.21.30:9200/<index name>/_search?format=json" | jq # get doc via its id curl -X GET "172.20.21.30:9200/<index name>/_doc/<doc id>?format=json" | jq
To upload single or bulk document data to elasticsearch, see document API. You can download sample here: sample data, let’s try accounts data:
1 2 3 4 5 6 7 8 9 10 11
# the accounts.json does not have index and type in it, so specify in curl command # /bank/account is the index and type # es will create index bank and type account for you automatically curl -s -H "Content-Type: application/x-ndjson" \ -XPOST 172.20.21.30:9200/bank/account/_bulk?pretty \ --data-binary "@accounts.json"; echo
# display indices curl -XGET 172.20.21.30:9200/_cat/indices # check doc id 1 of index `bank` curl -XGET 172.20.21.30:9200/bank/account/1?pretty
Query data, run on kibana dev console:
1 2 3 4 5 6 7 8 9
# query account from CA curl -XGET 172.20.21.30:9200/bank/account/_search { "query": { "match": { "state": "CA" } } }
In the response message, the match has a _score field, it tells you how relevant is the match.
Plugins
Elasticsearch provides a variety of plugins to extend the system, for example, snapshot plugin, see here.
1 2 3 4 5 6
# list all installed plug-ins bin/elasticsearch-plugin list # example bin/elasticsearch-plugin install analysis-icu # api localhost:9200/_cat/plugins
Access by http://172.20.21.30:5601 in firefox browser.
Kibana has built-in sample data that you can play with, Go to add sample data then
move to Analytics -> Discover to query and analyze the data. You need to know
KQL to query
document, Dashboard is also helpful.
Kibana dev console can issue HTTP request to explore ES APIs, command + enter
to run (more shortcut see help menu, helpful!).
1 2 3
# echo command has a play botton GET /_cat/indices?v GET /_cat/nodes?v
Or the data can be ingested from Logstash, see below and my Vagrant demo. Need
to create Index Pattern to load data and query.
Also you can install plug-ins for Kibana:
1 2 3
bin/kibana-plugin install <plugin> bin/kibana-plugin list bin/kibana-plugin remove <plugin>
Discover
I usually use Discover to filter and check log message, and use Dashboard to
make graph, such Area, Bar, etc to extract data pattern.
How to draw graph easily from Dscovery:
In Discover, query and filter to get the target log records.
In the leftside panel, right click one of the selected fields -> Visualize.
I want to highlight that the Area graph, it will show you the proportion of
target field value alone with the timeline. For example, in the graph settings,
the horizontal axis is @timtstamp, vertical axis uses count and break down
by the selected field of the message.
There is a
saved object
management, in Discover, Dashboard and Index pattern section, you can save
items and manage them as well as export/import from other Kibana instance.
Logstash
Ingest data to elasticsearch or other downstream consumers, introduction. Usually be paired with Beats.
Logstash offers self-contained architecture-specific downloads that include AdoptOpenJDK 11, the latest long term support (LTS) release of JDK. Use the JAVA_HOME environment variable if you want to use a JDK other than the version that is bundled.
There is a pipeline.workers setting in logstash.yml file and also some input plugin such as UDP has its own workers setting, what’s the difference? Read this post A History of Logstash Output Workers. So the input and (filter + output) are separated pools, they have separated worker thread settings, pipeline.workers is for (filter + output) part, the default value is equal to number of CPU core.
input { ## in Beats side, listening on port 5043 beats { port => "5043" } }
filter { if [type] == "syslog" { ## grok filter grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" } } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } } }
output { elasticsearch { ## Elasticsearch address hosts => [ "9.30.94.85:9200" ] ## write to index ## %{[@metadata][beat]}: these are field and sub_field in message index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" document_type => "%{[@metadata][type]}" } }
# --config.test_and_exit: parses configuration # file and reports any errors. bin/logstash -f beats.conf --config.test_and_exit
# The --config.reload.automatic: enables automatic # config reloading so that don’t have to stop and # restart Logstash every time modify the configuration file. bin/logstash -f beats.conf --config.reload.automatic
Beats
https://www.elastic.co/beats/
Beats, written in golang, can output data to Elasticsearch, Logstash and Redis. But usually we send data to Logstash (pre-processing) then forward to Elasticsearch.
Each Beat has configure yaml file with detailed configuration guideline. For example, in the configure yaml file, comment out Elasticsearch output, use Logstash output.
Filebeat: text log files
Heartbeat: uptime
Metricbeat: OS and applications
Packetbeat: network monitoring
Winlogbeat: windows event log
Libbeat: write your own
X-Pack
X-Pack is an Elastic Stack extension that provides security, alerting, monitoring, reporting, machine learning, and many other capabilities. By default, when you install Elasticsearch, X-Pack is installed, it is open-source now.
Check X-pack, you will see the availability and status of each component:
To alter a planet for the purpose of sustaining life.
This article from IBM can give you a good overview:
Terraform vs Kubernetes
Terraform vs Ansible
Removing manual build process, adopting declarative approach to deploy infrastructure as code, reusable, idempotent and consistent repeatable deployment.
Use gcloud API or client binary can do the same work as Terraform, so what are the benefits:
cloud agnostic, multi-cloud portable.
Unified workflow: If you are already deploying infrastructure to Google Cloud with Terraform, your resources can fit into that workflow.
Full lifecycle management: Terraform doesn’t only create resources, it updates, and deletes tracked resources without requiring you to inspect the API to identify those resources.
Graph of relationships: Terraform understands dependency relationships between resources.
Terraform 文档非常informative,结构清晰:
Terraform之于cloud infra on public cloud 就相当于 helm之于application on k8s, 大大简化了操作复杂性,自动快速部署,同时做到了复用,versioning等特性。但得去了解cloud provider中提供的resources 的用途,搭配。
Terraform executable: download from web (or build terraform docker image)
Terraform files: using hashicorp configure language DSL
Terraform plugin: interact with provider: AWS, GCP, Azure, etc
Terraform state file: json and don’t touch it but you can view it to get deployment detial
You can have multiple terraform files: .tf, when run terraform it will stitch them together to form a single configuration. 比如把variables, outputs, resources, tfvars分开。
tfvars file by default named as terraform.tfvars, otherwise when run plan you need to specify the file path. This tfvars file is usually generated from some meta-data configuration, then combining with variable declaration file.
Commands
To execute terraform command, build a docker image and mount cloud auth credentials when start container. BTW, if you run on google compute VM, the SDK will inject the host auth automatucally in container.
If you update the terraform file with different configuration, rerun init, plan and apply.
# list all commands terraform --help terraform version
# create workspace, see below section terraform workspace
# linter terraform validate
# show a tree of providers in main the sub-modules # for example, google, kubernetes, random, null, locals they are all providers terraform providers
# init will download plugin, for example, aws, gcp or azure.. terraform init
# will show you the diff if you update your terraform file # load terraform.tfvars by default, if not, need to specify terraform plan -out plan.tfplan
# will generate a tfstate file # perform creation as much parallel as possible # --auto-approve: for script, no interactive terraform apply "plan.tfplan" [--auto-approve]
# -state: output state to specific file # when run different env with a single definition file terraform apply -state=qa-env.tfstate -var environment=qa "plan.tfplan"
# Manually mark a resource as tainted, forcing a destroy and recreate # on the next plan/apply. terraform taint <google_compute_instance.vm_instance>
# output terraform state or plan file in a human-readable form # show what has been created terraform show
# show output variable value # useful for scripts to extract outputs from your configuration terraform output [output name]
# Update variables terraform refresh
# show objects being managed by state file terraform state list
在deploy infrastructure 之后的配置操作,比如使用ansible or shell script as privisioners.
Provisioner can be ran in creation or destruction stage, you can also have multi-provisioner in one resources and they execute in order in resource.
Provisioner can be local or remote:
file: copy file from local to remove VM instance.
local-exec: executes a command locally on the machine running Terraform, not the VM instance itself.
remote-exec: executes on remote VM instance.
Terraform treats provisioners differently from other arguments. Provisioners only run when a resource is created, adding a provisioner does not force that resource to be destroyed and recreated. Use terraform taint to tell Terraform to recreate the instance.
Other way to use variables rather than specifying in single .tf file.
The scenario, we need development, QA(Quality Assurance)/UAT(User Acceptance Testing), production environment, how to implement with one configuration and multiple inputs?
The variable values can be from, precedence from low to high:
environment variable: TF_VAR_<var name>.
file: terraform.tfvars or specify by -var-file in terraform command.
terraform command flags -var.
You can override variables and precedence, select value based on environment, for example:
1 2 3 4 5 6 7 8 9 10 11
# specify default value in tf file variable "env_name" { type = string default = "development" }
# or specify in tfvars file env_name = "uat"
# or specify in command line terraform plan -var 'env_name=production'
Variable types:
string, the default type if no explicitly specified
variable "web_instance_count" { type = number default = 1 }
# list variable "cidrs" { default = [] }
# map variable "machine_types" { # map type and key is string type = map(string) default = { dev = "f1-micro" test = "n1-highcpu-32" prod = "n1-highcpu-32" } }
In terraform, the same syntax ${} for interpolation as in bash:
1 2 3 4 5 6 7 8 9 10
# local variable definition locals { # random_integer is a terraform resource tags = "${var.bucket_name_prefix}-${var.environment_tag}-${random_integer.rand.result}" }
# use resource "aws_instance""example" { tags = local.tags }
Workspace
Workspace is the recommended way to working with multiple environments, for example:
state management
variables data
credentials management
State file example, we have dev, QA, prod three environments, put them each into separate folder, when run command, specify the input and output:
1 2 3 4 5 6
# for dev environment # -state: where to write state file # -var-file: load file terraform plan -state="./dev/dev.state" \ -var-file="common.tfvars" \ -var-file="./dev/dev.tfvars"
Workspace example, there is a terraform.workspace built-in variable can be used to indicate the workspace currently in, then use it in map variable to select right value for different environment. (不用再去分别创建不同的folder for different environment了)
1 2 3 4 5 6 7 8 9 10 11 12 13
# create dev workspace and switch to it # 类似于git branch terraform workspace new dev # show workspace terraform workspace list terraform plan -out dev.tfplan terraform apply "dev.tfplan"
# now create QA workspace terraform workspace new QA
# switch workspace terraform workspace select dev
Special terraform variable to get workspace name
1 2 3
locals { env_name = lower(terraform.workspace) }
Managing secrets
Hashicorp Vault is for this purpose. it can hand over credentials from cloud provider to terraform and set ttl for the secrets.
Or you can use environment variable to specify the credentials, terraform will pick it automatically, but bear in mind to use the right env var name. For example:
## you will see the default jenkins port is 8080 ps aux | grep jenkinx
then you can open the web interface by <node ip>:8080. 如果发现是中文版,可能是浏览器的语言设置出了问题,改一下chrome的语言设置为English即可.
If wizard install failed with some plugins, you can fix this later in Manage Plugins.
Jenkins use file system to store everything, if using systemd, the configuration is in /var/lib/jenkins. You can backup this folder, or if you want to wipe it out, then run systemctl restart jenkins, then jenkins goes back to init state.
Even Jenkins UI to create project is not needed, you can mkdir and files in the Jenkins working directory, then go to Manage Jenkins -> Reload Configuration from Disk.
freestyle project -> pipeline (series of freestyle project), freestyle project is not recommended.
Jenkins Workspace, you can see it in console output or click Workspace icon in your project dashboard. Everything running is in project workspace. Every build will override the previous one. You can use tar command to backup and restore this workspace, or clean workspace.
To store the build artifact, use Post-build Action in configure. for example, you want to archive some jar or zip files. then after build is done, these archives will show in the build page.
Build trend
Right to Build History list, there is a trend button, click it will see the build time history statistics and distribution.
Testing and Continuous integration
Now start the pipeline job type. After creating a pipeline job, you will see pipeline syntax button in the page bottom, it contains necessary resources to start. You can also use Copy from to copy a pipeline configure from another, for quick start.
Add slave nodes
Manage Jenkins -> Manage Nodes and Clouds
To add slaves, usually use SSH to launch agent nodes. (如果node没有被发现,会显示错误,根据错误指示排查问题即可)
Before adding a slave node to the Jenkins master we need to prepare the node. We need to install Java on the slave node. Jenkins will install a client program on the slave node.
To run the client program we need to install the same Java version we used to install on Jenkins master. You need to install and configure the necessary tool in the slave node.
1
yum install -y java-1.8.0-openjdk
When configure add node agent, Host Key Verification Strategy:
在初始化设置Jenkins的时候,有可能有plugins安装失败,可以自己在Manage plugin中安装,然后restart Jenkins (关于restart Jenkins请在没有job运行的情况下进行,不同的安装方式restart的方法不同,或者在安装plugin的时候选择restart jenkins after install), for example: systemctl restart jenkins, 这可以消除控制面板上的plugin failure警告。。
Continuous delivery
In Blue Ocean, you can run multiple builds in parallel. if more than one builds run in the same agent, the workplace path is distinguished by suffix (@count number). 但是能不能run multiple builds in one agent depends on how you design your pipeline and tasks.
Blue Ocean中的UI对parallel的显示也很直观,方便查看。
Trigger builds remotely
Set pipeline can be triggered remotely by URL, also optionally set pipeline trigger token in pipeline configure UI (can be empty).
You also need to know the user token, set from current user profile menu, you must keep the your user token to somewhere, for example, store it in credential secret text. So you can refer the token for example:
Then in the upstream pipeline script, trigger other pipeline by running curl command:
1 2 3 4
## can be http or https connection ## --user is the jenkin user and its token or password ## token=${PIPELINE_TRIGGER_TOKEN} can be ignored if it's empty curl --user ${TRIGGER_USER}:${TRIGGER_USER_TOEKN} --request POST http/https://<url>/job/${PIPELINE_NAME}/buildWithParameters?token=${PIPELINE_TRIGGER_TOKEN}\\¶1=val1\\&¶2=val2
You don’t need to specify all parameters in URL, the parameters default values will be used if they are not specified in the URL.
Notice that para1 and para2 must exist in parameters section of the triggered pipeline, otherwise you cannot use them. So far, based on testing, I can pass string, bool and file parameter types.
curl --user ${LOGIN_USER}:${LOGIN_USER_TOEKN} --request GET http/https://<url>/job/${PIPELINE_NAME}/<build number>/api/json
Then you can parse the json returned:
artifacts -> result: SUCCESS, FAILURE, ABORTED
Flyweight executor
Flyweight executor reside in Jenkins master, used to execute code outside of node allocation. Others are heavyweight executors. Flyweight executor will not be counted into executor capacity.
pipeline { // default agent specify agent any // pipeline level env var, global scope environment { // you can put release number here // referred by env.RELEASE RELEASE = '1.1.3' } // can have multiple stages stages { // list tool version stage('Audit tools') { steps { sh ''' git version docker version ''' } } stage('Build') { // agent specify agent any // stage level env var, stage scope environment { USER = 'root' } steps { echo "this is Build stage" // executable in your repo sh 'chmod +x ./build.sh' // 把jenkins中一个名为api-key的密匙的值 放入 API_KEY这个环境变量中 // 且这个API_KEY仅在block中可见 withCredentials([string(credentialsId:'api-key', variable:'API_KEY')]) { sh ''' ./build.sh ''' } } } // can have different type stage('Test') { environment { LOG_LEVEL = "INFO" } // parallel tasks parallel { // they can be running on different agent // depends on you agent setting stage('test1') { steps { // show current stage name test1 echo "parallel ${STAGE_NAME}" // switch to ./src directory dir('./gradle') { sh ''' ./gradlew -p xxx test1 ''' } } } stage('test2') { steps { echo "parallel ${STAGE_NAME}" } } stage('test3') { steps { echo "parallel ${STAGE_NAME}" } } } } stage('Deploy') { // waiting for user input before deploying input { message "Continue Deploy?" ok "Do it!" parameters { string(name:'TARGET', defaultValue:'PROD', description:'target environment') } } steps { echo "this is Deploy with ${env.RELEASE}" // groovy code block // potential security hole, jenkins will not make it easy for you script { // you need to approve use of these class/method if (Math.random() > 0.5) { thrownew Exception() } // you can use try/catch block for security reason } // if fail, this wouldn't get executed // write 'passed' into file test-results.txt writeFile file:'test-results.txt', text:'passed' } } } post { // will always be executed always { echo "prints whether deploy happened or not, success or failure." } // others like: success, failure, cleanup, etc success { // archive files archiveArtifacts 'test-results.txt' // slack notifation slackSend channel:'#chengdol-private', message:"Release ${env.RELEASE}, success: ${currentBuild.fullDisplayName}." } failure { slackSend channel:'#chengdol-private', color:'danger', message:"Release ${env.RELEASE}, FAILED: ${currentBuild.fullDisplayName}." } } }
if you don’t want to checkout SCM in stages that run in same agent, you can use this option:
First install slack notifaction plugin. Then go to Manage Jenkins -> Configure System, scroll down to bottom you will see slack section, see question mark for explanation.
Then go to your target slack channel, select Add an app, search Jenkins CI, then add it to slack, follow the instructions to get the secret token, add this token to Jenkins credentials and use it in above slack configuration.
After all set, try Test connection, you will see message in your slack channel.
// this is dynamic reference, explicitly specify the library in jenkins file library identifier:'jenkins-pipeline-demo-library@master', retriever: modernSCM( [$class:'GitSCMSource', remote:'https://github.com/sixeyed/jenkins-pipeline-demo-library.git', // if the repo is private, you can have credential here credentialsId:'<credential id>'])
Shared library is under vars folder. In Groovy, we can add a method named call to a class and then invoke the method without using the name call, crossPlatformBuild is actually the file name, inside file there is a call method.
## if not success, it will show you the overall problems with your jenkins file curl -X POST -F "jenkinsfile=<[jenkins file path]" http://<IP>:8080/pipeline-model-converter/validate
Visual Studio code has Jenkins linter plugin, you need to configure it with linter url.
Restart or replay
In every build interface, restart from stage, you can select which stage to restart (sometimes stage may fail due to external reason), replay, you can edit your jenkins file and library then rerun, the changes only live in current build (after succeed, check in your updates to source control).
https://hub.docker.com/r/jenkins/jenkins
you can parallelly run several jenkins version in one machine, for purposes like testing new features, testing upgrade, so on and so forth. But you may need to customize the jenkins docker image and expose to different port.
agent { docker { image 'myregistry.com/node' // can pass argument to docker run args '-v /tmp:/tmp' // the node must pre-configured to have docker label 'my-defined-label' // optional set the registry to pull image registryUrl 'https://myregistry.com/' registryCredentialsId 'myPredefinedCredentialsInJenkins' } }
Official Document
To install helm in the control node, download the corresponding binary and untar to execution path, or using container and mount necessary k8s credentials.
For helm2, Tiller server in cluster may not stable and secure, another workaround is run it locally, it talks to remote k8s cluster via kuebctl config.
Note that HELM_VER < 2.17.0 does not work anymore, the default stable repo is gone, so upgrade to 2.17.0 in environment variable.
Then run it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# go to tillerless folder that with the dockerfile above docker build -f tillerless.dockerfile -t tillerless:1.0 .
docker run -d --name=xxx \ ## something you want to mount ## gcloud config -v ~/.config:/root/.config \ ## kubectl connect -v ~/.kube:/root/.kube \ -v $(pwd)/../envoy-proxy:/envoy-proxy \ # default workspace path -w /envoy-proxy \ ## tillerless env vars ## by default tiller uses secret -e HELM_TILLER_STORAGE=configmap \ -e HELM_HOST=127.0.0.1:44134 \ --entrypoint=/bin/bash \ tillerless:1.0 \ -c "tail -f /dev/null"
When first time exec into docker container, run kubectl may not work, try exit out and run kubectl on host and exec log in again.
If switch k8s context, please stop and restart tillerless to adopt change.
1 2 3 4 5 6 7 8 9 10
## export if they are gone export HELM_TILLER_STORAGE=configmap export HELM_HOST=127.0.0.1:44134 ## by default tiller namespace is kube-system helm tiller start [tiller namespace] helm list helm install.. helm delete..
exit
or
1 2 3 4 5 6 7 8 9 10
## export if they are gone export HELM_TILLER_STORAGE=configmap export HELM_HOST=127.0.0.1:44134 ## by default tiller namespace is kube-system helm tiller start-ci [tiller namespace] helm list helm install.. helm delete..
helm tiller stop
or
1
helm tiller run <command>
Overview
helm3 does not have default repo, usually we use https://kubernetes-charts.storage.googleapis.com/ as our stable repo. helm2 can skip this as it has default stable repo.
## add stable repo to local repo ## 'stable' is your custom repo name helm repo add stable https://kubernetes-charts.storage.googleapis.com/ ## display local repo list helm repo list ## remove repo 'stable' helm repo remove stable
## install charts ## Make sure we get the latest list of charts helm repo update helm install stable/mysql --generate-name helm install <release name> stable/mysql -n <namespace> helm install <path to unpacked/packed chart>
## show status of your release helm status <release name>
Whenever you install a chart, a new release is created. So one chart can be installed multiple times into the same cluster. Each can be independently managed and upgraded.
1 2 3 4 5 6 7
## show deployed release helm ls -n <namespace>
## uninstall ## with --keep-history, you can check the status of release ## or even undelete it helm uninstall <release name> [--keep-history] -n <namespace>
Install order
Install in certain order, click to see. Or you can split the chart into different part or using init container.
<chart name>/ Chart.yaml # A YAML file containing information about the chart LICENSE # OPTIONAL: A plain text file containing the license for the chart README.md # OPTIONAL: A human-readable README file values.yaml # The default configuration values for this chart values.schema.json # OPTIONAL: A JSON Schema for imposing a structure on the values.yaml file, values.yaml 必须遵守这个结构, 否则不会通过 charts/ # other dependent requirements.yaml # other dependent (for helm2) crds/ # Custom Resource Definitions templates/ # A directory of templates that, when combined with values, # will generate valid Kubernetes manifest files. xxx.yaml _xx.tpl # functions NOTES.txt # show description after run helm install templates/NOTES.txt # OPTIONAL: A plain text file containing short usage notes
To drop a dependency into your charts/ directory, use the helm pull command
Chart.yaml
apiVersion, helm3 is v2, helm2 is v1appVersion, application verion
version, charts version, for example, chart file/structure changed
keywords field is used for helm search
type, we have application and library chart
Managing dependencies
Package the charts to archive, you can use tar but helm has special command for this purpose:
1 2 3 4
## it will create .tgz suffix ## and append chart verion to archive name ## chart version is from Chart.yaml helm package <chart_name>
Publishing chart in repos, chartmuseum (like docker hub…), just like private docker registry, you can create a private chartmuseum in your host (有专门的安装包).
1 2 3 4 5 6 7 8 9
## go to the dir that contains chart archive ## this will generate a index.yaml file helm repo index . ## for security can be signed and verified ## for verification, we need provenance file helm package --sign helm verify <chart> ## verify when install helm install --verify ...
## will download dependency charts archive to your charts folder ## according to the definition in Chart.yaml helm dependency update <chart name> ## list dependency, their version, repo and status helm dependency list <chart name>
You can also use conditions and tags to control which dependency is needed or not, for example, in Chart.yaml file
function usage -- pipeline usage ================================================================ default default_value value -- value | default default_value quote value -- value | quote upper value -- value | upper trunc value 20 -- value | trunc 20 trimSuffix "-" value -- value | trimSuffix "-" b64enc value -- value | b64enc randAlphaNum 10 -- value | randAlphaNum 10 toYaml value -- value | toYaml printf format value -- list value | join"-"
modify scope using with to simpify the directives,就不用写一长串引用了
control whitespaces and indent
use - to remove whitespace (newline is treated as white space!)
Usually first use the static check then dynamic check.
1 2 3 4 5 6 7 8 9 10 11
## static ## works without k8s cluster ## you can also specify which values yaml file helm template <chart dir or archive file> [--debug] | less ## for helm2 helm template --tiller-namespace tiller --values ./xxx/values.second.yaml --debug <chart dir or archive file> |less
## dynamic ## real helm install but without commit ## can generate a release name as [release] helm install [release] <chart> --dry-run --debug 2>&1 | less
## install with specified release name helm install [release name] [chart] -n <namespace> --values <path to values yaml> ## check release status helm list -n <namespace> ## display yaml files helm get manifest [release] -n <namespace> | less
## check release specification and revision numbers helm status [release] -n <namespace>
## get all info ## helm2: helm get [release] helm get all [release] -n <namespace>
## upgrade helm upgrade [release] [chart] -n <namespace> ## check revision helm history [release] -n <namespace> ## rollback ## revision number can get from helm history helm rollback [release] [revision] -n <namespace>
## if abort the helm install, check helm list then uninstall the broken release ## helm2: helm delete --purge [release] helm uninstall [release] -n <namespace>
In helm2, helm client uses gRPC protocol to access Tiller server (in production secure connection is required, set TLS/SSL), then Tiller (need service account with privilege) will call K8s API to instantiate the charts. In helm3, no Tiller no security issue.
# system version # softlink actually cat /etc/os-release cat /etc/system-release cat /etc/redhat-release
# kernel release number uname -r cat /proc/version
Shutdown
Send message to others
1 2 3 4 5 6
# send to individual user terminal write dsadm > xxx
# send to all user in terminals wall < message.txt
Shutdown system and prompt
1 2 3 4 5 6
# reboot now shutdown -r now # halt/poweroff in 10 mins and use wall send message to login users shutdown -h 10 "The system is going down in 10 min" # cancel shutdown shutdown -c
Changing runlevels
what is runlevel in linux?
https://www.liquidweb.com/kb/linux-runlevels-explained/
比如
runlevel 1 就只能root user且没有network enabled,也叫作rescue.target,可以做一些需要隔离的操作。
runlevel 3 是默认的multi-user + network enabled (多数情况是这个状态)
runlevel 5 是Desktop interface + runlevel 3的组合。
1 2 3 4 5 6 7 8 9
# show current runlevel who -r runlevel
# different systemd daemon can have differet target runlevel # default runlevel systemctl get-default # set default runlevel systemctl set-default multi-user.target
More about systemd, see my systemd blog.
Manage processes
1 2 3 4 5 6 7 8 9 10 11 12
# show process on current shell # use dash is UNIX options ps -f # -e means all processes ps -ef --forest # -F show full format column ps -F -p $(pgrep sshd) # kill all sleep processes pkill sleep
# BSD options ps aux
$$ the PID of current running process
1 2 3 4 5 6 7
cd /proc/$$
# we can interrogate this directory # current dir ls -l cwd # current exe ls -l exe
top 命令的options还记得吗? 比如切换memory显示单位,选择排序的依据CPU/MEM occupied…
Process priority
if something runs in foreground and prevent you from doing anything, use ctrl+z to suspend it (still in memory, not takeing CPU time), then put it in background.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
sleep 10000 ^Z [1]+ Stopped sleep 10000
# use job command, `+` means current focus jobs [1]+ Stopped sleep 10000
# use bg command to put current focus in background bg [1]+ sleep 10000 &
# check is running in background jobs [1]+ Running sleep 10000 &
# use fg will bring current focus to foreground again
如果你在一个bash shell中sleep 1000& 然后exit bash shell,则这个sleep process will hand over to init process. can check via ps -F -p $(pgrep sleep), 会发现PPID是1了。进入另一个bash shell jobs 并不会显示之前bash shell的background process.
1 2 3 4 5 6 7 8
# show PRI(priority) and NI(nice) number ps -l
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 4 S 0 23785 23781 0 80 0 - 28891 do_wai pts/1 00:00:00 bash 0 S 0 24859 23785 0 80 0 - 26987 hrtime pts/1 00:00:00 sleep 0 S 0 24861 23785 0 80 0 - 26987 hrtime pts/1 00:00:00 sleep ...
PRI value for real time is from [60,99] and [100,139] for users, the bigger the better.
NI value is from [-20,19], higher the nicer so less CPU time to take. 在相同PRI 之下,NI 决定了多少资源.
List content of the package procps-ng, procps is the package that has a bunch of small useful utilities that give information about processes using the /proc filesystem. The package includes the programs ps, top, vmstat, w, kill, free, slabtop, and skill.
1 2 3 4 5 6 7 8 9 10 11 12 13
# see executable files under procps package via rpm rpm -ql procps-ng | grep "^/usr/bin/"
# check how long the system has been running # load average is not normalized for cpu number 如果你知道CPU有多少个 # 根据load average就能看出是不是很忙, 如果load average的值超出了CPU个数 # 则说明需要queue or wait # 这个命令其实是从/proc/uptime, /proc/loadavg 来的数据 uptime 18:53:14 up 39 days, 3:50, 1 user, load average: 0.00, 0.01, 0.05
# check how many cpu # the number of cpu is equal to processor number # but you may have less cores, see /proc/cpuinfo lscpu
# the same as w w 18:59:29 up 12 days, 23:40, 3 users, load average: 0.04, 0.26, 0.26 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 9.160.1.111 08:47 6:46m 0.03s 0.03s -bash ...
监控load or output
1 2 3 4 5 6 7
# execute a program periodically, showing output fullscreen # 这里的例子是每隔4秒 运行 uptime watch -n 4 uptime
# graphic representation of system load average # 如果此时运行一个tar,会看到loadavg显著变化 tload
# check system reboot info # The last command reads data from the wtmp log and displays it in a terminal window. last reboot # check still login user, the same as `w` last | grep still
# show last 10 lines journalctl -n 10 # ses real time appending journalctl -f # -u: systemd unit journalctl -u sshd # timestamp journalctl --since "10 minutes ago" journalctl --since "2020-04-26 13:00:00"
Selinux
O’Reilly有过相关的课程,在我工作邮件中连接还在。目前只需要知道什么是selinux,如何打开,关闭它即可。
SELINUX= can take one of these three values:
enforcing - SELinux security policy is enforced.
permissive - SELinux prints warnings instead of enforcing.
disabled - No SELinux policy is loaded.
1 2 3 4
# see if selinux is permissive, enforcing or disabled getenforce # more clear sestatus
# see user selinux config id -Z # user, role, type unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 # see files selinux config /bin/ls -Z # see process selinux config ps -Zp $(pgrep sshd) LABEL PID TTY STAT TIME COMMAND system_u:system_r:kernel_t:s0 968 ? Ss 0:00 /usr/sbin/sshd -D unconfined_u:unconfined_r:unconfined_t:s0 1196 ? Ss 0:00 sshd: root@pts/0
Code that fails to honor the Catch or Specify Requirement will not compile.
Not all exceptions are subject to the Catch or Specify Requirement. To understand why, we need to look at the three basic categories of exceptions, only one of which is subject to the Requirement.
The Three Kinds of Exceptions
The first kind of exception is the checked exception.
Checked exceptions are subject to the Catch or Specify Requirement. All exceptions are checked exceptions, except for those indicated by Error, RuntimeException, and their subclasses.
The second kind of exception is the error. These are exceptional conditions that are external to the application, and that the application usually cannot anticipate or recover from.
Errors are not subject to the Catch or Specify Requirement. Errors are those exceptions indicated by Error and its subclasses.
The third kind of exception is the runtime exception.
Runtime exceptions are not subject to the Catch or Specify Requirement. Runtime exceptions are those indicated by RuntimeException and its subclasses.
Errors and runtime exceptions are collectively known as unchecked exceptions.
try { // code that could throw an exception } // check in order catch (IOException | SQLException ex) { logger.log(ex); throw ex; } catch (IndexOutOfBoundsException e) { System.err.println("IndexOutOfBoundsException: " + e.getMessage()); } // The finally block always executes when the try block exits. finally { if (out != null) { System.out.println("Closing PrintWriter"); out.close(); } else { System.out.println("PrintWriter not open"); } }
finally block it allows the programmer to avoid having cleanup code accidentally bypassed by a return, continue, or break. Putting cleanup code in a finally block is always a good practice, even when no exceptions are anticipated.
The try-with-resources statement ensures that each resource is closed at the end of the statement. Any object that implements java.lang.AutoCloseable, which includes all objects which implement java.io.Closeable, can be used as a resource.
Note: A try-with-resources statement can have catch and finally blocks just like an ordinary try statement. In a try-with-resources statement, any catch or finally block is run after the resources declared have been closed.
Throw exception
declare throws exception for method
1
publicvoidwriteList()throws IOException {}
throw an exception
1 2 3 4 5
publicvoidtest() { if (size == 0) { thrownewEmptyStackException(); } }
关于Deque,之前我的总结是,如果要当作Stack使用,则stick to Stack methods,比如push, pop, peek。如果当作Queue使用,则stick to Queue method,比如add/offer, poll/remove, peek。在实现上,我一般使用的是ArrayDeque, a resizable double-ended array。
今天突然想到一个问题,用Deque实现的Queue或者Stack,在使用enhanced for loop的时候,Java是怎么知道元素弹出的正确顺序呢? 或者如果混用Queue和Stack的方法,peek会弹出什么结果?iterator会给出什么顺序的结果呢?
/** * Pushes an element onto the stack represented by this deque. In other * words, inserts the element at the front of this deque. * * <p>This method is equivalent to {@link #addFirst}. * * @param e the element to push * @throws NullPointerException if the specified element is null */ publicvoidpush(E e) { addFirst(e); }
/** * Inserts the specified element at the front of this deque. * * @param e the element to add * @throws NullPointerException if the specified element is null */ publicvoidaddFirst(E e) { if (e == null) thrownewNullPointerException(); final Object[] es = elements; es[head = dec(head, es.length)] = e; if (head == tail) grow(1); }
/** * Circularly decrements i, mod modulus. * Precondition and postcondition: 0 <= i < modulus. */ staticfinalintdec(int i, int modulus) { if (--i < 0) i = modulus - 1; return i; }
/** * Inserts the specified element at the end of this deque. * * <p>This method is equivalent to {@link #addLast}. * * @param e the element to add * @return {@code true} (as specified by {@link Collection#add}) * @throws NullPointerException if the specified element is null */ publicbooleanadd(E e) { addLast(e); returntrue; }
/** * Inserts the specified element at the end of this deque. * * <p>This method is equivalent to {@link #add}. * * @param e the element to add * @throws NullPointerException if the specified element is null */ publicvoidaddLast(E e) { if (e == null) thrownewNullPointerException(); final Object[] es = elements; es[tail] = e; if (head == (tail = inc(tail, es.length))) grow(1); }
/** * Circularly increments i, mod modulus. * Precondition and postcondition: 0 <= i < modulus. */ staticfinalintinc(int i, int modulus) { if (++i >= modulus) i = 0; return i; }
peek总是从head pointer取值
1 2 3 4 5 6 7 8 9 10 11 12 13
/** * Retrieves, but does not remove, the head of the queue represented by * this deque, or returns {@code null} if this deque is empty. * * <p>This method is equivalent to {@link #peekFirst}. * * @return the head of the queue represented by this deque, or * {@code null} if this deque is empty */ public E peek() { return peekFirst(); }
iterator总是从head pointer -> tail pointer
1 2 3 4 5 6 7 8 9 10 11 12
/** * Returns an iterator over the elements in this deque. The elements * will be ordered from first (head) to last (tail). This is the same * order that elements would be dequeued (via successive calls to * {@link #remove} or popped (via successive calls to {@link #pop}). * * @return an iterator over the elements in this deque */ public Iterator<E> iterator() { returnnewDeqIterator(); }