Consul Quick Start

Posted on 2020-06-13 Edited on 2021-01-03 In Infra

Lab Environment Setup

Consul is easy to install, just a executable binary, put it in /usr/local/bin: https://www.consul.io/downloads

我修改了一下课程的demo，做了一个consul lab cluster via Vagrant: https://github.com/chengdol/InfraTree/tree/master/vagrant-consul

Glossary: https://github.com/chengdol/InfraTree/blob/master/vagrant-consul/glossary.md

Introduction

Challenges in managing services:

Service discovery
Failure Detection
Mutli-Data center
Service configuration

一个应用服务架构中，一般有API tier增加灵活性，同时提供额外的服务，比如以下应用就可以直接拿来API使用:

Consul is distributed.

These services need to be discovered by each other. 对于越来越复杂的内部组织结构，比如很多internal load balancer, Consul can come and play, 比如提供内部的DNS服务, Service discovery.

Failure Dectection, Consul running lightweight Consul agent (server or client mode) on each of node in your environment. The agent will diagnose all services running locally.

Reacting configuration via key/value store, reflecting changes quickly in near real time. Multi-Data center aware.

Consul vs Other softwares, see here. Especailly Consul vs Istio, see here.

Consul UI online demo: https://demo.consul.io

Monitor Nodes

在这一章的例子中提供了一个很好的建模思路！在vagrant virutal machine中安装docker，然后用container的方式运行一些服务(比如这里的Nginx web and HAProxy LB)，再expose(localhost)这些端口(对machine iptables做了更改)，这样就避免了很多的virtual machine上的安装配置工作。

Start consul server agent:

# -dev: development agent, server mode will be turned on this agent, for quick start
# in production, don't use -dev

# -advertise: specify one ipv4 interface
# -client: specify client access ip, usually 0.0.0.0
consul agent -dev -bind 0.0.0.0 -advertise 172.20.20.31 -client 127.0.0.1

# log output
==> Starting Consul agent...
           Version: 'v1.8.0'
           Node ID: '95b60a36-f350-8a2b-b1cb-54f7b79657dc'
         Node name: 'consul-server'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
      Cluster Addr: 172.20.20.31 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false

==> Log data will now stream in as it occurs:
    ...
    2020-06-19T23:59:25.729Z [INFO]  agent.server: New leader elected: payload=consul-server
    ...

[ ] 我改动了一下Vagrantfile，我估计是routing table出了问题，在MacOS host上无法访问private network中的virtual machine via private IP: https://stackoverflow.com/questions/23497855/unable-to-connect-to-vagrant-private-network-from-host

于是我增加了一个VM ui 去显示consul 的UI with port forwarding, but still does not work, from the log the port 8500 is bound with 127.0.0.1:

1 2	Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: -1, DNS: 8600) Cluster Addr: 172.20.20.41 (LAN: 8301, WAN: 8302)

首先，我想到了更改Client Addr 为 172.20.20.41，因为这是我在Vagrantfile中设置的private IP:

1	consul agent -config-file /vagrant/ui.consul.json -advertise 172.20.20.41 -client 172.20.20.41

但还是不行, 主机上localhost:8500 无法连接，当然为了确认-client flag的使用的正确性，用netstat查看一下是否端口在改interface上。后来我就想到应该是iptables的问题了，没有这个interface上的流量forward出去，那就改成0.0.0.0好了(specify “any IPv4 address at all”):

# /vagrant/ui.consul.json set ui is true
consul agent -config-file /vagrant/ui.consul.json -advertise 172.20.20.41 -client 0.0.0.0

# output
==> Starting Consul agent...
           Version: 'v1.8.0'
           Node ID: '10ccbe63-bef0-3cf6-b24b-e0a53bdef213'
         Node name: 'ui'
        Datacenter: 'dc1' (Segment: '')
            Server: false (Bootstrap: false)
       Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, gRPC: -1, DNS: 8600)
      Cluster Addr: 172.20.20.41 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false

==> Log data will now stream in as it occurs:
...
==> Consul agent running!
...
    2020-06-20T03:38:12.476Z [INFO]  agent: (LAN) joining: lan_addresses=[172.20.20.31]
    2020-06-20T03:38:12.477Z [WARN]  agent.client.manager: No servers available
    2020-06-20T03:38:12.477Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No known Consul servers"
    2020-06-20T03:38:12.480Z [INFO]  agent.client.serf.lan: serf: EventMemberJoin: consul-server 172.20.20.31
    2020-06-20T03:38:12.480Z [INFO]  agent: (LAN) joined: number_of_nodes=1
...

或者可以在config json中定义client_addr:

{
  "retry_join": ["172.20.20.31"],
  "data_dir": "/tmp/consul",
  "client_addr": "0.0.0.0"
}

虽然通过的ui virtual machine暴露的web，但是所有信息都来自consul server! 和k8s nodeport的模式类似。 Can access via HTTP API: https://www.consul.io/api-docs

1
2
3

http://localhost:8500/v1/catalog/nodes
# format readable
http://localhost:8500/v1/catalog/nodes?pretty

DNS query, go to ui node, when we run consul agent, the DNS port is 8600:

# query node
dig @localhost -p 8600 consul-server.node.consul
# query service
dig @localhost -p 8600 consul.service.consul
# query service record, will show you the server port, such as 8300
dig @localhost -p 8600 consul.service.consul SRV

The RPC Protocol is deprecated and support was removed in Consul 0.8. Please use the HTTP API, which has support for all features of the RPC Protocol.

Consul Commands

这里提到了2个有用的commands, 本来是用RPC实现的，但现在改了:

# can specify target point
# provide debug info
consul info [-http-addr=172.20.20.31:8500]
# get log message, 这样就可以在某一agent上查看任意其他的agent log了
consul monitor [-http-addr=172.20.20.31:8500]

Here 172.20.20.31 is consul server, you must start it by -client 0.0.0.0, otherwise the port is bound with loopback interface and cannot access.

Other commands:

# maintain node
# enable maintaince, service will not show in consul DNS
# -service: maintain for a specific service
consul maint -enable -reason "Because..."
consul maint
consul maint -disable

# validate config file
# the config file must complete! cannot separate to several parts!
consul validate [config file]

# show members
consul members

# similar to docker/k8s exec
consul exec uptime

Note that consul exec is by default disabled: https://www.consul.io/docs/agent/options.html#disable_remote_exec 这个命令挺危险，就相当于ssh到node执行command line. 比如在node上用的docker container提供服务，则可以exec到node docker stop xxx.

BTY, gracefully exit the consul process will not cause warning or error in UI display. If you force kill it, the node will be marked as critical.

Service Discovery

One way to register service to consul is use Service definition: https://www.consul.io/docs/agent/services 比如register LB service to consul，这样的好处就是前面提到了，consul会根据其他agent反馈的web nginx的情况及时修改HAProxy的config信息，更新配置, 接下来会看到:

Regsiter service does not mean the service is healthy, also need to do healthy check: For example:

{
  "service": {
    "name": "web",
    "port": 8080,
    "check": {
      "http": "http://localhost:8080",
      "interval": "10s"
    }
  }
}

Then launch consul client agent, add one more service config file web.service.json for registration, for example, in web1 node:

1
2
3

consul agent -config-file /vagrant/common.json \
             -advertise 172.20.20.21 \
             -config-file /vagrant/web.service.json

Then check the consul UI, you will see the node is good but service is unhealthy because now there is no nginx running, so create nginx in web1 node:

1	/vagrant/setup.web.sh

Then refresh the web page, everything is good.

You can dig the web service from ui node, this is so called internal service discovery, not facing public. 这些数据对于LB来说可以用来direct traffic, 这就是Consul自带DNS的好处，没有什么额外的设置了，并且还提供了health check，就非常方便了. 并且public facing LB也在Consul中注册了，这样一旦LB goes down，就能被马上监测到。

1 2	dig @localhost -p 8600 web.service.consul SRV # you will see exactly the number of web service running

Except query DNS from dig, consul HTTP API also can do it:

# services list
curl http://localhost:8500/v1/catalog/services?pretty
# service web detail
curl http://localhost:8500/v1/catalog/service/web?pretty
# health check
# see the Status field: passing or critical
curl http://localhost:8500/v1/health/service/web?pretty

前面用到了service definition去register service，这只是一种方法，还可以用HTTP API 注册. 这里还有一些自动注册的工具: https://www.consul.io/downloads_tools

docker container registrator
consul aware app: using HTTP API

LB Dynamic Config

HAProxy: The Reliable, High Performance TCP/HTTP Load Balancer. HAProxy config file haproxy.cfg example:

global
    maxconn 4096

defaults
    mode http
    timeout connect 5s
    timeout client 50s
    timeout server 50s

listen http-in
    bind *:80
    server web1 172.20.20.21:8080
    server web2 172.20.20.22:8080

8080 port is where nginx web service from, bind *:80 is meant to expose port for health check, 意思是外界通过LB上的80 端口访问后台web servers, 这也就是为啥consul中LB的health check输出居然是welcome to Nginx!，因为那是后台返回的页面.

In the demo, we run HAProxy container in lb machine. How to verify it is up and running? In any machine:

1 2	dig @localhost -p 8600 lb.service.consul SRV # the lb record will show

Now let’s verify LB is actually working:

1
2
3

# try several times, LB will cycling through backend servers
# you will see different ip returned
curl http://localhost/ip.html

如果这时关掉一个web server，在HAProxy没有enable health check功能的情况下，仍然会把请求发往已经挂掉的server，则用户得到503 error. 这也是很多LB的问题，需要设置自身的health check。但如果用consul的DNS，由于各个server的health check已经集成进去了，consul会返回健康的server进行服务. So we can feed information to LB from consul dynamically.

Consul Template

Consul template is go template format: https://github.com/hashicorp/consul-template 这个不仅仅用于config LB, any application with config file can utilize this tool!

Workflow: consul template will listen changes from consul, as changes occur it will be pushed to the consul template daemon (run in lb machine). consul template daemon will generate HAProxy new config file from a template for HAProxy, then we tell docker to restart HAProxy (or HAProxy reload config).

This is the haproxy.ctmpl file

global
    maxconn 4096

defaults
    mode http
    timeout connect 5s
    timeout client 50s
    timeout server 50s

listen http-in
    bind *:80{{range service "web"}}
    server {{.Node}} {{.Address}}:{{.Port}}{{end}}

    stats enable
    stats uri /haproxy
    stats refresh 5s

This part means in the web display HAProxy statistic report! 这个统计图挺直观的，但我这里由于route原因看不到, access from http://<Load balancer IP>/haproxy:

1
2
3

stats enable
stats uri /haproxy
stats refresh 5s

Next, install consul-template in lb machine, run some tests with template file:

1 2	# dry run consul-template -template /vagrant/provision/haproxy.ctmpl -dry

At meanwhile, go to web1 machine, run docker stop/start web, you will see the real time updates in output from consul-template command above.

Then, create consul-template template file lb.consul-template.hcl, used to tell consul-template how to do its job.

1 2	consul-template -config /vagrant/provision/lb.consul-template.hcl # you will see the haproxy.cfg is replaced by new one

Then we can provision the daemon run in background in lb machine:

1	(consul-template -config /vagrant/provision/lb.consul-template.hcl >/dev/null 2>&1)&

Open the consul UI, in terminal go to web1 or web2 machine, stop/start the docker, see the updates. Also in lb machine, run below command to see the LB still works good, it will not return the unhealthy server to you:

1	curl http://localhost/ip.html

Other tools

Envconsul Envconsul provides a convenient way to launch a subprocess with environment variables populated from HashiCorp Consul and Vault. 前面提到了config file for process, here Envconsul set env variables for process and kick off for us.
confd confd is a lightweight configuration management tool
fabio fabio is a fast, modern, zero-conf load balancing HTTP(S) and TCP router for deploying applications managed by consul

Reactive Configuration

One of primary use case is to update app configuration. for example, when services changes inject the changes to consul key/value pairs and have it pushed into our application.

注意key/value不要用来当Database, it’s not intended for! 但是运作的方式几乎和etcd一样！ https://etcd.io/

Go to Consul UI to add key/value pairs, create a folder path /prod/portal/haproxy, then create key/value pair in it:

maxconn 2048
stats enable
timeout-client 50s
timeout-connect 5s
timeout-server 50s

SSH to ui node, let’s read the key/value stored:

# list all pairs
curl http://localhost:8500/v1/kv/?recurse'&'pretty

# add key/value via HTTP API
# /prod/portal/haproxy is path we created before
curl -X PUT -d '50s' http://localhost:8500/v1/kv/prod/portal/haproxy/timeout-server
# delete
curl -X DELETE http://localhost:8500/v1/kv/prod/portal/haproxy/timeout-server
# get one
curl -X GET http://localhost:8500/v1/kv/prod/portal/haproxy/timeout-server?pretty
curl -X GET http://localhost:8500/v1/kv/prod/portal/haproxy/timeout-server?raw

The API will return JSON data, you can use jq to parse it.

Update the LB config template haproxy.ctmpl as:

global
    maxconn {{key "prod/portal/haproxy/maxconn"}}

defaults
    mode http
    timeout connect {{key "prod/portal/haproxy/timeout-connect"}}
    timeout client {{key "prod/portal/haproxy/timeout-client"}}
    timeout server {{key "prod/portal/haproxy/timeout-server"}}

listen http-in
    bind *:80{{range service "web"}}
    server {{.Node}} {{.Address}}:{{.Port}}{{end}}

    stats {{key "prod/portal/haproxy/stats"}}
    stats uri /haproxy
    stats refresh 5s

Then make consul-template process reload without killing it:

1 2	# HUP signal will make consul-tempalte reload killall -HUP consul-template

Then you will see the haproxy.cfg file is regenerated!

来谈谈为什么这个key/value setting如此重要: 有时候实现并不知道具体设置参数，在production环境，你可能想real time更新参数，比如这里LB中maxconn，实际使用中可能由于machine CPU, memory等因素，不得不调小，你可以用consul maint或其他方式去调节, but that would be a pain and the change will take time to converge across the infrastructure.

Use Key/Value store is really a reactive confiuration!

Blocking query

https://www.consul.io/api-docs/features/blocking

A blocking query is used to wait for a potential change using long polling. Not all endpoints support blocking, but each endpoint uniquely documents its support for blocking queries in the documentation.

Endpoints that support blocking queries return an HTTP header named X-Consul-Index. This is a unique identifier representing the current state of the requested resource.

Use curl -v to check HEADER info to see if it has X-Consul-Index.

这个功能可以用在比如自己的app long polling consul API, 去等待changes happen, reactive listen to changes of consul. 这比周期性的探测节省很多资源。for example:

1	curl -v http://localhost:8500/v1/kv/prod/portal/haproxy/stats?index=<X-Consul-Index value in header>'&'wait=40s

如果有change发生，每次X-Consul-Index value 都会变化.

Health Check

Gossip pool via Serf and Edge triggered updates, peer to peer. Serf: https://www.serfdom.io/ (在UI中每个node都有Serf health status)

If you kill and start the consul agent in one node, you will see the log something like:

1 2	serf: EventMemberFailed ... serf: EventMemberJoin ...

There are LAN gossip and WAN gossip.

Information disseminated:

Membership (discovery, joining) - joining the cluster entails only knowing the address of one other node (not required to be a server)
Failure detection - affords distributed health checks, no need for centralized health checking
Event broadcast - i.e. leader elected, custom events

System-Level Check

非常类似于K8s的 liveness probe.

https://www.consul.io/docs/agent/checks.html One of the primary roles of the agent is management of system-level and application-level health checks. A health check is considered to be application-level if it is associated with a service. If not associated with a service, the check monitors the health of the entire node.

前面都是用到了service check, 这里增加check node status. For example, disk usage, memory usage, etc.

Update common.json config file, this config file will take effect on lb and web machines, 这部分配置在最近的新版本已经变化了:

{
  "retry_join": [
    "172.20.20.31"
  ],
  "data_dir": "/tmp/consul",
  "client_addr": "0.0.0.0",
  "enable_script_checks": true,
  "checks": [
    {
      "id": "check_cpu_utilization",
      "name": "CPU Utilization",
      "args": ["/vagrant/provision/hc/cpu_utilization.sh"],
      "interval": "10s"
    },
    {
      "id": "check_mem_utilization",
      "name": "MEM Utilization",
      "args": ["/vagrant/provision/hc/mem_utilization.sh"],
      "interval": "10s"
    },
    {
      "id": "check_hdd_utilization",
      "name": "HDD Utilization",
      "args": ["/vagrant/provision/hc/hdd_utilization.sh"],
      "interval": "10s"
    }
  ]
}

Let’s see the mem_utilization.sh file:

AVAILABLE_RAM=`grep MemAvailable /proc/meminfo | awk '{print $2}'`
TOTAL_RAM=`grep MemTotal /proc/meminfo | awk '{print $2}'`
RAM_UTILIZATION=$(echo "scale = 2; 100-$AVAILABLE_RAM/$TOTAL_RAM*100" | bc)
RAM_UTILIZATION=${RAM_UTILIZATION%.*}

echo "RAM: ${RAM_UTILIZATION}%, ${AVAILABLE_RAM} available of ${TOTAL_RAM} total "

if (( $RAM_UTILIZATION > 95 ));
then
    exit 2
fi

if (( $RAM_UTILIZATION > 70 ));
then
    exit 1
fi

exit 0

The system-level health check sections will be displayed in consul UI. For stress test, install stress software in web1 machine (in the demo code it is added):

1 2	# install sudo apt-get install stress

CPU stress test, then you will see in the consul UI the node is unhealthy and is cycled out from LB:

1	stress -c 1

Watching the consul UI for web1, you will see CPU check failed:

CPU: 100%
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
vagrant   3122 97.2  0.0   7312   100 pts/0    R+   21:26   0:14 stress -c 1
root       822  1.4 10.7 578220 53828 ?        Ssl  21:19   0:06 /usr/bin/docker daemon --raw-logs
vagrant   2121  0.6 11.1 785204 55636 ?        Sl   21:20   0:02 consul agent -config-file /vagrant/config/common.json -config-file /vagrant/config/web.service.json -advertise 172.20.20.21
vagrant   3099  0.2  1.1  23012  5724 pts/0    Ss   21:26   0:00 -bash
root         1  0.1  0.7  33604  3792 ?        Ss   21:19   0:00 /sbin/init

Once it recover, node will itself back into the pool. 这个功能非常有用，可以提前预警可能会发生问题的node. 比如某个web server overloaded，检测出unhealthy，则会被LB 移出，待恢复后又会自动加进去！