The Vagrant demo file please see my git repo Infratree
Introduction
Zookeeper quorum architecture, at least 3 nodes cluster setup via Vagrant for Kafka use. Hght level information about zk:
- distributed key value store
- has voting mechanism
- used by many big data tools
- broker registration, heart-beating check
- maintaining a list of topics alongside
- leader election
- store kafka cluster id
- store ACLs if security is enabled
- quotas config if enabled
使用kafka自带的zookeeper 还是 独立的zookeeper呢?见这个回答参考. 工作项目中还是用的独立的zk. Zookeeper is going to be removed from kafka, see this article.
More references please see Zookeeper main page.
System Requirements for version 3.6.2
: ZooKeeper (also Kafka) runs in Java
, release 1.8 or greater (JDK 8 LTS, JDK 11 LTS, JDK 12 - Java 9 and 10 are not supported).
Archive download link. For example, here uses Zookeeper version 3.6.2
.
Performance Factors
Latency is key for zookeeper:
- fast disk
- no RAM swap
- separate disk for snapshots and logs
- high performance network
- resonable number of zk servers
- isolation zk process from others
Cluster Setup
Architecture
Clustered (Multi-Server) Setup. 这里面说了很多注意事项, 比如Java heap size to avoid swapping or disable swapping.
The architecture diagram of this experiment, co-locate zk and kafka in the same node, this is not recommended on production.
1 | // zk leader can be anyone of them |
Config
Generate zoo.cfg
file in each node:
1 | # download release binary version 3.6.2 |
Create zk data, log and conf directories in each zookeeper node:
1 | mkdir -p /root/zk/data |
Edit zoo.cfg
file for each zookeeper instance, use the same configuration for them:
1 | # the basic time unit in milliseonds used by zk |
In zk data directory, create myid
for each zk instance, much be unqiue:
1 | # current in 192.168.20.20 |
Commands
Run zk service commands:
1 | cd /root/apache-zookeeper-3.6.2-bin/bin |
[x] why port 2181 is bound on ipv6? 其他环境上也是如此。
After start zookeeper, check ./zkServer.sh status
to see if current node is leader or follower.
Run ./zkCli.sh
, you can execute zk CLI, see this reference.
./zkCli.sh
也可用于从外部连接一个zk server, 只要指定accessable IP 和 port即可,demo中由于就在本机,所以其实是localhost.
If start failed, see <zk package>/logs/zookeeper_audit.log
, If you don’t start another 2 zk instance, you will see periodically exception errors in <zk package>/logs/zookeeper-root-server-kafka1.out
, it will be resolved after you start all of them. The log setting is by log4j configuration from <zk package>/conf/log4j.properties
.
After the cluster is up and running, you can check the ports:
1 | # see 2181, 2888, 3888 |
在旧版本中,可以使用four letter words去检查一些状态,比如:
1 | # are you ok |
新版本已经切换到AdminServer. The AdminServer is enabled by default, access by http 8080 port:
1 | # list all commands |
ZooNavigator is a web based ZooKeeper UI and editor/browser with many features. You can launched it by docker container.
Run as Daemon
Set zookeeper as system daemon so that it will be launched every time on system boots.
systemd
zookeeper cluster setup on ubuntu, see here.
Basically speaking, first generate a zookeeper service file, for example zookeeper.service
, the prefix zookeeper
is the service name used in systemctl command. Place this file in /etc/systemd/system
folder and owned by root
.
Double curly brackets is placeholder in jinja2 template.
1 | [Unit] |
More detail about systemd
please search and see my systemd blog.
Then you must enable zookeeper starts on boot:
1 | systemctl daemon-reload |
Other systemd commands:
1 | systemctl start zookeeper |