yq is another open source tool relies on jq to query yaml file, tt has most common operations available and syntactically like jq.
Must quote the expression
For example:
1 2
# -r: raw string k get pod banzai-vault-0 -o json | jq -r '.spec.containers[].name'
Filter out null value
For example, I want to get the configMap name from a deployment, the volumes may have multiple subitems and one of them is config map, we need to filter out null:
Here I use cat at end to remove color supplement, otherwise kubectl apply will fail.
Loop and extract specified values
For example, the Elasticsearch API returns a list of map object and each map has the same number of fields, I want to extract few of the fields and output with specific format.
1 2 3
# [.index,.shard,.prirep,.node]: generate [] array format and put items in curl -s "http://localhost:9200/_cat/shards/.ds-*?h=index,shard,prirep,state,node&format=json" | \ jq -r '.[] | [.index,.shard,.prirep,.node] | @csv' | sed 's/\"//g'
# add helm repo if needs helm repo add <repo name> <repo URL> \ --username xx \ --password xx
# sync helm repo # you need to run this if any chart version updated helm repo update
# list helm repos helm repo list
# search # -l: lish all verions # --version: regexp to filter version helm search repo <chart name> [-l] [--version ^1.0.0] # CHART VERSION is for chart upgrade or install # show latest VERSION here NAME CHART VERSION APP VERSION DESCRIPTION xxx 2.1.0 3.3.2 xxxxxx
View helm installed chart:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# list installed charts helm list [-n <namespace>]
# check chart history # xxx-1.0.1: xxx is chart name, 1.0.1 is chart version helm history <release name> [-n <namespace>] REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Mon Oct 11 19:41:41 2021 superseded xxx-1.0.1 1.8.3 Install complete 2 Mon Oct 18 17:39:37 2021 superseded xxx-1.0.2 1.8.3 Upgrade complete 3 Mon Oct 18 17:41:29 2021 superseded xxx-1.0.1 1.8.3 Rollback to 1 4 Mon Oct 18 17:44:21 2021 superseded xxx-1.0.1 1.8.3 Upgrade complete 5 Mon Oct 18 17:55:28 2021 deployed xxx-1.0.2-66 1.8.3 Upgrade complete
# get current chart values # edit if necessary helm get values <release name> [-n <namespace>] > values.yaml
# upgrade with specified version and values.yaml # -f: specify vaules if necessary, if not, will reuse the existing as helm get values output # where to get version: helm search helm install/upgrade <release name> [repo/chart] [-n <namespace>] --version <version> [-f values.yaml] # helm upgrade example gcloud-helm/example --version 1.0.1 -f values.yaml # if run helm history, will displayed as xxx-1.0.3 in CHART column # 1.0.3 is the --version value # note that if the version format is 1.0.3.19, have to convert to 1.0.3-19 in command
# see upgrade result helm history <release name> [-n <namespace>]
Rollback chart:
1 2 3 4 5 6 7
# REVISION is from helm history, see above # if REVISION is ignored, rollback to previous release helm rollback <release name> [-n <namespace>] [REVISION] # note that rollback will also rollback the values
# see rollback result helm history <release name> [-n <namespace>]
Uninstall and install chart:
1 2 3
# --keep-history if necessary # in case need to rollback helm uninstall <release name> [--keep-history] [-n <namespace>]
Download chart package to local:
1 2 3 4
# helm-repo is repo name from "helm repo list" # example is chart name from that repo # --version specify version of the chart helm pull helm-repo/example --version 1.0.1
Local Chart
For developing purpose, we can install directly from local chart folder.
You need to create Chart.yaml which contains chart version, appVersion, etc.
# without k8s cluster # rendering template and check # --debug: verbose output helm template [--values <path to yaml file>] \ [--debug] \ <path to chart folder> \ | less
# in k8s cluster # real helm install but without commit # can generate a release name as [release] helm install [release name] <path to chart folder> \ [-n <namespace>] \ --dry-run \ --debug 2>&1 \ | less
# real install helm install [release name] <path to chart folder>
应用容器,一般来说只有一个main serivce process(it can spawn child processes). 需要一个 init process(PID 1) 去管理 children reaping, handle signals, 也就是说, 如果你的 service process 有 fork 但是 no reaping,那么你就需要一个 init process 了,否则会造成 zombie process.
# in alpine docker RUN apk add --no-cache tini # tini is now available at /sbin/tini ENTRYPOINT ["/sbin/tini", "--"] # or ENTRYPOINT ["/sbin/tini", "--", "/docker-entrypoint.sh"]
A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. As a result, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.
也就是说,用户如果在 init process 注册了 SIGTERM handler(sigcgt bit set to 1) 那么 handler == SIG_DFL is false,所以 init process 就可以收到了.
但问题是我查看 tini init process signal bitmask sigcgt is 0 for all fields, 所以 kernel 甚至都不会把信号传递过去, so how come the tini forwards signal if no signal would be delivered at all? I have opened a question regarding this.
staticboolsig_ignored(struct task_struct *t, int sig, bool force) { /* * Blocked signals are never ignored, since the * signal handler may change by the time it is * unblocked. */ if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig)) returnfalse;
/* * Tracers may want to know about even ignored signal unless it * is SIGKILL which can't be reported anyway but can be ignored * by SIGNAL_UNKILLABLE task. */ if (t->ptrace && sig != SIGKILL) returnfalse;
returnsig_task_ignored(t, sig, force); }
You will see Blocked signals are never ignored! So tini will always receive the signals from kernel.
Another useful benchmark tool is iperf3 to measure various network performance in a server-client mode. Search iperf3 Command for more details.
Analysis
Need to install perf.
1 2 3 4 5 6 7 8 9 10 11
# similar to top, real time cpu usage display # Object: [.] userspace, [k] kernal perf top # -g: enables call-graph (stack chain/backtrace) recording. perf top -g -p <pid>
# record profiling and inspect later # -g: enables call-graph (stack chain/backtrace) recording. perf record -g perf report # record can also help find short live process
# boot time, load average(runnable/running + uninterruptable IO), user uptime w
# -c: show command line # -b: batch mode # -n: iteration top -c -b -n 1 | head -n 1
# -d: highlight the successive difference watch -d "uptime"
# overall system metrics # focus on in, cs, r, b, check man for description # 注意r 的个数是否远超CPU 个数 # in 太多也是个问题 # us sy 看cpu 主要是被用户 还是 内核 占据 vmstat -w -S m 2
# cpu core number lscpu ## press 1 to see cpus list top
# check all cpus metrics # 判断cpu usage 升高是由于iowait 还是 computing mpstat -P ALL 1
# check which process cause cpu utilization high # -u: cpu status: usr, sys, guest, total pidstat -u 1
# short live process check perf top execsnoop
Versatile tool for generating system resource statistics
1 2 3 4
# combination of cpu, disk, net, system # when CPU iowait high, can use it to compare # iowait vs disk read/wirte vs network rec/send dstat
# buffer is from /proc/meminfo Buffers cat /proc/meminfo | grep -E "Buffers" # cache is from /proc/meminfo Cached + SReclaimable cat /proc/meminfo | grep -E "SReclaimable|Cached" # understand what is buffer and cache, man proc # --Buffers: # Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so) # --Cached: # In-memory cache for files read from the disk (the page cache). Doesn't include SwapCached # --Slab: # In-kernel data structures cache. # --SReclaimable: # Part of Slab, that might be reclaimed, such as caches.
# simple top-like I/O monitor # -b: batch mode # -n: iteration number # -o: only show actually doing I/O processes/threads # -P: only show process iotop -b -n 1 -o [-P]
Check system calls on I/O to locate files:
1 2 3 4 5 6 7 8
# -f: threads # -T: execution time # -tt: system timestamp # any read/write operations? strace [-f] [-T] [-tt] -p <pid>
# check files opened by process lsof -p <pid>
Also search and check <<Linux Check Disk Space>> for lsof usage.
And <<Linux Storage System>> to manage disk storage.
And <<Linux Make Big Files>> to make big file for testing.
Other BCC tools useful:
1 2 3 4
# trace file read/write filetop # trace kernel open system call opensnoop
Network
System kind errors:
1 2 3
# not only used for network but for general purpose # -e: show local timestamp dmesg -e | tail
sar network related commands:
1 2 3 4 5 6 7 8 9 10
# -n: statistics of network device # DEV: network devices statistic sar -n DEV 1
# see man sar for details sar -n UDP 1 # ETCP: statistics about TCPv4 network errors sar -n ETCP 1 # EDEV: statistics on failures (errors) from the network devices sar -n EDEV 1
Network stack statistics:
1 2 3 4 5 6 7 8 9 10 11 12 13
# see tcp, udp numeric listening # -p: PID and name of the program to which each socket belongs. netstat -tunlp
# check tcp connection status # LISTEN/ESTAB/TIME-WAIT, etc # -a: display all sockets # -n: no reslove service name # -t: tcp sockets ss -ant | awk 'NR>1 {++s[$1]} END {for(k in s) print k,s[k]}'
# check interface statistics ip -s -s link
Network sniffing, see another blog <<Logstash UDP Input Data Lost>> for more tcpdump usage. Last resort and expensive, check if mirror traffic is available in production.
1 2 3 4
# -i: interface # -nn: no resolution # tcp port 80 and src 192.168.1.4: filter to reduce kernel overhead tcpdump -i eth0 -nn tcp port 80 and src 192.168.1.4 -w log.pcap
For example, tcpdump to pcap file and analyzed by wireshark later, using rotated or timed file to control the file size. Don’t force kill the tcpdump process because that will corrupt the pcap file.
UDP statistics
1 2 3 4
# -s: summary statistics for each protocol # -u: UDP statistics # for example 'receive buffer errors' usually indicates UDP packet dropping watch -n1 -d netstat -su
For example, the receive buffer errors increases frequently usually means UDP packets dropping and needs to increase socket receiving buffer size or app level buffer/queue size.
Simulate packet loss for inbound(iptables) and outbound traffic(tc-netem), check this post for detail.
# --no-cache: do not rely on build cache # -f: specify dockerfile # context: build context location, usually .(current dir) docker build --no-cache -t helloapp:v2 -f dockerfiles/Dockerfile <context path>
Make sure do not include unnecessary files in your build context, that will result in larger image size. Or using .dockeringore to exclude files from build context.
Pipe in dockerfile, no files will be sent to build context:
1 2 3 4 5 6 7 8
# cannot use COPY in this way # -: read Dockerfilr from stdin echo -e 'FROM busybox\nRUN echo "hello world"' | docker build - # here document docker build -<<EOF FROM busybox RUN echo "hello world" EOF
Omitting the build context can be useful in situations where your Dockerfile does not require files to be copied into the image, and improves the build-speed, as no files are sent to the daemon.
Multi-Stage builds allow you to drastically reduce the size of your final image, without struggling to reduce the number of intermediate layers and files. For example, the Elasticserach curator Dockerfile also adopt this workflow:
Install tools you need to build your application
Install or update library dependencies
Generate your application
1 2
# the simplest base image FROM scratch
To reduce complexity, dependencies, file sizes, and build times, avoid installing extra or unnecessary packages just because they might be “nice to have.” For example, you don’t need to include a text editor in a database image.
Only the instructions RUN, COPY, ADD create layers. Other instructions create temporary intermediate images, and do not increase the size of the build.
Sort multi-line arguments, for example debian:
1 2 3 4 5 6 7 8 9 10
# Always combine RUN apt-get update with apt-get install in the same RUN # otherwise apt-get update clause will be skipped in rebuild if no --no-cache RUN apt-get update && apt-get install -y \ bzr \ cvs \ git \ mercurial \ subversion \ && rm -rf /var/lib/apt/lists/* # clean up the apt cache by removing /var/lib/apt/lists it reduces the image size
Dockerfile Clause
LABEL can be used to filter image with with -f option in docker images command.
Using pipe:
1 2 3 4 5
# Docker executes these commands using the /bin/sh -c interpreter RUN set -o pipefail && wget -O - https://some.site | wc -l > /number
# or explicitly specify shell to support -o pipefail RUN ["/bin/bash", "-c", "set -o pipefail && wget -O - https://some.site | wc -l > /number"]
CMD should rarely be used in the manner of CMD ["param", "param"] in conjunction with ENTRYPOINT, unless you and your expected users are already quite familiar with how ENTRYPOINT works. CMD should almost always be used in the form of CMD ["executable", "param1", "param2"…].
Use ENTRYPOINY with docker-entrypoint.sh helper script is also common:
1 2 3 4 5
COPY ./docker-entrypoint.sh / ENTRYPOINT ['/docker-entrypoint.sh'] # will be substituted with command in docker run end # docker run --it --rm image_name:tag <param1> <param2> ... CMD ["--help"]
Each ENV line creates a new intermediate layer, just like RUN commands. This means that even if you unset the environment variable in a future layer, it still persists in this layer and its value can be dumped. To prevent this, and really unset the environment variable, use a RUN command with shell commands, to set, use, and unset the variable all in a single layer. You can separate your commands with ; or &&:
1 2 3 4 5 6
# syntax=docker/dockerfile:1 FROM alpine RUNexport ADMIN_USER="mark" \ && echo$ADMIN_USER > ./mark \ && unset ADMIN_USER CMD sh
Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. If multiple files need to be COPY, copy them separately in use rather than all in one go, this can help to invalidate the cache.
Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they’ve been extracted and you don’t have to add another layer in your image.
You are strongly encouraged to use VOLUME for any mutable and/or user-serviceable parts of your image. (I rarely use)
Avoid installing or using sudo as it has unpredictable TTY and signal-forwarding behavior that can cause problems. If you absolutely need functionality similar to sudo, such as initializing the daemon as root but running it as non-root, consider using gosu.
Lastly, to reduce layers and complexity, avoid switching USER back and forth frequently.
For clarity and reliability, you should always use absolute paths for your WORKDIR.
Think of the ONBUILD command as an instruction the parent Dockerfile gives to the child Dockerfile.
Grafana support querying time-series database like prometheus and influxdb, also support Elasticsearch logging & analytics database.
1 2 3
## pull image alpine based docker pull grafana/grafana:7.0.0 docker run --detach --name=grafana --publish-all grafana/grafana:7.0.0
The default login is admin/admin. After login, go to set Data Sources, select prometheus and specify the url, then import data in dashboard.
Readings
Grafana vs. Kibana: The Key Differences to Know
The key difference between the two visualization tools stems from their purpose. Grafana is designed for analyzing and visualizing metrics such as system CPU, memory, disk and I/O utilization. Grafana does not allow full-text data querying. Kibana, on the other hand, runs on top of Elasticsearch and is used primarily for analyzing log messages.
top command ran in container within the pod shows the host machine overview metrics and container level process metrics. The reason is containers inside pod partially share /proc with the host system includes path about a memory and CPU information. The top utilizes /proc/stat(host machine), /proc/<pid>/stat(container process), they are not aware of the namespace.
P.S: lxcfs this FUSE filesystem can create container native /proc! Make container more likes a VM.
The two methods below collect data from different sources and they are also referring to different metrics.
For k8s OOMKiller event, using kubectl top to predicate and track is more accurate.
Kubectl Top
K8s OOMkiller uses container_memory_working_set_bytes(from cadviosr metrics, can also show in prometheus if deployed) as base line to decide the pod kill or not. It is an estimate of how much memory cannot be evicted, the kubectl top uses this metrics as well.
After metrics-server is installed:
1 2 3 4 5 6
# show all containers resource usage insdie a pod kubectl top pod <pod name> --containers # show pod resource usage kubectl top pod # show node resource usage kubectl top node
In prometheus expression browser, you can get the same value as kubectl top:
1 2 3
# value in Mib # pod, container are the label name, depends on your case container_memory_working_set_bytes{pod=~"<pod name>",container=~"<container name>"} / 1024 / 1024
docker stats memory display collects data from path /sys/fs/cgroup/memory with some calculations, see below explanation.
On host machine, display the container stats (CPU and Memory usages)
1 2
# similar to top docker stats --no-stream <container id>
Actually docker CLIs fetch data from Docker API, for instance v1.41 (run docker version to know API supported verion), you can get stats data by using curl command:
From this docker stats description:
On Linux, the Docker CLI reports memory usage by subtracting cache usage from the total memory usage. The API does not perform such a calculation but rather provides the total memory usage and the amount from the cache so that clients can use the data as needed. The cache usage is defined as the value of total_inactive_file field in the memory.stat file on cgroup v1 hosts.
On Docker 19.03 and older, the cache usage was defined as the value of cache field. On cgroup v2 hosts, the cache usage is defined as the value of inactive_file field.
memory_stats.usage is from /sys/fs/cgroup/memory/memory.usage_in_bytes.
memory_stats.stats.inactive_file is from /sys/fs/cgroup/memory/memory.stat.
So here it is:
1
80388096 - 17829888 = 62558208 => 59.66s Mib
This does perfectly match docker stats value in MEM USAGE column.
The dockershim is deprecated in k8s!! If containerd runtime is used instead, to explore metrics usage you can check cgroup in host machine or go into container check /sys/fs/cgroup/cpu.
To calculate the container memory usage as docker stats in the pod without installing third party tool:
1 2 3 4 5 6 7 8 9 10
# memory in Mib: used cd /sys/fs/cgroup/memory cat memory.usage_in_bytes | numfmt --to=iec
# memory in Mib: used - inactive(cache) cd /sys/fs/cgroup/memory used=$(cat memory.usage_in_bytes) inactive=$(grep -w inactive_file memory.stat | awk {'print $2'}) # numfmt: readable format echo $(($used-$inactive)) | numfmt --to=iec
To calculate the container cpu usage as docker stats in the pod without installing third party tool:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
# cpu, cpuacct dir are softlinks cd /sys/fs/cgroup/cpu,cpuacct # cpuacct.stat: # Reports the total CPU time in nanoseconds # spent in user and system mode by all tasks in the cgroup. utime_start=$(cat cpuacct.stat| grep user | awk '{print $2}') stime_start=$(cat cpuacct.stat| grep system | awk '{print $2}') sleep 1 utime_end=$(cat cpuacct.stat| grep user | awk '{print $2}') stime_end=$(cat cpuacct.stat| grep system | awk '{print $2}') # getconf CLK_TCK aka sysconf(_SC_CLK_TCK) returns USER_HZ # aka CLOCKS_PER_SEC which seems to be always # 100 independent of the kernel configuration. HZ=$(getconf CLK_TCK) # get container cpu usage # on top of user/system cpu time echo $(( (utime_end+stime_end-utime_start-stime_start)*100/HZ/1 )) "%" # if the outcome is 200%, means 2 cpu usage, so on and so forth
Readings
How much is too much? The Linux OOMKiller and “used” memory
We can see from this experiment that container_memory_usage_bytes does account for some filesystem pages that are being cached. We can also see that OOMKiller is tracking container_memory_working_set_bytes. This makes sense as shared filesystem cache pages can be evicted from memory at any time. There’s no point in killing the process just for using disk I/O.
Kubernetes top vs Linux top
kubectl top shows metrics for a given pod. That information is based on reports from cAdvisor, which collects real pods resource usage.
cAdvisor: container advisor
cAdvisor (Container Advisor, go project) provides container users an understanding of the resource usage and performance characteristics of their running containers.
Open-source monitoring and alerting system:
https://prometheus.io/
Prometheus collects and stores its metrics as time-series data, i.e. metrics
information is stored with the timestamp at which it was recorded, alongside
optional key-value pairs called labels.
Architecture
Learning targets:
Know how to set up prometheus cluster for testing purpose
Know how to configure prometheus/alertmanager/grafana
Conuter: request count, task completed, error count, etc.
Query how fast the value is increasing, rate() only applies for counter as
it is monotonic increasing.
Guage: memory usage, queue size, kafka lag, etc.
For example, avg_over_time() on gauge type.
Histogram: duration of http request, response size, etc.
To late calculate average and percentile, happy with approximation.
You can use default bucket or customizing your own.
The vaule in bucket is accumulated, add to all buckets that greater than current value.
Summary: duration of http request, response size, etc.
complex than Histogram, no idea the value range so cannot histogram.
How to know labels
of a specific metric? Using prometheus query browser run metric name and see the
console output, it will contains all labels of that metric.
There is case the we want to capture the first 0(non-existence) -> 1 counter
event and fire alert, this can be captured by unless + offset, and after 1
we can use increase to catch:
1 2
# ((0 -> 1 case capture) or (1 -> 1+ case capture)) ((_metric_counter_ unless _metric_counter_ offset 15m) or (increase(_metric_counter_[15m]))) > 0
Query Example
Here I list some examples to explain and practice common PromQL. Part of them
are from Grafana dashboard as they have embedded variables, but the syntax and
usage is the same in prometheus expression browser and Grafana.
rate(average rate!) or irate(instant rate, last 2 data points only)
calculates the per-second average rate of how fast a value is increasing over a
period of time, they automatically adjusts for counter resets. If you want to
use any other aggregation(such as sum) together with rate then you must
apply rate first, otherwise the counter resets will not be caught and you will
get weird results.
irate(spike) should only be used when graphing volatile, fast-moving counters.
Use rate(trend) for alerts and slow-moving counters, as brief changes in the
rate can reset the FOR clause and graphs consisting entirely of rare spikes
are hard to read.
For group_left(many to one!) and group_right(one to many!), here is the
example.
One query example for system load average dashboard:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# ${interval}, ${load}, ${service}, $env: # these variables are defined from by dashboard config variables
# explain on label_replace # 在avg_over_time()得到的向量中,对于instance这个label,看是否match $env-(.+) 这个正则表达式 # 如果有match,则$1 就是对应正则中的第一个(.+)的真实值,然后在label_replace返回的新向量中,增加一个 # label name=$1,如果没有match,则返回原来的向量 avg( label_replace(avg_over_time(node_load${load}{instance=~"^.+-${service:regex}-[0-9]+$"}[${interval}]), "instance_group", "$1", "instance", "$env-(.+)") ) by (instance_group) > 6 # then average the new vector, group it by instance_group label and check if the average group level LA > 6
Run identical Prometheus servers on two or more separate machines. Identical
alerts will be deduplicated by the Alertmanager.
For high availability of the Alertmanager, you can run multiple instances in a
Mesh cluster and configure the Prometheus servers to send notifications to each
of them.
To silence one alert, using New Silence and in matcher use alertname as
key and alertname vaule as value(can add more key-value to filter more). If
silence multiple alerts, using regex. Preview silence can show you how many
current active alerts are affected, or you can just silence it so no new alert
will come.
# list topics created ./bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --list
# display: # number of partitions of this topic # relica factor # overridden configs # in-sync replicas ./bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --topic <topic name> \ --describe # the output for example: # "Isr" is a status, it shows which replica is in-sync, the below means all replicas are good # Configs field shows the override settings of default Topic: apple PartitionCount: 3 ReplicationFactor: 3 Configs: cleanup.policy=delete,segment.bytes=536870912,retention.ms=172800000,retention.bytes=2000000000 Topic: apple Partition: 0 Leader: 28 Replicas: 28,29,27 Isr: 28,29,27 Topic: apple Partition: 1 Leader: 29 Replicas: 29,27,28 Isr: 29,27,28 Topic: apple Partition: 2 Leader: 27 Replicas: 27,28,29 Isr: 27,28,29
# only show overridden config ./bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --topics-with-overrides \ --topic <topic name> \ --describe # or using kafka-config ./bin/kafka-configs.sh \ --zookeeper <zookeeper>:2181 \ --entity-type topics \ --entity-name <topic name> \ --describe
# list consumer groups ./bin/kafka-consumer-groups.sh \ --bootstrap-server localhost:9092 \ --list
# check partition/offset/lag messages in each consumer group/topic # also see topics consumed by the group
# Consumer lag indicates the lag between Kafka producers and consumers. If the rate of # production of data far exceeds the rate at which it is getting consumed, consumer # groups will exhibit lag. # From column name: # LAG = LOG-END-OFFSET - CURRENT-OFFSET # CLIENT-ID: xxx-0-0: means consumer 0 and its worker thread 0 ./bin/kafka-consumer-groups.sh \ --bootstrap-server localhost:9092 \ --group <consumer group name> \ --describe
# read last one message in topic of consumer group # note that one topic can be consumed by different consumer group # each has separate consumer offset ./bin/kafka-console-consumer.sh \ --bootstrap-server localhost:9092 \ --topic <topic name> \ --group <consumer group name> \ --max-messages 1
Partition
1 2 3 4 5 6 7
# increase partition number # partition number can only grow up ./bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --topic <topic name> \ --partitions <new partition number> \ --alter
Delete Messages
If there is bad data in message that stucks the consumer, we can delete them
from the specified partition:
{ "partitions":[ { "topic":"<topic name>", // partition number, such as 0 "partition":0, // offset, delete all message from the beginning of partition till this // offset(excluded). // The offset specified is one higher than the problematic offset reported // in the log "offset":149615102 } ], // check ./bin/kafka-delete-records.sh --help to see the version "version":1 }
Note that if all messages need deleting from the topic, then specify in the
JSON an offset of -1.