In my blog <<Linux Capability>>
. I talk the basic and general knowlwdge about Capability
. This blog will focus on Capability in Docker container.
In docker run
command, there are some flags about runtime privilege and capabilities:
1 | --cap-add: Add Linux capabilities |
By default, Docker containers are unprivileged and cannot, for example, run a Docker daemon inside a Docker container. This is because by default a container is not allowed to access any devices (/dev
) on host, but a “privileged” container is given access to all devices on host.
The --privileged
flag gives all capabilities to the container, and it also lifts all the limitations enforced by the device cgroup controller. In other words, the container can then do almost everything that the host can do. This flag exists to allow special use-cases, like running Docker within Docker.
How to verify? you can run a busybox with --privileged
enabled or not, first try enable it:
1 | docker run --rm -it --privileged busybox sh |
then let’s check init process capabilities (busybox doesn’t have getpcaps
):
1 | # cat /proc/1/status | grep -i cap |
then decode in another machine, we can see full capabilities here:
1 | # capsh --decode=0000001fffffffff |
if not enabled, only see default ones:
1 | # capsh --decode=00000000a80425fb |
By default, Docker has a default list of capabilities that are kept. The following table lists the Linux capability options which are allowed by default and can be dropped.
- SETPCAP: Modify process capabilities.
- MKNOD: Create special files using mknod(2).
- AUDIT_WRITE: Write records to kernel auditing log.
- CHOWN: Make arbitrary changes to file UIDs and GIDs (see chown(2)).
- NET_RAW: Use RAW and PACKET sockets.
- DAC_OVERRIDE: Bypass file read, write, and execute permission checks.
- FOWNER Bypass: permission checks on operations that normally require the file system UID of the process to match the UID of the file.
- FSETID: Don’t clear set-user-ID and set-group-ID permission bits when a file is modified.
- KILL: Bypass permission checks for sending signals.
- SETGID: Make arbitrary manipulations of process GIDs and supplementary GID list.
- SETUID: Make arbitrary manipulations of process UIDs.
- NET_BIND_SERVICE: Bind a socket to internet domain privileged ports (port numbers less than 1024).
- SYS_CHROOT: Use chroot(2), change root directory.
- SETFCAP: Set file capabilities.
Further reference information is available on the capabilities(7) - Linux man page