File capabilities allow users to execute programs with higher privileges. Best example is network utility ping.
A ping binary has capabilities CAP_NET_ADMIN and CAP_NET_RAW. A normal user doesn’t have CAP_NET_ADMIN privilege, since the executable file ping has that capability you can run it.
which ping /usr/bin/ping = cap_net_admin,cap_net_raw+p Which normally works as follows:
$ ping -c 1 126.96.36.199 PING 188.8.131.52 (184.108.40.206) 56(84) bytes of data. 64 bytes from 1.
Here are simple steps that you can follow to prove that the root user inside container is also root on the host. And how to mitigate this.
Root in container, root on host I have a host with docker daemon running on it. I start a normal container on it with sleep process as PID1. See in the following output that the container clever_lalande started with sleep process.
$ docker run -d –rm alpine sleep 9999 6c541cf8f7b315783d2315eebc2f7dddd1f7b26f427e182f8597b10f2746ab0b $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 6c541cf8f7b3 alpine "sleep 9999" 12 seconds ago Up 11 seconds clever_lalande Now let’s find out the process sleep on the host.
What is Seccomp? A large number of system calls are exposed to every userland process with many of them going unused for the entire lifetime of the process. A certain subset of userland applications benefit by having a reduced set of available system calls. The resulting set reduces the total kernel surface exposed to the application. System call filtering is meant for use with those applications. Seccomp filtering provides a means for a process to specify a filter for incoming system calls.
A volume mount CVE was discovered in Kubernetes 1.9 and older which allowed access to node file system using emptyDir volume mount using subpath. The official description goes as follows:
In Kubernetes versions 1.3.x, 1.4.x, 1.5.x, 1.6.x and prior to versions 1.7.14, 1.8.9 and 1.9.4 containers using subpath volume mounts with any volume type (including non-privileged pods, subject to file permissions) can access files/directories outside of the volume, including the host’s filesystem.
I enabled PodSecurityPolicy on a minikube cluster by appending PodSecurityPolicy to the apiserver flag in minikube like this:
–extra-config=apiserver.enable-admission-plugins=Initializers,NamespaceLifecycle,\ LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,\ NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,\ ResourceQuota,PodSecurityPolicy Ideally when you have PSP enabled and if you don’t define any PSP and authorize it with right RBAC no pod will start in the cluster. But what I saw was that there were some pods still running in kube-system namespace.
$ kubectl -n kube-system get pods NAME READY STATUS RESTARTS AGE coredns-576cbf47c7-g2t8v 1⁄1 Running 4 5d11h etcd-minikube 1⁄1 Running 2 5d11h heapster-bn5xp 1⁄1 Running 2 5d11h influxdb-grafana-qzpv4 2⁄2 Running 4 5d11h kube-addon-manager-minikube 1⁄1 Running 2 5d11h kube-controller-manager-minikube 1⁄1 Running 1 4d20h kube-scheduler-minikube 1⁄1 Running 2 5d11h kubernetes-dashboard-5bb6f7c8c6-9d564 1⁄1 Running 8 5d11h storage-provisioner 1⁄1 Running 7 5d11h Which got me thinking what is wrong with the way PSPs work.
It’s always a hassle creating certificates and lot of technical jargons involved. This can be simplified, using mkcert. Install by following one of the steps mentioned in the docs.
Once installed just run:
$ mkcert -install Created a new local CA at "/home/hummer/.local/share/mkcert" 💥 [sudo] password for hummer: The local CA is now installed in the system trust store! ⚡ The local CA is now installed in the Firefox and/or Chrome/Chromium trust store (requires browser restart)!