Skip to main content

Container Security

This page aims to sum up security best practices to apply to all docker or podman projects.

Some recommendations are explicitly specified if you are working with sensitive data.

Why Worry?

In Docker, containers don’t have any security dimension by default.

When you are using Docker, by default, you are using root account during all the process. This is mainly because how Docker was designed (see Architecture below).

Running a container as root means that the user is root in the container. If the user breaks out of the container, he will still be root.

This short video gives a great example of how escalate privileges on the host with a default docker container running.

This article explains how you can root your Docker host in 10 seconds.

If it is in a lab environment, for tests purposes and you don’t use any sensitive data, then this may not be a big issue (depending on the use case). But if you are in production with all the default docker parameters, then you may experience some bad surprises.

The recommendations presented in this document apply to the whole container’s life cycle. You may not have the control over everything. That is okay, secure what you can.

Introduction

First, if you are not familiar with Docker, have a look at the architecture (source).

Basically, the Docker client makes API calls to the Docker daemon that is responsible to create and manage containers. Docker daemon must have root privilege to run, so this is why by default a container is root.

Development Vs. Production

As mentioned before, if you are in a production environment or you deal with sensitive data, you must apply these best practices. They may not be all relevant, it may not be possible to implement some of them - due sometimes to technical limitations - but do as much as you can. If you have any doubts, feel free to ask a question to discuss about your specific use case and the security risks associated.

If you are in a development environment, you may apply only the most relevant best practices. If you are just exploring possibilities and testing ideas, you may drop all these recommendations - just keep in mind the short video and the article mentioned above.

Also remember that it is easier to implement security at the beginning of a process rather than at the end, when everything else is running.

Best Practices

Basics

These recommendations apply to everyone

Keep OS & Docker Up-to-Date

To prevent from known vulnerabilities. These vulnerabilities usually end in escalating privileges to root or admin and conquer the host.

Image

These recommendations apply when you need to choose/pull an image

Only Use Images From Trusted/Official Sources

Pin Image Version

Latest tag or specific version?

An image reference has the following specifications (distribution spec).

Some examples are:

alpine:latest (using a tag)
docker.io/library/alpine:3.20.1 (using a tag)
docker.io/library/alpine@sha256:dabf91b69c191a1a0a1628fd6bdd029c0c4018041c7f052870bb13c5a222ae76(using
a digest)

By being very specific and pinning down a version, you are shielding yourself from any future breaking change. On the other hand, using the latest tag ensures that more vulnerabilities are patched on the fly. Remember that a tag is just a label and the true version behind may change from one build to another.

This is a trade-off, but pinning to a stable release is what is generally recommended. Thus, you make your work reproducible, which is another security key concept.

Referencing images in deployment: In order to correctly pin a specific version, you must use the digest (not a tag!).

For image references in production systems you must use a digest (e.g. sha265:...) instead of a tag (e.g. 3.20.1). This ensures that an attacker cannot easily inject another image into the image registry which then gets used on service start up.

Pin Binaries in Images

Pinning software binaries you install is not that straight forward to do. There is a trade-off between how rigorous you pin binaries and how well your Containerfile stays maintainable into the future with update etc. Different techniques exists for the different package manager:

Nix: If you build images directly with Nix or in a Containerfile with nix installed, pinning is never an issue since Nix solves this problem by design and most of the security considerations on this page do not apply.

Debians apt: Pinning is not that easy with apk but the recommended way is this and this.

Alpine Package Keeper apk:

The recommended way 75f523b3-5552-4aee-a842-a2c07b6dc4cd is the following. For example:

FROM alpine:3.18
RUN apk add \
--no-cache curl=8.2.1-r2 \
--repository=http://dl-cdn.alpinelinux.org/alpine/v3.18/main

Note: This will however make your Containerfile not buildable if the package is dropped from alpine:3.18 (there is not silver-bullet).

Build from Source: This method is the best if you want to be more independent on the OS. Download the source and checksum check (sha256sum or chksum) the source files and then build the software from source directly on that specific version.

Use an Image with a Minimal Number of Layers

Alpine Linux is a recommended choice, if it’s possible. Bitnami containers follow security’s best practices. Check if the application you need is already available!

Use Multi-Stage Build

Link: multi-stage build

Great way to eliminate unnecessary content in the final image, especially if you are using Go or other compiled language.

Verify the Signature of the Image

Enable Docker Content Trust

Docker Content Trust (DCT) provides the ability to use digital signatures for data sent to and received from remote Docker registries. These signatures allow client-side or runtime verification of the integrity and publisher of specific image tags.

If your goal is to push an image, you will need to generate a delegation key. Follow the steps here.

If your goal is to pull images, you can enable the docker trust variable like this:

export DOCKER_CONTENT_TRUST=1

Then, a docker pull someimage:latest will only succeed if someimage:latest is signed. However, an operation with an explicit content hash always succeeds as long as the digest exists. For instance:

docker pull registry.example.com/user/image@sha256:d149ab53f8718e987c3a3024bb8aa0e2caadf6c0328f1d9d850b2a2a67f2819a

No Secrets in Image

Don’t include anything in any layer that you are not prepared to be seen by anyone who has access to the image.

Scanners

Scanners run on the image. They check if there are known vulnerabilities in the technologies used. Here, we present only static scanners. They only examine the code and don’t execute it. In this document, we will not cover dynamic scanners, as they often require non-free tools and a whole infrastructure to analyze the runtime.

Scan Image with a Security Scanner

Some scanners : trivy, docker scout

You can pick one of them or any others you know. It is not necessary to run multiple scanners, they will normally all give pretty much the same results.

If you have any critical/high/medium vulnerability and you work with sensitive data, you need to fix the issues or find another image.

Install docker scout

You will need a docker account to use it

curl -sSfL https://raw.githubusercontent.com/docker/scout-cli/main/install.sh | sh -s --

or install manually here

To run a scan with all vulnerabilities and their summary, use:

docker scout cves python:3.4-alpine

If you just want a quick overview (no details):

docker scout quickview python:3.4-alpine
Install trivy
docker pull aquasec/trivy:0.42.1 docker run aquasec/trivy:0.42.1 image python:3.4-alpine

If you want to also check for secrets or misconfiguration, use the command (obviously, it’s a bit slower):

docker run aquasec/trivy:0.42.1 image --scanners config,secret,vuln python:3.4-alpine

Run Docker Bench Security

Link: docker bench security

The Docker Bench for Security is a script that checks for dozens of common best-practices around deploying Docker containers in production. This will not scan your code to check common vulnerabilities (like trivy > and > docker scout). But it can help you have a insight of your global configuration.

It helps you automatically check most of the security best practices/requirements presented here. Be careful, some checks still need to be done manually.

Install

git clone https://github.com/docker/docker-bench-security.git cd docker-bench-security sudo sh docker-bench-security.sh

Flags

These recommendations apply if you have a control on the docker run command

Set File System and Volumes to Read-Only

Set the flag --read-only when running the container

This requirement is strongly recommended when working with sensitive data.

Limit Hardware Resources

Avoid Denial of Services attacks by limiting resources. You can limit memory, CPU, maximum number of restarts (--restart=on-failure:<number_of_restarts), maximum number of file descriptors (--ulimit nofile=<number>) and maximum number of processes (--ulimit nproc=<number>).

Set the Flag no-New-Privileges

Add the security flag --security-opt=no-new-privileges to prevent escalate privileges using setuid or setgid binaries.

Limit Capabilities

Docker, by default, runs with only a subset of capabilities. You can change it and drop some capabilities (flag --cap-drop) to harden your docker containers, or add some capabilities (flag --cap-add) if needed.

  • The most secure setup is to drop all capabilities --cap-drop > all > and then add only required ones.
  • NEVER run containers with the --privileged flag (this will add ALL Linux kernel capabilities to the container)

More on this subject : Secure Your Containers with this One Weird Trick

Set Flag --Icc=false or Disable docker0

By default inter-container communication (ICC) is enabled - it means that all containers can talk with each other, using docker0 bridged network.

docker0 is a virtual bridge interface created by Docker. It randomly chooses an address and subnet from a private defined range. All the Docker containers are connected to this bridge and use the NAT rules created by docker to communicate with the outside world (source).

One solution is to disabled by running docker daemon with --icc=false flag. It is then required to tell which containers can communicate using --link=CONTAINER_NAME_or_ID:ALIAS option or by creating a dedicated network.

Note that the icc parameter only apply to the default bridge.

Alternatively, you can disable this default connection (docker0) of all the containers by specifying the option --bridge=none and create a dedicated network for every connection with the command:

docker network create <network_name>

And then use it to access the host network interface

docker run --network=<network_name>

This requirement is strongly recommended when working with sensitive data, you should use one of these two solutions.

Containers

These recommendations apply when you write a Dockerfile

Set a User Instead of root

```dockerfile
FROM <base image>
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
...

<rest of Dockerfile> ...

sets the UID to be used inside the container. It is useful to limit the privileges of the specific program which run within the container itself.

What risks is mitigates: The primary benefit to these approaches is that they limit the capabilities of an attacker to gain root privileges within the container, which will help prevent their ability to break out of the container to the host.

When is it ok to run a container as root?

  • The container needs to modify the host system (i.e. the kernel).

  • The container needs to bind to privileged ports (1-1024).

    • In practice, this can be avoided by using port mapping.
  • The container needs to install software at runtime.

    • This is a really, really bad practice.

Short answer: it’s not.

Enable User Namepsaces

Link: user namespaces

You should enable user namespace support in Docker daemon to utilize container user to host user re-mapping. This recommendation is beneficial where the containers you are using do not have an explicit container user defined in the container image.

If the container images that you are using have a pre-defined non-root user, this recommendation may be skipped as this feature is still in its infancy, and might result in unpredictable issues or difficulty in configuration.

Prefer COPY Instead of ADD

COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Use ADD only if you completely trust the source.

Keep Your Secrets Safe

  • Never include sensitive information in plaintext in a ENV file.
  • Never put any hardcoded secret in the image.

In general, vulnerability scanners also check if secrets are stored in the wrong places.

If you need secrets at runtime, pass them trough a file or use a secret manager.

If you need secrets at build time (docker build), pass them through the --secret mechanism.

Don't Allow Software Installation at Runtime

Install all the software you need during the build process and then remove the possibility to install any other tools (by changing the user, using a minimal image, …)

Installing software at runtime is not efficient, it means the new tool has not been scanned for security vulnerabilities and the container is not immutable.

Infrastructure

These recommendations apply if you have an access to the host and to the Docker deamon.

Enable SELinux or Seccomp or AppArmor

SELinux and AppArmor are access control systems. They grant or deny access to resources and control systems built into the Linux kernel. They are normally installed by default on most Linux distributions.

These systems improves server security by restricting and defining how a server processes requests and how users can interact.

If you are running Docker in rootless mode, you will not be able to run AppArmor.

You can choose one of these applications (depending on the platform or your use case, you may have only one option). You don’t need to install all of them at the same time.

Some tips on how to choose, depending on your use case:

  • If you are running a CentOS container, use SELinux. It uses SELinux by default, it’s a built-in feature.
  • SELinux, by default, separates containers from each other and also from the host filesystem.
  • With AppArmor, it’s not possible to keep separation between containers. AppArmor separates containers from the host, but the default container policy is very loose and needs to be improved to prevent access to the entire host filesystem.
  • AppArmor is more user-friendly.
  • You can combine AppArmor and Seccomp.

This requirement is strongly recommended when working with sensitive data.

Run Docker in Rootless Mode

Link: rootless mode

Rootless mode allows running the Docker daemon and containers as a non-root user to mitigate potential vulnerabilities in the daemon and the container runtime. Rootless mode does not require root privileges even during the installation of the Docker daemon, as long as the prerequisites are met.

Although Docker, lets you run Docker commands as a regular (non-root) user, the Docker daemon that carries out those requests runs as root. As a result, regular users can make requests through their containers that can harm the system (privilege escalation).

What risks it mitigates :

If an attacker breaks out of the container, they will not have root privileges on the host, limiting the attack surface substantially. If your host is properly isolated on the network, or if it is in a lab environment and the host has no access to any sensitive data, you may not need to run in rootless mode.

This feature in not native by default. This means that running Docker in rootless mode may break some functionalities (see the > official page > for more information).

Set Logging Level to Min. INFO

Logs are important because they keep a trace of all the interactions related to Docker. Setting the logging level is the default step. If you are running a whole infrastructure, you may also consider how you will monitor these logs and how you will manage the backups.

Network

Don’t Enable TCP Docker Daemon Socket

If you are running docker daemon with -H tcp://0.0.0.0:XXX or similar you are exposing unencrypted and unauthenticated direct access to the Docker daemon, if the host is internet connected this means the docker daemon on your computer can be used by anyone from the public internet.

Don’t Expose /var/run/docker.sock to Other Containers

For some use cases, you may want to run docker inside a docker container and may use the following command:

docker run -v /var/run/docker.sock:/var/run/docker.sock -ti docker

/var/run/docker.sock is the default Unix socket. Sockets are meant for communication between processes on the same host. Docker daemon by default listens to docker.sock. If you are on the same host where Docker daemon is running, you can use the /var/run/docker.sock to manage containers.

The default permissions of the Docker daemon socket are usually 777, which means that any user on the system can access it.

If the Docker daemon socket has wide-open permissions, anyone with access to the system can potentially access the Docker daemon and perform actions such as creating, executing, and deleting containers. This could potentially lead to the modification or destruction of important data or systems.

It is recommended to restrict the access (with 750 for instance).

Example

This is an example of a Dockerfile/Containerfile following the security best practices (source):

# minimal base image
# specific tag
FROM python:3.10-slim AS base

# environment variables to prevent buffer issues and ensure predictable behavior (if applicable)
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
LANG=C.UTF-8 \
LC_ALL=C.UTF-8

# non-root user for the app
RUN groupadd -r appgroup && useradd --no-log-init -r -g appgroup appuser

# working directory and ensure permissions are secure
WORKDIR /app
RUN chown appuser:appgroup /app

# install only necessary packages
# clean up dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libssl-dev \
&& rm -rf /var/lib/apt/lists/*

# non-root user
USER appuser

# copy application code with proper permissions
COPY --chown=appuser:appgroup . /app

# python dependencies
# dedicated virtual environment to isolate dependencies
RUN python -m venv /app/venv \
&& . /app/venv/bin/activate \
&& pip install --no-cache-dir --upgrade pip setuptools wheel \
&& pip install --no-cache-dir -r requirements.txt

# expose application port
EXPOSE 8000

# health check to ensure container is operating correctly
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1

# specify the entry point
ENTRYPOINT ["/app/venv/bin/python"]
CMD ["app.py"]

And this is an example of a docker command:

docker run
--name myapp
--rm -it
-u 4000
--security-opt no-new-privileges
--cap-drop=ALL
--cap-add=NET_BIND_SERVICE
-p 8080:80
--cpus=0.5
--restart=on-failure:5
--ulimit nofile=5
--ulimit nproc=5
--memory 128m
--read-only
--tmpfs /tmp:rw,noexec,nosuid
-v /usr/local/myapp:/app/:ro
--bridge=none
--network=web
--log-driver=<logging driver>
myimage:latest

Documentation

To Infinity and Beyond

Podman

Podman, just like Docker, is a tool for managing containers.

Unlike Docker, it has some built-in security mechanisms. The biggest difference is that Podman is deamonless by design.

Docker versus Podman Architecture (source):

In Docker, starting a container means sending a request to the Docker daemon via the Docker client. This means that the container processes are children of the Docker daemon, not the user session.

With Podman, the container is instantiated directly through a user login session.

Podman understands Docker commands. It is possible to install podman and create an alias

alias docker=podman

Some considerations about the two technologies (source):

PodmanDocker
Daemonless Architecture✔️
Integration with systemd✔️
Group containers in pods✔️
Supports Docker Swarm✔️
Supports Kubernetes YAML✔️

Singularity

The principle of Singularity seems close to Podman (secure by design) but the documentation is hard to find (so to be continued)