Distroless: Using Minimal Container Image for Kubernetes Workload
Introduction
To fully understand what Distroless is, one must first comprehend the concept of a Docker image. A Docker image is a lightweight, standalone, executable package that encompasses all the necessities to run a piece of software, including the code, runtime, system tools, libraries, and settings. Docker images are built from a Dockerfile, a script containing a series of instructions used to generate the image.
Distroless Docker images are a stripped-down version of regular Docker images. In essence, they only contain the absolute essentials required to run an application. This means they lack the usual operating system tools and shells. They are not devoid of an OS, but they carry only the OS’s minimal runtime components.
Why is this significant?
When building Docker images, developers often start with a base image that includes a full OS. For example, one might use a base image that contains a lightweight Linux distribution. However, most applications don’t require the entirety of that distribution. By only including the application and its direct dependencies, Distroless images effectively reduce the attack surface, decrease size, and optimize performance.
Structure of a Distroless Image
A Distroless image typically contains:
- The application binary or interpreted code
- The application’s direct dependencies (libraries, modules, etc.)
- Minimal system libraries required to run the application
- A runtime if needed (e.g., for Java applications, it would have a JVM)
In contrast, it doesn’t have:
- Shell utilities.
- Package managers.
- Any other binaries or tools typical of an OS distribution.
This minimalistic structure is what lends Distroless images their unique combination of efficiency and security.
Benefit
The main benefit of using Distroless is enhanced security. This heightened security is achieved through several attributes inherent to Distroless:
- Reduced Attack Surface: By excluding unnecessary tools, binaries, and shell, Distroless images offer a smaller surface for potential attacks. There’s simply less in the image that can be exploited.
- Minimized Vulnerabilities: With fewer components in the image, there are fewer potential points of failure. This can reduce the number of vulnerabilities and the frequency of required patches.
- No Shell: Distroless images don’t contain a shell. This means if an attacker manages to get into the container, they won’t have a shell to execute further malicious commands, making it more challenging to move laterally or escalate privileges.
- Clearer Dependency Management: By including only what’s necessary to run the application, it’s clearer what dependencies are present, making it easier to manage and update them. This clarity ensures that security patches are more straightforward to track and apply.
- Originated from Google: Being a Google project, Distroless benefits from the company’s vast expertise in cloud security and infrastructure. This pedigree can instill confidence in users prioritizing security.
In summary, the primary advantage of Distroless is its ability to provide a highly secure environment for running applications in containers by minimizing unnecessary components, thus substantially reducing potential vectors for exploitation.
Distroless vs Alpine
Choosing between Alpine and Distroless often hinges on your specific needs and the nature of your application. Here’s a breakdown of scenarios where one might be more suitable than the other:
When to Use Alpine
- Flexibility with Shell Access: If you require shell access for debugging, scripting, or system utilities, Alpine provides the
ash
shell, which can be immensely helpful. - Package Management: Alpine comes with the
apk
package manager. If you need to add additional software packages to your container, Alpine offers a more straightforward path. - Extended Functionality: Alpine is a full-fledged OS, albeit lightweight. This can be beneficial for applications that depend on various OS-level tools or utilities.
- Broader Compatibility: Alpine is likely to have broader compatibility with various Linux applications compared to the minimalist Distroless, which might not have all the libraries an application expects.
- Customization: If you need to customize your container environment extensively, starting with a minimal Linux distribution like Alpine might be preferable.
When to Use Distroless
- Maximized Security: Distroless images significantly reduce the attack surface since they exclude unnecessary binaries, tools, and shell. If security is a paramount concern, Distroless might be a better choice.
- Simplified Dependencies: Distroless ensures you only bundle what’s necessary to run your application, making dependency management simpler and clearer.
- Smaller Image Sizes (Potentially): For applications that don’t require additional OS-level tools or utilities, a Distroless image can be smaller than even an Alpine-based image.
- Fewer Vulnerabilities: Since Distroless images don’t include package managers, shells, or extraneous utilities, they have fewer components that can be exploited.
- Optimized Performance: With fewer components in the image, there’s less overhead, which can potentially lead to better performance.
Example Usage
Assume we have a simple NodeJS application with the main file named app.js
and it uses ExpressJS to serve content. We also have a package.json
and package-lock.json
that specify our application's dependencies.
The Dockerfile with Staged Builds
# ---- Base Node ----
FROM node:14 AS base
WORKDIR /app
COPY package*.json ./
# ---- Dependencies ----
FROM base AS dependencies
RUN npm install
# ---- Build ----
FROM base AS build
COPY . ./
COPY --from=dependencies /app/node_modules ./node_modules
# ---- Release Distroless ----
FROM gcr.io/distroless/nodejs:14
WORKDIR /app
COPY --from=build /app .
CMD ["app.js"]
- Base Node: We use a regular NodeJS image (like
node:14
) to set up our basic environment, copying ourpackage.json
files. - Dependencies: Using the base as a starting point, we install our NodeJS application’s dependencies. Separating out dependency installation can be beneficial if you have a complex build process.
- Build: Here, we copy our application code and the installed dependencies. This step can include other build processes like transpiling TypeScript or other build scripts.
- Release Distroless: In the final stage, we use the Distroless NodeJS image. We copy our application and its dependencies from the build stage to this Distroless image. The result is a lightweight, minimal, and secure image that contains just our application and what’s necessary to run it.
This Dockerfile uses the multi-stage feature of Docker to first set up, resolve dependencies, and build the application using a standard NodeJS image. It then deploys the built application into a Distroless image. This approach combines the power and flexibility of standard images with the minimalism and security of Distroless.