Skip to content
Ercan Ermis
Ercan Ermis

notes for everyone about cloud technology

  • Cloud
    • AWS
    • GCP
  • Container
    • Kubernetes
    • Docker
  • Linux
  • DevOps
  • Privacy Policy
  • Contact
Ercan Ermis

notes for everyone about cloud technology

Mastering Dockerfile: Writing Efficient, Scalable Container Builds

Ercan, September 11, 2024October 1, 2024

Docker has revolutionized the way we develop, package, and ship applications. At the heart of this container magic is the Dockerfile—the blueprint for building Docker images. If you want to master Docker, you need to know how to write efficient and scalable Dockerfiles. Let’s dive deep into the best practices for crafting a Dockerfile that ensures optimized, lean, and maintainable container images.

1. Start with the Right Base Image

Your choice of base image sets the foundation for your container. The smaller the base, the lighter your resulting image will be. You have two main strategies here:

  • Alpine Linux: A super lightweight image (~5 MB). It’s perfect for microservices or applications where minimal size is critical. However, note that Alpine uses musl instead of glibc, so some libraries may require tweaking.
  • Official Language Images: For language-specific applications (Node.js, Python, Golang), official images are often preconfigured with necessary dependencies, saving you time. Opt for slim versions (like python:3.9-slim) to avoid bloated images.

Example:

FROM python:3.9-slim

2. Use Multi-stage Builds

Multi-stage builds are a game-changer for keeping Docker images lightweight and clean. You use multiple FROM statements in your Dockerfile, where the first few stages build the application, and the final stage only copies the necessary artifacts.

Why Multi-stage? This approach eliminates build-time dependencies, reducing your final image size dramatically.

Example:

# Build stage
FROM golang:1.17 as builder
WORKDIR /app
COPY . .
RUN go build -o myapp

# Production stage
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/myapp .
CMD ["./myapp"]

In this example, the final image only contains the compiled Go binary and the Alpine base image—no Go SDK, no source files—just the essentials!


3. Leverage .dockerignoreJust like .gitignore, .dockerignore helps you exclude unnecessary files and directories from being added to your image. For example, why include your .git directory, local environment configs, or temporary files in your image?

This is an easy win for both performance and security.

Example .dockerignore file:

.git
node_modules
*.log
.env

4. Minimize Layer Creation

Each line in a Dockerfile creates a new layer. Docker caches these layers, but too many layers can lead to performance hits and larger image sizes. Try to group related commands into a single RUN instruction.

Before:

RUN apt-get update
RUN apt-get install -y python3
RUN pip3 install -r requirements.txt

After:

RUN apt-get update && apt-get install -y python3 && pip3 install -r requirements.txt

This approach reduces the number of layers and results in a more compact image.

5. Use COPY Instead of ADD

Both COPY and ADD let you move files into the image, but COPY is generally preferred because it’s more explicit and simpler. ADD has extra features like unpacking tarballs and fetching URLs, but this adds complexity. If you don’t need those extras, stick with COPY.

Example:

COPY ./source /app/source

6. Use Minimal User Permissions

By default, Docker containers run as the root user, which can be a security risk. Always aim to run your applications as a non-root user.

Example:

# Create a non-root user
RUN groupadd -r appgroup && useradd -r -g appgroup appuser
USER appuser

7. Optimize Caching

Docker caches layers to speed up subsequent builds. Structure your Dockerfile in a way that takes full advantage of this caching. Put frequently-changing layers (like source code) at the bottom and more static layers (like dependencies) at the top.

Example:

# Install dependencies first (less frequent changes)
COPY requirements.txt .
RUN pip install -r requirements.txt

# Copy application code (more frequent changes)
COPY . .

With this approach, Docker only re-builds the later layers when code changes, speeding up the build process.

8. Clean Up After Yourself

Commands like apt-get install can leave behind unnecessary files like caches and package lists. Clean them up to reduce image size.

Example:

RUN apt-get update && apt-get install -y \
    python3 \
    && apt-get clean && rm -rf /var/lib/apt/lists/*

This keeps your image lean and avoids leftover cruft.

9. Label Your Images

Adding metadata using LABEL makes your Docker images more manageable, especially in large environments. Use labels for versioning, authorship, and other useful metadata.

Example:

LABEL maintainer="[email protected]"
LABEL version="1.0.0"
LABEL description="My Python App"

10. Test Locally Before Pushing

Always test your Docker builds locally before pushing them to a registry or deploying to production. Use docker build and docker run to verify that your image is working as expected.

docker build -t myapp .
docker run --rm -p 8080:8080 myapp

Final Thoughts:

Writing a Dockerfile that’s both efficient and scalable doesn’t have to be daunting. By focusing on best practices like using multi-stage builds, minimizing layers, and optimizing caching, you can build images that are not only lightweight but also fast to deploy and easy to maintain. As you continue to build Dockerfiles, you’ll develop your own strategies for balancing image size, build speed, and functionality.

Remember: the goal is to create Docker images that scale with your application while maintaining performance and security. Keep experimenting, keep optimizing, and most importantly—keep shipping!

That’s it! Now, it’s time to go and try these practices out in your next Docker project. Let me know how it goes, or if you have any more tips to add. Happy containerizing! 🚀

Share on Social Media
x facebook linkedin reddit
Docker

Post navigation

Previous post
Next post
  • AWS (45)
    • Serverless (4)
  • Best (9)
  • DevOps (16)
  • Docker (10)
  • GCP (3)
  • Kubernetes (3)
  • Linux (13)
  • Uncategorized (6)

Recent Posts

  • Automating AWS CloudWatch Log Group Tagging with Python and Boto3
  • Automating AWS ECR Tagging with Python and Boto3
  • Automating ECR Image Cleanup with Bash
  • Update ECR Repositories with Bash Script
  • Why Automated Tests Are Essential in Your CI/CD Pipeline and Development Flow
  • Streamline Your AWS ECR Management with This Powerful Bash Script
  • Setting up DKIM for Google Workspace (Gmail) using Terraform and AWS Route 53
  • Automate AWS Site-to-Site VPN Monitoring
  • Optimizing Docker Images: Tips for Reducing Image Size and Build Time
  • Monitoring EC2 Disk Space with a Simple Bash Script and Slack Alerts
  • Securing Docker Containers: Best Practices for Container Security
  • Mastering Dockerfile: Writing Efficient, Scalable Container Builds
  • Migrating a Git Repository from GitLab to GitHub with GPG-Signed Commits
  • Accessing AWS Services in Private Subnets Without 0.0.0.0/0
  • Understanding AWS Regions, Availability Zones, and VPCs: A Comprehensive Guide
©2025 Ercan Ermis | WordPress Theme by SuperbThemes