Skip to content
Ercan Ermis
Ercan Ermis

notes for everyone about cloud technology

  • Cloud
    • AWS
    • GCP
  • Container
    • Kubernetes
    • Docker
  • Linux
  • DevOps
  • Privacy Policy
  • Contact
Ercan Ermis

notes for everyone about cloud technology

Automating ECR Image Cleanup with Bash

Ercan, April 11, 2025April 11, 2025

Managing container images in Amazon ECR (Elastic Container Registry) is crucial for keeping your registry clean and cost-effective. Over time, unused or deprecated images can accumulate, potentially leading to increased storage costs and operational overhead. One common scenario is removing images that follow a specific tagging pattern—in this case, any image tagged with versions following the “9.x.x” format, where x represents one or more digits.

This article introduces a Bash script designed to automate the cleanup of ECR images. The script provides two main modes—dry-run and apply—allowing you to simulate or execute deletions based on your needs.


Script Overview

The Bash script leverages the AWS CLI and jq (a lightweight and flexible command-line JSON processor) to:

  1. List All ECR Repositories:
    It queries AWS ECR for all available repositories in your account.
  2. Fetch Images and Tags:
    For each repository, it lists image details (including image digests and tags).
  3. Apply Tag Filtering:
    It uses a regular expression (^9\.[0-9]+\.[0-9]+$) to find images whose tags match the “9.x.x” format (e.g., “9.0.1” or “9.12.3”).
  4. Operational Modes:
    • Dry Run (--dry-run): The script only lists the images that would be deleted, offering an opportunity to verify which images qualify for deletion without any risk.
    • Apply (--apply): It goes ahead and deletes the identified images from the repositories.
  5. Usage Information:
    If the script is executed without a valid argument, it outputs a helpful usage message explaining the available options.

Key Features

1. Automatic Repository Discovery

  • What It Does:
    The script dynamically retrieves all available ECR repositories rather than requiring manual input or hardcoding repository names.
  • Benefit:
    This ensures that the script remains scalable and adaptable, handling changes in your account’s repository list without extra configuration.

2. Flexible Tag Filtering with Regex

  • What It Does:
    Uses a regular expression to match version tags that start with 9. followed by two more groups of digits separated by a period.
  • Benefit:
    The regex-based approach provides precise control over which images should be targeted. It allows for maintaining a naming convention and cleaning up images that fit a specific outdated or deprecated versioning pattern.

3. Dual-Mode Operation (Dry-Run vs. Apply)

  • What It Does:
    The script accepts command-line arguments to either simulate the deletion (--dry-run) or actually execute it (--apply).
  • Benefit:
    This dual-mode approach provides safety by allowing you to verify the impact before performing any destructive actions. It’s a valuable feature for administrators who need to test scripts in production-like environments first.

4. Robust Error Handling and Strict Mode

  • What It Does:
    The script uses Bash’s set -euo pipefail to enforce strict error checking. This setting causes the script to stop execution when errors occur, when referencing unset variables, or when any command in a pipeline fails.
  • Benefit:
    This minimizes the risk of incomplete operations or unpredictable script behavior, promoting a more reliable execution environment.

5. Clear Usage Documentation

  • What It Does:
    If an incorrect or no argument is supplied, it prints a help message with clear usage instructions and descriptions of each mode.
  • Benefit:
    This helps reduce user error and clarifies how to operate the script, making it more accessible even for less experienced users.

Pros and Cons

Pros

  • Safety and Predictability:
    The inclusion of a dry-run mode helps administrators preview changes before committing to deletion, reducing the risk of accidental data loss.
  • Automated Discovery:
    Automatic querying of all repositories simplifies administration and ensures comprehensive cleanup.
  • Regex-Based Filtering:
    A robust regular expression for tag filtering makes it easy to target specific images based on version patterns.
  • Simplicity and Extensibility:
    Written in Bash, the script is both simple to understand and modify. Its straightforward structure makes it accessible for further customization to suit additional needs.
  • Robust Error Handling:
    The script’s error-handling mechanisms (using set -euo pipefail) enhance reliability by halting execution on unexpected errors.

Cons

  • Dependency on AWS CLI and jq:
    The script requires both the AWS CLI and jq to be installed and configured correctly. In environments where these tools aren’t pre-installed, extra setup is necessary.
  • Limited Regex Flexibility:
    The current regex is specifically tuned to match “9.x.x” versions. If your versioning scheme changes, you may need to adjust the regex accordingly, making the script less flexible unless updated.
  • Potential for Over-Deletion:
    If an image is tagged with multiple values and one of them matches the regex, the deletion command will act on that image. Extra caution is necessary when using the deletion mode in a production environment.
  • No Rollback Mechanism:
    Once an image is deleted (in --apply mode), recovery is not straightforward. Administrators need to have backup or version control measures in place if deletion was accidental.
  • Verbose Logging:
    The script prints messages for each image processed, which might be overwhelming in environments with a large number of images. This could potentially be refined for a more concise logging approach.

Best Practices and Recommendations

  1. Test in a Staging Environment First:
    Always run the script in --dry-run mode on a staging environment to verify that it correctly identifies the target images.
  2. Implement Backups:
    Ensure that images are backed up or versioned appropriately if there’s a risk of needing to roll back any changes.
  3. Monitor Logs:
    If you intend to run this script periodically (e.g., via cron), consider redirecting output logs to a file and implementing log rotation.
  4. Regular Updates:
    Maintain your script by reviewing and testing it whenever your ECR usage or image tagging strategy changes.
  5. Fine-Tune Regex When Needed:
    Adjust the regex pattern if your versioning system evolves or if you introduce new tagging conventions.

The Script

#!/bin/bash
# This script processes all ECR repositories and their images, checks for image tags matching the pattern "9.x.x"
# (where x represents one or more digits), and then either lists (dry run) or deletes (apply) those images.
#
# Usage:
#   ./cleanup_ecr.sh --dry-run    # Lists images that would be deleted without performing deletion.
#   ./cleanup_ecr.sh --apply      # Deletes images with tags matching "9.x.x".
#   ./cleanup_ecr.sh              # Displays this help message.

set -euo pipefail

function usage() {
    echo "Usage: $0 [--dry-run | --apply]"
    echo ""
    echo "  --dry-run   List images that match the tag pattern '9.x.x' without deleting them."
    echo "  --apply     Delete images that match the tag pattern '9.x.x'."
}

# Check command-line arguments
if [ "$#" -ne 1 ]; then
    usage
    exit 1
fi

MODE="$1"
if [[ "$MODE" != "--dry-run" && "$MODE" != "--apply" ]]; then
    usage
    exit 1
fi

# Define regex pattern for tags: "9." followed by digit(s) a dot then digit(s)
REGEX="^9\.[0-9]+\.[0-9]+$"

echo "Mode: $MODE"
echo "Fetching list of ECR repositories..."

# Retrieve all repository names.
repositories=$(aws ecr describe-repositories --query 'repositories[].repositoryName' --output text)

# Check if any repositories were returned.
if [ -z "$repositories" ]; then
    echo "No ECR repositories found."
    exit 0
fi

# Process each repository.
for repo in $repositories; do
    echo "------------------------------"
    echo "Processing repository: $repo"
    
    # Retrieve image details: digest and tags.
    images_json=$(aws ecr describe-images \
        --repository-name "$repo" \
        --query 'imageDetails[*].{Digest: imageDigest, Tags: imageTags}' \
        --output json)
    
    # Determine the number of images in this repository.
    image_count=$(echo "$images_json" | jq 'length')
    if [ "$image_count" -eq 0 ]; then
        echo "  No images found in repository $repo."
        continue
    fi

    # Iterate over each image.
    for (( i=0; i<image_count; i++ )); do
        image_digest=$(echo "$images_json" | jq -r ".[$i].Digest")
        # Extract tags; some images might have no tags.
        tags=$(echo "$images_json" | jq -r ".[$i].Tags[]" 2>/dev/null || true)

        if [ -z "$tags" ]; then
            continue
        fi
        
        # Loop through each tag.
        for tag in $tags; do
            if [[ $tag =~ $REGEX ]]; then
                echo "Match found - Repo: '$repo', Digest: '$image_digest', Tag: '$tag'"
                if [ "$MODE" == "--apply" ]; then
                    echo "  Deleting image..."
                    aws ecr batch-delete-image \
                        --repository-name "$repo" \
                        --image-ids imageDigest="$image_digest",imageTag="$tag"
                fi
            else
                echo "Skipping - Repo: '$repo', Digest: '$image_digest', Tag: '$tag'"
            fi
        done
    done
done

echo "Process completed."

Conclusion

The provided Bash script offers a robust and practical solution for automating the cleanup of Amazon ECR images that match a specified tag pattern. With its clear separation between simulation (dry-run) and execution (apply) modes, along with its automated repository discovery and error-handling mechanisms, the script is a powerful tool for container image management.

However, administrators must also be aware of its dependencies and carefully tailor the regex filter to align with their specific versioning scheme. By following best practices, such as testing in a staging environment and monitoring deletion logs, the script can significantly streamline ECR maintenance, reduce storage bloat, and improve overall operational efficiency.


Share on Social Media
x facebook linkedin reddit
AWS

Post navigation

Previous post
Next post
  • AWS (45)
    • Serverless (4)
  • Best (9)
  • DevOps (16)
  • Docker (10)
  • GCP (3)
  • Kubernetes (3)
  • Linux (13)
  • Uncategorized (6)

Recent Posts

  • Automating AWS CloudWatch Log Group Tagging with Python and Boto3
  • Automating AWS ECR Tagging with Python and Boto3
  • Automating ECR Image Cleanup with Bash
  • Update ECR Repositories with Bash Script
  • Why Automated Tests Are Essential in Your CI/CD Pipeline and Development Flow
  • Streamline Your AWS ECR Management with This Powerful Bash Script
  • Setting up DKIM for Google Workspace (Gmail) using Terraform and AWS Route 53
  • Automate AWS Site-to-Site VPN Monitoring
  • Optimizing Docker Images: Tips for Reducing Image Size and Build Time
  • Monitoring EC2 Disk Space with a Simple Bash Script and Slack Alerts
  • Securing Docker Containers: Best Practices for Container Security
  • Mastering Dockerfile: Writing Efficient, Scalable Container Builds
  • Migrating a Git Repository from GitLab to GitHub with GPG-Signed Commits
  • Accessing AWS Services in Private Subnets Without 0.0.0.0/0
  • Understanding AWS Regions, Availability Zones, and VPCs: A Comprehensive Guide
©2025 Ercan Ermis | WordPress Theme by SuperbThemes