Skip to content
Ercan Ermis
Ercan Ermis

notes for everyone about cloud technology

  • Cloud
    • AWS
    • GCP
  • Container
    • Kubernetes
    • Docker
  • Linux
  • DevOps
  • Privacy Policy
  • Contact
Ercan Ermis

notes for everyone about cloud technology

Secure Your Media Files by Removing Metadata with AWS Lambda

Ercan, April 20, 2023April 25, 2023

In today’s digital world, images and videos often contain metadata that reveals a surprising amount of information about the media file. This metadata, such as EXIF data in images, can include sensitive details like location, device information, and more. To protect user privacy and enhance security, businesses in various industries can benefit from removing this metadata from media files. In this blog post, we’ll walk you through a simple AWS Lambda script that automatically removes metadata from uploaded images and videos in S3 buckets.

Industries That Can Benefit:

  1. Social Media Platforms: Social media platforms handle a massive number of media uploads every day. By removing metadata from images and videos, these platforms can better protect user privacy and minimize the risk of unintentional information leaks.
  2. E-Commerce: E-commerce websites often display user-generated content, such as product images and reviews. Stripping metadata from these media files ensures that customers’ private information is not inadvertently exposed.
  3. Healthcare: The healthcare industry deals with sensitive patient information, including images and videos from medical procedures. Removing metadata from these files is essential to comply with privacy regulations and protect patient confidentiality.
  4. News and Media: Journalists and media organizations publish images and videos that may contain sensitive information about sources or locations. Stripping metadata can help protect this information and maintain the integrity of their reporting.
  5. Education: Educational institutions often host and share various media files, such as lecture videos, research images, and student presentations. Removing metadata from these files ensures that private information about students, faculty, and research subjects is protected.

Benefits of Removing Metadata:

  1. Enhanced Privacy: Stripping metadata from media files helps protect sensitive information about users, locations, and devices, safeguarding user privacy.
  2. Security: By removing metadata, you reduce the risk of accidentally leaking sensitive information, which could be exploited by malicious actors.
  3. Compliance: Removing metadata can help organizations comply with data protection regulations, such as GDPR or HIPAA, that require safeguarding user data.
  4. Simplified Management: Automating metadata removal with AWS Lambda reduces the manual work needed to process media files, streamlining media management across your organization.
import boto3
import io
import os
from PIL import Image
from moviepy.editor import *

def lambda_handler(event, _):
    bucket_name = os.environ['S3_BUCKET_NAME']
    s3 = boto3.client('s3')
    object_name = event['Records'][0]['s3']['object']['key']
    file_name, file_extension = os.path.splitext(object_name)
    
    supported_image_extensions = ['.jpg', '.jpeg', '.png', '.tiff', '.tif', '.heic', '.heif']
    supported_video_extensions = ['.mp4', '.mov', '.avi', '.mkv', '.webm']
    
    image_data = s3.get_object(Bucket=bucket_name, Key=object_name)

    if file_extension.lower() in supported_image_extensions:
        with io.BytesIO(image_data['Body'].read()) as image_file:
            image = Image.open(image_file)
            image_format = image.format
            
            with io.BytesIO() as new_image_data:
                image.save(new_image_data, format=image_format)
                new_image_data.seek(0)
                
                s3.put_object(Bucket=bucket_name, Key=object_name, Body=new_image_data, Tagging='ExifDeleted=True')

    elif file_extension.lower() in supported_video_extensions:
        with io.BytesIO(image_data['Body'].read()) as video_file:
            video = VideoFileClip(video_file)
            with io.BytesIO() as new_video_data:
                video.write_videofile(new_video_data, codec='libx264', audio_codec='aac')
                new_video_data.seek(0)
                
                s3.put_object(Bucket=bucket_name, Key=object_name, Body=new_video_data, Tagging='ExifDeleted=True')

Please note that the PIL and moviepy libraries are requires some shared libraries, which may not be available in the default Lambda environment. You’ll need to create a custom Lambda layer that includes both shared libraries. You can follow the official guide to create a custom Lambda layer for FFmpeg.

Here is the Github Repository: https://github.com/flightlesstux/EXIF-Metadata-Remover

Conclusion

The AWS Lambda script we’ve provided makes it easy to remove metadata from images and videos uploaded to S3 buckets, enhancing privacy and security across a wide range of industries. By implementing this solution, you can protect user information, reduce potential risks, and ensure compliance with data protection regulations.

Share on Social Media
x facebook linkedin reddit
Serverless awsexiflambdametadatapythonsecurityserverless

Post navigation

Previous post
Next post
  • AWS (45)
    • Serverless (4)
  • Best (9)
  • DevOps (16)
  • Docker (10)
  • GCP (3)
  • Kubernetes (3)
  • Linux (13)
  • Uncategorized (6)

Recent Posts

  • Automating AWS CloudWatch Log Group Tagging with Python and Boto3
  • Automating AWS ECR Tagging with Python and Boto3
  • Automating ECR Image Cleanup with Bash
  • Update ECR Repositories with Bash Script
  • Why Automated Tests Are Essential in Your CI/CD Pipeline and Development Flow
  • Streamline Your AWS ECR Management with This Powerful Bash Script
  • Setting up DKIM for Google Workspace (Gmail) using Terraform and AWS Route 53
  • Automate AWS Site-to-Site VPN Monitoring
  • Optimizing Docker Images: Tips for Reducing Image Size and Build Time
  • Monitoring EC2 Disk Space with a Simple Bash Script and Slack Alerts
  • Securing Docker Containers: Best Practices for Container Security
  • Mastering Dockerfile: Writing Efficient, Scalable Container Builds
  • Migrating a Git Repository from GitLab to GitHub with GPG-Signed Commits
  • Accessing AWS Services in Private Subnets Without 0.0.0.0/0
  • Understanding AWS Regions, Availability Zones, and VPCs: A Comprehensive Guide
©2025 Ercan Ermis | WordPress Theme by SuperbThemes