Skip to main content

How to Stream Media Files from S3 Directly to AWS Lambda Using FFmpeg in Python

· 4 min read
Serhii Hrekov
software engineer, creator, artist, programmer, projects founder

Processing media files inside AWS Lambda can be a challenge due to its resource limits, lack of a local disk, and timeouts. However, it’s entirely possible to stream files directly from S3 into FFmpeg, process them on the fly, and avoid writing anything to disk.

In this guide, we’ll cover:

  • Streaming files from S3 into Lambda
  • Using ffmpeg with stdin and stdout in memory
  • Avoiding /tmp bottlenecks
  • Examples and production-ready patterns

Why This Approach?

Traditional file processing in AWS Lambda often looks like:

  1. Download file from S3 to /tmp
  2. Run FFmpeg on the file
  3. Upload result to S3

That works, but /tmp is limited to 512 MB, and I/O is relatively slow. Streaming avoids this.

Benefits:

  • Zero-disk usage (works in low-memory environments)
  • Fully stateless function
  • Faster, scalable media processing

Prerequisites

  • Python 3.9+ (AWS Lambda compatible)
  • FFmpeg binary packaged in your Lambda layer
  • boto3 for AWS SDK
  • subprocess or asyncio.subprocess
  • IAM role with S3 read/write permissions

🛠️ Step-by-Step Example

We’ll show how to:

  • Stream an MP3 file from S3
  • Trim the first 10 seconds using FFmpeg
  • Output result as MP3
  • Upload back to S3

1. Install FFmpeg to Lambda

You need a Lambda Layer with a statically compiled FFmpeg binary. You can use:

Make sure ffmpeg is available at /opt/bin/ffmpeg.


2. Lambda Handler Example (sync)

import boto3
import subprocess
import io

s3 = boto3.client('s3')

def lambda_handler(event, context):
input_bucket = 'your-source-bucket'
input_key = 'input.mp3'
output_bucket = 'your-destination-bucket'
output_key = 'trimmed.mp3'

# Get input file as stream
input_stream = s3.get_object(Bucket=input_bucket, Key=input_key)['Body']

# Use ffmpeg to trim 10 seconds
ffmpeg_cmd = [
"/opt/bin/ffmpeg",
"-i", "pipe:0", # stdin
"-ss", "00:00:00",
"-t", "00:00:10",
"-f", "mp3",
"pipe:1" # stdout
]

result = subprocess.run(
ffmpeg_cmd,
input=input_stream.read(),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)

if result.returncode != 0:
print(result.stderr.decode())
raise Exception("FFmpeg failed")

# Upload result to S3
s3.upload_fileobj(io.BytesIO(result.stdout), output_bucket, output_key)

return {
'statusCode': 200,
'body': f"Processed and uploaded to {output_bucket}/{output_key}"
}

Notes and Best Practices

  • Memory limits: use higher Lambda memory (512 MB–1 GB) for better CPU.
  • Avoid loading entire file: For large files, consider chunked streaming (advanced).
  • FFmpeg options: Customize codec (-c:a), bitrate (-b:a), and format (-f) as needed.
  • Logs: Always log stderr from FFmpeg to catch conversion issues.

Advanced: Asyncio Streaming with asyncio.subprocess

If using Python 3.11+ with asyncio, you can:

  • Stream bytes directly into FFmpeg
  • Consume stdout in chunks
  • Fully async Lambda setup (e.g. via async_lambda_handler)

See aiobotocore or async-ffmpeg for more ideas


Security Notes

  • Ensure that S3 keys and buckets are validated (do not trust user input directly).
  • Do not expose FFmpeg to external input without sanitization (avoid command injection).

Summary

FeatureBenefit
Streaming S3 → FFmpeg → S3Zero disk I/O
FFmpeg stdin/stdoutFast and memory-safe
No /tmp writesWorks in constrained environments
Fully serverlessPerfect for AWS Lambda microservices

📚 Further Reading

Final Thoughts

Streaming media files from S3 into FFmpeg inside Lambda is one of the cleanest ways to process content in modern cloud environments. Whether you’re trimming audio, extracting thumbnails from video, or re-encoding formats — this approach scales beautifully.