Skip to main content

How to Stream Media Files from S3 Directly to AWS Lambda Using FFmpeg in Python

ยท 6 min read
Serhii Hrekov
software engineer, creator, artist, programmer, projects founder

Processing media files inside AWS Lambda can be a challenge due to its resource limits, lack of a local disk, and timeouts. However, itโ€™s entirely possible to stream files directly from S3 into FFmpeg, process them on the fly, and avoid writing anything to disk.

In this guide, weโ€™ll cover:

  • Streaming files from S3 into Lambda
  • Using ffmpeg with stdin and stdout in memory
  • Avoiding /tmp bottlenecks
  • Examples and production-ready patterns

Why This Approach?โ€‹

Traditional file processing in AWS Lambda often looks like:

  1. Download file from S3 to /tmp
  2. Run FFmpeg on the file
  3. Upload result to S3

That works, but /tmp is limited to 512 MB, and I/O is relatively slow. Streaming avoids this.

Benefits:

  • Zero-disk usage (works in low-memory environments)
  • Fully stateless function
  • Faster, scalable media processing

Prerequisitesโ€‹

  • Python 3.9+ (AWS Lambda compatible)
  • FFmpeg binary packaged in your Lambda layer
  • boto3 for AWS SDK
  • subprocess or asyncio.subprocess
  • IAM role with S3 read/write permissions

๐Ÿ› ๏ธ Step-by-Step Exampleโ€‹

Weโ€™ll show how to:

  • Stream an MP3 file from S3
  • Trim the first 10 seconds using FFmpeg
  • Output result as MP3
  • Upload back to S3

1. Install FFmpeg to Lambdaโ€‹

You need a Lambda Layer with a statically compiled FFmpeg binary. You can use:

Make sure ffmpeg is available at /opt/bin/ffmpeg.


2. Lambda Handler Example (sync)โ€‹

import boto3
import subprocess
import io

s3 = boto3.client('s3')

def lambda_handler(event, context):
input_bucket = 'your-source-bucket'
input_key = 'input.mp3'
output_bucket = 'your-destination-bucket'
output_key = 'trimmed.mp3'

# Get input file as stream
input_stream = s3.get_object(Bucket=input_bucket, Key=input_key)['Body']

# Use ffmpeg to trim 10 seconds
ffmpeg_cmd = [
"/opt/bin/ffmpeg",
"-i", "pipe:0", # stdin
"-ss", "00:00:00",
"-t", "00:00:10",
"-f", "mp3",
"pipe:1" # stdout
]

result = subprocess.run(
ffmpeg_cmd,
input=input_stream.read(),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)

if result.returncode != 0:
print(result.stderr.decode())
raise Exception("FFmpeg failed")

# Upload result to S3
s3.upload_fileobj(io.BytesIO(result.stdout), output_bucket, output_key)

return {
'statusCode': 200,
'body': f"Processed and uploaded to {output_bucket}/{output_key}"
}

Notes and Best Practicesโ€‹

  • Memory limits: use higher Lambda memory (512 MBโ€“1 GB) for better CPU.
  • Avoid loading entire file: For large files, consider chunked streaming (advanced).
  • FFmpeg options: Customize codec (-c:a), bitrate (-b:a), and format (-f) as needed.
  • Logs: Always log stderr from FFmpeg to catch conversion issues.

Advanced: Asyncio Streaming with asyncio.subprocessโ€‹

If using Python 3.11+ with asyncio, you can:

  • Stream bytes directly into FFmpeg
  • Consume stdout in chunks
  • Fully async Lambda setup (e.g. via async_lambda_handler)

See aiobotocore or async-ffmpeg for more ideas


Security Notesโ€‹

  • Ensure that S3 keys and buckets are validated (do not trust user input directly).
  • Do not expose FFmpeg to external input without sanitization (avoid command injection).

Summaryโ€‹

FeatureBenefit
Streaming S3 โ†’ FFmpeg โ†’ S3Zero disk I/O
FFmpeg stdin/stdoutFast and memory-safe
No /tmp writesWorks in constrained environments
Fully serverlessPerfect for AWS Lambda microservices

๐Ÿ“š Further Readingโ€‹

Final Thoughtsโ€‹

Streaming media files from S3 into FFmpeg inside Lambda is one of the cleanest ways to process content in modern cloud environments. Whether youโ€™re trimming audio, extracting thumbnails from video, or re-encoding formats - this approach scales beautifully.