Lambda Response Streaming lets you send back large payloads from your Lambda functions without hitting the 6MB API Gateway payload limit.

{
  "FunctionName": "my-streaming-function",
  "Payload": "...", // A large JSON string, potentially > 6MB
  "InvocationType": "RequestResponse",
  "LogType": "Tail"
}

When you invoke a Lambda function synchronously, API Gateway typically buffers the entire response before sending it back. If that response exceeds 6MB, API Gateway rejects it with a 502 Bad Gateway error, even though your Lambda function might have completed successfully. Response streaming fundamentally changes this by allowing the Lambda function to send back chunks of data as they become available, rather than waiting for the entire payload to be built. API Gateway then streams these chunks to the client in real-time.

To enable response streaming, you need to configure your Lambda function to return a StreamingResponse object. This involves setting the isBase64Encoded flag to false and providing the body as a readable stream.

Here’s a simplified Python example:

import json
import time
import io

def lambda_handler(event, context):
    def stream_large_data():
        # Simulate generating a large response, chunk by chunk
        for i in range(1000): # Generate 1000 chunks
            chunk = {"data": f"This is chunk number {i}", "timestamp": time.time()}
            yield json.dumps(chunk) + '\n' # Yield JSON string with newline delimiter
            time.sleep(0.01) # Simulate some processing delay

    return {
        "statusCode": 200,
        "headers": {
            "Content-Type": "application/json"
        },
        "body": io.StringIO(stream_large_data()), # Wrap the generator in StringIO
        "isBase64Encoded": False
    }

The io.StringIO object here is crucial. It wraps your generator function (stream_large_data) and presents it as a file-like object that Lambda’s runtime can read from. When Lambda receives this StreamingResponse object, it knows to treat the body as a stream and forward its contents as they are generated. The Content-Type header in this case is application/json, but you could also stream plain text or binary data by adjusting this header and how you encode your chunks.

The key benefit is that the intermediate payload size within Lambda and API Gateway is drastically reduced. Instead of holding the entire multi-megabyte response in memory, API Gateway receives and forwards data in smaller, manageable chunks. This allows you to serve responses that would otherwise be impossible due to API Gateway’s payload limits. Think of it like a video stream versus downloading an entire movie file before watching it.

The biggest surprise is that you don’t need a special API Gateway configuration to enable streaming for a specific endpoint. If your Lambda function returns a StreamingResponse object (i.e., isBase64Encoded: False and body is a stream), API Gateway automatically detects this and switches to streaming mode for that particular request. You don’t need to set up a separate API Gateway stage or resource for streaming versus non-streaming functions.

The Content-Type header is vital for the client to correctly interpret the streamed data. If you’re streaming JSON objects, it’s common practice to delimit each JSON object with a newline character (\n). This allows clients to easily parse the stream, treating each line as a distinct JSON object. For example, a Node.js client might use the readline module to process the stream line by line.

This capability is particularly powerful when dealing with data exports, large reports, or any scenario where you’re generating a significant amount of data programmatically. It also means your function can start returning data much faster, improving perceived performance for your users.

The next hurdle is handling connection errors or timeouts gracefully. If the client disconnects mid-stream, your Lambda function might continue processing and generating data unnecessarily, incurring costs.

Want structured learning?

Take the full Lambda course →