← Back

Compressing the Uncompressible: How Gzip Saved Our Lambda Pipeline

·backend-core

Compressing the Uncompressible: How Gzip Saved Our Lambda Pipeline

Key Takeaway

Large geopandas computation payloads exceeded AWS Lambda's 256KB invocation limit, causing pipeline failures. Implementing gzip compression reduced payload sizes by 60-80%, enabling reliable Lambda-to-Lambda communication and preventing invocation errors.

The Problem

Our workflow engine orchestrates spatial computations by invoking specialized Lambda functions. These functions process geospatial data using geopandas and return results for further processing. We encountered five critical issues:

  1. Payload Size Limits: AWS Lambda synchronous invocation limit is 256KB request/response
  2. JSON Verbosity: GeoJSON and computational results are inherently large
  3. Invocation Failures: Large responses caused Lambda to reject invocations
  4. Data Loss: Failed invocations resulted in incomplete workflow execution
  5. Unpredictable Failures: Payload size varied by data, making failures hard to reproduce

Example failure:

# This Lambda invocation failed when response exceeded 256KB
response = lambda_client.invoke(
    FunctionName='geopandas-processor',
    Payload=json.dumps(large_geojson_data)
)
# Error: Payload size exceeded maximum allowed

Context and Background

Our architecture uses a microservices pattern where the main backend Lambda orchestrates specialized compute functions:

Backend Core Lambda
    ↓ (invokes)
Geopandas Utils Lambda (compute-intensive)
    ↓ (returns)
Backend Core Lambda (continues workflow)

The geopandas Lambda performs operations like:

  • Polygon intersection and union
  • Spatial join operations
  • Coordinate transformations
  • Annotation overlap detection
  • Area calculations with precision

These operations return large GeoJSON structures containing:

  • Polygon coordinates (hundreds of points)
  • Feature properties and metadata
  • Computed statistics
  • Relationship mappings

A typical response for 1,000 annotations could be 400-500KB uncompressed—far exceeding Lambda limits.

The Solution

We implemented gzip compression for all Lambda-to-Lambda communication:

Request Compression (Client Side)

# In geopandas_client.py
import gzip
import json
import base64

class GeopandasClient:
    def invoke_with_compression(self, function_name, payload):
        # Serialize to JSON
        json_payload = json.dumps(payload)

        # Compress with gzip
        compressed = gzip.compress(
            json_payload.encode('utf-8'),
            compresslevel=6  # Balance between speed and ratio
        )

        # Base64 encode for JSON transport
        encoded_payload = base64.b64encode(compressed).decode('utf-8')

        # Invoke Lambda with compressed payload
        response = self.lambda_client.invoke(
            FunctionName=function_name,
            Payload=json.dumps({
                'compressed': True,
                'data': encoded_payload
            })
        )

        return self._handle_response(response)

Response Decompression (Client Side)

def _handle_response(self, response):
    payload = json.loads(response['Payload'].read())

    # Check if response is compressed
    if payload.get('compressed'):
        # Decode base64
        compressed_data = base64.b64decode(payload['data'])

        # Decompress
        decompressed = gzip.decompress(compressed_data)

        # Parse JSON
        return json.loads(decompressed.decode('utf-8'))

    # Handle uncompressed responses (backward compatible)
    return payload

Handler Updates (Server Side)

# In geopandas_utils/handler.py
def lambda_handler(event, context):
    # Detect and decompress incoming payload
    if event.get('compressed'):
        compressed_data = base64.b64decode(event['data'])
        decompressed = gzip.decompress(compressed_data)
        actual_payload = json.loads(decompressed.decode('utf-8'))
    else:
        actual_payload = event

    # Process request
    result = process_geopandas_operation(actual_payload)

    # Compress response if it's large
    result_json = json.dumps(result)

    if len(result_json) > 100_000:  # Compress if > 100KB
        compressed = gzip.compress(result_json.encode('utf-8'))
        encoded = base64.b64encode(compressed).decode('utf-8')

        return {
            'compressed': True,
            'data': encoded
        }

    return result  # Return uncompressed for small responses

Implementation Details

1. Compression Level Tuning

We tested different gzip compression levels:

| Level | Compression Ratio | Speed | Use Case | |-------|------------------|-------|----------| | 1 | 40% reduction | Fastest | Time-critical, moderate size | | 6 | 65% reduction | Balanced | Default (our choice) | | 9 | 70% reduction | Slowest | Maximum compression needed |

Level 6 provided the best balance for our use case—significant size reduction without excessive CPU time.

2. Compression Threshold

Not all payloads benefit from compression:

COMPRESSION_THRESHOLD = 50_000  # 50KB

def should_compress(payload):
    """Small payloads incur overhead without benefit"""
    payload_size = len(json.dumps(payload))
    return payload_size > COMPRESSION_THRESHOLD

3. Error Handling

def safe_decompress(compressed_data):
    try:
        return gzip.decompress(compressed_data).decode('utf-8')
    except gzip.BadGzipFile:
        # Possibly not compressed despite flag
        return compressed_data.decode('utf-8')
    except Exception as e:
        raise DecompressionError(f"Failed to decompress: {e}")

4. Backward Compatibility

We maintained compatibility with non-compressed clients:

def handle_request(event):
    if 'compressed' in event and event['compressed']:
        return decompress_and_process(event)
    else:
        return process_directly(event)

5. Monitoring

Added CloudWatch metrics to track compression effectiveness:

def log_compression_metrics(original_size, compressed_size):
    ratio = (1 - compressed_size / original_size) * 100
    logger.info(f"Compression: {original_size}B → {compressed_size}B ({ratio:.1f}% reduction)")

    cloudwatch.put_metric_data(
        Namespace='GeopandasClient',
        MetricData=[{
            'MetricName': 'CompressionRatio',
            'Value': ratio,
            'Unit': 'Percent'
        }]
    )

Performance Metrics

Real-world compression results:

| Payload Type | Uncompressed Size | Compressed Size | Reduction | |--------------|------------------|-----------------|-----------| | 1K annotations GeoJSON | 485 KB | 92 KB | 81% | | Overlap detection result | 320 KB | 78 KB | 76% | | Spatial join output | 150 KB | 55 KB | 63% | | Small feature collection | 30 KB | 28 KB | 7% |

Compression overhead:

  • Compression time: 15-30ms for typical payloads
  • Decompression time: 8-15ms
  • Total overhead: ~40ms (acceptable for multi-second operations)

Impact and Results

After implementing compression:

  • Reliability: Zero payload size failures since deployment
  • Scalability: Can now handle 5,000+ annotation operations
  • Latency: Added ~40ms overhead (negligible for 2-5 second operations)
  • Cost: Reduced Lambda invocation failures and retries
  • Flexibility: Can process larger datasets without architecture changes

Lessons Learned

  1. Compression is Cheap: Modern CPUs compress/decompress faster than networks transfer data
  2. Test with Real Data: Synthetic data often compresses better than production data
  3. Threshold Matters: Don't compress everything—small payloads aren't worth the overhead
  4. Base64 Overhead: Binary compression + base64 encoding still saves significant space
  5. Monitor Compression Ratios: Track effectiveness to identify optimization opportunities

Additional Considerations

Alternative Approaches We Considered

  1. Async Invocation with S3:

    • Store large payloads in S3, pass S3 keys
    • Pros: No size limits
    • Cons: Added complexity, latency, cost
  2. Step Functions:

    • Use AWS Step Functions for large state
    • Pros: Built-in state management
    • Cons: Higher cost, vendor lock-in
  3. Direct API Gateway:

    • Replace Lambda invoke with HTTP API
    • Pros: Higher size limits (10MB)
    • Cons: More complex networking, authentication

Compression was the simplest solution that solved our immediate problem.

When Not to Use Compression

  • Payloads under 50KB (minimal benefit)
  • Already compressed data (images, videos)
  • Time-critical operations where milliseconds matter
  • CPU-constrained environments

Gzip compression is a powerful tool for working within AWS Lambda constraints without architectural overhauls. The implementation is straightforward and the benefits are substantial for data-heavy workflows.