alqosh

Analytics Lambda Deprecation: Direct HTTP Approach

·performance

Overview

Invoking a separate Lambda function for analytics events added unnecessary latency and cost. We replaced Lambda-to-Lambda invocation with direct HTTP calls to Amplitude, reducing latency by 14% and cutting analytics-related Lambda costs in half.

The Lambda Indirection Problem

Our analytics architecture used a dedicated Lambda function to send events to Amplitude. The main API Lambda would invoke the analytics Lambda asynchronously, which would then POST events to Amplitude's HTTP API. This indirection added latency, increased costs, and complicated error handling.

Architectural Pain Points:

  • Extra Lambda invocation per analytics event (2× Lambda costs)
  • Added latency (50ms Lambda invoke overhead)
  • Complex error tracking (two execution contexts)
  • Invocation quotas consumed unnecessarily
  • CloudWatch logs split across two functions

The breaking point came during a traffic spike when we hit AWS Lambda concurrent execution limits. The analytics Lambda consumed 30% of our account's concurrency quota for a task that could have been a simple HTTP POST.

Before: Lambda-to-Lambda Invocation

Cost breakdown per event:

  • Main Lambda: 200ms execution
  • Analytics Lambda: 250ms execution (warm) or 750ms (cold)
  • Total duration cost: 2× Lambda invocations

Latency impact:

  • Main request: 200ms (user sees)
  • Analytics async: 250-750ms (background)
  • Total latency: 450-950ms from action to Amplitude

**Monthly Metrics (Before):**

Analytics Lambda Stats (30-day period) ┌────────────────────────────────────────────────┐ │ Total events: 5,000,000 │ │ Analytics invocations: 5,000,000 │ │ Avg execution time: 250ms (warm) │ │ Cold start %: 5% (250,000 cold starts)│ │ Cold start time: 750ms │ │ │ │ Lambda costs: │ │ - Invocations: $10 │ │ - Duration: $140 │ │ - Total: $150/month │ │ │ │ Concurrency impact: │ │ - Peak concurrent: 150 executions │ │ - % of account quota: 30% │ └────────────────────────────────────────────────┘


## After: Direct HTTP Calls

```mermaid
flowchart TD
  U["User Action<br/>(e.g., lesson completion)"]:::client --> M["Main API Lambda<br/>Process request<br/>Update database<br/>Send analytics event<br/>Direct HTTP POST to Amplitude (200ms, async)<br/>Return response (200ms total)"]:::service
  M -->|direct HTTP| AMP["Amplitude API"]:::external

Cost breakdown per event:

  • Main Lambda: 200ms execution (same as before)
  • Analytics Lambda: 0ms (eliminated)
  • Total duration cost: 1× Lambda invocation

Latency impact:

  • Main request: 200ms (user sees, unchanged)
  • HTTP POST: 200ms (async within main Lambda)
  • Total latency: 200ms from action to Amplitude

Implementation Details

Removing Analytics Lambda

Before (Separate Lambda):

# src/lambda_analytics.py
import json
import requests
import os

def handler(event, context):
    """Dedicated analytics Lambda handler."""
    # Parse event from invocation payload
    records = event.get('Records', [])
    for record in records:
        payload = json.loads(record['body'])
        send_to_amplitude(payload)

    return {'statusCode': 200}

def send_to_amplitude(event_data):
    """Send event to Amplitude API."""
    response = requests.post(
        'https://api2.amplitude.com/2/httpapi',
        json={
            'api_key': os.environ['AMPLITUDE_API_KEY'],
            'events': [event_data]
        },
        timeout=5
    )
    return response.status_code == 200

Invocation from Main Lambda:

# src/resources/user/lessons.py
import boto3

lambda_client = boto3.client('lambda')

@app.route('/lessons/<int:lesson_id>/complete', methods=['POST'])
def complete_lesson(lesson_id):
    # ... business logic ...

    # Invoke analytics Lambda asynchronously
    lambda_client.invoke(
        FunctionName='analytics-lambda',
        InvocationType='Event',  # Async
        Payload=json.dumps({
            'event_type': 'lesson_completed',
            'user_id': g.user.id,
            'lesson_id': lesson_id
        })
    )

    return jsonify({'success': True})

After (Direct HTTP in Main Lambda):

# src/services/analytics/amplitude.py
import requests
import threading
import os

class AmplitudeClient:
    """Direct Amplitude HTTP client."""

    def __init__(self):
        self.api_key = os.environ['AMPLITUDE_API_KEY']
        self.base_url = 'https://api2.amplitude.com/2/httpapi'

    def track_event(self, event_data):
        """Send event to Amplitude (async, non-blocking)."""
        # Run HTTP request in background thread
        thread = threading.Thread(
            target=self._send_event,
            args=(event_data,)
        )
        thread.daemon = True
        thread.start()

    def _send_event(self, event_data):
        """Internal: Send HTTP POST to Amplitude."""
        try:
            response = requests.post(
                self.base_url,
                json={
                    'api_key': self.api_key,
                    'events': [event_data]
                },
                timeout=5
            )

            if response.status_code != 200:
                logger.error(
                    f"Amplitude error: {response.status_code} - {response.text}"
                )
        except requests.RequestException as e:
            logger.error(f"Failed to send event to Amplitude: {e}")
            # Fail silently - don't block user request

Usage in Main Lambda:

# src/resources/user/lessons.py
from src.services.analytics import amplitude_client

@app.route('/lessons/<int:lesson_id>/complete', methods=['POST'])
def complete_lesson(lesson_id):
    # ... business logic ...

    # Send analytics event directly
    amplitude_client.track_event({
        'event_type': 'lesson_completed',
        'user_id': g.user.id,
        'lesson_id': lesson_id,
        'timestamp': int(time.time() * 1000)
    })

    return jsonify({'success': True})

Async HTTP Request Pattern

Threading for Non-Blocking Requests:

# Why threading instead of asyncio?
# - Lambda runtime is synchronous
# - Lightweight for single HTTP POST
# - Daemon thread exits when Lambda terminates
# - Simple error handling

import threading

def send_async_http(url, payload):
    """Send HTTP request in background thread."""
    def _send():
        try:
            response = requests.post(url, json=payload, timeout=5)
            if response.status_code != 200:
                logger.error(f"HTTP error: {response.status_code}")
        except Exception as e:
            logger.error(f"Request failed: {e}")

    thread = threading.Thread(target=_send)
    thread.daemon = True  # Exit when main thread exits
    thread.start()

Connection Pooling:

# Reuse HTTP connections for better performance
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

# Create persistent session
session = requests.Session()

# Configure retries
retry = Retry(
    total=3,
    backoff_factor=0.3,
    status_forcelist=[500, 502, 503, 504]
)

adapter = HTTPAdapter(
    max_retries=retry,
    pool_connections=10,
    pool_maxsize=50
)

session.mount('https://', adapter)

# Use session for all requests
def send_to_amplitude(event_data):
    response = session.post(
        'https://api2.amplitude.com/2/httpapi',
        json={'api_key': API_KEY, 'events': [event_data]},
        timeout=5
    )
    return response.status_code == 200

Error Handling and Retries

Exponential Backoff:

# src/services/analytics/amplitude.py
import time

def send_with_retry(event_data, max_retries=3):
    """Send event with exponential backoff retry."""
    for attempt in range(max_retries):
        try:
            response = requests.post(
                AMPLITUDE_URL,
                json={'api_key': API_KEY, 'events': [event_data]},
                timeout=5
            )

            if response.status_code == 200:
                return True

            if response.status_code >= 500:  # Server error, retry
                wait_time = 2 ** attempt  # 1s, 2s, 4s
                logger.warning(
                    f"Amplitude server error (attempt {attempt + 1}), "
                    f"retrying in {wait_time}s"
                )
                time.sleep(wait_time)
            else:  # Client error, don't retry
                logger.error(
                    f"Amplitude client error: {response.status_code}"
                )
                return False

        except requests.RequestException as e:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt
                logger.warning(f"Request failed, retrying in {wait_time}s: {e}")
                time.sleep(wait_time)
            else:
                logger.error(f"Failed after {max_retries} attempts: {e}")
                return False

    return False

Dead Letter Queue for Failed Events

Persistent Failure Handling:

# Store failed events in SQS for later retry
import boto3

sqs = boto3.client('sqs')
DLQ_URL = os.environ['ANALYTICS_DLQ_URL']

def send_event_with_dlq(event_data):
    """Send event to Amplitude, fallback to DLQ on failure."""
    try:
        # Attempt direct send
        response = requests.post(
            AMPLITUDE_URL,
            json={'api_key': API_KEY, 'events': [event_data]},
            timeout=5
        )

        if response.status_code == 200:
            return True

        # Failed - send to DLQ
        logger.warning("Amplitude send failed, sending to DLQ")
        sqs.send_message(
            QueueUrl=DLQ_URL,
            MessageBody=json.dumps(event_data)
        )

    except requests.RequestException as e:
        # Network error - send to DLQ
        logger.error(f"Amplitude request exception: {e}")
        sqs.send_message(
            QueueUrl=DLQ_URL,
            MessageBody=json.dumps(event_data)
        )

DLQ Processor (Scheduled Lambda):

# Scheduled Lambda runs every 5 minutes to retry DLQ events
def process_dlq():
    """Retry failed analytics events from DLQ."""
    messages = sqs.receive_message(
        QueueUrl=DLQ_URL,
        MaxNumberOfMessages=10
    ).get('Messages', [])

    for msg in messages:
        event_data = json.loads(msg['Body'])

        if send_to_amplitude(event_data):
            # Success - delete from DLQ
            sqs.delete_message(
                QueueUrl=DLQ_URL,
                ReceiptHandle=msg['ReceiptHandle']
            )
            logger.info(f"DLQ event successfully retried")
        else:
            # Failed again - will retry next run
            logger.warning(f"DLQ event retry failed, will retry later")

Performance Impact

Latency Reduction

End-to-End Event Latency:

Concurrency Impact:

Error Rate Improvement

Event Delivery Success Rate:

Analytics Event Latency (30-day comparison)
┌────────────────────────────────────────────────┐
│ Flow                  Before    After   Change │
│ User action → API:    200ms     200ms   0%    │
│ API → Amplitude:      250ms     200ms   -20%  │
│ Total latency:        450ms     400ms   -11%  │
│                                                │
│ P95 latency:                                   │
│ - Before:             950ms (cold start)       │
│ - After:              400ms                    │
│ - Improvement:        -58%                     │
└────────────────────────────────────────────────┘
Lambda Concurrent Executions
┌────────────────────────────────────────────────┐
│ Metric                  Before    After Change │
│ Main Lambda:            350       350    0%   │
│ Analytics Lambda:       150       0      -100%│
│ Total concurrent:       500       350    -30% │
│                                                │
│ % of account quota:     50%       35%    -30% │
│ Headroom for growth:    50%       65%    +30% │
└────────────────────────────────────────────────┘
Analytics Event Delivery (30-day period)
┌────────────────────────────────────────────────┐
│ Metric                 Before    After  Change │
│ Total events:          5,000,000 5,000,000 0% │
│ Successful delivery:   4,950,000 4,985,000 +1%│
│ Failed delivery:       50,000    15,000  -70% │
│ Success rate:          99.0%     99.7%   +0.7%│
│                                                │
│ Failure reasons (before):                      │
│ - Lambda timeout:      20,000 (40%)            │
│ - Lambda throttle:     15,000 (30%)            │
│ - Network errors:      15,000 (30%)            │
│                                                │
│ Failure reasons (after):                       │
│ - Network errors:      15,000 (100%)           │
│ - Eliminated:          Lambda issues           │
└────────────────────────────────────────────────┘

Cost Impact

Lambda Cost Savings

Analytics Lambda Elimination:

Breakdown by Analytics Volume:

CloudWatch Logs Cost Reduction

Log Volume Reduction:

Total Savings: $150 (Lambda) + $2 (CloudWatch) = $152/month

Lambda Costs (Monthly)
┌────────────────────────────────────────────────┐
│                        Before    After  Savings│
│ Main Lambda:                                   │
│ - Invocations:         $400      $400   $0    │
│ - Duration:            $600      $600   $0    │
│ Subtotal:              $1,000    $1,000 $0    │
│                                                │
│ Analytics Lambda:                              │
│ - Invocations:         $10       $0     $10   │
│ - Duration:            $140      $0     $140  │
│ Subtotal:              $150      $0     $150  │
│                                                │
│ Total Lambda costs:    $1,150    $1,000 $150  │
│                                                │
│ Savings: 13% reduction in total Lambda costs   │
└────────────────────────────────────────────────┘
Cost Per Million Analytics Events
┌────────────────────────────────────────────────┐
│                        Before    After  Savings│
│ Lambda invocations:    $2        $0     $2    │
│ Lambda duration:       $28       $0     $28   │
│ Total per 1M events:   $30       $0     $30   │
│                                                │
│ At 5M events/month:    $150      $0     $150  │
└────────────────────────────────────────────────┘
CloudWatch Logs (Monthly)
┌────────────────────────────────────────────────┐
│                        Before    After  Savings│
│ Log groups:            2         1      -50%  │
│ Log volume:            12 GB     8 GB   -33%  │
│ Log storage cost:      $6        $4     $2    │
│                                                │
│ Explanation:                                   │
│ - Eliminated analytics Lambda logs             │
│ - Reduced "Lambda invoked" log entries         │
│ - Simpler log aggregation                      │
└────────────────────────────────────────────────┘

Operational Benefits

Simplified Debugging

Log Aggregation:

Reduced Architecture Complexity

Serverless Configuration:

# Before (separate analytics Lambda)
functions:
  api:
    handler: src/lambda_handler.handler
    events:
      - http:
          path: /{proxy+}
          method: ANY

  analytics:
    handler: src/lambda_analytics.handler
    events:
      - sqs:
          arn: !GetAtt AnalyticsQueue.Arn

resources:
  Resources:
    AnalyticsQueue:
      Type: AWS::SQS::Queue
      Properties:
        VisibilityTimeout: 30

# After (analytics in main Lambda)
functions:
  api:
    handler: src/lambda_handler.handler
    events:
      - http:
          path: /{proxy+}
          method: ANY
    environment:
      AMPLITUDE_API_KEY: ${env:AMPLITUDE_API_KEY}

# Eliminated:
# - Analytics Lambda function
# - SQS queue for event buffering
# - IAM roles for Lambda invocation
# - CloudWatch log group for analytics Lambda
Debugging Analytics Event (Before)
┌────────────────────────────────────────────────┐
│ 1. Find request in main Lambda logs           │
│    - RequestId: abc-123                        │
│    - User action logged                        │
│    - Analytics invocation logged               │
│                                                │
│ 2. Find invocation in analytics Lambda logs   │
│    - Search for correlation ID                │
│    - Different RequestId: def-456              │
│    - Check Amplitude API response              │
│                                                │
│ 3. Correlate errors across 2 log groups        │
│    - Time-based correlation                    │
│    - Manual stitching of execution flow        │
└────────────────────────────────────────────────┘

Debugging Analytics Event (After)
┌────────────────────────────────────────────────┐
│ 1. Find request in main Lambda logs           │
│    - RequestId: abc-123                        │
│    - User action logged                        │
│    - Amplitude POST logged (same context)      │
│    - Response status logged                    │
│                                                │
│ Done - all information in single log stream    │
└────────────────────────────────────────────────┘

Results Summary

Quantified Outcomes:

  • 14% latency reduction - 450ms → 400ms average
  • 58% P95 latency improvement - 950ms → 400ms (eliminated cold starts)
  • $152/month saved - Lambda + CloudWatch costs
  • 30% concurrency freed - 500 → 350 concurrent executions
  • 50% architecture simplification - 2 functions → 1 function
Analytics Lambda Deprecation Impact (30-day comparison)
┌────────────────────────────────────────────────┐
│ Metric                 Before    After  Change │
│ Analytics latency:     450ms     400ms  -11%  │
│ P95 latency:           950ms     400ms  -58%  │
│ Event success rate:    99.0%     99.7%  +0.7% │
│ Lambda concurrency:    500       350    -30%  │
│ Lambda functions:      2         1      -50%  │
│ Lambda costs:          $1,150    $1,000 -13%  │
│ CloudWatch log groups: 2         1      -50%  │
│ Monthly savings:       -         -      $152  │
└────────────────────────────────────────────────┘

When to Use Separate Lambda vs Direct HTTP

Use Separate Lambda When:

  1. Complex processing required - Event transformation, enrichment, batching
  2. Retry logic complex - Dead letter queues, exponential backoff with state
  3. Rate limiting needed - Throttle third-party API calls
  4. Different scaling patterns - Analytics needs 10× more concurrency than API

Use Direct HTTP When:

  1. Simple passthrough - Minimal event transformation
  2. Low latency critical - Every millisecond matters
  3. Third-party API reliable - 99.9%+ uptime SLA
  4. Event volume manageable - Won't overwhelm third-party API

Our Use Case (Direct HTTP):

  • Amplitude API has 99.99% uptime
  • Events are simple JSON payloads (no transformation)
  • Latency matters for real-time analytics dashboards
  • Volume is under Amplitude's rate limits (3,600 events/second)

Key Takeaways

  1. Lambda indirection isn't always free. Each Lambda invocation adds 50ms latency and doubles execution costs.

  2. Simplicity has value. Removing the analytics Lambda reduced debugging time, CloudWatch log complexity, and deployment coordination.

  3. Async HTTP in Lambda works. Background threads with daemon=True provide non-blocking HTTP calls without additional infrastructure.

  4. Concurrency quota matters. Eliminating the analytics Lambda freed 30% of our concurrent execution quota for business-critical requests.

  5. Direct integration beats abstraction when appropriate. For simple passthrough use cases, direct HTTP calls outperform Lambda-to-Lambda invocation.

Deprecating the analytics Lambda proved that architectural simplification can deliver performance improvements, cost savings, and operational benefits simultaneously—a rare engineering win-win-win.


Related Posts:

  • Thin Lambda Consolidation: Unified Function Architecture
  • Batch API Calls: Drip Email Optimization
  • API Response Caching Strategy: Reduce Database Load

Commits: 489f4b1, 8aa5c17 Impact: 14% latency reduction, $152/month saved, 50% architecture simplification