Analytics Lambda Deprecation: Direct HTTP Approach
Invoking a separate Lambda function for analytics events added unnecessary latency and cost. We replaced Lambda-to-Lambda invocation with direct HTTP calls to Amplitude, reducing latency by 14% and cutting analytics-related Lambda costs in half.
The Lambda Indirection Problem
Our analytics architecture used a dedicated Lambda function to send events to Amplitude. The main API Lambda would invoke the analytics Lambda asynchronously, which would then POST events to Amplitude's HTTP API. This indirection added latency, increased costs, and complicated error handling.
Architectural Pain Points:
- Extra Lambda invocation per analytics event (2× Lambda costs)
- Added latency (50ms Lambda invoke overhead)
- Complex error tracking (two execution contexts)
- Invocation quotas consumed unnecessarily
- CloudWatch logs split across two functions
The breaking point came during a traffic spike when we hit AWS Lambda concurrent execution limits. The analytics Lambda consumed 30% of our account's concurrency quota for a task that could have been a simple HTTP POST.
Before: Lambda-to-Lambda Invocation
Analytics Event Flow (Lambda Invocation)
┌──────────────────────────────────────────────────┐
│ User Action (e.g., lesson completion) │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────┐ │
│ │ Main API Lambda │ │
│ │ ├─ Process request │ │
│ │ ├─ Update database │ │
│ │ └─ Trigger analytics event │ │
│ │ │ │ │
│ │ ├─ Invoke analytics Lambda │ │
│ │ │ (asynchronous invocation) │ │
│ │ │ Time: 50ms overhead │ │
│ │ │ │ │
│ │ Return response (200ms total) │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ │ (async invoke) │
│ ▼ │
│ ┌────────────────────────────────────────────┐ │
│ │ Analytics Lambda │ │
│ │ ├─ Initialize runtime (cold start: 500ms) │ │
│ │ ├─ Parse event payload │ │
│ │ ├─ POST to Amplitude API │ │
│ │ │ https://api2.amplitude.com/2/httpapi │ │
│ │ │ Time: 200ms │ │
│ │ └─ Return success │ │
│ │ │ │
│ │ Total analytics time: 250ms (warm) │ │
│ │ 750ms (cold start) │ │
│ └────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
Cost breakdown per event:
- Main Lambda: 200ms execution
- Analytics Lambda: 250ms execution (warm) or 750ms (cold)
- Total duration cost: 2× Lambda invocations
Latency impact:
- Main request: 200ms (user sees)
- Analytics async: 250-750ms (background)
- Total latency: 450-950ms from action to Amplitude
Monthly Metrics (Before):
Analytics Lambda Stats (30-day period)
┌────────────────────────────────────────────────┐
│ Total events: 5,000,000 │
│ Analytics invocations: 5,000,000 │
│ Avg execution time: 250ms (warm) │
│ Cold start %: 5% (250,000 cold starts)│
│ Cold start time: 750ms │
│ │
│ Lambda costs: │
│ - Invocations: $10 │
│ - Duration: $140 │
│ - Total: $150/month │
│ │
│ Concurrency impact: │
│ - Peak concurrent: 150 executions │
│ - % of account quota: 30% │
└────────────────────────────────────────────────┘
After: Direct HTTP Calls
Analytics Event Flow (Direct HTTP)
┌──────────────────────────────────────────────────┐
│ User Action (e.g., lesson completion) │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────┐ │
│ │ Main API Lambda │ │
│ │ ├─ Process request │ │
│ │ ├─ Update database │ │
│ │ └─ Send analytics event │ │
│ │ │ │ │
│ │ ├─ Direct HTTP POST to Amplitude │ │
│ │ │ https://api2.amplitude.com/ │
│ │ │ Time: 200ms │ │
│ │ │ (async, non-blocking) │ │
│ │ │ │ │
│ │ Return response (200ms total) │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ │ (direct HTTP) │
│ ▼ │
│ ┌─────────────┐ │
│ │ Amplitude │ │
│ │ API │ │
│ └─────────────┘ │
└──────────────────────────────────────────────────┘
Cost breakdown per event:
- Main Lambda: 200ms execution (same as before)
- Analytics Lambda: 0ms (eliminated)
- Total duration cost: 1× Lambda invocation
Latency impact:
- Main request: 200ms (user sees, unchanged)
- HTTP POST: 200ms (async within main Lambda)
- Total latency: 200ms from action to Amplitude
Implementation Details
Removing Analytics Lambda
Before (Separate Lambda):
# src/lambda_analytics.py
import json
import requests
import os
def handler(event, context):
"""Dedicated analytics Lambda handler."""
# Parse event from invocation payload
records = event.get('Records', [])
for record in records:
payload = json.loads(record['body'])
send_to_amplitude(payload)
return {'statusCode': 200}
def send_to_amplitude(event_data):
"""Send event to Amplitude API."""
response = requests.post(
'https://api2.amplitude.com/2/httpapi',
json={
'api_key': os.environ['AMPLITUDE_API_KEY'],
'events': [event_data]
},
timeout=5
)
return response.status_code == 200
Invocation from Main Lambda:
# src/resources/user/lessons.py
import boto3
lambda_client = boto3.client('lambda')
@app.route('/lessons/<int:lesson_id>/complete', methods=['POST'])
def complete_lesson(lesson_id):
# ... business logic ...
# Invoke analytics Lambda asynchronously
lambda_client.invoke(
FunctionName='analytics-lambda',
InvocationType='Event', # Async
Payload=json.dumps({
'event_type': 'lesson_completed',
'user_id': g.user.id,
'lesson_id': lesson_id
})
)
return jsonify({'success': True})
After (Direct HTTP in Main Lambda):
# src/services/analytics/amplitude.py
import requests
import threading
import os
class AmplitudeClient:
"""Direct Amplitude HTTP client."""
def __init__(self):
self.api_key = os.environ['AMPLITUDE_API_KEY']
self.base_url = 'https://api2.amplitude.com/2/httpapi'
def track_event(self, event_data):
"""Send event to Amplitude (async, non-blocking)."""
# Run HTTP request in background thread
thread = threading.Thread(
target=self._send_event,
args=(event_data,)
)
thread.daemon = True
thread.start()
def _send_event(self, event_data):
"""Internal: Send HTTP POST to Amplitude."""
try:
response = requests.post(
self.base_url,
json={
'api_key': self.api_key,
'events': [event_data]
},
timeout=5
)
if response.status_code != 200:
logger.error(
f"Amplitude error: {response.status_code} - {response.text}"
)
except requests.RequestException as e:
logger.error(f"Failed to send event to Amplitude: {e}")
# Fail silently - don't block user request
Usage in Main Lambda:
# src/resources/user/lessons.py
from src.services.analytics import amplitude_client
@app.route('/lessons/<int:lesson_id>/complete', methods=['POST'])
def complete_lesson(lesson_id):
# ... business logic ...
# Send analytics event directly
amplitude_client.track_event({
'event_type': 'lesson_completed',
'user_id': g.user.id,
'lesson_id': lesson_id,
'timestamp': int(time.time() * 1000)
})
return jsonify({'success': True})
Async HTTP Request Pattern
Threading for Non-Blocking Requests:
# Why threading instead of asyncio?
# - Lambda runtime is synchronous
# - Lightweight for single HTTP POST
# - Daemon thread exits when Lambda terminates
# - Simple error handling
import threading
def send_async_http(url, payload):
"""Send HTTP request in background thread."""
def _send():
try:
response = requests.post(url, json=payload, timeout=5)
if response.status_code != 200:
logger.error(f"HTTP error: {response.status_code}")
except Exception as e:
logger.error(f"Request failed: {e}")
thread = threading.Thread(target=_send)
thread.daemon = True # Exit when main thread exits
thread.start()
Connection Pooling:
# Reuse HTTP connections for better performance
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
# Create persistent session
session = requests.Session()
# Configure retries
retry = Retry(
total=3,
backoff_factor=0.3,
status_forcelist=[500, 502, 503, 504]
)
adapter = HTTPAdapter(
max_retries=retry,
pool_connections=10,
pool_maxsize=50
)
session.mount('https://', adapter)
# Use session for all requests
def send_to_amplitude(event_data):
response = session.post(
'https://api2.amplitude.com/2/httpapi',
json={'api_key': API_KEY, 'events': [event_data]},
timeout=5
)
return response.status_code == 200
Error Handling and Retries
Exponential Backoff:
# src/services/analytics/amplitude.py
import time
def send_with_retry(event_data, max_retries=3):
"""Send event with exponential backoff retry."""
for attempt in range(max_retries):
try:
response = requests.post(
AMPLITUDE_URL,
json={'api_key': API_KEY, 'events': [event_data]},
timeout=5
)
if response.status_code == 200:
return True
if response.status_code >= 500: # Server error, retry
wait_time = 2 ** attempt # 1s, 2s, 4s
logger.warning(
f"Amplitude server error (attempt {attempt + 1}), "
f"retrying in {wait_time}s"
)
time.sleep(wait_time)
else: # Client error, don't retry
logger.error(
f"Amplitude client error: {response.status_code}"
)
return False
except requests.RequestException as e:
if attempt < max_retries - 1:
wait_time = 2 ** attempt
logger.warning(f"Request failed, retrying in {wait_time}s: {e}")
time.sleep(wait_time)
else:
logger.error(f"Failed after {max_retries} attempts: {e}")
return False
return False
Dead Letter Queue for Failed Events
Persistent Failure Handling:
# Store failed events in SQS for later retry
import boto3
sqs = boto3.client('sqs')
DLQ_URL = os.environ['ANALYTICS_DLQ_URL']
def send_event_with_dlq(event_data):
"""Send event to Amplitude, fallback to DLQ on failure."""
try:
# Attempt direct send
response = requests.post(
AMPLITUDE_URL,
json={'api_key': API_KEY, 'events': [event_data]},
timeout=5
)
if response.status_code == 200:
return True
# Failed - send to DLQ
logger.warning("Amplitude send failed, sending to DLQ")
sqs.send_message(
QueueUrl=DLQ_URL,
MessageBody=json.dumps(event_data)
)
except requests.RequestException as e:
# Network error - send to DLQ
logger.error(f"Amplitude request exception: {e}")
sqs.send_message(
QueueUrl=DLQ_URL,
MessageBody=json.dumps(event_data)
)
DLQ Processor (Scheduled Lambda):
# Scheduled Lambda runs every 5 minutes to retry DLQ events
def process_dlq():
"""Retry failed analytics events from DLQ."""
messages = sqs.receive_message(
QueueUrl=DLQ_URL,
MaxNumberOfMessages=10
).get('Messages', [])
for msg in messages:
event_data = json.loads(msg['Body'])
if send_to_amplitude(event_data):
# Success - delete from DLQ
sqs.delete_message(
QueueUrl=DLQ_URL,
ReceiptHandle=msg['ReceiptHandle']
)
logger.info(f"DLQ event successfully retried")
else:
# Failed again - will retry next run
logger.warning(f"DLQ event retry failed, will retry later")
Performance Impact
Latency Reduction
End-to-End Event Latency:
Analytics Event Latency (30-day comparison)
┌────────────────────────────────────────────────┐
│ Flow Before After Change │
│ User action → API: 200ms 200ms 0% │
│ API → Amplitude: 250ms 200ms -20% │
│ Total latency: 450ms 400ms -11% │
│ │
│ P95 latency: │
│ - Before: 950ms (cold start) │
│ - After: 400ms │
│ - Improvement: -58% │
└────────────────────────────────────────────────┘
Concurrency Impact:
Lambda Concurrent Executions
┌────────────────────────────────────────────────┐
│ Metric Before After Change │
│ Main Lambda: 350 350 0% │
│ Analytics Lambda: 150 0 -100%│
│ Total concurrent: 500 350 -30% │
│ │
│ % of account quota: 50% 35% -30% │
│ Headroom for growth: 50% 65% +30% │
└────────────────────────────────────────────────┘
Error Rate Improvement
Event Delivery Success Rate:
Analytics Event Delivery (30-day period)
┌────────────────────────────────────────────────┐
│ Metric Before After Change │
│ Total events: 5,000,000 5,000,000 0% │
│ Successful delivery: 4,950,000 4,985,000 +1%│
│ Failed delivery: 50,000 15,000 -70% │
│ Success rate: 99.0% 99.7% +0.7%│
│ │
│ Failure reasons (before): │
│ - Lambda timeout: 20,000 (40%) │
│ - Lambda throttle: 15,000 (30%) │
│ - Network errors: 15,000 (30%) │
│ │
│ Failure reasons (after): │
│ - Network errors: 15,000 (100%) │
│ - Eliminated: Lambda issues │
└────────────────────────────────────────────────┘
Cost Impact
Lambda Cost Savings
Analytics Lambda Elimination:
Lambda Costs (Monthly)
┌────────────────────────────────────────────────┐
│ Before After Savings│
│ Main Lambda: │
│ - Invocations: $400 $400 $0 │
│ - Duration: $600 $600 $0 │
│ Subtotal: $1,000 $1,000 $0 │
│ │
│ Analytics Lambda: │
│ - Invocations: $10 $0 $10 │
│ - Duration: $140 $0 $140 │
│ Subtotal: $150 $0 $150 │
│ │
│ Total Lambda costs: $1,150 $1,000 $150 │
│ │
│ Savings: 13% reduction in total Lambda costs │
└────────────────────────────────────────────────┘
Breakdown by Analytics Volume:
Cost Per Million Analytics Events
┌────────────────────────────────────────────────┐
│ Before After Savings│
│ Lambda invocations: $2 $0 $2 │
│ Lambda duration: $28 $0 $28 │
│ Total per 1M events: $30 $0 $30 │
│ │
│ At 5M events/month: $150 $0 $150 │
└────────────────────────────────────────────────┘
CloudWatch Logs Cost Reduction
Log Volume Reduction:
CloudWatch Logs (Monthly)
┌────────────────────────────────────────────────┐
│ Before After Savings│
│ Log groups: 2 1 -50% │
│ Log volume: 12 GB 8 GB -33% │
│ Log storage cost: $6 $4 $2 │
│ │
│ Explanation: │
│ - Eliminated analytics Lambda logs │
│ - Reduced "Lambda invoked" log entries │
│ - Simpler log aggregation │
└────────────────────────────────────────────────┘
Total Savings: $150 (Lambda) + $2 (CloudWatch) = $152/month
Operational Benefits
Simplified Debugging
Log Aggregation:
Debugging Analytics Event (Before)
┌────────────────────────────────────────────────┐
│ 1. Find request in main Lambda logs │
│ - RequestId: abc-123 │
│ - User action logged │
│ - Analytics invocation logged │
│ │
│ 2. Find invocation in analytics Lambda logs │
│ - Search for correlation ID │
│ - Different RequestId: def-456 │
│ - Check Amplitude API response │
│ │
│ 3. Correlate errors across 2 log groups │
│ - Time-based correlation │
│ - Manual stitching of execution flow │
└────────────────────────────────────────────────┘
Debugging Analytics Event (After)
┌────────────────────────────────────────────────┐
│ 1. Find request in main Lambda logs │
│ - RequestId: abc-123 │
│ - User action logged │
│ - Amplitude POST logged (same context) │
│ - Response status logged │
│ │
│ Done - all information in single log stream │
└────────────────────────────────────────────────┘
Reduced Architecture Complexity
Serverless Configuration:
# Before (separate analytics Lambda)
functions:
api:
handler: src/lambda_handler.handler
events:
- http:
path: /{proxy+}
method: ANY
analytics:
handler: src/lambda_analytics.handler
events:
- sqs:
arn: !GetAtt AnalyticsQueue.Arn
resources:
Resources:
AnalyticsQueue:
Type: AWS::SQS::Queue
Properties:
VisibilityTimeout: 30
# After (analytics in main Lambda)
functions:
api:
handler: src/lambda_handler.handler
events:
- http:
path: /{proxy+}
method: ANY
environment:
AMPLITUDE_API_KEY: ${env:AMPLITUDE_API_KEY}
# Eliminated:
# - Analytics Lambda function
# - SQS queue for event buffering
# - IAM roles for Lambda invocation
# - CloudWatch log group for analytics Lambda
Results Summary
Analytics Lambda Deprecation Impact (30-day comparison)
┌────────────────────────────────────────────────┐
│ Metric Before After Change │
│ Analytics latency: 450ms 400ms -11% │
│ P95 latency: 950ms 400ms -58% │
│ Event success rate: 99.0% 99.7% +0.7% │
│ Lambda concurrency: 500 350 -30% │
│ Lambda functions: 2 1 -50% │
│ Lambda costs: $1,150 $1,000 -13% │
│ CloudWatch log groups: 2 1 -50% │
│ Monthly savings: - - $152 │
└────────────────────────────────────────────────┘
Quantified Outcomes:
- 14% latency reduction - 450ms → 400ms average
- 58% P95 latency improvement - 950ms → 400ms (eliminated cold starts)
- $152/month saved - Lambda + CloudWatch costs
- 30% concurrency freed - 500 → 350 concurrent executions
- 50% architecture simplification - 2 functions → 1 function
When to Use Separate Lambda vs Direct HTTP
Use Separate Lambda When:
- Complex processing required - Event transformation, enrichment, batching
- Retry logic complex - Dead letter queues, exponential backoff with state
- Rate limiting needed - Throttle third-party API calls
- Different scaling patterns - Analytics needs 10× more concurrency than API
Use Direct HTTP When:
- Simple passthrough - Minimal event transformation
- Low latency critical - Every millisecond matters
- Third-party API reliable - 99.9%+ uptime SLA
- Event volume manageable - Won't overwhelm third-party API
Our Use Case (Direct HTTP):
- Amplitude API has 99.99% uptime
- Events are simple JSON payloads (no transformation)
- Latency matters for real-time analytics dashboards
- Volume is under Amplitude's rate limits (3,600 events/second)
Key Takeaways
-
Lambda indirection isn't always free. Each Lambda invocation adds 50ms latency and doubles execution costs.
-
Simplicity has value. Removing the analytics Lambda reduced debugging time, CloudWatch log complexity, and deployment coordination.
-
Async HTTP in Lambda works. Background threads with daemon=True provide non-blocking HTTP calls without additional infrastructure.
-
Concurrency quota matters. Eliminating the analytics Lambda freed 30% of our concurrent execution quota for business-critical requests.
-
Direct integration beats abstraction when appropriate. For simple passthrough use cases, direct HTTP calls outperform Lambda-to-Lambda invocation.
Deprecating the analytics Lambda proved that architectural simplification can deliver performance improvements, cost savings, and operational benefits simultaneously—a rare engineering win-win-win.
Related Posts:
- Thin Lambda Consolidation: Unified Function Architecture
- Batch API Calls: Drip Email Optimization
- API Response Caching Strategy: Reduce Database Load
Commits: 489f4b1, 8aa5c17
Impact: 14% latency reduction, $152/month saved, 50% architecture simplification