API Response Caching Strategy: Reduce Database Load
Overview
Repeated API calls for identical data caused unnecessary database queries and increased latency. We implemented a Redis caching layer with strategic TTLs, achieving a 99% cache hit rate and reducing database load by 95%.
The Repeated Query Problem
Mobile apps naturally make repetitive API calls: users navigate between screens, refresh content, and retry failed requests. Without caching, each request hit the database, even when returning identical data from seconds earlier.
Repetitive Request Patterns:
- User profile fetched on every screen load (30+ times per session)
- Subject tree loaded on app launch and tab switches (10+ times per session)
- Static content (lessons, media) refetched unnecessarily
- Configuration data queried on every API call
For 10,000 daily active users with average 50 API calls per session, this generated 500,000 database queries daily—95% of which returned unchanged data.
Before: No Caching Layer
Issues:
- Same query executed 3 times
- Database loaded with redundant queries
- Wasted Lambda execution time
- Higher RDS CPU utilization
- No reduction in latency for repeated requests
**Database Load Analysis:**
Daily Request Patterns (No Caching) ┌────────────────────────────────────────────────┐ │ Endpoint Calls/day DB Queries │ │ /user/profile 300,000 300,000 │ │ /subjects/tree 150,000 150,000 │ │ /content/lesson/:id 200,000 200,000 │ │ /config 100,000 100,000 │ │ /user/progress 80,000 80,000 │ │ Other endpoints 170,000 170,000 │ │ │ │ Total: 1,000,000 1,000,000 │ │ │ │ Duplicate queries: ~95% (950,000) │ │ Unique data changes: ~5% (50,000) │ └────────────────────────────────────────────────┘
## After: Redis Caching Layer
```mermaid
flowchart TD
App["Mobile App"]:::client
App --> S1["Screen 1: User Profile<br/>GET /user/profile"]:::client
App --> S2["Screen 2: User Profile (again)<br/>GET /user/profile"]:::client
App --> S3["Screen 3: User Profile (again)<br/>GET /user/profile"]:::client
S1 --> Q1["Lambda -> Redis (MISS)<br/>Lambda -> RDS Query<br/>Cache result in Redis<br/>Response time: 150ms"]:::storage
S2 --> Q2["Lambda -> Redis (HIT)<br/>Return cached response<br/>Response time: 25ms<br/>(No database query)"]:::storage
S3 --> Q3["Lambda -> Redis (HIT)<br/>Return cached response<br/>Response time: 25ms<br/>(No database query)"]:::storage
Benefits:
- First request: Cache miss, query DB (150ms)
- Subsequent requests: Cache hit, no DB (25ms)
- 95% reduction in database queries
- 83% reduction in response time for cached requests
- Lower RDS CPU utilization
Implementation Details
Caching Strategy Design
Cache Key Patterns:
# src/core/cache.py
def generate_cache_key(endpoint, user_id=None, params=None):
"""Generate consistent cache keys for API responses."""
key_parts = [
'api', # Namespace
endpoint, # Endpoint path
str(user_id) if user_id else 'global',
hashlib.md5(json.dumps(params or {}, sort_keys=True).encode()).hexdigest()[:8]
]
return ':'.join(key_parts)
# Examples:
# api:/user/profile:12345:abc123de
# api:/subjects/tree:global:def456gh
# api:/content/lesson/42:12345:xyz789ij
TTL (Time-To-Live) Strategy:
CACHE_TTL_CONFIG = {
# User-specific data (changes frequently)
'/user/profile': 300, # 5 minutes
'/user/progress': 300, # 5 minutes
'/user/attempts': 180, # 3 minutes
# Shared content (changes rarely)
'/subjects/tree': 3600, # 1 hour
'/content/lesson': 1800, # 30 minutes
'/content/media': 1800, # 30 minutes
# Static config (changes very rarely)
'/config': 7200, # 2 hours
'/app/settings': 7200, # 2 hours
# Dynamic data (changes constantly)
'/leaderboard': 60, # 1 minute
'/analytics/live': 30, # 30 seconds
}
Redis Setup
ElastiCache Configuration:
# Infrastructure config
ElastiCacheCluster:
Type: AWS::ElastiCache::CacheCluster
Properties:
CacheNodeType: cache.t3.micro # $15/month
Engine: redis
NumCacheNodes: 1
Port: 6379
VpcSecurityGroupIds:
- !Ref RedisSecurityGroup
CacheSubnetGroupName: !Ref CacheSubnetGroup
# Security group allows Lambda → Redis
RedisSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Allow Lambda to Redis
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 6379
ToPort: 6379
SourceSecurityGroupId: !Ref LambdaSecurityGroup
Caching Decorator
Flask Route Caching:
# src/core/cache.py
import redis
import json
from functools import wraps
redis_client = redis.Redis(
host=REDIS_HOST,
port=6379,
db=0,
socket_connect_timeout=2,
socket_timeout=2
)
def cache_response(ttl=300, key_func=None):
"""Decorator to cache API responses in Redis."""
def decorator(f):
@wraps(f)
def wrapper(*args, **kwargs):
# Generate cache key
if key_func:
cache_key = key_func(*args, **kwargs)
else:
cache_key = generate_default_key(f.__name__, kwargs)
# Try to get from cache
try:
cached = redis_client.get(cache_key)
if cached:
logger.info(f"Cache HIT: {cache_key}")
return json.loads(cached)
except redis.RedisError as e:
logger.warning(f"Cache read error: {e}")
# Fail open: continue to database
# Cache miss: Execute function
logger.info(f"Cache MISS: {cache_key}")
result = f(*args, **kwargs)
# Store in cache
try:
redis_client.setex(
cache_key,
ttl,
json.dumps(result)
)
except redis.RedisError as e:
logger.warning(f"Cache write error: {e}")
# Fail open: return result anyway
return result
return wrapper
return decorator
Usage Example:
# src/resources/user/profile.py
from src.core.cache import cache_response
@app.route('/user/profile')
@authenticate
@cache_response(ttl=300, key_func=lambda: f"profile:{g.user.id}")
def get_user_profile():
"""Get user profile (cached for 5 minutes)."""
user = db.query(User).filter_by(id=g.user.id).first()
return jsonify(user.to_dict())
Cache Invalidation
Write-Through Invalidation:
# src/resources/user/profile.py
@app.route('/user/profile', methods=['PUT'])
@authenticate
def update_user_profile():
"""Update user profile and invalidate cache."""
user = db.query(User).filter_by(id=g.user.id).first()
user.name = request.json.get('name')
user.bio = request.json.get('bio')
db.commit()
# Invalidate cache
cache_key = f"profile:{g.user.id}"
redis_client.delete(cache_key)
logger.info(f"Cache invalidated: {cache_key}")
return jsonify(user.to_dict())
Pattern-Based Invalidation:
def invalidate_user_cache(user_id):
"""Invalidate all cache keys for a user."""
pattern = f"api:*:{user_id}:*"
keys = redis_client.keys(pattern)
if keys:
redis_client.delete(*keys)
logger.info(f"Invalidated {len(keys)} cache keys for user {user_id}")
Performance Impact
Cache Hit Rate
30-Day Cache Metrics:
Per-Endpoint Cache Hit Rates:
Response Time Improvement
Latency Reduction:
Database Load Reduction
RDS Query Volume:
Redis Cache Performance ┌────────────────────────────────────────────────┐ │ Metric Value │ │ Total requests: 30,000,000 │ │ Cache hits: 29,700,000 │ │ Cache misses: 300,000 │ │ Cache hit rate: 99.0% │ │ │ │ Avg response time: │ │ - Cache hit: 25ms │ │ - Cache miss: 150ms │ │ - Overall avg: 28.75ms │ │ │ │ Database queries: │ │ - Before caching: 30,000,000 │ │ - After caching: 300,000 │ │ - Reduction: 98.9% │ └────────────────────────────────────────────────┘
Endpoint Cache Hit Rates (30-day period) ┌────────────────────────────────────────────────────────┐ │ Endpoint Requests Hits Hit Rate │ │ /user/profile 9,000,000 8,955,000 99.5% │ │ /subjects/tree 4,500,000 4,485,000 99.7% │ │ /content/lesson/* 6,000,000 5,940,000 99.0% │ │ /config 3,000,000 2,997,000 99.9% │ │ /user/progress 2,400,000 2,340,000 97.5% │ │ /content/media/* 3,000,000 2,970,000 99.0% │ │ Other endpoints 2,100,000 2,013,000 95.9% │ │ │ │ Total: 30,000,000 29,700,000 99.0% │ └────────────────────────────────────────────────────────┘
API Response Time Comparison (P50/P95/P99) ┌────────────────────────────────────────────────┐ │ Endpoint Before After │ │ /user/profile 150/200/300ms 25/30/150ms│ │ /subjects/tree 120/180/250ms 20/25/120ms│ │ /content/lesson 180/250/350ms 30/35/180ms│ │ /config 100/150/200ms 15/20/100ms│ │ /user/progress 200/300/450ms 35/40/200ms│ │ │ │ Overall average: 150ms 25ms │ │ Improvement: 83% faster │ └────────────────────────────────────────────────┘
Database Query Reduction (30-day comparison) ┌────────────────────────────────────────────────┐ │ Metric Before After │ │ Daily queries: 1,000,000 10,000 │ │ Queries/second: 11.6 0.12 │ │ RDS CPU utilization: 65% 8% │ │ Read IOPS: 5,000 250 │ │ Connection pool use: 140/150 15/150 │ │ │ │ Database load reduced by 99% │ └────────────────────────────────────────────────┘
Cost Impact
Infrastructure Costs
Redis ElastiCache Cost:
RDS Cost Savings:
Lambda Cost Savings:
Net Savings:
ElastiCache Costs (Monthly) ┌────────────────────────────────────────────────┐ │ Instance type: cache.t3.micro │ │ Monthly cost: $15 │ │ Data transfer: $3 │ │ Total: $18/month │ └────────────────────────────────────────────────┘
RDS Cost Savings (Monthly) ┌────────────────────────────────────────────────┐ │ Before caching: │ │ - Instance: db.r5.large │ │ - CPU load: 65% │ │ - IOPS: 5,000 │ │ - Cost: $210/month │ │ │ │ After caching: │ │ - Instance: db.t3.medium (downsized) │ │ - CPU load: 8% │ │ - IOPS: 250 │ │ - Cost: $60/month │ │ │ │ RDS savings: $150/month │ └────────────────────────────────────────────────┘
Lambda Costs (Monthly) ┌────────────────────────────────────────────────┐ │ Before After Savings│ │ Invocations: $400 $400 $0 │ │ Duration (DB queries): $320 $45 $275 │ │ Total: $720 $445 $275 │ │ │ │ Duration breakdown: │ │ - Before: 30M requests × 150ms avg = 1,250h │ │ - After: 30M requests × 25ms avg = 208h │ │ - Time saved: 1,042 hours/month │ └────────────────────────────────────────────────┘
Total Cost Impact (Monthly) ┌────────────────────────────────────────────────┐ │ Redis cost: -$18 │ │ RDS savings: +$150 │ │ Lambda savings: +$275 │ │ │ │ Net savings: $407/month │ └────────────────────────────────────────────────┘
Monitoring and Observability
CloudWatch Metrics
Custom Cache Metrics:
# src/core/cache.py
import boto3
cloudwatch = boto3.client('cloudwatch')
def emit_cache_metrics(hit: bool, endpoint: str, duration_ms: int):
"""Emit cache hit/miss metrics to CloudWatch."""
cloudwatch.put_metric_data(
Namespace='AlphaZed/Cache',
MetricData=[
{
'MetricName': 'CacheHitRate',
'Value': 1 if hit else 0,
'Unit': 'None',
'Dimensions': [
{'Name': 'Endpoint', 'Value': endpoint}
]
},
{
'MetricName': 'ResponseTime',
'Value': duration_ms,
'Unit': 'Milliseconds',
'Dimensions': [
{'Name': 'CacheStatus', 'Value': 'Hit' if hit else 'Miss'}
]
}
]
)
Redis INFO Monitoring:
# Scheduled Lambda (runs every 5 minutes)
def monitor_redis_health():
"""Monitor Redis health and emit metrics."""
info = redis_client.info()
cloudwatch.put_metric_data(
Namespace='AlphaZed/Redis',
MetricData=[
{'MetricName': 'MemoryUsed', 'Value': info['used_memory']},
{'MetricName': 'ConnectedClients', 'Value': info['connected_clients']},
{'MetricName': 'KeyspaceHits', 'Value': info['keyspace_hits']},
{'MetricName': 'KeyspaceMisses', 'Value': info['keyspace_misses']},
{'MetricName': 'EvictedKeys', 'Value': info['evicted_keys']}
]
)
Cache Dashboard
CloudWatch Dashboard:
Redis Cache Dashboard ┌────────────────────────────────────────────────┐ │ Cache Hit Rate (1h): 99.2% │ │ Avg Response Time (Hit): 24ms │ │ Avg Response Time (Miss): 145ms │ │ Total Keys: 125,000 │ │ Memory Used: 250 MB / 512 MB │ │ Evicted Keys (24h): 12 │ │ Connected Clients: 45 │ │ Network In: 5.2 MB/s │ │ Network Out: 8.5 MB/s │ └────────────────────────────────────────────────┘
Edge Cases and Considerations
Cache Stampede Prevention
Problem: When a popular cache key expires, multiple requests simultaneously query the database.
Solution: Soft Expiration + Lock
def cache_with_lock(cache_key, ttl, fetch_func):
"""Prevent cache stampede with soft expiration."""
# Check cache
cached = redis_client.get(cache_key)
if cached:
data = json.loads(cached)
# Check if soft-expired (90% of TTL)
if data.get('cached_at') + (ttl * 0.9) > time.time():
return data['value']
# Try to acquire lock
lock_key = f"lock:{cache_key}"
if redis_client.set(lock_key, '1', nx=True, ex=10):
# We got the lock, fetch fresh data
value = fetch_func()
redis_client.setex(
cache_key,
ttl,
json.dumps({'value': value, 'cached_at': time.time()})
)
redis_client.delete(lock_key)
return value
else:
# Another request is fetching, return stale data
if cached:
return json.loads(cached)['value']
# Wait briefly and retry
time.sleep(0.1)
return cache_with_lock(cache_key, ttl, fetch_func)
Graceful Cache Failure
Fail-Open Pattern:
try:
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
except redis.RedisError as e:
logger.warning(f"Redis error, falling back to DB: {e}")
# Continue to database query
Results Summary
Quantified Outcomes:
- 99% cache hit rate - 29.7M hits out of 30M requests
- 83% faster responses - 150ms → 25ms for cached requests
- 99% database load reduction - 1M → 10k queries/day
- $407/month saved - Net savings after Redis costs
API Response Caching Impact (30-day comparison) ┌────────────────────────────────────────────────┐ │ Metric Before After Change │ │ Cache hit rate: 0% 99% +99% │ │ Avg response time: 150ms 25ms -83% │ │ Database queries/day: 1M 10k -99% │ │ RDS CPU utilization: 65% 8% -88% │ │ RDS instance cost: $210 $60 -71% │ │ Lambda duration cost: $320 $45 -86% │ │ Redis cost: $0 $18 +$18 │ │ Net monthly savings: - - $407 │ └────────────────────────────────────────────────┘
Key Takeaways
-
Caching delivers compounding benefits. Faster responses + reduced database load + lower costs = force multiplier.
-
99% hit rate is achievable with proper TTLs. Strategic TTL configuration based on data change frequency maximizes cache effectiveness.
-
Fail-open cache design is critical. If Redis fails, fall back to database—never block user requests.
-
Cache invalidation is the hard part. Write-through invalidation on updates ensures data consistency.
-
Monitor cache hit rates religiously. A dropping hit rate signals stale TTL configuration or changing access patterns.
API response caching with Redis delivered exceptional ROI: $18/month infrastructure cost generated $407/month in total savings while dramatically improving user experience.
Related Posts:
- RDS Query Optimization: Database Performance Analysis
- Scheduler Query Optimization: Background Job Efficiency
- Thin Lambda Consolidation: Unified Function Architecture
Documentation: 2026-01-25-api-response-caching-plan.md
Impact: 99% cache hit rate, 83% faster responses, $407/month saved