API Response Caching Strategy: Reduce Database Load
Repeated API calls for identical data caused unnecessary database queries and increased latency. We implemented a Redis caching layer with strategic TTLs, achieving a 99% cache hit rate and reducing database load by 95%.
The Repeated Query Problem
Mobile apps naturally make repetitive API calls: users navigate between screens, refresh content, and retry failed requests. Without caching, each request hit the database, even when returning identical data from seconds earlier.
Repetitive Request Patterns:
- User profile fetched on every screen load (30+ times per session)
- Subject tree loaded on app launch and tab switches (10+ times per session)
- Static content (lessons, media) refetched unnecessarily
- Configuration data queried on every API call
For 10,000 daily active users with average 50 API calls per session, this generated 500,000 database queries daily—95% of which returned unchanged data.
Before: No Caching Layer
API Request Flow (No Caching)
┌──────────────────────────────────────────────────┐
│ Mobile App │
│ │ │
│ ├─ Screen 1: User Profile │
│ │ └─ GET /user/profile │
│ │ │ │
│ │ ▼ │
│ │ ┌─────────────────────────────────────┐ │
│ │ │ Lambda → RDS Query │ │
│ │ │ SELECT * FROM users WHERE id = 123 │ │
│ │ │ Response time: 150ms │ │
│ │ └─────────────────────────────────────┘ │
│ │ │
│ ├─ Screen 2: User Profile (again) │
│ │ └─ GET /user/profile │
│ │ │ │
│ │ ▼ │
│ │ ┌─────────────────────────────────────┐ │
│ │ │ Lambda → RDS Query (DUPLICATE) │ │
│ │ │ SELECT * FROM users WHERE id = 123 │ │
│ │ │ Response time: 150ms │ │
│ │ └─────────────────────────────────────┘ │
│ │ │
│ └─ Screen 3: User Profile (again) │
│ └─ GET /user/profile │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────┐ │
│ │ Lambda → RDS Query (DUPLICATE) │ │
│ │ SELECT * FROM users WHERE id = 123 │ │
│ │ Response time: 150ms │ │
│ └─────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
Issues:
- Same query executed 3 times
- Database loaded with redundant queries
- Wasted Lambda execution time
- Higher RDS CPU utilization
- No reduction in latency for repeated requests
Database Load Analysis:
Daily Request Patterns (No Caching)
┌────────────────────────────────────────────────┐
│ Endpoint Calls/day DB Queries │
│ /user/profile 300,000 300,000 │
│ /subjects/tree 150,000 150,000 │
│ /content/lesson/:id 200,000 200,000 │
│ /config 100,000 100,000 │
│ /user/progress 80,000 80,000 │
│ Other endpoints 170,000 170,000 │
│ │
│ Total: 1,000,000 1,000,000 │
│ │
│ Duplicate queries: ~95% (950,000) │
│ Unique data changes: ~5% (50,000) │
└────────────────────────────────────────────────┘
After: Redis Caching Layer
API Request Flow (With Redis Caching)
┌──────────────────────────────────────────────────┐
│ Mobile App │
│ │ │
│ ├─ Screen 1: User Profile │
│ │ └─ GET /user/profile │
│ │ │ │
│ │ ▼ │
│ │ ┌─────────────────────────────────────┐ │
│ │ │ Lambda → Redis (MISS) │ │
│ │ │ Lambda → RDS Query │ │
│ │ │ SELECT * FROM users WHERE id = 123 │ │
│ │ │ Lambda → Cache result in Redis │ │
│ │ │ Response time: 150ms │ │
│ │ └─────────────────────────────────────┘ │
│ │ │
│ ├─ Screen 2: User Profile (again) │
│ │ └─ GET /user/profile │
│ │ │ │
│ │ ▼ │
│ │ ┌─────────────────────────────────────┐ │
│ │ │ Lambda → Redis (HIT) ✓ │ │
│ │ │ Return cached response │ │
│ │ │ Response time: 25ms │ │
│ │ │ (No database query) │ │
│ │ └─────────────────────────────────────┘ │
│ │ │
│ └─ Screen 3: User Profile (again) │
│ └─ GET /user/profile │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────┐ │
│ │ Lambda → Redis (HIT) ✓ │ │
│ │ Return cached response │ │
│ │ Response time: 25ms │ │
│ │ (No database query) │ │
│ └─────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
Benefits:
- First request: Cache miss, query DB (150ms)
- Subsequent requests: Cache hit, no DB (25ms)
- 95% reduction in database queries
- 83% reduction in response time for cached requests
- Lower RDS CPU utilization
Implementation Details
Caching Strategy Design
Cache Key Patterns:
# src/core/cache.py
def generate_cache_key(endpoint, user_id=None, params=None):
"""Generate consistent cache keys for API responses."""
key_parts = [
'api', # Namespace
endpoint, # Endpoint path
str(user_id) if user_id else 'global',
hashlib.md5(json.dumps(params or {}, sort_keys=True).encode()).hexdigest()[:8]
]
return ':'.join(key_parts)
# Examples:
# api:/user/profile:12345:abc123de
# api:/subjects/tree:global:def456gh
# api:/content/lesson/42:12345:xyz789ij
TTL (Time-To-Live) Strategy:
CACHE_TTL_CONFIG = {
# User-specific data (changes frequently)
'/user/profile': 300, # 5 minutes
'/user/progress': 300, # 5 minutes
'/user/attempts': 180, # 3 minutes
# Shared content (changes rarely)
'/subjects/tree': 3600, # 1 hour
'/content/lesson': 1800, # 30 minutes
'/content/media': 1800, # 30 minutes
# Static config (changes very rarely)
'/config': 7200, # 2 hours
'/app/settings': 7200, # 2 hours
# Dynamic data (changes constantly)
'/leaderboard': 60, # 1 minute
'/analytics/live': 30, # 30 seconds
}
Redis Setup
ElastiCache Configuration:
# Infrastructure config
ElastiCacheCluster:
Type: AWS::ElastiCache::CacheCluster
Properties:
CacheNodeType: cache.t3.micro # $15/month
Engine: redis
NumCacheNodes: 1
Port: 6379
VpcSecurityGroupIds:
- !Ref RedisSecurityGroup
CacheSubnetGroupName: !Ref CacheSubnetGroup
# Security group allows Lambda → Redis
RedisSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Allow Lambda to Redis
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 6379
ToPort: 6379
SourceSecurityGroupId: !Ref LambdaSecurityGroup
Caching Decorator
Flask Route Caching:
# src/core/cache.py
import redis
import json
from functools import wraps
redis_client = redis.Redis(
host=REDIS_HOST,
port=6379,
db=0,
socket_connect_timeout=2,
socket_timeout=2
)
def cache_response(ttl=300, key_func=None):
"""Decorator to cache API responses in Redis."""
def decorator(f):
@wraps(f)
def wrapper(*args, **kwargs):
# Generate cache key
if key_func:
cache_key = key_func(*args, **kwargs)
else:
cache_key = generate_default_key(f.__name__, kwargs)
# Try to get from cache
try:
cached = redis_client.get(cache_key)
if cached:
logger.info(f"Cache HIT: {cache_key}")
return json.loads(cached)
except redis.RedisError as e:
logger.warning(f"Cache read error: {e}")
# Fail open: continue to database
# Cache miss: Execute function
logger.info(f"Cache MISS: {cache_key}")
result = f(*args, **kwargs)
# Store in cache
try:
redis_client.setex(
cache_key,
ttl,
json.dumps(result)
)
except redis.RedisError as e:
logger.warning(f"Cache write error: {e}")
# Fail open: return result anyway
return result
return wrapper
return decorator
Usage Example:
# src/resources/user/profile.py
from src.core.cache import cache_response
@app.route('/user/profile')
@authenticate
@cache_response(ttl=300, key_func=lambda: f"profile:{g.user.id}")
def get_user_profile():
"""Get user profile (cached for 5 minutes)."""
user = db.query(User).filter_by(id=g.user.id).first()
return jsonify(user.to_dict())
Cache Invalidation
Write-Through Invalidation:
# src/resources/user/profile.py
@app.route('/user/profile', methods=['PUT'])
@authenticate
def update_user_profile():
"""Update user profile and invalidate cache."""
user = db.query(User).filter_by(id=g.user.id).first()
user.name = request.json.get('name')
user.bio = request.json.get('bio')
db.commit()
# Invalidate cache
cache_key = f"profile:{g.user.id}"
redis_client.delete(cache_key)
logger.info(f"Cache invalidated: {cache_key}")
return jsonify(user.to_dict())
Pattern-Based Invalidation:
def invalidate_user_cache(user_id):
"""Invalidate all cache keys for a user."""
pattern = f"api:*:{user_id}:*"
keys = redis_client.keys(pattern)
if keys:
redis_client.delete(*keys)
logger.info(f"Invalidated {len(keys)} cache keys for user {user_id}")
Performance Impact
Cache Hit Rate
30-Day Cache Metrics:
Redis Cache Performance
┌────────────────────────────────────────────────┐
│ Metric Value │
│ Total requests: 30,000,000 │
│ Cache hits: 29,700,000 │
│ Cache misses: 300,000 │
│ Cache hit rate: 99.0% │
│ │
│ Avg response time: │
│ - Cache hit: 25ms │
│ - Cache miss: 150ms │
│ - Overall avg: 28.75ms │
│ │
│ Database queries: │
│ - Before caching: 30,000,000 │
│ - After caching: 300,000 │
│ - Reduction: 98.9% │
└────────────────────────────────────────────────┘
Per-Endpoint Cache Hit Rates:
Endpoint Cache Hit Rates (30-day period)
┌────────────────────────────────────────────────────────┐
│ Endpoint Requests Hits Hit Rate │
│ /user/profile 9,000,000 8,955,000 99.5% │
│ /subjects/tree 4,500,000 4,485,000 99.7% │
│ /content/lesson/* 6,000,000 5,940,000 99.0% │
│ /config 3,000,000 2,997,000 99.9% │
│ /user/progress 2,400,000 2,340,000 97.5% │
│ /content/media/* 3,000,000 2,970,000 99.0% │
│ Other endpoints 2,100,000 2,013,000 95.9% │
│ │
│ Total: 30,000,000 29,700,000 99.0% │
└────────────────────────────────────────────────────────┘
Response Time Improvement
Latency Reduction:
API Response Time Comparison (P50/P95/P99)
┌────────────────────────────────────────────────┐
│ Endpoint Before After │
│ /user/profile 150/200/300ms 25/30/150ms│
│ /subjects/tree 120/180/250ms 20/25/120ms│
│ /content/lesson 180/250/350ms 30/35/180ms│
│ /config 100/150/200ms 15/20/100ms│
│ /user/progress 200/300/450ms 35/40/200ms│
│ │
│ Overall average: 150ms 25ms │
│ Improvement: 83% faster │
└────────────────────────────────────────────────┘
Database Load Reduction
RDS Query Volume:
Database Query Reduction (30-day comparison)
┌────────────────────────────────────────────────┐
│ Metric Before After │
│ Daily queries: 1,000,000 10,000 │
│ Queries/second: 11.6 0.12 │
│ RDS CPU utilization: 65% 8% │
│ Read IOPS: 5,000 250 │
│ Connection pool use: 140/150 15/150 │
│ │
│ Database load reduced by 99% │
└────────────────────────────────────────────────┘
Cost Impact
Infrastructure Costs
Redis ElastiCache Cost:
ElastiCache Costs (Monthly)
┌────────────────────────────────────────────────┐
│ Instance type: cache.t3.micro │
│ Monthly cost: $15 │
│ Data transfer: $3 │
│ Total: $18/month │
└────────────────────────────────────────────────┘
RDS Cost Savings:
RDS Cost Savings (Monthly)
┌────────────────────────────────────────────────┐
│ Before caching: │
│ - Instance: db.r5.large │
│ - CPU load: 65% │
│ - IOPS: 5,000 │
│ - Cost: $210/month │
│ │
│ After caching: │
│ - Instance: db.t3.medium (downsized) │
│ - CPU load: 8% │
│ - IOPS: 250 │
│ - Cost: $60/month │
│ │
│ RDS savings: $150/month │
└────────────────────────────────────────────────┘
Lambda Cost Savings:
Lambda Costs (Monthly)
┌────────────────────────────────────────────────┐
│ Before After Savings│
│ Invocations: $400 $400 $0 │
│ Duration (DB queries): $320 $45 $275 │
│ Total: $720 $445 $275 │
│ │
│ Duration breakdown: │
│ - Before: 30M requests × 150ms avg = 1,250h │
│ - After: 30M requests × 25ms avg = 208h │
│ - Time saved: 1,042 hours/month │
└────────────────────────────────────────────────┘
Net Savings:
Total Cost Impact (Monthly)
┌────────────────────────────────────────────────┐
│ Redis cost: -$18 │
│ RDS savings: +$150 │
│ Lambda savings: +$275 │
│ │
│ Net savings: $407/month │
└────────────────────────────────────────────────┘
Monitoring and Observability
CloudWatch Metrics
Custom Cache Metrics:
# src/core/cache.py
import boto3
cloudwatch = boto3.client('cloudwatch')
def emit_cache_metrics(hit: bool, endpoint: str, duration_ms: int):
"""Emit cache hit/miss metrics to CloudWatch."""
cloudwatch.put_metric_data(
Namespace='AlphaZed/Cache',
MetricData=[
{
'MetricName': 'CacheHitRate',
'Value': 1 if hit else 0,
'Unit': 'None',
'Dimensions': [
{'Name': 'Endpoint', 'Value': endpoint}
]
},
{
'MetricName': 'ResponseTime',
'Value': duration_ms,
'Unit': 'Milliseconds',
'Dimensions': [
{'Name': 'CacheStatus', 'Value': 'Hit' if hit else 'Miss'}
]
}
]
)
Redis INFO Monitoring:
# Scheduled Lambda (runs every 5 minutes)
def monitor_redis_health():
"""Monitor Redis health and emit metrics."""
info = redis_client.info()
cloudwatch.put_metric_data(
Namespace='AlphaZed/Redis',
MetricData=[
{'MetricName': 'MemoryUsed', 'Value': info['used_memory']},
{'MetricName': 'ConnectedClients', 'Value': info['connected_clients']},
{'MetricName': 'KeyspaceHits', 'Value': info['keyspace_hits']},
{'MetricName': 'KeyspaceMisses', 'Value': info['keyspace_misses']},
{'MetricName': 'EvictedKeys', 'Value': info['evicted_keys']}
]
)
Cache Dashboard
CloudWatch Dashboard:
Redis Cache Dashboard
┌────────────────────────────────────────────────┐
│ Cache Hit Rate (1h): 99.2% │
│ Avg Response Time (Hit): 24ms │
│ Avg Response Time (Miss): 145ms │
│ Total Keys: 125,000 │
│ Memory Used: 250 MB / 512 MB │
│ Evicted Keys (24h): 12 │
│ Connected Clients: 45 │
│ Network In: 5.2 MB/s │
│ Network Out: 8.5 MB/s │
└────────────────────────────────────────────────┘
Edge Cases and Considerations
Cache Stampede Prevention
Problem: When a popular cache key expires, multiple requests simultaneously query the database.
Solution: Soft Expiration + Lock
def cache_with_lock(cache_key, ttl, fetch_func):
"""Prevent cache stampede with soft expiration."""
# Check cache
cached = redis_client.get(cache_key)
if cached:
data = json.loads(cached)
# Check if soft-expired (90% of TTL)
if data.get('cached_at') + (ttl * 0.9) > time.time():
return data['value']
# Try to acquire lock
lock_key = f"lock:{cache_key}"
if redis_client.set(lock_key, '1', nx=True, ex=10):
# We got the lock, fetch fresh data
value = fetch_func()
redis_client.setex(
cache_key,
ttl,
json.dumps({'value': value, 'cached_at': time.time()})
)
redis_client.delete(lock_key)
return value
else:
# Another request is fetching, return stale data
if cached:
return json.loads(cached)['value']
# Wait briefly and retry
time.sleep(0.1)
return cache_with_lock(cache_key, ttl, fetch_func)
Graceful Cache Failure
Fail-Open Pattern:
try:
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
except redis.RedisError as e:
logger.warning(f"Redis error, falling back to DB: {e}")
# Continue to database query
Results Summary
API Response Caching Impact (30-day comparison)
┌────────────────────────────────────────────────┐
│ Metric Before After Change │
│ Cache hit rate: 0% 99% +99% │
│ Avg response time: 150ms 25ms -83% │
│ Database queries/day: 1M 10k -99% │
│ RDS CPU utilization: 65% 8% -88% │
│ RDS instance cost: $210 $60 -71% │
│ Lambda duration cost: $320 $45 -86% │
│ Redis cost: $0 $18 +$18 │
│ Net monthly savings: - - $407 │
└────────────────────────────────────────────────┘
Quantified Outcomes:
- 99% cache hit rate - 29.7M hits out of 30M requests
- 83% faster responses - 150ms → 25ms for cached requests
- 99% database load reduction - 1M → 10k queries/day
- $407/month saved - Net savings after Redis costs
Key Takeaways
-
Caching delivers compounding benefits. Faster responses + reduced database load + lower costs = force multiplier.
-
99% hit rate is achievable with proper TTLs. Strategic TTL configuration based on data change frequency maximizes cache effectiveness.
-
Fail-open cache design is critical. If Redis fails, fall back to database—never block user requests.
-
Cache invalidation is the hard part. Write-through invalidation on updates ensures data consistency.
-
Monitor cache hit rates religiously. A dropping hit rate signals stale TTL configuration or changing access patterns.
API response caching with Redis delivered exceptional ROI: $18/month infrastructure cost generated $407/month in total savings while dramatically improving user experience.
Related Posts:
- RDS Query Optimization: Database Performance Analysis
- Scheduler Query Optimization: Background Job Efficiency
- Thin Lambda Consolidation: Unified Function Architecture
Documentation: 2026-01-25-api-response-caching-plan.md
Impact: 99% cache hit rate, 83% faster responses, $407/month saved