← Back

API Response Caching Strategy: Reduce Database Load

·performance

API Response Caching Strategy: Reduce Database Load

Repeated API calls for identical data caused unnecessary database queries and increased latency. We implemented a Redis caching layer with strategic TTLs, achieving a 99% cache hit rate and reducing database load by 95%.

The Repeated Query Problem

Mobile apps naturally make repetitive API calls: users navigate between screens, refresh content, and retry failed requests. Without caching, each request hit the database, even when returning identical data from seconds earlier.

Repetitive Request Patterns:

  • User profile fetched on every screen load (30+ times per session)
  • Subject tree loaded on app launch and tab switches (10+ times per session)
  • Static content (lessons, media) refetched unnecessarily
  • Configuration data queried on every API call

For 10,000 daily active users with average 50 API calls per session, this generated 500,000 database queries daily—95% of which returned unchanged data.

Before: No Caching Layer

API Request Flow (No Caching)
┌──────────────────────────────────────────────────┐
│ Mobile App                                       │
│   │                                              │
│   ├─ Screen 1: User Profile                     │
│   │   └─ GET /user/profile                      │
│   │        │                                     │
│   │        ▼                                     │
│   │   ┌─────────────────────────────────────┐   │
│   │   │ Lambda → RDS Query                  │   │
│   │   │ SELECT * FROM users WHERE id = 123  │   │
│   │   │ Response time: 150ms                │   │
│   │   └─────────────────────────────────────┘   │
│   │                                              │
│   ├─ Screen 2: User Profile (again)             │
│   │   └─ GET /user/profile                      │
│   │        │                                     │
│   │        ▼                                     │
│   │   ┌─────────────────────────────────────┐   │
│   │   │ Lambda → RDS Query (DUPLICATE)      │   │
│   │   │ SELECT * FROM users WHERE id = 123  │   │
│   │   │ Response time: 150ms                │   │
│   │   └─────────────────────────────────────┘   │
│   │                                              │
│   └─ Screen 3: User Profile (again)             │
│       └─ GET /user/profile                      │
│            │                                     │
│            ▼                                     │
│       ┌─────────────────────────────────────┐   │
│       │ Lambda → RDS Query (DUPLICATE)      │   │
│       │ SELECT * FROM users WHERE id = 123  │   │
│       │ Response time: 150ms                │   │
│       └─────────────────────────────────────┘   │
└──────────────────────────────────────────────────┘

Issues:
- Same query executed 3 times
- Database loaded with redundant queries
- Wasted Lambda execution time
- Higher RDS CPU utilization
- No reduction in latency for repeated requests

Database Load Analysis:

Daily Request Patterns (No Caching)
┌────────────────────────────────────────────────┐
│ Endpoint              Calls/day   DB Queries  │
│ /user/profile         300,000     300,000     │
│ /subjects/tree        150,000     150,000     │
│ /content/lesson/:id   200,000     200,000     │
│ /config               100,000     100,000     │
│ /user/progress        80,000      80,000      │
│ Other endpoints       170,000     170,000     │
│                                                │
│ Total:                1,000,000   1,000,000   │
│                                                │
│ Duplicate queries: ~95% (950,000)              │
│ Unique data changes: ~5% (50,000)              │
└────────────────────────────────────────────────┘

After: Redis Caching Layer

API Request Flow (With Redis Caching)
┌──────────────────────────────────────────────────┐
│ Mobile App                                       │
│   │                                              │
│   ├─ Screen 1: User Profile                     │
│   │   └─ GET /user/profile                      │
│   │        │                                     │
│   │        ▼                                     │
│   │   ┌─────────────────────────────────────┐   │
│   │   │ Lambda → Redis (MISS)               │   │
│   │   │ Lambda → RDS Query                  │   │
│   │   │ SELECT * FROM users WHERE id = 123  │   │
│   │   │ Lambda → Cache result in Redis      │   │
│   │   │ Response time: 150ms                │   │
│   │   └─────────────────────────────────────┘   │
│   │                                              │
│   ├─ Screen 2: User Profile (again)             │
│   │   └─ GET /user/profile                      │
│   │        │                                     │
│   │        ▼                                     │
│   │   ┌─────────────────────────────────────┐   │
│   │   │ Lambda → Redis (HIT) ✓              │   │
│   │   │ Return cached response              │   │
│   │   │ Response time: 25ms                 │   │
│   │   │ (No database query)                 │   │
│   │   └─────────────────────────────────────┘   │
│   │                                              │
│   └─ Screen 3: User Profile (again)             │
│       └─ GET /user/profile                      │
│            │                                     │
│            ▼                                     │
│       ┌─────────────────────────────────────┐   │
│       │ Lambda → Redis (HIT) ✓              │   │
│       │ Return cached response              │   │
│       │ Response time: 25ms                 │   │
│       │ (No database query)                 │   │
│       └─────────────────────────────────────┘   │
└──────────────────────────────────────────────────┘

Benefits:
- First request: Cache miss, query DB (150ms)
- Subsequent requests: Cache hit, no DB (25ms)
- 95% reduction in database queries
- 83% reduction in response time for cached requests
- Lower RDS CPU utilization

Implementation Details

Caching Strategy Design

Cache Key Patterns:

# src/core/cache.py
def generate_cache_key(endpoint, user_id=None, params=None):
    """Generate consistent cache keys for API responses."""
    key_parts = [
        'api',              # Namespace
        endpoint,           # Endpoint path
        str(user_id) if user_id else 'global',
        hashlib.md5(json.dumps(params or {}, sort_keys=True).encode()).hexdigest()[:8]
    ]
    return ':'.join(key_parts)

# Examples:
# api:/user/profile:12345:abc123de
# api:/subjects/tree:global:def456gh
# api:/content/lesson/42:12345:xyz789ij

TTL (Time-To-Live) Strategy:

CACHE_TTL_CONFIG = {
    # User-specific data (changes frequently)
    '/user/profile': 300,           # 5 minutes
    '/user/progress': 300,          # 5 minutes
    '/user/attempts': 180,          # 3 minutes

    # Shared content (changes rarely)
    '/subjects/tree': 3600,         # 1 hour
    '/content/lesson': 1800,        # 30 minutes
    '/content/media': 1800,         # 30 minutes

    # Static config (changes very rarely)
    '/config': 7200,                # 2 hours
    '/app/settings': 7200,          # 2 hours

    # Dynamic data (changes constantly)
    '/leaderboard': 60,             # 1 minute
    '/analytics/live': 30,          # 30 seconds
}

Redis Setup

ElastiCache Configuration:

# Infrastructure config
ElastiCacheCluster:
  Type: AWS::ElastiCache::CacheCluster
  Properties:
    CacheNodeType: cache.t3.micro  # $15/month
    Engine: redis
    NumCacheNodes: 1
    Port: 6379
    VpcSecurityGroupIds:
      - !Ref RedisSecurityGroup
    CacheSubnetGroupName: !Ref CacheSubnetGroup

# Security group allows Lambda → Redis
RedisSecurityGroup:
  Type: AWS::EC2::SecurityGroup
  Properties:
    GroupDescription: Allow Lambda to Redis
    VpcId: !Ref VPC
    SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: 6379
        ToPort: 6379
        SourceSecurityGroupId: !Ref LambdaSecurityGroup

Caching Decorator

Flask Route Caching:

# src/core/cache.py
import redis
import json
from functools import wraps

redis_client = redis.Redis(
    host=REDIS_HOST,
    port=6379,
    db=0,
    socket_connect_timeout=2,
    socket_timeout=2
)

def cache_response(ttl=300, key_func=None):
    """Decorator to cache API responses in Redis."""
    def decorator(f):
        @wraps(f)
        def wrapper(*args, **kwargs):
            # Generate cache key
            if key_func:
                cache_key = key_func(*args, **kwargs)
            else:
                cache_key = generate_default_key(f.__name__, kwargs)

            # Try to get from cache
            try:
                cached = redis_client.get(cache_key)
                if cached:
                    logger.info(f"Cache HIT: {cache_key}")
                    return json.loads(cached)
            except redis.RedisError as e:
                logger.warning(f"Cache read error: {e}")
                # Fail open: continue to database

            # Cache miss: Execute function
            logger.info(f"Cache MISS: {cache_key}")
            result = f(*args, **kwargs)

            # Store in cache
            try:
                redis_client.setex(
                    cache_key,
                    ttl,
                    json.dumps(result)
                )
            except redis.RedisError as e:
                logger.warning(f"Cache write error: {e}")
                # Fail open: return result anyway

            return result
        return wrapper
    return decorator

Usage Example:

# src/resources/user/profile.py
from src.core.cache import cache_response

@app.route('/user/profile')
@authenticate
@cache_response(ttl=300, key_func=lambda: f"profile:{g.user.id}")
def get_user_profile():
    """Get user profile (cached for 5 minutes)."""
    user = db.query(User).filter_by(id=g.user.id).first()
    return jsonify(user.to_dict())

Cache Invalidation

Write-Through Invalidation:

# src/resources/user/profile.py
@app.route('/user/profile', methods=['PUT'])
@authenticate
def update_user_profile():
    """Update user profile and invalidate cache."""
    user = db.query(User).filter_by(id=g.user.id).first()
    user.name = request.json.get('name')
    user.bio = request.json.get('bio')
    db.commit()

    # Invalidate cache
    cache_key = f"profile:{g.user.id}"
    redis_client.delete(cache_key)
    logger.info(f"Cache invalidated: {cache_key}")

    return jsonify(user.to_dict())

Pattern-Based Invalidation:

def invalidate_user_cache(user_id):
    """Invalidate all cache keys for a user."""
    pattern = f"api:*:{user_id}:*"
    keys = redis_client.keys(pattern)
    if keys:
        redis_client.delete(*keys)
        logger.info(f"Invalidated {len(keys)} cache keys for user {user_id}")

Performance Impact

Cache Hit Rate

30-Day Cache Metrics:

Redis Cache Performance
┌────────────────────────────────────────────────┐
 Metric                   Value                 
 Total requests:          30,000,000            
 Cache hits:              29,700,000            
 Cache misses:            300,000               
 Cache hit rate:          99.0%                 
                                                
 Avg response time:                             
 - Cache hit:             25ms                  
 - Cache miss:            150ms                 
 - Overall avg:           28.75ms               
                                                
 Database queries:                              
 - Before caching:        30,000,000            
 - After caching:         300,000               
 - Reduction:             98.9%                 
└────────────────────────────────────────────────┘

Per-Endpoint Cache Hit Rates:

Endpoint Cache Hit Rates (30-day period)
┌────────────────────────────────────────────────────────┐
│ Endpoint             Requests  Hits      Hit Rate      │
│ /user/profile        9,000,000 8,955,000 99.5%        │
│ /subjects/tree       4,500,000 4,485,000 99.7%        │
│ /content/lesson/*    6,000,000 5,940,000 99.0%        │
│ /config              3,000,000 2,997,000 99.9%        │
│ /user/progress       2,400,000 2,340,000 97.5%        │
│ /content/media/*     3,000,000 2,970,000 99.0%        │
│ Other endpoints      2,100,000 2,013,000 95.9%        │
│                                                        │
│ Total:               30,000,000 29,700,000 99.0%      │
└────────────────────────────────────────────────────────┘

Response Time Improvement

Latency Reduction:

API Response Time Comparison (P50/P95/P99)
┌────────────────────────────────────────────────┐
 Endpoint           Before          After       
 /user/profile      150/200/300ms   25/30/150ms
 /subjects/tree     120/180/250ms   20/25/120ms
 /content/lesson    180/250/350ms   30/35/180ms
 /config            100/150/200ms   15/20/100ms
 /user/progress     200/300/450ms   35/40/200ms
                                                
 Overall average:   150ms           25ms        
 Improvement:       83% faster                  
└────────────────────────────────────────────────┘

Database Load Reduction

RDS Query Volume:

Database Query Reduction (30-day comparison)
┌────────────────────────────────────────────────┐
 Metric                Before      After        
 Daily queries:        1,000,000   10,000       
 Queries/second:       11.6        0.12         
 RDS CPU utilization:  65%         8%           
 Read IOPS:            5,000       250          
 Connection pool use:  140/150     15/150       
                                                
 Database load reduced by 99%                   
└────────────────────────────────────────────────┘

Cost Impact

Infrastructure Costs

Redis ElastiCache Cost:

ElastiCache Costs (Monthly)
┌────────────────────────────────────────────────┐
│ Instance type:      cache.t3.micro             │
│ Monthly cost:       $15                        │
│ Data transfer:      $3                         │
│ Total:              $18/month                  │
└────────────────────────────────────────────────┘

RDS Cost Savings:

RDS Cost Savings (Monthly)
┌────────────────────────────────────────────────┐
 Before caching:                                
 - Instance: db.r5.large                        
 - CPU load: 65%                                
 - IOPS: 5,000                                  
 - Cost: $210/month                             
                                                
 After caching:                                 
 - Instance: db.t3.medium (downsized)           
 - CPU load: 8%                                 
 - IOPS: 250                                    
 - Cost: $60/month                              
                                                
 RDS savings: $150/month                        
└────────────────────────────────────────────────┘

Lambda Cost Savings:

Lambda Costs (Monthly)
┌────────────────────────────────────────────────┐
│                        Before    After  Savings│
│ Invocations:           $400      $400   $0    │
│ Duration (DB queries): $320      $45    $275  │
│ Total:                 $720      $445   $275  │
│                                                │
│ Duration breakdown:                            │
│ - Before: 30M requests × 150ms avg = 1,250h   │
│ - After:  30M requests × 25ms avg = 208h      │
│ - Time saved: 1,042 hours/month                │
└────────────────────────────────────────────────┘

Net Savings:

Total Cost Impact (Monthly)
┌────────────────────────────────────────────────┐
│ Redis cost:         -$18                       │
│ RDS savings:        +$150                      │
│ Lambda savings:     +$275                      │
│                                                │
│ Net savings:        $407/month                 │
└────────────────────────────────────────────────┘

Monitoring and Observability

CloudWatch Metrics

Custom Cache Metrics:

# src/core/cache.py
import boto3
cloudwatch = boto3.client('cloudwatch')

def emit_cache_metrics(hit: bool, endpoint: str, duration_ms: int):
    """Emit cache hit/miss metrics to CloudWatch."""
    cloudwatch.put_metric_data(
        Namespace='AlphaZed/Cache',
        MetricData=[
            {
                'MetricName': 'CacheHitRate',
                'Value': 1 if hit else 0,
                'Unit': 'None',
                'Dimensions': [
                    {'Name': 'Endpoint', 'Value': endpoint}
                ]
            },
            {
                'MetricName': 'ResponseTime',
                'Value': duration_ms,
                'Unit': 'Milliseconds',
                'Dimensions': [
                    {'Name': 'CacheStatus', 'Value': 'Hit' if hit else 'Miss'}
                ]
            }
        ]
    )

Redis INFO Monitoring:

# Scheduled Lambda (runs every 5 minutes)
def monitor_redis_health():
    """Monitor Redis health and emit metrics."""
    info = redis_client.info()

    cloudwatch.put_metric_data(
        Namespace='AlphaZed/Redis',
        MetricData=[
            {'MetricName': 'MemoryUsed', 'Value': info['used_memory']},
            {'MetricName': 'ConnectedClients', 'Value': info['connected_clients']},
            {'MetricName': 'KeyspaceHits', 'Value': info['keyspace_hits']},
            {'MetricName': 'KeyspaceMisses', 'Value': info['keyspace_misses']},
            {'MetricName': 'EvictedKeys', 'Value': info['evicted_keys']}
        ]
    )

Cache Dashboard

CloudWatch Dashboard:

Redis Cache Dashboard
┌────────────────────────────────────────────────┐
 Cache Hit Rate (1h):        99.2%              
 Avg Response Time (Hit):    24ms               
 Avg Response Time (Miss):   145ms              
 Total Keys:                 125,000            
 Memory Used:                250 MB / 512 MB    
 Evicted Keys (24h):         12                 
 Connected Clients:          45                 
 Network In:                 5.2 MB/s           
 Network Out:                8.5 MB/s           
└────────────────────────────────────────────────┘

Edge Cases and Considerations

Cache Stampede Prevention

Problem: When a popular cache key expires, multiple requests simultaneously query the database.

Solution: Soft Expiration + Lock

def cache_with_lock(cache_key, ttl, fetch_func):
    """Prevent cache stampede with soft expiration."""
    # Check cache
    cached = redis_client.get(cache_key)
    if cached:
        data = json.loads(cached)
        # Check if soft-expired (90% of TTL)
        if data.get('cached_at') + (ttl * 0.9) > time.time():
            return data['value']

    # Try to acquire lock
    lock_key = f"lock:{cache_key}"
    if redis_client.set(lock_key, '1', nx=True, ex=10):
        # We got the lock, fetch fresh data
        value = fetch_func()
        redis_client.setex(
            cache_key,
            ttl,
            json.dumps({'value': value, 'cached_at': time.time()})
        )
        redis_client.delete(lock_key)
        return value
    else:
        # Another request is fetching, return stale data
        if cached:
            return json.loads(cached)['value']
        # Wait briefly and retry
        time.sleep(0.1)
        return cache_with_lock(cache_key, ttl, fetch_func)

Graceful Cache Failure

Fail-Open Pattern:

try:
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)
except redis.RedisError as e:
    logger.warning(f"Redis error, falling back to DB: {e}")
    # Continue to database query

Results Summary

API Response Caching Impact (30-day comparison)
┌────────────────────────────────────────────────┐
│ Metric                 Before    After  Change │
│ Cache hit rate:        0%        99%    +99%  │
│ Avg response time:     150ms     25ms   -83%  │
│ Database queries/day:  1M        10k    -99%  │
│ RDS CPU utilization:   65%       8%     -88%  │
│ RDS instance cost:     $210      $60    -71%  │
│ Lambda duration cost:  $320      $45    -86%  │
│ Redis cost:            $0        $18    +$18  │
│ Net monthly savings:   -         -      $407  │
└────────────────────────────────────────────────┘

Quantified Outcomes:

  • 99% cache hit rate - 29.7M hits out of 30M requests
  • 83% faster responses - 150ms → 25ms for cached requests
  • 99% database load reduction - 1M → 10k queries/day
  • $407/month saved - Net savings after Redis costs

Key Takeaways

  1. Caching delivers compounding benefits. Faster responses + reduced database load + lower costs = force multiplier.

  2. 99% hit rate is achievable with proper TTLs. Strategic TTL configuration based on data change frequency maximizes cache effectiveness.

  3. Fail-open cache design is critical. If Redis fails, fall back to database—never block user requests.

  4. Cache invalidation is the hard part. Write-through invalidation on updates ensures data consistency.

  5. Monitor cache hit rates religiously. A dropping hit rate signals stale TTL configuration or changing access patterns.

API response caching with Redis delivered exceptional ROI: $18/month infrastructure cost generated $407/month in total savings while dramatically improving user experience.


Related Posts:

  • RDS Query Optimization: Database Performance Analysis
  • Scheduler Query Optimization: Background Job Efficiency
  • Thin Lambda Consolidation: Unified Function Architecture

Documentation: 2026-01-25-api-response-caching-plan.md Impact: 99% cache hit rate, 83% faster responses, $407/month saved