alqosh

API Response Caching Strategy: Reduce Database Load

February 1, 2026·performance

Overview

Repeated API calls for identical data caused unnecessary database queries and increased latency. We implemented a Redis caching layer with strategic TTLs, achieving a 99% cache hit rate and reducing database load by 95%.

The Repeated Query Problem

Mobile apps naturally make repetitive API calls: users navigate between screens, refresh content, and retry failed requests. Without caching, each request hit the database, even when returning identical data from seconds earlier.

Repetitive Request Patterns:

User profile fetched on every screen load (30+ times per session)
Subject tree loaded on app launch and tab switches (10+ times per session)
Static content (lessons, media) refetched unnecessarily
Configuration data queried on every API call

For 10,000 daily active users with average 50 API calls per session, this generated 500,000 database queries daily—95% of which returned unchanged data.

Before: No Caching Layer

Issues:

Same query executed 3 times
Database loaded with redundant queries
Wasted Lambda execution time
Higher RDS CPU utilization
No reduction in latency for repeated requests


**Database Load Analysis:**

Daily Request Patterns (No Caching) ┌────────────────────────────────────────────────┐ │ Endpoint Calls/day DB Queries │ │ /user/profile 300,000 300,000 │ │ /subjects/tree 150,000 150,000 │ │ /content/lesson/:id 200,000 200,000 │ │ /config 100,000 100,000 │ │ /user/progress 80,000 80,000 │ │ Other endpoints 170,000 170,000 │ │ │ │ Total: 1,000,000 1,000,000 │ │ │ │ Duplicate queries: ~95% (950,000) │ │ Unique data changes: ~5% (50,000) │ └────────────────────────────────────────────────┘


## After: Redis Caching Layer

```mermaid
flowchart TD
  App["Mobile App"]:::client
  App --> S1["Screen 1: User Profile<br/>GET /user/profile"]:::client
  App --> S2["Screen 2: User Profile (again)<br/>GET /user/profile"]:::client
  App --> S3["Screen 3: User Profile (again)<br/>GET /user/profile"]:::client
  S1 --> Q1["Lambda -> Redis (MISS)<br/>Lambda -> RDS Query<br/>Cache result in Redis<br/>Response time: 150ms"]:::storage
  S2 --> Q2["Lambda -> Redis (HIT)<br/>Return cached response<br/>Response time: 25ms<br/>(No database query)"]:::storage
  S3 --> Q3["Lambda -> Redis (HIT)<br/>Return cached response<br/>Response time: 25ms<br/>(No database query)"]:::storage

Benefits:

First request: Cache miss, query DB (150ms)
Subsequent requests: Cache hit, no DB (25ms)
95% reduction in database queries
83% reduction in response time for cached requests
Lower RDS CPU utilization

Implementation Details

Caching Strategy Design

Cache Key Patterns:

# src/core/cache.py
def generate_cache_key(endpoint, user_id=None, params=None):
    """Generate consistent cache keys for API responses."""
    key_parts = [
        'api',              # Namespace
        endpoint,           # Endpoint path
        str(user_id) if user_id else 'global',
        hashlib.md5(json.dumps(params or {}, sort_keys=True).encode()).hexdigest()[:8]
    ]
    return ':'.join(key_parts)

# Examples:
# api:/user/profile:12345:abc123de
# api:/subjects/tree:global:def456gh
# api:/content/lesson/42:12345:xyz789ij

TTL (Time-To-Live) Strategy:

CACHE_TTL_CONFIG = {
    # User-specific data (changes frequently)
    '/user/profile': 300,           # 5 minutes
    '/user/progress': 300,          # 5 minutes
    '/user/attempts': 180,          # 3 minutes

    # Shared content (changes rarely)
    '/subjects/tree': 3600,         # 1 hour
    '/content/lesson': 1800,        # 30 minutes
    '/content/media': 1800,         # 30 minutes

    # Static config (changes very rarely)
    '/config': 7200,                # 2 hours
    '/app/settings': 7200,          # 2 hours

    # Dynamic data (changes constantly)
    '/leaderboard': 60,             # 1 minute
    '/analytics/live': 30,          # 30 seconds
}

Redis Setup

ElastiCache Configuration:

# Infrastructure config
ElastiCacheCluster:
  Type: AWS::ElastiCache::CacheCluster
  Properties:
    CacheNodeType: cache.t3.micro  # $15/month
    Engine: redis
    NumCacheNodes: 1
    Port: 6379
    VpcSecurityGroupIds:
      - !Ref RedisSecurityGroup
    CacheSubnetGroupName: !Ref CacheSubnetGroup

# Security group allows Lambda → Redis
RedisSecurityGroup:
  Type: AWS::EC2::SecurityGroup
  Properties:
    GroupDescription: Allow Lambda to Redis
    VpcId: !Ref VPC
    SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: 6379
        ToPort: 6379
        SourceSecurityGroupId: !Ref LambdaSecurityGroup

Caching Decorator

Flask Route Caching:

# src/core/cache.py
import redis
import json
from functools import wraps

redis_client = redis.Redis(
    host=REDIS_HOST,
    port=6379,
    db=0,
    socket_connect_timeout=2,
    socket_timeout=2
)

def cache_response(ttl=300, key_func=None):
    """Decorator to cache API responses in Redis."""
    def decorator(f):
        @wraps(f)
        def wrapper(*args, **kwargs):
            # Generate cache key
            if key_func:
                cache_key = key_func(*args, **kwargs)
            else:
                cache_key = generate_default_key(f.__name__, kwargs)

            # Try to get from cache
            try:
                cached = redis_client.get(cache_key)
                if cached:
                    logger.info(f"Cache HIT: {cache_key}")
                    return json.loads(cached)
            except redis.RedisError as e:
                logger.warning(f"Cache read error: {e}")
                # Fail open: continue to database

            # Cache miss: Execute function
            logger.info(f"Cache MISS: {cache_key}")
            result = f(*args, **kwargs)

            # Store in cache
            try:
                redis_client.setex(
                    cache_key,
                    ttl,
                    json.dumps(result)
                )
            except redis.RedisError as e:
                logger.warning(f"Cache write error: {e}")
                # Fail open: return result anyway

            return result
        return wrapper
    return decorator

Usage Example:

# src/resources/user/profile.py
from src.core.cache import cache_response

@app.route('/user/profile')
@authenticate
@cache_response(ttl=300, key_func=lambda: f"profile:{g.user.id}")
def get_user_profile():
    """Get user profile (cached for 5 minutes)."""
    user = db.query(User).filter_by(id=g.user.id).first()
    return jsonify(user.to_dict())

Cache Invalidation

Write-Through Invalidation:

# src/resources/user/profile.py
@app.route('/user/profile', methods=['PUT'])
@authenticate
def update_user_profile():
    """Update user profile and invalidate cache."""
    user = db.query(User).filter_by(id=g.user.id).first()
    user.name = request.json.get('name')
    user.bio = request.json.get('bio')
    db.commit()

    # Invalidate cache
    cache_key = f"profile:{g.user.id}"
    redis_client.delete(cache_key)
    logger.info(f"Cache invalidated: {cache_key}")

    return jsonify(user.to_dict())

Pattern-Based Invalidation:

def invalidate_user_cache(user_id):
    """Invalidate all cache keys for a user."""
    pattern = f"api:*:{user_id}:*"
    keys = redis_client.keys(pattern)
    if keys:
        redis_client.delete(*keys)
        logger.info(f"Invalidated {len(keys)} cache keys for user {user_id}")

Performance Impact

Cache Hit Rate

30-Day Cache Metrics:

Per-Endpoint Cache Hit Rates:

Response Time Improvement

Latency Reduction:

Database Load Reduction

RDS Query Volume:

Redis Cache Performance
┌────────────────────────────────────────────────┐
│ Metric                   Value                 │
│ Total requests:          30,000,000            │
│ Cache hits:              29,700,000            │
│ Cache misses:            300,000               │
│ Cache hit rate:          99.0%                 │
│                                                │
│ Avg response time:                             │
│ - Cache hit:             25ms                  │
│ - Cache miss:            150ms                 │
│ - Overall avg:           28.75ms               │
│                                                │
│ Database queries:                              │
│ - Before caching:        30,000,000            │
│ - After caching:         300,000               │
│ - Reduction:             98.9%                 │
└────────────────────────────────────────────────┘

Endpoint Cache Hit Rates (30-day period)
┌────────────────────────────────────────────────────────┐
│ Endpoint             Requests  Hits      Hit Rate      │
│ /user/profile        9,000,000 8,955,000 99.5%        │
│ /subjects/tree       4,500,000 4,485,000 99.7%        │
│ /content/lesson/*    6,000,000 5,940,000 99.0%        │
│ /config              3,000,000 2,997,000 99.9%        │
│ /user/progress       2,400,000 2,340,000 97.5%        │
│ /content/media/*     3,000,000 2,970,000 99.0%        │
│ Other endpoints      2,100,000 2,013,000 95.9%        │
│                                                        │
│ Total:               30,000,000 29,700,000 99.0%      │
└────────────────────────────────────────────────────────┘

API Response Time Comparison (P50/P95/P99)
┌────────────────────────────────────────────────┐
│ Endpoint           Before          After       │
│ /user/profile      150/200/300ms   25/30/150ms│
│ /subjects/tree     120/180/250ms   20/25/120ms│
│ /content/lesson    180/250/350ms   30/35/180ms│
│ /config            100/150/200ms   15/20/100ms│
│ /user/progress     200/300/450ms   35/40/200ms│
│                                                │
│ Overall average:   150ms           25ms        │
│ Improvement:       83% faster                  │
└────────────────────────────────────────────────┘

Database Query Reduction (30-day comparison)
┌────────────────────────────────────────────────┐
│ Metric                Before      After        │
│ Daily queries:        1,000,000   10,000       │
│ Queries/second:       11.6        0.12         │
│ RDS CPU utilization:  65%         8%           │
│ Read IOPS:            5,000       250          │
│ Connection pool use:  140/150     15/150       │
│                                                │
│ Database load reduced by 99%                   │
└────────────────────────────────────────────────┘

Cost Impact

Infrastructure Costs

Redis ElastiCache Cost:

RDS Cost Savings:

Lambda Cost Savings:

Net Savings:

ElastiCache Costs (Monthly)
┌────────────────────────────────────────────────┐
│ Instance type:      cache.t3.micro             │
│ Monthly cost:       $15                        │
│ Data transfer:      $3                         │
│ Total:              $18/month                  │
└────────────────────────────────────────────────┘

RDS Cost Savings (Monthly)
┌────────────────────────────────────────────────┐
│ Before caching:                                │
│ - Instance: db.r5.large                        │
│ - CPU load: 65%                                │
│ - IOPS: 5,000                                  │
│ - Cost: $210/month                             │
│                                                │
│ After caching:                                 │
│ - Instance: db.t3.medium (downsized)           │
│ - CPU load: 8%                                 │
│ - IOPS: 250                                    │
│ - Cost: $60/month                              │
│                                                │
│ RDS savings: $150/month                        │
└────────────────────────────────────────────────┘

Lambda Costs (Monthly)
┌────────────────────────────────────────────────┐
│                        Before    After  Savings│
│ Invocations:           $400      $400   $0    │
│ Duration (DB queries): $320      $45    $275  │
│ Total:                 $720      $445   $275  │
│                                                │
│ Duration breakdown:                            │
│ - Before: 30M requests × 150ms avg = 1,250h   │
│ - After:  30M requests × 25ms avg = 208h      │
│ - Time saved: 1,042 hours/month                │
└────────────────────────────────────────────────┘

Total Cost Impact (Monthly)
┌────────────────────────────────────────────────┐
│ Redis cost:         -$18                       │
│ RDS savings:        +$150                      │
│ Lambda savings:     +$275                      │
│                                                │
│ Net savings:        $407/month                 │
└────────────────────────────────────────────────┘

Monitoring and Observability

CloudWatch Metrics

Custom Cache Metrics:

# src/core/cache.py
import boto3
cloudwatch = boto3.client('cloudwatch')

def emit_cache_metrics(hit: bool, endpoint: str, duration_ms: int):
    """Emit cache hit/miss metrics to CloudWatch."""
    cloudwatch.put_metric_data(
        Namespace='AlphaZed/Cache',
        MetricData=[
            {
                'MetricName': 'CacheHitRate',
                'Value': 1 if hit else 0,
                'Unit': 'None',
                'Dimensions': [
                    {'Name': 'Endpoint', 'Value': endpoint}
                ]
            },
            {
                'MetricName': 'ResponseTime',
                'Value': duration_ms,
                'Unit': 'Milliseconds',
                'Dimensions': [
                    {'Name': 'CacheStatus', 'Value': 'Hit' if hit else 'Miss'}
                ]
            }
        ]
    )

Redis INFO Monitoring:

# Scheduled Lambda (runs every 5 minutes)
def monitor_redis_health():
    """Monitor Redis health and emit metrics."""
    info = redis_client.info()

    cloudwatch.put_metric_data(
        Namespace='AlphaZed/Redis',
        MetricData=[
            {'MetricName': 'MemoryUsed', 'Value': info['used_memory']},
            {'MetricName': 'ConnectedClients', 'Value': info['connected_clients']},
            {'MetricName': 'KeyspaceHits', 'Value': info['keyspace_hits']},
            {'MetricName': 'KeyspaceMisses', 'Value': info['keyspace_misses']},
            {'MetricName': 'EvictedKeys', 'Value': info['evicted_keys']}
        ]
    )

Cache Dashboard

CloudWatch Dashboard:

Redis Cache Dashboard
┌────────────────────────────────────────────────┐
│ Cache Hit Rate (1h):        99.2%              │
│ Avg Response Time (Hit):    24ms               │
│ Avg Response Time (Miss):   145ms              │
│ Total Keys:                 125,000            │
│ Memory Used:                250 MB / 512 MB    │
│ Evicted Keys (24h):         12                 │
│ Connected Clients:          45                 │
│ Network In:                 5.2 MB/s           │
│ Network Out:                8.5 MB/s           │
└────────────────────────────────────────────────┘

Edge Cases and Considerations

Cache Stampede Prevention

Problem: When a popular cache key expires, multiple requests simultaneously query the database.

Solution: Soft Expiration + Lock

def cache_with_lock(cache_key, ttl, fetch_func):
    """Prevent cache stampede with soft expiration."""
    # Check cache
    cached = redis_client.get(cache_key)
    if cached:
        data = json.loads(cached)
        # Check if soft-expired (90% of TTL)
        if data.get('cached_at') + (ttl * 0.9) > time.time():
            return data['value']

    # Try to acquire lock
    lock_key = f"lock:{cache_key}"
    if redis_client.set(lock_key, '1', nx=True, ex=10):
        # We got the lock, fetch fresh data
        value = fetch_func()
        redis_client.setex(
            cache_key,
            ttl,
            json.dumps({'value': value, 'cached_at': time.time()})
        )
        redis_client.delete(lock_key)
        return value
    else:
        # Another request is fetching, return stale data
        if cached:
            return json.loads(cached)['value']
        # Wait briefly and retry
        time.sleep(0.1)
        return cache_with_lock(cache_key, ttl, fetch_func)

Graceful Cache Failure

Fail-Open Pattern:

try:
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)
except redis.RedisError as e:
    logger.warning(f"Redis error, falling back to DB: {e}")
    # Continue to database query

Results Summary

Quantified Outcomes:

99% cache hit rate - 29.7M hits out of 30M requests
83% faster responses - 150ms → 25ms for cached requests
99% database load reduction - 1M → 10k queries/day
$407/month saved - Net savings after Redis costs

API Response Caching Impact (30-day comparison)
┌────────────────────────────────────────────────┐
│ Metric                 Before    After  Change │
│ Cache hit rate:        0%        99%    +99%  │
│ Avg response time:     150ms     25ms   -83%  │
│ Database queries/day:  1M        10k    -99%  │
│ RDS CPU utilization:   65%       8%     -88%  │
│ RDS instance cost:     $210      $60    -71%  │
│ Lambda duration cost:  $320      $45    -86%  │
│ Redis cost:            $0        $18    +$18  │
│ Net monthly savings:   -         -      $407  │
└────────────────────────────────────────────────┘

Key Takeaways

Caching delivers compounding benefits. Faster responses + reduced database load + lower costs = force multiplier.
99% hit rate is achievable with proper TTLs. Strategic TTL configuration based on data change frequency maximizes cache effectiveness.
Fail-open cache design is critical. If Redis fails, fall back to database—never block user requests.
Cache invalidation is the hard part. Write-through invalidation on updates ensures data consistency.
Monitor cache hit rates religiously. A dropping hit rate signals stale TTL configuration or changing access patterns.

API response caching with Redis delivered exceptional ROI: $18/month infrastructure cost generated $407/month in total savings while dramatically improving user experience.

Related Posts:

RDS Query Optimization: Database Performance Analysis
Scheduler Query Optimization: Background Job Efficiency
Thin Lambda Consolidation: Unified Function Architecture

Documentation: 2026-01-25-api-response-caching-plan.md Impact: 99% cache hit rate, 83% faster responses, $407/month saved