Rate Limiting: Protecting APIs from Abuse and DoS Attacks
Unlimited API access invites abuse. A single malicious actor can overwhelm your infrastructure, degrade service for legitimate users, and rack up cloud costs. We implemented per-IP rate limiting with Redis to protect our APIs from abuse while maintaining performance for normal usage patterns.
The Problem
Our API endpoints had no request throttling. Any client could send unlimited requests per minute, creating three critical vulnerabilities:
- Denial of Service (DoS) - A single client could overwhelm the database with 10,000+ requests/minute
- Credential Stuffing - Attackers could brute-force authentication endpoints without limitation
- Cost Exposure - Malicious traffic consumed Lambda invocations and RDS connections, increasing AWS bills
Real incident that triggered this work:
Date: 2026-01-15
Event: Automated bot attempted 15,000 login requests in 3 minutes
Impact: Database connection pool exhausted, 500 errors for legitimate users
Duration: 12 minutes until manual IP block via CloudFront
Cost: $47 in excess Lambda invocations
This incident proved we needed automated protection.
Before: Unlimited API Access
API Request Flow (Vulnerable)
┌──────────────────────────────────────┐
│ Client (Malicious or Misconfigured) │
│ │
│ Sends 10,000 requests/minute │
│ │
│ │ │
│ v │
│ ┌──────────────────┐ │
│ │ API Gateway │ │
│ │ - No throttling │ │
│ │ - All pass thru │ │
│ └────────┬─────────┘ │
│ │ │
│ v │
│ ┌──────────────────┐ │
│ │ Lambda Function │ │
│ │ - Processes all │ │
│ │ - No filtering │ │
│ └────────┬─────────┘ │
│ │ │
│ v │
│ ┌──────────────────┐ │
│ │ RDS Database │ │
│ │ - Overwhelmed │ │
│ │ - Conn exhausted │ │
│ │ - Query timeouts │ │
│ └──────────────────┘ │
│ │
│ Result: Service degradation │
│ for ALL users │
└──────────────────────────────────────┘
Consequences:
- Any client could consume unlimited resources
- No protection against brute-force attacks
- Legitimate users affected by malicious traffic
- Unpredictable AWS costs from abuse
After: Per-IP Rate Limiting
API Request Flow (Protected)
┌──────────────────────────────────────┐
│ Client (Any) │
│ │
│ Sends requests │
│ │
│ │ │
│ v │
│ ┌──────────────────┐ │
│ │ API Gateway │ │
│ │ - Extracts IP │ │
│ └────────┬─────────┘ │
│ │ │
│ v │
│ ┌──────────────────┐ │
│ │ Rate Limiter │ │
│ │ Middleware │ │
│ │ │ │
│ │ Check Redis: │ │
│ │ IP:1.2.3.4 │ │
│ │ Count: 95/100 │ │
│ │ │ │
│ │ ├─ < limit? PASS │────────┐ │
│ │ └─ ≥ limit? BLOCK│ │ │
│ │ (HTTP 429) │ │ │
│ └──────────────────┘ │ │
│ │ │
│ v │
│ ┌──────────────────┐
│ │ Lambda Function │
│ │ - Only legit │
│ │ requests │
│ └────────┬─────────┘
│ │ │
│ v │
│ ┌──────────────────┐
│ │ RDS Database │
│ │ - Normal load │
│ │ - Fast queries │
│ └──────────────────┘
│ │
│ Result: Protected infrastructure │
│ Fair resource allocation │
└──────────────────────────────────────┘
Protection:
- First 100 requests/minute: Processed ✓
- Requests 101+: Blocked with HTTP 429 ✗
- Legitimate users: Unaffected
- Malicious actors: Neutralized
Implementation Details
Phase 1: Rate Limiting Strategy
We evaluated three approaches:
Option 1: API Gateway Throttling
- Built-in AWS feature
- Simple to configure
- Limitation: Global limits only, not per-IP
Option 2: Application-Level Token Bucket
- In-memory rate limiting
- Fast performance
- Limitation: Doesn't persist across Lambda cold starts
Option 3: Redis-Backed Per-IP Limiting ✓ Selected
- Persistent state across requests
- Per-IP granularity
- Minimal latency (<5ms per check)
We chose Redis-backed limiting for precision and persistence.
Phase 2: Architecture Design
Rate Limiting Middleware:
# src/middleware/rate_limiter.py
import redis
import time
from functools import wraps
from flask import request, jsonify
# Redis connection
redis_client = redis.StrictRedis(
host=REDIS_HOST,
port=6379,
db=0,
decode_responses=True
)
# Rate limit configurations
RATE_LIMITS = {
'default': {'requests': 100, 'window': 60}, # 100 req/min
'auth': {'requests': 10, 'window': 60}, # 10 req/min
'content': {'requests': 200, 'window': 60}, # 200 req/min
}
def rate_limit(limit_type='default'):
"""
Decorator for rate limiting API endpoints.
Args:
limit_type: Rate limit configuration to use
"""
def decorator(f):
@wraps(f)
def wrapped(*args, **kwargs):
# Get client IP (handles proxies)
client_ip = request.headers.get('X-Forwarded-For', request.remote_addr)
if ',' in client_ip:
client_ip = client_ip.split(',')[0].strip()
# Get rate limit config
config = RATE_LIMITS.get(limit_type, RATE_LIMITS['default'])
max_requests = config['requests']
window_seconds = config['window']
# Redis key for this IP + endpoint
redis_key = f"rate_limit:{limit_type}:{client_ip}"
# Get current request count
current_count = redis_client.get(redis_key)
if current_count is None:
# First request in window
redis_client.setex(redis_key, window_seconds, 1)
return f(*args, **kwargs)
current_count = int(current_count)
if current_count >= max_requests:
# Rate limit exceeded
return jsonify({
'error': 'Rate limit exceeded',
'limit': max_requests,
'window': f'{window_seconds}s',
'retry_after': redis_client.ttl(redis_key)
}), 429
# Increment counter
redis_client.incr(redis_key)
# Process request
return f(*args, **kwargs)
return wrapped
return decorator
Endpoint Integration:
# src/resources/auth/login.py
from middleware.rate_limiter import rate_limit
@app.route('/auth/login', methods=['POST'])
@rate_limit('auth') # 10 requests/min
def login():
email = request.json.get('email')
password = request.json.get('password')
# ... authentication logic
Response Headers: We added rate limit information to response headers for client transparency:
def add_rate_limit_headers(response, limit_info):
"""Add rate limit headers to response."""
response.headers['X-RateLimit-Limit'] = str(limit_info['limit'])
response.headers['X-RateLimit-Remaining'] = str(limit_info['remaining'])
response.headers['X-RateLimit-Reset'] = str(limit_info['reset_time'])
return response
Example response:
HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1642534920
Phase 3: Redis Configuration
Infrastructure Setup:
# ElastiCache Redis configuration
redis:
instance_type: cache.t3.micro # $13/month
engine_version: 7.0
parameter_group:
maxmemory-policy: allkeys-lru # Evict old keys when full
timeout: 300 # Close idle connections
Cost Analysis:
- Redis instance: $13/month
- Data transfer: <$1/month
- Total cost: $14/month
- Value: Prevents $500+ in abuse-related costs
Phase 4: Testing & Validation
Load Testing:
# Test rate limiting under load
ab -n 150 -c 10 https://api.example.com/auth/login
Results:
Total requests: 150
Successful (200): 10
Rate limited (429): 140
Average response time: 45ms
Rate limiter overhead: <5ms
Edge Case Testing:
- Distributed attacks - Multiple IPs from same attacker
- Shared IPs - Corporate NAT/proxy scenarios
- Legitimate bursts - Mobile app reconnection spikes
- Clock skew - Redis TTL accuracy
We adjusted limits based on real traffic patterns:
- Auth endpoints: 10 req/min (prevents brute force)
- Content endpoints: 200 req/min (supports normal browsing)
- Default endpoints: 100 req/min (balanced protection)
Results
Security Improvements
Brute-Force Protection: Before rate limiting, an attacker could attempt 10,000 passwords in 10 minutes. After rate limiting:
- 10 login attempts per minute maximum
- 600 attempts per hour (vs. unlimited)
- Brute-force attacks become impractical
For a 6-digit PIN (1 million combinations):
- Without rate limiting: ~1.7 hours to brute-force
- With rate limiting: ~1,157 days to brute-force
- Effectiveness: Attack becomes infeasible
DoS Protection: Tested with simulated attack:
Attack Pattern: 10,000 requests/minute from single IP
Protection:
- First 100 requests: Processed (1 minute)
- Remaining 9,900: Blocked immediately
- Database impact: Zero (requests never reach DB)
- Legitimate users: Unaffected
Cost Optimization
AWS Cost Reduction:
Before Rate Limiting:
- Malicious traffic: 500,000 requests/day
- Lambda invocations: 500,000/day
- Lambda cost: $100/day = $3,000/month
- RDS connections: Frequently exhausted
- RDS cost: $800/month (over-provisioned to handle abuse)
After Rate Limiting:
- Malicious traffic: Blocked at middleware
- Lambda invocations: 50,000/day (legitimate only)
- Lambda cost: $10/day = $300/month
- RDS connections: Normal utilization
- RDS cost: $400/month (right-sized)
- Redis cost: $14/month
Savings: $4,086/month (90% reduction in abuse-related costs)
Operational Metrics
30-Day Post-Implementation:
- Total requests: 15.2 million
- Rate-limited requests: 342,000 (2.25%)
- False positives: 0 (no legitimate users blocked)
- Blocked attacks: 47 distinct attack attempts
- Largest blocked attack: 125,000 requests from single IP
- Average rate limiter latency: 4.2ms
Response Time Impact:
Endpoint: /api/content/lessons
Before rate limiting: 145ms average
After rate limiting: 149ms average
Overhead: +4ms (2.7% increase)
Minimal performance impact for significant security benefit.
Incident Prevention
Prevented Incidents (30 days):
- Credential stuffing attack - 15,000 login attempts blocked
- API scraping bot - 50,000 content requests blocked
- Misconfigured mobile client - 8,000 polling requests blocked
- Competitor reconnaissance - 3,000 enumeration requests blocked
Each incident would have caused service degradation without rate limiting.
Lessons Learned
What Worked
- Per-Endpoint Limits - Different endpoints need different thresholds (auth vs. content)
- Redis Persistence - Stateful rate limiting survives Lambda cold starts
- Response Headers -
X-RateLimit-*headers help developers debug client issues - Gradual Rollout - Started with high limits, tuned based on real traffic
What Didn't Work
- Initial Limits Too Aggressive - First deployment set auth limit to 5/min, blocked legitimate password resets
- IP Extraction Logic - Early version didn't handle
X-Forwarded-Forproperly, blocked entire corporate offices - No Allowlist - Internal monitoring tools got rate-limited, required IP allowlist
Adjustments Made
IP Allowlist for Internal Tools:
INTERNAL_IPS = [
'10.0.0.0/8', # Internal network
'52.1.2.3', # CI/CD server
'54.5.6.7', # Monitoring service
]
def is_internal_ip(ip):
"""Check if IP is in allowlist."""
return any(ip.startswith(prefix) for prefix in INTERNAL_IPS)
Dynamic Limit Adjustment:
# Increase limits for authenticated users
if user_authenticated:
max_requests *= 2 # 200 req/min for logged-in users
Better Error Messages:
{
"error": "Rate limit exceeded",
"message": "You have made too many requests. Please wait 45 seconds.",
"limit": 100,
"window": "60s",
"retry_after": 45,
"documentation": "https://docs.example.com/rate-limiting"
}
Key Takeaways
Rate limiting is essential for production APIs. Our implementation blocks 2.25% of requests (342,000 in 30 days), preventing service degradation and reducing costs by $4,086/month.
Critical implementation factors:
- Per-IP granularity - Prevents single attacker from affecting all users
- Redis persistence - State survives across Lambda invocations
- Endpoint-specific limits - Auth endpoints need stricter limits than content
- Transparent responses - Clear error messages help developers fix clients
Recommended approach:
- Start with conservative limits (high thresholds)
- Monitor rate limit metrics for 1 week
- Adjust limits based on 99th percentile legitimate usage
- Add allowlist for internal tools
- Implement graduated limits (higher for authenticated users)
Redis vs. In-Memory Tradeoffs:
- Redis adds 4ms latency per request
- In-memory has no latency but loses state on cold starts
- For serverless architectures: Redis is worth the small overhead
Rate limiting transforms security from reactive (responding to incidents) to proactive (preventing incidents). The 4ms overhead per request prevents 12-minute outages and $4,000/month in abuse costs.
Implementation time: 3 days (middleware + testing + deployment) Cost: $14/month (Redis) ROI: $4,086/month savings + prevented outages
Production APIs without rate limiting are vulnerable to abuse. Implement rate limiting before you need it—attacks happen without warning.