API Gateway Cost Reduction: From $4,300 to $50/month
Context
Our AWS bill showed API Gateway costs at $4,300/month, more than Lambda and RDS combined. This wasn't a sudden spike—costs had been climbing steadily as our mobile user base grew. The problem wasn't just the absolute cost but the scaling trajectory: at our growth rate, API Gateway would hit $10,000/month within six months.
We needed to understand why API Gateway costs were so high and find architectural solutions that would scale better.
The Investigation
API Gateway pricing is straightforward: $3.50 per million API calls, plus $0.09/GB for data transfer. Our costs suggested we were making an enormous number of API calls.
Monthly API Call Volume:
API Gateway Bill: $4,300/month
Cost per million calls: $3.50
Calculation:
$4,300 ÷ $3.50 = 1,228 million calls
= 1.2 billion API calls per month
= 40 million calls per day
= 1.6 million calls per hour
= 27,000 calls per minute
With 1,000 daily active users, this meant each user was generating 40,000 API calls per day—clearly something was wrong.
Root Cause Analysis
We instrumented the mobile app with detailed API call logging and discovered the culprit:
Problematic Request Pattern:
Mobile App Polling Pattern
┌──────────────────────────────────────┐
│ User opens app │
│ ├─ Start polling: /api/user/status │
│ │ └─ Every 1 second │
│ ├─ Start polling: /api/content/new │
│ │ └─ Every 1 second │
│ ├─ Start polling: /api/progress │
│ │ └─ Every 1 second │
│ └─ User keeps app open for 30 min │
│ │
│ Result per user session: │
│ 3 endpoints × 1 req/s × 1800s │
│ = 5,400 API calls per 30-min session│
└──────────────────────────────────────┘
Daily Active User Impact:
1,000 DAU × 5,400 calls/session × 2 sessions/day
= 10.8 million calls/day from polling alone
Monthly: 324 million calls
Cost: $1,134/month just for polling
But the actual costs were $4,300/month, indicating polling was only part of the problem. Further investigation revealed:
Additional Cost Drivers
- Content Prefetching
Problem: App prefetched next 10 lessons on
every screen transition
Impact: 5 transitions × 10 lessons × 1000 users
= 50,000 unnecessary API calls/day
= $175/month
- Analytics Events
Problem: Each user action sent individual
analytics event via API
Impact: 100 events/user/day × 1000 users
= 100,000 calls/day
= $350/month
- Redundant Progress Saves
Problem: App saved progress after every
question (5-10 per lesson)
Impact: 8 questions × 10 lessons × 1000 users
= 80,000 saves/day
= $280/month
- Health Check Pings
Problem: App pinged /health every 30s to
check server status
Impact: 2 pings/min × 1440 min × 1000 users
= 2.88 million pings/day
= $1,008/month
Solution Architecture
We redesigned the communication pattern around three principles:
- Push over Pull - Use WebSocket for real-time updates instead of polling
- Batch over Individual - Aggregate multiple operations into single requests
- Client-Side Intelligence - Cache aggressively and only sync when needed
Architecture Redesign
Before: Polling-Based
Mobile App API Gateway Lambda
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Poll /status │──────>│ Invoke Lambda │──────>│ Query DB │
│ (every 1s) │ │ (1 req/s) │ │ (1 query/s) │
│ │ │ │ │ │
│ Poll /content │──────>│ Invoke Lambda │──────>│ Query DB │
│ (every 1s) │ │ (1 req/s) │ │ (1 query/s) │
│ │ │ │ │ │
│ Poll /progress │──────>│ Invoke Lambda │──────>│ Query DB │
│ (every 1s) │ │ (1 req/s) │ │ (1 query/s) │
└──────────────────┘ └──────────────────┘ └──────────────────┘
Cost per user per day:
3 endpoints × 86,400 seconds = 259,200 API calls
Cost: $0.90/user/day
After: WebSocket + Batching
Mobile App API Gateway Lambda
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ WebSocket │◄─────►│ WebSocket API │◄─────►│ Push updates │
│ connection │ │ (persistent) │ │ (when changes) │
│ │ │ │ │ │
│ Batch API calls │──────>│ REST API │──────>│ Handle batch │
│ (5 min buffer) │ │ (1 req/5min) │ │ (1 query batch) │
└──────────────────┘ └──────────────────┘ └──────────────────┘
Cost per user per day:
1 WebSocket connection + 12 batch calls = 13 API calls
Cost: $0.00005/user/day
Implementation Details
1. WebSocket Migration
We added WebSocket API support to API Gateway for real-time updates:
WebSocket Routes:
$connect → Authenticate user, store connection ID
$disconnect → Clean up connection state
$default → Handle client messages
Server-Side Push:
# When user data changes
def notify_user_update(user_id, update_type, payload):
connection_id = get_connection_id(user_id)
if connection_id:
api_gateway_client.post_to_connection(
ConnectionId=connection_id,
Data=json.dumps({
'type': update_type,
'payload': payload,
'timestamp': int(time.time())
})
)
Client-Side Handler:
// React Native WebSocket client
const ws = new WebSocket(WS_URL);
ws.onmessage = (event) => {
const update = JSON.parse(event.data);
switch (update.type) {
case 'progress_updated':
updateLocalProgress(update.payload);
break;
case 'new_content':
refreshContentList();
break;
case 'status_changed':
updateUserStatus(update.payload);
break;
}
};
Cost Impact:
- Eliminated 259,200 polling calls → 1 WebSocket connection
- WebSocket pricing: $0.25 per million connection minutes
- Average session: 30 minutes
- Cost: $0.0000075 per session (99.7% cheaper than polling)
2. Request Batching
For operations that couldn't use WebSocket (analytics, progress saves), we implemented client-side batching:
Batch Queue:
class APIBatchQueue {
constructor(flushInterval = 300000) { // 5 minutes
this.queue = [];
this.flushInterval = flushInterval;
this.startPeriodicFlush();
}
enqueue(endpoint, data) {
this.queue.push({ endpoint, data, timestamp: Date.now() });
// Flush if queue is full (100 items)
if (this.queue.length >= 100) {
this.flush();
}
}
async flush() {
if (this.queue.length === 0) return;
const batch = this.queue.splice(0, 100);
try {
await api.post('/api/batch', { operations: batch });
} catch (error) {
// Re-queue failed items
this.queue.unshift(...batch);
}
}
startPeriodicFlush() {
setInterval(() => this.flush(), this.flushInterval);
}
}
Server-Side Batch Handler:
@app.route('/api/batch', methods=['POST'])
def handle_batch():
operations = request.json.get('operations', [])
results = []
# Process operations in transaction
with db.begin():
for op in operations:
endpoint = op['endpoint']
data = op['data']
if endpoint == '/api/analytics/event':
record_analytics_event(data)
elif endpoint == '/api/progress/save':
save_user_progress(data)
elif endpoint == '/api/content/view':
record_content_view(data)
results.append({'success': True})
return jsonify({'results': results})
Cost Impact:
- Analytics: 100 events/day → 12 batch calls/day (92% reduction)
- Progress saves: 80 saves/day → 12 batch calls/day (85% reduction)
- Total: 180 calls → 24 calls per user per day
3. Client-Side Caching
We implemented aggressive caching for static and semi-static content:
Cache Strategy:
class ContentCache {
constructor() {
this.cache = new Map();
this.ttl = new Map(); // Time-to-live
}
async get(key, fetchFn, ttl = 3600000) { // 1 hour default
// Check if cached and not expired
if (this.cache.has(key)) {
const expiry = this.ttl.get(key);
if (Date.now() < expiry) {
return this.cache.get(key);
}
}
// Fetch and cache
const data = await fetchFn();
this.cache.set(key, data);
this.ttl.set(key, Date.now() + ttl);
return data;
}
}
// Usage
const content = await cache.get(
`lesson_${lessonId}`,
() => api.get(`/api/lessons/${lessonId}`),
3600000 // Cache for 1 hour
);
Cache Invalidation: WebSocket messages trigger cache invalidation:
ws.onmessage = (event) => {
const update = JSON.parse(event.data);
if (update.type === 'content_updated') {
cache.invalidate(`lesson_${update.lessonId}`);
}
};
Cost Impact:
- Lesson fetches: 50 calls/day → 5 calls/day (90% reduction)
- User profile: 20 calls/day → 2 calls/day (90% reduction)
- Subject tree: 30 calls/day → 1 call/day (97% reduction)
4. Eliminate Health Checks
The mobile app was pinging /health every 30 seconds to detect server outages. This was wasteful—API Gateway has built-in health monitoring.
Solution: Remove health checks entirely. If API calls fail, handle errors gracefully:
// Before: Proactive health checks
setInterval(() => {
api.get('/health').catch(() => showOfflineMessage());
}, 30000);
// After: Reactive error handling
api.interceptors.response.use(
response => response,
error => {
if (error.code === 'ECONNREFUSED' || error.response?.status >= 500) {
showOfflineMessage();
}
return Promise.reject(error);
}
);
Cost Impact:
- Health checks: 2,880 calls/day → 0 calls/day (100% reduction)
- Cost savings: $1,008/month
Results
Cost Reduction:
API Gateway Costs (Before → After)
┌──────────────────────────────────────┐
│ Before: $4,300/month │
│ After: $50/month │
│ ──────────────────────────────── │
│ Savings: $4,250/month │
│ Reduction: 99% │
│ Annual Impact: $51,000/year │
└──────────────────────────────────────┘
API Call Volume:
Monthly API Calls (Before → After)
┌──────────────────────────────────────┐
│ Before: 1.2 billion calls/month │
│ After: 14 million calls/month │
│ ──────────────────────────────── │
│ Reduction: 98.8% │
└──────────────────────────────────────┘
Breakdown of Savings:
- Polling elimination (WebSocket): $3,000/month
- Health check removal: $1,008/month
- Request batching: $150/month
- Client-side caching: $92/month
User Experience Impact:
Latency & Responsiveness (Before → After)
┌──────────────────────────────────────┐
│ Update latency: 1s → 50ms │
│ Battery drain: High → Low │
│ Offline support: None → Full │
│ Data usage: 120MB → 15MB/day │
└──────────────────────────────────────┘
Surprisingly, the new architecture improved user experience:
- Lower latency: Push updates arrive in 50ms vs 1s polling delay
- Better battery life: No continuous polling in background
- Offline resilience: Batching allows app to queue operations when offline
- Reduced data usage: 87% reduction in mobile data consumption
Lessons Learned
1. Polling is Almost Always Wrong
Unless you have a specific real-time requirement that WebSocket can't handle, polling is wasteful. Even long-polling is more efficient than short-interval polling.
2. Batch Everything
Batching reduced 180 API calls to 24 while improving atomicity (operations succeed or fail together).
3. Push State Changes, Don't Poll for Them
WebSocket connections cost $0.25 per million minutes. Polling costs $3.50 per million requests. For long-lived connections, WebSocket is 14× cheaper.
4. Client-Side Caching Has Massive Impact
90% of our content fetches were redundant. Aggressive caching with smart invalidation eliminated unnecessary round-trips.
5. Monitor Third-Party API Usage
Our analytics SDK was sending one API call per event. Switching to batch mode (100 events per call) reduced analytics costs by 99%.
Rollout Strategy
We couldn't migrate all users overnight. Our phased rollout:
Week 1: Beta Users (5% traffic)
- Deploy WebSocket support
- Enable batching for beta users only
- Monitor error rates and WebSocket connection stability
Week 2: Gradual Rollout (25% traffic)
- Increase rollout to 25% of users
- Monitor API Gateway costs (should drop 25%)
- Validate battery and data usage improvements
Week 3: Full Rollout (100% traffic)
- Deploy to all users
- Monitor for 7 days
- Measure final cost savings
Week 4: Cleanup
- Remove polling code from mobile app
- Delete unused REST endpoints
- Update documentation
Alternative Solutions Considered
1. GraphQL
Pros: Single endpoint, client-driven queries, batching built-in Cons: Learning curve, requires server refactor, doesn't solve polling problem
We decided GraphQL wouldn't address the root cause (polling) and would add complexity.
2. Server-Sent Events (SSE)
Pros: Simpler than WebSocket, built into HTTP Cons: One-way only (server → client), poor mobile support
WebSocket's bidirectional communication and better mobile support made it the winner.
3. Provisioned Concurrency
Pros: Keeps Lambda warm, reduces latency Cons: Doesn't reduce API Gateway costs, adds $200/month
This would have made the cost problem worse.
Conclusion
We reduced API Gateway costs by 99% by replacing polling with WebSocket push notifications and batching API calls. The optimization required mobile app changes but delivered $51,000/year in savings and improved user experience.
Key Takeaways:
- Polling is expensive—use WebSocket for real-time updates
- Batch API calls to reduce request count by 90%+
- Client-side caching eliminates redundant fetches
- Monitor API call patterns to identify waste
- User experience and cost savings can align
Final Metrics:
- Cost reduction: $4,250/month ($51,000/year)
- API calls: 1.2B → 14M/month (98.8% reduction)
- User experience: Improved across all metrics
- Engineering effort: 40 hours over 4 weeks
Related Plan: docs/plans/implemented/high/2026-01-16-cost-savings-apigateway-plan.md
Related Posts:
- Cost Post 8.2 (Lambda Cost Investigation)
- Performance Post 5.3 (Batch API Calls)