alqosh

API Gateway Cost Reduction: From $4,300 to $50/month

February 1, 2026·cost-optimization

Context

Our AWS bill showed API Gateway costs at $4,300/month, more than Lambda and RDS combined. This wasn't a sudden spike—costs had been climbing steadily as our mobile user base grew. The problem wasn't just the absolute cost but the scaling trajectory: at our growth rate, API Gateway would hit $10,000/month within six months.

We needed to understand why API Gateway costs were so high and find architectural solutions that would scale better.

The Investigation

API Gateway pricing is straightforward: $3.50 per million API calls, plus $0.09/GB for data transfer. Our costs suggested we were making an enormous number of API calls.

Monthly API Call Volume:

API Gateway Bill: $4,300/month
Cost per million calls: $3.50

Calculation:
$4,300 ÷ $3.50 = 1,228 million calls
= 1.2 billion API calls per month
= 40 million calls per day
= 1.6 million calls per hour
= 27,000 calls per minute

With 1,000 daily active users, this meant each user was generating 40,000 API calls per day—clearly something was wrong.

Root Cause Analysis

We instrumented the mobile app with detailed API call logging and discovered the culprit:

Problematic Request Pattern:

Daily Active User Impact:

1,000 DAU × 5,400 calls/session × 2 sessions/day
= 10.8 million calls/day from polling alone

Monthly: 324 million calls
Cost: $1,134/month just for polling

But the actual costs were $4,300/month, indicating polling was only part of the problem. Further investigation revealed:

Additional Cost Drivers

Content Prefetching

Problem: App prefetched next 10 lessons on
         every screen transition

Impact: 5 transitions × 10 lessons × 1000 users
      = 50,000 unnecessary API calls/day
      = $175/month

Analytics Events

Problem: Each user action sent individual
         analytics event via API

Impact: 100 events/user/day × 1000 users
      = 100,000 calls/day
      = $350/month

Redundant Progress Saves

Problem: App saved progress after every
         question (5-10 per lesson)

Impact: 8 questions × 10 lessons × 1000 users
      = 80,000 saves/day
      = $280/month

Health Check Pings

Problem: App pinged /health every 30s to
         check server status

Impact: 2 pings/min × 1440 min × 1000 users
      = 2.88 million pings/day
      = $1,008/month

Mobile App Polling Pattern
┌──────────────────────────────────────┐
│ User opens app                       │
│ ├─ Start polling: /api/user/status  │
│ │   └─ Every 1 second                │
│ ├─ Start polling: /api/content/new  │
│ │   └─ Every 1 second                │
│ ├─ Start polling: /api/progress     │
│ │   └─ Every 1 second                │
│ └─ User keeps app open for 30 min   │
│                                      │
│ Result per user session:             │
│ 3 endpoints × 1 req/s × 1800s        │
│ = 5,400 API calls per 30-min session│
└──────────────────────────────────────┘

Solution Architecture

We redesigned the communication pattern around three principles:

Push over Pull - Use WebSocket for real-time updates instead of polling
Batch over Individual - Aggregate multiple operations into single requests
Client-Side Intelligence - Cache aggressively and only sync when needed

Architecture Redesign

Before: Polling-Based

Cost per user per day: 3 endpoints × 86,400 seconds = 259,200 API calls Cost: $0.90/user/day

After: WebSocket + Batching

Cost per user per day: 1 WebSocket connection + 12 batch calls = 13 API calls Cost: $0.00005/user/day

Implementation Details

1. WebSocket Migration

We added WebSocket API support to API Gateway for real-time updates:

WebSocket Routes:

$connect    → Authenticate user, store connection ID
$disconnect → Clean up connection state
$default    → Handle client messages

Server-Side Push:

# When user data changes
def notify_user_update(user_id, update_type, payload):
    connection_id = get_connection_id(user_id)
    if connection_id:
        api_gateway_client.post_to_connection(
            ConnectionId=connection_id,
            Data=json.dumps({
                'type': update_type,
                'payload': payload,
                'timestamp': int(time.time())
            })
        )

Client-Side Handler:

// React Native WebSocket client
const ws = new WebSocket(WS_URL);

ws.onmessage = (event) => {
  const update = JSON.parse(event.data);
  switch (update.type) {
    case 'progress_updated':
      updateLocalProgress(update.payload);
      break;
    case 'new_content':
      refreshContentList();
      break;
    case 'status_changed':
      updateUserStatus(update.payload);
      break;
  }
};

Cost Impact:

Eliminated 259,200 polling calls → 1 WebSocket connection
WebSocket pricing: $0.25 per million connection minutes
Average session: 30 minutes
Cost: $0.0000075 per session (99.7% cheaper than polling)

2. Request Batching

For operations that couldn't use WebSocket (analytics, progress saves), we implemented client-side batching:

Batch Queue:

class APIBatchQueue {
  constructor(flushInterval = 300000) { // 5 minutes
    this.queue = [];
    this.flushInterval = flushInterval;
    this.startPeriodicFlush();
  }

  enqueue(endpoint, data) {
    this.queue.push({ endpoint, data, timestamp: Date.now() });

    // Flush if queue is full (100 items)
    if (this.queue.length >= 100) {
      this.flush();
    }
  }

  async flush() {
    if (this.queue.length === 0) return;

    const batch = this.queue.splice(0, 100);
    try {
      await api.post('/api/batch', { operations: batch });
    } catch (error) {
      // Re-queue failed items
      this.queue.unshift(...batch);
    }
  }

  startPeriodicFlush() {
    setInterval(() => this.flush(), this.flushInterval);
  }
}

Server-Side Batch Handler:

@app.route('/api/batch', methods=['POST'])
def handle_batch():
    operations = request.json.get('operations', [])
    results = []

    # Process operations in transaction
    with db.begin():
        for op in operations:
            endpoint = op['endpoint']
            data = op['data']

            if endpoint == '/api/analytics/event':
                record_analytics_event(data)
            elif endpoint == '/api/progress/save':
                save_user_progress(data)
            elif endpoint == '/api/content/view':
                record_content_view(data)

            results.append({'success': True})

    return jsonify({'results': results})

Cost Impact:

Analytics: 100 events/day → 12 batch calls/day (92% reduction)
Progress saves: 80 saves/day → 12 batch calls/day (85% reduction)
Total: 180 calls → 24 calls per user per day

3. Client-Side Caching

We implemented aggressive caching for static and semi-static content:

Cache Strategy:

class ContentCache {
  constructor() {
    this.cache = new Map();
    this.ttl = new Map(); // Time-to-live
  }

  async get(key, fetchFn, ttl = 3600000) { // 1 hour default
    // Check if cached and not expired
    if (this.cache.has(key)) {
      const expiry = this.ttl.get(key);
      if (Date.now() < expiry) {
        return this.cache.get(key);
      }
    }

    // Fetch and cache
    const data = await fetchFn();
    this.cache.set(key, data);
    this.ttl.set(key, Date.now() + ttl);
    return data;
  }
}

// Usage
const content = await cache.get(
  `lesson_${lessonId}`,
  () => api.get(`/api/lessons/${lessonId}`),
  3600000 // Cache for 1 hour
);

Cache Invalidation: WebSocket messages trigger cache invalidation:

ws.onmessage = (event) => {
  const update = JSON.parse(event.data);
  if (update.type === 'content_updated') {
    cache.invalidate(`lesson_${update.lessonId}`);
  }
};

Cost Impact:

Lesson fetches: 50 calls/day → 5 calls/day (90% reduction)
User profile: 20 calls/day → 2 calls/day (90% reduction)
Subject tree: 30 calls/day → 1 call/day (97% reduction)

4. Eliminate Health Checks

The mobile app was pinging /health every 30 seconds to detect server outages. This was wasteful—API Gateway has built-in health monitoring.

Solution: Remove health checks entirely. If API calls fail, handle errors gracefully:

// Before: Proactive health checks
setInterval(() => {
  api.get('/health').catch(() => showOfflineMessage());
}, 30000);

// After: Reactive error handling
api.interceptors.response.use(
  response => response,
  error => {
    if (error.code === 'ECONNREFUSED' || error.response?.status >= 500) {
      showOfflineMessage();
    }
    return Promise.reject(error);
  }
);

Cost Impact:

Health checks: 2,880 calls/day → 0 calls/day (100% reduction)
Cost savings: $1,008/month

Results

Cost Reduction:

API Call Volume:

Breakdown of Savings:

Polling elimination (WebSocket): $3,000/month
Health check removal: $1,008/month
Request batching: $150/month
Client-side caching: $92/month

User Experience Impact:

Surprisingly, the new architecture improved user experience:

Lower latency: Push updates arrive in 50ms vs 1s polling delay
Better battery life: No continuous polling in background
Offline resilience: Batching allows app to queue operations when offline
Reduced data usage: 87% reduction in mobile data consumption

API Gateway Costs (Before → After)
┌──────────────────────────────────────┐
│ Before:         $4,300/month         │
│ After:          $50/month            │
│ ────────────────────────────────     │
│ Savings:        $4,250/month         │
│ Reduction:      99%                  │
│ Annual Impact:  $51,000/year         │
└──────────────────────────────────────┘

Monthly API Calls (Before → After)
┌──────────────────────────────────────┐
│ Before:  1.2 billion calls/month     │
│ After:   14 million calls/month      │
│ ────────────────────────────────     │
│ Reduction: 98.8%                     │
└──────────────────────────────────────┘

Latency & Responsiveness (Before → After)
┌──────────────────────────────────────┐
│ Update latency:    1s → 50ms         │
│ Battery drain:     High → Low        │
│ Offline support:   None → Full       │
│ Data usage:        120MB → 15MB/day  │
└──────────────────────────────────────┘

Lessons Learned

1. Polling is Almost Always Wrong

Unless you have a specific real-time requirement that WebSocket can't handle, polling is wasteful. Even long-polling is more efficient than short-interval polling.

2. Batch Everything

Batching reduced 180 API calls to 24 while improving atomicity (operations succeed or fail together).

3. Push State Changes, Don't Poll for Them

WebSocket connections cost $0.25 per million minutes. Polling costs $3.50 per million requests. For long-lived connections, WebSocket is 14× cheaper.

4. Client-Side Caching Has Massive Impact

90% of our content fetches were redundant. Aggressive caching with smart invalidation eliminated unnecessary round-trips.

5. Monitor Third-Party API Usage

Our analytics SDK was sending one API call per event. Switching to batch mode (100 events per call) reduced analytics costs by 99%.

Rollout Strategy

We couldn't migrate all users overnight. Our phased rollout:

Week 1: Beta Users (5% traffic)

Deploy WebSocket support
Enable batching for beta users only
Monitor error rates and WebSocket connection stability

Week 2: Gradual Rollout (25% traffic)

Increase rollout to 25% of users
Monitor API Gateway costs (should drop 25%)
Validate battery and data usage improvements

Week 3: Full Rollout (100% traffic)

Deploy to all users
Monitor for 7 days
Measure final cost savings

Week 4: Cleanup

Remove polling code from mobile app
Delete unused REST endpoints
Update documentation

Alternative Solutions Considered

1. GraphQL

Pros: Single endpoint, client-driven queries, batching built-in Cons: Learning curve, requires server refactor, doesn't solve polling problem

We decided GraphQL wouldn't address the root cause (polling) and would add complexity.

2. Server-Sent Events (SSE)

Pros: Simpler than WebSocket, built into HTTP Cons: One-way only (server → client), poor mobile support

WebSocket's bidirectional communication and better mobile support made it the winner.

3. Provisioned Concurrency

Pros: Keeps Lambda warm, reduces latency Cons: Doesn't reduce API Gateway costs, adds $200/month

This would have made the cost problem worse.

Conclusion

We reduced API Gateway costs by 99% by replacing polling with WebSocket push notifications and batching API calls. The optimization required mobile app changes but delivered $51,000/year in savings and improved user experience.

Key Takeaways:

Polling is expensive—use WebSocket for real-time updates
Batch API calls to reduce request count by 90%+
Client-side caching eliminates redundant fetches
Monitor API call patterns to identify waste
User experience and cost savings can align

Final Metrics:

Cost reduction: $4,250/month ($51,000/year)
API calls: 1.2B → 14M/month (98.8% reduction)
User experience: Improved across all metrics
Engineering effort: 40 hours over 4 weeks

Related Plan: docs/plans/implemented/high/2026-01-16-cost-savings-apigateway-plan.md Related Posts:

Cost Post 8.2 (Lambda Cost Investigation)
Performance Post 5.3 (Batch API Calls)

Root Cause Analysis

Mobile App Polling Pattern
┌──────────────────────────────────────┐
│ User opens app                       │
│ ├─ Start polling: /api/user/status  │
│ │   └─ Every 1 second                │
│ ├─ Start polling: /api/content/new  │
│ │   └─ Every 1 second                │
│ ├─ Start polling: /api/progress     │
│ │   └─ Every 1 second                │
│ └─ User keeps app open for 30 min   │
│                                      │
│ Result per user session:             │
│ 3 endpoints × 1 req/s × 1800s        │
│ = 5,400 API calls per 30-min session│
└──────────────────────────────────────┘