108 Unit Tests for Service Layer: Isolated Business Logic Testing

Service layer bugs caused cascading failures across multiple API endpoints. A bug in DripService broke email campaigns for all users. A change to AmplitudeService silently stopped sending analytics events. Integration tests caught some issues, but they were slow (2 minutes per test), flaky (external API dependencies), and incomplete (missing edge cases).

The service layer—business logic that orchestrates external services—had minimal unit test coverage. We relied on integration tests that required real databases, mock HTTP servers, and complex setup. Changes were risky. We couldn't isolate bugs quickly. We wrote 108 unit tests with mocked dependencies to test service logic in isolation.

The Problem: Integration Tests for Unit Logic

Before implementation, our test pyramid was inverted: heavy on integration tests, light on unit tests.

Before: Inverted Test Pyramid

    /\
   /  \  Integration Tests (Slow, flaky, expensive)
  /____\
  \    /  Unit Tests (Few, incomplete)
   \  /
    \/

Service Layer Testing
┌──────────────────────────────────────┐
│ src/services/                        │
│ - 20+ service modules                │
│ - Sparse test coverage               │
│ - Integration tests only             │
│ - 2 min per test                     │
│ - Flaky (external dependencies)      │
└──────────────────────────────────────┘

Integration tests have value—they verify end-to-end behavior. But they're slow, brittle, and hard to debug. When an integration test fails, the problem could be in the service layer, the HTTP client, the mock server, database state, or test setup. Debugging requires tracing through multiple layers.

Integration Test Example:

def test_send_email_campaign_integration():
    # 1. Set up test database
    db = create_test_database()
    user = db.create_user(email="test@example.com")

    # 2. Mock external HTTP API
    with responses.RequestsMock() as rsps:
        rsps.add(responses.POST, 'https://api.drip.com/v2/campaigns',
                 json={'success': True}, status=200)

        # 3. Execute service method
        drip_service = DripService(db_session=db.session)
        result = drip_service.send_campaign(user_id=user.id, campaign_id=123)

        # 4. Verify database state
        db.refresh(user)
        assert user.last_email_sent is not None

        # 5. Verify HTTP call
        assert len(rsps.calls) == 1

This test takes 2 seconds to run due to database setup and teardown. If it fails, the issue could be in:

send_campaign() business logic
Database query construction
HTTP client configuration
Mock server setup
Test data generation

Debugging requires stepping through all layers.

The Solution: Unit Tests with Mocked Dependencies

We wrote 108 unit tests that isolate service logic from dependencies. Instead of real databases and HTTP clients, we use mocks that return controlled values. Tests run in milliseconds and fail for a single, clear reason.

After: Proper Test Pyramid

    /\
   /  \  Integration Tests (Selected critical paths)
  /____\
  |    |
  |    |  Unit Tests (Fast, focused, comprehensive)
  |    |
  |____|

Service Layer Testing
┌──────────────────────────────────────┐
│ src/services/                        │
│ - 20+ service modules                │
│ - 108 unit tests                     │
│ - Isolated service testing           │
│ - <100ms per test                    │
│ - Mocked dependencies                │
│ - No flakiness                       │
└──────────────────────────────────────┘

Unit tests verify business logic without involving external systems. They're fast, deterministic, and easy to debug.

Unit Test Example:

def test_send_email_campaign_unit(mocker):
    # Mock dependencies
    mock_db = mocker.Mock()
    mock_http_client = mocker.Mock()

    # Set up mock return values
    mock_db.get_user.return_value = User(id=1, email="test@example.com")
    mock_http_client.post.return_value = {'success': True}

    # Create service with mocked dependencies
    drip_service = DripService(db=mock_db, http_client=mock_http_client)

    # Execute method
    result = drip_service.send_campaign(user_id=1, campaign_id=123)

    # Verify business logic
    assert result is True
    mock_http_client.post.assert_called_once_with(
        'https://api.drip.com/v2/campaigns',
        json={'email': 'test@example.com', 'campaign_id': 123}
    )

This test runs in 5ms and fails only if send_campaign() logic is wrong. No database, no HTTP, no complexity.

Implementation Details

We organized tests by service module:

Test File Structure:

src/tests/unit/services/
├── test_drip_service.py           # 24 tests
├── test_amplitude_service.py      # 19 tests
├── test_twilio_service.py         # 18 tests
├── test_content_duo_service.py    # 16 tests
├── test_tts_service.py            # 15 tests
├── test_auth_service.py           # 12 tests
└── test_notification_service.py   # 4 tests

Service 1: DripService (24 Tests)

DripService handles email campaign automation via Drip's API. Tests cover:

Campaign creation and sending
Subscriber management
Event tracking
Error handling (API failures, invalid data)
Batch operations

Example Tests:

def test_create_subscriber_success(mocker):
    """Test successful subscriber creation."""
    mock_http = mocker.Mock()
    mock_http.post.return_value = {'subscribers': [{'id': '123'}]}

    service = DripService(http_client=mock_http)
    result = service.create_subscriber(
        email="test@example.com",
        custom_fields={'name': 'Test User'}
    )

    assert result['id'] == '123'
    mock_http.post.assert_called_once()

def test_create_subscriber_duplicate_email(mocker):
    """Test handling of duplicate email addresses."""
    mock_http = mocker.Mock()
    mock_http.post.side_effect = DripAPIError("Email already exists")

    service = DripService(http_client=mock_http)

    with pytest.raises(DuplicateSubscriberError):
        service.create_subscriber(email="duplicate@example.com")

def test_send_campaign_batch(mocker):
    """Test batch campaign sending."""
    mock_http = mocker.Mock()
    mock_http.post.return_value = {'success': True}

    service = DripService(http_client=mock_http)
    subscribers = [
        {'email': 'user1@example.com'},
        {'email': 'user2@example.com'},
        {'email': 'user3@example.com'}
    ]

    result = service.send_campaign_batch(
        campaign_id=456,
        subscribers=subscribers
    )

    assert result['sent'] == 3
    # Verify batching behavior
    assert mock_http.post.call_count == 1  # Single batch call

The mocked HTTP client allows us to test error handling without triggering actual API failures.

Service 2: AmplitudeService (19 Tests)

AmplitudeService sends analytics events to Amplitude. Tests cover:

Event validation and transformation
Batch event sending
Internal domain filtering
Error handling and retries

Example Tests:

def test_track_event_filters_internal_emails(mocker):
    """Ensure internal emails are excluded from analytics."""
    mock_http = mocker.Mock()

    service = AmplitudeService(http_client=mock_http)
    service.track_event(
        user_id=1,
        email="internal@alphazed.app",  # Internal domain
        event_name="button_clicked"
    )

    # Verify no HTTP call was made
    mock_http.post.assert_not_called()

def test_track_event_batches_correctly(mocker):
    """Test batching of multiple events."""
    mock_http = mocker.Mock()
    mock_http.post.return_value = {'success': True}

    service = AmplitudeService(http_client=mock_http, batch_size=3)

    # Send 5 events (should create 2 batches: 3 + 2)
    for i in range(5):
        service.track_event(
            user_id=i,
            email=f"user{i}@example.com",
            event_name="test_event"
        )

    # Force flush remaining events
    service.flush()

    # Verify 2 HTTP calls (batch 1: 3 events, batch 2: 2 events)
    assert mock_http.post.call_count == 2

These tests verified the internal domain filtering logic, catching a bug where the check used string equality instead of domain suffix matching.

Service 3: TwilioService (18 Tests)

TwilioService sends SMS notifications via Twilio. Tests cover:

Phone number validation
SMS sending
Bounce handling
Rate limiting

Example Tests:

def test_send_sms_invalid_phone_number(mocker):
    """Reject invalid phone numbers."""
    mock_twilio = mocker.Mock()

    service = TwilioService(client=mock_twilio)

    with pytest.raises(InvalidPhoneNumberError):
        service.send_sms(phone="not-a-phone", message="Test")

def test_send_sms_respects_suppressions(mocker):
    """Don't send SMS to suppressed phone numbers."""
    mock_db = mocker.Mock()
    mock_twilio = mocker.Mock()

    # Phone number is in suppression list
    mock_db.is_suppressed.return_value = True

    service = TwilioService(client=mock_twilio, db=mock_db)
    result = service.send_sms(phone="+1234567890", message="Test")

    assert result['suppressed'] is True
    mock_twilio.messages.create.assert_not_called()

The suppression tests caught an edge case where temporary failures were added to the suppression list, preventing retries.

Service 4: ContentDuoService (16 Tests)

ContentDuoService implements adaptive learning logic. Tests cover:

Persona detection
Lesson selection
HLR memory calculations
Slot distribution

Example Tests:

def test_detect_persona_beginner(mocker):
    """Classify users as beginners based on performance."""
    mock_db = mocker.Mock()

    # User with low accuracy and few attempts
    mock_db.get_user_stats.return_value = {
        'accuracy': 45,  # Below 60% threshold
        'attempts': 8    # Below 10 attempt threshold
    }

    service = ContentDuoService(db=mock_db)
    persona = service.detect_persona(user_id=1)

    assert persona == 'beginner'

def test_select_lesson_respects_slot_distribution(mocker):
    """Ensure lessons match configured slot percentages."""
    mock_db = mocker.Mock()
    mock_config = mocker.Mock()

    # Config: 40% new, 30% review, 30% challenge
    mock_config.get_slots.return_value = {
        'new': 0.40,
        'review': 0.30,
        'challenge': 0.30
    }

    service = ContentDuoService(db=mock_db, config=mock_config)

    # Generate 100 lessons and verify distribution
    lessons = [service.select_lesson(user_id=1) for _ in range(100)]

    new_count = sum(1 for l in lessons if l['type'] == 'new')
    review_count = sum(1 for l in lessons if l['type'] == 'review')
    challenge_count = sum(1 for l in lessons if l['type'] == 'challenge')

    # Allow ±10% variance for randomness
    assert 30 <= new_count <= 50
    assert 20 <= review_count <= 40
    assert 20 <= challenge_count <= 40

These statistical tests verify slot distribution without requiring exact percentages, accounting for randomness.

Mocking Patterns

We used consistent mocking patterns across all services:

1. Dependency Injection:

class DripService:
    def __init__(self, http_client, db_session):
        self.http_client = http_client
        self.db_session = db_session

Services accept dependencies via constructor, making them easy to mock in tests.

2. pytest-mock for Mocking:

def test_service_method(mocker):
    mock_dep = mocker.Mock()
    mock_dep.method.return_value = "expected value"

    service = Service(dependency=mock_dep)
    result = service.method()

    assert result == "expected value"

pytest-mock provides a clean API for creating and verifying mocks.

3. side_effect for Exception Testing:

def test_handles_api_error(mocker):
    mock_http = mocker.Mock()
    mock_http.post.side_effect = HTTPError("Connection failed")

    service = Service(http_client=mock_http)

    with pytest.raises(ServiceUnavailableError):
        service.call_api()

side_effect allows testing error handling without triggering real errors.

Running Tests

Unit tests run separately from integration tests:

# Run all service unit tests
pytest src/tests/unit/services/

# Run specific service tests
pytest src/tests/unit/services/test_drip_service.py

# Run with coverage
pytest --cov=src/services --cov-report=term-missing src/tests/unit/services/

# Run tests in parallel (faster)
pytest -n auto src/tests/unit/services/

108 tests complete in <5 seconds, compared to 2-4 minutes for equivalent integration tests.

Real-World Impact

Before Implementation:

Minimal service unit tests
Heavy reliance on integration tests (2-4 min execution time)
Flaky tests (external API dependencies)
Slow debugging (multiple failure points)
Developers avoided refactoring services

After Implementation:

108 comprehensive unit tests
Fast test execution (<5 seconds)
Zero flakiness (no external dependencies)
Instant bug localization
Confident service refactoring

Test Execution Time Comparison:

Integration Tests (Before):
- DripService: 24 tests × 5s = 120s
- AmplitudeService: 19 tests × 4s = 76s
- TwilioService: 18 tests × 6s = 108s
Total: 304s (~5 minutes)

Unit Tests (After):
- DripService: 24 tests × 20ms = 480ms
- AmplitudeService: 19 tests × 15ms = 285ms
- TwilioService: 18 tests × 25ms = 450ms
Total: 1.2s

Speed improvement: 253× faster

Faster tests mean developers run them more frequently, catching bugs earlier.

Bugs Caught During Development:

DripService batch logic - Batch size calculation off by one, causing extra API calls
AmplitudeService domain filtering - Used string equality instead of suffix matching
TwilioService suppression - Temporary failures added to permanent suppression list
ContentDuoService persona - Edge case where accuracy=60% classified incorrectly
TTSService caching - Cache key collision between different users

Each bug was caught in unit tests before reaching integration testing or production.

Developer Feedback

"I used to avoid changing service layer code because tests took forever to run. Now I refactor confidently—if I break something, tests fail in seconds."

"Unit tests are executable documentation. When I onboarded, I read the tests to understand how services work."

"The mocks show exactly what dependencies each service uses. It makes the architecture explicit."

Key Takeaways

Unit tests are for business logic - Test services in isolation from databases and APIs
Mocks provide control - Simulate errors, edge cases, and unusual states
Fast tests get run - <5s execution means developers run tests constantly
Dependency injection enables testing - Design services to accept mocked dependencies
Complement, don't replace integration tests - Both have value; use each appropriately

Writing 108 unit tests for service layer code transformed our testing strategy. We inverted the test pyramid, shifting from slow integration tests to fast unit tests. Services became easier to understand, refactor, and extend. The investment—two weeks of test development—paid off immediately through faster development cycles and prevented production bugs.

Unit tests aren't overhead. They're infrastructure that enables confident, rapid development.