108 Unit Tests for Service Layer: Isolated Business Logic Testing
Service layer bugs caused cascading failures across multiple API endpoints. A bug in DripService broke email campaigns for all users. A change to AmplitudeService silently stopped sending analytics events. Integration tests caught some issues, but they were slow (2 minutes per test), flaky (external API dependencies), and incomplete (missing edge cases).
The service layer—business logic that orchestrates external services—had minimal unit test coverage. We relied on integration tests that required real databases, mock HTTP servers, and complex setup. Changes were risky. We couldn't isolate bugs quickly. We wrote 108 unit tests with mocked dependencies to test service logic in isolation.
The Problem: Integration Tests for Unit Logic
Before implementation, our test pyramid was inverted: heavy on integration tests, light on unit tests.
Before: Inverted Test Pyramid
/\
/ \ Integration Tests (Slow, flaky, expensive)
/____\
\ / Unit Tests (Few, incomplete)
\ /
\/
Service Layer Testing
┌──────────────────────────────────────┐
│ src/services/ │
│ - 20+ service modules │
│ - Sparse test coverage │
│ - Integration tests only │
│ - 2 min per test │
│ - Flaky (external dependencies) │
└──────────────────────────────────────┘
Integration tests have value—they verify end-to-end behavior. But they're slow, brittle, and hard to debug. When an integration test fails, the problem could be in the service layer, the HTTP client, the mock server, database state, or test setup. Debugging requires tracing through multiple layers.
Integration Test Example:
def test_send_email_campaign_integration():
# 1. Set up test database
db = create_test_database()
user = db.create_user(email="test@example.com")
# 2. Mock external HTTP API
with responses.RequestsMock() as rsps:
rsps.add(responses.POST, 'https://api.drip.com/v2/campaigns',
json={'success': True}, status=200)
# 3. Execute service method
drip_service = DripService(db_session=db.session)
result = drip_service.send_campaign(user_id=user.id, campaign_id=123)
# 4. Verify database state
db.refresh(user)
assert user.last_email_sent is not None
# 5. Verify HTTP call
assert len(rsps.calls) == 1
This test takes 2 seconds to run due to database setup and teardown. If it fails, the issue could be in:
send_campaign()business logic- Database query construction
- HTTP client configuration
- Mock server setup
- Test data generation
Debugging requires stepping through all layers.
The Solution: Unit Tests with Mocked Dependencies
We wrote 108 unit tests that isolate service logic from dependencies. Instead of real databases and HTTP clients, we use mocks that return controlled values. Tests run in milliseconds and fail for a single, clear reason.
After: Proper Test Pyramid
/\
/ \ Integration Tests (Selected critical paths)
/____\
| |
| | Unit Tests (Fast, focused, comprehensive)
| |
|____|
Service Layer Testing
┌──────────────────────────────────────┐
│ src/services/ │
│ - 20+ service modules │
│ - 108 unit tests │
│ - Isolated service testing │
│ - <100ms per test │
│ - Mocked dependencies │
│ - No flakiness │
└──────────────────────────────────────┘
Unit tests verify business logic without involving external systems. They're fast, deterministic, and easy to debug.
Unit Test Example:
def test_send_email_campaign_unit(mocker):
# Mock dependencies
mock_db = mocker.Mock()
mock_http_client = mocker.Mock()
# Set up mock return values
mock_db.get_user.return_value = User(id=1, email="test@example.com")
mock_http_client.post.return_value = {'success': True}
# Create service with mocked dependencies
drip_service = DripService(db=mock_db, http_client=mock_http_client)
# Execute method
result = drip_service.send_campaign(user_id=1, campaign_id=123)
# Verify business logic
assert result is True
mock_http_client.post.assert_called_once_with(
'https://api.drip.com/v2/campaigns',
json={'email': 'test@example.com', 'campaign_id': 123}
)
This test runs in 5ms and fails only if send_campaign() logic is wrong. No database, no HTTP, no complexity.
Implementation Details
We organized tests by service module:
Test File Structure:
src/tests/unit/services/
├── test_drip_service.py # 24 tests
├── test_amplitude_service.py # 19 tests
├── test_twilio_service.py # 18 tests
├── test_content_duo_service.py # 16 tests
├── test_tts_service.py # 15 tests
├── test_auth_service.py # 12 tests
└── test_notification_service.py # 4 tests
Service 1: DripService (24 Tests)
DripService handles email campaign automation via Drip's API. Tests cover:
- Campaign creation and sending
- Subscriber management
- Event tracking
- Error handling (API failures, invalid data)
- Batch operations
Example Tests:
def test_create_subscriber_success(mocker):
"""Test successful subscriber creation."""
mock_http = mocker.Mock()
mock_http.post.return_value = {'subscribers': [{'id': '123'}]}
service = DripService(http_client=mock_http)
result = service.create_subscriber(
email="test@example.com",
custom_fields={'name': 'Test User'}
)
assert result['id'] == '123'
mock_http.post.assert_called_once()
def test_create_subscriber_duplicate_email(mocker):
"""Test handling of duplicate email addresses."""
mock_http = mocker.Mock()
mock_http.post.side_effect = DripAPIError("Email already exists")
service = DripService(http_client=mock_http)
with pytest.raises(DuplicateSubscriberError):
service.create_subscriber(email="duplicate@example.com")
def test_send_campaign_batch(mocker):
"""Test batch campaign sending."""
mock_http = mocker.Mock()
mock_http.post.return_value = {'success': True}
service = DripService(http_client=mock_http)
subscribers = [
{'email': 'user1@example.com'},
{'email': 'user2@example.com'},
{'email': 'user3@example.com'}
]
result = service.send_campaign_batch(
campaign_id=456,
subscribers=subscribers
)
assert result['sent'] == 3
# Verify batching behavior
assert mock_http.post.call_count == 1 # Single batch call
The mocked HTTP client allows us to test error handling without triggering actual API failures.
Service 2: AmplitudeService (19 Tests)
AmplitudeService sends analytics events to Amplitude. Tests cover:
- Event validation and transformation
- Batch event sending
- Internal domain filtering
- Error handling and retries
Example Tests:
def test_track_event_filters_internal_emails(mocker):
"""Ensure internal emails are excluded from analytics."""
mock_http = mocker.Mock()
service = AmplitudeService(http_client=mock_http)
service.track_event(
user_id=1,
email="internal@alphazed.app", # Internal domain
event_name="button_clicked"
)
# Verify no HTTP call was made
mock_http.post.assert_not_called()
def test_track_event_batches_correctly(mocker):
"""Test batching of multiple events."""
mock_http = mocker.Mock()
mock_http.post.return_value = {'success': True}
service = AmplitudeService(http_client=mock_http, batch_size=3)
# Send 5 events (should create 2 batches: 3 + 2)
for i in range(5):
service.track_event(
user_id=i,
email=f"user{i}@example.com",
event_name="test_event"
)
# Force flush remaining events
service.flush()
# Verify 2 HTTP calls (batch 1: 3 events, batch 2: 2 events)
assert mock_http.post.call_count == 2
These tests verified the internal domain filtering logic, catching a bug where the check used string equality instead of domain suffix matching.
Service 3: TwilioService (18 Tests)
TwilioService sends SMS notifications via Twilio. Tests cover:
- Phone number validation
- SMS sending
- Bounce handling
- Rate limiting
Example Tests:
def test_send_sms_invalid_phone_number(mocker):
"""Reject invalid phone numbers."""
mock_twilio = mocker.Mock()
service = TwilioService(client=mock_twilio)
with pytest.raises(InvalidPhoneNumberError):
service.send_sms(phone="not-a-phone", message="Test")
def test_send_sms_respects_suppressions(mocker):
"""Don't send SMS to suppressed phone numbers."""
mock_db = mocker.Mock()
mock_twilio = mocker.Mock()
# Phone number is in suppression list
mock_db.is_suppressed.return_value = True
service = TwilioService(client=mock_twilio, db=mock_db)
result = service.send_sms(phone="+1234567890", message="Test")
assert result['suppressed'] is True
mock_twilio.messages.create.assert_not_called()
The suppression tests caught an edge case where temporary failures were added to the suppression list, preventing retries.
Service 4: ContentDuoService (16 Tests)
ContentDuoService implements adaptive learning logic. Tests cover:
- Persona detection
- Lesson selection
- HLR memory calculations
- Slot distribution
Example Tests:
def test_detect_persona_beginner(mocker):
"""Classify users as beginners based on performance."""
mock_db = mocker.Mock()
# User with low accuracy and few attempts
mock_db.get_user_stats.return_value = {
'accuracy': 45, # Below 60% threshold
'attempts': 8 # Below 10 attempt threshold
}
service = ContentDuoService(db=mock_db)
persona = service.detect_persona(user_id=1)
assert persona == 'beginner'
def test_select_lesson_respects_slot_distribution(mocker):
"""Ensure lessons match configured slot percentages."""
mock_db = mocker.Mock()
mock_config = mocker.Mock()
# Config: 40% new, 30% review, 30% challenge
mock_config.get_slots.return_value = {
'new': 0.40,
'review': 0.30,
'challenge': 0.30
}
service = ContentDuoService(db=mock_db, config=mock_config)
# Generate 100 lessons and verify distribution
lessons = [service.select_lesson(user_id=1) for _ in range(100)]
new_count = sum(1 for l in lessons if l['type'] == 'new')
review_count = sum(1 for l in lessons if l['type'] == 'review')
challenge_count = sum(1 for l in lessons if l['type'] == 'challenge')
# Allow ±10% variance for randomness
assert 30 <= new_count <= 50
assert 20 <= review_count <= 40
assert 20 <= challenge_count <= 40
These statistical tests verify slot distribution without requiring exact percentages, accounting for randomness.
Mocking Patterns
We used consistent mocking patterns across all services:
1. Dependency Injection:
class DripService:
def __init__(self, http_client, db_session):
self.http_client = http_client
self.db_session = db_session
Services accept dependencies via constructor, making them easy to mock in tests.
2. pytest-mock for Mocking:
def test_service_method(mocker):
mock_dep = mocker.Mock()
mock_dep.method.return_value = "expected value"
service = Service(dependency=mock_dep)
result = service.method()
assert result == "expected value"
pytest-mock provides a clean API for creating and verifying mocks.
3. side_effect for Exception Testing:
def test_handles_api_error(mocker):
mock_http = mocker.Mock()
mock_http.post.side_effect = HTTPError("Connection failed")
service = Service(http_client=mock_http)
with pytest.raises(ServiceUnavailableError):
service.call_api()
side_effect allows testing error handling without triggering real errors.
Running Tests
Unit tests run separately from integration tests:
# Run all service unit tests
pytest src/tests/unit/services/
# Run specific service tests
pytest src/tests/unit/services/test_drip_service.py
# Run with coverage
pytest --cov=src/services --cov-report=term-missing src/tests/unit/services/
# Run tests in parallel (faster)
pytest -n auto src/tests/unit/services/
108 tests complete in <5 seconds, compared to 2-4 minutes for equivalent integration tests.
Real-World Impact
Before Implementation:
- Minimal service unit tests
- Heavy reliance on integration tests (2-4 min execution time)
- Flaky tests (external API dependencies)
- Slow debugging (multiple failure points)
- Developers avoided refactoring services
After Implementation:
- 108 comprehensive unit tests
- Fast test execution (<5 seconds)
- Zero flakiness (no external dependencies)
- Instant bug localization
- Confident service refactoring
Test Execution Time Comparison:
Integration Tests (Before):
- DripService: 24 tests × 5s = 120s
- AmplitudeService: 19 tests × 4s = 76s
- TwilioService: 18 tests × 6s = 108s
Total: 304s (~5 minutes)
Unit Tests (After):
- DripService: 24 tests × 20ms = 480ms
- AmplitudeService: 19 tests × 15ms = 285ms
- TwilioService: 18 tests × 25ms = 450ms
Total: 1.2s
Speed improvement: 253× faster
Faster tests mean developers run them more frequently, catching bugs earlier.
Bugs Caught During Development:
- DripService batch logic - Batch size calculation off by one, causing extra API calls
- AmplitudeService domain filtering - Used string equality instead of suffix matching
- TwilioService suppression - Temporary failures added to permanent suppression list
- ContentDuoService persona - Edge case where accuracy=60% classified incorrectly
- TTSService caching - Cache key collision between different users
Each bug was caught in unit tests before reaching integration testing or production.
Developer Feedback
"I used to avoid changing service layer code because tests took forever to run. Now I refactor confidently—if I break something, tests fail in seconds."
"Unit tests are executable documentation. When I onboarded, I read the tests to understand how services work."
"The mocks show exactly what dependencies each service uses. It makes the architecture explicit."
Key Takeaways
- Unit tests are for business logic - Test services in isolation from databases and APIs
- Mocks provide control - Simulate errors, edge cases, and unusual states
- Fast tests get run - <5s execution means developers run tests constantly
- Dependency injection enables testing - Design services to accept mocked dependencies
- Complement, don't replace integration tests - Both have value; use each appropriately
Writing 108 unit tests for service layer code transformed our testing strategy. We inverted the test pyramid, shifting from slow integration tests to fast unit tests. Services became easier to understand, refactor, and extend. The investment—two weeks of test development—paid off immediately through faster development cycles and prevented production bugs.
Unit tests aren't overhead. They're infrastructure that enables confident, rapid development.