Test Configuration Refactor: IntegrationBase Pattern
Integration tests were flaky, slow to write, and inconsistent across the codebase. Each test file implemented its own database setup, teardown, and fixture management. Some tests created a fresh database per test. Others shared state across tests, causing intermittent failures when run in different orders. Test setup code outnumbered actual test logic 3:1.
A developer writing a new integration test copy-pasted setup code from an existing test, modified it slightly, and introduced a subtle bug—the test didn't properly rollback transactions, leaking state into subsequent tests. The bug only manifested when running the full test suite, not when running the test file in isolation. Debugging took two hours.
We refactored integration test infrastructure around a shared IntegrationBase class that standardizes setup, teardown, and fixture management. Writing new tests became trivial—inherit from IntegrationBase, write test logic, done.
The Problem: Inconsistent Test Infrastructure
Integration tests interact with databases, external services, and complex application state. Each test needs:
- Database setup - Create tables, load fixtures
- Transaction management - Isolate tests from each other
- Cleanup - Reset state after tests
- Common fixtures - Authenticated users, test data
- Test client - HTTP client for API requests
Without shared infrastructure, every test file reimplemented these concerns differently.
Before: Inconsistent Test Infrastructure
Integration Tests
┌──────────────────────────────────────┐
│ test_api_1.py │
│ - Custom DB setup │
│ - Manual teardown │
│ - Reinvented fixtures │
├──────────────────────────────────────┤
│ test_api_2.py │
│ - Different DB setup │
│ - Inconsistent fixtures │
│ - Missing cleanup │
├──────────────────────────────────────┤
│ test_api_3.py │
│ - Copy-pasted setup │
│ - Slightly modified │
│ - Introduced bugs │
└──────────────────────────────────────┘
Inconsistent, error-prone
Example of inconsistent setup:
test_auth.py:
class TestAuth:
def setup_method(self):
# Custom database setup
self.db = create_test_db()
self.db.execute("CREATE TABLE users (...)")
self.client = TestClient(app)
def teardown_method(self):
# Manual cleanup
self.db.execute("DROP TABLE users")
self.db.close()
def test_login(self):
# Create user manually
self.db.execute("INSERT INTO users VALUES (...)")
response = self.client.post('/login', ...)
assert response.status_code == 200
test_content.py:
class TestContent:
@classmethod
def setup_class(cls):
# Different setup pattern (class-level, not method-level)
cls.db = Database()
cls.db.init_schema()
cls.app = create_app()
def test_get_content(self):
# Different fixture creation
user = User(email="test@example.com")
self.db.session.add(user)
self.db.session.commit()
# No cleanup - state leaks to next test!
response = self.app.get('/content')
assert response.status_code == 200
These tests use different patterns:
test_auth.pyusessetup_method(per-test setup)test_content.pyusessetup_class(shared across tests)- Different database initialization strategies
- Inconsistent fixture creation
test_content.pydoesn't rollback transactions—state leaks between tests
The result was flaky tests that passed individually but failed in suite runs.
The Solution: IntegrationBase Pattern
We created a base class that handles all common integration test concerns:
src/tests/conftest.py:
import pytest
from flask import Flask
from flask.testing import FlaskClient
from sqlalchemy import create_engine
from sqlalchemy.orm import Session, scoped_session, sessionmaker
from src import create_app
from src.models.base import Base
from src.models import User, Content, Attempt # Import all models
class IntegrationBase:
"""
Base class for integration tests.
Provides:
- Automatic database setup/teardown
- Transaction isolation between tests
- Common fixtures (auth, user, content)
- Test client
- Helper methods
"""
@pytest.fixture(autouse=True)
def setup(self, tmpdir):
"""
Set up test environment before each test.
Runs automatically for all tests inheriting from IntegrationBase.
"""
# 1. Create test database
db_path = tmpdir.join("test.db")
self.engine = create_engine(f"sqlite:///{db_path}")
# 2. Create all tables
Base.metadata.create_all(self.engine)
# 3. Create session factory
self.session_factory = sessionmaker(bind=self.engine)
self.Session = scoped_session(self.session_factory)
self.session = self.Session()
# 4. Create Flask app with test config
self.app = create_app(config={
'TESTING': True,
'DATABASE_URI': f"sqlite:///{db_path}",
'WTF_CSRF_ENABLED': False
})
# 5. Create test client
self.client: FlaskClient = self.app.test_client()
# 6. Start transaction for test isolation
self.transaction = self.session.begin_nested()
yield # Test runs here
# 7. Rollback transaction (cleanup)
self.transaction.rollback()
self.session.close()
self.Session.remove()
# Common fixtures
def create_user(self, email="test@example.com", **kwargs):
"""Create and return a test user."""
user = User(email=email, **kwargs)
self.session.add(user)
self.session.commit()
return user
def create_auth_headers(self, user):
"""Generate authentication headers for a user."""
token = generate_test_token(user)
return {'Authorization': f'Bearer {token}'}
def create_content(self, **kwargs):
"""Create and return test content."""
content = Content(**kwargs)
self.session.add(content)
self.session.commit()
return content
# Helper methods
def assert_json_response(self, response, expected_status=200):
"""Assert response is JSON with expected status."""
assert response.status_code == expected_status
assert response.content_type == 'application/json'
return response.get_json()
Tests inherit from IntegrationBase and get all infrastructure automatically:
test_auth_refactored.py:
from src.tests.conftest import IntegrationBase
class TestAuth(IntegrationBase):
def test_login_success(self):
# Setup - use inherited helper
user = self.create_user(email="test@example.com", password="password123")
# Execute - use inherited client
response = self.client.post('/login', json={
'email': 'test@example.com',
'password': 'password123'
})
# Assert - use inherited helper
data = self.assert_json_response(response, expected_status=200)
assert 'token' in data
def test_login_invalid_password(self):
user = self.create_user(email="test@example.com", password="correct")
response = self.client.post('/login', json={
'email': 'test@example.com',
'password': 'wrong'
})
self.assert_json_response(response, expected_status=401)
No setup code, no teardown code, no manual fixture creation. Just test logic.
Key Design Decisions
Decision 1: Use pytest fixtures with autouse=True
The @pytest.fixture(autouse=True) decorator runs setup automatically for every test. Developers don't need to remember to call setup methods—inheritance is enough.
Alternative Considered:
# Explicit fixture (requires remembering to use it)
@pytest.fixture
def integration_setup(self):
# setup code...
yield
# teardown code...
class TestAuth:
def test_login(self, integration_setup): # Must remember to add fixture
...
We rejected this because it's easy to forget the fixture parameter, breaking test isolation.
Decision 2: Transaction-based isolation
Each test runs in a nested transaction that rolls back after the test completes. This is faster than truncating tables or recreating the database.
Performance comparison:
Per-test database reset strategies:
1. Recreate database: ~500ms per test
2. Truncate all tables: ~100ms per test
3. Transaction rollback: ~5ms per test
We chose option 3 (100× faster than recreation).
Transaction rollback guarantees isolation—changes in one test never affect others.
Decision 3: Shared helper methods
Common operations (creating users, generating auth tokens) are methods on the base class. This reduces code duplication and ensures consistency.
Before (duplicated helper code):
# test_auth.py
def _create_test_user(self):
user = User(email="test@example.com")
self.session.add(user)
self.session.commit()
return user
# test_content.py
def _create_user(self): # Different name, same logic
u = User(email="test@example.com")
self.db.add(u)
self.db.commit()
return u
# test_api.py
def make_user(self): # Yet another variation
user = User(email="test@example.com")
self.db_session.add(user)
self.db_session.flush()
return user
After (shared helper):
# IntegrationBase (one implementation)
def create_user(self, email="test@example.com", **kwargs):
user = User(email=email, **kwargs)
self.session.add(user)
self.session.commit()
return user
# All tests use the same helper
user = self.create_user(email="custom@example.com")
Decision 4: Scoped sessions
We use scoped_session to ensure thread safety and proper cleanup:
self.Session = scoped_session(self.session_factory)
self.session = self.Session()
# After test
self.Session.remove() # Clears session registry
This prevents session leakage between tests.
Migration Process
We migrated 47 integration test files to use IntegrationBase:
Migration steps per file:
-
Add IntegrationBase inheritance:
# Before class TestAuth: ... # After from src.tests.conftest import IntegrationBase class TestAuth(IntegrationBase): ... -
Remove setup/teardown methods:
# Before def setup_method(self): self.db = create_test_db() ... def teardown_method(self): self.db.close() ... # After # (deleted - handled by IntegrationBase) -
Replace manual fixture creation with helpers:
# Before user = User(email="test@example.com") self.session.add(user) self.session.commit() # After user = self.create_user(email="test@example.com") -
Update client references:
# Before client = TestClient(app) response = client.post(...) # After response = self.client.post(...) # Use inherited client -
Run tests to verify:
pytest test_auth_refactored.py -v
Migration statistics:
- Files migrated: 47
- Lines of code removed: 1,847 (setup/teardown boilerplate)
- Lines of code added: 94 (IntegrationBase implementation)
- Net reduction: 1,753 lines (94.9% reduction in infrastructure code)
Extended Functionality
As tests matured, we added more helpers to IntegrationBase:
Helper: Authenticated requests
class IntegrationBase:
def post_authenticated(self, url, user, **kwargs):
"""Make authenticated POST request."""
headers = self.create_auth_headers(user)
return self.client.post(url, headers=headers, **kwargs)
def get_authenticated(self, url, user, **kwargs):
"""Make authenticated GET request."""
headers = self.create_auth_headers(user)
return self.client.get(url, headers=headers, **kwargs)
# Usage
def test_protected_endpoint(self):
user = self.create_user()
response = self.get_authenticated('/api/profile', user)
assert response.status_code == 200
Helper: Bulk data creation
class IntegrationBase:
def create_users(self, count=5, **kwargs):
"""Create multiple test users."""
users = []
for i in range(count):
user = self.create_user(email=f"user{i}@example.com", **kwargs)
users.append(user)
return users
# Usage
def test_list_users(self):
users = self.create_users(count=10)
response = self.client.get('/api/users')
data = self.assert_json_response(response)
assert len(data['users']) == 10
Helper: Time manipulation
from unittest.mock import patch
from datetime import datetime, timedelta
class IntegrationBase:
def freeze_time(self, frozen_time):
"""Context manager to freeze time for testing."""
return patch('src.utils.datetime.now', return_value=frozen_time)
# Usage
def test_expired_token(self):
user = self.create_user()
# Create token that expires in 1 hour
token = create_token(user, expires_in=3600)
# Fast-forward 2 hours
future_time = datetime.now() + timedelta(hours=2)
with self.freeze_time(future_time):
headers = {'Authorization': f'Bearer {token}'}
response = self.client.get('/api/profile', headers=headers)
assert response.status_code == 401 # Token expired
Real-World Impact
Before Implementation:
- 47 test files with custom setup/teardown
- 1,847 lines of duplicated infrastructure code
- Inconsistent patterns across tests
- Flaky tests due to state leakage
- High barrier to writing new tests (copy-paste required)
After Implementation:
- 47 test files using
IntegrationBase - 94 lines of shared infrastructure code (98% reduction)
- Consistent patterns across all tests
- Zero flaky tests (proper transaction isolation)
- Low barrier to writing tests (inherit and write)
Developer Experience:
Before (writing a new integration test):
- Find similar test file to copy from (~2 min)
- Copy setup/teardown code (~3 min)
- Modify for specific use case (~5 min)
- Debug subtle setup bugs (~10 min if unlucky)
- Write actual test logic (~5 min)
Total: 15-25 minutes per test
After (writing a new integration test):
- Create test class inheriting from
IntegrationBase(~30 sec) - Write test logic (~5 min)
Total: 5-6 minutes per test
Time savings: 10-20 minutes per test
Over 6 months, we wrote 89 new integration tests. Time saved:
89 tests × 15 min average savings = 1,335 minutes = 22 hours
Test Reliability:
Before IntegrationBase, we had 3-5 flaky test failures per week due to state leakage. After implementation: zero flaky failures in 6 months.
Flaky test debugging time saved:
5 failures/week × 30 min debugging = 150 min/week
Over 26 weeks: 3,900 minutes = 65 hours saved
Adoption Curve
Migration happened gradually:
Week 1: Created IntegrationBase, migrated 5 test files as proof of concept
Week 2: Migrated 15 more files, added common helpers
Week 3: Migrated remaining 27 files
Week 4: Documentation, team training, established as standard practice
New tests automatically use IntegrationBase. Old tests migrate opportunistically during updates.
Common Pitfalls and Solutions
Pitfall 1: Forgetting to commit fixtures
Developers created fixtures but forgot to commit:
def test_content_list(self):
user = self.create_user() # Creates user
content = Content(title="Test")
self.session.add(content)
# FORGOT: self.session.commit()
response = self.get_authenticated('/api/content', user)
# content not in database - test fails
Solution: We updated helpers to auto-commit:
def create_content(self, **kwargs):
content = Content(**kwargs)
self.session.add(content)
self.session.commit() # Auto-commit
return content
Pitfall 2: Accessing session after rollback
Attempting to access objects after transaction rollback caused errors:
def test_user_update(self):
user = self.create_user()
user_id = user.id
# Test ends, transaction rolls back
# In next test
def test_user_login(self):
# user object from previous test is now detached - error!
Solution: Documentation emphasized creating fixtures per test, not reusing across tests.
Pitfall 3: Slow tests due to excessive fixtures
Some tests created unnecessary fixtures:
def test_simple_endpoint(self):
# Creates 100 users, 500 content items
self.create_users(count=100)
self.create_bulk_content(count=500)
# Test only needs empty database
response = self.client.get('/api/health')
assert response.status_code == 200
Solution: Code review caught these cases. We added documentation on creating minimal fixtures.
Key Takeaways
- Shared infrastructure reduces duplication - One implementation beats 47 copies
- Automatic setup prevents errors -
autouse=Truemeans developers can't forget - Transaction isolation prevents flakiness - Proper cleanup guarantees test independence
- Helper methods accelerate development - Common operations become one-liners
- Gradual migration is feasible - Refactor opportunistically, not all at once
The IntegrationBase pattern transformed integration testing from a tedious, error-prone process into a straightforward, reliable practice. Tests became easier to write, faster to run, and more trustworthy. The upfront investment—one day of refactoring—paid immediate dividends and continues delivering value with every new test.
Test infrastructure is often neglected, but it's high-leverage work. Better infrastructure means developers write more tests, write them faster, and trust them more. That trust enables confident refactoring, faster development, and fewer production bugs. The foundation matters.