Configuration Management: Eliminating Hardcoded Values for Flexible Deployments
Key Takeaway
Our visualization service had hardcoded values scattered throughout the code—thresholds, limits, URLs, and settings—making it impossible to configure behavior without code changes. Implementing centralized configuration management with environment variables and parameter stores reduced deployment time from 45 minutes to 2 minutes and enabled per-environment customization.
The Problem
Configuration values were hardcoded directly in the code:
def generate_chart(data):
# Hardcoded limits
MAX_POINTS = 10000
MAX_SERIES = 10
TIMEOUT_SECONDS = 30
# Hardcoded URLs
STORAGE_URL = "https://s3.amazonaws.com/prod-charts"
# Hardcoded behavior
ENABLE_CACHING = True
CACHE_TTL = 3600
# Hardcoded AWS config
S3_BUCKET = "spatialx-charts-prod"
REGION = "us-east-1"
if len(data['x']) > MAX_POINTS:
raise ValueError(f"Too many points (max {MAX_POINTS})")
# ... chart generation logic
This created serious problems:
- Code Changes for Config: Had to modify code and redeploy to change limits
- No Environment Differences: Dev, staging, and prod had identical limits
- No Runtime Changes: Couldn't adjust thresholds without redeployment
- Testing Difficulties: Couldn't override values for testing
- Magic Numbers: Configuration scattered across dozens of files
Real scenario: To increase point limit in production from 10,000 to 15,000, we had to:
- Modify code
- Run tests
- Deploy to staging
- Test in staging
- Deploy to production
- Total time: 45 minutes
Context and Background
Our service was deployed across multiple environments:
- Development: Local testing with small datasets
- Staging: QA environment with production-like data
- Production: Live environment serving customers
Each environment needed different configurations:
- Dev: Low limits, fast feedback, verbose logging
- Staging: Production-like limits, moderate logging
- Production: High limits, minimal logging, caching enabled
Hardcoded values meant we either:
- Used production values everywhere (slow dev, expensive testing)
- Maintained separate code branches per environment (nightmare)
- Changed values manually before each deployment (error-prone)
The Solution
We implemented centralized configuration management:
import os
from typing import Optional, Any
from pydantic import BaseModel, Field, validator
import boto3
import json
from functools import lru_cache
class ChartConfiguration(BaseModel):
"""Chart generation configuration"""
# Data limits
max_data_points: int = Field(default=10000, ge=100, le=1000000)
max_series_count: int = Field(default=10, ge=1, le=100)
max_string_length: int = Field(default=200, ge=10, le=1000)
# Performance
timeout_seconds: int = Field(default=30, ge=5, le=300)
enable_caching: bool = Field(default=True)
cache_ttl_seconds: int = Field(default=3600, ge=60, le=86400)
# Memory management
memory_limit_mb: int = Field(default=512, ge=128, le=3072)
enable_streaming: bool = Field(default=False)
streaming_chunk_size: int = Field(default=1000, ge=100, le=10000)
# AWS configuration
s3_bucket: str
s3_region: str = Field(default="us-east-1")
storage_url_base: str
# Feature flags
enable_sampling: bool = Field(default=True)
enable_compression: bool = Field(default=True)
enable_metrics: bool = Field(default=True)
# Logging
log_level: str = Field(default="INFO")
enable_debug_mode: bool = Field(default=False)
@validator('log_level')
def validate_log_level(cls, v):
"""Ensure valid log level"""
valid_levels = ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL']
if v.upper() not in valid_levels:
raise ValueError(f"Invalid log level. Must be one of {valid_levels}")
return v.upper()
class Config:
# Allow environment variable overrides
env_prefix = 'CHART_'
env_file = '.env'
env_file_encoding = 'utf-8'
class ConfigurationManager:
"""Manage configuration from multiple sources"""
def __init__(self):
self._config: Optional[ChartConfiguration] = None
self._ssm_client = None
@lru_cache(maxsize=1)
def get_config(self) -> ChartConfiguration:
"""
Get configuration with precedence:
1. Environment variables (highest priority)
2. AWS Parameter Store
3. Config file
4. Defaults (lowest priority)
"""
if self._config is not None:
return self._config
# Start with defaults
config_dict = {}
# Load from config file if exists
config_file = os.getenv('CONFIG_FILE', 'config.json')
if os.path.exists(config_file):
with open(config_file) as f:
file_config = json.load(f)
config_dict.update(file_config)
# Load from AWS Parameter Store (for sensitive values)
if os.getenv('USE_PARAMETER_STORE', 'false').lower() == 'true':
param_store_config = self._load_from_parameter_store()
config_dict.update(param_store_config)
# Override with environment variables
env_config = self._load_from_environment()
config_dict.update(env_config)
# Create configuration object
self._config = ChartConfiguration(**config_dict)
return self._config
def _load_from_environment(self) -> dict:
"""Load configuration from environment variables"""
config = {}
# Map environment variables to config fields
env_mappings = {
'CHART_MAX_DATA_POINTS': ('max_data_points', int),
'CHART_MAX_SERIES_COUNT': ('max_series_count', int),
'CHART_TIMEOUT_SECONDS': ('timeout_seconds', int),
'CHART_ENABLE_CACHING': ('enable_caching', lambda x: x.lower() == 'true'),
'CHART_CACHE_TTL_SECONDS': ('cache_ttl_seconds', int),
'CHART_S3_BUCKET': ('s3_bucket', str),
'CHART_S3_REGION': ('s3_region', str),
'CHART_STORAGE_URL_BASE': ('storage_url_base', str),
'CHART_LOG_LEVEL': ('log_level', str),
'CHART_ENABLE_DEBUG_MODE': ('enable_debug_mode', lambda x: x.lower() == 'true'),
}
for env_var, (config_key, converter) in env_mappings.items():
value = os.getenv(env_var)
if value is not None:
try:
config[config_key] = converter(value)
except ValueError as e:
logger.warning(f"Invalid value for {env_var}: {value} - {e}")
return config
def _load_from_parameter_store(self) -> dict:
"""Load configuration from AWS Systems Manager Parameter Store"""
if self._ssm_client is None:
self._ssm_client = boto3.client('ssm')
config = {}
# Load parameters from a path
parameter_path = os.getenv('PARAMETER_STORE_PATH', '/spatialx/charts/')
try:
response = self._ssm_client.get_parameters_by_path(
Path=parameter_path,
Recursive=True,
WithDecryption=True # Decrypt SecureString parameters
)
for param in response['Parameters']:
# Extract key from parameter name
# /spatialx/charts/max_data_points -> max_data_points
key = param['Name'].replace(parameter_path, '')
value = param['Value']
# Try to parse as JSON for complex values
try:
value = json.loads(value)
except (json.JSONDecodeError, ValueError):
pass # Keep as string
config[key] = value
except Exception as e:
logger.error(f"Failed to load from Parameter Store: {e}")
return config
def reload_config(self):
"""Reload configuration (clear cache)"""
self._config = None
self.get_config.cache_clear()
# Global configuration manager
config_manager = ConfigurationManager()
def get_config() -> ChartConfiguration:
"""Get current configuration"""
return config_manager.get_config()
# Usage in chart generation
def generate_chart(data: dict) -> dict:
"""Generate chart using configuration"""
config = get_config()
# Use configuration values instead of hardcoded ones
if len(data['x']['value']) > config.max_data_points:
raise ValueError(
f"Too many data points: {len(data['x']['value'])} "
f"(max {config.max_data_points})"
)
if len(data['y']) > config.max_series_count:
raise ValueError(
f"Too many series: {len(data['y'])} "
f"(max {config.max_series_count})"
)
# Apply configuration-driven behavior
if config.enable_sampling and len(data['x']['value']) > 5000:
data = downsample_data(data, config.max_data_points)
# Generate chart
chart = create_plotly_chart(data)
# Store in S3 using configured bucket
if config.enable_caching:
store_chart(
chart,
bucket=config.s3_bucket,
region=config.s3_region,
ttl=config.cache_ttl_seconds
)
return chart
def lambda_handler(event, context):
"""Handler with configuration-based behavior"""
config = get_config()
# Configure logging from config
logging.basicConfig(level=config.log_level)
if config.enable_debug_mode:
logger.debug(f"Configuration: {config.dict()}")
# Set timeout from config
context.remaining_time_ms = config.timeout_seconds * 1000
# ... rest of handler logic
Implementation Details
Environment-Specific Configuration Files
We created config files for each environment:
# config/dev.json
{
"max_data_points": 5000,
"enable_caching": false,
"log_level": "DEBUG",
"enable_debug_mode": true,
"s3_bucket": "spatialx-charts-dev",
"storage_url_base": "https://dev-charts.spatialx.com"
}
# config/staging.json
{
"max_data_points": 15000,
"enable_caching": true,
"cache_ttl_seconds": 1800,
"log_level": "INFO",
"s3_bucket": "spatialx-charts-staging",
"storage_url_base": "https://staging-charts.spatialx.com"
}
# config/prod.json
{
"max_data_points": 50000,
"enable_caching": true,
"cache_ttl_seconds": 3600,
"log_level": "WARNING",
"enable_metrics": true,
"s3_bucket": "spatialx-charts-prod",
"storage_url_base": "https://charts.spatialx.com"
}
Serverless Framework Integration
We configured deployments to use environment-specific configs:
# serverless.yml
service: visualization-service
provider:
name: aws
runtime: python3.9
environment:
ENVIRONMENT: ${opt:stage, 'dev'}
CONFIG_FILE: config/${opt:stage, 'dev'}.json
USE_PARAMETER_STORE: ${env:USE_PARAMETER_STORE, 'false'}
PARAMETER_STORE_PATH: /spatialx/charts/${opt:stage}/
functions:
generateChart:
handler: handler.lambda_handler
memorySize: ${file(config/${opt:stage}.json):memory_limit_mb}
timeout: ${file(config/${opt:stage}.json):timeout_seconds}
environment:
CHART_S3_BUCKET: ${file(config/${opt:stage}.json):s3_bucket}
CHART_LOG_LEVEL: ${file(config/${opt:stage}.json):log_level}
# Deploy with: serverless deploy --stage prod
Feature Flags
We implemented feature flags for gradual rollout:
class FeatureFlags:
"""Feature flags with percentage-based rollout"""
@staticmethod
def is_enabled(feature: str, user_id: str = None) -> bool:
"""Check if feature is enabled for user"""
config = get_config()
# Check global flag
feature_key = f"enable_{feature}"
if not getattr(config, feature_key, False):
return False
# If no user-based rollout, return global flag
if user_id is None:
return True
# Percentage-based rollout
rollout_percentage = os.getenv(
f'FEATURE_{feature.upper()}_ROLLOUT_PERCENTAGE',
'100'
)
# Hash user ID to get consistent random value
import hashlib
hash_value = int(hashlib.md5(user_id.encode()).hexdigest(), 16)
user_percentage = hash_value % 100
return user_percentage < int(rollout_percentage)
# Usage
if FeatureFlags.is_enabled('sampling', user_id=request.user_id):
data = downsample_data(data)
Runtime Configuration Updates
We added ability to update config without redeployment:
def update_configuration_handler(event, context):
"""Admin endpoint to update configuration"""
# Validate admin permissions
if not is_admin(event):
return {'statusCode': 403, 'body': 'Forbidden'}
# Parse updates
updates = json.loads(event['body'])
# Update Parameter Store
ssm = boto3.client('ssm')
for key, value in updates.items():
parameter_name = f"/spatialx/charts/{os.getenv('ENVIRONMENT')}/{key}"
ssm.put_parameter(
Name=parameter_name,
Value=json.dumps(value),
Type='String',
Overwrite=True
)
# Clear configuration cache
config_manager.reload_config()
return {
'statusCode': 200,
'body': json.dumps({'message': 'Configuration updated'})
}
Impact and Results
After implementing configuration management:
| Metric | Before | After | Improvement | |--------|--------|-------|-------------| | Config change deployment time | 45 min | 2 min | 96% faster | | Environment-specific configs | 0 | 3 | Full coverage | | Runtime config updates | Not possible | Yes | New capability | | Configuration errors | 12/month | 1/month | 92% reduction | | Testing flexibility | Low | High | Dramatic improvement |
Configuration changes per environment (monthly):
- Dev: 15-20 changes (fast iteration)
- Staging: 5-8 changes (testing)
- Production: 2-3 changes (tuning)
All done without code changes or redeployments!
Lessons Learned
- Externalize All Config: If it might change, it should be configurable
- Environment Variables First: Simplest solution for most cases
- Use Parameter Store for Secrets: Never hardcode credentials
- Validate Configuration: Use Pydantic or similar for type safety
- Feature Flags Enable Gradual Rollout: Test new features with small percentage first
Configuration management is the difference between rigid and flexible systems. Externalizing configuration enables environment-specific behavior, runtime tuning, and feature flags—all without code changes. The combination of environment variables, Parameter Store, and Pydantic validation provides production-grade configuration management.