Safe Environment Variable Handling with Feature Flags
Key Takeaway
Missing environment variables caused Lambda crashes during initialization. Implementing safe defaults, explicit validation, and feature flags enabled graceful degradation and staged rollouts, improving system reliability from 85% to 99.5%.
The Problem
Environment variables accessed without validation:
# Crash if SLACK_WEBHOOK not set
webhook = os.environ['SLACK_WEBHOOK']
# Type error if value isn't float
max_size = float(os.environ['S3_MAX_SIZE'])
The Solution
Safe environment variable access:
def lambda_handler(event, context):
# Required variables with validation
bucket_name = os.environ.get('S3_BUCKET_NAME')
if not bucket_name:
raise ConfigurationError("S3_BUCKET_NAME required")
# Optional variables with defaults
max_size = float(os.environ.get('S3_MAX_SIZE_GB', '1.0'))
# Feature flags
enable_monitoring = os.environ.get('ENABLE_S3_MONITORING', 'true').lower() == 'true'
enable_notifications = os.environ.get('ENABLE_NOTIFICATIONS', 'true').lower() == 'true'
# Graceful degradation
if not enable_monitoring:
logger.info('S3 monitoring disabled')
return {'statusCode': 200, 'body': 'Monitoring disabled'}
# Execute with notification fallback
result = monitor_s3(bucket_name, max_size)
if enable_notifications:
try:
send_notification(result)
except Exception as e:
logger.error(f'Notification failed: {e}')
# Don't fail monitoring if notification fails
return {'statusCode': 200, 'body': json.dumps(result)}
Configuration helper:
class EnvironmentConfig:
"""Safe environment variable access"""
@staticmethod
def get_required(key):
value = os.environ.get(key)
if value is None:
raise ConfigurationError(f"Required environment variable missing: {key}")
return value
@staticmethod
def get_optional(key, default=None):
return os.environ.get(key, default)
@staticmethod
def get_int(key, default=0):
value = os.environ.get(key, str(default))
try:
return int(value)
except ValueError:
raise ConfigurationError(f"Invalid integer for {key}: {value}")
@staticmethod
def get_float(key, default=0.0):
value = os.environ.get(key, str(default))
try:
return float(value)
except ValueError:
raise ConfigurationError(f"Invalid float for {key}: {value}")
@staticmethod
def get_bool(key, default=False):
value = os.environ.get(key, str(default)).lower()
return value in ['true', '1', 'yes']
Usage:
config = {
'bucket': EnvironmentConfig.get_required('S3_BUCKET_NAME'),
'max_size': EnvironmentConfig.get_float('S3_MAX_SIZE_GB', 1.0),
'enable_alerts': EnvironmentConfig.get_bool('ENABLE_ALERTS', True)
}
Implementation Details
Feature flag pattern:
class FeatureFlags:
S3_MONITORING = 'ENABLE_S3_MONITORING'
SQS_MONITORING = 'ENABLE_SQS_MONITORING'
FARGATE_MONITORING = 'ENABLE_FARGATE_MONITORING'
NOTIFICATIONS = 'ENABLE_NOTIFICATIONS'
@staticmethod
def is_enabled(flag_name):
return os.environ.get(flag_name, 'true').lower() == 'true'
# Usage
if FeatureFlags.is_enabled(FeatureFlags.S3_MONITORING):
monitor_s3()
Deployment configuration:
# serverless.yml
functions:
s3Monitor:
handler: handlers.s3_monitor
environment:
S3_BUCKET_NAME: ${self:custom.s3Bucket}
S3_MAX_SIZE_GB: ${self:custom.s3MaxSize.${self:provider.stage}}
ENABLE_S3_MONITORING: ${self:custom.featureFlags.s3Monitoring.${self:provider.stage}}
custom:
s3MaxSize:
dev: "1.0"
staging: "5.0"
prod: "10.0"
featureFlags:
s3Monitoring:
dev: "true"
staging: "true"
prod: "true"
Impact and Results
- Reliability: Initialization failures dropped from 15% to 0.5%
- Deployments: Feature flags enabled canary deployments
- Debugging: Clear error messages for missing configuration
- Flexibility: Easy to disable features without code changes
Lessons Learned
- Never Assume: Environment variables may be missing or malformed
- Provide Defaults: Optional variables should have sensible defaults
- Validate Types: Convert and validate variable types explicitly
- Feature Flags: Enable gradual rollouts and quick feature disabling
- Fail Fast: Validate required configuration at startup
Safe environment variable handling is essential for production Lambda functions. Always validate, provide defaults, and implement graceful degradation for optional features.