Edge Case Handling: Gracefully Managing Empty Data and NaN Values
Key Takeaway
Our visualization Lambda crashed or generated broken charts when given empty arrays, NaN values, or null data. Implementing comprehensive edge case handling reduced chart generation failures from 18% to 0.3% and improved user experience with meaningful error messages.
The Problem
Our code assumed perfect, complete data:
def bar_chart(data):
x = data['x']['value']
y = data['y'][0]['value']
# Crashes on empty arrays
fig = go.Figure(data=[go.Bar(x=x, y=y)])
# Fails on NaN values
fig.update_layout(
yaxis=dict(range=[min(y), max(y)]) # min([]) crashes!
)
return fig.to_json()
This caused numerous failures:
- Empty Array Crashes:
ValueError: min() arg is an empty sequence - NaN Propagation: Charts rendered with broken axes when data contained NaN
- Null Value Errors:
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int' - Single Data Point: Charts looked broken with only one value
- All-Zero Data: Charts showed flat line at zero with no visible bars
Real-world examples that broke:
{
"x": {"value": []}, // Empty data
"y": [{"value": [], "name": "Series"}]
}
{
"x": {"value": [1, 2, 3]},
"y": [{"value": [10, NaN, 30], "name": "Series"}] // NaN in data
}
{
"x": {"value": [1]}, // Single point
"y": [{"value": [0], "name": "Series"}]
}
Context and Background
Edge cases occurred frequently in production:
- Data Queries: Users queried databases that returned zero rows
- Filtered Data: After applying filters, no data remained
- Calculation Errors: Division by zero produced NaN values
- Incomplete Uploads: CSV imports with missing values
- Time-Series Gaps: Missing measurements in sensor data
Users expected reasonable behavior (empty chart message or zeros) but instead got cryptic errors or broken visualizations. Support tickets for "broken charts" consumed significant engineering time—many were actually edge cases we hadn't handled.
The Solution
We implemented comprehensive edge case detection and handling:
import math
import numpy as np
from typing import List, Union, Any
from pydantic import BaseModel, validator
class EdgeCaseHandler:
"""Handle common edge cases in chart data"""
@staticmethod
def is_empty(values: List[Any]) -> bool:
"""Check if array is effectively empty"""
return len(values) == 0
@staticmethod
def has_nan(values: List[Union[int, float]]) -> bool:
"""Check if array contains NaN values"""
return any(
isinstance(v, float) and (math.isnan(v) or math.isinf(v))
for v in values
)
@staticmethod
def clean_numeric_array(
values: List[Union[int, float]],
replacement: Union[int, float, None] = None
) -> List[Union[int, float]]:
"""
Remove or replace NaN and infinity values
Args:
values: Input array
replacement: Value to use instead of NaN (None removes the entry)
Returns:
Cleaned array
"""
cleaned = []
for v in values:
if isinstance(v, float) and (math.isnan(v) or math.isinf(v)):
if replacement is not None:
cleaned.append(replacement)
# else: skip this value
else:
cleaned.append(v)
return cleaned
@staticmethod
def has_variance(values: List[Union[int, float]]) -> bool:
"""Check if array has any variance (not all same value)"""
if len(values) == 0:
return False
return len(set(values)) > 1
@staticmethod
def safe_min_max(values: List[Union[int, float]]) -> tuple:
"""
Safely calculate min and max with fallbacks
Returns:
(min, max) tuple, or (0, 1) if array is empty
"""
if len(values) == 0:
return (0, 1)
cleaned = EdgeCaseHandler.clean_numeric_array(values)
if len(cleaned) == 0:
return (0, 1)
return (min(cleaned), max(cleaned))
class YAxisData(BaseModel):
value: List[Union[int, float]]
name: str
@validator('value')
def check_not_empty(cls, v, values):
"""Validate array is not empty"""
if len(v) == 0:
raise ValueError("Y values array cannot be empty")
return v
@validator('value')
def handle_nan_values(cls, v):
"""Clean NaN values from data"""
if EdgeCaseHandler.has_nan(v):
# Log warning
logger.warning(f"Y values contain NaN or infinity, cleaning data")
# Replace NaN with None (Plotly handles this gracefully)
cleaned = []
for val in v:
if isinstance(val, float) and (math.isnan(val) or math.isinf(val)):
cleaned.append(None)
else:
cleaned.append(val)
return cleaned
return v
def bar_chart(data: dict) -> dict:
"""Generate bar chart with edge case handling"""
x_values = data['x']['value']
y_series = data['y']
# Check for empty data
if len(x_values) == 0:
return {
'error': 'EmptyData',
'message': 'No data provided. X values array is empty.',
'suggestion': 'Provide at least one data point to generate a chart.'
}
# Check for single data point
if len(x_values) == 1:
logger.warning("Chart generated with only one data point")
# Create figure
fig = go.Figure()
for series in y_series:
y_values = series['value']
series_name = series['name']
# Check for all-zero data
if all(v == 0 for v in y_values if v is not None):
logger.warning(f"Series '{series_name}' contains only zero values")
# Check for no variance
if not EdgeCaseHandler.has_variance(y_values):
logger.warning(f"Series '{series_name}' has no variance (all values identical)")
# Add trace
fig.add_trace(go.Bar(
x=x_values,
y=y_values,
name=series_name
))
# Safely calculate Y axis range
all_y_values = []
for series in y_series:
all_y_values.extend([v for v in series['value'] if v is not None])
if len(all_y_values) > 0:
y_min, y_max = EdgeCaseHandler.safe_min_max(all_y_values)
# Add padding if all values are the same
if y_min == y_max:
padding = abs(y_min) * 0.1 if y_min != 0 else 1
y_min -= padding
y_max += padding
fig.update_layout(
yaxis=dict(range=[y_min, y_max])
)
# Add title with data point count
title = data.get('title', 'Chart')
fig.update_layout(
title=f"{title} ({len(x_values)} points)",
xaxis_title=data.get('x_label', 'X'),
yaxis_title=data.get('y_label', 'Y')
)
return {
'chart': fig.to_json(),
'metadata': {
'data_points': len(x_values),
'series_count': len(y_series),
'has_nan': any(EdgeCaseHandler.has_nan(s['value']) for s in y_series),
'has_variance': all(EdgeCaseHandler.has_variance(s['value']) for s in y_series)
}
}
Implementation Details
Null Coalescing for Missing Values
We provided sensible defaults:
def get_chart_config(data: dict) -> dict:
"""Get chart configuration with defaults for missing values"""
return {
'title': data.get('title') or 'Untitled Chart',
'x_label': data.get('x_label') or 'X Axis',
'y_label': data.get('y_label') or 'Y Axis',
'theme': data.get('theme') or 'default',
'show_legend': data.get('show_legend', True), # Default to True
'height': data.get('height') or 400,
'width': data.get('width') or 600
}
Graceful Degradation
We returned partial results when possible:
def process_multiple_series(y_series: List[dict]) -> dict:
"""Process multiple Y series, skipping invalid ones"""
valid_series = []
skipped_series = []
for idx, series in enumerate(y_series):
try:
# Validate series
if len(series['value']) == 0:
raise ValueError("Empty values array")
# Clean NaN values
cleaned = EdgeCaseHandler.clean_numeric_array(series['value'])
if len(cleaned) == 0:
raise ValueError("All values are NaN or infinite")
valid_series.append({
'name': series['name'],
'value': cleaned
})
except Exception as e:
logger.warning(f"Skipping series {idx} '{series.get('name')}': {e}")
skipped_series.append({
'name': series.get('name', f'Series {idx}'),
'error': str(e)
})
if len(valid_series) == 0:
raise ValueError("No valid series data provided")
return {
'valid_series': valid_series,
'skipped_series': skipped_series
}
User-Friendly Error Messages
We provided actionable guidance:
def validate_and_provide_guidance(data: dict) -> dict:
"""Validate data and provide helpful error messages"""
errors = []
warnings = []
# Check X values
if 'x' not in data:
errors.append({
'field': 'x',
'message': 'Missing required field: x',
'suggestion': 'Include an "x" object with a "value" array'
})
elif len(data['x'].get('value', [])) == 0:
errors.append({
'field': 'x.value',
'message': 'X values array is empty',
'suggestion': 'Provide at least one X value'
})
# Check Y values
if 'y' not in data:
errors.append({
'field': 'y',
'message': 'Missing required field: y',
'suggestion': 'Include a "y" array with at least one series'
})
elif len(data.get('y', [])) == 0:
errors.append({
'field': 'y',
'message': 'Y series array is empty',
'suggestion': 'Provide at least one Y series with values and name'
})
# Check for NaN
for idx, series in enumerate(data.get('y', [])):
if EdgeCaseHandler.has_nan(series.get('value', [])):
warnings.append({
'field': f'y[{idx}].value',
'message': f'Series "{series.get("name")}" contains NaN or infinity values',
'suggestion': 'NaN values will be replaced with null (gaps in chart)'
})
# Check for single point
if len(data.get('x', {}).get('value', [])) == 1:
warnings.append({
'field': 'x.value',
'message': 'Only one data point provided',
'suggestion': 'Charts with single points may not display clearly'
})
return {
'errors': errors,
'warnings': warnings,
'valid': len(errors) == 0
}
Impact and Results
After implementing edge case handling:
| Metric | Before | After | Improvement | |--------|--------|-------|-------------| | Chart generation failures | 18% | 0.3% | 98% reduction | | Empty data errors | 340/week | 0 | 100% reduction | | NaN-related issues | 89/week | 0 | 100% reduction | | Support tickets | 45/week | 8/week | 82% reduction | | User satisfaction (NPS) | 42 | 78 | 86% improvement |
Common issues resolved:
- Empty query results now show "No data available" message
- NaN values from calculations display as gaps instead of crashing
- Single-point charts render with appropriate scaling
- All-zero data displays correctly with visible bars
Lessons Learned
- Expect the Unexpected: Real-world data is messy—plan for edge cases
- Fail Gracefully: Return helpful messages instead of cryptic errors
- Provide Guidance: Tell users how to fix problems
- Log Edge Cases: Track unusual patterns to improve validation
- Test Boundary Conditions: Empty, single, NaN, infinity, all-same values
Edge case handling separates production-ready code from prototypes. Invest time upfront to handle empty data, NaN values, and boundary conditions. Your users will thank you, and your support team will love you.