← Back

Type Validation: Catching Runtime Type Errors Before They Crash Production

·visualization-utils

Type Validation: Catching Runtime Type Errors Before They Crash Production

Key Takeaway

Our visualization service crashed with TypeError when clients sent numbers as strings or wrong data structures. Implementing runtime type validation with Pydantic reduced type-related crashes by 95% and improved API contract clarity for client teams.

The Problem

Our code assumed data types without validation:

def bar_chart(data):
    x = data['x']['value']  # Assumes list
    y = data['y'][0]['value']  # Assumes list of numbers

    # Crashes if x or y are strings, None, or wrong type
    fig = go.Figure(data=[go.Bar(x=x, y=y)])
    return fig.to_json()

This caused multiple failures:

  1. Runtime TypeErrors: TypeError: 'str' object is not iterable when iterating over non-lists
  2. Silent Failures: Plotly silently failed on incompatible types
  3. Confusing Errors: "Cannot convert string to float" deep in Plotly stack
  4. No Type Contract: Clients didn't know what types to send
  5. Debugging Hell: Stack traces pointed to Plotly internals, not our code

Common failing inputs:

{
  "x": {"value": "1,2,3,4,5"},  // String instead of array
  "y": [{"value": "[10, 20, 30]", "name": "Series1"}]  // Stringified array
}

Context and Background

Different clients integrated with our API:

  • JavaScript/TypeScript frontends (type-aware)
  • Python backends (dynamic typing)
  • Excel/CSV imports (everything as strings)
  • Third-party integrations (unknown)

Without explicit type validation, clients made incorrect assumptions about our API contract. CSV-based imports were particularly problematic—Excel exports often converted arrays to comma-separated strings, causing silent failures that users attributed to "broken charts."

The Solution

We implemented Pydantic models for comprehensive type validation:

from pydantic import BaseModel, validator, Field
from typing import List, Optional, Union
import numpy as np

class XAxisData(BaseModel):
    """X-axis data structure"""
    value: List[Union[int, float, str]] = Field(..., min_items=1, max_items=10000)
    label: Optional[str] = None

    @validator('value')
    def validate_value_types(cls, v):
        """Ensure all values are valid types"""
        for idx, item in enumerate(v):
            if not isinstance(item, (int, float, str)):
                raise ValueError(
                    f"X value at index {idx} must be number or string, got {type(item).__name__}"
                )
        return v

    @validator('value')
    def check_no_nan(cls, v):
        """Check for NaN or infinity"""
        for idx, item in enumerate(v):
            if isinstance(item, float):
                if np.isnan(item):
                    raise ValueError(f"X value at index {idx} is NaN")
                if np.isinf(item):
                    raise ValueError(f"X value at index {idx} is infinity")
        return v

class YAxisData(BaseModel):
    """Y-axis data structure"""
    value: List[Union[int, float]] = Field(..., min_items=1, max_items=10000)
    name: str = Field(..., min_length=1, max_length=100)
    color: Optional[str] = None

    @validator('value')
    def validate_numeric_only(cls, v):
        """Y values must be numeric"""
        for idx, item in enumerate(v):
            if not isinstance(item, (int, float)):
                raise ValueError(
                    f"Y value at index {idx} must be numeric, got {type(item).__name__}: {item}"
                )
            if isinstance(item, float):
                if np.isnan(item):
                    raise ValueError(f"Y value at index {idx} is NaN")
                if np.isinf(item):
                    raise ValueError(f"Y value at index {idx} is infinity")
        return v

class BarChartRequest(BaseModel):
    """Complete bar chart request validation"""
    x: XAxisData
    y: List[YAxisData] = Field(..., min_items=1, max_items=10)
    title: Optional[str] = Field(None, max_length=200)
    theme: Optional[str] = Field('default', regex='^(default|dark|light)$')

    @validator('y')
    def validate_array_lengths(cls, y_data, values):
        """Ensure all Y arrays match X length"""
        if 'x' not in values:
            return y_data

        x_length = len(values['x'].value)

        for idx, y_item in enumerate(y_data):
            if len(y_item.value) != x_length:
                raise ValueError(
                    f"Y series '{y_item.name}' has {len(y_item.value)} values, "
                    f"but X has {x_length} values. Arrays must be same length."
                )

        return y_data

    class Config:
        # Provide example in schema
        schema_extra = {
            "example": {
                "x": {"value": [1, 2, 3, 4, 5], "label": "X Axis"},
                "y": [{"value": [10, 20, 30, 40, 50], "name": "Series 1"}],
                "title": "Sample Chart",
                "theme": "default"
            }
        }

def lambda_handler(event, context):
    try:
        # Parse JSON body
        body = json.loads(event['body'])

        # Validate types - Pydantic does all the work
        request = BarChartRequest(**body)

        # Generate chart with validated data
        chart = generate_bar_chart(request)

        return {
            'statusCode': 200,
            'headers': {'Content-Type': 'application/json'},
            'body': json.dumps({'chart': chart})
        }

    except ValidationError as e:
        # Pydantic provides detailed error information
        errors = []
        for error in e.errors():
            errors.append({
                'field': '.'.join(str(x) for x in error['loc']),
                'message': error['msg'],
                'type': error['type']
            })

        return {
            'statusCode': 400,
            'headers': {'Content-Type': 'application/json'},
            'body': json.dumps({
                'error': 'ValidationError',
                'message': 'Request validation failed',
                'errors': errors
            })
        }

    except Exception as e:
        logger.exception("Unexpected error")
        return {
            'statusCode': 500,
            'body': json.dumps({'error': 'Internal server error'})
        }

Implementation Details

Auto-Coercion for Common Cases

Pydantic can automatically coerce types:

class CoerciveYAxisData(BaseModel):
    """Y-axis with automatic type coercion"""
    value: List[float]  # Automatically converts "10" -> 10.0
    name: str

    @validator('value', pre=True)
    def coerce_to_float_list(cls, v):
        """Handle common type conversion issues"""
        # Handle stringified JSON arrays
        if isinstance(v, str):
            try:
                v = json.loads(v)
            except json.JSONDecodeError:
                raise ValueError(f"Cannot parse Y values: {v}")

        # Handle single value instead of array
        if not isinstance(v, list):
            v = [v]

        # Convert each item to float
        result = []
        for idx, item in enumerate(v):
            try:
                result.append(float(item))
            except (ValueError, TypeError):
                raise ValueError(
                    f"Cannot convert Y value at index {idx} to number: {item}"
                )

        return result

# Now accepts: "10,20,30" or [10, 20, 30] or ["10", "20", "30"]

OpenAPI Schema Generation

Pydantic models generate OpenAPI schemas automatically:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

@app.post("/chart/bar")
def create_bar_chart(request: BarChartRequest):
    """
    Generate a bar chart from provided data.

    The API automatically generates documentation from Pydantic models.
    """
    return generate_bar_chart(request)

# FastAPI automatically creates:
# - Interactive docs at /docs
# - OpenAPI schema at /openapi.json
# - Request validation
# - Response serialization

Custom Validators for Business Rules

We added domain-specific validation:

class ChartRequest(BaseModel):
    x: XAxisData
    y: List[YAxisData]

    @validator('x')
    def check_x_uniqueness(cls, v):
        """Warn about duplicate X values"""
        values = v.value
        if len(values) != len(set(values)):
            # Don't fail, but log warning
            logger.warning(f"X values contain duplicates")
        return v

    @validator('y')
    def check_y_variance(cls, y_data):
        """Warn if all Y values are identical"""
        for y in y_data:
            if len(set(y.value)) == 1:
                logger.warning(
                    f"Y series '{y.name}' has no variance (all values are {y.value[0]})"
                )
        return y_data

    @validator('y')
    def check_reasonable_scale(cls, y_data):
        """Warn about extreme value ranges"""
        for y in y_data:
            min_val = min(y.value)
            max_val = max(y.value)
            if max_val > 0 and min_val / max_val < 0.0001:
                logger.warning(
                    f"Y series '{y.name}' has extreme range: {min_val} to {max_val}"
                )
        return y_data

Type Documentation

We generated type definitions for clients:

# Export TypeScript definitions
from pydantic2ts import generate_typescript_defs

# Generates:
# export interface XAxisData {
#   value: (number | string)[];
#   label?: string;
# }
#
# export interface YAxisData {
#   value: number[];
#   name: string;
#   color?: string;
# }
#
# export interface BarChartRequest {
#   x: XAxisData;
#   y: YAxisData[];
#   title?: string;
#   theme?: "default" | "dark" | "light";
# }

Impact and Results

After implementing type validation:

| Metric | Before | After | Improvement | |--------|--------|-------|-------------| | Type-related errors | 340/week | 18/week | 95% reduction | | Invalid request rate | 12% | 0.8% | 93% reduction | | Client integration time | 3-5 days | 4-6 hours | 85% faster | | Support: "Why isn't this working?" | 25/week | 2/week | 92% reduction | | Time to identify bad data | 30 min | 0 sec | Instant feedback |

Clear error messages helped developers immediately:

Before:

TypeError: 'str' object is not iterable
  at plotly/graph_objs/_bar.py line 234

After:

{
  "error": "ValidationError",
  "message": "Request validation failed",
  "errors": [
    {
      "field": "y.0.value.3",
      "message": "Y value at index 3 must be numeric, got str: 'N/A'",
      "type": "value_error"
    }
  ]
}

Lessons Learned

  1. Validate at the Boundary: Type-check all inputs before processing
  2. Use Type Libraries: Pydantic, marshmallow, or similar save enormous effort
  3. Fail Early with Context: Tell users exactly what's wrong and where
  4. Generate Documentation: Derive API docs from validation schemas
  5. Coerce Carefully: Auto-convert common mistakes, but log warnings

Runtime type validation is essential in dynamically-typed languages like Python. Pydantic provides production-grade validation with minimal code and excellent error messages. The investment pays back immediately in reduced errors and faster client integration.