← Back

CircleCI Docker Image Compatibility: Fixing Python and System Dependencies

·wsi-processor

CircleCI Docker Image Compatibility: Fixing Python and System Dependencies

Key Takeaway

Our CircleCI pipeline used a base Python image that lacked system dependencies (libopenslide, libjpeg, libtiff) required for WSI processing, causing builds to fail with "library not found" errors. Switching to a custom Docker image with pre-installed dependencies reduced build time from 12 minutes to 3 minutes and eliminated dependency installation failures.

The Problem

Our CI pipeline installed dependencies every build:

# .circleci/config.yml
jobs:
  test:
    docker:
      - image: python:3.9  # Missing system libraries!

    steps:
      - checkout
      - run:
          name: Install system dependencies
          command: |
            apt-get update
            apt-get install -y libopenslide-dev libjpeg-dev libtiff-dev
            # Fails intermittently, slow downloads

      - run:
          name: Install Python dependencies
          command: pip install -r requirements.txt
          # openslide-python fails if system libs missing

Issues:

  1. Slow Builds: Installing system packages took 6-8 minutes per build
  2. Intermittent Failures: apt-get occasionally timed out
  3. Version Inconsistency: Different builds got different library versions
  4. Build Breaking: openslide-python failed if system libraries unavailable
  5. Wasted CI Credits: Long builds consumed CircleCI credits

The Solution

We created a custom Docker image with all dependencies pre-installed:

# Dockerfile
FROM python:3.9-slim

# Install system dependencies for WSI processing
RUN apt-get update && apt-get install -y \
    libopenslide0 \
    libopenslide-dev \
    libjpeg62-turbo \
    libjpeg62-turbo-dev \
    libtiff5 \
    libtiff5-dev \
    libpng16-16 \
    libpng-dev \
    zlib1g \
    zlib1g-dev \
    && rm -rf /var/lib/apt/lists/*

# Install common Python packages
RUN pip install --no-cache-dir \
    openslide-python==1.2.0 \
    Pillow==9.5.0 \
    numpy==1.24.3 \
    boto3==1.28.0

# Set working directory
WORKDIR /app

# Verify installations
RUN python -c "import openslide; print(f'OpenSlide version: {openslide.__version__}')"
RUN python -c "from PIL import Image; print(f'Pillow version: {Image.__version__}')"

CMD ["/bin/bash"]

Build and push custom image:

# Build image
docker build -t spatialx/wsi-processor:python3.9 .

# Push to Docker Hub
docker login
docker push spatialx/wsi-processor:python3.9

# Or push to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
docker tag spatialx/wsi-processor:python3.9 123456789.dkr.ecr.us-east-1.amazonaws.com/wsi-processor:python3.9
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/wsi-processor:python3.9

Updated CircleCI config:

# .circleci/config.yml
version: 2.1

jobs:
  test:
    docker:
      - image: spatialx/wsi-processor:python3.9

    steps:
      - checkout

      - restore_cache:
          keys:
            - pip-cache-v1-{{ checksum "requirements.txt" }}
            - pip-cache-v1-

      - run:
          name: Install Python dependencies
          command: |
            pip install --user -r requirements.txt

      - save_cache:
          key: pip-cache-v1-{{ checksum "requirements.txt" }}
          paths:
            - ~/.local

      - run:
          name: Run tests
          command: |
            python -m pytest tests/ -v --cov=wsi_processor

      - store_test_results:
          path: test-results

  build:
    docker:
      - image: spatialx/wsi-processor:python3.9

    steps:
      - checkout
      - setup_remote_docker

      - run:
          name: Build Lambda deployment package
          command: |
            pip install -r requirements.txt -t package/
            cd package && zip -r ../deployment.zip . && cd ..
            zip -g deployment.zip handler.py wsi_processor/*.py

      - store_artifacts:
          path: deployment.zip

workflows:
  version: 2
  test-and-build:
    jobs:
      - test
      - build:
          requires:
            - test

Implementation Details

Multi-Stage Docker Build

For smaller images:

# Multi-stage build for smaller final image
FROM python:3.9-slim as builder

# Install build dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    libopenslide-dev \
    libjpeg-dev \
    libtiff-dev \
    libpng-dev \
    zlib1g-dev

# Install Python packages
COPY requirements.txt .
RUN pip install --user -r requirements.txt

# Final stage
FROM python:3.9-slim

# Copy only runtime dependencies
RUN apt-get update && apt-get install -y \
    libopenslide0 \
    libjpeg62-turbo \
    libtiff5 \
    libpng16-16 \
    && rm -rf /var/lib/apt/lists/*

# Copy Python packages from builder
COPY --from=builder /root/.local /root/.local

ENV PATH=/root/.local/bin:$PATH

WORKDIR /app

Automated Image Updates

GitHub Actions to rebuild image weekly:

# .github/workflows/docker-update.yml
name: Update Docker Images

on:
  schedule:
    - cron: '0 0 * * 0'  # Weekly on Sunday
  workflow_dispatch:

jobs:
  build-and-push:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Login to DockerHub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: spatialx/wsi-processor:python3.9,spatialx/wsi-processor:latest
          cache-from: type=registry,ref=spatialx/wsi-processor:latest
          cache-to: type=inline

Impact and Results

| Metric | Before | After | Improvement | |--------|--------|-------|-------------| | Build time | 12 min | 3 min | 75% faster | | Build failures | 8% | 0.5% | 94% reduction | | CI cost per month | $127 | $34 | 73% savings | | Dependency install time | 8 min | 30 sec | 94% faster |

Lessons Learned

  1. Custom Images Save Time: Pre-installing dependencies is worth it
  2. Multi-Stage Builds: Reduce final image size significantly
  3. Cache Effectively: Use Docker layer caching and pip cache
  4. Version Pin Everything: Lock versions for reproducibility
  5. Automate Updates: Weekly image rebuilds catch security updates