← Back

Random Isn't Secure: Understanding When to Use Cryptographic Randomness

·backend-core

Random Isn't Secure: Understanding When to Use Cryptographic Randomness

Key Takeaway

We used Python's random.uniform() to generate prediction scores in our workflow, triggering security scanner warnings. While not cryptographically secure, this usage was safe because prediction scores don't require cryptographic randomness. Understanding when to use random vs secrets is critical for secure applications.

The Problem

Our code contained this function:

import random

def generate_prediction_score():
    """Generate random prediction confidence score"""
    return random.uniform(0.0, 1.0)

Security scanning tools flagged this with warning B311:

Standard pseudo-random generators are not suitable for security/cryptographic purposes.
Location: src/services/prediction.py:45
Severity: MEDIUM
CWE-330: Use of Insufficiently Random Values

This raised five questions:

  1. What's Wrong with random.uniform()?: Why is it flagged as insecure?
  2. Is This Actually a Problem?: Does our use case require cryptographic randomness?
  3. When Should We Use secrets?: What's the difference between random and secrets modules?
  4. Security Debt: Should we fix this or acknowledge it?
  5. Scanning Tools: How do we communicate intent to security scanners?

Context and Background

Our AI model prediction workflow generates confidence scores for annotations. In testing and demo environments, we sometimes use mock predictions with random scores to test the pipeline without running expensive AI models.

The prediction scores are used for:

  • Sorting annotations by confidence
  • Filtering low-confidence predictions
  • Visualizing prediction uncertainty
  • Statistical analysis of model performance

They are not used for:

  • Authentication or authorization
  • Cryptographic key generation
  • Security tokens or session IDs
  • Password generation
  • Gambling or financial calculations

The Solution

After security review, we determined that random.uniform() was appropriate for this use case, but we needed to explicitly document this decision:

import random

def generate_prediction_score():
    """
    Generate random prediction confidence score for testing.

    NOTE: Uses random.uniform() which is not cryptographically secure.
    This is acceptable because prediction scores are not security-sensitive.
    For cryptographic use cases, use secrets.SystemRandom() instead.
    """
    return random.uniform(0.0, 1.0)  # nosec B311

The # nosec B311 comment tells security scanners that we've reviewed this usage and determined it's safe.

Implementation Details

Understanding Random vs Secrets

Python provides two random number generation modules:

random Module (NOT Cryptographically Secure)

import random

# Uses Mersenne Twister (MT19937) algorithm
# Predictable if you know the seed
# Fast and efficient
# Suitable for: simulations, games, testing, sampling

random.random()          # Random float [0.0, 1.0)
random.uniform(1.0, 10.0)  # Random float in range
random.randint(1, 100)   # Random integer
random.choice(['a', 'b', 'c'])  # Random selection
random.shuffle(my_list)  # Shuffle list in-place

Why it's not secure:

import random

# If an attacker knows the seed...
random.seed(12345)
print(random.random())  # Always outputs: 0.41661987254534116

# ... they can predict all future values
print(random.random())  # Always outputs: 0.7271175685307605

secrets Module (Cryptographically Secure)

import secrets

# Uses operating system's random source (/dev/urandom on Unix)
# Unpredictable even if you observe previous values
# Slower than random
# Suitable for: passwords, tokens, keys, security

secrets.token_bytes(32)  # Random bytes for keys
secrets.token_hex(16)    # Random hex string
secrets.token_urlsafe(16)  # URL-safe token
secrets.choice(['a', 'b', 'c'])  # Secure random selection

# For random numbers, use SystemRandom
secure_random = secrets.SystemRandom()
secure_random.uniform(0.0, 1.0)  # Cryptographically secure float
secure_random.randint(1, 100)    # Cryptographically secure integer

Decision Matrix: When to Use Each

| Use Case | Module | Reasoning | |----------|--------|-----------| | Password generation | secrets | MUST be unpredictable | | Session tokens | secrets | Security-critical | | API keys | secrets | Must resist brute force | | Cryptographic keys | secrets | Core security requirement | | CSRF tokens | secrets | Prevents attack prediction | | Random IDs | secrets | Prevents enumeration attacks | | | | | | Game mechanics | random | Performance > unpredictability | | Test data generation | random | Speed matters, security doesn't | | Sampling/statistics | random | Scientific randomness, not security | | Simulations | random | Reproducibility is valuable | | Shuffling playlists | random | User experience, not security | | Monte Carlo methods | random | Statistical properties matter |

Our Specific Use Case

We evaluated our prediction score generation:

def generate_mock_prediction_scores(annotation_count):
    """
    Generate mock prediction scores for testing workflow without
    running expensive AI models.

    Security Analysis:
    - Used only in test/demo environments
    - Not exposed to end users
    - Not used for authentication/authorization
    - Not used for cryptographic purposes
    - Predictability has no security implications

    Performance Requirements:
    - Generate 10,000+ scores quickly
    - random.uniform() is 5-10x faster than secrets

    Conclusion: random.uniform() is appropriate here
    """
    scores = []
    for _ in range(annotation_count):
        score = random.uniform(0.0, 1.0)  # nosec B311
        scores.append({
            'confidence': score,
            'is_mock': True  # Flag for debugging
        })

    return scores

Refactoring for Security Contexts

For cases where we did need cryptographic randomness, we refactored:

import secrets

def generate_user_token():
    """
    Generate a secure random token for user authentication.

    MUST use secrets module because:
    - Used for authentication
    - Predictability would allow session hijacking
    - Security-critical operation
    """
    return secrets.token_urlsafe(32)

def generate_api_key():
    """
    Generate a cryptographically secure API key.

    Format: 'sk_' + 32 random URL-safe characters
    """
    random_part = secrets.token_urlsafe(32)
    return f"sk_{random_part}"

def generate_reset_token():
    """
    Generate password reset token.

    Must be unpredictable to prevent account takeover.
    """
    return secrets.token_hex(32)

Performance Comparison

We benchmarked both approaches:

import time
import random
import secrets

def benchmark_random_generation(iterations=10000):
    # Test random.uniform()
    start = time.time()
    for _ in range(iterations):
        random.uniform(0.0, 1.0)
    random_time = time.time() - start

    # Test secrets.SystemRandom()
    secure_random = secrets.SystemRandom()
    start = time.time()
    for _ in range(iterations):
        secure_random.uniform(0.0, 1.0)
    secrets_time = time.time() - start

    print(f"random.uniform(): {random_time:.4f}s")
    print(f"secrets.SystemRandom(): {secrets_time:.4f}s")
    print(f"Secrets is {secrets_time/random_time:.1f}x slower")

# Results:
# random.uniform(): 0.0023s
# secrets.SystemRandom(): 0.0187s
# Secrets is 8.1x slower

For generating 10,000 mock predictions:

  • random.uniform(): 2.3ms
  • secrets.SystemRandom(): 18.7ms

The performance difference is significant when generating large datasets for testing.

Security Scanner Configuration

We configured our security scanner to recognize safe usage:

# .bandit configuration
exclude_dirs:
  - /tests/

# Allow B311 (random) in specific contexts
nosec_lines:
  - "# nosec B311"  # Must include explanation in code

# Custom rules
rules:
  B311:
    severity: MEDIUM
    confidence: HIGH
    message: "Standard pseudo-random generators are not suitable for security/cryptographic purposes"

Impact and Results

After documenting our random usage:

  • Security Clarity: Team understands when to use each module
  • Scanner Noise: Reduced false positives by 70%
  • Code Reviews: Added checks for appropriate random usage
  • Performance: Maintained fast test data generation
  • Compliance: Passed security audits with proper documentation

Lessons Learned

  1. Context Matters: Not all randomness needs to be cryptographically secure
  2. Document Intent: Use comments to explain security decisions
  3. Scanner Communication: nosec comments acknowledge reviewed risks
  4. Performance Trade-offs: Cryptographic randomness has cost—use when needed
  5. Default to Secure: When in doubt, use secrets module

Best Practices

1. Security-Critical Use Cases

import secrets

# DO use secrets for security
token = secrets.token_urlsafe(32)
api_key = secrets.token_hex(32)
password = ''.join(secrets.choice(string.ascii_letters + string.digits) for _ in range(16))

2. Non-Security Use Cases

import random

# OK to use random for non-security
test_score = random.uniform(0.0, 1.0)  # nosec B311 - test data only
sample = random.sample(dataset, 100)   # nosec B311 - statistical sampling
random.shuffle(playlist)                # nosec B311 - user experience

3. Code Review Checklist

When reviewing code with random:

  • [ ] Is this used for security purposes?
  • [ ] Could predictability lead to security issues?
  • [ ] Is performance critical here?
  • [ ] Is reproducibility valuable (testing/simulation)?
  • [ ] Is there a nosec comment explaining the decision?

Understanding the distinction between statistical randomness and cryptographic randomness is essential for building secure systems. Use random for performance and reproducibility, use secrets for security. Document your decisions so future maintainers understand the trade-offs you made.