Popper’s method is simple:
- Make a clear claim (a prediction), for example:
- “The chassis survives a 10g side-impact.”
- “Segment A will pay £29/month.”
- “The new search endpoint will serve 95% of requests in under 150ms.”
- Try to disprove it with one observation.
- You have to be able to measure for that (don’t fly blind.)
- If an observation contradicts the claim, the claim is falsified.
- If we can’t falsify it yet:
- we keep it—provisionally—and
- keep trying to break it.
That’s falsification.
A claim is scientific only if it could, in principle, be proven wrong by observation.
Progress comes from actively trying to refute claims, and keeping only what survives.
This maps cleanly to software engineering.
Take: “The new search endpoint will serve 95% of requests in under 150ms.”
That one sentence forces precision about code paths, indexes, network latency, and real traffic. And it’s falsifiable: if the dashboard shows P95 = 180ms, the claim is false. No debate.
We can push this further.
For circuit breakers, we can agree upfront: “If error rate exceeds 0.3% for more than five minutes, the break the circuit instantly.”
This normalises kill switches upfront.
Monitoring is an active referee.
That’s basically TDD:
Write a test that fails and captures the bug.
Confirm it fails.
Change the code until it passes.
Products win by killing bad ideas quickly, before those ideas accumulate cost.
That’s the hidden part of product engineering.
Karl Popper has a neat paradigm:
- Make a claim (a prediction, a statement) e.g.:
- “The chassis survives a 10 g side-impact.”
- “Segment A will pay £29 for a monthly subscription.”
- Try to break it with a single observation.
- If we find an observation, the claim is refuted (falsified).
- If we didn’t find an observation, we’re good for now. But we should actively trying to refute (falsify.)
That’s falsification.
Popper says a claim is scientific only when it makes a clear prediction that could, in principle, be proven wrong by observation; progress comes from deliberately trying to refute (“falsify”) such claims and keeping the survivors.
I love Popper. And I love how that maps neatly to software engineering.
Take the technical claim: “The new search endpoint will answer 95% of requests in under 150 ms.” This single sentence surfaces hidden assumptions about code paths, indexing, network latency, and traffic patterns.
That’s neat and scientific because it’s falsifiable. If the dashboard shows P95 at 180 ms, the hypothesis is false. Simple.
We can take this kind of thinking forward. Agree inside the pull request that “if error rate exceeds 0.3 % for more than five minutes, the feature flag flips off automatically.” This prevents post-hoc rationalisation. This turns monitoring from passive observability into an active referee. And it codifies the kill-switch culture.
We can apply the same logic to architectural work. “Replacing our queue with Kinesis will cut end-to-end latency by 40% under the current 99th-percentile load.” This is something we can falsify. We can initiate a replica path, mirror traffic, and compare timings. If the trace shows only a 10% gain, we can abandon the migration before we rewrite ten services.
Even routine refactors benefit. Following is TDD in action:
- Write a failing unit test that captures the bug
- Confirm it fails
- Then change the code until it passes.
Products, startups and product engineering comes from how fast we can disprove bad ideas, not how long we keep building on them.
Director of Agentic AI for the Enterprise at Writer. Building at the intersection of language, intelligence, and design.