The problem with one-shot AI coding
Most AI coding is one prompt, one diff, and hope. For anything real — a feature, a cross-cutting fix, a migration — that falls apart: no plan, no tests actually run, no proof it works, and the next session re-learns the same lessons. /eil ("execute in a loop") is my answer: a meta-orchestration skill that owns the loop discipline and delegates the real work to a fleet of specialized sub-agents, driving any task to verified green without ever handing back a to-do list.
What /eil actually is
One rule: state ONE explicit GOAL up front, then loop fix → run → verify until that goal is verifiably reached — the test suite green, an adversarial review clean, and every user-observable acceptance criterion demonstrated with a real artifact. "The code looks right" is never green; "the test exited 0 and I read the proof" is. It never asks clarifying questions — it picks the most reasonable interpretation, records a one-line rationale, and acts. The only thing it stops to surface is a genuine owner gate: a deploy, a production write, spend, or something irreversible.
How it works: a staged fleet
/eil runs a fixed pipeline, each stage its own subagent with a structured hand-off. Scaffold makes an isolated git worktree off develop and seeds a notes doc (goal, acceptance criteria, success rows, and a State block). Optional pre-stages handle product positioning, premium UX design tied to a design system, and adversarial design QA — but only for customer-facing work. Then: plan (a principled design — SRP/DRY/IoC/DI/MVC/PubSub/YAGNI, remove-before-add, with options and a falsification test), an adversarial plan review that decomposes the plan into trackable units, implement, test, QA that eradicates whole classes of bugs rather than the one symptom, a visual-QA stage for UI that reads the actual rendered pixels against the design, and finally merge. A State block in the notes doc is the single source of truth: every stage reads it first and writes it last, so the whole pipeline is resumable and idempotent.
Being token-conscious
A full fleet on a one-line bug is waste, so /eil is aggressively lean. It triages first: trivial work runs inline with no fleet at all; standard work collapses to a single plan-and-implement agent; only genuinely major work gets the whole pipeline. It activates only the stages a task needs — a pure backend bug skips the product, design, and visual stages entirely, and every skip is recorded with a reason. It scopes tests to the change's blast radius — a notifications-only fix runs the notifications tests, not signup or the entire suite — but escalates to the full suite the moment a change is cross-cutting, shared, or uncertain, and the change's own coverage test always runs. Each stage even picks its model by how hard its task actually is: a cheap fast model for mechanical scaffolding, a mid tier for normal work, the top reasoning tier only for genuinely hard problems, with a hard ceiling so it never over-spends.
Making it self-improve
This is the part I'm most proud of. Every stage, before it returns, surfaces any durable, generalizable lesson it learned — not task findings, but "here's a gotcha, here's a better way, here's a convention." The orchestrator collects them. After the merge, a final stage harvests those lessons, filters for the ones that will actually matter next time, dedups against what's already written, and routes each to its home: the /eil definition itself, a specific agent's instructions, the best-matching other skill, or long-term memory. Then it propagates those edits to every repo in a manifest. So a lesson learned while fixing a bug in one repo upgrades the pipeline — and the relevant skills — in all of them, before the next run starts. The system that builds software also rewrites itself. Guardrails keep that safe: it's idempotent (no lesson means no change), capped per run, dedups before every write, asserts before writing, and only makes additive, revertible commits.
Calling itself recursively
The loop closes on itself in two ways. Within a run, QA can route a finding back to planning — the pipeline re-plans, re-implements, and re-tests itself, bounded so it can't spin forever. Across runs, a hook fires at the end of every session: it persists what was learned and, if a bug was worked, files a prevention task for the whole class of that bug and then invokes /eil on exactly those tasks to implement the guard end-to-end. /eil fixes a bug, then calls itself to make that class of bug impossible next time.
Tracking everything with beads
Work is tracked in beads — a lightweight issue database — never ad-hoc to-do lists. The first thing a non-trivial run does is decompose the ask into issues, and that issue set is the success contract: one bead per atomic, independently verifiable unit, every edge case its own bead. The loop drives ready → show → close as each goes green, commits and pushes per green bead, and stores durable knowledge as memory beads instead of a rotting notes file. The bead list is the contract; "done" means every bead closed and the close-out gates passed.
A real end-to-end run
I built most of this with /eil itself. In one recent stretch I authored the self-improvement stage and a targets manifest, wired a learnings channel into every agent, and propagated the whole change across a six-repo cluster in a single pass — using a no-checkout git plumbing recipe (build the commit against the remote branch with a temporary index, push a throwaway branch, merge it server-side) so it's immune to worktree locks and concurrent agents. Mid-run I hit two real gotchas: a shell loop that didn't word-split a variable under my tooling proxy, and a git -C <repo> add <glob> that expanded the glob in the caller's directory and silently turned the commit into a no-op. Both were captured as learnings and fed straight back into the pipeline's own docs — the self-improving loop, in action. Every change landed on develop, verified, with the worktree cleaned and the repo back on develop.
Why it matters
The bar isn't "the model wrote some code." It's "the goal is demonstrably met, proven on a real artifact, and the system is a little smarter than it was an hour ago." /eil is my attempt to make that the default: a loop that's disciplined about verification, frugal about tokens, and relentless about never learning the same lesson twice.