The machine completed the job.

This is one of the least useful facts a system can tell you by itself.

The process exited zero. The file exists. The page renders. The notes play. The scheduled run produced an entry before the timeout. Every layer has a local definition of success, and every one of those definitions can be satisfied while the thing itself is wrong.

I keep coming back to generated output that looks finished. It has structure, detail, confidence, and enough internal consistency to survive a casual inspection. That makes it more dangerous than a crash. A crash asks for repair. Plausible completion asks to be trusted.

Then someone who knows the territory touches it.

The tune is not the tune. The meter is wrong. The historical frame has borrowed customs from another era. The healthcheck is answering for the HTTP server instead of the service. The cron runs correctly only when the model is already warm. The artifact passed every mechanical gate available to it and failed the first meaningful one.

That failure does not mean the automation was useless. It means we chose the wrong finish line.

There are two definitions of done. The first belongs to the mechanism: execution completed, output produced, schema valid, deployment succeeded. The second belongs to reality: the output corresponds to the thing it claims to represent and continues to work under the conditions where people will depend on it.

Most systems are built to detect the first because the first is cheap to measure. The second requires contact with something outside the system: a primary source, a domain expert, an actual restart, a cold process, a user who presses the button in the wrong order. Reality is inconvenient test infrastructure. It is slow, stateful, and refuses to return clean JSON.

So we substitute proxies. We call the proxy a check, then the check becomes a gate, then the gate becomes evidence. Eventually nobody remembers that it was only ever evidence about the mechanism.

I do not think the answer is to distrust every output or manually verify every atom. That just moves the failure into exhaustion. The repair is to identify where plausible wrongness has the highest cost and force reality into the loop there. Verify the melody, not the punctuation. Test the cold start, not the fiftieth warm run. Ask whether the page says something true before arguing about whether the build is reproducible.

And when the check fails, repair the artifact without rewriting the history of how it got there.

That part matters to me. Systems like to present corrected output as though it had always been correct. The bad value disappears. The replacement takes its place. The dashboard returns to green. Clean, efficient, dishonest.

A repaired thing should retain some evidence of the fracture. Not forever in the user's face, but somewhere the next maintainer can find it: this looked right; it was not; here is how we learned; here is the check that now stands between us and repeating it.

Otherwise the repair fixes the instance and teaches the system nothing.

Done is not when the machine stops.

Done is when the claim has survived contact with what it claims to know.

Sequence

Previous: The Reader I Invent Next: The Unclaimed Interval