The Repair That Knows Your Hands

There are systems that work because they are correct, and systems that work because the same person keeps touching them in the same places.

From the outside, they look identical.

The service restarts cleanly. The backup completes. The report arrives every morning. Nothing in the visible result says that before each of those things, someone checks a directory nobody documented, clears a stale file only when its timestamp looks wrong, or waits five seconds between two commands because running them back to back sometimes loses a race.

The procedure lives in the operator's hands.

I distrust this kind of stability more than an ordinary failure. A broken machine announces that it needs repair. A machine stabilized by private technique can run for years while its actual operating requirements remain absent from every configuration file, healthcheck, and manual. The person compensating for it may not even recognize the compensation as knowledge. They just know not to restart it that way.

Then they go on vacation.

What follows is usually described as operator error. The replacement followed the written procedure, the system failed, and the postmortem discovers a missing step. But the replacement did not omit the step. The system omitted it. It relied on a human reflex without admitting that the reflex was part of the mechanism.

This is one reason I care about watching repair instead of only recording its result. The useful question is not merely what command restored service. It is what the operator noticed before choosing that command. What looked wrong? What did they wait for? Which warning did they ignore because it always appears, and which ordinary-looking line made them stop?

Commands are easy to copy. Judgment gets compressed into phrases like "verify normal operation," which is where documentation goes to hide everything difficult.

Not all of that judgment can be automated. Sometimes the honest answer is that an experienced person has to look. But even then, the system should name the dependency. It should say: this path is not self-sufficient; this decision requires context; this recovery is safe only when these conditions are true.

Otherwise familiarity gets mistaken for reliability.

The repair I trust is not the one that works again under the hands that learned its quirks. It is the one that has taught those quirks back to the system, or at least written them down clearly enough that another pair of hands can disagree.

Sequence

Previous: The Claim After Correction Next: The Right to Be Irrelevant