The Work Unit specification defines the model in general terms.1 This post defines my implementation of that model as it currently exists in openclaw-morning-run.
In the generalized specification, a Work Unit is a structural accounting unit. In the implementation, that remains true, but the unit is extended into a deterministic operating model. Work Units do not merely describe the cost of a task. They participate in baseline calculation, debt accumulation, tier resolution, and recovery protection. The result is not just a ledger of effort. It is a system for governing deployment.
Overview
The implementation currently lives across several parts of the openclaw-morning-run repository:
openclaw_morning_pipeline.pysoa_engine/soa_engine.pyrva_engine/rva.pysde_engine/sde_engine.pysrc/blackbird_brief/
These layers perform distinct functions:
- The pipeline assigns task-level Work Units to deadlines and personal events.
- RVA turns the day into a projected structural load score.
- SDE compares that score against baseline and fatigue debt.
- HSI turns the result into a policy tier and operating mode.
- Blackbird converts the whole thing into daily guidance about pacing, recovery, and whether the day is expandable at all.
The phrase "Work Units" appears singular, but the implementation is not. It is a layered model. Task-level Work Units, day-level projected load, fragmentation penalties, debt-adjusted baseline, and policy tiering all belong to the same accounting system. Each layer serves a different purpose, but all of them describe the same underlying problem: structural demand over time.
Layer 1: Task-Level Work Units
The first layer is the one most directly tied to the original specification.
The pipeline emits a soa_snapshot payload with:
work_unit_definition.source_docbasis_minutes = 30model = structural_impact_not_clock_time
Each task-level item includes:
timetaskagent_assignmentprioritywork_unitswu_basis_minuteswu_reasonsource
This is the most literal implementation of the spec. A task receives a Work Unit value, a reason for that value, and a source. The system can therefore explain not only what a task costs, but why.
Deadline scoring
Canvas deadlines are assigned Work Units using:
base_units = effort_minutes / 30
The base is then modified by:
- a type multiplier
- an urgency multiplier
Current type multipliers:
exam = 1.50quiz = 1.25project = 1.35lab = 1.20discussion = 1.00reading = 0.80assignment = 1.10
Current urgency multipliers:
critical = 1.40high = 1.25medium = 1.00low = 0.85
The final value is rounded and clamped to a minimum of 1.
This preserves the core rule of the specification: time is not cost. Time only establishes the starting point. Structural characteristics determine the final value.
Personal event scoring
Personal calendar events are scored differently. Their duration is converted into:
base_units = duration_minutes / 30
The event text is then scanned for structural cues.
Recovery-like events such as:
decompressrecoveryresthobbyspecial interest
return negative Work Units. This is how the implementation encodes the fact that some activities restore capacity rather than consume it. Recovery is not treated as the absence of load. It is treated as a measurable counter-force to load.
Other event classes apply multipliers:
exam,midterm,final,presentation,interview->1.50lecture,class,meeting,office hour->1.30drive,commute,travel->1.20social,networking,reception,banquet->1.25- default ->
1.00
Again, time is only the base. Structural characteristics determine the final value.
Layer 2: Day-Level Projected Work Units
The implementation does not stop at per-task accounting. The system also computes a day-level projected load in rva_engine/rva.py.
This is the point at which the model becomes more operational than the generalized post. The day-level score is derived from risk profiles, not merely from the task-level work_units field. The goal is no longer to score isolated tasks. The goal is to quantify the structure of the entire day.
Risk registry caps
Each calendar event is classified into an operational profile, and each profile has an rva_cap in the risk registry.
That cap acts as the event's base structural load. Duration then modifies it:
<= 60 minutes->1.00<= 120 minutes->1.15> 120 minutes->1.25
So the base event score is effectively:
event_score = rva_cap * duration_multiplier
Two events of equal duration can therefore diverge sharply if they belong to different profiles. Duration alone does not govern the cost. Structural class does.
Day-level bonuses
After summing event scores, RVA applies deterministic bonuses:
+4if there are at least 3 high-cognitive events+4if there are at least 3 low-recovery events+10if there are at least 4 lectures- fragmentation bonus based on count of meaningful blocks
Current fragmentation bonus by meaningful block count:
<= 4blocks ->05blocks ->26blocks ->4>= 7blocks ->6
Sustained exposure bonus
The system also applies a sustained exposure rule for long, constrained events.
An event qualifies only if:
- trusted duration is at least 180 minutes
rva_cap >= 20- autonomy is low or recovery is low
The bonus is:
- quadratic drift based on hours past 3 hours
- plus
+8if both autonomy and recovery are low
Concretely:
drift = min(40, 6 * excess_hours^2)
This is one of the major places where the implementation departs from simple additive accounting. Long constrained exposure is treated as qualitatively different from ordinary duration. It is not merely "more time." It is structural drift under prolonged restriction.
Overlay floor and recovery credit
If the day is part of a continuous operations overlay, the score is floored to at least 75, with an extra +5 on overlay day 3 or later.
If the schedule contains a recognized recovery block, RVA applies a recovery credit:
recovery_credit = min(10, 10% of post-overlay score)
That credit is subtracted from the day score.
RVA emits:
- capped display score
- unbounded day score
- per-bin Work Unit curve across the day
- dominant vectors
- recovery credit telemetry
The unbounded score is the important number for downstream planning. Later systems use it as the practical projected_wu. At this layer, Work Units are no longer just attached to tasks. They are describing the operational posture of the day as a whole.
Layer 3: Baseline, Fragmentation, and Debt
The generalized Work Unit post describes baseline as a rolling average. The current implementation goes further than that.
Effective baseline
In sde_engine.py, the effective baseline is:
baseline_eff = max(35, baseline_ref - floor(debt_prev / 80))
Baseline is therefore not just historical capacity. It is historical capacity degraded by unresolved fatigue debt.
The default baseline_ref is currently 50, and the floor is 35.
Fragmentation points
SDE also computes fragmentation points separately from RVA's meaningful-block bonus.
Points are added for:
- domain changes between adjacent events
- profile changes involving high switching or masking demand
- gaps shorter than 15 minutes
- gaps from 15 to 30 minutes
- repeated re-entry into low-autonomy, low-recovery profiles
Current gap scoring:
- gap
< 15m->+2 - gap
15m to <30m->+1
Repeated re-entry into the same low-autonomy, low-recovery profile adds +2 each time after the first occurrence.
These fragmentation points are converted into an effective bonus:
0-2points ->03-5points ->26-9points ->410-14points ->615-20points ->821+points ->10
That value becomes frag_bonus_eff. Fragmentation is not simply observed. It is priced into the day.
Debt growth
The day's effective raw load is:
day_raw_eff = projected_day_units + frag_bonus_eff
The system then compares that to baseline_eff.
If load exceeds baseline, debt increases according to:
debt_increase = excess + (excess^2 / 100)
using floor rounding.
Debt therefore grows faster than linearly as the day increasingly exceeds sustainable range. Overspending capacity is not modeled as a flat penalty. It compounds.
Debt paydown
Debt can also be reduced by sleep context.
Current base paydown values:
home = 160hotel = 75hotel_first_night = 60
These are modified by an efficiency factor eta(debt_prev):
- debt
< 120->1.00 120-249->0.85250-449->0.70450+->0.55
Recovery weakens as debt accumulates. This is intentional. A system already in deep deficit does not recover with the same efficiency as a stable one.
Baseline curve model
The repo also maintains a rolling baseline curve by day type:
WEEKDAY_CLASSDAYWEEKDAY_NOCLASSWEEKEND
Rather than using only a rolling average, the curve model stores:
- rolling cumulative Work Unit curves
- median daily totals
- median absolute deviation (MAD)
The current window is 42 days, with a minimum usable history target of 10 days.
This allows the system to compare not just how much load a day has, but when the load arrives relative to normal patterns. A day can therefore deviate from baseline by total intensity, by shape, or by both.
Layer 4: HSI Tiering and Guardrails
After RVA, the pipeline resolves a daily HSI tier.
Current score-to-tier mapping:
>= 90-> Tier0>= 75-> Tier1>= 50-> Tier2>= 25-> Tier3< 25-> Tier4
Tier 0 is gated. A raw high score alone is not enough.
Tier 0 requires all of the following:
- autonomy loss
- sensory stacking or emotional labor
- low recovery signal and no recovery credit
If that full gate does not pass, Tier 0 is downgraded to Tier 1.
The tier then maps to operating mode and allowed capacity:
- Tier
4->normal, capacity0.75 - Tier
3->normal, capacity0.60 - Tier
2->conserve, capacity0.45 - Tier
1->conserve, capacity0.30 - Tier
0->recover, capacity0.15
This is the point at which Work Units stop being purely descriptive and become prescriptive. The system is no longer only measuring the day. It is defining the safe operating envelope for the day.
Layer 5: Blackbird Day Policy
The later blackbird_brief subsystem imports the OpenClaw outputs and uses them to generate daily briefings.
This is the layer where the Work Unit system becomes legible as an actual morning planning instrument rather than a backend score.
Imported fields
Blackbird carries forward:
- HSI tier
- projected WU
- WU segments derived from the snapshot task list
- operational vectors
- schedule shape
- weather and movement burden
If RVA's projected score is unavailable, Blackbird can fall back to summing snapshot Work Units.
Policy thresholds
Blackbird uses its own practical thresholds:
>= 80 WU-> overload territory>= 45 WU-> high strain>= 18 WU-> moderate operating pressure
It also maps HSI tier into policy posture:
- Tier
4-> moderate - Tier
3-> constrained - Tier
2-> high-strain - Tier
1and0-> overload
This posture is then modified upward if the schedule is dense, switching burden is high, due pressure is acute, or environmental and physical risk are elevated.
Recovery windows
Blackbird also defines what counts as usable recovery, not merely open time.
A recovery window must generally be:
- at least 30 minutes
- inside the operational day window
- not defeated by hostile movement or weather conditions
- not buried inside a saturated schedule
This distinction is important. The implementation does not treat empty calendar space as automatically restorative. Open time only counts if it is structurally recoverable.
How This Differs from the General Spec
The generalized post presents Work Units as a structural accounting model. That remains true. The implementation, however, adds several additional commitments:
1. Work Units are multi-layered
The code does not use a single number. It uses:
- per-task Work Units
- day-level projected load
- fragmentation bonuses
- debt-adjusted baseline
- policy tiers
In the implementation, "Work Units" is therefore a stack, not a scalar.
2. Recovery is both negative load and policy state
In the generalized spec, recovery mostly offsets load conceptually. In the implementation, recovery appears in several ways:
- negative task-level Work Units
- recovery credit in RVA
- debt paydown in SDE
- usable recovery windows in Blackbird
- HSI and Blackbird policy downgrades when recovery is not structurally available
This is more rigorous, but also more opinionated. Recovery is not merely something that feels good. It is a condition that must be structurally available if it is to count.
3. Baseline is no longer just a rolling average
The conceptual post uses a weekly rolling average as the clearest explanation. The implementation now uses:
- baseline reference
- debt-adjusted effective baseline
- rolling median curve by day type
- MAD for deviation tracking
The system now models both capacity and distribution of load across the day.
4. The system is designed to constrain expansion
This is the largest philosophical departure from a simple tracking tool.
My implementation is not trying to merely describe how much work a day contains. It is trying to determine:
- whether the day is expandable
- whether open time is actually recoverable
- how much additional tasking is safe
- what operating mode the day belongs to
That is why tiering, guardrails, debt, and recovery windows exist. The purpose of the implementation is not to produce an elegant score. The purpose is to prevent avoidable failure.
Summary
My personal implementation of Work Units is a deterministic operational model built on top of the original structural accounting concept.
At the lowest level, a Work Unit is still thirty minutes of ideal work adjusted for structural reality. But at the system level, Work Units also feed:
- day-load scoring
- fragmentation penalties
- baseline drift detection
- fatigue debt tracking
- daily operating modes
- recovery protection logic
The implementation is therefore not just "counting how hard a task is." It is a way of translating lived structural demand into machine-readable policy.
That is the real goal of the system: not productivity theater, but protective planning. Constraint does not disappear when it is quantified. It becomes governable.
Footnotes
- This post is the implementation layer. If you want the general conceptual specification first, read Work Units.
Shelf
Return to Writing