Workflow State Channels v0

Status: exploratory design record

Issue: burin-labs/harn#2219

Decision

Adopt LangGraph-style typed state channels as an explicit workflow-runtime extension, but do not make them the default state model and do not replace artifacts, transcripts, sessions, or reduce nodes.

The v0 shape should be small:

  • workflows may declare named state channels
  • each channel has a schema, reducer, initial value, and visibility policy
  • stages may declare which channels they read and write
  • stage results may include a state_updates dict
  • the runtime applies deterministic reducers at safe node boundaries
  • persisted run records include the channel snapshot and the ordered update log

This gives LangGraph users the typed fan-out/reduce primitive they expect while preserving Harn's current model: artifacts carry evidence and provenance, transcripts remain session-owned, and workflow graphs stay explicit.

Current Runtime Survey

Harn already has the pieces that overlap with typed channels:

SurfaceCurrent behaviorGap
map nodesProduce branch artifacts, support max_concurrent, and retain partial failures.Branch outputs are artifacts, not updates to named typed state slots.
join nodesGate readiness with join_policy.strategy (all, first, quorum).Join readiness does not expose a typed merge target.
reduce nodesConcatenate selected artifact text by default and emit one output artifact.Reducer behavior is node-local and mostly text-oriented.
context_policy / input_contractSelect artifacts for a stage by kind, freshness, budget, and contract.They select context; they do not define persistent workflow state.
run recordsPersist stages, artifacts, transitions, lineage, replay fixtures, and status.There is no first-class channel snapshot or channel-update audit log.
sessions/transcriptsOwn conversation continuity independently of workflows.They are intentionally not a general workflow state dict.

The gap is real, but it is narrower than "Harn needs LangGraph state." Harn needs deterministic, typed merge slots for workflows whose branches produce structured partial updates that should be replayed as state, not as prompt context.

Channel Definition

Add an optional workflow graph field:

state_channels: {
  messages: {
    schema: {type: "array", items: {type: "object"}},
    reducer: "append",
    initial: [],
    visibility: "public"
  },
  plan: {
    schema: task_plan_schema(),
    reducer: "last",
    initial: nil,
    visibility: "private"
  },
  score: {
    schema: {type: "number"},
    reducer: "sum",
    initial: 0.0,
    visibility: "public"
  }
}

The normalized graph stores this as:

{
  "state_channels": {
    "plan": {
      "schema": {},
      "reducer": "last",
      "initial": null,
      "visibility": "private"
    }
  }
}

Channel Fields

FieldRequiredMeaning
schemayesHarn schema or JSON Schema accepted by schema_check.
reduceryesBuilt-in reducer name.
initialnoInitial channel value. Defaults to nil for last, [] for list reducers, {} for object reducers, 0 for numeric reducers.
visibilityno"public" or "private". Private channel values stay out of visible host summaries.
descriptionnoHuman-readable purpose for docs and planning prompts.

Reducers

v0 should support only deterministic built-ins:

ReducerInputResult
lastany schema-compatible valueReplace with the update.
appenditem or listAppend to a list channel.
extendlistExtend a list channel with the update list.
mergeobjectShallow-merge object fields.
set_unionitem or listAppend missing values using structural equality.
sumnumberAdd numeric updates.
minnumberKeep the smaller value.
maxnumberKeep the larger value.

Custom closures are deliberately out of v0. They are difficult to persist, replay, and expose over protocol surfaces. If custom reducers become necessary, they should be named deterministic Harn functions stored in a module path, not captured closures embedded in a graph record.

Node Contract

Nodes may declare channel access:

nodes: {
  plan_branch: {
    kind: "map",
    reads: ["plan"],
    writes: ["findings"],
    map_policy: {item_artifact_kind: "workspace_file"}
  },
  summarize: {
    kind: "stage",
    reads: ["findings", "score"],
    writes: ["plan"]
  }
}

Access declarations are used for validation and prompt assembly. A node may still produce artifacts; writes only states that state_updates for those channels are accepted.

Update Envelope

Any stage result may include:

{
  "status": "completed",
  "text": "summary for the user",
  "state_updates": {
    "findings": [{"path": "src/auth.rs", "note": "rate limiter lives here"}],
    "score": 0.25
  }
}

The runtime should:

  1. reject updates to undeclared channels
  2. reject updates to channels not listed in the node's writes
  3. validate each update against the channel schema before reduction
  4. apply reducers in deterministic order: node completion order, then channel name sort order inside the update dict
  5. persist a workflow_state_update record with node id, attempt, previous value hash, update value hash, next value hash, reducer, and visibility
  6. store the final state snapshot on the run record

For map nodes, branch updates are reduced in branch item index order for deterministic replay, regardless of wall-clock completion order.

Prompt Assembly

State channels are not automatically dumped into every prompt. A stage sees only channels listed in reads, and the prompt renderer should include them in a separate state block from artifacts:

<workflow_state>
<channel name="plan" visibility="private">
...
</channel>
</workflow_state>

This prevents channels from becoming a second unbounded context stream.

Replay And Resume

Channel state is replayed from the persisted update log. Deterministic replay may either:

  • load the saved final snapshot for speed, then verify update hashes when audit: true; or
  • replay each update through the reducer for strict audit mode.

Resume starts from the saved snapshot and appends new update records. The runtime must not re-apply updates from already-completed stages.

Why Not Replace Artifacts

Artifacts remain the right abstraction for evidence:

  • they have source, freshness, lineage, token estimates, and relevance
  • they can be selected under context_policy
  • they can represent files, diffs, tests, verification bundles, and host state

Channels are for durable workflow state that many nodes update. Artifacts are for evidence and handoff material. A stage may emit both.

Migration Path

  1. Add graph normalization and validation for state_channels, reads, and writes; no execution semantics yet.
  2. Persist empty initial channel snapshots in run records.
  3. Accept state_updates from deterministic/static stages and map branches.
  4. Add stage prompt rendering for explicitly-read channels.
  5. Add protocol/portal display for public channel snapshots.
  6. Promote to an experimental stdlib helper only after replay and resume tests prove deterministic behavior.

Existing reduce nodes should continue to work unchanged. For simple text aggregation, reduce is still simpler. Use channels when fan-out branches produce structured updates that later stages need to read by name.

Open Questions

  • Should channel definitions live directly on workflow_graph, or under metadata.state_channels until the field graduates?
  • Should writes be required for all channel updates, or can a capability ceiling allow broad channel writes?
  • Should private channel values be redacted from run records or only from host summaries?
  • Should state schemas use Harn schema only, JSON Schema only, or accept both with normalization?
  • Do workflow queries need first-class access to channel snapshots, or should workflow.query stay explicitly published by user code?