Harn

Harn is a pipeline-oriented programming language for orchestrating AI agents. LLM calls, tool use, concurrency, and error recovery are built into the language – no libraries or SDKs needed.

let response = llm_call(
  "Explain quicksort in two sentences.",
  "You are a computer science tutor."
)
println(response)

Harn files can contain top-level code like the above (implicit pipeline), or organize logic into named pipelines for larger programs:

pipeline default(task) {
  let files = ["src/main.rs", "src/lib.rs"]

  let reviews = parallel each files { file ->
    let content = read_file(file)
    llm_call("Review this code:\n${content}", "You are a code reviewer.")
  }

  for review in reviews {
    println(review)
  }
}

Get started

The fastest way to start is the Getting Started guide: install Harn, write a program, and run it in under five minutes.

What’s in this guide

Getting started – Install and run your first program
Why Harn? – What problems Harn solves and how it compares
Language basics – Syntax, types, control flow, functions, structs, enums
Error handling – try/catch, Result type, the ? operator, retry
Modules and imports – Splitting code across files, standard library
Concurrency – spawn/await, parallel, channels, mutexes, deadlines
Language specification – Formal grammar and runtime semantics
LLM calls and agent loops – Calling models, agent loops, tool use
Transcript architecture – How Harn stores and replays agent conversations
Workflow runtime – Workflow graphs, artifacts, run records, replay, evals
Cookbook – Practical recipes and patterns
Host boundary – How Harn integrates with host applications
Bridge protocol – JSON-RPC contract for host bridges
MCP and ACP integration – MCP client/server, ACP, and A2A protocols
Harn portal – Local observability UI for runs and transcripts
CLI reference – All CLI commands and flags
Builtin functions – Complete reference for all built-in functions
Editor integration – LSP, tree-sitter, and formatter support
Testing – Running user tests and the conformance suite

Getting started

This page gets you from zero to running your first Harn program.

Prerequisites

Rust 1.70 or later – install with curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Git

Installation

From crates.io

cargo install harn-cli

From source

git clone https://github.com/burin-labs/harn
cd harn
./scripts/dev_setup.sh   # installs dev tooling, portal deps/build, git hooks, sccache
cargo build --release
cp target/release/harn ~/.local/bin/

Verify the installation:

harn version

Your first program

Create a file called hello.harn:

println("Hello, world!")

Run it:

harn run hello.harn

That’s it. Harn files can contain top-level code without any boilerplate. The above is an implicit pipeline – the runtime wraps your top-level statements automatically.

Adding a pipeline

For larger programs, organize code into named pipelines. The runtime executes the default pipeline (or the first one declared):

pipeline default(task) {
  let name = "Harn"
  println("Hello from ${name}!")
}

The task parameter is injected by the host runtime. It carries the user’s request when Harn is used as an agent backend.

Calling an LLM

Harn has native LLM support. Set your API key and call a model directly:

export ANTHROPIC_API_KEY=sk-ant-...

let response = llm_call(
  "Explain quicksort in two sentences.",
  "You are a computer science tutor."
)
println(response)

No imports, no SDK initialization, no response parsing. Harn ships with built-in configs for Anthropic, OpenAI, OpenRouter, Ollama, HuggingFace, and local OpenAI-compatible servers.

The REPL

Start an interactive session:

harn repl

The REPL evaluates expressions as you type and displays results immediately. It keeps a persistent history in ~/.harn/repl_history and supports multi-line blocks until delimiters are balanced, which makes it useful for experimenting with builtins and small snippets.

Project setup

Scaffold a new project with harn init or pick a starter with harn new:

harn new my-agent --template agent
cd my-agent
harn doctor --no-network

This creates a directory with harn.toml (project config) and starter files for the selected template. Run it with:

harn run main.harn

Remote MCP quick start

If you want to use a cloud MCP server such as Notion, authorize it once with the CLI and then reference it from harn.toml:

harn mcp redirect-uri
harn mcp login https://mcp.notion.com/mcp --scope "read write"

Next steps

Why Harn? – What problems Harn solves
Language basics – Syntax, types, control flow
LLM calls and agent loops – Calling models and building agents
Cookbook – Practical recipes and patterns

Scripting Cheatsheet

A compact, prose-friendly tour of everything you need to write real Harn scripts. The companion one-page LLM reference is at docs/llm/harn-quickref.md (outside the mdBook; served as raw Markdown) — they cover the same ground with different shapes, and should stay in lockstep. Agents that can fetch URLs should prefer the quickref.

Strings

Use standard double-quoted strings with \n escapes for short literals, and triple-quoted """...""" for multiline prose like system prompts:

let greeting = "Hello, ${name}!"
let prompt = """
You are a strict grader.
Emit exactly one verdict.
"""

Heredoc-style <<TAG ... TAG is only valid inside LLM tool-call argument JSON — in source code, the parser points you at triple quotes.

Slicing

End-exclusive slicing works on strings and lists:

let head = content[0:400]
let tail = content[len(content) - 400:len(content)]
let sub = xs[1:4]

substring(s, start, length) exists too, but the third argument is a length, not an end index. Prefer the slice syntax to avoid that footgun.

`if` is an expression

if / else produces a value. Drop it straight into let, an argument, or a return:

let body = if len(content) > 2400 {
  content[0:400] + "..." + content[len(content) - 400:len(content)]
} else {
  content
}

Module scope

Top-level let / var and fn declarations are visible inside functions defined in the same file — no wrapping in a getter fn needed:

let GRADER_SYSTEM = """
You are a strict grader...
"""

pub fn grade(path) {
  return llm_call(read_file(path), GRADER_SYSTEM, {
    provider: "auto",
    model: "local:gemma-4-e4b-it",
  })
}

(Module-level mutable var cross-function mutation is not fully supported yet. If you need shared mutable state across functions, use atomics: atomic(0), atomic_add(a, 1), atomic_get(a).)

Results and error handling

let r = try { llm_call(prompt, nil, opts) }
// Optional chaining short-circuits on Result.Err.
let text = r?.prose ?? "no response"
// Explicit error inspection.
if unwrap_err(r) != "" {
  log("failed")
}

// `try/catch` also works as an expression — the whole form evaluates to
// the try body's tail value on success or the catch handler's tail value
// on a caught throw, so simple fallbacks don't need Result gymnastics.
let prose = try { llm_call(prompt, nil, opts).prose } catch (e) { "fallback" }

Concurrency

// Spawn a task, collect its result.
let h = spawn { long_work() }
let value = await(h)

// parallel each: concurrent map over a list.
let doubled = parallel each xs { x -> x * 2 }

// parallel settle: concurrent map that collects per-item Ok/Err.
let outcome = parallel settle paths { p -> grade(p) }
println(outcome.succeeded)

// Cap in-flight workers so you don't overwhelm the backend.
let results = parallel settle paths with { max_concurrent: 4 } { p ->
  llm_call(p, nil, opts)
}

max_concurrent: 0 (or a missing with clause) means unlimited. See concurrency.md for the RPM rate limiter, channels, select, deadline, and defer.

CLI: `argv`

harn run my_script.harn -- file1.md file2.md

Inside the script:

fn grade_file(path) {
  println(path)
}

for path in argv {
  grade_file(path)
}

argv is always defined as list<string>; empty when no positional args were given.

Regex

let matches  = regex_match("[0-9]+", "abc 42 def 7")
let swapped  = regex_replace("(\\w+)\\s(\\w+)", "$2 $1", "hello world")
let same     = regex_replace_all("(\\w+)\\s(\\w+)", "$2 $1", "hello world")
let captures = regex_captures("(?P<day>[A-Z][a-z]+)", "Mon Tue")

Both regex_replace and regex_replace_all replace every match; both support $1, $2, ${name} backrefs from the regex crate.

LLM calls

let r = llm_call(prompt, system, {
  provider: "auto",        // infers from model prefix
  model: "local:gemma-4-e4b-it",
  output_schema: schema,
  output_validation: "error",
  schema_retries: 2,       // retry with corrective nudge on schema mismatch
  response_format: "json",
})
println(r.prose)           // unwrapped prose (preferred for "the answer")
println(r.data.verdict)    // parsed structured output

Key options:

Option	Default	Notes
`provider`	`"auto"`	`"auto"` infers from model prefix (`local:` / `/` / `claude-` / `gpt-` / `:`).
`llm_retries`	`2`	Transient error retries (HTTP 5xx, timeout, rate-limit). Set `0` to fail fast.
`llm_backoff_ms`	`2000`	Base exponential backoff.
`schema_retries`	`1`	Re-prompt on `output_schema` validation failure. Requires `output_validation: "error"` to kick in.
`schema_retry_nudge`	auto	String (verbatim), `true` (auto), or `false` (bare retry).
`output_validation`	`"off"`	`"error"` throws on mismatch; `"warn"` logs.

See docs/src/llm-and-agents.md for agent_loop, tool dispatch, and the full option surface.

Rate limiting

max_concurrent bounds simultaneous in-flight tasks on the caller side. Providers can also be rate-limited at the throughput layer via rpm: in providers.toml / harn.toml or HARN_RATE_LIMIT_<PROVIDER>=N env vars. The two compose: use max_concurrent to prevent bursts, and rpm to shape sustained throughput.

LLM-friendly one-pager: docs/llm/harn-quickref.md (loaded automatically by the harn-scripting Claude skill when present).
Full mdBook: docs/src/ (introduction.md, language-basics.md, concurrency.md, error-handling.md, llm-and-agents.md).
Language spec: spec/HARN_SPEC.md.
Conformance examples: conformance/tests/*.harn.

Why Harn?

The problem

Building AI agents is complex. A typical agent needs to call LLMs, execute tools, handle errors and retries, run tasks concurrently, maintain conversation state, and coordinate multiple sub-agents. In most languages, this means assembling a tower of libraries:

An LLM SDK (LangChain, OpenAI SDK, Anthropic SDK)
An async runtime (asyncio, Tokio, goroutines)
Retry and timeout logic (tenacity, custom decorators)
Tool registration and dispatch (custom JSON Schema plumbing)
Structured logging and tracing (separate packages)
A test framework (pytest, Jest)

Each layer adds configuration, boilerplate, and failure modes. The orchestration logic – the part that actually matters – gets buried under infrastructure code.

What Harn does differently

Harn is a programming language where agent orchestration primitives are built into the syntax, not bolted on as libraries.

In practice that means Harn aims to be the long-term orchestration boundary between product code and provider/runtime code. Product integrations should mainly declare workflows, policies, capabilities, and UI hooks rather than rebuilding transcript logic, tool queues, replay fixtures, or provider response normalization.

Native LLM calls

llm_call and agent_loop are language primitives. No SDK imports, no client initialization, no response parsing. Set an environment variable and call a model:

let answer = llm_call("Summarize this code", "You are a code reviewer.")

Harn ships with built-in configs for Anthropic, OpenAI, OpenRouter, HuggingFace, Ollama, and local OpenAI-compatible servers. Switching providers is a one-field change in the options dict.

Pipeline composition

Pipelines are the unit of composition. They can extend each other, override steps, and be imported across files. This gives you a natural way to structure multi-stage agent workflows:

pipeline analyze(task) {
  let context = read_file("README.md")
  let plan = llm_call("${task}\n\nContext:\n${context}", "Break this into steps.")
  let steps = json_parse(plan)

  let results = parallel each steps { step ->
    agent_loop(step, "You are a coding assistant.", {persistent: true})
  }

  write_file("results.json", json_stringify(results))
}

Files can also contain top-level code without a pipeline block (implicit pipeline), making Harn work well for scripts and quick experiments.

MCP and ACP integration

Harn has built-in support for the Model Context Protocol. Connect to any MCP server, or expose your Harn pipeline as one. ACP integration lets editors use Harn as an agent backend.

That includes remote HTTP MCP servers with standalone OAuth handled by the CLI, so cloud MCP integrations can be treated as normal runtime dependencies instead of host-specific glue.

let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
let tools = mcp_list_tools(client)
let content = mcp_call(client, "read_file", {path: "/tmp/data.txt"})
mcp_disconnect(client)

Concurrency without async/await

parallel each, parallel, spawn/await, and channels are keywords, not library functions. No callback chains, no promise combinators, no async def annotations:

let results = parallel each files { file ->
  llm_call(read_file(file), "Review this file for security issues")
}

Retry and error recovery

retry and try/catch are control flow constructs. Wrapping an unreliable LLM call in retries is a one-liner:

retry 3 {
  let result = llm_call(prompt, system)
  json_parse(result)
}

Gradual typing

Type annotations are optional. Add them where they help, leave them off where they don’t. Structural shape types let you describe expected dict fields:

fn score(text: string) -> int {
  let result = llm_call(text, "Rate 1-10. Respond with just the number.")
  return to_int(result)
}

Embeddable

Harn compiles to a WASM target for browser embedding and ships with LSP and DAP servers for IDE integration. Agent pipelines can run inside editors, CI systems, or web applications.

Who Harn is for

Developers building AI agents who want orchestration logic to be readable and concise, not buried under framework boilerplate.
IDE authors who want a scriptable, embeddable language for agent pipelines with built-in LSP support.
Researchers prototyping agent architectures who need fast iteration without setting up infrastructure.

Comparison

Here is what a “fetch three URLs in parallel, summarize each with an LLM, and retry failures” pattern looks like across approaches:

Python (LangChain + asyncio):

import asyncio
from langchain_anthropic import ChatAnthropic
from tenacity import retry, stop_after_attempt
import aiohttp

llm = ChatAnthropic(model="claude-sonnet-4-20250514")

@retry(stop=stop_after_attempt(3))
async def summarize(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            text = await resp.text()
    result = await llm.ainvoke(f"Summarize:\n{text}")
    return result.content

async def main():
    urls = ["https://a.com", "https://b.com", "https://c.com"]
    results = await asyncio.gather(*[summarize(u) for u in urls])
    for r in results:
        print(r)

asyncio.run(main())

Harn:

pipeline default(task) {
  let urls = ["https://a.com", "https://b.com", "https://c.com"]

  let results = parallel each urls { url ->
    retry 3 {
      let page = http_get(url)
      llm_call("Summarize:\n${page}", "Be concise.")
    }
  }

  for r in results {
    println(r)
  }
}

The Harn version has no imports, no decorators, no client initialization, no async annotations, and no runtime setup. The orchestration logic is all that remains.

Getting started

See the Getting started guide to install Harn and run your first program, or jump to the cookbook for practical patterns.

Language basics

This guide covers the core syntax and semantics of Harn.

Implicit pipeline

Harn files can contain top-level code without a pipeline block. The runtime wraps it in an implicit pipeline automatically:

let x = 1 + 2
println(x)

fn double(n) {
  return n * 2
}
println(double(5))

This is convenient for scripts, experiments, and small programs.

Pipelines

For larger programs, organize code into named pipelines. The runtime executes the pipeline named default, or the first one declared.

pipeline default(task) {
  println("Hello from the default pipeline")
}

pipeline other(task) {
  println("This only runs if called or if there's no default")
}

Pipeline parameters task and project are injected by the host runtime. A context dict with keys task, project_root, and task_type is always available.

Variables

let creates immutable bindings. var creates mutable ones.

let name = "Alice"
var counter = 0

counter = counter + 1  // ok
name = "Bob"           // error: immutable assignment

Bindings are lexically scoped. Each if branch, loop body, catch body, and explicit { ... } block gets its own scope, so inner bindings can shadow outer names without colliding:

let status = "outer"

if true {
  let status = "inner"
  println(status)  // inner
}

println(status)    // outer

If you want to update an outer binding from inside a block, declare it with var outside the block and assign to it inside the branch or loop body.

Types and values

Harn is dynamically typed with optional type annotations.

Type	Example	Notes
`int`	`42`	Platform-width integer
`float`	`3.14`	Double-precision
`string`	`"hello"`	UTF-8, supports interpolation
`bool`	`true`, `false`
`nil`	`nil`	Null value
`list`	`[1, 2, 3]`	Heterogeneous, ordered
`dict`	`{name: "Alice"}`	String-keyed map
`closure`	`{ x -> x + 1 }`	First-class function
`duration`	`5s`, `100ms`	Time duration

Type annotations

Annotations are optional and checked at compile time:

let x: int = 42
let name: string = "hello"
let nums: list<int> = [1, 2, 3]

fn add(a: int, b: int) -> int {
  return a + b
}

Supported type expressions: int, float, string, bool, nil, list, list<T>, dict, dict<K, V>, union types (string | nil), and structural shape types ({name: string, age: int}).

Parameter type annotations for primitive types (int, float, string, bool, list, dict, set, nil, closure) are enforced at runtime. Calling a function with the wrong type produces a TypeError:

fn add(a: int, b: int) -> int {
  return a + b
}

add("hello", "world")
// TypeError: parameter 'a' expected int, got string (hello)

Structural types (shapes)

Shape types describe the expected fields of a dict. The type checker verifies that required fields are present with compatible types. Extra fields are allowed (width subtyping).

let user: {name: string, age: int} = {name: "Alice", age: 30}
let config: {host: string, port?: int} = {host: "localhost"}

fn greet(u: {name: string}) -> string {
  return "hi ${u["name"]}"
}
greet({name: "Bob", age: 25})

Use type aliases for reusable shape definitions:

type Config = {model: string, max_tokens: int}
let cfg: Config = {model: "gpt-4", max_tokens: 100}

Truthiness

These values are falsy: false, nil, 0, 0.0, "", [], {}. Everything else is truthy.

Strings

Interpolation

let name = "world"
println("Hello, ${name}!")
println("2 + 2 = ${2 + 2}")

Any expression works inside ${}.

Raw strings

Raw strings use the r"..." prefix. No escape processing or interpolation is performed – backslashes and dollar signs are taken literally. Useful for regex patterns and file paths:

let pattern = r"\d+\.\d+"
let path = r"C:\Users\alice\docs"

Raw strings cannot span multiple lines.

Multi-line strings

let doc = """
  This is a multi-line string.
  Common leading whitespace is stripped.
"""

Multi-line strings support ${expression} interpolation with automatic indent stripping:

let name = "world"
let greeting = """
  Hello, ${name}!
  Welcome to Harn.
"""

Escape sequences

\n (newline), \t (tab), \\ (backslash), \" (quote), \$ (dollar sign).

String methods

"hello".count                    // 5
"hello".empty                    // false
"hello".contains("ell")          // true
"hello".replace("l", "r")       // "herro"
"a,b,c".split(",")              // ["a", "b", "c"]
"  hello  ".trim()              // "hello"
"hello".starts_with("he")       // true
"hello".ends_with("lo")         // true
"hello".uppercase()             // "HELLO"
"hello".lowercase()             // "hello"
"hello world".substring(0, 5)   // "hello"

Operators

Ordered by precedence (lowest to highest):

Precedence	Operators	Description
1	`\|>`	Pipe
2	`? :`	Ternary conditional
3	`??`	Nil coalescing
4	`\|\|`	Logical OR (short-circuit)
5	`&&`	Logical AND (short-circuit)
6	`==` `!=`	Equality
7	`<` `>` `<=` `>=` `in` `not in`	Comparison, membership
8	`+` `-`	Add, subtract, string/list concat
9	`*` `/`	Multiply, divide
10	`!` `-`	Unary not, negate
11	`.` `?.` `[]` `[:]` `()` `?`	Member access, optional chaining, subscript, slice, call, try

Division by zero returns nil. Integer division truncates. Arithmetic operators are strictly typed — mismatched operands (e.g. "hello" + 5) produce a TypeError. Use to_string() or string interpolation ("value=${x}") for explicit conversion.

Optional chaining (`?.`)

Access properties or call methods on values that might be nil. Returns nil instead of erroring when the receiver is nil:

let user = nil
println(user?.name)           // nil (no error)
println(user?.greet("hi"))    // nil (method not called)

let d = {name: "Alice"}
println(d?.name)              // Alice

Chains propagate nil: a?.b?.c returns nil if any step is nil.

List and string slicing (`[start:end]`)

Extract sublists or substrings using slice syntax:

let items = [10, 20, 30, 40, 50]
println(items[1:3])   // [20, 30]
println(items[:2])    // [10, 20]
println(items[3:])    // [40, 50]
println(items[-2:])   // [40, 50]

let s = "hello world"
println(s[0:5])       // hello
println(s[-5:])       // world

Negative indices count from the end. Omit start for 0, omit end for length.

Try operator (`?`)

The postfix ? operator works with Result values (Ok / Err). It unwraps Ok values and propagates Err values by returning early from the enclosing function:

fn divide(a, b) {
  if b == 0 {
    return Err("division by zero")
  }
  return Ok(a / b)
}

fn compute(x) {
  let result = divide(x, 2)?   // unwraps Ok, or returns Err early
  return Ok(result + 10)
}

fn compute_zero(x) {
  let result = divide(x, 0)?   // divide returns Err, ? propagates it
  return Ok(result + 10)
}

println(compute(20))       // Result.Ok(20)
println(compute_zero(20))  // Result.Err(division by zero)

Multiple ? calls can be chained in a single function to build pipelines that short-circuit on the first error.

Membership operators (`in`, `not in`)

Test whether a value is contained in a collection:

// Lists
println(3 in [1, 2, 3])          // true
println(6 not in [1, 2, 3])      // true

// Strings (substring containment)
println("world" in "hello world") // true
println("xyz" not in "hello")     // true

// Dicts (key membership)
let data = {name: "Alice", age: 30}
println("name" in data)           // true
println("email" not in data)      // true

// Sets
let s = set(1, 2, 3)
println(2 in s)                   // true
println(5 not in s)               // true

Control flow

if/else

if score > 90 {
  println("A")
} else if score > 80 {
  println("B")
} else {
  println("C")
}

Can be used as an expression: let grade = if score > 90 { "A" } else { "B" }

for/in

for item in [1, 2, 3] {
  println(item)
}

// Dict iteration yields {key, value} entries sorted by key
for entry in {a: 1, b: 2} {
  println("${entry.key}: ${entry.value}")
}

while

var i = 0
while i < 10 {
  println(i)
  i = i + 1
}

Safety limit of 10,000 iterations.

match

match status {
  "active" -> { println("Running") }
  "stopped" -> { println("Halted") }
}

Patterns are expressions compared by equality. First match wins. No match returns nil.

guard

Early exit if a condition isn’t met:

guard x > 0 else {
  return "invalid"
}
// x is guaranteed > 0 here

Ranges

Harn has a single range keyword: to. Ranges are inclusive by default — 1 to 5 is [1, 2, 3, 4, 5] — because that matches how the expression reads aloud. Add the trailing exclusive modifier when you want the half-open form.

for i in 1 to 5 {              // inclusive: 1, 2, 3, 4, 5
  println(i)
}

for i in 0 to 3 exclusive {    // half-open: 0, 1, 2
  println(i)
}

For Python-compatible 0-indexed iteration there is also a range() stdlib builtin. range(n) is equivalent to 0 to n exclusive; range(a, b) is a to b exclusive. Both forms always produce half-open integer ranges.

for i in range(5) { println(i) }        // 0, 1, 2, 3, 4
for i in range(3, 7) { println(i) }      // 3, 4, 5, 6

Iteration patterns

Prefer destructuring and stdlib helpers over integer-indexed loops — they read better and avoid off-by-one bugs.

// enumerate(): yields a list of {index, value} dicts.
for {index, value} in ["a", "b", "c"].enumerate() {
  println("${index}: ${value}")
}

// zip(): yields [a, b] pairs — use list destructuring.
for [name, score] in names.zip(scores) {
  println("${name}: ${score}")
}

// Dict iteration yields {key, value} entries sorted by key.
for {key, value} in {a: 1, b: 2}.entries() {
  println("${key} -> ${value}")
}

for heads currently accept a bare name, a list pattern [a, b], or a dict pattern {name1, name2}. Tuple patterns written with parentheses (for (a, b) in ...) are not yet supported — use the list pattern when the iterable yields pair-lists (zip), and the dict pattern when the iterable yields shaped dicts (enumerate, entries).

Functions and closures

Named functions

fn double(x) {
  return x * 2
}

fn greet(name: string) -> string {
  return "Hello, ${name}!"
}

Functions can be declared at the top level (for library files) or inside pipelines.

Rest parameters

Use ...name as the last parameter to collect any remaining arguments into a list:

fn sum(...nums) {
  var total = 0
  for n in nums {
    total = total + n
  }
  return total
}
println(sum(1, 2, 3))  // 6

fn log(level, ...parts) {
  println("[${level}] ${join(parts, " ")}")
}
log("INFO", "server", "started")  // [INFO] server started

If no extra arguments are provided, the rest parameter is an empty list.

Closures

let square = { x -> x * x }
let add = { a, b -> a + b }

println(square(4))     // 16
println(add(2, 3))     // 5

Closures capture their lexical environment at definition time. Parameters are immutable.

Higher-order functions

let nums = [1, 2, 3, 4, 5]

nums.map({ x -> x * 2 })           // [2, 4, 6, 8, 10]
nums.filter({ x -> x > 3 })        // [4, 5]
nums.reduce(0, { acc, x -> acc + x }) // 15
nums.find({ x -> x == 3 })         // 3
nums.any({ x -> x > 4 })           // true
nums.all({ x -> x > 0 })           // true
nums.flat_map({ x -> [x, x] })     // [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]

Lazy iterators

Collection methods like .map and .filter above are eager — each call allocates a new list and walks the whole input. That’s fine for small inputs, but wastes work when you only need the first few results, or when you want to compose several transforms.

Harn also ships a lazy iterator protocol. Call .iter() on any iterable source (list, dict, set, string, generator, channel) to lift it into an Iter<T> — a single-pass, fused iterator. Combinators on an Iter return a new Iter without running any work. Sinks drain the iter and return an eager value.

let xs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
let first_three_doubled_evens = xs
  .iter()
  .filter({ x -> x % 2 == 0 })
  .map({ x -> x * 2 })
  .take(3)
  .to_list()
println(first_three_doubled_evens)  // [4, 8, 12]

Use .enumerate() to get (index, value) pairs in a for-loop:

let items = ["a", "b", "c"]
for (i, x) in items.iter().enumerate() {
  println("${i}: ${x}")
}

.iter() on a dict yields Pair(key, value) values — destructure them in a for-loop:

for (k, v) in {a: 1, b: 2}.iter() {
  println("${k}: ${v}")
}

A direct for entry in some_dict still yields the usual {key, value} dicts (back-compat). pair(a, b) also exists as a builtin for constructing pairs explicitly.

Lazy combinators (return a new Iter): .map, .filter, .flat_map, .take(n), .skip(n), .take_while, .skip_while, .zip, .enumerate, .chain, .chunks(n), .windows(n).

Sinks (drain the iter, return a value): .to_list(), .to_set(), .to_dict() (requires Pair items), .count(), .sum(), .min(), .max(), .reduce(init, f), .first(), .last(), .any(p), .all(p), .find(p), .for_each(f).

When to use which: reach for eager list/dict/set methods for simple one-shot transforms where you want a collection back. Reach for .iter() when you’re composing multiple transforms, taking the first N results of a large input, consuming a generator lazily, or driving a for-loop over combined sources.

Iterators are single-pass and fused — once exhausted, they stay exhausted. Iteration takes a snapshot of the backing collection, so mutating the source after .iter() does not affect the iter. Printing an iter renders <iter> without draining it.

Numeric ranges (a to b, range(n)) participate in the lazy iter protocol directly: .map / .filter / .take / .zip / .enumerate / ... on a Range return a lazy iter with no upfront allocation, so (1 to 10_000_000).map(fn(x) { return x * 2 }).take(5).to_list() finishes instantly. Range still keeps its O(1) fast paths for .len / .first / .last / .contains(x) and r[k] subscript — those don’t round-trip through iter.

Pipe operator

The pipe operator |> passes the left side as the argument to the right side:

let result = data
  |> { list -> list.filter({ x -> x > 0 }) }
  |> { list -> list.map({ x -> x * 2 }) }
  |> json_stringify

Pipe placeholder (`_`)

Use _ to control where the piped value is placed in the call:

"hello world" |> split(_, " ")       // ["hello", "world"]
[3, 1, 2] |> _.sort()               // [1, 2, 3]
items |> len(_)                      // length of items
"world" |> replace("hello _", "_", _) // "hello world"

Without _, the value is passed as the sole argument to a closure or function name.

Multiline expressions

Binary operators, method chains, and pipes can span multiple lines:

let message = "hello"
  + " "
  + "world"

let result = items
  .filter({ x -> x > 0 })
  .map({ x -> x * 2 })

let valid = check_a()
  && check_b()
  || fallback()

Note: - does not continue across lines because it doubles as unary negation.

A backslash at the end of a line forces the next line to continue the current expression, even when no operator is present:

let long_value = some_function( \
  arg1, arg2, arg3 \
)

Destructuring

Destructuring extracts values from dicts and lists into local variables.

Dict destructuring

let person = {name: "Alice", age: 30}
let {name, age} = person
println(name)  // "Alice"
println(age)   // 30

List destructuring

let items = [1, 2, 3, 4, 5]
let [first, ...rest] = items
println(first)  // 1
println(rest)   // [2, 3, 4, 5]

Renaming

Use : to bind a dict field to a different variable name:

let data = {name: "Alice"}
let {name: user_name} = data
println(user_name)  // "Alice"

Destructuring in for-in loops

let entries = [{key: "a", value: 1}, {key: "b", value: 2}]
for {key, value} in entries {
  println("${key}: ${value}")
}

Default values

Pattern fields can specify defaults with = expr. The default is used when the value would otherwise be nil:

let { name = "anon", role = "user" } = { name: "Alice" }
println(name)  // Alice
println(role)  // user

let [a = 0, b = 0, c = 0] = [1, 2]
println(c)     // 0

// Combine with renaming
let { name: display = "Unknown" } = {}
println(display)  // Unknown

Missing keys and empty rest

Missing keys destructure to nil (unless a default is specified). A rest pattern with no remaining items gives an empty collection:

let {name, email} = {name: "Alice"}
println(email)  // nil

let [only, ...rest] = [42]
println(rest)   // []

Collections

Lists

let nums = [1, 2, 3]
nums.count          // 3
nums.first          // 1
nums.last           // 3
nums.empty          // false
nums[0]             // 1 (subscript access)

Lists support + for concatenation: [1, 2] + [3, 4] yields [1, 2, 3, 4]. Assigning to an out-of-bounds index throws an error.

Dicts

let user = {name: "Alice", age: 30}
user.name           // "Alice" (property access)
user["age"]         // 30 (subscript access)
user.missing        // nil (missing keys return nil)
user.has("email")   // false

user.keys()         // ["age", "name"] (sorted)
user.values()       // [30, "Alice"]
user.entries()      // [{key: "age", value: 30}, ...]
user.merge({role: "admin"})  // new dict with merged keys
user.map_values({ v -> to_string(v) })
user.filter({ v -> type_of(v) == "int" })

Computed keys use bracket syntax: {[dynamic_key]: value}.

Quoted string keys are also supported for JSON compatibility: {"content-type": "json"}. The formatter normalizes simple quoted keys to unquoted form and non-identifier keys to computed key syntax.

Keywords can be used as dict keys and property names: {type: "read"}, op.type.

Dicts iterate in sorted key order (alphabetical). This means for k in dict is deterministic and reproducible, but does not preserve insertion order.

Sets

Sets are unordered collections of unique values. Duplicates are automatically removed.

let s = set(1, 2, 3)          // create from individual values
let s2 = set([4, 5, 5, 6])   // create from a list (deduplicates)
let tags = set("a", "b", "c") // works with any value type

Set operations are provided as builtin functions:

let a = set(1, 2, 3)
let b = set(3, 4, 5)

set_contains(a, 2)       // true
set_contains(a, 99)      // false

set_union(a, b)          // set(1, 2, 3, 4, 5)
set_intersect(a, b)      // set(3)
set_difference(a, b)     // set(1, 2) -- items in a but not in b

set_add(a, 4)            // set(1, 2, 3, 4)
set_remove(a, 2)         // set(1, 3)

Sets support iteration with for..in:

var sum = 0
for item in set(10, 20, 30) {
  sum = sum + item
}
println(sum)  // 60

Convert a set to a list with to_list():

let items = to_list(set(10, 20))
type_of(items)  // "list"

Enums and structs

Enums

enum Status {
  Active
  Inactive
  Pending(reason)
  Failed(code, message)
}

let s = Status.Pending("waiting")
match s.variant {
  "Pending" -> { println(s.fields[0]) }
  "Active" -> { println("ok") }
}

Structs

struct Point {
  x: int
  y: int
}

let p = {x: 10, y: 20}
println(p.x)

Structs can also be constructed with the struct name as a constructor, using named fields directly:

let p = Point { x: 10, y: 20 }
println(p.x)  // 10

Structs can declare type parameters when fields should stay connected:

struct Pair<A, B> {
  first: A
  second: B
}

let pair: Pair<int, string> = Pair { first: 1, second: "two" }
println(pair.second)  // two

Impl blocks

Add methods to a struct with impl:

struct Point {
  x: int
  y: int
}

impl Point {
  fn distance(self) {
    return sqrt(self.x * self.x + self.y * self.y)
  }
  fn translate(self, dx, dy) {
    return Point { x: self.x + dx, y: self.y + dy }
  }
}

let p = Point { x: 3, y: 4 }
println(p.distance())       // 5.0
println(p.translate(10, 20)) // Point({x: 13, y: 24})

The first parameter must be self, which receives the struct instance. Methods are called with dot syntax on values constructed with the struct constructor.

Interfaces

Interfaces let you define a contract: a set of methods that a type must have. Harn uses implicit satisfaction, just like Go. A struct satisfies an interface automatically if its impl block has all the required methods. You never write implements or impl Interface for Type.

Step 1: Define an interface

An interface lists method signatures without bodies:

interface Displayable {
  fn display(self) -> string
}

This says: any type that has a display(self) -> string method counts as Displayable.

Interfaces can also be generic, and individual interface methods may declare their own type parameters when the contract needs them:

interface Repository<T> {
  fn get(id: string) -> T
  fn map<U>(value: T, f: fn(T) -> U) -> U
}

Interfaces may also declare associated types when the contract needs to name an implementation-defined type without making the whole interface generic:

interface Collection {
  type Item
  fn get(self, index: int) -> Item
}

Step 2: Create structs with matching methods

struct Dog {
  name: string
  breed: string
}

impl Dog {
  fn display(self) -> string {
    return "${self.name} the ${self.breed}"
  }
}

struct Cat {
  name: string
  indoor: bool
}

impl Cat {
  fn display(self) -> string {
    let status = if self.indoor { "indoor" } else { "outdoor" }
    return "${self.name} (${status} cat)"
  }
}

Both Dog and Cat have a display(self) -> string method, so they both satisfy Displayable. No extra annotation is needed.

Step 3: Use the interface as a type

Now you can write a function that accepts any Displayable:

fn introduce(animal: Displayable) {
  println("Meet: ${animal.display()}")
}

let d = Dog({name: "Rex", breed: "Labrador"})
let c = Cat({name: "Whiskers", indoor: true})

introduce(d)  // Meet: Rex the Labrador
introduce(c)  // Meet: Whiskers (indoor cat)

The type checker verifies at compile time that Dog and Cat satisfy Displayable. If a struct is missing a required method, you get a clear error at the call site.

Interfaces with multiple methods

Interfaces can require more than one method:

interface Serializable {
  fn serialize(self) -> string
  fn byte_size(self) -> int
}

`guard`, `require`, and `assert`

These three forms serve different jobs:

guard condition else { ... } handles expected control flow and narrows types after the guard.
require condition, "message" enforces runtime invariants in normal code and throws on failure.
assert, assert_eq, and assert_ne are for test pipelines. The linter warns when you use them in non-test code, and it nudges test pipelines away from require.

guard user != nil else {
  return "missing user"
}

require len(user.name) > 0, "user name cannot be empty"

A struct must implement all listed methods to satisfy the interface.

Generic constraints

You can also use interfaces as constraints on generic type parameters:

fn log_item<T>(item: T) where T: Displayable {
  println("[LOG] ${item.display()}")
}

The where T: Displayable clause tells the type checker to verify that whatever concrete type is passed for T satisfies Displayable. If it does not, a compile-time error is produced. Generic parameters must also bind consistently across arguments, so fn<T>(a: T, b: T) cannot be called with mixed concrete types such as (int, string). Container bindings like list<T> preserve and validate their element type at call sites too.

Variance: `in T` and `out T`

Type parameters on user-defined generics may be marked in (the parameter is contravariant — it appears only in input positions) or out (covariant — only in output positions). Unannotated parameters default to invariant: Box<int> and Box<float> are unrelated unless Box declares out T and uses T only covariantly.

type Reader<out T> = fn() -> T          // T is produced
interface Sink<in T> { fn accept(v: T) -> int }  // T is consumed

Built-in containers carry sensible variance: iter<T> is covariant (read-only), but list<T> and dict<K, V> are invariant (mutable). Function types are contravariant in their parameters and covariant in their return type — fn(float) can stand in for fn(int), but not the other way around. The full variance table lives in the spec under “Subtyping and variance”.

Declarations are checked at the definition site: a type Box<out T> = fn(T) -> int is rejected because T appears in a contravariant position despite the out annotation.

Spread in function calls

The spread operator ... expands a list into individual function arguments:

fn add(a, b, c) {
  return a + b + c
}

let nums = [1, 2, 3]
println(add(...nums))  // 6

You can mix regular arguments and spread arguments:

fn add(a, b, c) {
  return a + b + c
}

let rest = [2, 3]
println(add(1, ...rest))  // 6

Spread works in method calls too:

let point = Point({x: 0, y: 0})
let deltas = [10, 20]
let moved = point.translate(...deltas)

Try-expression

The try keyword without a catch block is a try-expression. It evaluates its body and wraps the outcome in a Result:

let result = try { json_parse(raw_input) }
// Result.Ok(parsed_data)  -- if parsing succeeds
// Result.Err("invalid JSON: ...") -- if parsing throws

This is the complement of the ? operator. Use try to enter Result-land (catching errors into Result.Err), and ? to exit Result-land (propagating errors upward):

fn safe_divide(a, b) {
  return try { a / b }
}

fn compute(x) {
  let half = safe_divide(x, 2)?  // unwrap Ok or propagate Err
  return Ok(half + 10)
}

No catch or finally is needed. If a catch follows try, it is parsed as the traditional try/catch statement instead.

Ask expression

The ask expression is syntactic sugar for making an LLM call. It takes a set of key-value fields and returns the LLM response as a string:

let answer = ask {
  system: "You are a helpful assistant.",
  user: "What is 2 + 2?"
}
println(answer)

Common fields include system (system prompt), user (user message), model, max_tokens, and provider. The ask expression is equivalent to building a dict and passing it to llm_call.

Duration literals

let d1 = 500ms   // 500 milliseconds
let d2 = 5s      // 5 seconds
let d3 = 2m      // 2 minutes
let d4 = 1h      // 1 hour

Durations can be passed to sleep() and used in deadline blocks.

Math constants

pi and e are global constants (not functions):

println(pi)    // 3.141592653589793
println(e)     // 2.718281828459045

let area = pi * r * r

Named format placeholders

The format builtin supports both positional {} placeholders and named {key} placeholders when the second argument is a dict:

// Positional
println(format("Hello, {}!", "world"))

// Named
println(format("Hello {name}, you are {age}.", {name: "Alice", age: 30}))

For simple cases, string interpolation with ${} is usually more convenient:

let name = "Alice"
println("Hello, ${name}!")

Comments

// Line comment

/** HarnDoc comment for a public API.
    Use a `/** ... */` block directly above `pub fn`. */
pub fn greet(name: string) -> string {
  return "Hello, ${name}"
}

pub pipeline deploy(task) {
  return
}

pub enum Result {
  Ok(value: string)
  Err(message: string)
}

pub struct Config {
  host: string
  port?: int
}

/* Block comment
   /* Nested block comments are supported */
   Still inside the outer comment */

Error handling

Harn provides try/catch/throw for error handling and retry for automatic recovery.

throw

Any value can be thrown as an error:

throw "something went wrong"
throw {code: 404, message: "not found"}
throw 42

try/catch

Catch errors with an optional error binding:

try {
  let data = json_parse(raw_input)
} catch (e) {
  println("Parse failed: ${e}")
}

The error variable is optional:

fn risky_operation() { throw "boom" }

try {
  risky_operation()
} catch {
  println("Something failed, moving on")
}

What gets bound to the error variable

If the error was created with throw: e is the thrown value directly (string, dict, etc.)
If the error is an internal runtime error: e is the error’s description as a string

return inside try

A return statement inside a try block is not caught. It propagates out of the enclosing pipeline or function as expected.

fn find_user(id) {
  try {
    let user = lookup(id)
    return user  // this returns from find_user, not caught
  } catch (e) {
    return nil
  }
}

Typed catch

Catch specific error types using enum-based error hierarchies:

enum AppError {
  NotFound(resource)
  Unauthorized(reason)
  Internal(message)
}

try {
  throw AppError.NotFound("user:123")
} catch (e: AppError) {
  match e.variant {
    "NotFound" -> { println("Missing: ${e.fields[0]}") }
    "Unauthorized" -> { println("Access denied") }
    "Internal" -> { println("Internal: ${e.fields[0]}") }
  }
}

Errors that don’t match the typed catch propagate up the call stack.

require

The require statement checks a condition and throws an error if it is false. An optional second argument provides the error message:

require len(items) > 0, "items list must not be empty"
require user != nil, "user is required"
require score >= 0    // throws a generic error if false

require is useful at the top of a function to validate preconditions before proceeding. If the condition is falsy, execution stops with a thrown error that can be caught by try/catch or will surface as a runtime error.

guard

The guard statement provides an early-return pattern. If the condition is false, the else block executes. The else block must exit the current scope (typically via return or throw):

fn process(input) {
  guard input != nil else {
    return "no input"
  }
  guard type_of(input) == "string" else {
    throw "expected string, got ${type_of(input)}"
  }
  // input is guaranteed non-nil and a string here
  return input.uppercase()
}

After a guard statement, the type checker narrows the variable’s type based on the condition. For example, guard x != nil ensures x is non-nil in subsequent code.

retry

Automatically retry a block up to N times:

retry 3 {
  let response = http_post(url, payload)
  let parsed = json_parse(response)
  parsed
}

If the body succeeds on any attempt, returns that result immediately
If all attempts fail, returns nil
return inside a retry block propagates out (not retried)

Try-expression

The try keyword without a catch block acts as a try-expression. It evaluates the body and returns a Result:

On success: Result.Ok(value)
On error: Result.Err(error)

let result = try { json_parse(raw_input) }

This is useful when you want to capture an error as a value rather than crashing or needing a full try/catch:

let parsed = try { json_parse(input) }
if is_err(parsed) {
  println("Bad input, using defaults")
  parsed = Ok({})
}
let data = unwrap(parsed)

Try/catch expression

try { ... } catch (e) { ... } is also usable as an expression — the whole form evaluates to the try body’s tail value on success, or the catch handler’s tail value on a caught throw. The lub of the two branch types is inferred automatically, and an explicit type annotation on the let binds the result:

let parsed: dict = try { json_parse(input) } catch (e) { default_config() }

Typed catches work identically in expression position; when the thrown error’s type does not match the catch’s type filter, the throw propagates past the expression and the let binding is never established:

let user: User = try {
  fetch_user(id)
} catch (e: NetworkError) {
  cached_user(id)
}
// Any non-`NetworkError` throw surfaces out of this block unchanged.

A finally { ... } tail is optional on either form and runs once for side-effect only — its value is discarded. The expression’s value still comes from the try body or the catch handler.

The try-expression pairs naturally with the ? operator. Use try to enter Result-land and ? to propagate within it:

fn fetch_json(url) {
  let body = try { http_get(url) }
  let text = unwrap(body)?
  let data = try { json_parse(text) }
  return data
}

When catch or finally follows try, the form is the handled expression described above; only the bare try { body } form wraps in Result.

Runtime shape validation errors

When a function parameter has a structural type annotation (a shape like {name: string, age: int}), Harn validates the argument at runtime. If the argument is missing a required field or a field has the wrong type, a clear error is produced:

fn process(user: {name: string, age: int}) {
  println("${user.name} is ${user.age}")
}

process({name: "Alice"})
// Error: parameter 'user': missing field 'age' (int)

process({name: "Alice", age: "old"})
// Error: parameter 'user': field 'age' expected int, got string

Shape validation works with both plain dicts and struct instances. Extra fields beyond those listed in the shape are allowed (width subtyping).

This catches a common class of bugs where a dict is passed with missing or mistyped fields, giving you precise feedback about exactly which field is wrong.

Result type

The built-in Result enum provides an alternative to try/catch for representing success and failure as values. A Result is either Ok(value) or Err(error). Statically, Result is generic: Result<T, E>.

let ok = Ok(42)
let err = Err("something failed")

let typed_ok: Result<int, string> = ok
let typed_err: Result<int, string> = err

println(ok)   // Result.Ok(42)
println(err)  // Result.Err(something failed)

The shorthand constructors Ok(value) and Err(value) are equivalent to Result.Ok(value) and Result.Err(value).

Result helper functions

Function	Description
`is_ok(r)`	Returns `true` if `r` is `Result.Ok`
`is_err(r)`	Returns `true` if `r` is `Result.Err`
`unwrap(r)`	Returns the `Ok` value, throws if `r` is `Err`
`unwrap_or(r, default)`	Returns the `Ok` value, or `default` if `r` is `Err`
`unwrap_err(r)`	Returns the `Err` value, throws if `r` is `Ok`

let r = Ok(42)
println(is_ok(r))           // true
println(is_err(r))          // false
println(unwrap(r))          // 42
println(unwrap_or(Err("x"), "default"))  // default

Pattern matching on Result

Result values can be destructured with match:

fn fetch_data(url) {
  // ... returns Ok(data) or Err(message)
}

match fetch_data("/api/users") {
  Result.Ok(data) -> { println("Got ${len(data)} users") }
  Result.Err(err) -> { println("Failed: ${err}") }
}

The `?` operator

The postfix ? operator provides concise error propagation. Applied to a Result value, it unwraps Ok and returns the value, or immediately returns the Err from the enclosing function.

fn divide(a, b) {
  if b == 0 {
    return Err("division by zero")
  }
  return Ok(a / b)
}

fn compute(x) {
  let result = divide(x, 2)?   // unwraps Ok, or returns Err early
  return Ok(result + 10)
}

let r1 = compute(20)  // Result.Ok(20)
let r2 = compute(0)   // Result.Err(division by zero)

The ? operator has the same precedence as ., [], and (), so it chains naturally:

fn fetch_and_parse(url) {
  let response = http_get(url)?
  let data = json_parse(response)?
  return Ok(data)
}

Applying ? to a non-Result value produces a runtime type error.

Result vs. try/catch

Use Result and ? when errors are expected outcomes that callers should handle (validation failures, missing data, parse errors). Use try/catch for unexpected errors or when you want to recover from failures in-place without propagating them through return values.

The two patterns can be combined:

fn transform(data) { return data }

fn safe_parse(input) {
  try {
    let data = json_parse(input)
    return Ok(data)
  } catch (e) {
    return Err("parse error: ${e}")
  }
}

fn process(raw) {
  let data = safe_parse(raw)?   // propagate Err if parse fails
  return Ok(transform(data))
}

Stack traces

When a runtime error occurs, Harn displays a stack trace showing the call chain that led to the error. The trace includes file location, source context, and the sequence of function calls.

error: division by zero
  --> example.harn:3:14
  |
3 |   let x = a / b
  |              ^
  = note: called from compute at example.harn:8
  = note: called from pipeline at example.harn:12

The error format shows:

Error message: what went wrong
Source location: file, line, and column where the error occurred
Source context: the relevant source line with a caret (^) pointing to the exact position
Call chain: each function in the call stack, from innermost to outermost, with file and line numbers

Stack traces are captured at the point of the error, before try/catch unwinding, so the full call chain is preserved even when errors are caught at a higher level.

Combining patterns

retry 3 {
  try {
    let result = llm_call(prompt, system)
    let parsed = json_parse(result)
    return parsed
  } catch (e) {
    println("Attempt failed: ${e}")
    throw e  // re-throw to trigger retry
  }
}

Modules and imports

Harn supports splitting code across files using import and top-level fn declarations.

Importing files

import "lib/helpers.harn"

The extension is optional — these are equivalent:

import "lib/helpers.harn"
import "lib/helpers"

Import paths are resolved relative to the current file’s directory. If main.harn imports "lib/helpers", it looks for lib/helpers.harn next to main.harn.

Writing a library file

Library files contain top-level fn declarations:

// lib/math.harn

fn double(x) {
  return x * 2
}

fn clamp(value, low, high) {
  if value < low { return low }
  if value > high { return high }
  return value
}

When imported, these functions become available in the importing file’s scope.

Using imported functions

import "lib/math"

pipeline default(task) {
  println(double(21))        // 42
  println(clamp(150, 0, 100)) // 100
}

Importing pipelines

Imported files can also contain pipelines, which are registered globally by name:

// lib/analysis.harn
pipeline analyze(task) {
  println("Analyzing: ${task}")
}

import "lib/analysis"

pipeline default(task) {
  // the "analyze" pipeline is now registered and available
}

What needs an import

Most Harn builtins — println, log, read_file, write_file, llm_call, agent_loop, http_get, parallel, workflow_*, transcript_*, mcp_*, and the rest of the runtime surface — are registered globally and require no import statement. You can call them directly from top-level code or inside any pipeline.

import "std/..." is only needed for the Harn-written helper modules described below (std/text, std/json, std/math, std/collections, std/path, std/context, std/agent_state, std/agents, std/runtime, std/project, std/worktree, std/checkpoint). These add layered utilities on top of the core builtins; the core builtins themselves are always available.

Standard library modules

Harn includes built-in modules that are compiled into the interpreter. Import them with the std/ prefix:

import "std/text"
import "std/collections"
import "std/math"
import "std/path"
import "std/json"
import "std/context"
import "std/agent_state"
import "std/agents"

std/text

Text processing utilities for LLM output and code analysis:

Function	Description
`int_to_string(value)`	Convert an integer-compatible value to a decimal string
`float_to_string(value)`	Convert a float-compatible value to a string
`parse_int_or(value, fallback)`	Parse an integer, returning `fallback` on failure
`parse_float_or(value, fallback)`	Parse a float, returning `fallback` on failure
`extract_paths(text)`	Extract file paths from text, filtering comments and validating extensions
`parse_cells(response)`	Parse fenced code blocks from LLM output. Returns `[{type, lang, code}]`
`filter_test_cells(cells, target_file?)`	Filter cells to keep code blocks and write_file calls
`truncate_head_tail(text, n)`	Keep first/last n lines with omission marker
`detect_compile_error(output)`	Check for compile error patterns (SyntaxError, etc.)
`has_got_want(output)`	Check for got/want test failure patterns
`format_test_errors(output)`	Extract error-relevant lines (max 20)

std/collections

Collection utilities and store helpers:

Function	Description
`filter_nil(dict)`	Remove entries where value is nil, empty string, or “null”
`store_stale(key, max_age_seconds)`	Check if a store key’s timestamp is stale
`store_refresh(key)`	Update a store key’s timestamp to now

std/math

Extended math utilities:

Function	Description
`clamp(value, lo, hi)`	Clamp a value between min and max
`lerp(a, b, t)`	Linear interpolation between a and b by t (0..1)
`map_range(value, in_lo, in_hi, out_lo, out_hi)`	Map a value from one range to another
`deg_to_rad(degrees)`	Convert degrees to radians
`rad_to_deg(radians)`	Convert radians to degrees
`sum(items)`	Sum a list of numbers
`avg(items)`	Average of a list of numbers (returns 0 for empty lists)
`mean(items)`	Arithmetic mean of a list of numbers
`median(items)`	Median of a non-empty numeric list
`percentile(items, p)`	R-7 percentile interpolation for `p` in `[0, 100]`
`argsort(items, score_fn?)`	Indices that would sort a list ascending, optionally by score
`top_k(items, k, score_fn?)`	Highest-scoring `k` items, descending
`variance(items, sample?)`	Population variance, or sample variance when `sample = true`
`stddev(items, sample?)`	Population standard deviation, or sample mode when `sample = true`
`minmax_scale(items)`	Scale a numeric list into `[0, 1]`, or all zeros for a constant list
`zscore(items, sample?)`	Standardize a numeric list, or all zeros for a constant list
`weighted_mean(items, weights)`	Weighted arithmetic mean
`weighted_choice(items, weights?)`	Randomly choose one item by non-negative weights
`softmax(items, temperature?)`	Convert numeric scores into probabilities
`normal_pdf(x, mean?, stddev?)`	Normal density with defaults `mean = 0`, `stddev = 1`
`normal_cdf(x, mean?, stddev?)`	Normal cumulative distribution with defaults `mean = 0`, `stddev = 1`
`normal_quantile(prob, mean?, stddev?)`	Inverse normal CDF for `0 < prob < 1`
`dot(a, b)`	Dot product of two equal-length numeric vectors
`vector_norm(v)`	Euclidean norm of a numeric vector
`vector_normalize(v)`	Unit-length version of a non-zero numeric vector
`cosine_similarity(a, b)`	Cosine similarity of two non-zero equal-length vectors
`euclidean_distance(a, b)`	Euclidean distance between two equal-length vectors
`manhattan_distance(a, b)`	Manhattan distance between two equal-length vectors
`chebyshev_distance(a, b)`	Chebyshev distance between two equal-length vectors
`covariance(xs, ys, sample?)`	Population or sample covariance between two numeric lists
`correlation(xs, ys, sample?)`	Pearson correlation between two numeric lists
`moving_avg(items, window)`	Sliding-window moving average
`ema(items, alpha)`	Exponential moving average over a numeric list
`kmeans(points, k, options?)`	Deterministic k-means over `list<list<number>>`, returns `{centroids, assignments, counts, iterations, converged, inertia}`

import "std/math"

println(clamp(150, 0, 100))         // 100
println(lerp(0, 10, 0.5))           // 5
println(map_range(50, 0, 100, 0, 1)) // 0.5
println(sum([1, 2, 3, 4]))          // 10
println(avg([10, 20, 30]))          // 20
println(percentile([1, 2, 3, 4], 75)) // 3.25
println(top_k(["a", "bbbb", "cc"], 2, { x -> len(x) })) // ["bbbb", "cc"]
println(softmax([1, 2, 3]))         // probabilities summing to 1
println(cosine_similarity([1, 0], [1, 1])) // ~0.707
println(moving_avg([1, 2, 3, 4, 5], 3)) // [2.0, 3.0, 4.0]

let grouped = kmeans([[0, 0], [0, 1], [10, 10], [10, 11]], 2)
println(grouped.centroids)          // [[0.0, 0.5], [10.0, 10.5]]

std/path

Path manipulation utilities:

Function	Description
`ext(path)`	Get the file extension without the dot
`stem(path)`	Get the filename without extension
`normalize(path)`	Normalize path separators (backslash to forward slash)
`is_absolute(path)`	Check if a path is absolute
`workspace_info(path, workspace_root?)`	Classify a path at the workspace boundary
`workspace_normalize(path, workspace_root?)`	Normalize a path into workspace-relative form when safe
`list_files(dir)`	List files in a directory (one level)
`list_dirs(dir)`	List subdirectories in a directory

import "std/path"

println(ext("main.harn"))          // "harn"
println(stem("/src/main.harn"))    // "main"
println(is_absolute("/usr/bin"))   // true
println(workspace_normalize("/packages/app/SKILL.md", cwd())) // "packages/app/SKILL.md"

let files = list_files("src")
let dirs = list_dirs(".")

std/json

JSON utility patterns:

Function	Description
`pretty(value)`	Pretty-print a value as indented JSON
`safe_parse(text)`	Safely parse JSON, returning nil on failure instead of throwing
`merge(a, b)`	Shallow-merge two dicts (keys in b override keys in a)
`pick(data, keys)`	Pick specific keys from a dict
`omit(data, keys)`	Omit specific keys from a dict

import "std/json"

let data = safe_parse("{\"x\": 1}")   // {x: 1}, or nil on bad input
let merged = merge({a: 1}, {b: 2})    // {a: 1, b: 2}
let subset = pick({a: 1, b: 2, c: 3}, ["a", "c"])  // {a: 1, c: 3}
let rest = omit({a: 1, b: 2, c: 3}, ["b"])          // {a: 1, c: 3}

std/context

Structured prompt/context assembly helpers:

Function	Description
`section(name, content, options?)`	Create a named context section
`context_attach(name, path, content, options?)`	Attach file/path-oriented context
`context(sections, options?)`	Build a context object
`context_render(ctx, options?)`	Render a context into prompt text
`prompt_compose(task, ctx, options?)`	Compose `{prompt, system, rendered_context}`

std/agent_state

Durable session-scoped state helpers built on the VM-side durable-state backend:

Function	Description
`agent_state_init(root, options?)`	Create or reopen a session-scoped durable state handle
`agent_state_resume(root, session_id, options?)`	Reopen an existing durable state session
`agent_state_write(handle, key, content)`	Atomically persist text content under a relative key
`agent_state_read(handle, key)`	Read a key, returning `nil` when it is absent
`agent_state_list(handle)`	Recursively list keys in deterministic order
`agent_state_delete(handle, key)`	Delete a key
`agent_state_handoff(handle, summary)`	Write a structured JSON handoff envelope to the reserved handoff key
`agent_state_handoff_key()`	Return the reserved handoff key name

See Agent state for the handle format, conflict policies, and backend details.

std/runtime

Generic host/runtime helpers that are useful across many hosts:

Function	Description
`runtime_task()`	Return the current runtime task string
`runtime_pipeline_input()`	Return structured pipeline input from the host
`runtime_dry_run()`	Return whether the current run is dry-run only
`runtime_approved_plan()`	Return the host-approved plan text when available
`process_exec(command)`	Execute a process through the typed host contract
`process_exec_with_timeout(command, timeout_ms)`	Execute a process with an explicit timeout
`interaction_ask(question)`	Ask the host/user a question through the typed interaction contract
`interaction_ask_with_kind(question, kind)`	Ask the host/user a question with an explicit interaction kind
`record_run_metadata(run, workflow_name)`	Persist normalized workflow run metadata through the runtime contract

std/project

Project metadata helpers plus deterministic project evidence scanning:

Function	Description
`metadata_namespace(dir, namespace)`	Read resolved metadata for a namespace, defaulting to `{}`
`metadata_local_namespace(dir, namespace)`	Read only the namespace data stored directly on a directory
`project_inventory(namespace?)`	Return `{entries, status}` for metadata-backed project state
`project_root_package()`	Infer the repository’s root package/module name from common manifests
`project_scan(path, options?)`	Scan a directory for deterministic L0/L1 evidence
`project_enrich(path, options)`	Run caller-owned L2 enrichment over bounded project context with schema validation and caching
`project_scan_tree(path, options?)`	Walk subdirectories and return a `{rel_path: evidence}` map
`project_enrich(path, options?)`	Run a structured per-directory L2 enrichment with caller-owned prompt/schema
`project_deep_scan(path, options?)`	Build or refresh a cached per-directory evidence tree backed by metadata namespaces
`project_deep_scan_status(namespace, path?)`	Return the last deep-scan status for a namespace/scope
`project_catalog()`	Return the built-in anchor/lockfile catalog used by `project_scan(...)`
`project_scan_paths(path, options?)`	Return only the keys from `project_scan_tree(...)`
`project_stale(namespace?)`	Return the stale summary from `metadata_status(...)`
`project_stale_dirs(namespace?)`	Return the tier1+tier2 stale directory list
`project_requires_refresh(namespace?)`	Return `true` when stale or missing hashes require refresh

Host-specific editor, git, diagnostics, learning, and filesystem/edit helpers should live in host-side .harn libraries built on capability-aware host_call(...), not in Harn’s shared stdlib.

std/agents

Workflow helpers built on transcripts and agent_loop:

Function	Description
`workflow(config)`	Create a workflow config
`action_graph(raw, options?)`	Normalize planner output into a canonical action-graph envelope
`action_graph_batches(graph, completed?)`	Compute dependency-ready action batches grouped by phase and tool class
`action_graph_render(graph)`	Render a human-readable markdown summary of an action graph
`action_graph_flow(graph, config?)`	Convert an action graph into a typed workflow graph
`action_graph_run(task, graph, config?, overrides?)`	Execute an action graph through the shared workflow runtime
`task_run(task, flow, overrides?)`	Run an act/verify/repair workflow
`workflow_result_text(result)`	Extract a visible text result from an LLM call, workflow wrapper, or ad hoc payload
`workflow_result_run(task, workflow_name, result, artifacts?, options?)`	Normalize an ad hoc result into a reusable run record
`workflow_result_persist(task, workflow_name, result, artifacts?, options?)`	Persist an ad hoc result as a run record without going through `workflow_execute`
`workflow_session(prev)`	Normalize a task result or transcript into a reusable session object
`workflow_session_new(metadata?)`	Create a new empty workflow session
`workflow_session_restore(run_or_path)`	Restore a session from a run record or persisted run path
`workflow_session_fork(prev)`	Fork a session transcript and mark it `forked`
`workflow_session_archive(prev)`	Archive a session transcript
`workflow_session_resume(prev)`	Resume an archived session transcript
`workflow_session_compact(prev, options?)`	Summarize/compact a session transcript in place
`workflow_session_reset(prev, carry_summary)`	Reset a session transcript, optionally carrying summary
`workflow_session_persist(prev, path?)`	Persist the session run record and attach the saved path
`workflow_continue(prev, task, flow, overrides?)`	Continue from an existing transcript
`workflow_compact(prev, options?)`	Summarize and compact a transcript
`workflow_reset(prev, carry_summary)`	Reset or summarize-then-reset a workflow transcript
`worker_request(worker)`	Return a worker handle’s immutable original request payload
`worker_result(worker)`	Return a worker handle/result payload or worker-result artifact payload
`worker_provenance(worker)`	Return normalized worker provenance fields
`worker_research_questions(worker)`	Return the worker’s canonical `research_questions` list
`worker_action_items(worker)`	Return the worker’s canonical `action_items` list
`worker_workflow_stages(worker)`	Return the worker’s canonical `workflow_stages` list
`worker_verification_steps(worker)`	Return the worker’s canonical `verification_steps` list

workflow_session(...) returns a normalized session dict that includes the current transcript, message count, summary, persisted run metadata, and a usage object when the source run captured LLM totals: {input_tokens, output_tokens, total_duration_ms, call_count}.

For background or delegated execution, use the worker lifecycle builtins (spawn_agent, send_input, resume_agent, wait_agent, close_agent, list_agents) directly from the runtime, or the worker_* helpers above when you need the normalized request/provenance views.

std/worktree

Helpers for isolated git worktree execution built on exec_at(...) and shell_at(...):

Function	Description
`worktree_default_path(repo, name)`	Return the default `.harn/worktrees/<name>` path
`worktree_create(repo, name, base_ref, path?)`	Create or reset a worktree branch at a target path
`worktree_remove(repo, path, force)`	Remove a worktree from the parent repo
`worktree_status(path)`	Run `git status --short --branch` in the worktree
`worktree_diff(path, base_ref?)`	Render diff output for the worktree
`worktree_shell(path, script)`	Run an arbitrary shell command inside the worktree

Selective imports

Import specific functions from any module:

import { extract_paths, parse_cells } from "std/text"

Import behavior

Import paths resolve in this order:

std/<module> from the embedded stdlib
Relative to the importing file, with implicit .harn
Installed packages under the nearest ancestor .harn/packages/
Package manifest [exports] aliases
Package directories with lib.harn

Packages can publish stable module entry points in harn.toml:

[exports]
capabilities = "runtime/capabilities.harn"
providers = "runtime/providers.harn"

With that manifest, import "acme/capabilities" resolves to the declared file inside .harn/packages/acme/, and nested package modules can import sibling packages through the workspace-level .harn/packages root instead of relying on brittle relative paths.

The imported file is parsed and executed
Pipelines in the imported file are registered by name
Non-pipeline top-level statements (fn declarations, let bindings) are executed, making their values available
Circular imports are detected and skipped (each file is imported at most once)
The working directory is temporarily changed to the imported file’s directory, so nested imports resolve correctly
Source-relative builtins like render(...) inside imported functions resolve paths relative to the imported module’s directory, not the entry pipeline

Static cross-module checking

harn check, harn run, harn bench, and the Harn LSP all build a module graph from the entry file that follows import statements transitively, so they share one consistent view of what names are visible in each module.

When every import in a file resolves, the typechecker treats a call to an unknown name as an error (not a lint warning):

error: call target `helpr` is not defined or imported

Resolution is conservative: if any import in the file fails to resolve (missing file, parse error, nonexistent package), the stricter cross-module check is turned off for that file and only the normal builtin/local-declaration check applies. That way one broken import does not produce a flood of follow-on undefined-name errors.

Go-to-definition in the LSP uses the same graph, so navigation works across any chain of imports — not just direct ones.

Import collision detection

If two wildcard imports export a function with the same name, Harn will report an error at both runtime and during harn check preflight:

Import collision: 'helper' is already defined when importing lib/b.harn.
Use selective imports to disambiguate: import { helper } from "..."

To resolve collisions, use selective imports to import only the names you need from each module:

import { parse_output } from "lib/a"
import { format_result } from "lib/b"

Pipeline inheritance

Pipelines can extend other pipelines:

pipeline base(task) {
  println("Step 1: setup")
  println("Step 2: execute")
  println("Step 3: cleanup")
}

pipeline custom(task) extends base {
  override setup() {
    println("Custom setup")
  }
}

If the child pipeline has override declarations, the parent’s body runs with the overrides applied. If the child has no overrides, the child’s body replaces the parent’s entirely.

Organizing a project

A typical project structure:

my-project/
  main.harn
  lib/
    context.harn      # shared context-gathering functions
    agent.harn        # shared agent utility functions
    helpers.harn      # general-purpose utilities

// main.harn
import "lib/context"
import "lib/agent"
import "lib/helpers"

pipeline default(task, project) {
  let ctx = gather_context(task, project)
  let result = run_agent(ctx)
  finalize(result)
}

Concurrency

Harn has built-in concurrency primitives that don’t require callbacks, promises, or async/await boilerplate.

spawn and await

Launch background tasks and collect results:

let handle = spawn {
  sleep(1s)
  "done"
}

let result = await(handle)  // blocks until complete
println(result)                 // "done"

Cancel a task before it finishes:

let handle = spawn { sleep(10s) }
cancel(handle)

Each spawned task runs in an isolated interpreter instance.

parallel

Run N tasks concurrently and collect results in order:

let results = parallel(5) { i ->
  i * 10
}
// [0, 10, 20, 30, 40]

The variable i is the zero-based task index. Results are always returned in index order regardless of completion order.

parallel each

Map over a collection concurrently:

let files = ["a.txt", "b.txt", "c.txt"]

let contents = parallel each files { file ->
  read_file(file)
}

Results preserve the original list order.

parallel settle

Like parallel each, but never throws. Instead, it collects both successes and failures into a result object:

let items = [1, 2, 3]
let outcome = parallel settle items { item ->
  if item == 2 {
    throw "boom"
  }
  item * 10
}

println(outcome.succeeded)  // 2
println(outcome.failed)     // 1

for r in outcome.results {
  if is_ok(r) {
    println(unwrap(r))
  } else {
    println(unwrap_err(r))
  }
}

The return value is a dict with:

Field	Type	Description
`results`	list	List of `Result` values (one per item), in order
`succeeded`	int	Number of `Ok` results
`failed`	int	Number of `Err` results

This is useful when you want to process all items and handle failures after the fact, rather than aborting on the first error.

retry

Automatically retry a block that might fail:

retry 3 {
  http_get("https://flaky-api.example.com/data")
}

Executes the body up to N times. If the body succeeds, returns immediately. If all attempts fail, returns nil. Note that return statements inside retry propagate out (they are not retried).

Channels

Message-passing between concurrent tasks:

let ch = channel("events")
send(ch, {event: "start", timestamp: timestamp()})
let msg = receive(ch)

Channel iteration

You can iterate over a channel with a for loop. The loop receives messages one at a time and exits when the channel is closed and fully drained:

let ch = channel("stream")

spawn {
  send(ch, "chunk 1")
  send(ch, "chunk 2")
  close_channel(ch)
}

for chunk in ch {
  println(chunk)
}
// prints "chunk 1" then "chunk 2", then the loop ends

This is especially useful with llm_stream, which returns a channel of response chunks:

let stream = llm_stream("Tell me a story", "You are a storyteller")
for chunk in stream {
  print(chunk)
}

Use try_receive(ch) for non-blocking reads – it returns nil immediately if no message is available. Use close_channel(ch) to signal that no more messages will be sent.

Atomics

Thread-safe counters:

let counter = atomic(0)
println(atomic_get(counter))         // 0

let c2 = atomic_add(counter, 5)
println(atomic_get(c2))              // 5

let c3 = atomic_set(c2, 100)
println(atomic_get(c3))              // 100

Atomic operations return new atomic values (they don’t mutate in place).

Mutex

Mutual exclusion for critical sections:

mutex {
  // only one task executes this block at a time
  var count = count + 1
}

Deadline

Set a timeout on a block of work:

deadline 30s {
  // must complete within 30 seconds
  agent_loop(task, system, {persistent: true})
}

Defer

fn open(path) { return path }
fn close(f) { log("closed ${f}") }

let f = open("data.txt")
defer { close(f) }
// ... use f ...
// close(f) runs automatically on scope exit

Multiple defer blocks execute in LIFO (last-registered, first-executed) order, similar to Go’s defer.

Capping in-flight work with `max_concurrent`

parallel each, parallel settle, and parallel N all accept an optional with { max_concurrent: N } clause that caps how many workers are in flight at once. Tasks past the cap wait until a slot frees up — fan-out stays bounded while the total work is unchanged.

// Without a cap: all 200 requests hit the server at once.
let results = parallel settle paths { p -> llm_call(p, nil, opts) }

// With max_concurrent=8: at most 8 in-flight calls at any moment.
let results = parallel settle paths with { max_concurrent: 8 } { p ->
  llm_call(p, nil, opts)
}

max_concurrent: 0 (or a missing with clause) means unlimited. Negative values are treated as unlimited. The cap applies to every parallel mode, including the count form:

fn process(i) { log(i) }

parallel 100 with { max_concurrent: 4 } { i ->
  process(i)
}

Rate limiting LLM providers

max_concurrent bounds simultaneous in-flight tasks on the caller’s side. A provider can additionally be rate-limited at the throughput layer (requests per minute). The RPM limiter is a sliding-window budget enforced before each llm_call / llm_completion — requests past the budget wait for the window to free up rather than error.

Configure RPM per provider via:

rpm: 600 in the provider’s entry in providers.toml / harn.toml.
HARN_RATE_LIMIT_<PROVIDER>=600 environment variable (e.g. HARN_RATE_LIMIT_TOGETHER=600, HARN_RATE_LIMIT_LOCAL=60). Env overrides config.
llm_rate_limit("provider", 600) at runtime from a pipeline.

The two controls compose: max_concurrent prevents bursts from saturating the server; RPM shapes sustained throughput. When batching hundreds of LLM calls against a local single-GPU server, both are worth setting — otherwise the RPM budget can be spent in a 2-second burst that overwhelms the queue and drops requests.

Harn language specification

Version: 1.0 (derived from implementation, 2026-04-01)

Harn is a pipeline-oriented programming language for orchestrating AI agents. It is implemented as a Rust workspace with a lexer, parser, type checker, tree-walking VM, tree-sitter grammar, and CLI/runtime tooling. Programs consist of named pipelines containing imperative statements, expressions, and calls to registered builtins that perform I/O, LLM calls, and tool execution.

This file is the canonical language specification. The hosted docs page docs/src/language-spec.md is generated from it by scripts/sync_language_spec.sh.

Lexical rules

Whitespace

Spaces (' '), tabs ('\t'), and carriage returns ('\r') are insignificant and skipped between tokens. Newlines ('\n') are significant tokens used as statement separators. The parser skips newlines between statements but they are preserved in the token stream.

Backslash line continuation

A backslash (\) immediately before a newline joins the current line with the next. Both the backslash and the newline are removed from the token stream, so the two physical lines are treated as a single logical line by the lexer.

let total = 1 + 2 \
  + 3 + 4
// equivalent to: let total = 1 + 2 + 3 + 4

This is useful for breaking long expressions that do not involve a binary operator eligible for multiline continuation (see “Multiline expressions”).

Comments

// Line comment: everything until the next newline is ignored.

/* Block comment: can span multiple lines.
   /* Nesting is supported. */
   Still inside the outer comment. */

Block comments track nesting depth, so /* /* */ */ is valid. An unterminated block comment produces a lexer error.

Keywords

The following identifiers are reserved:

Keyword	Token
`pipeline`	`.pipeline`
`extends`	`.extends`
`override`	`.overrideKw`
`let`	`.letKw`
`var`	`.varKw`
`if`	`.ifKw`
`else`	`.elseKw`
`for`	`.forKw`
`in`	`.inKw`
`match`	`.matchKw`
`retry`	`.retry`
`parallel`	`.parallel`
`defer`	`.defer`
`return`	`.returnKw`
`import`	`.importKw`
`true`	`.trueKw`
`false`	`.falseKw`
`nil`	`.nilKw`
`try`	`.tryKw`
`catch`	`.catchKw`
`throw`	`.throwKw`
`finally`	`.finally`
`fn`	`.fnKw`
`spawn`	`.spawnKw`
`while`	`.whileKw`
`type`	`.typeKw`
`enum`	`.enum`
`struct`	`.struct`
`interface`	`.interface`
`pub`	`.pub`
`from`	`.from`
`to`	`.to`
`tool`	`.tool`
`exclusive`	`.exclusive`
`guard`	`.guard`
`require`	`.require`
`each`	`.each`
`settle`	`.settle`
`deadline`	`.deadline`
`yield`	`.yield`
`mutex`	`.mutex`
`break`	`.break`
`continue`	`.continue`
`select`	`.select`
`impl`	`.impl`

Identifiers

An identifier starts with a letter or underscore, followed by zero or more letters, digits, or underscores:

identifier ::= [a-zA-Z_][a-zA-Z0-9_]*

Number literals

int_literal   ::= digit+
float_literal ::= digit+ '.' digit+

A number followed by . where the next character is not a digit is lexed as an integer followed by the . operator (enabling 42.method).

Duration literals

A duration literal is an integer followed immediately (no whitespace) by a time-unit suffix:

duration_literal ::= digit+ ('ms' | 's' | 'm' | 'h' | 'd' | 'w')

Suffix	Unit	Equivalent
`ms`	milliseconds	–
`s`	seconds	1000 ms
`m`	minutes	60 s
`h`	hours	60 m
`d`	days	24 h
`w`	weeks	7 d

Duration literals evaluate to an integer number of milliseconds. They can be used anywhere an expression is expected:

sleep(500ms)
deadline 30s { /* ... */ }
let one_day = 1d       // 86400000
let two_weeks = 2w     // 1209600000

String literals

Single-line strings

string_literal ::= '"' (char | escape | interpolation)* '"'
escape         ::= '\' ('n' | 't' | '\\' | '"' | '$')
interpolation  ::= '${' expression '}'

A string cannot span multiple lines. An unescaped newline inside a string is a lexer error.

If the string contains at least one ${...} interpolation, it produces an interpolatedString token containing a list of segments (literal text and expression source strings). Otherwise it produces a plain stringLiteral token.

Escape sequences: \n (newline), \t (tab), \\ (backslash), \" (double quote), \$ (dollar sign). Any other character after \ produces a literal backslash followed by that character.

Raw string literals

raw_string_literal ::= 'r"' char* '"'

Raw strings use the r"..." prefix. No escape processing or interpolation is performed inside a raw string – backslashes, dollar signs, and other characters are taken literally. Raw strings cannot span multiple lines.

Raw strings are useful for regex patterns and file paths where backslashes are common:

let pattern = r"\d+\.\d+"
let path = r"C:\Users\alice\docs"

Multi-line strings

multi_line_string ::= '"""' newline? content '"""'

Triple-quoted strings can span multiple lines. The optional newline immediately after the opening """ is consumed. Common leading whitespace is stripped from all non-empty lines. A trailing newline before the closing """ is removed.

Multi-line strings support ${expression} interpolation with automatic indent stripping. If at least one ${...} interpolation is present, the result is an interpolatedString token; otherwise it is a plain stringLiteral token.

let name = "world"
let doc = """
  Hello, ${name}!
  Today is ${timestamp()}.
"""

Operators

Two-character operators (checked first)

Operator	Token	Description
`==`	`.eq`	Equality
`!=`	`.neq`	Inequality
`&&`	`.and`	Logical AND
`\|\|`	`.or`	Logical OR
`\|>`	`.pipe`	Pipe
`??`	`.nilCoal`	Nil coalescing
`**`	`.pow`	Exponentiation
`?.`	`.questionDot`	Optional property/method chaining
`->`	`.arrow`	Arrow
`<=`	`.lte`	Less than or equal
`>=`	`.gte`	Greater than or equal
`+=`	`.plusAssign`	Compound assignment
`-=`	`.minusAssign`	Compound assignment
`*=`	`.starAssign`	Compound assignment
`/=`	`.slashAssign`	Compound assignment
`%=`	`.percentAssign`	Compound assignment

Single-character operators

Operator	Token	Description
`=`	`.assign`	Assignment
`!`	`.not`	Logical NOT
`.`	`.dot`	Member access
`+`	`.plus`	Addition / concatenation
`-`	`.minus`	Subtraction / negation
`*`	`.star`	Multiplication / string repetition
`/`	`.slash`	Division
`<`	`.lt`	Less than
`>`	`.gt`	Greater than
`%`	`.percent`	Modulo
`?`	`.question`	Ternary / Result propagation
`\|`	`.bar`	Union types

Keyword operators

Operator	Description
`in`	Membership test (lists, dicts, strings, sets)
`not in`	Negated membership test

Delimiters

Delimiter	Token
`{`	`.lBrace`
`}`	`.rBrace`
`(`	`.lParen`
`)`	`.rParen`
`[`	`.lBracket`
`]`	`.rBracket`
`,`	`.comma`
`:`	`.colon`
`;`	`.semicolon`
`@`	`.at` (attribute prefix)

Special tokens

Token	Description
`.newline`	Line break character
`.eof`	End of input

Grammar

The grammar is expressed in EBNF. Newlines between statements are implicit separators (the parser skips them with skipNewlines()). The consume() helper also skips newlines before checking the expected token.

Top-level

program            ::= (top_level | NEWLINE)*
top_level          ::= import_decl
                     | attributed_decl
                     | pipeline_decl
                     | statement

attributed_decl    ::= attribute+ (pipeline_decl | fn_decl | tool_decl
                                  | struct_decl | enum_decl | type_decl
                                  | interface_decl | impl_block)
attribute          ::= '@' IDENTIFIER ['(' attr_arg (',' attr_arg)* [','] ')']
attr_arg           ::= [IDENTIFIER ':'] attr_value
attr_value         ::= STRING_LITERAL | RAW_STRING | INT_LITERAL
                     | FLOAT_LITERAL | 'true' | 'false' | 'nil'
                     | IDENTIFIER | '-' INT_LITERAL | '-' FLOAT_LITERAL

import_decl        ::= 'import' STRING_LITERAL
                     | 'import' '{' IDENTIFIER (',' IDENTIFIER)* '}'
                       'from' STRING_LITERAL

pipeline_decl      ::= ['pub'] 'pipeline' IDENTIFIER '(' param_list ')'
                       ['->' type_expr]
                       ['extends' IDENTIFIER] '{' block '}'

param_list         ::= (IDENTIFIER (',' IDENTIFIER)*)?
block              ::= statement*

fn_decl            ::= ['pub'] 'fn' IDENTIFIER [generic_params]
                       '(' fn_param_list ')' ['->' type_expr]
                       [where_clause] '{' block '}'
type_decl          ::= 'type' IDENTIFIER '=' type_expr
enum_decl          ::= ['pub'] 'enum' IDENTIFIER [generic_params] '{'
                       (enum_variant | ',' | NEWLINE)* '}'
enum_variant       ::= IDENTIFIER ['(' fn_param_list ')']
struct_decl        ::= ['pub'] 'struct' IDENTIFIER [generic_params]
                       '{' struct_field* '}'
struct_field       ::= IDENTIFIER ['?'] ':' type_expr
impl_block         ::= 'impl' IDENTIFIER '{' (fn_decl | NEWLINE)* '}'
interface_decl     ::= 'interface' IDENTIFIER [generic_params] '{'
                       (interface_assoc_type | interface_method)* '}'
interface_assoc_type ::= 'type' IDENTIFIER ['=' type_expr]
interface_method   ::= 'fn' IDENTIFIER [generic_params]
                       '(' fn_param_list ')' ['->' type_expr]

Standard library modules

Imports starting with std/ load embedded stdlib modules:

import "std/text" — text processing (extract_paths, parse_cells, filter_test_cells, truncate_head_tail, detect_compile_error, has_got_want, format_test_errors, int_to_string, float_to_string, parse_int_or, parse_float_or)
import "std/collections" — collection utilities (filter_nil, store_stale, store_refresh)
import "std/agent_state" — durable session-scoped state helpers (agent_state_init, agent_state_resume, agent_state_write, agent_state_read, agent_state_list, agent_state_delete, agent_state_handoff)

These modules are compiled into the interpreter binary and require no filesystem access.

Statements

statement          ::= let_binding
                     | var_binding
                     | if_else
                     | for_in
                     | match_expr
                     | while_loop
                     | retry_block
                     | parallel_block
                     | parallel_each
                     | parallel_settle
                     | defer_block
                     | return_stmt
                     | throw_stmt
                     | override_decl
                     | try_catch
                     | fn_decl
                     | enum_decl
                     | struct_decl
                     | impl_block
                     | interface_decl
                     | type_decl
                     | guard_stmt
                     | require_stmt
                     | deadline_block
                     | mutex_block
                     | select_expr
                     | break_stmt
                     | continue_stmt
                     | expression_statement

let_binding        ::= 'let' binding_pattern [':' type_expr] '=' expression
var_binding        ::= 'var' binding_pattern [':' type_expr] '=' expression
if_else            ::= 'if' expression '{' block '}'
                       ['else' (if_else | '{' block '}')]
for_in             ::= 'for' binding_pattern 'in' expression '{' block '}'
match_expr         ::= 'match' expression '{' match_arm* '}'
match_arm          ::= expression ['if' expression] '->' '{' block '}'
while_loop         ::= 'while' expression '{' block '}'
retry_block        ::= 'retry' ['(' expression ')'] expression? '{' block '}'
parallel_block     ::= 'parallel' '(' expression ')' '{' [IDENTIFIER '->'] block '}'
parallel_each      ::= 'parallel' 'each' expression '{' IDENTIFIER '->' block '}'
parallel_settle    ::= 'parallel' 'settle' expression '{' IDENTIFIER '->' block '}'
defer_block        ::= 'defer' '{' block '}'
return_stmt        ::= 'return' [expression]
throw_stmt         ::= 'throw' expression
override_decl      ::= 'override' IDENTIFIER '(' param_list ')' '{' block '}'
try_catch          ::= 'try' '{' block '}'
                       ['catch' [('(' IDENTIFIER [':' type_expr] ')') | IDENTIFIER]
                         '{' block '}']
                       ['finally' '{' block '}']
try_star_expr      ::= 'try' '*' unary_expr
guard_stmt         ::= 'guard' expression 'else' '{' block '}'
require_stmt       ::= 'require' expression [',' expression]
deadline_block     ::= 'deadline' primary '{' block '}'
mutex_block        ::= 'mutex' '{' block '}'
select_expr        ::= 'select' '{'
                         (IDENTIFIER 'from' expression '{' block '}'
                         | 'timeout' expression '{' block '}'
                         | 'default' '{' block '}')+
                       '}'
break_stmt         ::= 'break'
continue_stmt      ::= 'continue'

generic_params     ::= '<' IDENTIFIER (',' IDENTIFIER)* '>'
where_clause       ::= 'where' IDENTIFIER ':' IDENTIFIER
                       (',' IDENTIFIER ':' IDENTIFIER)*

fn_param_list      ::= (fn_param (',' fn_param)*)? [',' rest_param]
                     | rest_param
fn_param           ::= IDENTIFIER [':' type_expr] ['=' expression]
rest_param         ::= '...' IDENTIFIER

A rest parameter (`...name`) must be the last parameter in the list. At call
time, any arguments beyond the positional parameters are collected into a list
and bound to the rest parameter name. If no extra arguments are provided, the
rest parameter is an empty list.

```harn
fn sum(...nums) {
  var total = 0
  for n in nums {
    total = total + n
  }
  return total
}
sum(1, 2, 3)  // 6

fn log(level, ...parts) {
  println("[${level}] ${join(parts, " ")}")
}
log("INFO", "server", "started")  // [INFO] server started

expression_statement ::= expression
                       | assignable '=' expression
                       | assignable ('+=' | '-=' | '*=' | '/=' | '%=') expression

assignable         ::= IDENTIFIER
                     | postfix_property
                     | postfix_subscript

binding_pattern    ::= IDENTIFIER
                     | '{' dict_pattern_fields '}'
                     | '[' list_pattern_elements ']'

dict_pattern_fields   ::= dict_pattern_field (',' dict_pattern_field)*
dict_pattern_field    ::= '...' IDENTIFIER
                        | IDENTIFIER [':' IDENTIFIER]

list_pattern_elements ::= list_pattern_element (',' list_pattern_element)*
list_pattern_element  ::= '...' IDENTIFIER
                        | IDENTIFIER

The expression_statement rule handles both bare expressions (function calls, method calls) and assignments. An assignment is recognized when the left-hand side is an identifier followed by =.

Expressions (by precedence, lowest to highest)

expression         ::= pipe_expr
pipe_expr          ::= range_expr ('|>' range_expr)*
range_expr         ::= ternary_expr ['to' ternary_expr ['exclusive']]
ternary_expr       ::= logical_or ['?' logical_or ':' logical_or]
logical_or         ::= logical_and ('||' logical_and)*
logical_and        ::= equality ('&&' equality)*
equality           ::= comparison (('==' | '!=') comparison)*
comparison         ::= additive
                       (('<' | '>' | '<=' | '>=' | 'in' | 'not in') additive)*
additive           ::= nil_coal_expr (('+' | '-') nil_coal_expr)*
nil_coal_expr      ::= multiplicative ('??' multiplicative)*
multiplicative     ::= power_expr (('*' | '/' | '%') power_expr)*
power_expr         ::= unary ['**' power_expr]
unary              ::= ('!' | '-') unary | postfix
postfix            ::= primary (member_access
                               | optional_member_access
                               | subscript_access
                               | slice_access
                               | call
                               | try_unwrap)*
member_access      ::= '.' IDENTIFIER ['(' arg_list ')']
optional_member_access
                    ::= '?.' IDENTIFIER ['(' arg_list ')']
subscript_access   ::= '[' expression ']'
slice_access       ::= '[' [expression] ':' [expression] ']'
call               ::= '(' arg_list ')'    (* only when postfix base is an identifier *)
try_unwrap         ::= '?'                 (* expr? on Result *)

Primary expressions

primary            ::= STRING_LITERAL
                     | INTERPOLATED_STRING
                     | INT_LITERAL
                     | FLOAT_LITERAL
                     | DURATION_LITERAL
                     | 'true' | 'false' | 'nil'
                     | IDENTIFIER
                     | '(' expression ')'
                     | list_literal
                     | dict_or_closure
                     | parallel_block
                     | parallel_each
                     | parallel_settle
                     | retry_block
                     | if_else
                     | match_expr
                     | deadline_block
                     | 'spawn' '{' block '}'
                     | 'fn' '(' fn_param_list ')' '{' block '}'
                     | 'try' '{' block '}'

```text
list_literal       ::= '[' (list_element (',' list_element)*)? ']'
list_element       ::= '...' expression | expression

dict_or_closure    ::= '{' '}'
                     | '{' closure_param_list '->' block '}'
                     | '{' dict_entries '}'
closure_param_list ::= fn_param_list

dict_entries       ::= dict_entry (',' dict_entry)*
dict_entry         ::= (IDENTIFIER | STRING_LITERAL | '[' expression ']')
                       ':' expression
                     | '...' expression
arg_list           ::= (arg_element (',' arg_element)*)?
arg_element        ::= '...' expression | expression

Dict keys written as bare identifiers are converted to string literals (e.g., {name: "x"} becomes {"name": "x"}). Computed keys use bracket syntax: {[expr]: value}.

Operator precedence table

From lowest to highest binding:

Precedence	Operators	Associativity	Description
1	`\|>`	Left	Pipe
2	`? :`	Right	Ternary conditional
3	`\|\|`	Left	Logical OR
4	`&&`	Left	Logical AND
5	`==` `!=`	Left	Equality
6	`<` `>` `<=` `>=` `in` `not in`	Left	Comparison / membership
7	`+` `-`	Left	Additive
8	`??`	Left	Nil coalescing
9	`*` `/` `%`	Left	Multiplicative
10	`**`	Right	Exponentiation
11	`!` `-` (unary)	Right (prefix)	Unary
12	`.` `?.` `[]` `[:]` `()` `?`	Left	Postfix

Multiline expressions

Binary operators ||, &&, +, *, /, %, **, |> and the . member access operator can span multiple lines. The operator at the start of a continuation line causes the parser to treat it as a continuation of the previous expression rather than a new statement.

Note: - does not support multiline continuation because it is also a unary negation prefix.

let result = items
  .filter({ x -> x > 0 })
  .map({ x -> x * 2 })

let msg = "hello"
  + " "
  + "world"

let ok = check_a()
  && check_b()
  || fallback()

Pipe placeholder (`_`)

When the right side of |> contains _ identifiers, the expression is automatically wrapped in a closure where _ is replaced with the piped value:

"hello world" |> split(_, " ")     // desugars to: |> { __pipe -> split(__pipe, " ") }
[3, 1, 2] |> _.sort()             // desugars to: |> { __pipe -> __pipe.sort() }
items |> len(_)                    // desugars to: |> { __pipe -> len(__pipe) }

Without _, the pipe passes the value as the first argument to a closure or function.

Scope rules

Harn uses lexical scoping with a parent-chain environment model.

Environment

Each HarnEnvironment has:

A values dictionary mapping names to HarnValue
A mutable set tracking which names were declared with var
An optional parent reference

Variable lookup

env.get(name) checks the current scope’s values first, then walks up the parent chain. Returns nil (which becomes .nilValue) if not found anywhere.

Variable definition

let name = value – defines name as immutable in the current scope.
var name = value – defines name as mutable in the current scope.

Variable assignment

name = value walks up the scope chain to find the binding. If the binding is found but was declared with let, throws HarnRuntimeError.immutableAssignment. If not found in any scope, throws HarnRuntimeError.undefinedVariable.

Scope creation

New child scopes are created for:

Pipeline bodies
for loop bodies (loop variable is mutable)
while loop iterations
parallel, parallel each, and parallel settle task bodies (isolated interpreter per task)
try/catch blocks (catch body gets its own child scope with optional error variable)
Closure invocations (child of the captured environment, not the call site)
block nodes

Control flow statements (if/else, match) execute in the current scope without creating a new child scope.

Destructuring patterns

Destructuring binds multiple variables from a dict or list in a single let, var, or for-in statement.

Dict destructuring

let {name, age} = {name: "Alice", age: 30}
// name == "Alice", age == 30

Each field name in the pattern extracts the value for the matching key. If the key is missing from the dict, the variable is bound to nil.

Default values

Pattern fields can specify default values with = expr syntax. The default expression is evaluated when the extracted value is nil (i.e. when the key is missing from the dict or the index is out of bounds for a list):

let { name = "workflow", system = "" } = { name: "custom" }
// name == "custom" (key exists), system == "" (default applied)

let [a = 10, b = 20, c = 30] = [1, 2]
// a == 1, b == 2, c == 30 (default applied)

Defaults can be combined with field renaming:

let { name: displayName = "Unknown" } = {}
// displayName == "Unknown"

Default expressions are evaluated fresh each time the pattern is matched (they are not memoized). Rest patterns (...rest) do not support default values.

List destructuring

let [first, second, third] = [10, 20, 30]
// first == 10, second == 20, third == 30

Elements are bound positionally. If there are more bindings than elements in the list, the excess bindings receive nil (unless a default value is specified).

Field renaming

A dict pattern field can be renamed with key: alias syntax:

let {name: user_name} = {name: "Bob"}
// user_name == "Bob"

Rest patterns

A ...rest element collects remaining items into a new list or dict:

let [head, ...tail] = [1, 2, 3, 4]
// head == 1, tail == [2, 3, 4]

let {name, ...extras} = {name: "Carol", age: 25, role: "dev"}
// name == "Carol", extras == {age: 25, role: "dev"}

If there are no remaining items, the rest variable is bound to [] for list patterns or {} for dict patterns. The rest element must appear last in the pattern.

For-in destructuring

Destructuring patterns work in for-in loops to unpack each element:

let entries = [{name: "X", val: 1}, {name: "Y", val: 2}]
for {name, val} in entries {
  println("${name}=${val}")
}

let pairs = [[1, 2], [3, 4]]
for [a, b] in pairs {
  println("${a}+${b}")
}

Var destructuring

var destructuring creates mutable bindings that can be reassigned:

var {x, y} = {x: 1, y: 2}
x = 10
y = 20

Type errors

Destructuring a non-dict value with a dict pattern or a non-list value with a list pattern produces a runtime error. For example, let {a} = "hello" throws "dict destructuring requires a dict value".

Evaluation order

Program entry

All top-level nodes are scanned. Pipeline declarations are registered by name. Import declarations are processed (loaded and evaluated).
The entry pipeline is selected: the pipeline named "default" if it exists, otherwise the first pipeline in the file.
The entry pipeline’s body is executed.

If no pipeline is found in the file, all top-level statements are compiled and executed directly as an implicit entry point (script mode). This allows simple scripts to work without wrapping code in a pipeline block.

Pipeline parameters

If the pipeline parameter list includes task, it is bound to context.task. If it includes project, it is bound to context.projectRoot. A context dict is always injected with keys task, project_root, and task_type.

Pipeline return type

Pipelines may declare a return type with the same -> TypeExpr syntax as functions:

pipeline ghost_text(task) -> {text: string, code: int} {
  return {text: "hello", code: 0}
}

The type checker verifies every return <expr> statement against the declared type. Mismatches are reported as return type doesn't match errors.

A declared return type is the typed contract that a host or bridge (ACP, A2A) can rely on when consuming the pipeline’s output.

Public pipelines (pub pipeline) without an explicit return type emit the pipeline-return-type lint warning; explicit return types on the Harn→ACP boundary will be required in a future release.

Pipeline inheritance

pipeline child(x) extends parent { ... }:

If the child body contains override declarations, the resolved body is the parent’s body plus any non-override statements from the child. Override declarations are available for lookup by name.
If the child body contains no override declarations, the child body entirely replaces the parent body.

Statement execution

Statements execute sequentially. The last expression value in a block is the block’s result, though this is mostly relevant for closures and parallel bodies.

Import resolution

import "path" resolves in this order:

If path starts with std/, loads embedded stdlib module (e.g. std/text)
Relative to current file’s directory; auto-adds .harn extension
.harn/packages/<path> directories rooted at the nearest ancestor package root (the search walks upward and stops at a .git boundary)
Package manifest [exports] mappings under .harn/packages/<package>/harn.toml
Package directories with lib.harn entry point

Package manifests can publish stable module entry points without forcing consumers to import the on-disk file layout directly:

[exports]
capabilities = "runtime/capabilities.harn"
providers = "runtime/providers.harn"

With the example above, import "acme/capabilities" resolves to the declared file inside the installed acme package.

Selective imports: import { name1, name2 } from "module" imports only the specified functions. Functions marked pub are exported by default; if no pub functions exist, all functions are exported.

Imported pipelines are registered for later invocation. Non-pipeline top-level statements (fn declarations, let bindings) are executed immediately.

Static cross-module resolution

harn check, harn run, harn bench, and the LSP build a module graph from the entry file that transitively loads every import-reachable .harn module. The graph drives:

Typechecker: when every import in a file resolves, call targets that are not builtins, not local declarations, not struct constructors, not callable variables, and not introduced by an import produce a call target ... is not defined or imported error (not a lint warning). This catches typos and stale imports before the VM loads.
Linter: wildcard imports are resolved via the same graph; the undefined-function rule can now check against the actual exported name set of imported modules rather than silently disabling itself.
LSP go-to-definition: cross-file navigation walks the graph’s definition_of lookup, so any reachable symbol (through any number of transitive imports) can be jumped to.

Resolution conservatively degrades to the pre-v0.7.12 behavior when any import in the file is unresolved (missing file, parse error, non-existent package directory), so a single broken import does not avalanche into a sea of false-positive undefined-name errors. The unresolved import itself still surfaces via the runtime loader.

Runtime values

Type	Syntax	Description
`string`	`"text"`	UTF-8 string
`int`	`42`	Platform-width integer
`float`	`3.14`	Double-precision float
`bool`	`true` / `false`	Boolean
`nil`	`nil`	Null value
`list`	`[1, 2, 3]`	Ordered collection
`dict`	`{key: value}`	String-keyed map
`set`	`set(1, 2, 3)`	Unordered collection of unique values
`closure`	`{ x -> x + 1 }`	First-class function with captured environment
`enum`	`Color.Red`	Enum variant, optionally with associated data
`struct`	`Point({x: 3, y: 4})`	Struct instance with named fields
`taskHandle`	(from `spawn`)	Opaque handle to an async task
`Iter<T>`	`x.iter()` / `iter(x)`	Lazy, single-pass, fused iterator. See Iterator protocol
`Pair<K, V>`	`pair(k, v)`	Two-element value; access via `.first` / `.second`

Truthiness

Value	Truthy?
`bool(false)`	No
`nil`	No
`int(0)`	No
`float(0)`	No
`string("")`	No
`list([])`	No
`dict([:])`	No
`set()` (empty)	No
Everything else	Yes

Equality

Values are equal if they have the same type and same contents, with these exceptions:

int and float are compared by converting int to float
Two closures are never equal
Two task handles are equal if their IDs match

Comparison

Only int, float, and string support ordering (<, >, <=, >=). Comparison between other types returns 0 (equal).

Binary operator semantics

Arithmetic (`+`, `-`, `*`, `/`)

Left	Right	`+`	`-`	`*`	`/`
int	int	int	int	int	int (truncating)
float	float	float	float	float	float
int	float	float	float	float	float
float	int	float	float	float	float
string	string	string (concatenation)	TypeError	TypeError	TypeError
string	int	TypeError	TypeError	string (repetition)	TypeError
int	string	TypeError	TypeError	string (repetition)	TypeError
list	list	list (concatenation)	TypeError	TypeError	TypeError
dict	dict	dict (merge, right wins)	TypeError	TypeError	TypeError
other	other	TypeError	TypeError	TypeError	TypeError

Division by zero returns nil. string * int repeats the string; negative or zero counts return "".

Type mismatches that are not listed as valid combinations above produce a TypeError at runtime. The type checker reports these as compile-time errors when operand types are statically known. Use to_string() or string interpolation ("${expr}") for explicit type conversion.

Modulo (`%`)

% is numeric-only. int % int returns int; any case involving a float returns float. Modulo by zero follows the same runtime error path as division by zero.

Exponentiation (`**`)

** is numeric-only and right-associative, so 2 ** 3 ** 2 evaluates as 2 ** (3 ** 2).

int ** int returns int for non-negative exponents that fit in u32, using wrapping integer exponentiation.
Negative or very large integer exponents promote to float.
Any case involving a float returns float.
Non-numeric operands raise TypeError.

Logical (`&&`, `||`)

Short-circuit evaluation:

&&: if left is falsy, returns false without evaluating right.
||: if left is truthy, returns true without evaluating right.

Nil coalescing (`??`)

Short-circuit: if left is not nil, returns left without evaluating right. ?? binds tighter than additive/comparison/logical operators but looser than multiplicative operators, so xs?.count ?? 0 > 0 parses as (xs?.count ?? 0) > 0.

Pipe (`|>`)

a |> f evaluates a, then:

If f evaluates to a closure, invokes it with a as the single argument.
If f is an identifier resolving to a builtin, calls the builtin with [a].
If f is an identifier resolving to a closure variable, invokes it with a.
Otherwise returns nil.

Ternary (`? :`)

condition ? trueExpr : falseExpr evaluates condition, then evaluates and returns either trueExpr (if truthy) or falseExpr.

Ranges (`to`, `to … exclusive`)

a to b evaluates a and b (both must be integers) and produces a list of consecutive integers. The form is inclusive by default — 1 to 5 is [1, 2, 3, 4, 5] — because that matches how the expression reads aloud.

Add the trailing modifier exclusive to get the half-open form: 1 to 5 exclusive is [1, 2, 3, 4].

Expression	Value	Shape
`1 to 5`	`[1, 2, 3, 4, 5]`	`[a, b]`
`1 to 5 exclusive`	`[1, 2, 3, 4]`	`[a, b)`
`0 to 3`	`[0, 1, 2, 3]`	`[a, b]`
`0 to 3 exclusive`	`[0, 1, 2]`	`[a, b)`

If b < a, the result is the empty list. The range(n) / range(a, b) stdlib builtins always produce the half-open form, for Python-compatible indexing.

Control flow

if/else

if condition {
  // then
} else if other {
  // else-if
} else {
  // else
}

else if chains are parsed as a nested ifElse node in the else branch.

for/in

for item in iterable {
  // body
}

If iterable is a list, iterates over elements. If iterable is a dict, iterates over entries sorted by key, where each entry is {key: "...", value: ...}. The loop variable is mutable within the loop body.

while

while condition {
  // body
}

Maximum 10,000 iterations (safety limit). Condition is re-evaluated each iteration.

match

match value {
  pattern1 -> { body1 }
  pattern2 if condition -> { body2 }
}

Patterns are expressions. Each pattern is evaluated and compared to the match value using valuesEqual. An arm may include an if guard after the pattern; when present, the arm only matches if the pattern matches and the guard expression evaluates to a truthy value. The first matching arm executes.

If no arm matches, a runtime error is thrown (no matching arm in match expression). This makes non-exhaustive matches a hard failure rather than a silent nil.

let x = 5
match x {
  1 -> { "one" }
  n if n > 3 -> { "big: ${n}" }
  _ -> { "other" }
}
// -> "big: 5"

retry

retry 3 {
  // body that may throw
}

Executes the body up to N times. If the body succeeds (no error), returns immediately. If the body throws, catches the error and retries. return statements inside retry propagate out (are not retried). After all attempts are exhausted, returns nil (does not re-throw the last error).

Concurrency

parallel

parallel(count) { i ->
  // body executed count times concurrently
}

Creates count concurrent tasks. Each task gets an isolated interpreter with a child environment. The optional variable i is bound to the task index (0-based). Returns a list of results in index order.

parallel each

parallel each list { item ->
  // body for each item
}

Maps over a list concurrently. Each task gets an isolated interpreter. The variable is bound to the current list element. Returns a list of results in the original order.

parallel settle

parallel settle list { item ->
  // body for each item
}

Like parallel each, but never throws. Instead, it collects both successes and failures into a result object with fields:

Field	Type	Description
`results`	list	List of `Result` values (one per item), in order
`succeeded`	int	Number of `Ok` results
`failed`	int	Number of `Err` results

defer

defer {
  // cleanup body
}

Registers a block to run when the enclosing scope exits, whether by normal return or by a thrown error. Multiple defer blocks in the same scope execute in LIFO (last-registered, first-executed) order, similar to Go’s defer. The deferred block runs in the scope where it was declared.

fn open(path) { path }
fn close(f) { log("closing ${f}") }
let f = open("data.txt")
defer { close(f) }
// ... use f ...
// close(f) runs automatically on scope exit

spawn/await/cancel

let handle = spawn {
  // async body
}
let result = await(handle)
cancel(handle)

spawn launches an async task and returns a taskHandle. await (a built-in interpreter function, not a keyword) blocks until the task completes and returns its result. cancel cancels the task.

Channels

Channels provide typed message-passing between concurrent tasks.

let ch = channel("name", 10)   // buffered channel with capacity 10
send(ch, "hello")               // send a value
let msg = receive(ch)           // blocking receive

Channel iteration

A for-in loop over a channel asynchronously receives values until the channel is closed and drained:

let ch = channel("stream", 10)
spawn {
  send(ch, "a")
  send(ch, "b")
  close_channel(ch)
}
for item in ch {
  println(item)    // prints "a", then "b"
}
// loop exits after channel is closed and all items are consumed

When the channel is closed, remaining buffered items are still delivered. The loop exits once all items have been consumed.

close_channel(ch)

Closes a channel. After closing, send returns false and no new values are accepted. Buffered items can still be received.

try_receive(ch)

Non-blocking receive. Returns the next value from the channel, or nil if the channel is empty (regardless of whether it is closed).

select

Multiplexes across multiple channels, executing the body of whichever channel receives a value first:

select {
  msg from ch1 {
    log("ch1: ${msg}")
  }
  msg from ch2 {
    log("ch2: ${msg}")
  }
}

Each case binds the received value to a variable (msg) and executes the corresponding body. Only one case fires per select.

timeout case

fn handle(msg) { log(msg) }
let ch1 = channel(1)
select {
  msg from ch1 { handle(msg) }
  timeout 5s {
    log("timed out")
  }
}

If no channel produces a value within the duration, the timeout body runs.

default case (non-blocking)

fn handle(msg) { log(msg) }
let ch1 = channel(1)
select {
  msg from ch1 { handle(msg) }
  default {
    log("nothing ready")
  }
}

If no channel has a value immediately available, the default body runs without blocking. timeout and default are mutually exclusive.

select() builtin

The statement form desugars to the select(ch1, ch2, ...) async builtin, which returns {index, value, channel}. The builtin can be called directly for dynamic channel lists.

Error model

throw

throw expression

Evaluates the expression and throws it as HarnRuntimeError.thrownError(value). Any value can be thrown (strings, dicts, etc.).

try/catch/finally

try {
  // body
} catch (e) {
  // handler
} finally {
  // cleanup — always runs
}

If the body throws:

A thrownError(value): e is bound to the thrown value directly.
Any other runtime error: e is bound to the error’s localizedDescription string.

return inside a try block propagates out of the enclosing pipeline (is not caught).

The error variable (e) is optional: catch { ... } is valid without it.

try { ... } catch (e) { ... } is also usable as an expression: the value of the whole form is the tail value of the try body when it succeeds, and the tail value of the catch handler when an error is caught. This means the natural let v = try { risky() } catch (e) { fallback } binding is supported directly, without needing to restructure through Result helpers. When a typed catch (catch (e: AppError) { ... }) does not match the thrown error’s type, the throw propagates past the expression unchanged — the surrounding let never binds. See the Try-expression section below for the Result-wrapping behavior when catch is omitted.

try* (rethrow-into-catch)

try* EXPR is a prefix operator that evaluates EXPR and rethrows any thrown error so an enclosing try { ... } catch (e) { ... } can handle it, instead of forcing the caller to manually convert thrown errors into a Result and then guard is_ok / unwrap. The lowered form is:

{ let _r = try { EXPR }
  guard is_ok(_r) else { throw unwrap_err(_r) }
  unwrap(_r) }

On success try* EXPR evaluates to EXPR’s value with no Result wrapping. The rethrow runs every finally block between the rethrow site and the innermost catch handler exactly once, matching the finally exactly-once guarantee for plain throw.

fn fetch(prompt) {
  // Without try*: try { llm_call(prompt) } / guard is_ok / unwrap
  let response = try* llm_call(prompt)
  return parse(response)
}

let outcome = try {
  let result = fetch(prompt)
  Ok(result)
} catch (e: ApiError) {
  Err(e.code)
}

try* requires an enclosing function (fn, tool, or pipeline) so the rethrow has a body to live in — using it at module top level is a compile error. The operand is parsed at unary-prefix precedence, so try* foo.bar(1) parses as try* (foo.bar(1)) and try* a + b parses as (try* a) + b. Use parentheses to combine try* with binary operators on its operand. try* is distinct from the postfix ? operator: ? early-returns Result.Err(...) from a Result-returning function, while try* rethrows a thrown value into an enclosing catch.

finally

The finally block is optional and runs regardless of whether the try body succeeds, throws, or the catch body re-throws. Supported forms:

try { ... } catch e { ... } finally { ... }
try { ... } finally { ... }
try { ... } catch e { ... }

return, break, and continue inside a try body with a finally block will execute the finally block before the control flow transfer completes.

The finally block’s return value is discarded — the overall expression value comes from the try or catch body.

Functions and closures

fn declarations

fn name(param1, param2) {
  return param1 + param2
}

Declares a named function. Equivalent to let name = { param1, param2 -> ... }. The function captures the lexical scope at definition time.

Default parameters

Parameters may have default values using = expr. Required parameters must come before optional (defaulted) parameters. Defaults are evaluated fresh at each call site (not memoized at definition time). Any expression is valid as a default — not just literals.

fn greet(name, greeting = "hello") {
  log("${greeting}, ${name}!")
}
greet("world")           // "hello, world!"
greet("world", "hi")     // "hi, world!"

fn config(host = "localhost", port = 8080, debug = false) {
  // all params optional
}

let add = { x, y = 10 -> x + y }  // closures support defaults too

Explicit nil counts as a provided argument (does NOT trigger the default). Arguments are positional — fill left to right, only trailing defaults can be omitted.

tool declarations

tool read_file(path: string, encoding: string) -> string {
  description "Read a file from the filesystem"
  read_file(path)
}

tool search(query: string, file_glob: string = "*.py") -> string {
  description "Search files matching an optional glob"
  "..."
}

Declares a named tool and registers it with a tool registry. The body is compiled as a closure and attached as the tool’s handler. An optional description metadata string may appear as the first statement in the body.

Annotated tool parameter and return types are lowered into the same schema model used by runtime validation and structured LLM I/O. Primitive types map to their JSON Schema equivalents, while nested shapes, list<T>, dict<string, V>, and unions produce nested schema objects. Parameters with default values are emitted as optional schema fields (required: false) and include their default value in the generated tool registry entry.

The result of a tool declaration is a tool registry dict (the return value of tool_define). Multiple tool declarations accumulate into separate registries; use tool_registry() and tool_define(...) for multi-tool registries.

Like fn, tool may be prefixed with pub.

Deferred tool loading (`defer_loading`)

A tool registered through tool_define may set defer_loading: true in its config dict. Deferred tools keep their schema out of the model’s context on each LLM call until a tool-search call surfaces them.

fn admin(token) { log(token) }

let registry = tool_registry()
registry = tool_define(registry, "rare_admin_action", "...", {
  parameters: {token: {type: "string"}},
  defer_loading: true,
  handler: { args -> admin(args.token) },
})

defer_loading is validated as a bool at registration time — typos like defer_loading: "yes" raise at tool_define rather than silently falling back to eager loading.

Deferred tools are only materialised on the wire when the call opts into tool_search (see the llm_call option of the same name and docs/src/llm-and-agents.md). Harn supports two native backends plus a provider-agnostic client fallback:

Anthropic Claude Opus/Sonnet 4.0+ and Haiku 4.5+ — Harn emits defer_loading: true on each deferred tool and prepends the tool_search_tool_{bm25,regex}_20251119 meta-tool. Anthropic keeps deferred schemas in the API prefix (prompt caching stays warm) but out of the model’s context.
OpenAI GPT 5.4+ (Responses API) — Harn emits defer_loading: true on each deferred tool and prepends {"type": "tool_search", "mode": "hosted"} to the tools array. OpenRouter, Together, Groq, DeepSeek, Fireworks, HuggingFace, and local vLLM inherit the capability when their routed model matches gpt-5.4+.
Everyone else (and any of the above on older models) — Harn injects a synthetic __harn_tool_search tool and runs the configured strategy (BM25, regex, semantic, or host-delegated) in-VM, promoting matching deferred tools into the next turn’s schema list.

Tool entries may also set namespace: "<label>" to group deferred tools for the OpenAI meta-tool’s namespaces field. The field is a harmless passthrough on Anthropic — ignored by the API, preserved in replay.

mode: "native" refuses to silently downgrade and errors when the active (provider, model) pair is not natively capable; mode: "client" forces the fallback everywhere; mode: "auto" (default) picks native when available.

The per-provider / per-model capability table that gates native tool_search, defer_loading, prompt caching, and extended thinking is a shipped TOML matrix overridable per-project via [[capabilities.provider.<name>]] in harn.toml. Scripts query the effective matrix at runtime with:

let caps = provider_capabilities("anthropic", "claude-opus-4-7")
// {
//   provider, model, native_tools, defer_loading,
//   tool_search: [string], max_tools: int | nil,
//   prompt_caching, thinking,
// }

The provider_capabilities_install(toml_src) and provider_capabilities_clear() builtins let scripts install and revert overrides in-process for cases where editing the manifest is awkward (runtime proxy detection, conformance test setup). See docs/src/llm-and-agents.md#capability-matrix--harntoml-overrides for the rule schema.

skill declarations

pub skill deploy {
  description "Deploy the application to production"
  when_to_use "User says deploy/ship/release"
  invocation "explicit"
  paths ["infra/**", "Dockerfile"]
  allowed_tools ["bash", "git"]
  model "claude-opus-4-7"
  effort "high"
  prompt "Follow the deployment runbook."

  on_activate fn() {
    log("deploy skill activated")
  }
  on_deactivate fn() {
    log("deploy skill deactivated")
  }
}

Declares a named skill and registers it with a skill registry. A skill bundles metadata, tool references, MCP server lists, system-prompt fragments, and auto-activation rules into a typed unit that hosts can enumerate, select, and invoke.

Body entries are <field_name> <expression> pairs separated by newlines. The field name is an ordinary identifier (no keyword is reserved), and the value is any expression — string literal, list literal, identifier reference, dict literal, or fn-literal (for lifecycle hooks). The compiler lowers the decl to:

skill_define(skill_registry(), NAME, { field: value, ... })

and binds the resulting registry dict to NAME, parallel to how tool NAME { ... } works.

skill_define performs light value-shape validation on known keys: description, when_to_use, prompt, invocation, model, effort must be strings; paths, allowed_tools, mcp must be lists. Mistyped values fail at registration rather than at use. Unknown keys pass through unchanged to support integrator metadata.

Like fn and tool, skill may be prefixed with pub to export it from the module. The registry-dict value is bound as a module-level variable.

Skill registry operations

let reg = skill_registry()
let reg = skill_define(reg, "review", {
  description: "Code review",
  invocation: "auto",
  paths: ["src/**"],
})
skill_count(reg)           // int
skill_find(reg, "review")  // dict | nil
skill_list(reg)            // list (closure hooks stripped)
skill_select(reg, ["review"])
skill_remove(reg, "review")
skill_describe(reg)        // formatted string

skill_list strips closure-valued fields (lifecycle hooks) so its output is safe to serialize. skill_find returns the full entry including closures.

`@acp_skill` attribute

Functions can be promoted into skills via the @acp_skill attribute:

@acp_skill(name: "deploy", when_to_use: "User says deploy", invocation: "explicit")
pub fn deploy_run() { ... }

Attribute arguments populate the skill’s metadata dict, and the annotated function is registered as the skill’s on_activate lifecycle hook. Like @acp_tool, @acp_skill only applies to function declarations; using it on other kinds of item is a compile error.

Closures

let f = { x -> x * 2 }
let g = { a, b -> a + b }

First-class values. When invoked, a child environment is created from the captured environment (not the call-site environment), and parameters are bound as immutable bindings.

Spread in function calls

The spread operator ... expands a list into individual function arguments. It can be used in both function calls and method calls:

fn add(a, b, c) {
  return a + b + c
}

let args = [1, 2, 3]
add(...args)           // equivalent to add(1, 2, 3)

Spread arguments can be mixed with regular arguments:

fn add(a, b, c) { return a + b + c }

let rest = [2, 3]
add(1, ...rest)        // equivalent to add(1, 2, 3)

Multiple spreads are allowed in a single call, and they can appear in any position:

fn add(a, b, c) { return a + b + c }

let first = [1]
let last = [3]
add(...first, 2, ...last)   // equivalent to add(1, 2, 3)

At runtime the VM flattens all spread arguments into the argument list before invoking the function. If the total number of arguments does not match the function’s parameter count, the usual arity error is produced.

Return

return value inside a function/closure unwinds execution via HarnRuntimeError.returnValue. The closure invocation catches this and returns the value. return inside a pipeline terminates the pipeline.

Enums

Enums define a type with a fixed set of named variants, each optionally carrying associated data.

Enum declaration

enum Color {
  Red,
  Green,
  Blue
}

enum Shape {
  Circle(float),
  Rectangle(float, float)
}

Variants without data are simple tags. Variants with data carry positional fields specified in parentheses.

Enum construction

Variants are constructed using dot syntax on the enum name:

let c = Color.Red
let s = Shape.Circle(5.0)
let r = Shape.Rectangle(3.0, 4.0)

Pattern matching on enums

Enum variants are matched using EnumName.Variant(binding) patterns in match expressions:

match s {
  Shape.Circle(radius) -> { log("circle r=${radius}") }
  Shape.Rectangle(w, h) -> { log("rect ${w}x${h}") }
}

A match on an enum must be exhaustive: a missing variant is a hard error, not a warning. Add the missing arm or end with a wildcard _ -> { … } arm to opt out. if/elif/else chains stay intentionally partial; opt into exhaustiveness by ending the chain with unreachable("…").

Built-in Result enum

Harn provides a built-in generic Result<T, E> enum with two variants:

Result.Ok(value) – represents a successful result
Result.Err(error) – represents an error

Shorthand constructor functions Ok(value) and Err(value) are available as builtins, equivalent to Result.Ok(value) and Result.Err(value).

let ok = Ok(42)
let err = Err("something failed")
let typed_ok: Result<int, string> = ok

// Equivalent long form:
let ok2 = Result.Ok(42)
let err2 = Result.Err("oops")

Result helper functions

Function	Description
`is_ok(r)`	Returns `true` if `r` is `Result.Ok`
`is_err(r)`	Returns `true` if `r` is `Result.Err`
`unwrap(r)`	Returns the `Ok` value, throws if `r` is `Err`
`unwrap_or(r, default)`	Returns the `Ok` value, or `default` if `r` is `Err`
`unwrap_err(r)`	Returns the `Err` value, throws if `r` is `Ok`

The `?` operator (Result propagation)

The postfix ? operator unwraps a Result.Ok value or propagates a Result.Err from the current function. It is a postfix operator with the same precedence as ., [], and ().

fn divide(a, b) {
  if b == 0 {
    return Err("division by zero")
  }
  return Ok(a / b)
}

fn compute(x) {
  let result = divide(x, 2)?   // unwraps Ok, or returns Err early
  return Ok(result + 10)
}

let r1 = compute(20)   // Result.Ok(20)
let r2 = compute(0)    // would propagate Err from divide

The ? operator requires its operand to be a Result value. Applying ? to a non-Result value produces a type error at runtime.

Disambiguation: when the parser sees expr?, it distinguishes between the postfix ? (Result propagation) and the ternary ? : operator by checking whether the token following ? could start a ternary branch expression.

Pattern matching on Result

match result {
  Result.Ok(val) -> { log("success: ${val}") }
  Result.Err(err) -> { log("error: ${err}") }
}

Try-expression

The try keyword used without a catch block acts as a try-expression. It evaluates the body and wraps the result in a Result:

If the body succeeds, returns Result.Ok(value).
If the body throws an error, returns Result.Err(error).

let result = try { json_parse(raw_input) }
// result is Result.Ok(parsed_data) or Result.Err("invalid JSON: ...")

The try-expression is the complement of the ? operator: try enters Result-land by catching errors, while ? exits Result-land by propagating errors. Together they form a complete error-handling pipeline:

fn safe_divide(a, b) {
  let result = try { a / b }
  return result
}

fn compute(x) {
  let val = safe_divide(x, 2)?  // unwrap Ok or propagate Err
  return Ok(val + 10)
}

No catch or finally block is needed for the Result-wrapping form. When catch or finally follow try, the form is a handled try/catch expression whose value is the try or catch body’s tail value (see try/catch/finally); only the bare try { ... } form wraps in Result.

Result in pipelines

The ? operator works naturally in pipelines:

fn fetch_and_parse(url) {
  let response = http_get(url)?
  let data = json_parse(response)?
  return Ok(data)
}

Structs

Structs define named record types with typed fields. Structs may also be generic.

Struct declaration

struct Point {
  x: int
  y: int
}

struct User {
  name: string
  age: int
}

struct Pair<A, B> {
  first: A
  second: B
}

Fields are declared with name: type syntax, one per line.

Struct construction

Struct instances can be constructed with the struct name followed by a named-field body:

let p = Point { x: 3, y: 4 }
let u = User { name: "Alice", age: 30 }
let pair: Pair<int, string> = Pair { first: 1, second: "two" }

Field access

Struct fields are accessed with dot syntax, the same as dict property access:

log(p.x)    // 3
log(u.name) // "Alice"

Impl blocks

Impl blocks attach methods to a struct type.

Syntax

impl TypeName {
  fn method_name(self, arg) {
    // body -- self refers to the struct instance
  }
}

The first parameter of each method must be self, which receives the struct instance the method is called on.

Method calls

Methods are called using dot syntax on struct instances:

struct Point {
  x: int
  y: int
}

impl Point {
  fn distance(self) {
    return sqrt(self.x * self.x + self.y * self.y)
  }
  fn translate(self, dx, dy) {
    return Point { x: self.x + dx, y: self.y + dy }
  }
}

let p = Point { x: 3, y: 4 }
log(p.distance())           // 5.0
let p2 = p.translate(10, 20)
log(p2.x)                   // 13

When instance.method(args) is called, the VM looks up methods registered by the impl block for the instance’s struct type. The instance is automatically passed as the self argument.

Interfaces

Interfaces define a set of method signatures that a struct type must implement. Harn uses Go-style implicit satisfaction: a struct satisfies an interface if its impl block contains all the required methods with compatible signatures. There is no implements keyword. Interfaces may also declare associated types.

Interface declaration

interface Displayable {
  fn display(self) -> string
}

interface Serializable {
  fn serialize(self) -> string
  fn byte_size(self) -> int
}

interface Collection {
  type Item
  fn get(self, index: int) -> Item
}

Each method signature lists parameters (the first must be self) and an optional return type. Associated types name implementation-defined types that methods can refer to. The body is omitted – interfaces only declare the shape of the methods.

Implicit satisfaction

A struct satisfies an interface when its impl block has all the methods declared by the interface, with matching parameter counts:

struct Dog {
  name: string
}

impl Dog {
  fn display(self) -> string {
    return "Dog(${self.name})"
  }
}

Dog satisfies Displayable because it has a display(self) -> string method. No extra annotation is needed.

Using interfaces as type annotations

Interfaces can be used as parameter types. At compile time, the type checker verifies that any struct passed to such a parameter satisfies the interface:

fn show(item: Displayable) {
  println(item.display())
}

let d = Dog({name: "Rex"})
show(d)  // OK: Dog satisfies Displayable

Generic constraints with interfaces

Interfaces can be used as generic constraints via where clauses:

fn process<T>(item: T) where T: Displayable {
  println(item.display())
}

The type checker verifies at call sites that the concrete type passed for T satisfies Displayable. Passing a type that does not satisfy the constraint produces a compile-time error. Generic parameters must bind consistently across all arguments in the call, and container bindings such as list<T> propagate the concrete element type instead of collapsing to an unconstrained generic.

Subtyping and variance

Harn’s subtype relation is polarity-aware: each compound type has a declared variance per slot that determines whether widening (e.g. int <: float) is allowed in that slot, prohibited entirely, or applied with the direction reversed.

Type parameters on user-defined generics may be marked with in or out:

type Reader<out T> = fn() -> T          // T appears only in output position
interface Sink<in T> { fn accept(v: T) -> int }
fn map<in A, out B>(value: A) -> B { ... }

Marker	Meaning	Where T may appear
`out T`	covariant	output positions only
`in T`	contravariant	input positions only
(none)	invariant (default)	anywhere

Unannotated parameters default to invariant. This is strictly safer than implicit covariance — Box<int> does not flow into Box<float> unless Box declares out T and the body uses T only in covariant positions.

Built-in variance

Constructor	Variance
`iter<T>`	covariant in `T` (read-only)
`list<T>`	invariant in `T` (mutable: `push`, index assignment)
`dict<K, V>`	invariant in both `K` and `V` (mutable)
`Result<T, E>`	covariant in both `T` and `E`
`fn(P1, ...) -> R`	parameters contravariant, return covariant
Shape `{ field: T, ... }`	covariant per field (width subtyping)

The numeric widening int <: float only applies in covariant positions. In invariant or contravariant positions it is suppressed — that is what makes list<int> to list<float> a type error.

Function subtyping

For an actual fn(A) -> R' to be a subtype of an expected fn(B) -> R, B must be a subtype of A (parameters are contravariant) and R' must be a subtype of R (return is covariant). A callback that accepts a wider input or produces a narrower output is always a valid substitute.

let wide = fn(x: float) { return 0 }
let cb: fn(int) -> int = wide   // OK: float-accepting closure stands in for int-accepting

let narrow = fn(x: int) { return 0 }
let bad: fn(float) -> int = narrow   // ERROR: narrow cannot accept the float a caller may pass

Declaration-site checking

When a type parameter is marked in or out, the declaration body is checked: each occurrence of the parameter must respect the declared variance. Mismatches are caught at definition time, not at each use:

type Box<out T> = fn(T) -> int
// ERROR: type parameter 'T' is declared 'out' (covariant) but appears
// in a contravariant position in type alias 'Box'

Attributes

Attributes are declarative metadata attached to a top-level declaration with the @ prefix. They compile to side-effects (warnings, runtime registrations) at the attached declaration, and stack so a single decl can carry multiple. Arguments are restricted to literal values (strings, numbers, booleans, nil, bare identifiers) — no runtime evaluation, no expressions.

Syntax

attribute    ::= '@' IDENTIFIER ['(' attr_arg (',' attr_arg)* [','] ')']
attr_arg     ::= [IDENTIFIER ':'] attr_value
attr_value   ::= literal | IDENTIFIER

@deprecated(since: "0.8", use: "compute_v2")
@test
pub fn compute(x: int) -> int { return x + 1 }

Attributes attach to the immediately following declaration — either pipeline, fn, tool, struct, enum, type, interface, or impl. Attaching to anything else (a let, a statement) is a parse error.

Standard attributes

`@deprecated`

@deprecated(since: "0.8", use: "new_fn")
pub fn old_fn() -> int { ... }

Emits a type-checker warning at every call site of the attributed function. Both arguments are optional; when present they are folded into the warning message.

Argument	Type	Meaning
`since`	string	Version that introduced the deprecation
`use`	string	Replacement function name (rendered as a help line)

`@test`

@test
pipeline test_smoke(task) { ... }

Marks a pipeline as a test entry point. The conformance / harn test runner discovers attributed pipelines in addition to the legacy test_* naming convention. Both forms continue to work.

`@complexity(allow)`

@complexity(allow)
pub fn classify(x: int) -> string {
  if x == 1 { return "one" }
  ...
}

Suppresses the cyclomatic-complexity lint warning on the attached function. The bare allow identifier is the only currently accepted form. Use it for functions whose branching is intrinsic (parsers, tier dispatchers, tree-sitter adapters) rather than accidental.

The rule fires when a function’s cyclomatic score exceeds the default threshold of 25. Projects can override the threshold in harn.toml:

[lint]
complexity_threshold = 15   # stricter for this project

Cyclomatic complexity counts each branching construct (if/else, guard, match arm, for, while, try/catch, ternary, select case, retry) and each short-circuit boolean operator (&&, ||). Nesting, guard-vs-if, and De Morgan rewrites are all score-preserving — the only way to reduce the count is to extract helpers or mark the function @complexity(allow).

`@acp_tool`

@acp_tool(name: "edit", kind: "edit", side_effect_level: "mutation")
pub fn apply_edit(path: string, content: string) -> EditResult { ... }

Compiles to the same runtime registration as an imperative tool_define(tool_registry(), name, "", { handler, annotations }) call, with the function bound as the tool’s handler and every named attribute argument (other than name) lifted into the annotations dict. name defaults to the function name when omitted.

Argument	Type	Meaning
`name`	string	Tool name (defaults to fn name)
`kind`	string	One of `read`, `edit`, `delete`, `move`, `search`, `execute`, `think`, `fetch`, `other`
`side_effect_level`	string	`none`, `read`, `mutation`, `destructive`

Other named arguments pass through to the annotations dict unchanged, so additional ToolAnnotations fields can be added without a parser change.

Unknown attributes

Unknown attribute names produce a type-checker warning so that misspellings surface at check time. The attribute itself is otherwise ignored — code still compiles.

Type annotations

Harn has an optional, gradual type system. Type annotations are checked at compile time but do not affect runtime behavior. Omitting annotations is always valid.

Basic types

let name: string = "Alice"
let age: int = 30
let rate: float = 3.14
let ok: bool = true
let nothing: nil = nil

The `never` type

never is the bottom type — the type of expressions that never produce a value. It is a subtype of all other types.

Expressions that infer to never:

throw expr
return expr
break and continue
A block where every control path exits
An if/else where both branches infer to never
Calls to unreachable()

never is removed from union types: never | string simplifies to string. An empty union (all members removed by narrowing) becomes never.

fn always_throws() -> never {
  throw "this function never returns normally"
}

The `any` type

any is the top type and the explicit escape hatch. Every concrete type is assignable to any, and any is assignable back to every concrete type without narrowing. any disables type checking in both directions for the values it flows through.

fn passthrough(x: any) -> any {
  return x
}

let s: string = passthrough("hello")  // any → string, no narrowing required
let n: int    = passthrough(42)

Use any deliberately, when you want to opt out of checking — for example, a generic dispatcher that forwards values through a runtime protocol you don’t want to describe statically. Prefer unknown (see below) for values from untrusted boundaries where callers should be forced to narrow.

The `unknown` type

unknown is the safe top type. Every concrete type is assignable to unknown, but an unknown value is not assignable to any concrete type without narrowing. This is the correct annotation for values arriving from untrusted boundaries (parsed JSON, LLM responses, dynamic dicts) where callers should be forced to validate the shape before use.

fn describe(v: unknown) -> string {
  // Direct use of `v` as a concrete type is a compile-time error.
  // Narrow via type_of/schema_is first.
  if type_of(v) == "string" {
    return "string: ${v.upper()}"
  }
  if type_of(v) == "int" {
    return "int: ${v + 1}"
  }
  return "other"
}

Narrowing rules for unknown:

type_of(x) == "T" narrows x to T on the truthy branch (where T is one of the type-of protocol names: string, int, float, bool, nil, list, dict, closure).
schema_is(x, Shape) narrows x to Shape on the truthy branch.
guard type_of(x) == "T" else { ... } narrows x to T in the surrounding scope after the guard.
The falsy branch keeps unknown — subtracting one concrete type from an open top still leaves an open top. The checker still tracks which concrete type_of variants have been ruled out on the current flow path, so an exhaustive chain ending in unreachable() / throw can be validated; see the “Exhaustive narrowing on unknown” subsection of “Flow-sensitive type refinement”.

Interop between any and unknown:

unknown is assignable to any (upward to the full escape hatch).
any is assignable to unknown (downward — the any escape hatch lets it flow into anything, including unknown).

When to pick which:

No annotation — “I haven’t annotated this.” Callers get no checking. Use for internal, unstable code.
unknown — “this value could be anything; narrow before use.” Use at untrusted boundaries and in APIs that hand back open-ended data. This is the preferred annotation for LLM / JSON / dynamic dict values.
any — “stop checking.” A last-resort escape hatch. Prefer unknown unless you have a specific reason to defeat checking bidirectionally.

Union types

let value: string | nil = nil
let id: int | string = "abc"

Union members may also be literal types — specific string or int values used to encode enum-like discriminated sets:

type Verdict = "pass" | "fail" | "unclear"
type RetryCount = 0 | 1 | 2 | 3

let v: Verdict = "pass"

Literal types are assignable to their base type ("pass" flows into string), and a base-typed value flows into a literal union (string into Verdict). Runtime schema_is / schema_expect guards and the parameter-annotation runtime check reject values that violate the literal set.

A match on a literal union must cover every literal or include a wildcard _ arm — non-exhaustive match is a hard error.

Tagged shape unions (discriminated unions)

A union of two or more dict shapes is a tagged shape union when the shapes share a discriminant field. The discriminant is auto-detected: the first field of the first variant that (a) is non-optional in every member, (b) has a literal type (LitString or LitInt), and (c) takes a distinct literal value per variant qualifies. The field can be named anything — kind, type, op, t, etc. — there is no privileged spelling.

type Msg =
  {kind: "ping", ttl: int} |
  {kind: "pong", latency_ms: int}

Matching on the discriminant narrows the value to the matching variant inside each arm; the same narrowing fires under if obj.<tag> == "value" / else:

fn handle(m: Msg) -> string {
  match m.kind {
    "ping" -> { return "ttl=" + to_string(m.ttl) }
    "pong" -> { return to_string(m.latency_ms) + "ms" }
  }
}

Such a match must cover every variant or include a wildcard _ arm — non-exhaustive match is a hard error.

Distributive generic instantiation

Generic type aliases distribute over closed-union arguments. Writing Container<A | B> is equivalent to Container<A> | Container<B> so each instantiation independently fixes the type parameter. This is what keeps processCreate: fn("create") -> nil flowing into a list< ActionContainer<Action>> element instead of getting rejected by the contravariance of the function-parameter slot:

type Action = "create" | "edit"
type ActionContainer<T> = {action: T, process_action: fn(T) -> nil}

ActionContainer<Action> resolves to ActionContainer<"create"> | ActionContainer<"edit">, and a literal-tagged shape on the right flows into the matching branch.

Parameterized types

let numbers: list<int> = [1, 2, 3]
let headers: dict<string, string> = {content_type: "json"}

Structural types (shapes)

Dict shape types describe the expected fields of a dict value. The type checker verifies that dict literals have the required fields with compatible types.

let user: {name: string, age: int} = {name: "Alice", age: 30}

Optional fields use ? and need not be present:

let config: {host: string, port?: int} = {host: "localhost"}

Width subtyping: a dict with extra fields satisfies a shape that requires fewer fields.

fn greet(u: {name: string}) -> string {
  return "hi ${u["name"]}"
}
greet({name: "Bob", age: 25})  // OK — extra field allowed

Nested shapes:

let data: {user: {name: string}, tags: list} = {user: {name: "X"}, tags: []}

Shapes are compatible with dict and dict<string, V> when all field values match V.

Type aliases

type Config = {model: string, max_tokens: int}
let cfg: Config = {model: "gpt-4", max_tokens: 100}

A type alias can also drive schema validation for structured LLM output and runtime guards. schema_of(T) lowers an alias to a JSON-Schema dict at compile time:

type GraderOut = {
  verdict: "pass" | "fail" | "unclear",
  summary: string,
  findings: list<string>,
}

// Use the alias directly wherever a schema dict is expected.
let s = schema_of(GraderOut)
let ok = schema_is({verdict: "pass", summary: "x", findings: []}, GraderOut)

let r = llm_call(prompt, nil, {
  provider: "openai",
  output_schema: GraderOut,     // alias in value position — compiled to schema_of(T)
  schema_retries: 2,
})

The emitted schema follows canonical JSON-Schema conventions (objects with properties/required, arrays with items, literal unions as {type, enum}) so it is compatible with structured-output validators and with ACP ToolAnnotations.args schemas. The compile-time lowering applies when the alias identifier appears as:

The argument of schema_of(T).
The schema argument of schema_is, schema_expect, schema_parse, schema_check, is_type, json_validate.
The value of an output_schema: entry in an llm_call options dict.

For aliases not known at compile time (e.g. let T = schema_of(Foo) or dynamic construction), passthrough through the runtime schema_of builtin keeps existing schema dicts working.

Generic inference via `Schema<T>`

Schema-driven builtins are typed with proper generics so user-defined wrappers pick up the same narrowing.

llm_call<T>(prompt, system, options: {output_schema: Schema<T>, ...}) -> {data: T, text: string, ...}
llm_completion<T> has the same signature.
schema_parse<T>(value: unknown, schema: Schema<T>) -> Result<T, string>
schema_check<T>(value: unknown, schema: Schema<T>) -> Result<T, string>
schema_expect<T>(value: unknown, schema: Schema<T>) -> T

Schema<T> denotes a runtime schema value whose static shape is T. In a parameter position, matching a Schema<T> against an argument whose value resolves to a type alias (directly, via schema_of(T), or via an inline JSON-Schema dict literal) binds the type parameter. A user-defined wrapper such as

fn grade<T>(prompt: string, schema: Schema<T>) -> T {
  let r = llm_call(prompt, nil,
    {provider: "mock", output_schema: schema, output_validation: "error",
     response_format: "json"})
  return r.data
}

let out: GraderOut = grade("Grade this", schema_of(GraderOut))
println(out.verdict)

narrows out to GraderOut at the call site without any schema_is / schema_expect guard, and without per-wrapper typechecker support.

Schema<T> is a type-level construct. In value positions, the runtime schema_of(T) builtin returns an idiomatic schema dict whose static type is Schema<T>.

Function type annotations

Parameters and return types can be annotated:

fn add(a: int, b: int) -> int {
  return a + b
}

Type checking behavior

Annotations are optional (gradual typing). Untyped values are None and skip checks.
int is assignable to float.
Dict literals with string keys infer a structural shape type.
Dict literals with computed keys infer as generic dict.
Shape-to-shape: all required fields in the expected type must exist with compatible types.
Shape-to-dict<K, V>: all field values must be compatible with V.
Type errors are reported at compile time and halt execution.

The type checker performs flow-sensitive type refinement (narrowing) on union types based on control flow conditions. Refinements are bidirectional — both the truthy and falsy paths of a condition are narrowed.

Nil checks

x != nil narrows to non-nil in the then-branch and to nil in the else-branch. x == nil applies the inverse.

fn greet(name: string | nil) -> string {
  if name != nil {
    // name is `string` here
    return "hello ${name}"
  }
  // name is `nil` here
  return "hello stranger"
}

`type_of()` checks

type_of(x) == "typename" narrows to that type in the then-branch and removes it from the union in the else-branch.

fn describe(x: string | int) {
  if type_of(x) == "string" {
    log(x)  // x is `string`
  } else {
    log(x)  // x is `int`
  }
}

Truthiness

A bare identifier in condition position narrows by removing nil:

fn check(x: string | nil) {
  if x {
    log(x)  // x is `string`
  }
}

Logical operators

a && b: combines both refinements on the truthy path.
a || b: combines both refinements on the falsy path.
!cond: inverts truthy and falsy refinements.

fn check(x: string | int | nil) {
  if x != nil && type_of(x) == "string" {
    log(x)  // x is `string`
  }
}

Guard statements

After a guard statement, the truthy refinements apply to the outer scope (since the else-body must exit):

fn process(x: string | nil) {
  guard x != nil else { return }
  log(x)  // x is `string` here
}

Early-exit narrowing

When one branch of an if/else definitely exits (via return, throw, break, or continue), the opposite refinements apply after the if:

fn process(x: string | nil) {
  if x == nil { return }
  log(x)  // x is `string` — the nil path returned
}

While loops

The condition’s truthy refinements apply inside the loop body.

Ternary expressions

The condition’s refinements apply to the true and false branches respectively.

Match expressions

When matching a union-typed variable against literal patterns, the variable’s type is narrowed in each arm:

fn check(x: string | int) {
  match x {
    "hello" -> { log(x) }  // x is `string`
    42 -> { log(x) }       // x is `int`
    _ -> {}
  }
}

Or-patterns (`pat1 | pat2 -> body`)

A match arm may list two or more alternative patterns separated by |; the shared body runs when any alternative matches. Each alternative contributes to exhaustiveness coverage independently, so an or-pattern and a single-literal arm compose naturally:

fn verdict(v: "pass" | "fail" | "unclear") -> string {
  return match v {
    "pass" -> { "ok" }
    "fail" | "unclear" -> { "not ok" }
  }
}

Narrowing inside the or-arm refines the matched variable to the union of the alternatives’ single-literal narrowings. On a literal union this is a sub-union; on a tagged shape union it is a union of the matching shape variants:

type Msg =
  {kind: "ping", ttl: int} |
  {kind: "pong", latency_ms: int} |
  {kind: "close", reason: string}

fn summarise(m: Msg) -> string {
  return match m.kind {
    "ping" | "pong" -> {
      // m is narrowed to {kind:"ping",…} | {kind:"pong",…};
      // the shared `kind` discriminant stays accessible.
      "live:" + m.kind
    }
    "close" -> { "closed:" + m.reason }
  }
}

Guards apply to the arm as a whole: 1 | 2 | 3 if n > 2 -> … runs the body only when some alternative matched and the guard held. A guard failure falls through to the next arm, exactly like a literal-pattern arm.

Or-patterns are restricted to literal alternatives (string, int, float, bool, nil) in this release. Alternatives that introduce identifier bindings or destructuring patterns are a forward-compatible extension and currently rejected.

`.has()` on shapes

dict.has("key") narrows optional shape fields to required:

fn check(x: {name?: string, age: int}) {
  if x.has("name") {
    log(x)  // x.name is now required (non-optional)
  }
}

Exhaustiveness checking with `unreachable()`

The unreachable() builtin acts as a static exhaustiveness assertion. When called with a variable argument, the type checker verifies that the variable has been narrowed to never — meaning all possible types have been handled. If not, a compile-time error reports the remaining types.

fn process(x: string | int | nil) -> string {
  if type_of(x) == "string" { return "string: ${x}" }
  if type_of(x) == "int" { return "int: ${x}" }
  if x == nil { return "nil" }
  unreachable(x)  // compile-time verified: x is `never` here
}

At runtime, unreachable() throws "unreachable code was reached" as a safety net. When called without arguments or with a non-variable argument, no compile-time check is performed.

Exhaustive narrowing on `unknown`

The checker tracks the set of concrete type_of variants that have been ruled out on the current flow path for every unknown-typed variable. The falsy branch of type_of(v) == "T" still leaves v typed unknown (subtracting one concrete type from an open top still leaves an open top), but the coverage set for v gains "T".

When control flow reaches a never-returning site — unreachable(), a throw statement, or a call to a user-defined function whose return type is never — the checker verifies that the coverage set for every still-unknown variable is either empty or complete. An incomplete coverage set is treated as a failed exhaustiveness claim and triggers a warning that names the uncovered concrete variants:

fn handle(v: unknown) -> string {
  if type_of(v) == "string" { return "s:${v}" }
  if type_of(v) == "int"    { return "i:${v}" }
  unreachable("unknown type_of variant")
  // warning: `unreachable()` reached but `v: unknown` was not fully
  // narrowed — uncovered concrete type(s): float, bool, nil, list,
  // dict, closure
}

Covering all eight type_of variants (int, string, float, bool, nil, list, dict, closure) silences the warning. Suppression via an explicit fallthrough return is intentional: a plain return doesn’t claim exhaustiveness, so partial narrowing followed by a normal return stays silent. Reaching throw or unreachable() with no prior type_of narrowing also stays silent — the coverage set must be non-empty for the warning to fire, which avoids false positives on unrelated error paths.

Reassigning the variable clears its coverage set, matching the way narrowing is already invalidated on reassignment.

Unreachable code warnings

The type checker warns about code after statements that definitely exit (via return, throw, break, or continue), including composite exits where both branches of an if/else exit:

fn foo(x: bool) {
  if x { return 1 } else { throw "err" }
  log("never reached")  // warning: unreachable code
}

Reassignment invalidation

When a narrowed variable is reassigned, the narrowing is invalidated and the original declared type is restored.

Mutability

Variables declared with let are immutable. Assigning to a let variable produces a compile-time warning (and a runtime error).

Runtime parameter type enforcement

In addition to compile-time checking, function parameters with type annotations are enforced at runtime. When a function is called, the VM verifies that each annotated parameter matches its declared type before executing the function body. If the types do not match, a TypeError is thrown:

TypeError: parameter 'name' expected string, got int (42)

The following types are enforced at runtime: int, float, string, bool, list, dict, set, nil, and closure. int and float are mutually compatible (passing an int to a float parameter is allowed, and vice versa). Union types, list<T>, dict<string, V>, and nested shapes are also checked at runtime when the parameter annotation can be lowered into a runtime schema.

Runtime shape validation

Shape-annotated function parameters are validated at runtime. When a function parameter has a structural type annotation (e.g., {name: string, age: int}), the VM checks that the argument is a dict (or struct instance) with all required fields and that each field has the expected type.

fn process(user: {name: string, age: int}) {
  println("${user.name} is ${user.age}")
}

process({name: "Alice", age: 30})     // OK
process({name: "Alice"})              // Error: parameter 'user': missing field 'age' (int)
process({name: "Alice", age: "old"})  // Error: parameter 'user': field 'age' expected int, got string

Shape validation works with both plain dicts and struct instances. Extra fields are allowed (width subtyping). Optional fields (declared with ?) are not required to be present.

Built-in methods

String methods

Method	Signature	Returns
`count`	`.count` (property)	int – character count
`empty`	`.empty` (property)	bool – true if empty
`contains(sub)`	string	bool
`replace(old, new)`	string, string	string
`split(sep)`	string	list of strings
`trim()`	(none)	string – whitespace stripped
`starts_with(prefix)`	string	bool
`ends_with(suffix)`	string	bool
`lowercase()`	(none)	string
`uppercase()`	(none)	string
`substring(start, end?)`	int, int?	string – character range

List methods

Method	Signature	Returns
`count`	(property)	int
`empty`	(property)	bool
`first`	(property)	value or nil
`last`	(property)	value or nil
`map(closure)`	closure(item) -> value	list
`filter(closure)`	closure(item) -> bool	list
`reduce(init, closure)`	value, closure(acc, item) -> value	value
`find(closure)`	closure(item) -> bool	value or nil
`any(closure)`	closure(item) -> bool	bool
`all(closure)`	closure(item) -> bool	bool
`flat_map(closure)`	closure(item) -> value/list	list (flattened)

Dict methods

Method	Signature	Returns
`keys()`	(none)	list of strings (sorted)
`values()`	(none)	list of values (sorted by key)
`entries()`	(none)	list of `{key, value}` dicts (sorted by key)
`count`	(property)	int
`has(key)`	string	bool
`merge(other)`	dict	dict (other wins on conflict)
`map_values(closure)`	closure(value) -> value	dict
`filter(closure)`	closure(value) -> bool	dict

Dict property access

dict.name returns the value for key "name", or nil if absent.

Set builtins

Sets are created with the set() builtin and are immutable – mutation operations return a new set. Sets deduplicate values using structural equality.

Function	Signature	Returns
`set(...)`	values or a list	set – deduplicated
`set_add(s, value)`	set, value	set – with value added
`set_remove(s, value)`	set, value	set – with value removed
`set_contains(s, value)`	set, value	bool
`set_union(a, b)`	set, set	set – all items from both
`set_intersect(a, b)`	set, set	set – items in both
`set_difference(a, b)`	set, set	set – items in a but not b
`to_list(s)`	set	list – convert set to list

Sets are iterable with for ... in and support len().

Encoding and hashing builtins

Function	Description
`base64_encode(str)`	Returns the base64-encoded version of `str`
`base64_decode(str)`	Returns the decoded string from a base64-encoded `str`
`sha256(str)`	Returns the hex-encoded SHA-256 hash of `str`
`md5(str)`	Returns the hex-encoded MD5 hash of `str`

let encoded = base64_encode("hello world")  // "aGVsbG8gd29ybGQ="
let decoded = base64_decode(encoded)        // "hello world"
let hash = sha256("hello")                  // hex string
let md5hash = md5("hello")                  // hex string

Regex builtins

Function	Description
`regex_match(pattern, str)`	Returns match data if `str` matches `pattern`, or `nil`
`regex_replace(pattern, str, replacement)`	Replaces all matches of `pattern` in `str`
`regex_captures(pattern, str)`	Returns a list of capture group dicts for all matches

regex_captures

regex_captures(pattern, text) finds all matches of pattern in text and returns a list of dicts, one per match. Each dict contains:

match: the full match string
groups: a list of positional capture group strings (from (...))
Any named capture groups (from (?P<name>...)) as additional keys

let results = regex_captures("(\\w+)@(\\w+)", "alice@example bob@test")
// results == [
//   {match: "alice@example", groups: ["alice", "example"]},
//   {match: "bob@test", groups: ["bob", "test"]}
// ]

let named = regex_captures("(?P<user>\\w+):(?P<role>\\w+)", "alice:admin")
// named == [{match: "alice:admin", groups: ["alice", "admin"], user: "alice", role: "admin"}]

Returns an empty list if there are no matches.

Regex patterns are compiled and cached internally using a thread-local cache. Repeated calls with the same pattern string reuse the compiled regex, avoiding recompilation overhead. This is a performance optimization with no API-visible change.

Iterator protocol

Harn provides a lazy iterator protocol layered over the eager collection methods. Eager methods (list.map, list.filter, list.flat_map, dict.map_values, dict.filter, etc.) are unchanged — they return eager collections. Lazy iteration is opt-in via .iter() and the iter(x) builtin.

The `Iter<T>` type

Iter<T> is a runtime value representing a lazy, single-pass, fused iterator over values of type T. It is produced by calling iter(x) or x.iter() on an iterable source (list, dict, set, string, generator, channel) or by chaining a combinator on an existing iter.

iter(x) / x.iter() on a value that is already an Iter<T> is a no-op (returns the iter unchanged).

The `Pair<K, V>` type

Pair<K, V> is a two-element value used by the iterator protocol for key/value and index/value yields.

Construction: pair(a, b) builtin. Combinators such as .zip and .enumerate and dict iteration produce pairs automatically.
Access: .first and .second as properties.
For-loop destructuring: for (k, v) in iter_expr { ... } binds the .first and .second of each Pair to k and v.
Equality: structural (pair(1, 2) == pair(1, 2)).
Printing: (a, b).

For-loop integration

for x in iter_expr pulls values one at a time from iter_expr until the iter is exhausted.

for (a, b) in iter_expr destructures each yielded Pair into two bindings. If a yielded value is not a Pair, a runtime error is raised.

for entry in some_dict (no .iter()) continues to yield {key, value} dicts in sorted-key order for back-compat. Only some_dict.iter() yields Pair(key, value).

Semantics

Lazy: combinators allocate a new Iter and perform no work; values are only produced when a sink (or for-loop) pulls them.
Single-pass: once an item has been yielded, it cannot be re-read from the same iter.
Fused: once exhausted, subsequent pulls continue to report exhaustion (never panic, never yield again). Re-call .iter() on the source collection to obtain a fresh iter.
Snapshot: lifting a list/dict/set/string Rc-clones the backing storage into the iter, so mutating the source after .iter() does not affect iteration.
String iteration: yields chars (Unicode scalar values), not graphemes.
Printing: log(it) / to_string(it) renders <iter> or <iter (exhausted)> without draining the iter.

Combinators

Each combinator below is a method on Iter<T> and returns a new Iter without consuming items eagerly.

Method	Signature
`.iter()`	`Iter<T> -> Iter<T>` (no-op)
`.map(f)`	`Iter<T>, (T) -> U -> Iter<U>`
`.filter(p)`	`Iter<T>, (T) -> bool -> Iter<T>`
`.flat_map(f)`	`Iter<T>, (T) -> Iter<U> \| list<U> -> Iter<U>`
`.take(n)`	`Iter<T>, int -> Iter<T>`
`.skip(n)`	`Iter<T>, int -> Iter<T>`
`.take_while(p)`	`Iter<T>, (T) -> bool -> Iter<T>`
`.skip_while(p)`	`Iter<T>, (T) -> bool -> Iter<T>`
`.zip(other)`	`Iter<T>, Iter<U> -> Iter<Pair<T, U>>`
`.enumerate()`	`Iter<T> -> Iter<Pair<int, T>>`
`.chain(other)`	`Iter<T>, Iter<T> -> Iter<T>`
`.chunks(n)`	`Iter<T>, int -> Iter<list<T>>`
`.windows(n)`	`Iter<T>, int -> Iter<list<T>>`

Sinks

Sinks drive the iter to completion (or until a short-circuit) and return an eager value.

Method	Signature
`.to_list()`	`Iter<T> -> list<T>`
`.to_set()`	`Iter<T> -> set<T>`
`.to_dict()`	`Iter<Pair<K, V>> -> dict<K, V>`
`.count()`	`Iter<T> -> int`
`.sum()`	`Iter<T> -> int \| float`
`.min()`	`Iter<T> -> T \| nil`
`.max()`	`Iter<T> -> T \| nil`
`.reduce(init, f)`	`Iter<T>, U, (U, T) -> U -> U`
`.first()`	`Iter<T> -> T \| nil`
`.last()`	`Iter<T> -> T \| nil`
`.any(p)`	`Iter<T>, (T) -> bool -> bool`
`.all(p)`	`Iter<T>, (T) -> bool -> bool`
`.find(p)`	`Iter<T>, (T) -> bool -> T \| nil`
`.for_each(f)`	`Iter<T>, (T) -> any -> nil`

Notes

.to_dict() requires the iter to yield Pair values; a runtime error is raised otherwise.
.min() / .max() return nil on an empty iter.
.any / .all / .find short-circuit as soon as the result is determined.
Numeric ranges (a to b, range(n)) participate in the lazy iter protocol directly; applying any combinator on a Range returns a lazy Iter without materializing the range.

Method-style builtins

If obj.method(args) is called and obj is an identifier, the interpreter first checks for a registered builtin named "obj.method". If found, it is called with just args (not obj). This enables namespaced builtins like experience_bank.save(...) and negative_knowledge.record(...).

Runtime errors

Error	Description
`undefinedVariable(name)`	Variable not found in any scope
`undefinedBuiltin(name)`	No registered builtin or user function with this name
`immutableAssignment(name)`	Attempted `=` on a `let` binding
`typeMismatch(expected, got)`	Type assertion failed
`returnValue(value?)`	Internal: used to implement `return` (not a user-facing error)
`retryExhausted`	All retry attempts failed
`thrownError(value)`	User-thrown error via `throw`

Most undefinedBuiltin errors are now caught statically by the cross-module typechecker (see Static cross-module resolution) — harn check and harn run refuse to start the VM when a file contains a call to a name that is not a builtin, local declaration, struct constructor, callable variable, or imported symbol. The runtime check remains as a backstop for cases where imports could not be resolved at check time.

Stack traces

Runtime errors include a full call stack trace showing the chain of function calls that led to the error. The stack trace lists each frame with its function name, source file, line number, and column:

Error: division by zero
  at divide (script.harn:3:5)
  at compute (script.harn:8:18)
  at default (script.harn:12:10)

Stack traces are captured at the point of the error before unwinding, so they accurately reflect the call chain at the time of failure.

Persistent store

Six builtins provide a persistent key-value store backed by the resolved Harn state root (default .harn/store.json):

Function	Description
`store_get(key)`	Retrieve value or nil
`store_set(key, value)`	Set key, auto-saves to disk
`store_delete(key)`	Remove key, auto-saves
`store_list()`	List all keys (sorted)
`store_save()`	Explicit flush to disk
`store_clear()`	Remove all keys, auto-saves

The store file is created lazily on first mutation. In bridge mode, the host can override these builtins via the bridge protocol. The state root can be relocated with HARN_STATE_DIR.

Checkpoint & resume

Checkpoints enable resilient, resumable pipelines. State is persisted to the resolved Harn state root (default .harn/checkpoints/<pipeline>.json) and survives crashes, restarts, and migration to another machine.

Core builtins

Function	Description
`checkpoint(key, value)`	Save `value` at `key`; writes to disk immediately
`checkpoint_get(key)`	Retrieve saved value, or `nil` if absent
`checkpoint_exists(key)`	Return `true` if `key` is present (even if value is `nil`)
`checkpoint_delete(key)`	Remove a single key; no-op if absent
`checkpoint_clear()`	Remove all checkpoints for this pipeline
`checkpoint_list()`	Return sorted list of all checkpoint keys

checkpoint_exists is preferable to checkpoint_get(key) == nil when nil is a valid checkpoint value.

std/checkpoint module

import { checkpoint_stage, checkpoint_stage_retry } from "std/checkpoint"

checkpoint_stage(name, fn) -> value

Runs fn() and caches the result under name. On subsequent calls with the same name, returns the cached result without running fn() again. This is the primary primitive for building resumable pipelines.

import { checkpoint_stage } from "std/checkpoint"

fn fetch_dataset(url) { url }
fn clean(data) { data }
fn run_model(cleaned) { cleaned }
fn upload(result) { log(result) }

pipeline process(task) {
  let url = "https://example.com/data.csv"
  let data    = checkpoint_stage("fetch",   fn() { fetch_dataset(url) })
  let cleaned = checkpoint_stage("clean",   fn() { clean(data) })
  let result  = checkpoint_stage("process", fn() { run_model(cleaned) })
  upload(result)
}

On first run all three stages execute. On a resumed run (pipeline restarted after a crash), completed stages are skipped automatically.

checkpoint_stage_retry(name, max_retries, fn) -> value

Like checkpoint_stage, but retries fn() up to max_retries times on failure before propagating the error. Once successful, the result is cached so retries are never needed on resume.

import { checkpoint_stage_retry } from "std/checkpoint"

fn fetch_with_timeout(url) { url }

let url = "https://example.com/data.csv"
let data = checkpoint_stage_retry("fetch", 3, fn() { fetch_with_timeout(url) })
log(data)

File location

Checkpoint files are stored at .harn/checkpoints/<pipeline>.json relative to the project root (where harn.toml lives), or relative to the source file directory if no project root is found. Files are plain JSON and can be copied between machines to migrate pipeline state.

std/agent_state module

import "std/agent_state"

Provides a durable, session-scoped text/blob store rooted at a caller-supplied directory.

Function	Notes
`agent_state_init(root, options?)`	Create or reopen a session root under `root/<session_id>/`
`agent_state_resume(root, session_id, options?)`	Reopen an existing session; errors when absent
`agent_state_write(handle, key, content)`	Atomic temp-write plus rename
`agent_state_read(handle, key)`	Returns `string` or `nil`
`agent_state_list(handle)`	Deterministic recursive key listing
`agent_state_delete(handle, key)`	Deletes a key
`agent_state_handoff(handle, summary)`	Writes a JSON handoff envelope to `__handoff.json`

Keys must be relative paths inside the session root. Absolute paths and parent-directory escapes are rejected.

Workspace manifest (`harn.toml`)

Harn projects declare a workspace manifest at the project root named harn.toml. Tooling walks upward from a target .harn file looking for the nearest ancestor manifest and stops at a .git boundary so a stray manifest in a parent project or $HOME is never silently picked up.

`[check]` — type-checker and preflight

[check]
host_capabilities_path = "./schemas/host-capabilities.json"
preflight_severity = "warning"          # "error" (default), "warning", "off"
preflight_allow = ["mystery.*", "runtime.task"]

[check.host_capabilities]
project = ["ensure_enriched", "enrich"]
workspace = ["read_text", "write_text"]

host_capabilities_path and [check.host_capabilities] declare the host-call surface that the preflight pass is allowed to assume exists at runtime. The CLI flag --host-capabilities <file> takes precedence for a single invocation. The external file is JSON or TOML with the namespaced shape { capability: [op, ...], ... }; nested { capabilities: { ... } } wrappers and per-op metadata dictionaries are accepted.
preflight_severity downgrades preflight diagnostics to warnings or suppresses them entirely. Type-checker and lint diagnostics are unaffected — preflight failures are reported under the preflight category so IDEs and CI filters can route them separately.
preflight_allow suppresses preflight diagnostics tagged with a specific host capability. Entries match an exact capability.operation pair, a capability.* wildcard, a bare capability name, or a blanket *.

Preflight capabilities in this section are a static check surface for the Harn type-checker only. They are not the same thing as ACP’s agent/client capability handshake (agentCapabilities / clientCapabilities), which is runtime protocol-level negotiation and lives outside harn.toml.

`[workspace]` — multi-file targets

[workspace]
pipelines = ["Sources/BurinCore/Resources/pipelines", "scripts"]

harn check --workspace resolves each path in pipelines relative to the manifest directory and recursively checks every .harn file under each. Positional targets remain additive. The manifest is discovered by walking upward from the first positional target (or the current working directory when none is supplied).

`[exports]` — stable package module entry points

[exports]
capabilities = "runtime/capabilities.harn"
providers = "runtime/providers.harn"

[exports] maps logical import suffixes to package-root-relative module paths. After harn install, consumers import them as "<package>/<export>" instead of coupling to the package’s internal directory layout.

Exports are resolved after the direct .harn/packages/<path> lookup, so packages can still expose raw file trees when they want that behavior.

`[llm]` — packaged provider extensions

[llm.providers.my_proxy]
base_url = "https://llm.example.com/v1"
chat_endpoint = "/chat/completions"
completion_endpoint = "/completions"
auth_style = "bearer"
auth_env = "MY_PROXY_API_KEY"

[llm.aliases]
my-fast = { id = "vendor/model-fast", provider = "my_proxy" }

The [llm] table accepts the same schema as providers.toml (providers, aliases, inference_rules, tier_rules, tier_defaults, model_defaults) but scopes it to the current run.

When Harn starts from a file inside a workspace, it merges:

built-in defaults,
the global provider file (HARN_PROVIDERS_CONFIG or ~/.config/harn/providers.toml),
installed package [llm] tables from .harn/packages/*/harn.toml,
the root project’s [llm] table.

Later layers win on key collisions; rule lists are prepended so package and project inference/tier overrides run before the built-in defaults.

`[lint]` — lint configuration

[lint]
disabled = ["unused-import"]
require_file_header = false
complexity_threshold = 25

disabled silences the listed rules for the whole project.
require_file_header opts into the require-file-header rule, which checks that each source file begins with a /** */ HarnDoc block whose title matches the filename.
complexity_threshold overrides the default cyclomatic-complexity warning threshold (default 25, chosen to match Clippy’s cognitive_complexity default). Set lower to tighten, higher to loosen. Per-function escapes still go through @complexity(allow).

Sandbox mode

The harn run command supports sandbox flags that restrict which builtins a program may call.

–deny

harn run --deny read_file,write_file,exec script.harn

Denies the listed builtins. Any call to a denied builtin produces a runtime error:

Permission denied: builtin 'read_file' is not allowed in sandbox mode
  (use --allow read_file to permit)

–allow

harn run --allow llm,llm_stream script.harn

Allows only the listed builtins plus the core builtins (see below). All other builtins are denied.

--deny and --allow cannot be used together; specifying both is an error.

Core builtins

The following builtins are always allowed, even when using --allow:

println, print, log, type_of, to_string, to_int, to_float, len, assert, assert_eq, assert_ne, json_parse, json_stringify

Propagation

Sandbox restrictions propagate to child VMs created by spawn, parallel, and parallel each. A child VM inherits the same set of denied builtins as its parent.

Test framework

Harn includes a built-in test runner invoked via harn test.

Running tests

harn test path/to/tests/         # run all test files in a directory
harn test path/to/test_file.harn # run tests in a single file

Test discovery

The test runner scans .harn files for pipelines whose names start with test_. Each such pipeline is executed independently. A test passes if it completes without error; it fails if it throws or an assertion fails.

pipeline test_addition() {
  assert_eq(1 + 1, 2)
}

pipeline test_string_concat() {
  let result = "hello" + " " + "world"
  assert_eq(result, "hello world")
}

Assertions

Three assertion builtins are available. They can be called anywhere, but they are intended for test pipelines and the linter warns on non-test use:

Function	Description
`assert(condition)`	Throws if `condition` is falsy
`assert_eq(a, b)`	Throws if `a != b`, showing both values
`assert_ne(a, b)`	Throws if `a == b`, showing both values

Mock LLM provider

During harn test, the HARN_LLM_PROVIDER environment variable is automatically set to "mock" unless explicitly overridden. The mock provider returns deterministic placeholder responses, allowing tests that call llm or llm_stream to run without API keys.

CLI options

Flag	Description
`--filter <pattern>`	Only run tests whose names contain `<pattern>`
`--verbose` / `-v`	Show per-test timing and detailed failures
`--timing`	Show per-test timing and summary statistics
`--timeout <ms>`	Per-test timeout in milliseconds (default 30000)
`--parallel`	Run test files concurrently
`--junit <path>`	Write JUnit XML report to `<path>`
`--record`	Record LLM responses to `.harn-fixtures/`
`--replay`	Replay LLM responses from `.harn-fixtures/`

Environment variables

The following environment variables configure runtime behavior:

Variable	Description
`HARN_LLM_PROVIDER`	Override the default LLM provider. Any configured provider is accepted. Built-in names include `anthropic` (default), `openai`, `openrouter`, `huggingface`, `ollama`, `local`, and `mock`.
`HARN_LLM_TIMEOUT`	LLM request timeout in seconds. Default `120`.
`HARN_STATE_DIR`	Override the runtime state root used for store, checkpoint, metadata, and default worktree state. Relative values resolve from the active project/runtime root.
`HARN_RUN_DIR`	Override the default persisted run directory. Relative values resolve from the active project/runtime root.
`HARN_WORKTREE_DIR`	Override the default worker worktree root. Relative values resolve from the active project/runtime root.
`ANTHROPIC_API_KEY`	API key for the Anthropic provider.
`OPENAI_API_KEY`	API key for the OpenAI provider.
`OPENROUTER_API_KEY`	API key for the OpenRouter provider.
`HF_TOKEN`	API key for the HuggingFace provider.
`HUGGINGFACE_API_KEY`	Alternate API key name for the HuggingFace provider.
`OLLAMA_HOST`	Override the Ollama host. Default `http://localhost:11434`.
`LOCAL_LLM_BASE_URL`	Base URL for a local OpenAI-compatible server. Default `http://localhost:8000`.
`LOCAL_LLM_MODEL`	Default model ID for the local OpenAI-compatible provider.

Known limitations and future work

The following are known limitations in the current implementation that may be addressed in future versions.

Type system

Definition-site generic checking: Inside a generic function body, type parameters are treated as compatible with any type. The checker does not yet restrict method calls on T to only those declared in the where clause interface.
No runtime interface enforcement: Interface satisfaction is checked at compile-time only. Passing an untyped value to an interface-typed parameter is not caught at runtime.

Runtime

Syntax limitations

No impl Interface for Type syntax: Interface satisfaction is always implicit. There is no way to explicitly declare that a type implements an interface.

LLM calls and agent loops

Harn has built-in support for calling language models and running persistent agent loops. No libraries or SDKs needed.

Providers

Harn ships with built-in configs for Anthropic, OpenAI, OpenRouter, Ollama, HuggingFace, and a local OpenAI-compatible server. Set the appropriate environment variable to authenticate or point Harn at a local endpoint:

Provider	Environment variable	Default model
Anthropic (default)	`ANTHROPIC_API_KEY`	`claude-sonnet-4-20250514`
OpenAI	`OPENAI_API_KEY`	`gpt-4o`
OpenRouter	`OPENROUTER_API_KEY`	`anthropic/claude-sonnet-4-20250514`
HuggingFace	`HF_TOKEN` or `HUGGINGFACE_API_KEY`	explicit `model`
Ollama	`OLLAMA_HOST` (optional)	`llama3.2`
Local server	`LOCAL_LLM_BASE_URL`	`LOCAL_LLM_MODEL` or explicit `model`

Ollama runs locally and doesn’t require an API key. The default host is http://localhost:11434.

For a generic OpenAI-compatible local server, set LOCAL_LLM_BASE_URL to something like http://192.168.86.250:8000 and either pass {provider: "local", model: "qwen2.5-coder-32b"} or set LOCAL_LLM_MODEL=qwen2.5-coder-32b.

llm_call

Make a single LLM request. Harn normalizes provider responses into a canonical dict so product code does not need to parse provider-native message shapes.

let result = llm_call("What is 2 + 2?")
println(result.text)

With a system message:

let result = llm_call(
  "Explain quicksort",
  "You are a computer science teacher. Be concise."
)
println(result.text)

With options:

let result = llm_call(
  "Translate to French: Hello, world",
  "You are a translator.",
  {
    provider: "openai",
    model: "gpt-4o",
    max_tokens: 1024
  }
)
println(result.text)

Parameters

Parameter	Type	Required	Description
prompt	string	yes	The user message
system	string	no	System message for the model
options	dict	no	Provider, model, and generation settings

Return value

llm_call always returns a dict:

Field	Type	Description
`text`	string	The text content of the response
`visible_text`	string	Human-visible assistant output
`model`	string	The model used
`provider`	string	Canonical provider identifier
`input_tokens`	int	Input/prompt token count
`output_tokens`	int	Output/completion token count
`cache_read_tokens`	int	Prompt tokens served from provider-side cache when supported
`cache_write_tokens`	int	Prompt tokens written into provider-side cache when supported
`data`	any	Parsed JSON (when `response_format: "json"`)
`tool_calls`	list	Tool calls (when model uses tools)
`thinking`	string	Reasoning trace (when `thinking` is enabled)
`private_reasoning`	string	Provider reasoning metadata kept separate from visible text
`blocks`	list	Canonical structured content blocks across providers
`stop_reason`	string	`"end_turn"`, `"max_tokens"`, `"tool_use"`, `"stop_sequence"`
`transcript`	dict	Transcript carrying message history, events, summary, metadata, and id

Options dict

Key	Type	Default	Description
`provider`	string	`"anthropic"`	Any configured provider. Built-in names include `"anthropic"`, `"openai"`, `"openrouter"`, `"huggingface"`, `"ollama"`, and `"local"`
`model`	string	varies by provider	Model identifier
`max_tokens`	int	`16384`	Maximum tokens in the response
`temperature`	float	provider default	Sampling temperature (0.0-2.0)
`top_p`	float	nil	Nucleus sampling
`top_k`	int	nil	Top-K sampling (Anthropic/Ollama only)
`stop`	list	nil	Stop sequences
`seed`	int	nil	Reproducibility seed (OpenAI/Ollama)
`frequency_penalty`	float	nil	Frequency penalty (OpenAI only)
`presence_penalty`	float	nil	Presence penalty (OpenAI only)
`response_format`	string	`"text"`	`"text"` or `"json"`
`schema`	dict	nil	JSON Schema, OpenAPI Schema Object, or canonical Harn schema dict for structured output
`thinking`	bool/dict	nil	Enable provider reasoning. `true` or `{budget_tokens: N}`. Anthropic maps this to thinking/adaptive thinking, OpenRouter maps it to `reasoning`, and Ollama maps it to `think`.
`tools`	list	nil	Tool definitions
`tool_choice`	string/dict	`"auto"`	`"auto"`, `"none"`, `"required"`, or `{name: "tool"}`
`tool_search`	bool/string/dict	nil	Progressive tool disclosure. See Tool Vault
`cache`	bool	`false`	Enable prompt caching (Anthropic)
`stream`	bool	`true`	Use streaming SSE transport. Set `false` for synchronous request/response. Env: `HARN_LLM_STREAM`
`timeout`	int	`120`	Request timeout in seconds
`messages`	list	nil	Full message list (overrides prompt)
`transcript`	dict	nil	Continue from a previous transcript; prompt is appended as the next user turn
`model_tier`	string	nil	Resolve a configured tier alias such as `"small"`, `"mid"`, or `"frontier"`

Provider-specific overrides can be passed as sub-dicts:

let result = llm_call("hello", nil, {
  provider: "ollama",
  ollama: {num_ctx: 32768}
})

Tool Vault

Harn’s Tool Vault is the progressive-tool-disclosure primitive: tool definitions that stay out of the model’s context until they’re surfaced by a search call. This keeps context cheap for agents with hundreds of tools (coding agents, MCP-heavy setups) without requiring the integrator to hand-filter tools per turn.

Per-tool flag: `defer_loading`

Any tool registered via tool_define (or the tool { … } language form) can opt out of eager loading:

var registry = tool_registry()
registry = tool_define(registry, "deploy", "Deploy to production", {
  parameters: {env: {type: "string"}},
  defer_loading: true,
  handler: { args -> shell("deploy " + args.env) },
})

Deferred tools never appear in the model’s context unless a tool-search call surfaces them. They are sent to the provider (so prompt caching stays warm on Anthropic — the schemas live in the API prefix but not the model’s context).

Call-level option: `tool_search`

Turning progressive disclosure on is one option away:

let r = llm_call(prompt, sys, {
  provider: "anthropic",
  model: "claude-opus-4-7",
  tools: registry,
  tool_search: "bm25",
})

Accepted shapes:

Shape	Meaning
`tool_search: true`	Default: `bm25` variant, mode `auto`.
`tool_search: "bm25"`	Natural-language queries.
`tool_search: "regex"`	Python-regex queries.
`tool_search: false`	Explicit off (same as omitting).
`tool_search: {variant, mode, strategy, always_loaded, budget_tokens, name, include_stub_listing}`	Explicit dict form.

mode options:

"auto" (default) — use native if the provider supports it, otherwise fall back to the client-executed path (no error).
"native" — force the provider’s native mechanism. Errors if unsupported.
"client" — force the client-executed path even on providers with native support. Useful for A/B-ing strategies or pinning behavior across heterogeneous provider fleets.

Provider support

Provider	Native `tool_search`	Variants / modes
Anthropic Claude Opus/Sonnet 4.0+, Haiku 4.5+	✓	`bm25`, `regex`
Anthropic 3.x or earlier 4.x Haiku	✗ (uses client fallback)	—
OpenAI Responses API — GPT 5.4+	✓	`hosted` (default), `client`
OpenAI pre-5.4 (`gpt-4o`, `gpt-4.1`, …)	✗	client fallback works today
OpenRouter / Together / Groq / DeepSeek / Fireworks / HuggingFace / local	✓ when routed model matches `gpt-5.4+` upstream	hosted forwarded; escape hatch below for proxies
Gemini, Ollama, mock (default model)	✗	client fallback works today

The OpenAI native path (harn#71) emits a flat {"type": "tool_search", "mode": "hosted"} meta-tool at the front of the tools array, alongside defer_loading: true on the wrapper of each user tool. The server runs the search and replies with tool_search_call / tool_search_output entries that Harn parses into the same transcript event shape as the Anthropic path (replays are indistinguishable across providers).

Namespace grouping

OpenAI’s tool_search can group deferred tools into namespaces; pass namespace: "<label>" on tool_define(...) to tag a tool. Harn collects the distinct set into the meta-tool’s namespaces field. Anthropic ignores the label — harmless passthrough for replay fidelity.

tool_define(registry, "deploy_api", "Deploy the API", {
  parameters: {env: {type: "string"}},
  defer_loading: true,
  namespace: "ops",
  handler: { args -> shell("deploy api " + args.env) },
})

Escape hatch for proxied OpenAI-compat endpoints

Self-hosted routers and enterprise gateways sometimes advertise a model ID Harn cannot parse (my-internal-gpt-clone-v2) yet forward the OpenAI Responses payload unchanged. Opt into the hosted path with:

llm_call(prompt, sys, {
  provider: "openrouter",
  model: "my-custom/gpt-forward",
  tools: registry,
  tool_search: {mode: "native"},
  openrouter: {force_native_tool_search: true},
})

The override is keyed by the provider name (the same dict you’d use for any provider-specific knob).

Capability matrix + `harn.toml` overrides

The provider support table above is not hard-coded: it’s the output of a shipped data file (crates/harn-vm/src/llm/capabilities.toml) matched against the (provider, model) pair at call time. Scripts can query the effective capability surface without carrying vendor-specific knowledge:

let caps = provider_capabilities("anthropic", "claude-opus-4-7")
// {
//   native_tools: true, defer_loading: true,
//   tool_search: ["bm25", "regex"], max_tools: 10000,
//   prompt_caching: true, thinking: true,
// }

if "bm25" in caps.tool_search {
  llm_call(prompt, sys, {
    tools: registry,
    tool_search: "bm25",
  })
}

Projects override or extend the shipped table in harn.toml — useful for flagging a proxied OpenAI-compat endpoint as supporting tool_search ahead of a Harn release that knows about it natively:

# harn.toml
[[capabilities.provider.my-proxy]]
model_match = "*"
native_tools = true
defer_loading = true
tool_search = ["hosted"]
prompt_caching = true

# Shadow the built-in Anthropic rule to force client-executed
# fallback on every Opus call (e.g. while a regional outage is
# active):
[[capabilities.provider.anthropic]]
model_match = "claude-opus-*"
native_tools = true
defer_loading = false
tool_search = []
prompt_caching = true
thinking = true

Each [[capabilities.provider.<name>]] entry accepts these fields:

Field	Type	Purpose
`model_match`	glob string	Required. Matched against the lowercased model ID. Leading/trailing `` or a single middle `` supported.
`version_min`	`[major, minor]`	Narrows the match to a parseable version (Anthropic / OpenAI extractors). Rules where `version_min` is set but the model ID won’t parse are skipped.
`native_tools`	bool	Whether the provider accepts a native tool-call wire shape.
`defer_loading`	bool	Whether `defer_loading: true` on tool definitions is honored server-side.
`tool_search`	list of strings	Native `tool_search` variants, preferred first. Anthropic: `["bm25", "regex"]`. OpenAI: `["hosted", "client"]`. Empty = no native support (client fallback only).
`max_tools`	int	Cap on tool count. `harn lint` will warn if a registry exceeds the smallest cap any active provider advertises.
`prompt_caching`	bool	`cache_control` blocks honored.
`thinking`	bool	Extended or adaptive thinking available.

First match wins. User rules for a given provider are consulted before the shipped rules — so the order inside the TOML file matters (place more specific patterns above wildcards).

[provider_family] declares sibling providers that inherit rules from a canonical family. The shipped table routes OpenRouter, Together, Groq, DeepSeek, Fireworks, HuggingFace, and local vLLM to [[provider.openai]] by default.

Two programmatic helpers mirror the harn.toml path for cases where editing the manifest is awkward:

provider_capabilities_install(toml_src) — install overrides from a TOML string (same layout as capabilities.toml, without the capabilities. prefix: just [[provider.<name>]]). Useful when a script detects a proxied endpoint at runtime.
provider_capabilities_clear() — revert to shipped defaults.

Packaged provider adapters via `[llm]`

Projects and installed packages can also contribute provider definitions, aliases, inference rules, and model defaults directly from harn.toml under [llm]. The schema matches providers.toml, but the merge is scoped to the current run:

[llm.providers.my_proxy]
base_url = "https://llm.example.com/v1"
chat_endpoint = "/chat/completions"
completion_endpoint = "/completions"
auth_style = "bearer"
auth_env = "MY_PROXY_API_KEY"

[llm.aliases]
my-fast = { id = "vendor/model-fast", provider = "my_proxy" }

Load order is:

built-in defaults
HARN_PROVIDERS_CONFIG when set, otherwise ~/.config/harn/providers.toml
installed package [llm] tables from .harn/packages/*/harn.toml
the root project’s [llm] table

That gives packages a stable, declarative way to ship provider adapters and model aliases without editing Rust-side registration code.

Client-executed fallback

On providers without native defer_loading, Harn falls back to an in-VM execution path (landed in harn#70). The fallback is identical to the native path from a script’s point of view: same option surface, same transcript events, same promotion behavior across turns. Internally, Harn injects a synthetic tool called __harn_tool_search — when the model calls it, the loop runs the configured strategy against the deferred-tool index, promotes the matching tools into the next turn’s schema list, and emits the same tool_search_query / tool_search_result transcript events as native mode (tagged mode: "client" in metadata so replays can distinguish paths).

Strategies (client mode only):

`strategy`	Runs in	Notes
`"bm25"` (default)	VM	Tokenized BM25 over `name + description + param text`. Matches `open_file` from query `open file`.
`"regex"`	VM	Case-insensitive Rust-regex over the same corpus. No backreferences, no lookaround.
`"semantic"`	Host (bridge)	Delegated to the host via `tool_search/query` so integrators can wire embeddings without Harn pulling in ML crates.
`"host"`	Host (bridge)	Pure host-side; the VM round-trips the query and promotes whatever the host returns.

Extra client-mode knobs:

budget_tokens: N — soft cap on the total token footprint of promoted tool schemas. Oldest-first eviction when exceeded. Omit to keep every promoted schema for the life of the call.
name: "find_tool" — override the synthetic tool’s name. Handy when a skill’s vocabulary suggests a more natural verb (discover, lookup, …).
always_loaded: ["read_file", "run"] — pin tool names to the eager set even if defer_loading: true is set on their registry entries.
include_stub_listing: true — append a short list of deferred tool names + one-line descriptions to the tool-contract prompt so the model can eyeball what’s available without a search call. Off by default to match Anthropic’s native ergonomic.

Pre-flight validation

At least one user tool must be non-deferred. Harn errors before the API call is made, matching Anthropic’s documented 400.
defer_loading must be a bool — typos like defer_loading: "yes" error at tool_define time rather than silently falling back to the “no defer” default.

Transcript events

Every native tool-search round-trip emits structured events in the run record:

tool_search_query — the search tool’s invocation (input query, search-tool id).
tool_search_result — the references returned by the server (which deferred tools got promoted on this turn).

These are stable shapes; replay / eval can reconstruct which tools were available when without re-running the call.

llm_completion

Use llm_completion for text continuation and fill-in-the-middle generation. It lives at the same abstraction level as llm_call.

let result = llm_completion("let total = ", ";", nil, {
  provider: "ollama",
  model_tier: "small"
})
println(result.text)

agent_loop

Run an agent that keeps working until it’s done. The agent maintains conversation history across turns and loops until it outputs the ##DONE## sentinel. Returns a dict with canonical visible text, tool usage, transcript state, and any deferred queued human messages.

let result = agent_loop(
  "Write a function that sorts a list, then write tests for it.",
  "You are a senior engineer.",
  {persistent: true}
)
println(result.text)       // the accumulated output
println(result.status)     // "done", "stuck", "budget_exhausted", "idle", "watchdog", or "failed"
println(result.iterations) // number of LLM round-trips

How it works

Sends the prompt to the model
Reads the response
If persistent: true:
- Checks if the response contains ##DONE##
- If yes, stops and returns the accumulated output
- If no, sends a nudge message asking the agent to continue
- Repeats until done or limits are hit
If persistent: false (default): returns after the first response

agent_loop return value

agent_loop returns a dict with the following fields:

Field	Type	Description
`status`	string	Terminal state: `"done"` (natural completion), `"stuck"` (exceeded `max_nudges` consecutive text-only turns), `"budget_exhausted"` (hit `max_iterations` without any explicit break), `"idle"` (daemon yielded with no remaining wake source), `"watchdog"` (daemon idle-wait tripped the `idle_watchdog_attempts` limit), or `"failed"` (`require_successful_tools` not satisfied).
`text`	string	Accumulated text output from all iterations
`visible_text`	string	Human-visible accumulated output
`iterations`	int	Number of LLM round-trips
`duration_ms`	int	Total wall-clock time in milliseconds
`tools_used`	list	Names of tools that were called
`rejected_tools`	list	Tools rejected by policy/host ceiling
`deferred_user_messages`	list	Queued human messages deferred until agent yield/completion
`daemon_state`	string	Final daemon lifecycle state; mirrors `status` for daemon loops.
`daemon_snapshot_path`	string or nil	Persisted snapshot path when daemon persistence is enabled
`transcript`	dict	Transcript of the full conversation state

agent_loop options

Same as llm_call, plus additional options:

Key	Type	Default	Description
`persistent`	bool	`false`	Keep looping until `##DONE##`
`max_iterations`	int	`50`	Maximum number of LLM round-trips
`max_nudges`	int	`3`	Max consecutive text-only responses before stopping
`nudge`	string	see below	Custom message to send when nudging the agent
`tool_retries`	int	`0`	Number of retry attempts for failed tool calls
`tool_backoff_ms`	int	`1000`	Base backoff delay in ms for tool retries (doubles each attempt)
`policy`	dict	nil	Capability ceiling applied to this agent loop
`daemon`	bool	`false`	Idle instead of terminating after text-only turns
`persist_path`	string	nil	Persist daemon snapshots to this path on idle/finalize
`resume_path`	string	nil	Restore daemon state from a previously persisted snapshot
`wake_interval_ms`	int	nil	Fixed timer wake interval for daemon loops
`watch_paths`	list/string	nil	Files to poll for mtime changes while idle
`consolidate_on_idle`	bool	`false`	Run transcript auto-compaction before persisting an idle daemon snapshot
`idle_watchdog_attempts`	int	nil (disabled)	Max consecutive idle-wait ticks that may return no wake reason before the daemon terminates with `status = "watchdog"`. Guards against a misconfigured daemon (e.g. bridge never signals, no timer, no watch paths) hanging the session silently
`context_callback`	closure	nil	Per-turn hook that can rewrite prompt-visible `messages` and/or the effective `system` prompt before the next LLM call
`context_filter`	closure	nil	Alias for `context_callback`
`post_turn_callback`	closure	nil	Hook called after each tool turn. Receives turn metadata and may inject a message, request an immediate stage stop, or both
`turn_policy`	dict	nil	Turn-shape policy for action stages. Supports `require_action_or_yield: bool`, `allow_done_sentinel: bool` (default `true`; set to `false` in workflow-owned action stages so nudges stop advertising the done sentinel), and `max_prose_chars: int`
`stop_after_successful_tools`	`list<string>`	nil	Stop after a tool-calling turn whose successful results include one of these tool names. Useful for workflow-owned verify loops such as `["edit", "scaffold"]`
`require_successful_tools`	`list<string>`	nil	Mark the loop `status = "failed"` unless at least one of these tool names succeeds at some point during the interaction. Keeps action stages honest when every attempted effect was rejected or errored
`loop_detect_warn`	int	`2`	Consecutive identical tool calls before appending a redirection hint
`loop_detect_block`	int	`3`	Consecutive identical tool calls before replacing the result with a hard redirect
`loop_detect_skip`	int	`4`	Consecutive identical tool calls before skipping execution entirely
`skills`	skill_registry or list	nil	Skill registry exposed to the match-and-activate lifecycle phase. See Skills lifecycle
`skill_match`	dict	`{strategy: "metadata", top_n: 1, sticky: true}`	Match configuration — `strategy` (`"metadata"` \| `"host"` \| `"embedding"`), `top_n`, `sticky`
`working_files`	list\|string	`[]`	Paths that feed `paths:` glob auto-trigger in the metadata matcher and ride along as a hint to host-delegated matchers

When daemon: true, the loop transitions active -> idle -> active instead of terminating on a text-only turn. Idle daemons can be woken by queued human messages, agent/resume bridge notifications, wake_interval_ms, or watched file changes from watch_paths.

Default nudge message:

You have not output ##DONE## yet — the task is not complete. Use your tools to continue working. Only output ##DONE## when the task is fully complete and verified.

When persistent: true, the system prompt is automatically extended with:

IMPORTANT: You MUST keep working until the task is complete. Do NOT stop to explain or summarize — take action. Output ##DONE## only when the task is fully complete and verified.

Daemon stdlib wrappers

When you want a first-class daemon handle instead of wiring agent_loop options manually, use the daemon builtins:

daemon_spawn(config)
daemon_trigger(handle, event)
daemon_snapshot(handle)
daemon_stop(handle)
daemon_resume(path)

daemon_spawn accepts the same daemon-related options that agent_loop understands (wake_interval_ms, watch_paths, idle_watchdog_attempts, etc.) plus event_queue_capacity, which bounds the durable FIFO trigger queue used by daemon_trigger.

let daemon = daemon_spawn({
  name: "reviewer",
  task: "Watch for trigger events and summarize the latest change.",
  system: "You are a careful reviewer.",
  provider: "mock",
  persist_path: ".harn/daemons/reviewer",
  event_queue_capacity: 256,
})

daemon_trigger(daemon, {kind: "file_changed", path: "src/lib.rs"})
let snap = daemon_snapshot(daemon)
println(snap.pending_event_count)
daemon_stop(daemon)
let resumed = daemon_resume(".harn/daemons/reviewer")

These wrappers preserve queued trigger events across stop/resume. If a daemon is stopped while a trigger is mid-flight, that trigger is re-queued and replayed on resume instead of being lost.

Context callback

context_callback lets you keep the full recorded transcript for replay and debugging while showing the model a smaller or rewritten prompt-visible history on each turn.

The callback receives one argument:

{
  iteration: int,
  system: string?,
  messages: list,
  visible_messages: list,
  recorded_messages: list,
  recent_visible_messages: list,
  recent_recorded_messages: list,
  latest_visible_user_message: string?,
  latest_visible_assistant_message: string?,
  latest_recorded_user_message: string?,
  latest_recorded_assistant_message: string?,
  latest_tool_result: string?,
  latest_recorded_tool_result: string?
}

It may return:

nil to leave the current prompt-visible context unchanged
a list of messages to use as the next prompt-visible message list
a dict with optional messages and system fields

Example: hide older assistant messages so the model mostly sees user intent, tool results, and the latest assistant turn.

fn hide_old_assistant_turns(ctx) {
  var kept = []
  var latest_assistant = nil
  for msg in ctx.visible_messages {
    if msg?.role == "assistant" {
      latest_assistant = msg
    } else {
      kept = kept + [msg]
    }
  }
  if latest_assistant != nil {
    kept = kept + [latest_assistant]
  }
  return {messages: kept}
}

let result = agent_loop(task, "You are a coding assistant.", {
  persistent: true,
  context_callback: hide_old_assistant_turns
})

Post-turn callback

post_turn_callback runs after a tool-calling turn completes. Use it when the workflow should react to the tool outcomes directly instead of waiting for the model to emit another message.

The callback receives:

{
  tool_names: list,
  tool_results: list,
  successful_tool_names: list,
  tool_count: int,
  iteration: int,
  consecutive_single_tool_turns: int,
  session_tools_used: list,
  session_successful_tools: list,
}

Each tool_results entry has:

{tool_name: string, status: string, rejected: bool}

It may return:

a string to inject as the next user-visible message
a bool where true stops the current stage immediately after the turn
a dict with optional message and stop fields

Example: stop after the first successful write turn, but still allow multiple edits in that same turn.

fn stop_after_successful_write(turn) {
  if turn?.successful_tool_names?.contains("edit") {
    return {stop: true}
  }
  return ""
}

Example with retry

retry 3 {
  let result = agent_loop(
    task,
    "You are a coding assistant.",
    {
      persistent: true,
      max_iterations: 30,
      max_nudges: 5,
      provider: "anthropic",
      model: "claude-sonnet-4-20250514"
    }
  )
  println(result.text)
}

Skills lifecycle

Skills bundle metadata, a system-prompt fragment, scoped tools, and lifecycle hooks into a typed unit. Declare them with the top-level skill NAME { ... } language form (see the Harn spec) or the imperative skill_define(...) builtin, then pass the resulting skill_registry to agent_loop via the skills: option. The agent loop matches, activates, and (optionally) deactivates skills across turns automatically.

Matching strategies

skill_match: { strategy: ..., top_n: 1, sticky: true } controls how the loop picks which skill(s) to activate:

"metadata" (default) — in-VM BM25-ish scoring over description + when_to_use combined with glob matching against the paths: list. Name-in-prompt mentions count as a strong boost. No host round-trip, so matching is fast and deterministic.
"host" — delegates scoring to the host via the skill/match bridge RPC (see bridge-protocol.md). Useful for embedding-based or LLM-driven matchers. Failing RPC falls back to metadata scoring with a warning.
"embedding" — alias for "host"; accepted so the language matches Anthropic’s canonical terminology.

Activation lifecycle

Match runs at the head of iteration 0 (always) and, when sticky: false, before every subsequent iteration (reassess).
Activate: the skill’s on_activate closure (if any) is called, its prompt body is woven into the effective system prompt, and allowed_tools narrows the tool surface for the next LLM call. Each activation emits AgentEvent::SkillActivated + a skill_activated transcript event with the match score and reason.
Deactivate (only in sticky: false mode) — when reassess picks a different top-N, the previously-active skill’s on_deactivate runs and the scoped tool filter is dropped. Emits AgentEvent::SkillDeactivated + a skill_deactivated transcript event.
Session resume: when session_id: is set, the set of active skills at the end of one run is persisted in the session store. The next agent_loop call on the same session rehydrates them before iteration-0 matching runs, so sticky re-entry stays hot without re-matching from a cold prompt.

Scoped tools

A skill’s allowed_tools list is the union across all active skills; any tool outside that union is filtered out of both the contract prompt and the native tool schemas the provider sees. Runtime-internal tools like __harn_tool_search are never filtered — scoping gates the user-declared surface, not the runtime’s own scaffolding.

Frontmatter honoured by the runtime

Field	Type	Effect
`description`	string	Primary ranking signal for metadata matching
`when_to_use`	string	Secondary ranking signal
`paths`	`list<string>`	Glob patterns for `paths:` auto-trigger
`allowed_tools`	`list<string>`	Whitelist applied to the tool surface on activation
`prompt`	string	Body woven into the active-skill system-prompt block
`disable-model-invocation`	bool	When `true`, the matcher skips the skill entirely
`user-invocable`	bool	Placeholder for host UI (not consumed by the runtime today)
`mcp`	`list<string>`	MCP servers the skill wants booted (consumed by host integrations)
`on_activate` / `on_deactivate`	fn	Closures invoked on transition

Example

skill ship {
  description "Ship a production release"
  when_to_use "User says ship/release/deploy"
  paths ["infra/**", "Dockerfile"]
  allowed_tools ["deploy_service"]
  prompt "Follow the deploy runbook. One command at a time."
}

let result = agent_loop(
  "Ship the new release to production",
  "You are a staff deploy engineer.",
  {
    provider: "anthropic",
    tools: tools(),
    skills: ship,
    working_files: ["infra/terraform/cluster.tf"],
  }
)

The loop emits one skill_matched event per match pass (including zero-candidate passes so replayers see the boundary), one skill_activated per activated skill, and one skill_scope_tools event per activation whose allowed_tools narrowed the surface.

Streaming responses

llm_stream returns a channel that yields response chunks as they arrive. Iterate over it with a for loop:

let stream = llm_stream("Tell me a story", "You are a storyteller")
for chunk in stream {
  print(chunk)
}

llm_stream accepts the same options as llm_call (provider, model, max_tokens). The channel closes automatically when the response is complete.

Delegated workers

For long-running or parallel orchestration, Harn exposes a worker/task lifecycle directly in the runtime.

let worker = spawn_agent({
  name: "research-pass",
  task: "Draft a summary",
  node: {
    kind: "subagent",
    mode: "llm",
    model_policy: {provider: "mock"},
    output_contract: {output_kinds: ["summary"]}
  }
})

let done = wait_agent(worker)
println(done.status)

spawn_agent(...) accepts either:

a graph plus optional artifacts and options, which runs a typed workflow in the background, or
a node plus optional artifacts and transcript, which runs a single delegated stage and preserves transcript continuity across send_input(...)

Worker configs may also include policy to narrow the delegated worker to a subset of the parent’s current execution ceiling, or a top-level tools: ["name", ...] shorthand:

let worker = spawn_agent({
  task: "Read project files only",
  tools: ["read", "search"],
  node: {
    kind: "subagent",
    mode: "llm",
    model_policy: {provider: "mock"},
    tools: repo_tools()
  }
})

If neither is provided, the worker inherits the current execution policy as-is. If either is provided, Harn intersects the requested worker scope with the parent ceiling before the worker starts or is resumed. Permission denials are returned to the agent loop as structured tool results: {error: "permission_denied", tool, reason}.

Worker lifecycle builtins:

Function	Description
`spawn_agent(config)`	Start a worker from a workflow graph or delegated stage
`sub_agent_run(task, options?)`	Run an isolated child agent loop and return a single clean result envelope to the parent
`send_input(handle, task)`	Re-run a completed worker with a new task, carrying transcript/artifacts forward when applicable
`resume_agent(id_or_snapshot_path)`	Restore a persisted worker snapshot and continue it in the current runtime
`wait_agent(handle_or_list)`	Wait for one worker or a list of workers to finish
`close_agent(handle)`	Cancel a worker and mark it terminal
`list_agents()`	Return summaries for all known workers in the current runtime

`sub_agent_run`

Use sub_agent_run(...) when you want a full child agent_loop with its own session and narrowed capability scope, but you do not want the child transcript to spill into the parent conversation history.

let result = sub_agent_run("Find the config entrypoints.", {
  provider: "mock",
  tools: repo_tools(),
  allowed_tools: ["search", "read"],
  token_budget: 1200,
  returns: {
    schema: {
      type: "object",
      properties: {
        paths: {type: "array", items: {type: "string"}}
      },
      required: ["paths"]
    }
  }
})

if result.ok {
  println(result.data.paths)
} else {
  println(result.error.category)
}

The parent transcript only records the outer tool call and tool result. The child keeps its own session and transcript, linked by session_id / parent lineage metadata.

sub_agent_run(...) returns an envelope with:

ok
summary
artifacts
evidence_added
tokens_used
budget_exceeded
session_id
data when the child requests JSON mode or returns.schema succeeds
error: {category, message, tool?} when the child fails or a narrowed tool policy rejects a call

Set background: true to get a normal worker handle back instead of waiting inline. The resulting worker uses mode: "sub_agent" and can be resumed with wait_agent(...), send_input(...), and close_agent(...). Background handles retain the original structured request plus a normalized provenance object, so parent pipelines can recover child questions, actions, workflow stages, and verification steps directly from the handle/result.

Workers can persist state and child run paths between sessions. Use carry inside spawn_agent(...) when you want continuation to reset transcript state, drop carried artifacts, or disable workflow resume against the previous child run record. Worker configs may also include execution to pin delegated work to an explicit cwd/env overlay or a managed git worktree:

let worker = spawn_agent({
  task: "Run the repo-local verification pass",
  graph: some_graph,
  execution: {
    worktree: {
      repo: ".",
      branch: "worker/research-pass",
      cleanup: "preserve"
    }
  }
})

Transcript management

Harn includes transcript primitives for carrying context across calls, forks, repairs, and resumptions:

let first = llm_call("Plan the work", nil, {provider: "mock"})

let second = llm_call("Continue", nil, {
  provider: "mock",
  transcript: first.transcript
})

let compacted = transcript_compact(second.transcript, {
  keep_last: 4,
  summary: "Planning complete."
})

Use transcript_summarize() when you want Harn to create a fresh summary with an LLM, or transcript_compact() when you want the runtime compaction engine outside the agent_loop path.

Transcript helpers also expose the canonical event model:

let visible = transcript_render_visible(result.transcript)
let full = transcript_render_full(result.transcript)
let events = transcript_events(result.transcript)

Use these when a host app needs to render human-visible chat separately from internal execution history.

For chat/session lifecycle, std/agents now exposes a higher-level workflow session contract on top of raw transcripts and run records:

import "std/agents"

let result = task_run("Write a note", some_flow, {provider: "mock"})
let session = workflow_session(result)
let forked = workflow_session_fork(session)
let archived = workflow_session_archive(forked)
let resumed = workflow_session_resume(archived)
let persisted = workflow_session_persist(result, ".harn-runs/chat.json")
let restored = workflow_session_restore(persisted.run.persisted_path)

Each workflow session also carries a normalized usage summary copied from the underlying run record when available:

println(session?.usage?.input_tokens)
println(session?.usage?.output_tokens)
println(session?.usage?.total_duration_ms)
println(session?.usage?.call_count)

std/agents also exposes worker helpers for delegated/background orchestration: worker_request(worker), worker_result(worker), worker_provenance(worker), worker_research_questions(worker), worker_action_items(worker), worker_workflow_stages(worker), and worker_verification_steps(worker).

This is the intended host integration boundary:

hosts persist chat tabs, titles, and durable asset files
Harn persists transcript/run-record/session semantics
hosts should prefer restoring a Harn session or transcript over inventing a parallel hidden memory format

Workflow runtime

For multi-stage orchestration, prefer the workflow runtime over product-side loop wiring. Define a helper that assembles the tools your agents will use:

fn review_tools() {
  var tools = tool_registry()
  tools = tool_define(tools, "read", "Read a file", {
    parameters: {path: {type: "string"}},
    returns: {type: "string"},
    handler: nil
  })
  tools = tool_define(tools, "edit", "Edit a file", {
    parameters: {path: {type: "string"}},
    returns: {type: "string"},
    handler: nil
  })
  tools = tool_define(tools, "run", "Run a command", {
    parameters: {command: {type: "string"}},
    returns: {type: "string"},
    handler: nil
  })
  return tools
}

let graph = workflow_graph({
  name: "review_and_repair",
  entry: "act",
  nodes: {
    act: {kind: "stage", mode: "agent", tools: review_tools()},
    verify: {kind: "verify", mode: "agent", tools: tool_select(review_tools(), ["run"])}
  },
  edges: [{from: "act", to: "verify"}]
})

let run = workflow_execute(
  "Fix the failing test and verify the change.",
  graph,
  [],
  {max_steps: 6}
)

This keeps orchestration structure, transcript policy, context policy, artifacts, and retries inside Harn instead of product code.

Cost tracking

Harn provides builtins for estimating and controlling LLM costs:

// Estimate cost for a specific call
let cost = llm_cost("claude-sonnet-4-20250514", 1000, 500)
println("Estimated cost: $${cost}")

// Check cumulative session costs
let session = llm_session_cost()
println("Total: $${session.total_cost}")
println("Calls: ${session.call_count}")
println("Input tokens: ${session.input_tokens}")
println("Output tokens: ${session.output_tokens}")

// Set a budget (LLM calls throw if exceeded)
llm_budget(1.00)
println("Remaining: $${llm_budget_remaining()}")

Function	Description
`llm_cost(model, input_tokens, output_tokens)`	Estimate USD cost from embedded pricing table
`llm_session_cost()`	Session totals: `{total_cost, input_tokens, output_tokens, call_count}`
`llm_budget(max_cost)`	Set session budget in USD. LLM calls throw if exceeded
`llm_budget_remaining()`	Remaining budget (nil if no budget set)

Provider API details

Anthropic

Endpoint: https://api.anthropic.com/v1/messages
Auth: x-api-key header
API version: 2023-06-01
System message sent as a top-level system field

OpenAI

Endpoint: https://api.openai.com/v1/chat/completions
Auth: Authorization: Bearer <key>
System message sent as a message with role: "system"

OpenRouter

Endpoint: https://openrouter.ai/api/v1/chat/completions
Auth: Authorization: Bearer <key>
Same message format as OpenAI

HuggingFace

Endpoint: https://router.huggingface.co/v1/chat/completions
Auth: Authorization: Bearer <key>
Use HF_TOKEN or HUGGINGFACE_API_KEY
Same message format as OpenAI

Ollama

Endpoint: <OLLAMA_HOST>/v1/chat/completions
Default host: http://localhost:11434
No authentication required
Same message format as OpenAI

Local OpenAI-compatible server

Endpoint: <LOCAL_LLM_BASE_URL>/v1/chat/completions
Default host: http://localhost:8000
No authentication required
Same message format as OpenAI

Testing with mock LLM responses

The mock provider returns deterministic responses without API keys. Use llm_mock() to queue specific responses — text, tool calls, or both:

// Queue a text response (consumed in FIFO order)
llm_mock({text: "The capital of France is Paris."})
let r = llm_call("What is the capital of France?", nil, {provider: "mock"})
assert_eq(r.text, "The capital of France is Paris.")

// Queue a response with tool calls
llm_mock({
  text: "Let me read that file.",
  tool_calls: [{name: "read_file", arguments: {path: "src/main.rs"}}],
})

// Pattern-matched mocks (reusable by default, matched in declaration order)
llm_mock({text: "I don't know.", match: "*unknown*"})
llm_mock({text: "step 1", match: "*planner*", consume_match: true})
llm_mock({text: "step 2", match: "*planner*", consume_match: true})

// Inspect what was sent to the mock provider
let calls = llm_mock_calls()
// Each entry: {messages: [...], system: "..." or nil, tools: [...] or nil}

// Clear all mocks and call log between tests
llm_mock_clear()

When no llm_mock() responses are queued, the mock provider falls back to its default deterministic behavior (echoing prompt metadata). This means existing tests using provider: "mock" without llm_mock() continue to work unchanged.

Daemon stdlib

Harn’s daemon builtins wrap the existing agent_loop(..., {daemon: true}) runtime so scripts can manage long-lived assistants without hand-assembling snapshot paths and resume options.

Builtins

`daemon_spawn(config)`

Start a daemon-mode agent and return a daemon handle dict.

Required config:

task or prompt
persist_path or state_dir

Useful optional config:

name
system
provider, model, tools, max_iterations, and other agent_loop options
wake_interval_ms
watch_paths
idle_watchdog_attempts
event_queue_capacity (default 1024)

Example:

let reviewer = daemon_spawn({
  name: "reviewer",
  task: "Watch for trigger events and summarize the change.",
  system: "You are a careful code reviewer.",
  provider: "mock",
  persist_path: ".harn/daemons/reviewer",
  watch_paths: ["src/"],
  wake_interval_ms: 30000,
  event_queue_capacity: 256,
})

`daemon_trigger(handle, event)`

Queue a trigger event for a running daemon. Events are delivered FIFO, one daemon wake at a time, and the queue is durably persisted in the daemon’s metadata so a stop/resume or crash/recovery cycle does not lose pending work.

If the queue is full, the builtin throws VmError::DaemonQueueFull.

daemon_trigger(reviewer, {
  kind: "file_changed",
  path: "src/lib.rs",
})

`daemon_snapshot(handle)`

Return the latest persisted daemon snapshot plus live queue metadata:

pending_events
pending_event_count
inflight_event
queued_event_count
event_queue_capacity

The rest of the payload mirrors agent_loop daemon snapshots, including daemon_state, recorded_messages, total_iterations, and saved_at.

`daemon_stop(handle)`

Stop a daemon and preserve its state on disk. The runtime waits briefly for an idle boundary when possible; if the daemon is still mid-turn, the current in-flight trigger is re-queued so daemon_resume(...) can replay it safely.

`daemon_resume(path)`

Resume a daemon from its persisted state directory. The path is the same root directory you passed as persist_path / state_dir to daemon_spawn(...), not the inner daemon.json snapshot file.

If the daemon stopped with queued or in-flight trigger events, they are restored and replayed after resume.

Delivery semantics

Trigger events are FIFO.
The queue is bounded by event_queue_capacity.
Trigger payloads are handed to the daemon only from an idle boundary, so a persisted snapshot always reflects the pre-trigger or post-trigger state and never an ambiguous half-consumed queue.
Forced stop/restart is intentionally at-least-once: an in-flight trigger is re-queued on stop/resume instead of being dropped silently.

Trigger stdlib

The trigger stdlib exposes the live runtime registry to Harn scripts. Use it to inspect installed bindings, register new bindings at runtime, fire synthetic events for tests/manual invocations, replay a recorded event by id, and inspect the current dead-letter queue (DLQ).

Import the shared types from std/triggers when you want typed handles and payloads:

import "std/triggers"

Builtins

`trigger_list()`

Return the current live registry snapshot as list<TriggerBinding>.

Each binding includes:

id
version
source ("manifest" or "dynamic")
kind
provider
handler_kind
state
metrics

metrics is a typed TriggerMetrics record with counters for received, dispatched, failed, dlq, in_flight, and the cost snapshot fields.

`trigger_register(config)`

TriggerConfig uses the same broad shape as manifest-loaded bindings:

id
kind
provider
handler
when
retry
match or events
dedupe_key
filter
budget
manifest_path
package_name

The runtime currently accepts two handler forms:

Local Harn closures / function references
Remote URI strings with a2a://... or worker://...

retry is optional. The current stdlib surface accepts:

{max: N, backoff: "svix"}
{max: N, backoff: "immediate"}

Example:

import "std/triggers"

fn handle_issue(event: TriggerEvent) -> dict {
  return {kind: event.kind, provider: event.provider}
}

let handle: TriggerHandle = trigger_register({
  id: "github-new-issue",
  kind: "issue.opened",
  provider: "github",
  handler: handle_issue,
  when: nil,
  match: {events: ["issue.opened"]},
  events: nil,
  dedupe_key: nil,
  filter: nil,
  budget: nil,
  manifest_path: nil,
  package_name: nil,
})

`trigger_fire(handle, event)`

Fire a synthetic TriggerEvent into a binding and return a DispatchHandle.

The builtin accepts either:

A TriggerHandle / TriggerBinding dict
A plain trigger id string

If the event dict omits low-level envelope fields such as id, received_at, trace_id, or provider_payload, the runtime fills them with synthetic defaults.

Current behavior:

Execution routes through the trigger dispatcher, so local handlers inherit dispatcher retries, lifecycle events, action-graph updates, and DLQ moves.
when predicates execute before the handler and can still short-circuit a dispatch.
a2a://... and worker://... handlers still return the dispatcher’s explicit NotImplemented failure path.

`trigger_replay(event_id)`

Replay a previously recorded event from the EventLog by id and return a DispatchHandle.

Current replay behavior:

Fetch the prior event from the triggers.events topic
Re-dispatch it through the trigger dispatcher using the recorded binding
Preserve replay_of_event_id on the returned DispatchHandle
Resolve the pending stdlib DLQ entry when a replay succeeds

trigger_replay(...) is still not the full deterministic T-14 replay engine. It replays the recorded trigger event through the current dispatcher/runtime state rather than a sandboxed drift-detecting environment.

`trigger_inspect_dlq()`

Return the current DLQ snapshot as list<DlqEntry>.

Each DlqEntry includes:

The failed event
Trigger identity (binding_id, binding_version)
Current state
Latest error
retry_history

retry_history records every DLQ attempt, including replay attempts.

Example

import "std/triggers"

fn fail_handler(event: TriggerEvent) -> any {
  throw("manual failure: " + event.kind)
}

let handle = trigger_register({
  id: "manual-dlq",
  kind: "issue.opened",
  provider: "github",
  handler: fail_handler,
  when: nil,
  retry: {max: 1, backoff: "immediate"},
  match: nil,
  events: ["issue.opened"],
  dedupe_key: nil,
  filter: nil,
  budget: nil,
  manifest_path: nil,
  package_name: nil,
})

let fired = trigger_fire(handle, {provider: "github", kind: "issue.opened"})
let dlq = trigger_inspect_dlq().filter({ entry -> entry.binding_id == handle.id })
let replay = trigger_replay(fired.event_id)

println(fired.status)                  // "dlq"
println(len(dlq[0].retry_history))     // 1
println(replay.replay_of_event_id)     // original event id

Notes

Dynamic registrations are runtime-local. trigger_register(...) updates the live registry in the current process; it does not rewrite harn.toml.
trigger_fire(...) and trigger_replay(...) need an active EventLog to persist triggers.events and triggers.dlq. If the runtime did not already install one, the stdlib wrapper falls back to an in-memory log for the current thread.
When workflow_execute(...) runs inside a replayed trigger dispatch, the runtime carries the replay pointer into run metadata so derived observability can render a replay_chain edge back to the original event.

Skills

Harn discovers skills — bundled instructions, tool lists, and activation rules — from the filesystem and from the host process. Every skill is a directory containing a SKILL.md file with YAML frontmatter plus a Markdown body; the format matches Anthropic’s Agent Skills and Claude Code specs, so skills you author once work across both environments.

This page describes:

the layered discovery hierarchy (CLI > env > project > manifest > user > package > system > host),
the SKILL.md frontmatter Harn recognizes,
the body substitution ($ARGUMENTS, $N, ${HARN_SKILL_DIR}, ${HARN_SESSION_ID}) that runs over SKILL.md before the model sees it,
the harn.toml [skills] / [[skill.source]] tables, and
the harn doctor output for diagnosing collisions / missing entries.

The companion language form — skill NAME { ... } — is documented in Language basics and the skill builtins (skill_registry, skill_define, skill_find, skill_list, skill_render, skills_catalog_entries, render_always_on_catalog, …) in Builtin functions.

Layered discovery

When harn run / harn test / harn check starts, every discovered skill is merged into a single registry and exposed as the pre-populated VM global skills. The layers — in order of highest to lowest priority — are:

#	Layer	Source	When
1	CLI	`--skill-dir <path>` (repeatable)	Ephemeral overrides, CI pinning
2	Env	`$HARN_SKILLS_PATH` (colon-separated on Unix, `;` on Windows)	Deployment config, Docker, cloud agents
3	Project	`.harn/skills/<name>/SKILL.md` walking up from the script	Default for repo-scoped skills
4	Manifest	`[skills] paths` + `[[skill.source]]` in harn.toml	Multi-root, shared across siblings
5	User	`~/.harn/skills/<name>/SKILL.md`	Personal skills across projects
6	Package	`.harn/packages/**/skills/<name>/SKILL.md`	Skills shipped via `[dependencies]`
7	System	`/etc/harn/skills/` + `$XDG_CONFIG_HOME/harn/skills/`	Managed / enterprise
8	Host	Registered via the bridge at runtime	Cloud / embedded hosts

Name collisions: when two layers both expose a skill named deploy, the higher layer wins. The shadowed entry is recorded so harn doctor can surface it. Scripts that need both at once can register a fully-qualified <namespace>/<skill> id via [[skill.source]] in the manifest (see below).

SKILL.md frontmatter

The frontmatter is YAML, delimited by --- on its own line above and below. Unknown fields are not hard errors — harn doctor reports them as warnings so newer spec fields roll out cleanly.

---
name: deploy
description: Deploy the application to production
when-to-use: User says deploy / ship / release
disable-model-invocation: false
user-invocable: true
allowed-tools: [bash, git]
paths:
  - infra/**
  - Dockerfile
context: fork
agent: ops-lead
model: claude-opus-4-7
effort: high
shell: bash
argument-hint: "<target-env>"
hooks:
  on-activate: echo "starting deploy"
  on-deactivate: echo "deploy ended"
---
# Deploy runbook
Ship it: `$ARGUMENTS`. Skill directory: `${HARN_SKILL_DIR}`.

Recognized fields (Harn normalizes hyphens to underscores, so when-to-use and when_to_use are the same key):

Field	Type	Purpose
`name`	string	Required. Id the script looks up via `skill_find`.
`description`	string	One-liner the model sees for auto-activation.
`when-to-use`	string	Longer activation trigger.
`disable-model-invocation`	bool	If `true`, never auto-activate — explicit use only.
`allowed-tools`	list of string	Restrict tool surface while the skill is active. Entries accept three shapes: an exact tool name (`"deploy_service"`), a namespace tag (`"namespace:read"` — matches every tool declared with `namespace: "read"`), or `"*"` (escape hatch that keeps the full surface, useful for skills that only carry prompt context).
`user-invocable`	bool	Expose the skill to end users via a slash menu.
`paths`	list of glob	Files the skill expects to touch.
`context`	string	`"fork"` runs in an isolated subcontext.
`agent`	string	Sub-agent that owns the skill.
`hooks`	map or list	Shell commands for lifecycle events.
`model`	string	Preferred model alias.
`effort`	string	`low` / `medium` / `high`.
`shell`	string	Shell to run the body under when `context` is shell-ish.
`argument-hint`	string	UI hint for `$ARGUMENTS`.

Tool scoping with `namespace:<tag>`

Tool declarations that carry a namespace: field can be grouped into one allowed-tools entry instead of enumerating names. Given

tool_define(reg, "read_file", "...", {namespace: "read", ...})
tool_define(reg, "list_files", "...", {namespace: "read", ...})
tool_define(reg, "write_file", "...", {namespace: "write", ...})

a skill with allowed-tools: ["namespace:read"] scopes the turn to read_file + list_files and hides write_file. Exact tool names and the wildcard "*" remain valid and can mix freely:

allowed-tools: ["namespace:read", "grep", "*"]

Malformed entries fail loudly at skill_define time — a bare ":" without a tag or a colon-prefixed entry that isn’t namespace: raises so authors don’t silently scope to an empty set.

Body substitution

When a skill is rendered (via the skill_render builtin, or by a host before handing the body to the model), the following substitutions run over the Markdown body:

$ARGUMENTS → all positional args joined with spaces
$N → the N-th positional arg (1-based). $0 is reserved.
${HARN_SKILL_DIR} → absolute path to the skill directory
${HARN_SESSION_ID} → opaque session id threaded through the run
${OTHER_NAME} → looks up OTHER_NAME in the process environment
$$ → literal $

Missing positional args ($3 when only $1 was supplied) pass through unchanged so authors see what wasn’t supplied rather than a silent empty substitution.

let deploy = skill_find(skills, "deploy")
let rendered = skill_render(deploy, ["prod", "us-east-1"])
// rendered now has $1 and $2 replaced with "prod" and "us-east-1".

Progressive disclosure with `load_skill`

When an agent loop receives a skill registry through skills:, Harn automatically exposes a runtime-owned load_skill({ name }) tool. The tool:

resolves the requested skill id against the loop’s resolved skill registry,
applies the same SKILL.md body substitution described above, and
returns the substituted body as the tool result so it lands in the next turn’s transcript naturally.

If the target skill has disable-model-invocation: true, load_skill returns a typed error instead of leaking the body.

Always-on catalog helper

The recommended harness convention is:

Keep a compact catalog of available skills in the system prompt.
Let the model call load_skill only when one of those entries looks relevant.

Harn ships two pure helpers for that pattern:

let entries = skills_catalog_entries(skills)
let catalog = render_always_on_catalog(entries, 2000)

skills_catalog_entries projects the resolved registry into compact {name, description, when_to_use} cards (sorted deterministically by skill id, using <namespace>/<name> when present). render_always_on_catalog formats those cards into a stable prompt block and trims the list to the requested character budget.

Copy-pasteable example:

let catalog = render_always_on_catalog(skills_catalog_entries(skills), 2000)

let result = agent_loop(
  "Help me ship this release",
  catalog,
  {
    provider: "mock",
    model: "gpt-5.4",
    persistent: true,
    skills: skills,
  },
)

On a later turn the model can emit:

load_skill({ name: "deploy" })

and the next turn will see the substituted SKILL.md body in the tool result, while any allowed-tools declared by that skill narrow the tool surface for subsequent turns.

harn.toml `[skills]` + `[[skill.source]]`

Projects that share skills across siblings or pull them from a remote tag use the manifest instead of a per-script flag:

[skills]
paths = ["packages/*/skills", "../shared-skills"]
lookup_order = ["cli", "project", "manifest", "user", "package", "system", "host"]
disable = ["system"]

[skills.defaults]
tool_search = "bm25"
always_loaded = ["look", "edit", "bash"]

[[skill.source]]
type = "fs"
path = "../shared"

[[skill.source]]
type = "git"
url = "https://github.com/acme/harn-skills"
tag = "v1.2.0"

[[skill.source]]
type = "registry"   # reserved, inert until a marketplace exists
url = "https://skills.harnlang.com"
name = "acme/ops"

paths is joined against the directory holding harn.toml and supports a single trailing * component (packages/*/skills).
lookup_order lets you invert a layer’s priority — for example, to prefer user over project on a personal checkout without touching the repo.
disable kicks entire layers out of discovery. Disabled layers are reported by harn doctor.
[[skill.source]] entries of type git expect their materialized checkout to live under .harn/packages/<name>/skills/ — run harn install to populate it.
registry entries are accepted but inert until a Harn Skills marketplace exists (tracked by #73).

harn doctor

harn doctor reports the resolved skill catalog:

  OK   skills                 3 loaded (1 cli, 1 project, 1 user)
  WARN skill:deploy           shadowed by cli layer; user version at /home/me/.harn/skills/deploy is hidden
  WARN skill:review           unknown frontmatter field(s) forwarded as metadata: future_field
  SKIP skills-layer:system    layer disabled by harn.toml [skills.disable]

CLI flags

harn run --skill-dir <path> (repeatable) — highest-priority layer.
harn test --skill-dir <path> — same semantics for user tests and conformance fixtures.
$HARN_SKILLS_PATH — colon-separated list of directories, applied to every invocation.

Bridge protocol

Hosts expose their own managed skill store through three RPCs:

skills/list (request) — response is an array of { id, name, description, source } entries.
skills/fetch (request) — payload { id: "<skill id>" }; response is the full manifest + body shape so the CLI can hydrate a SkillManifestRef into a Skill.
skills/update (notification, host → VM) — invalidates the VM’s cached catalog. The CLI re-runs discovery on the next boundary.

See Bridge protocol for wire-format details.

Managing skills

The harn skills CLI manages and inspects skills without running a pipeline. Each subcommand resolves the layered catalog the same way harn run does (--skill-dir, HARN_SKILLS_PATH, project, manifest, user, packages, system, host), so what you see here is exactly what pipelines see.

`harn skills list`

Prints every resolved skill with the layer it came from. Pass --all to include shadowed entries; pass --json for machine output.

$ harn skills list
Resolved skills (3):
  deploy         [cli]       Deploy to production with rollback support
  review         [project]   Review a pull request
  helpers/utils  [package]   Shared helpers from the acme/ops package

Shadowed skills (1):
  deploy   winner=[cli] hidden=[user] origin=/home/me/.harn/skills/deploy

`harn skills inspect <name>`

Dumps the resolved SKILL.md — frontmatter, bundled files under the skill directory, and the full body — for a specific skill. Accepts bare <name> or fully-qualified <namespace>/<name>:

$ harn skills inspect deploy
id:          deploy
name:        deploy
layer:       cli
description: Deploy to production with rollback support
skill_dir:   /repo/.harn/skills/deploy

Bundled files:
  files/runbook.md
  files/rollback.sh

---- SKILL.md body ----
Run the deploy. Confirm replicas and then flip traffic.

`harn skills match "<query>"`

Runs the built-in metadata matcher (same scorer the agent loop uses) against a prompt and prints the ranked candidates with their scores. Supports --working-file to simulate path-glob matches:

$ harn skills match "deploy the staging service" --top-n 3
Match results for: deploy the staging service
   1. deploy              score=2.400  [cli]       prompt mentions 'deploy'; 1 keyword hit(s)
   2. review              score=0.400  [project]   1 keyword hit(s)

Useful when authoring a SKILL.md to confirm its description: and when_to_use: frontmatter actually attracts the right prompts.

`harn skills install <spec>`

Materializes a git ref or local path into .harn/skills-cache/ so the filesystem package walker picks it up on the next run. The .harn/skills-cache/ layout mirrors .harn/packages/:

$ harn skills install acme/harn-skills --tag v1.2.0
installing acme/harn-skills to .harn/skills-cache/harn-skills
installed — layer=package, path=.harn/skills-cache/harn-skills

<spec> accepts:

A full git URL: https://github.com/acme/harn-skills.git
owner/repo shorthand (expands to GitHub): acme/harn-skills
A local filesystem path: ../shared/skills/deploy

Pass --namespace <ns> to shelf the install under a subdirectory so it shows up in the resolver as <ns>/<skill>. Pass --tag <ref> to pin a git branch or tag. Every install rewrites .harn/skills-cache/skills.lock with the resolved source + commit.

`harn skills new <name>`

Scaffolds a new SKILL.md and files/ directory under .harn/skills/:

$ harn skills new deploy --description "Deploy to production"
Scaffolded skill 'deploy' at .harn/skills/deploy
  SKILL.md
  files/README.md

Edit the SKILL.md frontmatter and body, then run `harn skills list`
to verify it's picked up.

Pass --dir <path> to target a different destination (for example ~/.harn/skills/deploy to scaffold under the user layer instead of the project layer), or --force to overwrite an existing directory.

Portal observability

The Harn portal (harn portal) surfaces two skill-focused panels on every run detail page:

Skill timeline — horizontal bars showing which skills activated on which agent-loop iteration and when they deactivated. Hover a bar for the matcher score and the reason the skill was promoted.
Tool-load waterfall — one row per tool_search_query event, pairing each query with its tool_search_result so you can see which deferred tools entered the LLM’s context in each turn.
Matcher decisions — per-iteration expansions showing every candidate the matcher considered, its score, and the working-file snapshot it scored against.

The runs index page takes a skill=<name> filter so you can narrow evals to runs where a specific skill was active. The same skill=<name> query parameter works from a URL, making it easy to link to “every run that used deploy”.

Sessions

A session is a first-class VM resource that owns three things for a given conversational agent run:

Its transcript history (messages, events, summary, …).
The closure subscribers registered against it via agent_subscribe(session_id, cb).
Its lifecycle — create, reset, fork, trim, compact, close.

Sessions replace the old transcript_policy config pattern. Lifecycle used to be a side effect of dict fields (mode: "reset", mode: "fork" quietly surgerying state on stage entry); it is now expressed by explicit, imperative builtins. Unknown inputs are hard errors.

Quick tour

pipeline main(task) {
  // Open (or resume) a session. `nil` mints a UUIDv7.
  let s = agent_session_open()

  // Seed the conversation.
  agent_session_inject(s, {role: "user", content: "Hello!"})

  // Run an agent loop against the session — prior messages are
  // automatically loaded as prefix, the final transcript is persisted
  // back under `s`.
  let first = agent_loop("continue the greeting", nil, {
    session_id: s,
    provider: "mock",
  })

  // A second call sees `first`'s assistant reply as prior history.
  let second = agent_loop("what do you remember?", nil, {
    session_id: s,
    provider: "mock",
  })

  // Fork to explore a counterfactual without touching `s`.
  let branch = agent_session_fork(s)
  agent_session_inject(branch, {role: "user", content: "what if …"})

  // Release a session immediately.
  agent_session_close(branch)
}

If you don’t pass session_id to agent_loop, the loop mints an anonymous id internally and does NOT persist anything. That preserves the “one-shot” call shape.

Builtins

Function	Returns	Notes
`agent_session_open(id?: string)`	`string`	Idempotent. `nil` mints a UUIDv7.
`agent_session_exists(id)`	`bool`	Safe on unknown ids.
`agent_session_length(id)`	`int`	Message count. Errors if `id` doesn’t exist.
`agent_session_snapshot(id)`	`dict` or `nil`	Read-only deep copy of the transcript.
`agent_session_reset(id)`	`nil`	Wipes history; preserves id and subscribers.
`agent_session_fork(src, dst?)`	`string`	Copies transcript; subscribers are NOT copied.
`agent_session_trim(id, keep_last)`	`int`	Retains last `keep_last` messages. Returns kept count.
`agent_session_compact(id, opts)`	`int`	Runs the LLM/truncate/observation-mask compactor. Unknown keys in `opts` error.
`agent_session_inject(id, message)`	`nil`	Appends a `{role, content, …}` message. Missing `role` errors.
`agent_session_close(id)`	`nil`	Evicts immediately.

`agent_session_compact` options

Accepts any subset of these keys; anything else is a hard error:

keep_last (int, default 12)
token_threshold (int)
tool_output_max_chars (int)
compact_strategy ("llm" | "truncate" | "observation_mask" | "custom")
hard_limit_tokens (int)
hard_limit_strategy (same values as above)
custom_compactor (closure)
mask_callback (closure)
compress_callback (closure)

Storage model

Sessions live in a per-thread HashMap<String, SessionState> in crate::agent_sessions. Thread-local is correct because VmValue wraps Rc and the agent loop runs on a pinned tokio LocalSet task.

An LRU cap (default 128 sessions per VM) evicts the least-recently accessed session when a new one is opened over the cap. agent_session_close evicts immediately regardless of the cap.

Subscribers

agent_subscribe(id, closure) appends closure to the session’s subscribers list. The agent loop fires turn_end (and other) events through every subscriber for that session id. Subscribers are not copied by agent_session_fork — a fork is a conversation branch, not an event fanout.

Interaction with workflows

Workflow stages pick up a session id from model_policy.session_id on the node; if unset, each stage mints a stable stage-scoped id. Two stages sharing a session_id share their transcript automatically through the session store — no explicit threading or policy dict required.

To branch a stage’s conversation before running it, call agent_session_fork in the pipeline before workflow_execute and wire the fork id into the relevant node’s model_policy.session_id.

Fail-loud

Unknown option keys on agent_session_compact, a missing role on agent_session_inject, a negative keep_last, and any of the lifecycle verbs (reset, fork, close, trim, inject, length, compact) called against an unknown id all raise a VmError::Thrown(string). exists, open, and snapshot are the only calls that tolerate unknown ids by design.

Agent State

std/agent_state is Harn’s durable, session-scoped scratch space for agent orchestration. It gives a caller-owned root directory plus a session id a small set of predictable operations:

write text blobs atomically
read them back later
list keys deterministically
delete keys
persist a machine-readable handoff document
reopen the same session from a later process with agent_state_resume

The important design point is that the primitive is generic. Harn owns the durable-state substrate; host apps own their schema and naming conventions layered on top of it.

Import

import "std/agent_state"

Functions

Function	Returns	Notes
`agent_state_init(root, options?)`	`state_handle`	Creates or reopens a session-scoped state root under `root/<session_id>/`
`agent_state_resume(root, session_id, options?)`	`state_handle`	Reopens an existing session; errors if it does not exist
`agent_state_write(handle, key, content)`	`nil`	Atomic temp-write plus rename
`agent_state_read(handle, key)`	`string` or `nil`	Returns `nil` for missing keys
`agent_state_list(handle)`	`list<string>`	Lexicographically sorted, recursive, deterministic
`agent_state_delete(handle, key)`	`nil`	Missing keys are ignored
`agent_state_handoff(handle, summary)`	`nil`	Writes a structured JSON handoff envelope to `__handoff.json`
`agent_state_handoff_key()`	`string`	Returns the reserved handoff key name (`"__handoff.json"`)

Handle shape

agent_state_init(...) and agent_state_resume(...) return a tagged dict:

{
  _type: "state_handle",
  backend: "filesystem",
  root: "/absolute/root",
  session_id: "session-123",
  handoff_key: "__handoff.json",
  conflict_policy: "ignore",
  writer: {
    writer_id: "worker-a",
    stage_id: "worker-a",
    session_id: "session-123",
    worker_id: "worker-a"
  }
}

The exact fields are stable on purpose. Other runtime features can build on the same handle semantics without introducing a second durable-state model.

Session ids

agent_state_init(root, options?) looks for options.session_id first. If it is absent, Harn defaults to the active agent/workflow session id when one exists. Outside an active agent context, Harn mints a fresh UUIDv7.

That means common agent code can usually say:

import "std/agent_state"

pipeline default() {
  let state = agent_state_init(".harn/state", {writer_id: "planner"})
  agent_state_write(state, "plan.md", "# Plan")
}

and get a session-specific namespace automatically.

Keys and layout

Keys are always relative to the session root. Nested paths are fine:

import "std/agent_state"

pipeline default() {
  let state = agent_state_init(".harn/state", {writer_id: "planner"})
  agent_state_write(state, "plan.md", "# Plan")
  agent_state_write(state, "evidence/files.json", "{\"paths\":[]}")
}

Rejected key forms:

absolute paths
any path containing ..
reserved internal metadata paths

The default filesystem backend stores user content under:

<root>/<session_id>/<key>

with internal writer metadata stored separately under a hidden backend directory. agent_state_list(...) only returns user-visible keys.

Atomic writes

agent_state_write(...) writes to a temp file in the target directory, syncs it, then renames it into place. If the process crashes before the rename, the old file remains intact and the partially-written temp file never becomes the visible key.

This guarantees “no partial file at the target path”, which is the durability property the primitive is designed to expose.

Handoff documents

agent_state_handoff(handle, summary) stores a JSON envelope at __handoff.json:

{
  "_type": "agent_state_handoff",
  "version": 1,
  "session_id": "session-123",
  "key": "__handoff.json",
  "summary": {
    "status": "ready"
  }
}

Callers own the shape of summary. Harn owns the outer envelope and the well-known key.

Two-writer discipline

Each handle can carry a writer identity and conflict policy:

let state = agent_state_init(".harn/state", {
  session_id: "demo",
  writer_id: "planner",
  conflict_policy: "error"
})

Supported policies:

"ignore": accept overlapping writes silently
"warn": accept the write and emit a runtime warning
"error": reject the write before replacing the existing content

Conflict detection compares the previous writer id for that key with the current writer id. This is intentionally simple and deterministic: it is a guard rail against accidental stage overlap, not a full distributed locking protocol.

Backend seam

The default implementation is a filesystem backend, but the storage layer is split behind a backend trait in crates/harn-vm/src/stdlib/agent_state/backend.rs.

That trait is designed around:

scope creation/resume
atomic blob read/write/delete
deterministic list
conflict metadata on write

so future backends such as in-memory, SQLite, or remote stores can plug in without changing the Harn-facing handle semantics.

Example

import "std/agent_state"

pipeline default() {
  let state = agent_state_init(".harn/state", {
    session_id: "review-42",
    writer_id: "triage"
  })

  agent_state_write(state, "plan.md", "# Plan\n- inspect PR")
  agent_state_handoff(state, {
    status: "needs_review",
    next_stage: "implement"
  })

  let resumed = agent_state_resume(".harn/state", "review-42", {
    writer_id: "implement"
  })
  println(agent_state_read(resumed, "plan.md"))
}

Transcript architecture

Harn transcripts are now versioned runtime values with three distinct layers:

messages: durable conversational turns used to continue model calls.
events: normalized audit history derived from messages plus lifecycle/runtime events.
assets: durable descriptors for large or non-text payloads that should not be inlined into prompt history.

The intended schema is:

{
  "_type": "transcript",
  "version": 2,
  "id": "tr_...",
  "state": "active",
  "summary": "optional compacted summary",
  "metadata": {},
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "image", "asset_id": "asset_1", "visibility": "public"},
        {"type": "text", "text": "Review this screenshot", "visibility": "public"}
      ]
    }
  ],
  "events": [
    {
      "kind": "message",
      "role": "user",
      "visibility": "public",
      "text": "<image:screenshot.png> Review this screenshot",
      "blocks": [...]
    }
  ],
  "assets": [
    {
      "_type": "transcript_asset",
      "id": "asset_1",
      "kind": "image",
      "mime_type": "image/png",
      "visibility": "internal",
      "storage": {"path": ".harn/assets/asset_1.png"}
    }
  ]
}

Rules:

Put prompt-relevant turn content in messages.
Put replay/audit/lifecycle facts in events.
Put large media, file blobs, provider payload dumps, and durable attachments in assets.
Message blocks should reference assets by asset_id instead of embedding base64 when persistence matters.
Compaction should summarize archived text while retaining asset descriptors and recent multimodal turns.

Persistence split:

Hosts should persist asset files and any product-level chat/session metadata needed to reopen a conversation in the app shell.
Harn run records, worker snapshots, and transcript values should persist the structured transcript object, including asset descriptors and message/event links.
Hosts should avoid inventing a parallel hidden memory model. If a chat needs continuity, reuse or restore the Harn transcript and run record state.

Workflow runtime

Harn’s workflow runtime is the layer above raw llm_call() and agent_loop(). It gives host applications a typed, inspectable, replayable orchestration boundary instead of pushing orchestration logic into app code.

Core concepts

Workflow graphs

Use workflow_graph(...) to normalize a workflow definition into a typed graph with:

named nodes
explicit edges
node kinds such as stage, verify, join, condition, fork, map, reduce, subagent, and escalation
typed stage input/output contracts
explicit branch semantics and typed run transitions
per-node model, transcript, context, retry, and capability policies
workflow-level capability ceiling
mutation audit log entries

subagent nodes are now a real delegated execution boundary. They run through the worker lifecycle, attach worker metadata to their stage records, and tag their produced artifacts with delegated provenance so parent workflows can inspect and reduce child results explicitly.

Start with a helper that registers the tools the workflow will expose to each node. Each tool carries its own capability policy so validation can enforce them automatically:

fn review_tools() {
  var tools = tool_registry()
  tools = tool_define(tools, "read", "Read a file", {
    parameters: {path: {type: "string"}},
    returns: {type: "string"},
    handler: nil,
    policy: {
      capabilities: {workspace: ["read_text"]},
      side_effect_level: "read_only",
      path_params: ["path"],
      mutation_classification: "read_only"
    }
  })
  tools = tool_define(tools, "edit", "Edit a file", {
    parameters: {path: {type: "string"}},
    returns: {type: "string"},
    handler: nil,
    policy: {
      capabilities: {workspace: ["write_text"]},
      side_effect_level: "workspace_write",
      path_params: ["path"],
      mutation_classification: "apply_workspace"
    }
  })
  tools = tool_define(tools, "run", "Run a command", {
    parameters: {command: {type: "string"}},
    returns: {type: "string"},
    handler: nil,
    policy: {
      capabilities: {process: ["exec"]},
      side_effect_level: "process_exec",
      mutation_classification: "ambient_side_effect"
    }
  })
  return tools
}

let graph = workflow_graph({
  name: "repair_loop",
  entry: "act",
  nodes: {
    act: {kind: "stage", mode: "agent", tools: review_tools()},
    verify: {kind: "verify", mode: "agent", tools: tool_select(review_tools(), ["run"])},
    repair: {kind: "stage", mode: "agent", tools: tool_select(review_tools(), ["edit", "run"])}
  },
  edges: [
    {from: "act", to: "verify"},
    {from: "verify", to: "repair", branch: "failed"},
    {from: "repair", to: "verify", branch: "retry"}
  ]
})

let report = workflow_validate(graph)
assert(report.valid)

When tool entries include policy, Harn folds that metadata into workflow validation and execution automatically. That keeps the registry itself as the source of truth for capability requirements instead of forcing products to repeat the same information in both tool definitions and node policy blocks.

Action graphs

std/agents now exposes an action-graph layer above raw workflow graphs for planner-driven orchestration:

action_graph(raw, options?) canonicalizes planner output variants into a stable {_type: "action_graph", actions: [...]} envelope.
action_graph_batches(graph, completed?) repairs missing cross-phase dependencies and groups ready work by phase plus tool class.
action_graph_flow(graph, config?) turns that plan envelope into a typed workflow graph with one scheduled batch stage per ready batch.
action_graph_run(task, graph, config?, overrides?) attaches a durable plan artifact and executes the generated workflow via workflow_execute.

This is the intended shared substrate for “research -> plan -> execute -> verify” style pipelines when the planner output is unstable but the executor should still see a canonical schedule.

import "std/agents"

let raw_plan = {
  steps: [
    {id: "inspect", kind: "research", title: "Inspect parser", tools: ["read", "search"]},
    {id: "patch", title: "Patch diagnostics", tools: ["edit"]},
    {id: "docs", title: "Update release notes", tools: ["edit"]}
  ]
}

let plan = action_graph(raw_plan, {task: "Fix parser diagnostics"})
let run = action_graph_run("Fix parser diagnostics", plan, {
  research: {mode: "llm", model_policy: {provider: "mock"}},
  execute: {mode: "llm", model_policy: {provider: "mock"}},
  verify: {command: "cargo test --workspace --quiet", expect_status: 0}
})

println(run.status)
println(len(run.batches))

Artifacts and resources

Artifacts are the real context boundary. Instead of building context mostly by concatenating strings, Harn selects typed artifacts under policy and budget.

Core artifact kinds that ship in the runtime include:

artifact
resource
summary
analysis_note
diff
test_result
verification_result
plan

Artifacts carry provenance fields such as:

source
created_at
freshness
lineage
relevance
estimated_tokens
metadata

Example:

let selection = artifact({
  kind: "resource",
  title: "Selected code",
  text: read_file("src/parser.rs"),
  source: "workspace",
  relevance: 0.95
})

let plan = artifact_derive(selection, "plan", {
  text: "Update the parser diagnostic wording and preserve spans."
})

let context = artifact_context([selection, plan], {
  include_kinds: ["resource", "plan"],
  max_tokens: 1200
})

Executing workflows

workflow_execute(task, graph, artifacts?, options?) executes a typed workflow and persists a structured run record.

let run = workflow_execute(
  "Fix the diagnostic regression and verify the tests.",
  graph,
  [selection, plan],
  {max_steps: 8}
)

println(run.status)
println(run.path)
println(run.run.stages)

verify nodes can also run deterministic checks without an LLM loop:

verify: {
  kind: "verify",
  verify: {
    command: "cargo test --workspace --quiet",
    expect_status: 0,
    assert_text: "test result: ok"
  }
}

Command-based verification records stdout, stderr, exit_status, and a derived success flag on the stage result while still flowing through the same workflow branch/outcome machinery as LLM-backed verification.

Verifier requirements can also be published as structured contract inputs for earlier planning and execution stages. Harn injects these contracts into the stage prompt automatically so the model sees exact verifier-owned identifiers, paths, and wiring text before it starts editing:

verify: {
  kind: "verify",
  verify: {
    command: "python scripts/verify_rate_limit.py",
    expect_status: 0,
    required_identifiers: ["rateLimit"],
    required_paths: ["src/middleware/rateLimit.ts"],
    required_text: ["app.use(rateLimit)"],
    notes: ["Use the verifier-exact symbol names. Do not rename them."]
  }
}

When the verifier contract lives outside the workflow file, point contract_path at a JSON file relative to the workflow execution context:

verify: {
  kind: "verify",
  verify: {
    command: "python scripts/verify_rate_limit.py",
    contract_path: "scripts/verify_rate_limit.contract.json",
    expect_status: 0
  }
}

Options currently include:

max_steps
persist_path
resume_path
resume_run
replay_path
replay_run
replay_mode: "deterministic"
audit
mutation_scope
approval_policy

Resuming is practical rather than magical: if a saved run has unfinished successor stages, Harn continues from persisted ready-node checkpoints with saved artifacts, transcript state, and traversed run-graph edges.

Deterministic replay is now a runtime mode rather than a CLI-only inspection tool: passing a prior run via replay_run or replay_path replays saved stage records and artifacts through the workflow engine without calling providers or tools again.

Delegated runs surface child worker lineage in each delegated stage’s metadata. This makes replay/eval and host timelines able to distinguish parent execution from child execution without reconstructing that structure from plain text. Persisted runs also retain explicit parent_run_id, root_run_id, and child_runs lineage, and load_run_tree(path) materializes that hierarchy recursively for inspection or host-side task views.

Map nodes can now execute branch work in parallel. node.join_policy.strategy accepts:

"all" to wait for every branch result
"first" to return after the first completed branch
"quorum" to return after join_policy.min_completed branches finish

node.map_policy.max_concurrent limits branch fan-out, and partial failures are retained alongside successful branch artifacts instead of aborting the whole map stage on the first error.

Runs may also include metadata.mutation_session, a normalized audit record used to tie tool gates, workers, and artifacts back to one mutation boundary:

session_id
parent_session_id
run_id
worker_id
execution_kind
mutation_scope
approval_policy

This is not an editor undo stack. It is the runtime-side provenance contract that hosts can map onto their own approval and undo/redo UX.

Transcripts and sessions

Stage transcripts are owned by the session store, not by a per-node transcript_policy dict. Each node picks up a session id from model_policy.session_id; two nodes that share an id share their conversation automatically. Unset ids get a stable stage-scoped default.

To shape transcript behavior on a node, use the dedicated workflow setters plus the lifecycle builtins:

workflow_set_auto_compact(graph, node_id, policy) — sets auto_compact, compact_threshold, tool_output_max_chars, compact_strategy, hard_limit_tokens, hard_limit_strategy.
workflow_set_output_visibility(graph, node_id, visibility) — "public" | "private" | nil.
agent_session_reset(id), agent_session_fork(src, dst?), agent_session_trim(id, keep_last), agent_session_compact(id, opts) — call these in the pipeline before workflow_execute to branch, reset, or compact a stage’s conversation explicitly.

The old transcript_policy dict (with mode: "continue" | "reset" | "fork") was removed in 0.7.0; see Sessions for migration.

Meta-orchestration builtins

Harn exposes typed workflow editing builtins so orchestration changes can be audited and validated against the workflow IR:

workflow_inspect(..., ceiling?)
workflow_clone(...)
workflow_insert_node(...)
workflow_replace_node(...)
workflow_rewire(...)
workflow_set_model_policy(...)
workflow_set_context_policy(...)
workflow_set_auto_compact(...)
workflow_set_output_visibility(...)
workflow_diff(...)
workflow_validate(..., ceiling?)
workflow_policy_report(..., ceiling?)
workflow_commit(...)

These mutate structured workflow graphs, not free-form prompt text.

Capability ceilings

Workflows and sub-orchestration may narrow capabilities, but they must not exceed the host/runtime ceiling.

This is enforced explicitly by capability-policy intersection during validation and execution setup. If a node requests tools or host operations outside the ceiling, validation fails.

Run records, replay, and evals

Workflow execution produces a persisted run record containing:

workflow identity
task
stage records
stage attempts, outcomes, and branch decisions
traversed graph transitions
ready-node checkpoints for resume
stage transcripts
visible output
private reasoning metadata
tool intent and tool execution events
provider payload metadata kept separate from visible text
verification outcomes
artifacts
policy metadata
parent/root run lineage and delegated child runs
a derived observability block summarizing planner rounds, research facts, action-graph nodes/edges, verification outcomes, and transcript pointers
execution status

CLI support:

harn portal
harn runs inspect .harn-runs/<run>.json
harn runs inspect .harn-runs/<run>.json --compare baseline.json
harn replay .harn-runs/<run>.json
harn eval .harn-runs/<run>.json
harn eval .harn-runs/
harn eval evals/regression.json

The replay/eval surface is intentionally tied to saved typed run records so host applications do not need to build their own provenance layer.

For a local visual view over the same persisted data, harn portal reads the run directory directly and renders stages, the derived action graph, trace spans, transcript sections, and delegated child runs without introducing a second storage format.

For host/runtime consumers that want the same logic inside Harn code, the VM also exposes:

run_record_fixture(...)
run_record_eval(...)
run_record_eval_suite(...)
run_record_diff(...)
eval_suite_manifest(...)
eval_suite_run(...)

Eval manifests group persisted runs, optional explicit replay fixtures, and optional baseline run comparisons under a single typed document. This lets hosts treat replay/eval suites as data rather than external scripts.

Host artifact handoff

Hosts and editor bridges should hand Harn typed artifacts instead of embedding their own orchestration rules in ad hoc prompt strings. The VM now exposes helpers for the most common host surfaces:

artifact_workspace_file(...)
artifact_workspace_snapshot(...)
artifact_editor_selection(...)
artifact_verification_result(...)
artifact_test_result(...)
artifact_command_result(...)
artifact_diff(...)
artifact_git_diff(...)
artifact_diff_review(...)
artifact_review_decision(...)
artifact_patch_proposal(...)
artifact_verification_bundle(...)
artifact_apply_intent(...)

These helpers normalize kind names, token estimates, priority defaults, lineage, and metadata so host products can pass editor/test/diff state into Harn without recreating artifact taxonomy and provenance logic externally.

Trigger manifests

[[triggers]] extends harn.toml with declarative trigger registrations in the same manifest-overlay family as [exports], [llm], and [[hooks]].

Each entry declares:

a stable trigger id
a trigger kind such as webhook, cron, or a2a-push
a provider from the registered trigger provider catalog
a delivery handler
optional dedupe, retry, budget, secret, and predicate settings

Shape

[[triggers]]
id = "github-new-issue"
kind = "webhook"
provider = "github"
match = { events = ["issues.opened"] }
when = "handlers::should_handle"
handler = "handlers::on_new_issue"
dedupe_key = "event.dedupe_key"
retry = { max = 7, backoff = "svix", retention_days = 7 }
priority = "normal"
budget = { daily_cost_usd = 5.00, max_concurrent = 10 }
secrets = { signing_secret = "github/webhook-secret" }
filter = "event.kind"

Handler URI schemes

Harn currently accepts three handler forms:

local function: handler = "on_event" or handler = "handlers::on_event"
A2A dispatch: handler = "a2a://reviewer.prod/triage"
worker queue dispatch: handler = "worker://triage-queue"

Unsupported URI schemes fail fast at load time.

Local handlers and predicates resolve through the same module-export plumbing as the manifest hook loader:

bare names resolve against lib.harn next to the manifest
module::function resolves either through the current manifest’s [exports] table or through package imports under .harn/packages

Validation

The manifest loader rejects invalid trigger declarations before execution:

trigger ids must be unique across the loaded root manifest plus installed package manifests
provider must exist in the registered trigger provider catalog
handler must be a supported URI, and local handlers must resolve to exported functions
when must resolve to a function with signature fn(TriggerEvent) -> bool
dedupe_key and filter must parse as JMESPath expressions
retry.max must be <= 100
retry.retention_days defaults to 7 and must be >= 1
budget.daily_cost_usd must be >= 0
cron triggers must declare a parseable schedule
cron timezone must be a valid IANA timezone name
secret references must use <namespace>/<name> syntax and the namespace must match the trigger provider

Errors include the manifest path plus the [[triggers]] table index so the bad entry is easy to locate.

Durable dedupe retention

Trigger dedupe now uses a durable inbox index backed by the shared EventLog topic trigger.inbox. Each successful claim stores the binding id plus the resolved dedupe_key, and duplicate deliveries are rejected until the claim’s TTL expires.

configure the TTL with retry.retention_days
the default is 7 days
shorter retention trims durable dedupe history sooner, which lowers storage cost but increases the chance that a late provider retry will be treated as a fresh event

Use a retention window at least as long as the provider’s maximum retry window. If a provider can redeliver for longer than your configured TTL, Harn may dispatch that late retry again once the durable claim has expired.

Doctor output

harn doctor now lists loaded triggers with:

trigger id
trigger kind
provider
handler kind (local, a2a, or worker)
budget summary

Examples

See the example manifests under examples/triggers:

Trigger event schema

TriggerEvent is the normalized envelope every inbound trigger provider converges on before dispatch. Connectors preserve provider-specific payload fidelity inside provider_payload, but the orchestration layer always sees the same outer shape:

import "std/triggers"

fn on_event(event: TriggerEvent) {
  let payload = event.provider_payload
  if payload.provider == "github" && payload.event == "issues" {
    println(payload.issue.title ?? "unknown")
  }

  let signature = event.signature_status
  if signature.state == "failed" {
    println(signature.reason)
  }
}

Envelope fields

TriggerEvent carries:

id: runtime-assigned event id.
provider: provider identity such as "github", "slack", "cron", or "webhook".
kind: provider-specific event kind.
received_at: RFC3339 timestamp captured by the runtime.
occurred_at: provider-reported RFC3339 timestamp when available.
dedupe_key: delivery id or equivalent idempotency key.
trace_id: trace correlation id propagated through dispatch.
tenant_id: optional orchestrator-assigned tenant namespace.
headers: redacted provider headers retained for audit/debugging.
provider_payload: provider-tagged payload union.
signature_status: typed verification result.

Signature status

signature_status is a discriminated union:

{ state: "verified" }
{ state: "unsigned" }
{ state: "failed", reason: string }

Unsigned events are valid for synthetic sources such as cron. Failed events can still be logged for audit purposes even if the dispatcher rejects them.

Provider payloads

The initial std/triggers payload aliases are intentionally small. Each provider variant exposes a stable normalized surface plus raw: dict. GitHub’s payload is already narrowed into the six MVP event families (issues, pull_request, issue_comment, pull_request_review, push, and workflow_run) with event-specific top-level fields such as issue, pull_request, comment, review, commits, and workflow_run:

GitHubEventPayload
SlackEventPayload
LinearEventPayload
NotionEventPayload
CronEventPayload
GenericWebhookPayload
A2aPushPayload
ExtensionProviderPayload

The runtime registers these through a ProviderCatalog, so future connectors can contribute new payload schemas without rewriting the top-level TriggerEvent envelope.

Header redaction

The runtime keeps delivery, event, timestamp, request-id, signature, and user-agent headers by default. It redacts sensitive headers such as Authorization, Cookie, and names containing secret, token, or key unless they are explicitly allow-listed as safe metadata.

Trigger Dispatcher

The trigger dispatcher is the runtime path that turns a normalized TriggerEvent plus a live registry binding into actual handler work.

At MVP, the dispatcher fully wires the local-function path and keeps the remote handler schemes (a2a://..., worker://...) as explicit stubs with clear error messages pointing at their follow-up tickets.

Dispatch shape

Each dispatch goes through the same sequence:

Append the inbound event to trigger.inbox.
Match the event against active registry bindings for the provider + event kind.
Evaluate the optional when predicate in the same VM/runtime surface as the handler.
Invoke the resolved handler target.
Record each attempt on trigger.attempts.
Record successful handler results on trigger.outbox.
Schedule retries from the manifest retry policy.
Move exhausted deliveries into the in-memory DLQ and append a copy to trigger.dlq.
When the dispatch is a replay, emit a replay_chain action-graph edge linking the new trigger node back to the original event id.

The dispatcher keeps per-thread stats for:

in-flight dispatch count
retry queue depth
DLQ depth

harn doctor surfaces that snapshot next to the trigger registry view.

Handler URI resolution

Manifest handler URIs support three forms:

bare/local function name: handler = "on_issue" or handler = "handlers::on_issue"
remote A2A target: handler = "a2a://reviewer.prod/triage"
worker queue target: handler = "worker://triage-queue"

By the time the dispatcher sees a manifest-installed binding, local function handlers have already been resolved to concrete VmClosure values through the same export-loading path used by manifest hooks and trigger predicates.

The dispatcher still re-normalizes those shapes internally so it can emit a stable handler kind and target URI in lifecycle logs and action-graph nodes.

Retry policy

Bindings carry a normalized TriggerRetryConfig:

Svix
Linear { delay_ms }
Exponential { base_ms, cap_ms }

The default retry budget is 7 total attempts.

The Svix schedule is:

immediate -> 5s -> 5m -> 30m -> 2h -> 5h -> 10h -> 10h

The last slot saturates, so attempts beyond the published vector continue to wait 10 hours unless a future manifest surface narrows that policy.

Cancellation

Dispatcher shutdown is cooperative:

a shutdown signal flips the active per-dispatch VM cancel tokens immediately
sleeping retry waits listen for the shared shutdown broadcast and abort early
local handlers observe cancellation through the existing VM install_cancel_token(...) path and exit on the next instruction boundary

This keeps the trigger runtime aligned with the orchestrator shutdown model without inventing a second cancellation mechanism.

Event-log topics

The dispatcher uses the shared EventLog instead of a parallel queue layer:

trigger.inbox
trigger.outbox
trigger.attempts
trigger.dlq
triggers.lifecycle
observability.action_graph

triggers.lifecycle now includes dispatcher-specific lifecycle records:

DispatchStarted
DispatchSucceeded
DispatchFailed
RetryScheduled
DlqMoved

Action-graph updates

Dispatcher streaming closes the local-handler portion of the trigger action-graph deferral:

node kinds: trigger, predicate, dispatch, retry, dlq
edge kinds: trigger_dispatch, predicate_gate, retry, dlq_move

Each update is appended to observability.action_graph using the shared RunActionGraphNodeRecord / RunActionGraphEdgeRecord schema so the portal and any other subscriber can consume dispatcher traces without special-casing a separate payload format.

Replay dispatches add one more edge kind:

replay_chain

The portal renders that edge as the visible link from the replayed trigger event back to the original event id.

Current MVP limits

a2a://... returns DispatchError::NotImplemented and points at O-04 #181
worker://... returns DispatchError::NotImplemented and points at O-05 #182
DLQ storage is in-memory plus event-log append; durable replay remains follow-up work

Trigger Observability In The Action Graph

Harn now projects dispatcher-independent trigger activity into persisted run observability. This lands the first half of issue #163: trigger and predicate nodes, plus the matching trigger_dispatch and predicate_gate edges.

What lands in this change

A synthetic trigger node is added when a run carries a trigger_event envelope in run.metadata.
Workflow condition stages render as predicate nodes in observability.action_graph_nodes.
Entry edges from the trigger node into the workflow render as trigger_dispatch.
Transitions leaving a predicate render as predicate_gate.
trace_id propagates from the TriggerEvent onto the synthetic trigger node and every downstream action-graph node derived from that run.

The runtime also streams the derived graph onto the shared event-log topic observability.action_graph whenever a run record is persisted. This reuses the generalized EventLog infrastructure instead of a parallel observability bus.

Current shape

This scoped change is intentionally limited to the dispatcher-independent surface:

Landed here: trigger and predicate node kinds.
Deferred to T-06: dispatch, a2a_hop, worker_enqueue, and dlq.
Deferred to T-06: portal replay controls and dispatcher-coupled UI work.
Deferred to T-06: A2A trace_id header propagation.

Example

When a workflow is started with a trigger_event option, the persisted run record will include observability nodes like:

{
  "kind": "trigger",
  "label": "cron:tick",
  "trace_id": "trace_123"
}

and:

{
  "kind": "predicate",
  "label": "gate",
  "trace_id": "trace_123"
}

with edges such as:

{"kind": "trigger_dispatch", "from_id": "trigger:...", "to_id": "stage:..."}
{"kind": "predicate_gate", "label": "true"}

The portal does not yet render specialized UI for these nodes in this PR; it will consume the shared event-log topic in the dispatcher follow-up.

Connector authoring

Custom connectors implement the harn_vm::connectors::Connector trait and plug into a ConnectorRegistry at orchestrator startup. The initial surface lives in crates/harn-vm/src/connectors/ because the supporting abstractions it depends on today already live in harn-vm:

EventLog for audit and durable event plumbing
SecretProvider for signing secrets and outbound tokens
TriggerEvent for the normalized inbound envelope

If the connector ecosystem grows large enough, the module can be extracted into a dedicated crate later without changing the core trait contract.

Provider catalog

Connectors should treat the runtime ProviderCatalog as the authoritative discovery surface for provider metadata. Each provider entry carries:

the normalized payload schema name exposed through std/triggers
supported trigger kinds such as webhook or cron
outbound method names (empty today for the built-in providers)
required secrets, including the namespace each secret must live under
signature verification strategy metadata
runtime connector metadata indicating whether the provider is backed by a built-in connector or a placeholder implementation

Harn also exposes that same catalog to scripts through import "std/triggers" and list_providers(), so connector metadata has one runtime-facing source instead of separate registry and docs tables.

Implementing a connector

A connector implementation owns two concerns:

Inbound normalization: verify the provider request, preserve the raw bytes, and normalize into TriggerEvent.
Outbound callbacks: expose provider APIs through a ConnectorClient.

The runtime-facing surface is:

#![allow(unused)]
fn main() {
use std::sync::Arc;

use async_trait::async_trait;
use harn_vm::connectors::{
    Connector, ConnectorClient, ConnectorCtx, ConnectorError, ProviderPayloadSchema,
    RawInbound, TriggerBinding, TriggerKind,
};
use harn_vm::{ProviderId, TriggerEvent};
use serde_json::Value as JsonValue;

struct ExampleConnector {
    provider_id: ProviderId,
    kinds: Vec<TriggerKind>,
    client: Arc<ExampleClient>,
}

struct ExampleClient;

#[async_trait]
impl ConnectorClient for ExampleClient {
    async fn call(
        &self,
        method: &str,
        args: JsonValue,
    ) -> Result<JsonValue, harn_vm::ClientError> {
        let _ = (method, args);
        Ok(JsonValue::Null)
    }
}

#[async_trait]
impl Connector for ExampleConnector {
    fn provider_id(&self) -> &ProviderId {
        &self.provider_id
    }

    fn kinds(&self) -> &[TriggerKind] {
        &self.kinds
    }

    async fn init(&mut self, _ctx: ConnectorCtx) -> Result<(), ConnectorError> {
        Ok(())
    }

    async fn activate(
        &self,
        _bindings: &[TriggerBinding],
    ) -> Result<harn_vm::ActivationHandle, ConnectorError> {
        Ok(harn_vm::ActivationHandle::new(self.provider_id.clone(), 0))
    }

    fn normalize_inbound(&self, raw: RawInbound) -> Result<TriggerEvent, ConnectorError> {
        let _payload = raw.json_body()?;
        todo!("map the provider request into TriggerEvent")
    }

    fn payload_schema(&self) -> ProviderPayloadSchema {
        ProviderPayloadSchema::named("ExamplePayload")
    }

    fn client(&self) -> Arc<dyn ConnectorClient> {
        self.client.clone()
    }
}
}

HMAC verification helper

Webhook-style connectors should reuse harn_vm::connectors::verify_hmac_signed(...) instead of open-coding HMAC checks. The helper enforces the non-negotiable rules from issue #167:

verification happens against the raw request body bytes
signature comparisons use constant-time equality
timestamped schemes reject outside a caller-provided window
rejection paths write an audit event to the audit.signature_verify topic

The helper currently supports the three MVP HMAC header styles needed by the planned connector tickets:

GitHub: X-Hub-Signature-256: sha256=<hex>
Stripe: Stripe-Signature: t=<unix>,v1=<hex>[,v1=<hex>...]
Standard Webhooks: webhook-id, webhook-timestamp, and webhook-signature: v1,<base64>

Rate limiting

Connector clients should acquire outbound permits through the shared RateLimiterFactory. The current implementation is intentionally small: a process-local token bucket keyed by (provider_id, scope_key). That keeps the first landing trait-pure while giving upcoming provider clients one place to enforce per-installation or per-tenant quotas.

What is deliberately not here yet

This foundation PR does not define:

outbound stdlib client wrappers for connector-specific APIs
third-party manifest ABI for external connector packages

Those land in follow-up tickets once the shared trait, provider catalog, runtime registry, audit, and verification primitives are in place.

Trigger registry

The trigger registry is the runtime-owned binding table that turns validated [[triggers]] manifest entries into live, versioned trigger bindings inside a VM thread.

Ownership model

The registry is thread-local, following the same pattern as the runtime hook table. Each VM thread owns its own bindings and does not share Rc<VmClosure> values across threads.
Cross-thread coordination is pushed down to the event-log layer. The trigger registry only tracks the bindings that the current VM can execute.
Manifest parsing and validation still live in harn-cli. Once handlers and predicates resolve, the CLI passes a compact binding spec into harn-vm, which owns lifecycle and metrics.

Binding shape

Each live binding stores:

logical trigger id
monotonically increasing version
provider and trigger kind
resolved handler target (local, a2a, or worker)
optional resolved when predicate
lifecycle state: registering, active, draining, terminated
metrics snapshot: received, dispatched, failed, dlq, in_flight, and last-received timestamp
manifest provenance for diagnostics

Hot reload keeps the logical id stable and bumps the binding version whenever the manifest definition fingerprint changes.

Lifecycle

Manifest install performs a reconcile step against the current thread-local registry:

New trigger id: register version 1, emit registering, then active.
Existing trigger id with unchanged definition: keep the current active binding.
Existing trigger id with changed definition: mark the old binding draining, register a new active version, and keep both bindings visible until the old version reaches in_flight == 0.
Removed manifest trigger: mark the live binding draining. Once in_flight == 0, it transitions to terminated.

Dynamic registrations follow the same state machine, but they are not reconciled by manifest reload.

Metrics and draining

begin_in_flight(id, version) increments received and in_flight and updates last_received_ms.
finish_in_flight(id, version, outcome) decrements in_flight and increments one of dispatched, failed, or dlq.
A draining binding becomes terminated only after the in-flight count returns to zero.

This keeps hot reload safe: events that started under version N complete under version N, while new events route to version N+1.

Event-log integration

When an active event log is installed for the VM thread, every lifecycle transition appends a record to the triggers.lifecycle topic. The event payload includes:

logical trigger id
id@vN binding key
provider
trigger kind
handler kind
transition from_state and to_state

harn doctor uses the installed registry snapshot to report the live bindings it sees after manifest load, including state, version, and zeroed metrics for newly installed triggers.

The trigger stdlib’s manual replay path also depends on the registry:

trigger_fire(...) records the synthetic event on triggers.events
trigger_replay(...) looks up that recorded envelope plus any pending stdlib DLQ summary entry on triggers.dlq
the wrapper then re-enters the dispatcher against the resolved live binding version and threads replay_of_event_id through dispatch observability

Test Harness

harn_vm::triggers::test_util now provides the shared trigger-system test harness used by both Rust unit tests and .harn conformance fixtures. The harness owns:

a reusable mock clock with wall-clock and monotonic hooks
a recording connector sink/registry for emitted normalized events
named fixture runners that cover cron, webhook verification, retry/backoff, DLQ/replay, dedupe, rate limiting, cost guards, crash recovery, hot reload, and dead-man alerts

The script-facing entrypoint is the trigger_test_harness(...) builtin, which returns a structured report for the selected fixture instead of requiring each conformance script to rebuild connector state by hand.

Cron connector

The cron connector is Harn’s in-process scheduler for time-triggered work. It implements the shared Connector trait, evaluates cron expressions in an IANA time zone, and persists the last-fired boundary for each trigger in the shared EventLog.

Manifest shape

Cron triggers live under [[triggers]] and keep their schedule-specific settings inline with the rest of the trigger manifest entry:

[[triggers]]
id = "daily-digest"
kind = "cron"
provider = "cron"
match = { events = ["cron.tick"] }
handler = "worker://digest-queue"
schedule = "0 9 * * *"
timezone = "America/New_York"
catchup_mode = "skip"

Supported fields:

schedule: five-field cron expression parsed by croner
timezone: IANA time zone name such as America/New_York
catchup_mode: skip (default), all, or latest

Offset literals such as +02:00 and UTC-5 are rejected at manifest-load time. Use a named zone instead so DST transitions can be evaluated correctly.

DST semantics

The cron connector intentionally favors stable wall-clock semantics over trying to synthesize impossible local times:

Fall-back overlaps fire a matching wall-clock slot once, even though the local hour appears twice.
Spring-forward gaps do not invent a firing for a missing local time. A schedule like 0 2 * * * simply does not fire on the DST transition day when 02:00 is skipped.
Named zones continue to track the intended local wall time across standard and daylight time. Midnight in America/New_York fires at 05:00Z in winter and 04:00Z in summer.

Durable state

Every successful firing appends the latest scheduled boundary for that trigger to the EventLog topic connectors.cron.state. On restart, the connector reloads the latest entry for each trigger_id and uses it to determine whether any ticks were missed while the orchestrator was down.

The current implementation persists:

trigger_id
last_fired_at

This keeps recovery append-only and backend-agnostic across the memory, file, and SQLite EventLog implementations.

Catch-up modes

Catch-up behavior is evaluated from the persisted last_fired_at boundary to the connector’s current clock on activation.

skip: drop missed ticks and resume from “now”
all: replay every missed scheduled tick in chronological order
latest: replay only the most recent missed scheduled tick

Catch-up reuses the original scheduled boundary as occurred_at, so downstream consumers can distinguish between when a job was due and when the process actually resumed.

Event output

Until the broader trigger dispatcher lands, cron firings are emitted as serialized TriggerEvent envelopes on the EventLog topic connectors.cron.tick with provider cron, kind tick, and a CronEventPayload that includes:

cron_id
schedule
tick_at
raw.catchup
raw.timezone

This keeps the connector testable today and preserves a normalized event shape for the follow-up dispatcher work.

GitHub App connector

GitHubConnector is Harn’s built-in GitHub App integration for inbound webhook events plus outbound GitHub REST calls authenticated as an installation.

The MVP scope in #170 is intentionally narrow:

inbound GitHub webhook verification with X-Hub-Signature-256
strongly typed payload narrowing for the six orchestration-relevant event families: issues, pull_request, issue_comment, pull_request_review, push, and workflow_run
outbound installation-token lifecycle for GitHub App auth
seven outbound helper methods exposed through std/connectors/github

Guided install / OAuth setup remains deferred to C-10. This landing supports the manual-config path now: provide the App id, installation id, private key, and webhook secret through the orchestrator config + secret providers.

Inbound webhook bindings

Configure GitHub as a provider = "github" webhook trigger:

[[triggers]]
id = "github-prs"
kind = "webhook"
provider = "github"
match = { path = "/hooks/github" }
handler = "handlers::on_github"
dedupe_key = "event.dedupe_key"
secrets = { signing_secret = "github/webhook-secret" }

The connector verifies X-Hub-Signature-256 against the raw request body using the shared verify_hmac_signed(...) helper from the generic webhook path. It does not duplicate HMAC logic. Successful deliveries normalize into TriggerEvent with:

kind from X-GitHub-Event
dedupe_key from X-GitHub-Delivery
signature_status = { state: "verified" }
provider_payload = GitHubEventPayload

GitHubEventPayload is narrowed into the six MVP event families. For example, an issues delivery exposes payload.issue, while pull_request_review exposes both payload.review and payload.pull_request.

Outbound configuration

Outbound helpers authenticate as a GitHub App installation. Required config:

app_id
installation_id
private_key_pem or private_key_secret

Optional config:

api_base_url for GitHub Enterprise or tests; defaults to https://api.github.com

Recommended production shape:

import { configure } from "std/connectors/github"

configure({
  app_id: 12345,
  installation_id: 67890,
  private_key_secret: "github/app-private-key",
})

For tests and local fixtures, private_key_pem can be passed inline.

Installation-token lifecycle

The connector follows the GitHub App installation flow:

Mint a short-lived App JWT (RS256, iss = app_id) from the configured private key.
Exchange it at POST /app/installations/{installation_id}/access_tokens.
Cache the returned installation token per installation.
Refresh lazily a little before expiry, or immediately after a 401.

The in-process cache refreshes roughly every 55 minutes even though GitHub tokens are valid for one hour. Token fetches still flow through the shared secret-provider-backed connector context, and outbound requests are scoped through the connector RateLimiterFactory.

Outbound helpers

Import from std/connectors/github:

import {
  add_labels,
  comment,
  create_issue,
  get_pr_diff,
  list_stale_prs,
  merge_pr,
  request_review,
} from "std/connectors/github"

Available methods:

comment(issue_url, body, options = nil)
add_labels(issue_url, labels, options = nil)
request_review(pr_url, reviewers, options = nil)
merge_pr(pr_url, options = nil)
list_stale_prs(repo, days, options = nil)
get_pr_diff(pr_url, options = nil)
create_issue(repo, title, body = nil, labels = nil, options = nil)

All helpers accept the same auth/config fields through options, but configure(...) is the intended shared setup path.

Example:

import {
  comment,
  configure,
  list_stale_prs,
  merge_pr,
} from "std/connectors/github"

pipeline default() {
  configure({
    app_id: 12345,
    installation_id: 67890,
    private_key_secret: "github/app-private-key",
  })

  let stale = list_stale_prs("acme/api", 14)
  if stale.total_count > 0 {
    let pr = stale.items[0]
    comment("https://github.com/acme/api/issues/" + to_string(pr.number), "Taking a look.")
  }

  let merged = merge_pr(
    "https://github.com/acme/api/pull/42",
    {merge_method: "squash", admin_override: true},
  )
  println(merged.merged)
}

admin_override: true records that the caller requested an override and annotates the returned JSON with admin_override_requested = true. GitHub’s REST merge endpoint does not currently expose a distinct override flag, so the connector still uses the standard merge call.

Rate limiting

The connector uses the shared RateLimiterFactory with a per-installation scope key before each outbound request. It also reacts to GitHub rate-limit responses:

retries once after 429 using Retry-After or X-RateLimit-Reset
invalidates cached tokens and re-mints on 401
emits observations to the connectors.github.rate_limit event-log topic

This keeps the MVP aligned with the generic connector rate-limit contract without introducing a second bespoke limiter.

Generic webhook connector

GenericWebhookConnector is the first concrete inbound connector built on top of the C-01 Connector trait. It accepts generic HTTP webhook deliveries, verifies supported HMAC signature conventions against the raw request body, and normalizes the delivery into a TriggerEvent with the built-in GenericWebhookPayload shape.

The current implementation is intentionally small:

activation-only; the O-02 HTTP listener still wires request routing later
raw-body verification for Standard Webhooks, Stripe-style, and GitHub-style signatures
TriggerEvent normalization with header redaction and provider payload preservation
process-local dedupe stub keyed by the manifest dedupe_key opt-in until the durable trigger inbox lands

Manifest shape

[[triggers]]
id = "incoming-webhook"
kind = "webhook"
provider = "webhook"
match = { path = "/hooks/incoming" }
handler = "handlers::on_webhook"
dedupe_key = "event.dedupe_key"
secrets = { signing_secret = "webhook/incoming" }

[triggers.webhook]
signature_scheme = "standard"  # "standard" | "stripe" | "github"
timestamp_tolerance_secs = 300
source = "incoming"

signature_scheme defaults to "standard" when omitted. Standard Webhooks and Stripe-style signatures default to a 5-minute timestamp tolerance. GitHub-style signatures are untimestamped and therefore ignore timestamp skew.

Supported signature conventions

The connector delegates signature checks to harn_vm::connectors::verify_hmac_signed(...), so it inherits the shared verification rules from C-01:

verify against the raw inbound bytes, not a reparsed body
compare signatures in constant time
enforce a timestamp window for timestamped schemes
append signature failures to the audit.signature_verify event-log topic

Supported variants:

Standard Webhooks: webhook-id, webhook-timestamp, webhook-signature: v1,<base64>
Stripe-style: Stripe-Signature: t=<unix>,v1=<hex>[,v1=<hex>...]
GitHub-style: X-Hub-Signature-256: sha256=<hex>

Normalized event fields

For successful deliveries the connector produces:

provider = "webhook"
kind from RawInbound.kind, then X-GitHub-Event, then payload type / event, else "webhook"
dedupe_key from the provider-native delivery identifier: webhook-id, Stripe event id, or X-GitHub-Delivery
signature_status = { state: "verified" }
provider_payload = GenericWebhookPayload

GenericWebhookPayload.raw keeps parsed JSON when the body is JSON. When the payload is not valid JSON, the connector preserves the bytes as:

{
  "raw_base64": "<base64-encoded body>",
  "raw_utf8": "optional utf-8 view"
}

GenericWebhookPayload.source comes from X-Webhook-Source when present, or from the binding’s optional webhook.source override.

Dedupe

If the trigger manifest declares dedupe_key, the connector records the normalized event.dedupe_key in the current inbox dedupe stub and rejects replays for the same binding. This is process-local today; durable inbox-backed dedupe is still deferred to T-09.

Activation and listener integration

The connector’s activate() hook validates the binding config and reserves unique match.path values across active bindings. Because O-02 is still outstanding, request routing is not implemented here. Until the listener lands:

a single active binding can call normalize_inbound(...) directly
multiple active bindings must pass the selected binding_id in RawInbound.metadata.binding_id

Notes and follow-up

Signature failures are audited even when normalization returns an error.
Production TLS handling is owned by the eventual listener, not this connector.
Streaming request bodies larger than 10 MiB is still a follow-up item.

Cookbook

Practical patterns for building agents and pipelines in Harn. Each recipe is self-contained with a short explanation and working code.

1. Basic LLM call

Single-shot prompt with a system message. Set ANTHROPIC_API_KEY (or the appropriate key for your provider) before running.

pipeline default(task) {
  let response = llm_call(
    "Explain the builder pattern in three sentences.",
    "You are a software engineering tutor. Be concise."
  )
  println(response)
}

To use a different provider or model, pass an options dict:

pipeline default(task) {
  let response = llm_call(
    "Explain the builder pattern in three sentences.",
    "You are a software engineering tutor. Be concise.",
    {provider: "openai", model: "gpt-4o", max_tokens: 512}
  )
  println(response)
}

2. Agent loop with tools

Register tools with JSON Schema-compatible definitions, generate a system prompt that describes them, then let the LLM call tools in a loop.

pipeline default(task) {
  var tools = tool_registry()

  tools = tool_define(tools, "read", "Read a file from disk", {
    parameters: {path: {type: "string", description: "Path to read"}},
    returns: {type: "string"},
    handler: { path -> return read_file(path) }
  })

  tools = tool_define(tools, "search", "Search code for a pattern", {
    parameters: {query: {type: "string", description: "Query to search"}},
    returns: {type: "string"},
    handler: { query ->
      let result = shell("grep -r '${query}' src/ || true")
      return result.stdout
    }
  })

  let system = tool_prompt(tools)

  var messages = task
  var done = false
  var iterations = 0

  while !done && iterations < 10 {
    let response = llm_call(messages, system)
    let calls = tool_parse_call(response)

    if calls.count() == 0 {
      println(response)
      done = true
    } else {
      var tool_output = ""
      for call in calls {
        let t = tool_find(tools, call.name)
        let handler = t.handler
        let result = handler(call.arguments[call.arguments.keys()[0]])
        tool_output = tool_output + tool_format_result(call.name, result)
      }
      messages = tool_output
    }
    iterations = iterations + 1
  }
}

3. Parallel tool execution

Run multiple independent operations concurrently with parallel each. Results preserve the original list order.

pipeline default(task) {
  let files = ["src/main.rs", "src/lib.rs", "src/utils.rs"]

  let reviews = parallel each files { file ->
    let content = read_file(file)
    llm_call(
      "Review this code for bugs and suggest fixes:\n\n${content}",
      "You are a senior code reviewer. Be specific."
    )
  }

  for i in 0 to files.count exclusive {
    println("=== ${files[i]} ===")
    println(reviews[i])
  }
}

Use parallel when you need to run N indexed tasks rather than mapping over a list:

pipeline default(task) {
  let prompts = [
    "Write a haiku about Rust",
    "Write a haiku about concurrency",
    "Write a haiku about debugging"
  ]

  let results = parallel(prompts.count) { i ->
    llm_call(prompts[i], "You are a poet.")
  }

  for r in results {
    println(r)
  }
}

4. MCP client integration

Connect to an MCP-compatible tool server, list available tools, and call them. This example uses the filesystem MCP server.

pipeline default(task) {
  let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])

  // Check connection
  let info = mcp_server_info(client)
  println("Connected to: ${info.name}")

  // List available tools
  let tools = mcp_list_tools(client)
  for t in tools {
    println("Tool: ${t.name} - ${t.description}")
  }

  // Write a file, then read it back
  mcp_call(client, "write_file", {path: "/tmp/hello.txt", content: "Hello from Harn!"})
  let content = mcp_call(client, "read_file", {path: "/tmp/hello.txt"})
  println("File content: ${content}")

  // List directory
  let entries = mcp_call(client, "list_directory", {path: "/tmp"})
  println(entries)

  mcp_disconnect(client)
}

You can also declare MCP servers in harn.toml for automatic connection. See MCP and ACP Integration for details.

For remote HTTP MCP servers, authorize once with the CLI and then reuse the stored token automatically from harn.toml:

harn mcp redirect-uri
harn mcp login https://mcp.notion.com/mcp --scope "read write"

5. Filtering with `in` and `not in`

Use the in and not in operators to filter collections by membership.

pipeline default(task) {
  let allowed_extensions = [".rs", ".harn", ".toml"]
  let files = list_dir("src")

  // Filter files to only allowed extensions
  let relevant = files.filter({ f ->
    let ext = extname(f)
    ext in allowed_extensions
  })

  println("Relevant files: ${relevant}")

  // Exclude specific keys from a config dict
  let config = {host: "localhost", port: 8080, debug: true, secret: "abc"}
  let sensitive = ["secret", "password"]

  let safe = {}
  for entry in config {
    if entry.key not in sensitive {
      println("${entry.key}: ${entry.value}")
    }
  }
}

The in operator works with lists, strings (substring test), dicts (key membership), and sets.

6. Pipeline composition

Split agent logic across files and compose pipelines using imports and inheritance.

lib/context.harn – shared context-gathering logic:

fn gather_context(task) {
  let readme = read_file("README.md")
  return {
    task: task,
    readme: readme,
    timestamp: timestamp()
  }
}

lib/review.harn – a reusable review pipeline:

import "lib/context"

pipeline review(task) {
  let ctx = gather_context(task)
  let prompt = "Review this project.\n\nREADME:\n${ctx.readme}\n\nTask: ${ctx.task}"
  let result = llm_call(prompt, "You are a code reviewer.")
  println(result)
}

main.harn – extend and customize:

import "lib/review"

pipeline default(task) extends review {
  override setup() {
    println("Starting custom review pipeline")
  }
}

7. Error handling in agent loops

Wrap LLM calls in try/catch with retry to handle transient failures. Use typed catch for structured error handling.

pipeline default(task) {
  enum AgentError {
    LlmFailure(message)
    ParseFailure(raw)
    Timeout(seconds)
  }

  fn safe_llm_call(prompt, system) {
    retry 3 {
      try {
        let raw = llm_call(prompt, system)
        let parsed = json_parse(raw)
        return parsed
      } catch (e) {
        println("LLM call failed: ${e}")
        throw AgentError.LlmFailure(to_string(e))
      }
    }
  }

  try {
    let result = safe_llm_call(
      "Return a JSON object with keys 'summary' and 'score'.",
      "You are an evaluator. Always respond with valid JSON only."
    )
    println("Summary: ${result.summary}")
    println("Score: ${result.score}")
  } catch (e) {
    // Harn supports a single catch per try; branch on the error type here.
    if type_of(e) == "enum" {
      match e.variant {
        "LlmFailure" -> { println("LLM failed after retries: ${e.fields[0]}") }
        "ParseFailure" -> { println("Could not parse LLM output: ${e.fields[0]}") }
        "Timeout" -> { println("Timed out after ${e.fields[0]}s") }
      }
    } else {
      println("Unexpected error: ${e}")
    }
  }
}

8. Channel-based coordination

Use channels to coordinate between spawned tasks. One task produces work, another consumes it.

pipeline default(task) {
  let ch = channel("work", 10)
  let results_ch = channel("results", 10)

  // Producer: send work items
  let producer = spawn {
    let items = ["item_a", "item_b", "item_c"]
    for item in items {
      send(ch, item)
    }
    send(ch, "DONE")
  }

  // Consumer: process work items
  let consumer = spawn {
    var processed = 0
    var running = true
    while running {
      let item = receive(ch)
      if item == "DONE" {
        running = false
      } else {
        let result = "processed: ${item}"
        send(results_ch, result)
        processed = processed + 1
      }
    }
    send(results_ch, "COMPLETE:${processed}")
  }

  await(producer)
  await(consumer)

  // Collect results
  var collecting = true
  while collecting {
    let msg = receive(results_ch)
    if msg.starts_with("COMPLETE:") {
      println(msg)
      collecting = false
    } else {
      println(msg)
    }
  }
}

9. Context building pattern

Gather context from multiple sources, merge it into a single dict, and pass it to an LLM.

pipeline default(task) {
  fn read_or_empty(path) {
    try {
      return read_file(path)
    } catch (e) {
      return ""
    }
  }

  // Gather context from multiple sources in parallel
  let sources = ["README.md", "CHANGELOG.md", "docs/architecture.md"]

  let contents = parallel each sources { path ->
    {path: path, content: read_or_empty(path)}
  }

  // Build a merged context dict
  var context = {task: task, files: {}}
  for item in contents {
    if item.content != "" {
      context = context.merge({files: context.files.merge({[item.path]: item.content})})
    }
  }

  // Format context for the LLM
  var prompt = "Task: ${task}\n\n"
  for entry in context.files {
    prompt += "=== ${entry.key} ===\n${entry.value}\n\n"
  }

  let result = llm_call(prompt, "You are a helpful assistant. Use the provided files as context.")
  println(result)
}

10. Structured output parsing

Ask the LLM for JSON output, parse it with json_parse, and validate the structure before using it.

pipeline default(task) {
  let system = """
You are a task planner. Given a task description, break it into steps.
Respond with ONLY a JSON array of objects, each with "step" (string) and
"priority" (int 1-5). No other text.
"""

  fn get_plan(task_desc) {
    retry 3 {
      let raw = llm_call(task_desc, system)
      let parsed = json_parse(raw)

      // Validate structure
      guard type_of(parsed) == "list" else {
        throw "Expected a JSON array, got: ${type_of(parsed)}"
      }

      for item in parsed {
        guard item.has("step") && item.has("priority") else {
          throw "Missing required fields in: ${json_stringify(item)}"
        }
      }

      return parsed
    }
  }

  let plan = get_plan("Build a REST API for a todo app")

  if plan != nil {
    let sorted = plan.filter({ s -> s.priority <= 3 })
    for step in sorted {
      println("[P${step.priority}] ${step.step}")
    }
  } else {
    println("Failed to get a valid plan after retries")
  }
}

11. Sets for deduplication and membership testing

Use sets to track processed items and avoid duplicates. Sets provide O(1)-style membership testing via set_contains and are immutable – operations like set_add return a new set.

pipeline default(task) {
  let urls = [
    "https://example.com/a",
    "https://example.com/b",
    "https://example.com/a",
    "https://example.com/c",
    "https://example.com/b"
  ]

  // Deduplicate with set(), then convert back to a list
  let unique_urls = to_list(set(urls))
  println("${len(unique_urls)} unique URLs out of ${len(urls)} total")

  // Track which URLs have been processed
  var visited = set()

  for url in unique_urls {
    if !set_contains(visited, url) {
      println("Processing: ${url}")
      visited = set_add(visited, url)
    }
  }

  // Set operations: find overlap between two batches
  let batch_a = set("task-1", "task-2", "task-3")
  let batch_b = set("task-2", "task-3", "task-4")

  let already_done = set_intersect(batch_a, batch_b)
  let new_work = set_difference(batch_b, batch_a)

  println("Overlap: ${len(already_done)}, New: ${len(new_work)}")
}

12. Typed functions with runtime enforcement

Add type annotations to function parameters for automatic runtime validation. When a caller passes a value of the wrong type, the VM throws a TypeError before the function body executes.

pipeline default(task) {
  fn summarize(text: string, max_words: int) -> string {
    let words = text.split(" ")
    if words.count <= max_words {
      return text
    }
    let truncated = words.slice(0, max_words)
    return "${join(truncated, " ")}..."
  }

  println(summarize("The quick brown fox jumps over the lazy dog", 5))

  // Catch type errors gracefully. `harn check` rejects this call statically
  // before the catch can run — the example is shown for illustration only.
  try {
    summarize(42, "not a number")
  } catch (e) {
    println("Caught: ${e}")
    // -> TypeError: parameter 'text' expected string, got int (42)
  }

  // Works with all primitive types: string, int, float, bool, list, dict, set
  fn process_batch(items: list, verbose: bool) {
    for item in items {
      if verbose {
        println("Processing: ${item}")
      }
    }
    println("Done: ${len(items)} items")
  }

  process_batch(["a", "b", "c"], true)
}

13. MCP client with agent loop

Connect to an MCP server and pass its tools to an agent_loop, letting the LLM decide which tools to call.

pipeline default(task) {
  let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
  let mcp_tool_list = mcp_list_tools(client)

  // Build a tool registry from MCP tools
  var tools = tool_registry()
  for t in mcp_tool_list {
    tools = tool_define(tools, t.name, t.description, {
      parameters: t.inputSchema?.properties ?? {},
      returns: {type: "string"},
      handler: { args -> return mcp_call(client, t.name, args) }
    })
  }

  let result = agent_loop(
    "List all files in /tmp and read the first one.",
    "You are a helpful file assistant.",
    {
      tools: tools,
      persistent: true,
      max_iterations: 10
    }
  )

  println(result.text)
  mcp_disconnect(client)
}

14. Recursive agent with tail call optimization

Tail-recursive functions are optimized by the VM, so they do not overflow the stack even across thousands of iterations. This is an advanced pattern useful for processing a queue of work items one at a time.

pipeline default(task) {
  let items = ["Refactor auth module", "Add input validation", "Write unit tests"]

  fn process(remaining, results) {
    if remaining.count == 0 {
      return results
    }
    let item = remaining.first
    let rest = remaining.slice(1)

    let result = retry 3 {
      llm_call(
        "Plan how to: ${item}",
        "You are a senior engineer. Output a numbered list of steps."
      )
    }

    return process(rest, results + [{task: item, plan: result}])
  }

  let plans = process(items, [])

  for p in plans {
    println("=== ${p.task} ===")
    println(p.plan)
  }
}

For non-LLM workloads, TCO handles deep recursion without issues:

pipeline default(task) {
  fn sum_to(n, acc) {
    if n <= 0 {
      return acc
    }
    return sum_to(n - 1, acc + n)
  }

  println(sum_to(10000, 0))
}

15. Multi-agent delegation

Spawn worker agents for different roles and collect their results in parallel.

// Spawn workers and collect results
let agents = ["research", "analyze", "summarize"]
let results = parallel each agents { role ->
  let agent = spawn_agent({name: role, system: "You are a ${role} agent."})
  send_input(agent, task)
  wait_agent(agent)
}

16. Parallel LLM evaluation

Evaluate multiple prompts concurrently using parallel each.

// Evaluate multiple prompts in parallel
let prompts = ["Explain X", "Explain Y", "Explain Z"]
let responses = parallel each prompts { p ->
  llm_call({prompt: p})
}

17. MCP client usage

Connect to an MCP server, list tools, call one, and disconnect.

// Connect to an MCP server and call tools
let client = mcp_connect({command: "npx", args: ["-y", "some-mcp-server"]})
let tools = mcp_list_tools(client)
log("Available: ${len(tools)} tools")
let result = mcp_call(client, "tool_name", {arg: "value"})
mcp_disconnect(client)

18. Eval metrics tracking

Track quality metrics during agent execution for later analysis.

// Track quality metrics during agent execution
eval_metric("accuracy", score, {model: model_name})
let usage = llm_usage()
eval_metric("cost_tokens", usage.input_tokens + usage.output_tokens)

Tutorial: Build a code review agent

This tutorial shows a small but realistic review pipeline. The goal is not to rebuild a full IDE integration. Instead, we want a deterministic Harn program that can review a patch, inspect context, and return a concise report.

Use the companion example as a starting point:

cargo run --bin harn -- run examples/code-reviewer.harn

1. Start with a tight review prompt

The simplest useful reviewer is just an LLM call with a strong system prompt. Keep the instructions short, specific, and opinionated:

pipeline default(task) {
  let system = """
You are a senior code reviewer.
Review the patch for correctness, security, maintainability, and tests.
Return:
- must-fix issues
- suggestions
- missing tests
End with a short verdict.
"""

  let review = llm_call(task, system, {
    temperature: 0.2,
    max_tokens: 1200,
  })

  println(review.text)
}

This is enough when the user pastes a diff directly into task.

2. Add file context when you need it

Real review agents usually need a bit of surrounding code. The simplest route is to read a small, explicit list of files and combine them with the patch. Keep the list short so the prompt stays focused.

pipeline default(task) {
  let files = ["src/main.rs", "src/lib.rs"]
  var context = ""

  for file in files {
    context = context + "\n\n=== " + file + " ===\n" + read_file(file)
  }

  let review = llm_call(
    "Patch:\n" + task + "\n\nContext:\n" + context,
    """
You are a strict code reviewer.
Flag correctness bugs first, then test gaps, then maintainability issues.
Do not invent missing context. If the context is insufficient, say so.
""",
    {temperature: 0.2, max_tokens: 1400}
  )

  println(review.text)
}

If you want to review a directory tree instead, use list_dir() and parallel each to gather files concurrently, then trim the result to the most relevant ones before calling the model.

3. Make the review measurable

Good review agents should record something observable, even if it is only a small heuristic. Use eval_metric() to track whether the agent found issues and how often it asked for more context.

pipeline default(task) {
  let review = llm_call(
    task,
    "You are a code reviewer. Return a concise bullet list.",
    {temperature: 0.2}
  )

  let has_issue = review.text.contains("issue") || review.text.contains("bug")
  eval_metric("review_has_issue", has_issue)
  eval_metric("review_chars", review.text.count)

  println(review.text)
}

That makes the output easier to compare in harn eval runs later.

4. When to stop

Use the agent loop when the review needs to gather context, but stop once the review itself is stable. For code review, that usually means:

inspect a small, explicit file set
keep the system prompt short
request concrete fixes, not a long essay
record metrics so you can compare review quality over time

If you need a richer workflow, combine this with the eval tutorial and the debugging tools in docs/src/debugging.md.

Tutorial: Build an MCP server

This tutorial builds a small MCP server in Harn. The same program can expose tools, static resources, resource templates, and prompts over stdio.

Use the companion example as a baseline:

cargo run --bin harn -- mcp-serve examples/mcp_server.harn

1. Register tools

Start by creating a tool registry and attaching a few tools with explicit schemas:

pipeline main(task) {
  var tools = tool_registry()

  tools = tool_define(tools, "greet", "Greet someone by name", {
    params: { name: "string" },
    handler: { args -> "Hello, " + args.name + "!" },
    annotations: {
      title: "Greeting Tool",
      readOnlyHint: true,
      destructiveHint: false,
    }
  })

  tools = tool_define(tools, "add", "Add two numbers", {
    params: { a: "number", b: "number" },
    handler: { args -> to_string(args.a + args.b) }
  })

  mcp_tools(tools)
}

Keep tool names short and descriptive. The description should be written for a model, not for a human reading source code.

2. Add resources and templates

Resources are good for static content, while resource templates are better for parameterized data.

pipeline main(task) {
  mcp_resource({
    uri: "docs://readme",
    name: "README",
    mime_type: "text/markdown",
    text: "# Harn MCP Demo\n\nThis server is implemented in Harn."
  })

  mcp_resource_template({
    uri_template: "config://{key}",
    name: "Configuration values",
    mime_type: "text/plain",
    handler: { args ->
      if args.key == "version" {
        "0.6.0"
      } else if args.key == "name" {
        "harn-demo"
      } else {
        "unknown key: " + args.key
      }
    }
  })
}

That pattern is useful for docs, policy data, generated summaries, and other state you want to expose without writing a dedicated tool for each lookup.

3. Add prompts

Prompts let the client ask the server for structured guidance:

pipeline main(task) {
  mcp_prompt({
    name: "code_review",
    description: "Review code for correctness and maintainability",
    arguments: [
      { name: "code", description: "The code to review", required: true },
      { name: "language", description: "Programming language" }
    ],
    handler: { args ->
      let lang = args.language ?? "unknown"
      "Please review this " + lang + " code for correctness, bugs, and tests:\n\n" + args.code
    }
  })
}

Prompts are a good way to standardize a client workflow while still letting the client supply the final payload.

4. Run it over stdio

Once the pipeline calls mcp_tools(), mcp_resource(), or mcp_prompt(), launch the server with:

harn mcp-serve examples/mcp_server.harn

All user-visible output goes to stderr; the MCP transport stays on stdout. That keeps the server compatible with Claude Desktop, Cursor, and other MCP clients.

5. Keep the surface small

A good MCP server has a narrow surface area:

expose only the operations the client truly needs
keep tool names and schemas stable
prefer explicit resources over ad hoc text blobs
use resource templates when one static resource is not enough

If you want the server to be consumable from a desktop client, add a short launch snippet in the client config and test the tool list before expanding the surface.

Tutorial: Build an eval pipeline

This tutorial builds a small evaluation loop that runs a set of examples, records metrics, and produces an auditable summary. The goal is to make quality visible, not to build an elaborate benchmark harness.

Use the companion example as a baseline:

cargo run --bin harn -- run examples/data-pipeline.harn

1. Define the dataset inline

Start with a tiny set of representative inputs. Keep the examples small enough that you can inspect failures by eye:

pipeline main(task) {
  let cases = [
    {id: "case-1", input: "What is 2 + 2?", expected: "4"},
    {id: "case-2", input: "Capital of France?", expected: "Paris"},
    {id: "case-3", input: "Color of grass?", expected: "green"},
  ]

  println("Loaded ${cases.count} eval cases")
}

2. Run the cases in parallel

If each case is independent, use parallel each so the slow parts overlap.

pipeline main(task) {
  let cases = [
    {id: "case-1", input: "What is 2 + 2?", expected: "4"},
    {id: "case-2", input: "Capital of France?", expected: "Paris"},
    {id: "case-3", input: "Color of grass?", expected: "green"},
  ]

  let results = parallel each cases { tc ->
    let answer = llm_call(tc.input, "Answer in one word or short phrase.", {
      temperature: 0.0,
      max_tokens: 64,
    })

    {
      id: tc.id,
      expected: tc.expected,
      actual: answer.text,
      correct: answer.text.contains(tc.expected),
    }
  }

  println(json_stringify(results))
}

For a real eval suite, replace the inline cases list with a manifest or a dataset file that your pipeline reads with read_file().

3. Record metrics

The important part of an eval pipeline is the metric trail. Use eval_metric() to record per-case and aggregate results.

pipeline main(task) {
  let cases = [
    {id: "case-1", input: "What is 2 + 2?", expected: "4"},
    {id: "case-2", input: "Capital of France?", expected: "Paris"},
  ]

  var passed = 0
  for tc in cases {
    let answer = llm_call(tc.input, "Answer in one word.", {temperature: 0.0})
    let correct = answer.text.contains(tc.expected)
    if correct {
      passed = passed + 1
    }
    eval_metric("case_correct", correct, {case_id: tc.id})
  }

  let accuracy = passed / cases.count
  eval_metric("accuracy", accuracy, {passed: passed, total: cases.count})
  eval_metric("run_id", uuid())
  eval_metric("generated_at", timestamp())
}

4. Export a report

Once the metrics are recorded, write a compact report so a later run can diff the results.

pipeline main(task) {
  let summary = {
    run_id: uuid(),
    generated_at: timestamp(),
    accuracy: 0.83,
    notes: "Replace the fixed accuracy with real case scoring",
  }

  write_file("eval-summary.json", json_stringify(summary))
  println(json_stringify(summary))
}

5. How to use it

Run the pipeline, inspect the metrics, then compare runs over time:

harn run examples/eval-workflow.harn
harn eval .harn-runs/<run-id>.json

A good eval pipeline answers three questions:

did the model improve?
did latency or token usage regress?
which cases failed, and why?

Best practices

This guide collects the habits that keep Harn programs small, testable, and easier to operate.

Keep prompts narrow

The best prompts are short and explicit. Tell the model exactly what shape of output you want, what to avoid, and when to say it does not know something. Prefer one task per call over one giant prompt that tries to do everything.

Use explicit context

Pass the minimum useful context into each model call. If the model only needs a few files or a short patch, read those directly instead of dumping the entire repository into the prompt.

Prefer typed boundaries

Use type annotations, shape types, and small helper functions where they make the interface clearer. A narrow typed boundary is easier to debug than a large pile of implicit dicts.

Make concurrency obvious

Use parallel each when the work is independent and order matters. Use parallel when you need indexed fan-out. Keep the body of each worker short so it is obvious what is happening concurrently.

Record metrics early

If a pipeline matters enough to keep, add eval_metric() calls sooner rather than later. Track the numbers you will want during regressions: accuracy, latency, token usage, and counts of failures or retries.

Fail fast on unclear inputs

Use require, guard, typed catches, and explicit validation when the pipeline depends on a particular shape of data. It is cheaper to fail immediately than to let a bad input travel through several stages.

Keep operational surfaces small

For MCP servers, host integrations, and agent tools, expose only the minimum surface you need. Smaller tool surfaces are easier to document, secure, and debug.

Inspect before you scale

Use harn repl for quick experiments, harn viz for structural overviews, harn doctor for environment checks, and cargo run --bin harn-dap through the DAP adapter when you need line-level stepping.

Recommended workflow

For a new agent or pipeline:

Prototype the prompt in harn repl.
Turn it into a named pipeline.
Add a small example under examples/.
Add metrics or a conformance test.
Use harn viz and the debugger when the control flow gets complicated.

That sequence is usually enough to keep the implementation honest without turning the repository into a framework project.

Playground

harn playground runs a pipeline against a Harn-native host module in the same process. It is intended for fast pipeline iteration without wiring a JSON-RPC host or booting a larger app shell.

Quick start

The repo ships with a minimal example:

harn playground \
  --host examples/playground/host.harn \
  --script examples/playground/echo.harn \
  --task "Explain this repository in plain English"

--task is exposed to the script through the HARN_TASK environment variable, so the example reads it with env_or("HARN_TASK", "").

If you want an offline smoke test, force the mock provider:

harn playground \
  --host examples/playground/host.harn \
  --script examples/playground/echo.harn \
  --task "Say hello" \
  --llm mock:mock

For deterministic end-to-end iteration, harn playground also accepts the same JSONL fixture flags as harn run:

harn playground \
  --host examples/playground/host.harn \
  --script examples/playground/echo.harn \
  --task "Explain this repository" \
  --llm-mock fixtures/playground.jsonl

Use --llm-mock-record <path> once to capture a replayable fixture, then switch back to --llm-mock <path> while you iterate on control flow.

Host modules

A playground host is just a .harn file that exports the functions your pipeline expects:

pub fn build_prompt(task_text) {
  return "Task: " + task_text + "\nWorkspace: " + cwd()
}

pub fn request_permission(tool_name, request_args) -> bool {
  return true
}

The playground command loads those exported functions and makes them available to the entry script during execution. If the script calls a host function that the module does not export, the command fails with a pointed error naming the missing function and the caller location.

Watch mode

Use --watch to re-run when either the host module or the script changes:

harn playground --watch --task "Refine the prompt"

The watcher tracks the host and script parent directories recursively and debounces save bursts before re-running.

Starter project

Use the built-in scaffold when you want a dedicated scratchpad:

harn new pipeline-lab-demo --template pipeline-lab
cd pipeline-lab-demo
harn playground --task "Summarize this project"

Host boundary

Harn is the orchestration layer. Hosts supply facts and platform effects.

The boundary should stay narrow:

Hosts expose typed capabilities such as project scan data, editor state, diagnostics, git facts, approval decisions, and persistence hooks.
Harn owns orchestration policy: workflow topology, retries, verification, transcript lifecycle, context assembly, contract enforcement, replay, evals, and worker semantics.

What belongs in Harn std/* modules or the VM:

Generic runtime wrappers like runtime_task(), process_exec(), or interaction_ask()
Reusable metadata/scanner helpers and product-agnostic project-state normalization
Transcript schemas, assets, compaction, and replay semantics
Context/artifact assembly rules that are product-agnostic
Structured contract enforcement and eval/replay helpers
Test-time typed host mocks such as host_mock(...) when the behavior is a runtime fixture for host-backed flows rather than a product-specific bridge
Mutation-session identity and audit provenance for write-capable workflows and delegated workers

What should stay in host-side .harn scripts:

Product-specific prompts and instruction tone
IDE-specific flows such as edit application, approval UX, repo enrichment, or bespoke tool choreography
Host-owned filesystem and edit wrappers built on capability-aware host_call(...)
Host-owned editor, diagnostics, git, learning, and project-context wrappers
Concrete undo/redo stacks and editor-native mutation application
Proprietary ranking, routing, or heuristics tied to one host product
Features that depend on host-only commercial, account, or app lifecycle rules

Rule of thumb:

If a behavior decides how an agent or workflow should think, continue, verify, compact, replay, or select context, it probably belongs in Harn.
If a behavior fetches facts from a specific editor or app surface, asks the user for approval, or performs a host-only side effect, it belongs in the host.

Keep advanced host-side .harn modules local to the host when they encode host-only UX, proprietary behavior, or app-specific heuristics. Move a helper into Harn only when it is general enough to be useful across hosts.

Trust boundary

Harn should own the audit contract for mutations:

mutation-session IDs
workflow/worker/session lineage
tool-gate mutation classification and declared scope
artifact and run-record provenance

Hosts should own the concrete UX:

apply/approve/deny flows
patch previews
editor undo/redo semantics
trust UI around which worker or session produced a change

Contract surfaces

Harn now ships machine-readable contract exports so hosts do not need to reverse-engineer runtime assumptions:

harn contracts builtins for the builtin registry and parser/runtime drift
harn contracts host-capabilities for the effective host manifest used by preflight validation
harn contracts bundle for entry modules, imported modules, prompt/template assets, explicit module-dependency edges, required host capabilities, literal execution directories, worker repo dependencies, and stable summary counts

Those surfaces are intended to be the generic boundary for embedded hosts such as editors or native apps. Product-specific packaging logic should build on top of them rather than re-implementing Harn’s import, asset, and host-capability resolution rules independently.

Bridge protocol

Harn’s stdio bridge uses JSON-RPC 2.0 notifications and requests for host/runtime coordination that sits below ACP session semantics.

Tool lifecycle observation

The tool/pre_use, tool/post_use, and tool/request_approval bridge request/response methods have been retired in favor of the canonical ACP surface:

Tool lifecycle is now carried on the session/update notification stream as tool_call and tool_call_update variants (see the ACP schema at https://agentclientprotocol.com/protocol/schema). Hosts observe every dispatch via the session update stream — there is no host-side approve/deny/modify hook at dispatch time.
Approvals route through canonical session/request_permission. When harn’s declarative ToolApprovalPolicy classifies a call as RequiresHostApproval, the agent loop issues a session/request_permission request to the host and fails closed if the host does not implement it (or returns an error).

Internally, the agent loop emits AgentEvent::ToolCall + AgentEvent::ToolCallUpdate events; harn-cli’s ACP server translates them into session/update notifications via an AgentEventSink it registers per session.

`session/request_permission`

Request payload (harn-issued):

{
  "sessionId": "session_123",
  "toolCall": {
    "toolCallId": "call_123",
    "toolName": "edit_file",
    "rawInput": {"path": "src/main.rs"}
  },
  "mutation": {
    "session_id": "session_123",
    "run_id": "run_123",
    "worker_id": null,
    "mutation_scope": "apply_workspace",
    "approval_policy": {"require_approval": ["edit*"]}
  },
  "declaredPaths": ["src/main.rs"]
}

Response payload (host-issued):

{ "outcome": { "outcome": "selected" } } (ACP canonical): granted
{ "granted": true } (legacy shim): granted with original args
{ "granted": true, "args": {...} }: granted with rewritten args
{ "granted": false, "reason": "..." }: denied

Worker lifecycle notifications

Delegated workers emit session/update notifications with worker_update content. Those payloads include lifecycle timing, child run/snapshot paths, and audit-session metadata so hosts can render background work without scraping plain-text logs.

Daemon idle/resume notifications

Daemon agents stay alive after text-only turns and wait for host activity with adaptive backoff: 100ms, 500ms, 1s, 2s, resetting to 100ms whenever activity arrives.

`agent/idle`

Sent as a bridge notification whenever the daemon enters or remains in the idle wait loop.

Payload:

{
  "iteration": 3,
  "backoff_ms": 1000
}

`agent/resume`

Hosts can send this notification to wake an idle daemon without injecting a user-visible message.

Payload:

{}

A host may also wake the daemon by sending a queued user_message, session/input, or agent/user_message notification.

Client-executed tool search

When a Harn script opts into tool_search against a provider that lacks native defer-loading support, the runtime switches to a client-executed fallback (see the LLM and agents guide). For the "bm25" and "regex" strategies everything stays in-VM; the "semantic" and "host" strategies round-trip the query through the bridge.

`tool_search/query`

Request payload (harn-issued, host response required):

{
  "strategy": "semantic",
  "query": "deploy a new service version",
  "candidates": ["deploy_service", "rollback_service", "query_metrics", "..."]
}

strategy: one of "semantic" or "host". The in-tree strategies ("bm25" / "regex") never hit the bridge.
query: the raw query string the model passed to the synthetic search tool. For strategy: "regex" / "bm25" hosts don’t see this; those strategies run inside the VM.
candidates: full list of deferred tool names the host may choose from. The host should return a subset.

Response payload (host-issued):

{
  "tool_names": ["deploy_service", "rollback_service"],
  "diagnostic": "matched by vector similarity"
}

tool_names (required): ordered list of tool names to promote. Unknown names are ignored by the runtime — they can’t be surfaced because their schemas weren’t registered. Return at most ~20 names per call; the runtime caps promotions soft-per-turn regardless.
diagnostic (optional): short explanation surfaced to the model in the tool result alongside tool_names. Useful for “no hits, try broader terms”-style feedback.

An ACP-style wrapper { "result": { "tool_names": [...] } } is also accepted for hosts that re-wrap everything in a result envelope.

Errors: a JSON-RPC error response (standard shape) is surfaced to the model as a tool_names: [] result with a diagnostic that includes the host error message. The loop continues — the model can retry with a different query.

Host tool discovery

Hosts can expose their own dynamic tool surface to scripts without pre-registering every tool in the initial prompt. Harn discovers that surface through one bridge RPC and then invokes individual tools through the existing builtin_call request path.

`host/tools/list`

VM-issued request. No parameters (or an empty object). The host responds with a list of tool descriptors. Canonical response shape:

{
  "tools": [
    {
      "name": "Read",
      "description": "Read a file from the active workspace",
      "schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string", "description": "File path to read"}
        },
        "required": ["path"]
      },
      "deprecated": false
    },
    {
      "name": "open_file",
      "description": "Reveal a file in the editor",
      "schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string"}
        },
        "required": ["path"]
      },
      "deprecated": true
    }
  ]
}

Accepted variants:

a bare array [{...}, {...}]
an ACP-style wrapper { "result": { "tools": [...] } }
compatibility field names short_description, parameters, or input_schema; Harn normalizes them to description and schema

Each normalized descriptor surfaced to scripts has exactly these keys:

name: string, required
description: string, defaults to ""
schema: JSON Schema object or null
deprecated: boolean, defaults to false

Invocation:

host_tool_list() returns the normalized list directly.
host_tool_call(name, args) then dispatches that tool through the existing builtin_call bridge request using name as the builtin name and args as the single argument payload.

Skill registry (issue #73)

Hosts expose their own managed skill store to the VM through three RPCs. Filesystem skill discovery works without the bridge (harn run walks the seven non-host layers described in Skills); these RPCs add a layer 8 so cloud hosts, enterprise deployments, and the Burin Code IDE can serve skills the filesystem can’t see.

`skills/list`

VM-issued request. No parameters (or an empty object). The host responds with an array of SkillManifestRef entries. Minimal shape:

[
  { "id": "deploy", "name": "deploy", "description": "Ship it", "source": "host" },
  { "id": "acme/ops/review", "name": "review", "description": "Code review", "source": "host" }
]

The VM also accepts { "skills": [ ... ] } for hosts that wrap collections in an object.

`skills/fetch`

VM-issued request. Parameters: { "id": "<skill id>" }. Response is a single skill object carrying enough metadata to populate a Skill:

{
  "name": "deploy",
  "description": "Ship it",
  "body": "# Deploy runbook\n...",
  "manifest": {
    "when_to_use": "...",
    "allowed_tools": ["bash", "git"],
    "paths": ["infra/**"],
    "model": "claude-opus-4-7"
  }
}

Hosts may flatten the manifest fields into the top level instead — the CLI accepts either shape.

`skills/update`

Host-issued notification. No parameters. Invalidates the VM’s cached skill catalog; the CLI re-runs layered discovery (including another skills/list call) on the next iteration boundary — for harn watch, between file changes; for long-running agents, between turns. A VM without an active bridge simply ignores the notification.

Host-delegated skill matching

Harn agents that opt into skill_match: { strategy: "host" } (or the alias "embedding") delegate skill ranking to the host via a single JSON-RPC request. The host response is purely advisory — unknown skill names are ignored, and an RPC error falls back to the in-VM metadata ranker with a warning logged against agent.skill_match.

`skill/match`

Request payload (harn-issued, host response required):

{
  "strategy": "host",
  "prompt": "Ship the new release to production",
  "working_files": ["infra/terraform/cluster.tf"],
  "candidates": [
    {
      "name": "ship",
      "description": "Ship a production release",
      "when_to_use": "User says ship/release/deploy",
      "paths": ["infra/**", "Dockerfile"]
    },
    {
      "name": "review",
      "description": "Review existing code for correctness",
      "when_to_use": "User asks to review/audit",
      "paths": []
    }
  ]
}

Response payload (host-issued):

{
  "matches": [
    {"name": "ship", "score": 0.92, "reason": "matched by embedding similarity"}
  ]
}

matches[*].name (required): the candidate’s skill name. Names absent from the original candidates list are ignored.
matches[*].score (optional): non-negative float; higher scores rank earlier. Defaults to 1.0 when omitted.
matches[*].reason (optional): short diagnostic stored on the skill_matched / skill_activated transcript events. Defaults to "host match".

Alternative shapes accepted for host convenience:

Top-level array: [{"name": ..., "score": ...}, ...]
{"skills": [...]} wrapping
{"result": {"matches": [...]}} ACP envelope

Skill lifecycle session updates

Agents emit ACP session/update notifications for skill lifecycle transitions so hosts can surface active-skill state in real time. harn-cli’s ACP server translates the canonical AgentEvent variants into:

sessionUpdate: "skill_activated" — {skillName, iteration, reason}
sessionUpdate: "skill_deactivated" — {skillName, iteration}
sessionUpdate: "skill_scope_tools" — {skillName, allowedTools}

skill_matched stays internal to the VM transcript — the candidate list can be large and host UIs typically only care about activation transitions, not every ranking pass.

Host tools over the bridge

host_tool_list() and host_tool_call(name, args) are the host-side mirror of Harn’s LLM-facing tool_search flow: the script can ask the host what tools exist right now, inspect their schemas, and invoke the one it actually needs.

This is useful when the host owns the real capabilities:

Claude Code style tools such as Read, Edit, and Bash
IDE actions such as open_file, ide.panel.focus, or ide.git.worktree
product-specific actions that vary by project, session, or user role

Worked example

The script below discovers a readable tool at runtime, refuses to use a deprecated one, and then calls it with a single structured argument payload.

import { host_tool_available, host_tool_lookup } from "std/host"

pipeline inspect_readme(task) {
  if !host_tool_available("Read") {
    log("Host does not expose a Read tool in this session")
    return nil
  }

  let read_tool = host_tool_lookup("Read")
  assert(read_tool != nil, "Read tool metadata should be present")
  assert(read_tool?.deprecated != true, "Read tool is deprecated on this host")

  let result = host_tool_call("Read", {path: "README.md"})
  log(result)
}

What happens at runtime:

host_tool_list() sends host/tools/list to the active bridge host.
The host replies with tool descriptors: name, description, schema, and deprecated.
host_tool_call("Read", {path: "README.md"}) reuses the bridge’s existing builtin_call path, so the host receives the dynamic tool invocation without Harn needing a second bespoke call protocol.

Shape conventions

Harn normalizes each entry returned by host/tools/list to this form:

{
  "name": "Read",
  "description": "Read a file",
  "schema": {
    "type": "object",
    "properties": {
      "path": {"type": "string"}
    },
    "required": ["path"]
  },
  "deprecated": false
}

That means scripts can safely branch on tool.schema or tool.deprecated without having to care whether the host originally used compatibility field names such as short_description or input_schema.

Notes

Without a bridge host, host_tool_list() returns [].
host_tool_call(...) requires an attached bridge host and throws if none is active.
Hosts remain authoritative: if a tool disappears between discovery and invocation, the host error is surfaced to the script normally.

MCP and ACP integration

Harn has built-in support for the Model Context Protocol (MCP), Agent Client Protocol (ACP), and Agent-to-Agent (A2A) protocol. This guide covers how to use each from both client and server perspectives.

MCP client (connecting to MCP servers)

Connect to any MCP-compatible tool server, list its capabilities, and call tools from within a Harn program. Harn supports both stdio MCP servers and remote HTTP MCP servers.

Connecting manually

Use mcp_connect to spawn an MCP server process and perform the initialize handshake:

let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])

let info = mcp_server_info(client)
println("Connected to: ${info.name}")

Listing and calling tools

let tools = mcp_list_tools(client)
for t in tools {
  println("${t.name}: ${t.description}")
}

let content = mcp_call(client, "read_file", {path: "/tmp/data.txt"})
println(content)

mcp_call returns a string for single-text results, a list of content dicts for multi-block results, or nil when empty. If the tool reports an error, mcp_call throws.

Resources and prompts

let resources = mcp_list_resources(client)
let data = mcp_read_resource(client, "file:///tmp/config.json")

let prompts = mcp_list_prompts(client)
let prompt = mcp_get_prompt(client, "review", {code: "fn main() {}"})

Disconnecting

mcp_disconnect(client)

Auto-connection via harn.toml

Instead of calling mcp_connect manually, declare servers in harn.toml. They connect automatically before the pipeline executes and are available through the global mcp dict:

[[mcp]]
name = "filesystem"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]

[[mcp]]
name = "github"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]

[[mcp]]
name = "notion"
transport = "http"
url = "https://mcp.notion.com/mcp"
scopes = "read write"

Lazy boot (harn#75)

Servers marked lazy = true are NOT booted at pipeline startup. They start on the first mcp_call, mcp_ensure_active("name"), or skill activation that declares the server in requires_mcp. This keeps cold starts fast when many servers are declared but only a few are needed per run.

[[mcp]]
name = "github"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
lazy = true
keep_alive_ms = 30_000   # keep the process alive 30s after last release

[[mcp]]
name = "datadog"
command = "datadog-mcp"
lazy = true

Ref-counting: each skill activation or explicit mcp_ensure_active(name) call bumps a binder count. On deactivation or mcp_release(name), the count drops. When it reaches zero, Harn disconnects the server — immediately if keep_alive_ms is absent, or after the window elapses if set.

Explicit control from user code:

// Start the lazy server and hold it open.
let client = mcp_ensure_active("github")
let issues = mcp_call(client, "list_issues", {repo: "burin-labs/harn"})

// Release when done — lets the registry shut it down.
mcp_release("github")

// Inspect current state.
let status = mcp_registry_status()
for s in status {
  println("${s.name}: lazy=${s.lazy} active=${s.active} refs=${s.ref_count}")
}

Server Cards (MCP v2.1)

A Server Card is a small JSON document that advertises a server’s identity, capabilities, and tool catalog without requiring a connection. Harn consumes cards for discoverability and can publish its own when running as an MCP server.

Declare a card source in harn.toml:

[[mcp]]
name = "notion"
transport = "http"
url = "https://mcp.notion.com/mcp"
card = "https://mcp.notion.com/.well-known/mcp-card"

[[mcp]]
name = "local-agent"
command = "my-agent"
lazy = true
card = "./agents/my-agent-card.json"

Fetch it from a pipeline:

// Look up by registered server name.
let card = mcp_server_card("notion")
println(card.description)
for t in card.tools {
  println("- ${t.name}")
}

// Or pass a URL / path directly.
let card = mcp_server_card("./agents/my-agent-card.json")

Cards are cached in-process with a 5-minute TTL — repeated calls are free. Skill matchers can factor card metadata into scoring without paying connection cost.

Skill-scoped MCP binding

Skills can declare the MCP servers they need via requires_mcp (or the equivalent mcp) frontmatter field. On activation, Harn ensures every listed server is running; on deactivation, it releases them.

skill github_triage {
  description: "Triage GitHub issues and cut fixes",
  when_to_use: "User mentions a GitHub issue or PR by number",
  requires_mcp: ["github"],
  allowed_tools: ["list_issues", "create_pr", "add_comment"],
  prompt: "You are a triage assistant...",
}

When agent_loop activates github_triage, the lazy github MCP server boots (if configured that way) and its process stays alive for as long as the skill is active. When the skill deactivates, the server is released — and if no other skill holds it, the process shuts down (respecting keep_alive_ms).

Transcript events emitted along the way: skill_mcp_bound, skill_mcp_unbound, skill_mcp_bind_failed.

MCP tools in the tool-search index

When an LLM uses tool_search (progressive tool disclosure), MCP tools are auto-tagged with both mcp:<server> and <server> in the BM25 corpus. That means a query like "github" or "mcp:github" surfaces every tool from that server even when the tool’s own name and description don’t contain the word. Tools returned by mcp_list_tools carry an _mcp_server field that the indexer consumes automatically — no extra wiring needed.

Use them in your pipeline:

pipeline default(task) {
  let tools = mcp_list_tools(mcp.filesystem)
  let content = mcp_call(mcp.filesystem, "read_file", {path: "/tmp/data.txt"})
  println(content)
}

If a server fails to connect, a warning is printed to stderr and that server is omitted from the mcp dict. Other servers still connect normally.

For HTTP MCP servers, Harn can reuse OAuth tokens stored with the CLI:

harn mcp redirect-uri
harn mcp login notion

If the server uses a pre-registered OAuth client, you can provide those values in harn.toml or on the CLI:

[[mcp]]
name = "internal"
transport = "http"
url = "https://mcp.example.com"
client_id = "https://client.example.com/metadata.json"
client_secret = "super-secret"
scopes = "read:docs write:docs"

When no client_id is provided, Harn will attempt dynamic client registration if the authorization server advertises it.

Example: filesystem MCP server

A complete example connecting to the filesystem MCP server, writing a file, and reading it back:

let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])

mcp_call(client, "write_file", {path: "/tmp/hello.txt", content: "Hello from Harn!"})
let content = mcp_call(client, "read_file", {path: "/tmp/hello.txt"})
println(content)

let entries = mcp_call(client, "list_directory", {path: "/tmp"})
println(entries)

mcp_disconnect(client)

MCP server (exposing Harn as an MCP server)

Harn pipelines can expose tools, resources, resource templates, and prompts as an MCP server. This lets Claude Desktop, Cursor, or any MCP client call into your Harn code.

Defining tools

Use tool_registry() and tool_define() to create tools, then register them with mcp_tools():

pipeline main(task) {
  var tools = tool_registry()

  tools = tool_define(tools, "greet", "Greet someone", {
    parameters: {name: "string"},
    handler: { args -> "Hello, ${args.name}!" }
  })

  tools = tool_define(tools, "search", "Search files", {
    parameters: {query: "string"},
    handler: { args -> "results for ${args.query}" },
    annotations: {
      title: "File Search",
      readOnlyHint: true,
      destructiveHint: false
    }
  })

  mcp_tools(tools)
}

Defining resources and prompts

pipeline main(task) {
  // Static resource
  mcp_resource({
    uri: "docs://readme",
    name: "README",
    text: "# My Agent\nA demo MCP server."
  })

  // Dynamic resource template
  mcp_resource_template({
    uri_template: "config://{key}",
    name: "Config Values",
    handler: { args -> "value for ${args.key}" }
  })

  // Prompt
  mcp_prompt({
    name: "review",
    description: "Code review prompt",
    arguments: [{name: "code", required: true}],
    handler: { args -> "Please review:\n${args.code}" }
  })
}

Running as an MCP server

harn mcp-serve agent.harn

All print/println output goes to stderr (stdout is the MCP transport). The server supports the 2025-11-25 MCP protocol version over stdio.

Publishing a Server Card

Attach a Server Card so clients can discover your server’s identity and capabilities before connecting:

harn mcp-serve agent.harn --card ./card.json

The card JSON is embedded in the initialize response’s serverInfo.card field and also exposed as a read-only resource at well-known://mcp-card. Minimal shape:

{
  "name": "my-agent",
  "version": "1.0.0",
  "description": "Short one-line summary shown in pickers.",
  "protocolVersion": "2025-11-25",
  "capabilities": { "tools": true, "resources": false, "prompts": false },
  "tools": [
    {"name": "greet", "description": "Greet someone by name"}
  ]
}

--card also accepts an inline JSON string for ad-hoc publishing: --card '{"name":"demo","description":"…"}'.

Configuring in Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "my-agent": {
      "command": "harn",
      "args": ["mcp-serve", "agent.harn"]
    }
  }
}

ACP (Agent Client Protocol)

ACP lets host applications and local clients use Harn as a runtime backend. Communication is JSON-RPC 2.0 over stdin/stdout.

Bridge-level tool gates and daemon idle/resume notifications are documented in Bridge protocol.

Running the ACP server

harn acp                    # no pipeline, uses bridge mode
harn acp pipeline.harn      # execute a specific pipeline per prompt

Protocol overview

The ACP server supports these JSON-RPC methods:

Method	Description
`initialize`	Handshake with capabilities
`session/new`	Create a new session (returns session ID)
`session/prompt`	Send a prompt to the agent for execution
`session/cancel`	Cancel the currently running prompt

Queued user messages during agent execution

ACP hosts can inject user follow-up messages while an agent is running. Harn owns the delivery semantics inside the runtime so product apps do not need to reimplement queue/orchestration logic.

Supported notification methods:

user_message
session/input
agent/user_message
session/update with worker_update content for delegated worker lifecycle events

Payload shape:

{
  "content": "Please stop editing that file and explain first.",
  "mode": "interrupt_immediate"
}

Supported mode values:

interrupt_immediate
finish_step
wait_for_completion

Runtime behavior:

interrupt_immediate: inject on the next agent loop boundary immediately
Worker lifecycle updates are emitted as structured session/update payloads with worker id/name, status, lineage metadata, artifact counts, transcript presence, snapshot path, execution metadata, child run ids/paths, lifecycle summaries, and audit-session metadata when applicable. Hosts can render these as background task notifications instead of scraping stdout.
Bridge-mode logs also stream boot timing records (ACP_BOOT with compile_ms, vm_setup_ms, and execute_ms) and live span_end duration events while a prompt is still running, so hosts do not need to wait for the final stdout flush to surface basic timing telemetry.
finish_step: inject after the current tool/operation completes
wait_for_completion: defer until the current agent interaction yields

Typed pipeline returns (Harn → ACP boundary)

Pipelines are what produce ACP events (agent_message_chunk, tool_call, tool_call_update, plan, sessionUpdate). Declaring a return type on a pipeline turns the Harn→ACP boundary into a type-checked contract instead of an implicit shape that only the bridge validates:

type PipelineResult = {
  text: string | nil,
  events: list<dict> | nil,
}

pub pipeline ghost_text(task) -> PipelineResult {
  return {
    text: "hello",
    events: [],
  }
}

The type checker verifies every return <expr> against the declared type, so drift between pipeline output and bridge expectation is caught before the Swift/TypeScript bridge ever sees the message.

Public pipelines without an explicit return type emit the pipeline-return-type lint warning. Explicit return types on the Harn→ACP boundary will be required in a future release; the warning is a one-release deprecation window.

Well-known entry pipelines (default, main, auto, test) are exempt from the warning because their return value is host-driven, not consumed by a protocol bridge.

Canonical ACP envelope types are provided as Harn type aliases in std/acp — SessionUpdate, AgentMessageChunk, ToolCall, ToolCallUpdate, and Plan — and can be used directly as pipeline return types so a pipeline’s contract matches the ACP schema byte-for-byte.

Security notes

Remote MCP OAuth

harn mcp login stores remote MCP OAuth tokens in the local OS keychain for standalone CLI reuse. Treat that as durable delegated access:

prefer the narrowest scopes the server supports
treat configured client_secret values as secrets
review remote MCP capabilities before wiring them into autonomous workflows

Safer write defaults

Harn now propagates mutation-session audit metadata through workflow runs, delegated workers, and bridge tool gates. Recommended host defaults remain:

proposal-first application for direct workspace edits
worktree-backed execution for autonomous/background workers
explicit approval for destructive or broad-scope mutation tools

Bridge mode

ACP internally uses Harn’s host bridge so the host can retain control over tool execution while Harn still owns agent/runtime orchestration.

Unknown builtins are delegated to the host via builtin_call JSON-RPC requests. This enables the host to provide filesystem access, editor integration, or other capabilities that Harn code can call as regular builtins.

A2A (Agent-to-Agent Protocol)

A2A exposes a Harn pipeline as an HTTP server that other agents can interact with. The server implements A2A protocol version 1.0.0.

Running the server

harn serve agent.harn               # default port 8080
harn serve --port 3000 agent.harn   # custom port

Agent card

The server publishes an agent card at GET /.well-known/agent.json describing the agent’s capabilities. MCP clients and other A2A agents use this to discover the agent.

Task submission

Submit a task with a POST request:

POST /message/send
Content-Type: application/json

{
  "message": {
    "role": "user",
    "parts": [{"type": "text", "text": "Analyze this codebase"}]
  }
}

Task status

Check the status of a submitted task:

GET /task/get?id=<task-id>

Task states follow the A2A protocol lifecycle: submitted, working, completed, failed, cancelled.

Harn portal

harn portal launches a local observability UI for persisted Harn runs.

The portal frontend is now a Vite-built React application embedded into harn-cli as static assets. Running harn portal does not require Node once those built assets are present in the repository, but editing the portal UI does.

The portal treats .harn-runs/ as the source of truth and gives you one place to inspect:

run history
the derived action-graph / planner observability artifact
workflow stages
nested trace spans
transcript/story sections
delegated child runs
token/call usage

Start the portal

harn portal
make portal

By default the portal:

serves from http://127.0.0.1:4721
watches .harn-runs
opens a browser automatically

For a fresh source checkout, the simplest local setup is:

./scripts/dev_setup.sh
make portal

./scripts/dev_setup.sh also installs the portal’s Node dependencies and builds crates/harn-cli/portal-dist up front, so the git hooks and harn portal start from a ready state.

For portal frontend work specifically:

npm run portal:build
npm run portal:test

Useful flags:

harn portal --dir runs/archive
harn portal --host 0.0.0.0 --port 4900
harn portal --open false

For frontend development with Vite, npm run portal:dev starts:

the Rust portal server on http://127.0.0.1:4721
the Vite UI on http://127.0.0.1:4723 with /api proxied to the Rust server

Quick demo

To generate a purpose-built demo dataset and launch the portal against it:

make portal-demo

That script creates .harn-runs/portal-demo/ with:

a successful workflow-graph run
a deterministic replay of that run
a failed verification run with failure context in the run list

If you only want the data without launching the server:

./scripts/portal_demo.sh --generate-only
cargo run --bin harn -- portal --dir .harn-runs/portal-demo --open false

If you want to regenerate that dataset from scratch, pass --refresh.

How to read it

The UI is organized around a few simple ideas:

Launch is a dedicated workspace for playground runs and script execution
Runs is a dedicated paginated library for persisted run records
Run detail is a separate inspector page for one run at a time
the top of the detail view is the quick read
the action-graph panel is the “debug this run from one artifact” view: planner rounds, research facts, worker lineage, verification outcomes, and transcript pointers all come from the same derived block in the saved run
the policy panel shows the effective run ceiling plus saved validation output
the replay panel shows whether a run already carries replay/eval assertions
the flamegraph shows where time went
the activity feed shows what the runtime actually did
the transcript story shows the human-visible text that was preserved
the stage detail drawers expose persisted per-stage policy, contracts, worker, prompt, and rendered-context metadata

The portal is intentionally generic. It does not assume a particular editor, client, or host integration. If Harn persisted the run, the portal can inspect it.

Live updates

The portal polls conservatively instead of hammering the run directory:

the runs index refreshes on a slower cadence
the selected run detail refreshes faster only while that run is still active
hidden browser tabs do not poll

The portal also supports:

deep-linking to a selected run via the URL
manual refresh without waiting for the poll interval
comparing a run against any other run of the same workflow, not just the latest earlier one
surfacing action-graph, worker-lineage, transcript-pointer, and tool-result diffs alongside stage-level drift

Launch and playground

The portal can also launch Harn directly through a small control panel at the top of the page.

It supports three modes:

existing .harn files from examples/ and conformance/tests/
inline Harn source through the script editor
a lightweight playground that turns a task plus provider/model selection into a real persisted workflow run

For local model servers, the launch UI also exposes the provider’s endpoint override env when one exists, so you can point local or similar providers at another localhost or LAN address without editing config files first.

The portal now shows both roots explicitly in the launch panel:

Workspace root: the directory where harn portal was started, and the current working directory for launches
Run artifacts: the watched run directory passed via --dir

Inline and playground launches create a concrete per-job workspace under the watched run directory:

.harn-runs/playground/<job-id>/
  workflow.harn
  task.txt
  launch.json
  run.json
  run-llm/llm_transcript.jsonl

That keeps the portal useful even before building a larger hosted playground: you get an inspectable source file, launch metadata, and a real run record that the debugger can reopen later.

Security and privacy constraints:

env overrides are passed only to the child harn run process
env overrides are validated as uppercase shell-style keys
env values are not persisted in portal job state or run metadata
launch file paths must stay inside the current workspace
run inspection paths must stay inside the configured run directory

The transcript sidecar is only populated for runtime paths that currently emit HARN_LLM_TRANSCRIPT_DIR output. Agent-loop traffic supports this today; generic workflow-stage model calls may still only appear in the persisted run record itself.

Saved model-turn detail

If a run has a sibling transcript sidecar directory named like:

.harn-runs/<run-id>.json
.harn-runs/<run-id>-llm/llm_transcript.jsonl

the portal will automatically render step-by-step model turns, including:

kept vs newly added context
saved request messages
reply text
tool calls
token counts
span ids

For richer live observability, Harn already exposes ACP session/update notifications with:

call_start
call_progress
call_end
worker_update

Those can power a future streaming view without inventing a second provenance system alongside run records.

Skill observability

Each run detail page renders three skill-focused panels above the replay/eval section:

Skill timeline — horizontal bars showing which skills activated on which agent-loop iteration and when they deactivated. Hover a bar for the matcher score and the reason the skill was promoted.
Tool-load waterfall — one row per tool_search_query event, pairing each query with the tool_search_result that followed so you can see which deferred tools entered the LLM’s context in each turn.
Matcher decisions — per-iteration expansions showing every candidate the matcher considered, its score, and the working-file snapshot it scored against.

The runs index also accepts a skill=<name> query parameter (and exposes it as a filter input on the runs page), so you can narrow evals to runs where a specific skill was active — useful when validating that a new skill attracts the right prompts.

Orchestrator

harn orchestrator serve is the long-running process entry point for manifest-driven trigger ingestion and connector activation.

Today, the command:

load harn.toml through the existing manifest loader
boot the selected orchestrator role
initialize the shared EventLog under --state-dir
initialize the configured secret-provider chain
resolve and register manifest triggers
activate connectors for the manifest’s providers
bind an HTTP listener for webhook and a2a-push triggers
write a state snapshot and stay up until shutdown

Current limitations:

multi-tenant returns a clear not-implemented error that points at O-12 #190
inspect, replay, dlq, and queue are placeholders for O-08 #185

Command

harn orchestrator serve \
  --config harn.toml \
  --state-dir ./.harn/orchestrator \
  --bind 0.0.0.0:8080 \
  --cert certs/dev.pem \
  --key certs/dev-key.pem \
  --role single-tenant

Omit --cert and --key to serve plain HTTP. When both are present, the listener serves HTTPS and terminates TLS with rustls.

On startup, the command logs the active secret-provider chain, loaded triggers, registered connectors, and the actual bound listener URL. On SIGTERM, it stops accepting new requests, lets in-flight requests drain, appends lifecycle events to the EventLog, and persists a final orchestrator-state.json snapshot under --state-dir.

--manifest is an alias for --config, and --listen is an alias for --bind. Container deployments can also configure those through HARN_ORCHESTRATOR_MANIFEST, HARN_ORCHESTRATOR_LISTEN, HARN_ORCHESTRATOR_STATE_DIR, HARN_ORCHESTRATOR_CERT, and HARN_ORCHESTRATOR_KEY.

On Unix, SIGHUP reloads manifest-backed HTTP trigger bindings without rebinding the socket. The orchestrator reparses harn.toml, re-collects manifest triggers, installs a new manifest binding version for changed webhook / a2a-push entries, and swaps the live listener route table in place. Requests already in flight keep the binding version they started with; new requests route to the newest active binding version. The orchestrator records reload_succeeded / reload_failed events on orchestrator.manifest and refreshes orchestrator-state.json after a successful reload.

Current reload scope is intentionally narrow: listener-wide settings such as --bind, TLS files, allowed_origins, max_body_bytes, and connector-managed trigger changes still require a full restart.

HTTP Listener

The orchestrator listener assembles routes from [[triggers]] entries with kind = "webhook" or kind = "a2a-push".

If a trigger declares path = "/github/issues", that path is used.
Otherwise the route defaults to /triggers/<id>.
/health, /healthz, and /readyz are reserved listener endpoints; use GET /health for container health checks.

Accepted deliveries are normalized into TriggerEvent records and appended to the shared orchestrator.triggers.pending queue in the event log for downstream dispatch.

Hot reload uses the trigger registry’s versioned manifest bindings. A modified trigger id drains the old binding version, activates a new version, and keeps terminated versions around for a short retention window so operators can inspect the handoff without the registry growing unbounded.

Listener controls

Listener-wide controls live under [orchestrator] in harn.toml.

[orchestrator]
allowed_origins = ["https://app.example.com"]
max_body_bytes = 10485760

allowed_origins defaults to ["*"] semantics when omitted or empty. Requests with an Origin header outside the allowlist are rejected with 403 Forbidden.
max_body_bytes defaults to 10485760 bytes (10 MiB). Larger requests are rejected with 413 Payload Too Large.

Listener auth

Health probes stay public:

GET /health
GET /healthz
GET /readyz

Webhook routes keep using their provider-specific signature checks. a2a-push routes require either a bearer API key or a shared-secret HMAC authorization header.

Configure the auth material with environment variables:

export HARN_ORCHESTRATOR_API_KEYS="dev-key-1,dev-key-2"
export HARN_ORCHESTRATOR_HMAC_SECRET="replace-me"

Bearer requests use:

Authorization: Bearer <api-key>

HMAC requests use:

Authorization: HMAC-SHA256 timestamp=<unix>,signature=<base64>

The canonical string is:

METHOD
PATH
TIMESTAMP
SHA256(BODY)

METHOD is uppercased, PATH is the request path without the query string, TIMESTAMP is a Unix epoch seconds value, and SHA256(BODY) is the lowercase hex digest of the raw request body. Timestamps outside the 5-minute replay window are rejected with 401 Unauthorized.

Deployment

Release tags publish a distroless container image to ghcr.io/burin-labs/harn for both linux/amd64 and linux/arm64.

docker run \
  -p 8080:8080 \
  -v "$PWD/triggers.toml:/etc/harn/triggers.toml:ro" \
  -e HARN_ORCHESTRATOR_API_KEYS=xxx \
  -e HARN_ORCHESTRATOR_HMAC_SECRET=replace-me \
  -e RUST_LOG=info \
  ghcr.io/burin-labs/harn

The image runs as UID 10001 and stores orchestrator state under /var/lib/harn/state by default. Override the startup contract with environment variables instead of replacing the entrypoint:

HARN_ORCHESTRATOR_MANIFEST defaults to /etc/harn/triggers.toml
HARN_ORCHESTRATOR_LISTEN defaults to 0.0.0.0:8080
HARN_ORCHESTRATOR_STATE_DIR defaults to /var/lib/harn/state
HARN_ORCHESTRATOR_API_KEYS supplies bearer credentials for authenticated a2a-push routes
HARN_ORCHESTRATOR_HMAC_SECRET supplies the shared secret for canonical-request HMAC auth on a2a-push routes
HARN_SECRET_*, provider API-key env vars, and deployment-specific HARN_PROVIDER_* values are passed through to connector/provider code
RUST_LOG controls runtime log verbosity

The image healthcheck issues GET /health against the local listener, so it works with Docker, BuildKit smoke tests, and most container platforms without requiring curl inside the distroless runtime.

Trigger examples

[[triggers]]
id = "github-new-issue"
kind = "webhook"
provider = "github"
path = "/triggers/github-new-issue"
match = { events = ["issues.opened"] }
handler = "handlers::on_new_issue"
secrets = { signing_secret = "github/webhook-secret" }

[[triggers]]
id = "incoming-review-task"
kind = "a2a-push"
provider = "a2a-push"
path = "/a2a/review"
match = { events = ["a2a.task.received"] }
handler = "a2a://reviewer.prod/triage"

GitHub webhook triggers verify the X-Hub-Signature-256 HMAC against secrets.signing_secret before enqueueing. Generic provider = "webhook" triggers use the shared Standard Webhooks verifier. a2a-push routes require either Authorization: Bearer <api-key> or a valid Authorization: HMAC-SHA256 ... header before enqueueing.

Orchestrator Secrets

Reactive Harn features need a single way to fetch secrets without sprinkling provider-specific code across connectors, OAuth flows, and future orchestrator runtime surfaces. The secret layer lives in harn_vm::secrets and currently ships with two concrete providers:

EnvSecretProvider
KeyringSecretProvider

The default chain is:

env -> keyring

Use harn doctor --no-network to inspect the active chain and to verify that the keyring backend is reachable on the current machine.

Secret model

Secrets are addressed by a structured SecretId:

#![allow(unused)]
fn main() {
use harn_vm::secrets::{SecretId, SecretVersion};

let id = SecretId::new(
    "harn.orchestrator.github",
    "installation-12345/private-key",
)
.with_version(SecretVersion::Latest);
}

Secret values are held in SecretBytes:

bytes are zeroized on drop
Debug is redacted
Display is intentionally absent
explicit duplication requires reborrow()
callers expose bytes via with_exposed(|bytes| ...)

Successful get() calls also emit a structured audit event through the existing VM event sink with the secret id, provider name, caller span, mutation session id when present, and a timestamp. The event payload never contains the secret bytes.

Provider chain configuration

The provider order is controlled with HARN_SECRET_PROVIDERS:

export HARN_SECRET_PROVIDERS=env,keyring

The doctor output also reports a namespace used for backend grouping. By default Harn derives it as harn/<current-directory-name>. Override it with:

export HARN_SECRET_NAMESPACE="harn/my-workspace"

Environment provider

EnvSecretProvider is first in the chain so CI, local shells, and containers can override secrets without touching the OS credential store.

Environment variable names use:

HARN_SECRET_<NAMESPACE>_<NAME>

For example:

export HARN_SECRET_HARN_ORCHESTRATOR_GITHUB_INSTALLATION_12345_PRIVATE_KEY="$(cat github-app.pem)"

Non-alphanumeric characters are normalized to underscores and multiple separators collapse.

Keyring provider

KeyringSecretProvider uses the keyring crate so the same code path works against:

macOS Keychain
Linux native keyring / Secret Service backends supported by keyring
Windows Credential Manager

This is the default local-first provider. The CLI already uses it for MCP OAuth token storage, and harn doctor probes it directly.

Recommended setups

Laptop development:

export HARN_SECRET_PROVIDERS=env,keyring

CI or containers:

export HARN_SECRET_PROVIDERS=env

Cloud deployments:

Today, use env for injected platform secrets. The SecretProvider surface is intentionally ready for Vault / AWS / GCP implementations, but those provider backends are not wired in yet.

CLI reference

All commands available in the harn CLI.

harn run

Execute a .harn file.

harn run <file.harn>
harn run --trace <file.harn>
harn run -e 'println("hello")'
harn run --deny shell,exec <file.harn>
harn run --allow read_file,write_file <file.harn>

Flag	Description
`--trace`	Print LLM trace summary after execution
`-e <code>`	Evaluate inline code instead of a file
`--deny <builtins>`	Deny specific builtins (comma-separated)
`--allow <builtins>`	Allow only specific builtins (comma-separated)

You can also run a file directly without the run subcommand:

harn main.harn

Before starting the VM, harn run <file> builds the cross-module graph for the entry file. When all imports resolve, unknown call targets produce a static error and the VM is never started — the same call target ... is not defined or imported message you see from harn check. The inline -e <code> form has no importing file and therefore skips the cross-module check.

harn playground

Run a pipeline against a Harn-native host module for fast local iteration.

harn playground --host host.harn --script pipeline.harn --task "Explain this repo"
harn playground --watch --task "Refine the prompt"
harn playground --llm ollama:qwen2.5-coder:latest --task "Use a local model"

Flag	Description
`--host <file>`	Host module exporting the functions the script expects (default: `host.harn`)
`--script <file>`	Pipeline entrypoint to execute (default: `pipeline.harn`)
`--task <text>`	Task string exposed as `HARN_TASK` during the run
`--llm <provider:model>`	Override the provider/model selection for this invocation
`--llm-mock <path>`	Replay LLM responses from a JSONL fixture file instead of calling the provider
`--llm-mock-record <path>`	Record executed LLM responses into a JSONL fixture file
`--watch`	Re-run when the host module or script changes

harn playground type-checks the host module, merges its exported function names into the script’s static call-target validation, then executes the script with an in-process host adapter. Missing host functions fail with a pointed error naming the function and caller location.

harn test

Run tests.

harn test conformance                  # run conformance test suite
harn test conformance tests/language/arithmetic.harn # run one conformance file
harn test conformance tests/stdlib/     # run a conformance subtree
harn test tests/                       # run user tests in directory
harn test tests/ --filter "auth*"      # filter by pattern
harn test tests/ --parallel            # run tests concurrently
harn test tests/ --watch               # re-run on file changes
harn test conformance --verbose        # show per-test timing
harn test conformance --timing         # show timing summary without verbose failures
harn test tests/ --record              # record LLM fixtures
harn test tests/ --replay              # replay LLM fixtures

Flag	Description
`--filter <pattern>`	Only run tests matching pattern
`--parallel`	Run tests concurrently
`--watch`	Re-run tests on file changes
`--verbose` / `-v`	Show per-test timing and detailed failures
`--timing`	Show per-test timing plus summary statistics
`--junit <path>`	Write JUnit XML report
`--timeout <ms>`	Per-test timeout in milliseconds (default: 30000)
`--record`	Record LLM responses to `.harn-fixtures/`
`--replay`	Replay recorded LLM responses

When no path is given, harn test auto-discovers a tests/ directory in the current folder. Conformance targets must resolve to a file or directory inside conformance/; the CLI now errors instead of silently falling back to the full suite when a requested target is missing.

harn repl

Start an interactive REPL with syntax highlighting, multiline editing, live builtin completion, and persistent history in ~/.harn/repl_history.

harn repl

The REPL keeps incomplete blocks open until braces, brackets, parentheses, and quoted strings are balanced, so you can paste or type multi-line pipelines and control-flow blocks directly.

harn bench

Benchmark a .harn file over repeated runs.

harn bench main.harn
harn bench main.harn --iterations 25

harn bench parses and compiles the file once, executes it with a fresh VM for each iteration, and reports wall time plus aggregated LLM token, call, and cost metrics.

harn viz

Render a .harn file as a Mermaid flowchart.

harn viz main.harn
harn viz main.harn --output docs/graph.mmd

harn viz parses the file, walks the AST, and emits a Mermaid flowchart TD graph showing pipelines, functions, branches, loops, and other workflow-shaped control-flow nodes.

harn fmt

Format .harn source files. Accepts files or directories.

harn fmt main.harn
harn fmt src/
harn fmt --check main.harn            # check mode (no changes, exit 1 if unformatted)
harn fmt --line-width 80 main.harn    # custom line width

Flag	Description
`--check`	Check mode: exit 1 if any file would be reformatted, make no changes
`--line-width <N>`	Maximum line width before wrapping (default: 100)

The formatter enforces a 100-character line width by default (overridable with --line-width). When a line exceeds this limit the formatter wraps it automatically:

Comma-separated forms — function call arguments, function declaration parameters, list literals, dict literals, struct construction fields, enum constructor payloads, selective import names, interface method parameters, and enum variant fields all wrap with one item per line and trailing commas.
Binary operator chains — long expressions like a + b + c + d break before the operator. Operators that the parser cannot resume across a bare newline (-, ==, !=, <, >, <=, >=, in, not in, ??) get an automatic backslash continuation (\); other operators (+, *, /, %, ||, &&, |>) break without one.
Operator precedence parentheses — the formatter inserts parentheses to preserve semantics when the AST drops them (e.g. a * (b + c) stays parenthesised) and for clarity when mixing && / || (e.g. a && b || c becomes (a && b) || c).

harn lint

Lint one or more .harn files or directories for common issues (unused variables, unused functions, unreachable code, empty blocks, missing /** */ HarnDoc on public functions, etc.).

harn lint main.harn
harn lint src/ tests/

Pass --fix to automatically apply safe fixes (e.g., var → let for never-reassigned bindings, boolean comparison simplification, unused import removal, and string interpolation conversion):

harn lint --fix main.harn

harn check

Type-check one or more .harn files or directories and run preflight validation without executing them. The preflight pass resolves imports, checks literal render(...) / render_prompt(...) targets, detects import symbol collisions across modules, validates host_call("capability.operation", ...) capability contracts, and flags missing template resources, execution directories, and worker repos that would otherwise fail only at runtime. Source-aware lint rules run as part of check, including the missing-harndoc warning for undocumented pub fn APIs.

check builds a cross-module graph from each entry file and follows import statements recursively. When every import in a file resolves, the typechecker knows the exact set of names that module brings into scope and will emit a hard error for any call target that is neither a builtin, a local declaration, a struct constructor, a callable variable, nor an imported symbol:

error: call target `helpr` is not defined or imported

This catches typos and stale imports before the VM runs. If any import in the file is unresolved, the stricter check is turned off for that file so one broken import does not avalanche into spurious errors — the unresolved import itself still fails at runtime.

harn check main.harn
harn check src/ tests/
harn check --host-capabilities host-capabilities.json main.harn
harn check --bundle-root .bundle main.harn
harn check --workspace
harn check --preflight warning src/

Flag	Description
`--host-capabilities <file>`	Load a host capability manifest for preflight validation. Supports plain `{capability: [ops...]}` objects, nested `{capabilities: ...}` wrappers, and per-op metadata dictionaries. Overrides `[check].host_capabilities_path` in `harn.toml`.
`--bundle-root <dir>`	Validate `render(...)`, `render_prompt(...)`, and template paths against an alternate bundled layout root
`--workspace`	Walk every path listed in `[workspace].pipelines` of the nearest `harn.toml`. Positional targets remain additive.
`--preflight <severity>`	Override preflight diagnostic severity: `error` (default, fails the check), `warning` (reports but does not fail), or `off` (suppresses all preflight diagnostics). Overrides `[check].preflight_severity`.
`--strict-types`	Flag unvalidated boundary-API values used in field access.

harn.toml — `[check]` and `[workspace]` sections

harn check walks upward from the target file (stopping at the first .git directory) to find the nearest harn.toml. The following keys are honored:

[check]
# Load an external capability manifest. Path is resolved relative to
# harn.toml. Accepts JSON or TOML with the namespaced shape
# { workspace = [...], process = [...], project = [...], ... }.
host_capabilities_path = "./schemas/host-capabilities.json"

# Or declare inline:
[check.host_capabilities]
project = ["ensure_enriched", "enrich"]
workspace = ["read_text", "write_text"]

[check]
# Downgrade preflight errors to warnings (or suppress entirely with "off").
# Keeps type diagnostics visible while an external capability schema is
# still catching up to a host's live surface.
preflight_severity = "warning"

# Suppress preflight diagnostics for specific capabilities/operations.
# Entries match either an exact "capability.operation" pair, a
# "capability.*" wildcard, a bare "capability" name, or a blanket "*".
preflight_allow = ["mystery.*", "runtime.task"]

[workspace]
# Directories or files checked by `harn check --workspace`. Paths are
# resolved relative to harn.toml.
pipelines = ["Sources/BurinCore/Resources/pipelines", "scripts"]

Preflight diagnostics are reported under the preflight category so they can be distinguished from type-checker errors in IDE output streams and CI log filters.

harn contracts

Export machine-readable contracts for hosts, release tooling, and embedded bundles.

harn contracts builtins
harn contracts host-capabilities --host-capabilities host-capabilities.json
harn contracts bundle main.harn --verify
harn contracts bundle src/ --bundle-root .bundle --host-capabilities host-capabilities.json

harn contracts builtins

Print the parser/runtime builtin registry as JSON, including return-type hints and alignment status.

harn contracts host-capabilities

Print the effective host-capability manifest used by preflight validation after merging the built-in defaults with any external manifest file.

harn contracts bundle

Print a bundle manifest for one or more .harn targets. The manifest includes:

explicit entry_modules, import_modules, and module_dependencies edges
explicit prompt_assets and template_assets slices, plus a full assets table resolved through the same source-relative rules as render(...)
required host capabilities discovered from literal host_call(...) sites
literal execution directories and worker worktree repos
a summary block with stable counts for packagers and release tooling

Use --verify to run normal Harn preflight validation before emitting the bundle manifest and return a non-zero exit code if the selected targets are not bundle-safe.

harn init

Scaffold a new project with harn.toml and main.harn.

harn init              # create in current directory
harn init my-project   # create in a new directory
harn init --template eval

harn new

Scaffold a new project from a starter template. Supported templates are basic, agent, mcp-server, and eval.

harn new my-agent --template agent
harn new local-mcp --template mcp-server
harn new eval-suite --template eval

harn init and harn new share the same scaffolding engine. Use init for the default quick-start flow and new when you want the template choice to be explicit.

harn doctor

Inspect the local environment and report the current Harn setup, including the resolved secret-provider chain and keyring health.

harn doctor
harn doctor --no-network

harn watch

Watch a file for changes and re-run it automatically.

harn watch main.harn
harn watch --deny shell main.harn

harn acp

Start an ACP (Agent Client Protocol) server on stdio.

harn acp                    # bridge mode, no pipeline
harn acp pipeline.harn      # execute a pipeline per prompt

See MCP and ACP Integration for protocol details.

harn portal

Launch the local Harn observability portal for persisted runs.

harn portal
harn portal --dir runs/archive
harn portal --host 0.0.0.0 --port 4900
harn portal --open false

See Harn Portal for the full guide.

harn runs

Inspect persisted workflow run records.

harn runs inspect .harn-runs/<run>.json
harn runs inspect .harn-runs/<run>.json --compare baseline.json

harn replay

Replay a persisted workflow run record from saved output.

harn replay .harn-runs/<run>.json

harn eval

Evaluate a persisted workflow run record as a regression fixture.

harn eval .harn-runs/<run>.json
harn eval .harn-runs/<run>.json --compare baseline.json
harn eval .harn-runs/
harn eval evals/regression.json

harn eval accepts three inputs:

a single run record JSON file
a directory of run record JSON files
an eval suite manifest JSON file with grouped cases and optional baseline comparisons

harn serve

Start an A2A (Agent-to-Agent) HTTP server.

harn serve agent.harn               # default port 8080
harn serve --port 3000 agent.harn   # custom port

See MCP and ACP Integration for protocol details.

harn mcp-serve

Serve a Harn pipeline as an MCP server over stdio.

harn mcp-serve agent.harn

See MCP and ACP Integration for details on defining tools, resources, and prompts.

harn mcp

Manage standalone OAuth state for remote HTTP MCP servers.

harn mcp redirect-uri
harn mcp login notion
harn mcp login https://mcp.notion.com/mcp
harn mcp login my-server --url https://example.com/mcp --client-id <id> --client-secret <secret>
harn mcp status notion
harn mcp logout notion

harn mcp login resolves the server from the nearest harn.toml when you pass an MCP server name, or uses the explicit URL when you pass --url or a raw https://... target. The CLI:

discovers OAuth protected resource and authorization server metadata
prefers pre-registered client_id / client_secret values when supplied
falls back to dynamic client registration when supported by the server
stores tokens in the local OS keychain and refreshes them automatically

Relevant flags:

Flag	Description
`--url <url>`	Explicit MCP server URL when logging in/out by a custom name
`--client-id <id>`	Use a pre-registered client ID instead of dynamic registration
`--client-secret <secret>`	Optional client secret for `client_secret_post` / `client_secret_basic` servers
`--scope <scopes>`	Override or provide requested OAuth scopes
`--redirect-uri <uri>`	Override the default loopback redirect URI (default shown by `harn mcp redirect-uri`)

Security guidance:

prefer the narrowest scopes the remote MCP server supports
treat configured client_secret values as secrets
review remote MCP capabilities before using them in autonomous workflows

Release gate

For repo maintainers, the deterministic full-release path is:

./scripts/release_ship.sh --bump patch

This runs audit → dry-run publish → bump → commit → tag → push → cargo publish → GitHub release in that order. Pushing happens before cargo publish so downstream consumers (GitHub release binary workflows, burin-code’s fetch-harn) start in parallel with crates.io.

For piecewise work, the docs audit, verification gate, bump flow, and publish sequence are exposed individually:

./scripts/release_gate.sh audit
./scripts/release_gate.sh full --bump patch --dry-run

harn add

Add a dependency to harn.toml.

harn add my-lib --git https://github.com/user/my-lib

harn install

Install dependencies declared in harn.toml.

harn install

harn version

Show version information.

harn version

Builtin functions

Complete reference for all built-in functions available in Harn.

Output

Function	Parameters	Returns	Description
`log(msg)`	msg: any	nil	Print with `[harn]` prefix and newline
`print(msg)`	msg: any	nil	Print without prefix or newline
`println(msg)`	msg: any	nil	Print with newline, no prefix
`progress(phase, message, progress?, total?)`	phase: string, message: string, optional numeric progress	nil	Emit standalone progress output. Dict options support `mode: "spinner"` with `step`, or `mode: "bar"` with `current`, `total`, and optional `width`
`color(text, name)`	text: any, name: string	string	Wrap text with an ANSI foreground color code
`bold(text)`	text: any	string	Wrap text with ANSI bold styling
`dim(text)`	text: any	string	Wrap text with ANSI dim styling

Type conversion

Function	Parameters	Returns	Description
`type_of(value)`	value: any	string	Returns type name: `"int"`, `"float"`, `"string"`, `"bool"`, `"nil"`, `"list"`, `"dict"`, `"closure"`, `"taskHandle"`, `"duration"`, `"enum"`, `"struct"`
`to_string(value)`	value: any	string	Convert to string representation
`to_int(value)`	value: any	int or nil	Parse/convert to integer. Floats truncate, bools become 0/1
`to_float(value)`	value: any	float or nil	Parse/convert to float
`unreachable(value?)`	value: any (optional)	never	Throws “unreachable code was reached” at runtime. When the argument is a variable, the type checker verifies it has been narrowed to `never` (exhaustiveness check)
`iter(x)`	x: list, dict, set, string, generator, channel, or iter	`Iter<T>`	Lift an iterable source into a lazy, single-pass, fused iterator. No-op on an existing iter. Dict iters yield `Pair(key, value)`; string iters yield chars. See Iterator methods
`pair(a, b)`	a: any, b: any	`Pair`	Construct a two-element `Pair` value. Access via `.first` / `.second`, or destructure in a for-loop: `for (k, v) in ...`

Runtime shape validation

Function parameters with structural type annotations (shapes) are validated at runtime. If a dict or struct argument is missing a required field or has the wrong field type, a descriptive error is thrown before the function body executes.

fn greet(u: {name: string, age: int}) {
  println("${u.name} is ${u.age}")
}

greet({name: "Alice", age: 30})   // OK
greet({name: "Alice"})            // Error: parameter 'u': missing field 'age' (int)

See Error handling – Runtime shape validation errors for more details.

Result

Harn has a built-in Result type for representing success/failure values without exceptions. Ok and Err create Result.Ok and Result.Err enum variants respectively. When called on a non-Result value, unwrap and unwrap_or pass the value through unchanged.

Function	Parameters	Returns	Description
`Ok(value)`	value: any	Result.Ok	Create a Result.Ok value
`Err(value)`	value: any	Result.Err	Create a Result.Err value
`is_ok(result)`	result: any	bool	Returns true if value is Result.Ok
`is_err(result)`	result: any	bool	Returns true if value is Result.Err
`unwrap(result)`	result: any	any	Extract Ok value. Throws on Err. Non-Result values pass through
`unwrap_or(result, default)`	result: any, default: any	any	Extract Ok value. Returns default on Err. Non-Result values pass through
`unwrap_err(result)`	result: any	any	Extract Err value. Throws on non-Err

Example:

let good = Ok(42)
let bad = Err("something went wrong")

println(is_ok(good))             // true
println(is_err(bad))             // true

println(unwrap(good))            // 42
println(unwrap_or(bad, 0))       // 0
println(unwrap_err(bad))         // something went wrong

JSON

Function	Parameters	Returns	Description
`json_parse(str)`	str: string	value	Parse JSON string into Harn values. Throws on invalid JSON
`json_stringify(value)`	value: any	string	Serialize Harn value to JSON. Closures and handles become `null`
`yaml_parse(str)`	str: string	value	Parse YAML string into Harn values. Throws on invalid YAML
`yaml_stringify(value)`	value: any	string	Serialize Harn value to YAML
`toml_parse(str)`	str: string	value	Parse TOML string into Harn values. Throws on invalid TOML
`toml_stringify(value)`	value: any	string	Serialize Harn value to TOML
`json_validate(data, schema)`	data: any, schema: dict	bool	Validate data against a schema. Returns `true` if valid, throws with details if not
`schema_check(data, schema)`	data: any, schema: dict	Result	Validate data against an extended schema and return `Result.Ok(data)` or `Result.Err({message, errors, value?})`
`schema_parse(data, schema)`	data: any, schema: dict	Result	Same as `schema_check`, but applies `default` values recursively
`schema_is(data, schema)`	data: any, schema: dict	bool	Validate data against a schema and return `true`/`false` without throwing
`schema_expect(data, schema, apply_defaults?)`	data: any, schema: dict, bool (optional)	any	Validate data and return the normalized value, throwing on failure
`schema_from_json_schema(schema)`	schema: dict	dict	Convert a JSON Schema object into Harn’s canonical schema dict
`schema_from_openapi_schema(schema)`	schema: dict	dict	Convert an OpenAPI Schema Object into Harn’s canonical schema dict
`schema_to_json_schema(schema)`	schema: dict	dict	Convert an extended Harn schema into JSON Schema
`schema_to_openapi_schema(schema)`	schema: dict	dict	Convert an extended Harn schema into an OpenAPI-friendly schema object
`schema_extend(base, overrides)`	base: dict, overrides: dict	dict	Shallow-merge two schema dicts
`schema_partial(schema)`	schema: dict	dict	Remove `required` recursively so properties become optional
`schema_pick(schema, keys)`	schema: dict, keys: list	dict	Keep only selected top-level properties
`schema_omit(schema, keys)`	schema: dict, keys: list	dict	Remove selected top-level properties
`json_extract(text, key?)`	text: string, key: string (optional)	value	Extract JSON from text (strips markdown code fences). If key given, returns that key’s value

Type mapping:

JSON	Harn
string	string
integer	int
decimal/exponent	float
true/false	bool
null	nil
array	list
object	dict

Canonical schema format

The canonical schema is a plain Harn dict. The validator also accepts compatible JSON Schema / OpenAPI Schema Object spellings such as object, array, integer, number, boolean, oneOf, allOf, minLength, maxLength, minItems, maxItems, and additionalProperties, normalizing them into the same internal form.

Supported canonical keys:

Key	Type	Description
`type`	string	Expected type: `"string"`, `"int"`, `"float"`, `"bool"`, `"list"`, `"dict"`, `"any"`
`required`	list	List of required key names (for dicts)
`properties`	dict	Dict mapping property names to sub-schemas (for dicts)
`items`	dict	Schema to validate each item against (for lists)
`additional_properties`	bool or dict	Whether unknown dict keys are allowed, or which schema they must satisfy

Example:

let schema = {
  type: "dict",
  required: ["name", "age"],
  properties: {
    name: {type: "string"},
    age: {type: "int"},
    tags: {type: "list", items: {type: "string"}}
  }
}
json_validate(data, schema)  // throws if invalid

Extended schema constraints

The schema builtins support these additional keys:

Key	Type	Description
`nullable`	bool	Allow `nil`
`min` / `max`	int or float	Numeric bounds
`min_length` / `max_length`	int	String length bounds
`pattern`	string	Regex pattern for strings
`enum`	list	Allowed literal values
`const`	any	Exact required literal value
`min_items` / `max_items`	int	List length bounds
`union`	list of schemas	Value must match one schema
`all_of`	list of schemas	Value must satisfy every schema
`default`	any	Default value applied by `schema_parse`

Example:

let user_schema = {
  type: "dict",
  required: ["name", "age"],
  properties: {
    name: {type: "string", min_length: 1},
    age: {type: "int", min: 0},
    role: {type: "string", enum: ["admin", "user"], default: "user"}
  }
}

let parsed = schema_parse({name: "Ada", age: 36}, user_schema)
println(is_ok(parsed))
println(unwrap(parsed).role)
println(schema_to_json_schema(user_schema).type)

schema_is(...) is useful for dynamic checks and can participate in static type refinement when the schema is a literal (or a variable bound from a literal schema).

The lazy std/schema module provides ergonomic builders such as schema_string(), schema_object(...), schema_union(...), get_typed_result(...), get_typed_value(...), and is_type(...).

Composition helpers:

let public_user = schema_pick(user_schema, ["name", "role"])
let patch_schema = schema_partial(user_schema)
let admin_user = schema_extend(user_schema, {
  properties: {
    name: {type: "string", min_length: 1},
    age: {type: "int", min: 0},
    role: {type: "string", enum: ["admin"], default: "admin"}
  }
})

json_extract

Extracts JSON from LLM responses that may contain markdown code fences or surrounding prose. Handles ```json ... ```, ``` ... ```, and bare JSON with surrounding text. Uses balanced bracket matching to correctly extract nested objects and arrays from mixed prose.

let result = llm_call("Return JSON with name and age")
let data = json_extract(result.text)         // parse, stripping fences
let name = json_extract(result.text, "name") // extract just one key

Math

Function	Parameters	Returns	Description
`abs(n)`	n: int or float	int or float	Absolute value
`ceil(n)`	n: float	int	Ceiling (rounds up). Ints pass through unchanged
`floor(n)`	n: float	int	Floor (rounds down). Ints pass through unchanged
`round(n)`	n: float	int	Round to nearest integer. Ints pass through unchanged
`sqrt(n)`	n: int or float	float	Square root
`pow(base, exp)`	base: number, exp: number	int or float	Exponentiation. Returns int when both args are int and exp is non-negative
`min(a, b)`	a: number, b: number	int or float	Minimum of two values. Returns float if either argument is float
`max(a, b)`	a: number, b: number	int or float	Maximum of two values. Returns float if either argument is float
`random()`	none	float	Random float in [0, 1)
`random_int(min, max)`	min: int, max: int	int	Random integer in [min, max] inclusive

Trigonometry

Function	Parameters	Returns	Description
`sin(n)`	n: float	float	Sine (radians)
`cos(n)`	n: float	float	Cosine (radians)
`tan(n)`	n: float	float	Tangent (radians)
`asin(n)`	n: float	float	Inverse sine
`acos(n)`	n: float	float	Inverse cosine
`atan(n)`	n: float	float	Inverse tangent
`atan2(y, x)`	y: float, x: float	float	Two-argument inverse tangent

Logarithms and exponentials

Function	Parameters	Returns	Description
`log2(n)`	n: float	float	Base-2 logarithm
`log10(n)`	n: float	float	Base-10 logarithm
`ln(n)`	n: float	float	Natural logarithm
`exp(n)`	n: float	float	Euler’s number raised to the power n

Constants and utilities

Function	Parameters	Returns	Description
`pi`	—	float	The constant pi (3.14159…)
`e`	—	float	Euler’s number (2.71828…)
`sign(n)`	n: int or float	int	Sign of a number: -1, 0, or 1
`is_nan(n)`	n: float	bool	Check if value is NaN
`is_infinite(n)`	n: float	bool	Check if value is infinite

Sets

Function	Parameters	Returns	Description
`set(items?)`	items: list (optional)	set	Create a new set, optionally from a list
`set_add(s, value)`	s: set, value: any	set	Add a value to a set, returns new set
`set_remove(s, value)`	s: set, value: any	set	Remove a value from a set, returns new set
`set_contains(s, value)`	s: set, value: any	bool	Check if set contains a value
`set_union(a, b)`	a: set, b: set	set	Union of two sets
`set_intersect(a, b)`	a: set, b: set	set	Intersection of two sets
`set_difference(a, b)`	a: set, b: set	set	Difference (elements in a but not b)
`set_symmetric_difference(a, b)`	a: set, b: set	set	Elements in either but not both
`set_is_subset(a, b)`	a: set, b: set	bool	True if all elements of a are in b
`set_is_superset(a, b)`	a: set, b: set	bool	True if a contains all elements of b
`set_is_disjoint(a, b)`	a: set, b: set	bool	True if a and b share no elements
`to_list(s)`	s: set	list	Convert a set to a list

Set methods (dot syntax)

Sets also support method syntax: my_set.union(other).

Method	Parameters	Returns	Description
`.count()` / `.len()`	none	int	Number of elements
`.empty()`	none	bool	True if set is empty
`.contains(val)`	val: any	bool	Check membership
`.add(val)`	val: any	set	New set with val added
`.remove(val)`	val: any	set	New set with val removed
`.union(other)`	other: set	set	Union
`.intersect(other)`	other: set	set	Intersection
`.difference(other)`	other: set	set	Elements in self but not other
`.symmetric_difference(other)`	other: set	set	Elements in either but not both
`.is_subset(other)`	other: set	bool	True if self is a subset of other
`.is_superset(other)`	other: set	bool	True if self is a superset of other
`.is_disjoint(other)`	other: set	bool	True if no shared elements
`.to_list()`	none	list	Convert to list
`.map(fn)`	fn: closure	set	Transform elements (deduplicates)
`.filter(fn)`	fn: closure	set	Keep elements matching predicate
`.any(fn)`	fn: closure	bool	True if any element matches
`.all(fn)` / `.every(fn)`	fn: closure	bool	True if all elements match

String functions

Function	Parameters	Returns	Description
`len(value)`	value: string, list, or dict	int	Length of string (chars), list (items), or dict (keys)
`trim(str)`	str: string	string	Remove leading and trailing whitespace
`lowercase(str)`	str: string	string	Convert to lowercase
`uppercase(str)`	str: string	string	Convert to uppercase
`split(str, sep)`	str: string, sep: string	list	Split string by separator
`starts_with(str, prefix)`	str: string, prefix: string	bool	Check if string starts with prefix
`ends_with(str, suffix)`	str: string, suffix: string	bool	Check if string ends with suffix
`contains(str, substr)`	str: string, substr: string	bool	Check if string contains substring. Also works on lists
`replace(str, old, new)`	str: string, old: string, new: string	string	Replace all occurrences
`join(list, sep)`	list: list, sep: string	string	Join list elements with separator
`substring(str, start, len?)`	str: string, start: int, len: int	string	Extract substring from start position
`format(template, ...)`	template: string, args: any	string	Format string with `{}` placeholders. With a dict as the second arg, supports named `{key}` placeholders

String methods (dot syntax)

These are called on string values with dot notation: "hello".uppercase().

Method	Parameters	Returns	Description
`.trim()`	none	string	Remove leading/trailing whitespace
`.trim_start()`	none	string	Remove leading whitespace only
`.trim_end()`	none	string	Remove trailing whitespace only
`.lines()`	none	list	Split string by newlines
`.char_at(index)`	index: int	string or nil	Character at index (nil if out of bounds)
`.index_of(substr)`	substr: string	int	First character offset of substring (-1 if not found)
`.last_index_of(substr)`	substr: string	int	Last character offset of substring (-1 if not found)
`.lower()` / `.to_lower()`	none	string	Lowercase string
`.len()`	none	int	Character count
`.upper()` / `.to_upper()`	none	string	Uppercase string
`.chars()`	none	list	List of single-character strings
`.reverse()`	none	string	Reversed string
`.repeat(n)`	n: int	string	Repeat n times
`.pad_left(width, char?)`	width: int, char: string	string	Pad to width with char (default space)
`.pad_right(width, char?)`	width: int, char: string	string	Pad to width with char (default space)

List methods (dot syntax)

Method	Parameters	Returns	Description
`.map(fn)`	fn: closure	list	Transform each element
`.filter(fn)`	fn: closure	list	Keep elements where fn returns truthy
`.reduce(init, fn)`	init: any, fn: closure	any	Fold with accumulator
`.find(fn)`	fn: closure	any or nil	First element matching predicate
`.find_index(fn)`	fn: closure	int	Index of first match (-1 if not found)
`.any(fn)`	fn: closure	bool	True if any element matches
`.all(fn)` / `.every(fn)`	fn: closure	bool	True if all elements match
`.none(fn?)`	fn: closure	bool	True if no elements match (no arg: checks emptiness)
`.first(n?)`	n: int (optional)	any or list	First element, or first n elements
`.last(n?)`	n: int (optional)	any or list	Last element, or last n elements
`.partition(fn)`	fn: closure	list	Split into `[[truthy], [falsy]]`
`.group_by(fn)`	fn: closure	dict	Group into dict keyed by fn result
`.sort()` / `.sort_by(fn)`	fn: closure (optional)	list	Sort (natural or by key function)
`.min()` / `.max()`	none	any	Minimum/maximum value
`.min_by(fn)` / `.max_by(fn)`	fn: closure	any	Min/max by key function
`.chunk(size)`	size: int	list	Split into chunks of size
`.window(size)`	size: int	list	Sliding windows of size
`.each_cons(size)`	size: int	list	Sliding windows of size
`.compact()`	none	list	Remove nil values
`.unique()`	none	list	Remove duplicates
`.flatten()`	none	list	Flatten one level of nesting
`.flat_map(fn)`	fn: closure	list	Map then flatten
`.tally()`	none	dict	Frequency count: `{value: count}`
`.zip(other)`	other: list	list	Pair elements from two lists
`.enumerate()`	none	list	List of `{index, value}` dicts
`.take(n)` / `.skip(n)`	n: int	list	First/remaining n elements
`.sum()`	none	int or float	Sum of numeric values
`.join(sep?)`	sep: string	string	Join to string
`.reverse()`	none	list	Reversed list
`.push(item)` / `.pop()`	item: any	list	New list with item added/removed (immutable)
`.contains(item)`	item: any	bool	Check if list contains item
`.index_of(item)`	item: any	int	Index of item (-1 if not found)
`.slice(start, end?)`	start: int, end: int	list	Slice with negative index support

Iterator methods

Eager list/dict/set/string methods listed above are unchanged — they still return eager collections. Lazy iteration is opt-in via .iter(), which lifts a list, dict, set, string, generator, or channel into an Iter<T> value. Iterators are single-pass, fused, and snapshot — they Rc-clone the backing collection, so mutating the source after .iter() does not affect the iter.

On a dict, .iter() yields Pair(key, value) values (use .first / .second, or destructure in a for-loop). String iteration yields chars (Unicode scalar values).

Printing with log(it) renders <iter> or <iter (exhausted)> and does not drain the iterator.

Lazy combinators (return a new `Iter`)

Method	Parameters	Returns	Description
`.iter()`	none	`Iter<T>`	Lift a source into an iter; no-op on an existing iter
`.map(fn)`	fn: closure	`Iter<U>`	Lazily transform each item
`.filter(fn)`	fn: closure	`Iter<T>`	Lazily keep items where fn returns truthy
`.flat_map(fn)`	fn: closure	`Iter<U>`	Map then flatten, lazily
`.take(n)`	n: int	`Iter<T>`	First n items
`.skip(n)`	n: int	`Iter<T>`	Drop first n items
`.take_while(fn)`	fn: closure	`Iter<T>`	Items until predicate first returns falsy
`.skip_while(fn)`	fn: closure	`Iter<T>`	Drop items while predicate is truthy
`.zip(other)`	other: iter	`Iter<Pair<T, U>>`	Pair items from two iters, stops at shorter
`.enumerate()`	none	`Iter<Pair<int, T>>`	Pair each item with a 0-based index
`.chain(other)`	other: iter	`Iter<T>`	Yield items from self, then from other
`.chunks(n)`	n: int	`Iter<list<T>>`	Non-overlapping fixed-size chunks
`.windows(n)`	n: int	`Iter<list<T>>`	Sliding windows of size n

Sinks (drain the iter, return an eager value)

Method	Parameters	Returns	Description
`.to_list()`	none	list	Collect all items into a list
`.to_set()`	none	set	Collect all items into a set
`.to_dict()`	none	dict	Collect `Pair(key, value)` items into a dict
`.count()`	none	int	Count remaining items
`.sum()`	none	int or float	Sum of numeric items
`.min()` / `.max()`	none	any	Min/max item
`.reduce(init, fn)`	init: any, fn: closure	any	Fold with accumulator
`.first()` / `.last()`	none	any or nil	First/last item
`.any(fn)`	fn: closure	bool	True if any remaining item matches
`.all(fn)`	fn: closure	bool	True if all remaining items match
`.find(fn)`	fn: closure	any or nil	First item matching predicate
`.for_each(fn)`	fn: closure	nil	Invoke fn on each remaining item

Path functions

Function	Parameters	Returns	Description
`dirname(path)`	path: string	string	Directory component of path
`basename(path)`	path: string	string	File name component of path
`extname(path)`	path: string	string	File extension including dot (e.g., `.harn`)
`path_join(parts...)`	parts: strings	string	Join path components
`path_workspace_info(path, workspace_root?)`	path: string, workspace_root?: string	dict	Classify a path as `workspace_relative`, `host_absolute`, or `invalid`, and project both workspace-relative and host-absolute forms when known
`path_workspace_normalize(path, workspace_root?)`	path: string, workspace_root?: string	string or nil	Normalize a path into workspace-relative form when it is safely inside the workspace (including common leading-slash drift like `/packages/...`)

File I/O

Function	Parameters	Returns	Description
`read_file(path)`	path: string	string	Read entire file as UTF-8 string. Throws on failure. Deprecated in favor of `read_file_result` for new code; the throwing form remains supported.
`read_file_result(path)`	path: string	`Result<string, string>`	Non-throwing read: returns `Result.Ok(content)` on success or `Result.Err(message)` on failure. Shares `read_file`’s content cache
`write_file(path, content)`	path: string, content: string	nil	Write string to file. Throws on failure
`append_file(path, content)`	path: string, content: string	nil	Append string to file, creating it if it doesn’t exist. Throws on failure
`copy_file(src, dst)`	src: string, dst: string	nil	Copy a file. Throws on failure
`delete_file(path)`	path: string	nil	Delete a file or directory (recursive). Throws on failure
`file_exists(path)`	path: string	bool	Check if a file or directory exists
`list_dir(path?)`	path: string (default `"."`)	list	List directory contents as sorted list of file names. Throws on failure
`mkdir(path)`	path: string	nil	Create directory and all parent directories. Throws on failure
`stat(path)`	path: string	dict	File metadata: `{size, is_file, is_dir, readonly, modified}`. Throws on failure
`temp_dir()`	none	string	System temporary directory path
`render(path, bindings?)`	path: string, bindings: dict	string	Read a template file relative to the current module’s asset root and render it. The template language supports `{{ name }}` interpolation (with nested paths and filters), `{{ if }} / {{ elif }} / {{ else }} / {{ end }}`, `{{ for item in xs }} ... {{ end }}` (with `{{ loop.index }}` etc.), `{{ include "..." }}` partials, `{{# comments #}}`, `{{ raw }} ... {{ endraw }}` verbatim blocks, and `{{- -}}` whitespace trim markers. See the Prompt templating reference for the full grammar and filter list. When called from an imported module, resolves relative to that module’s directory, not the entry pipeline. Without bindings, just reads the file
`render_prompt(path, bindings?)`	path: string, bindings: dict	string	Prompt-oriented alias of `render(...)`. Use this for `.harn.prompt` / `.prompt` assets when you want the asset to be surfaced explicitly in bundle manifests and preflight output

Environment and system

Function	Parameters	Returns	Description
`env(name)`	name: string	string or nil	Read environment variable
`env_or(name, default)`	name: string, default: any	string or default	Read environment variable, or return `default` when unset. One-line replacement for the common `let v = env(K); if v { v } else { default }` pattern
`timestamp()`	none	float	Unix timestamp in seconds with sub-second precision
`elapsed()`	none	int	Milliseconds since VM startup
`exec(cmd, args...)`	cmd: string, args: strings	dict	Execute external command. Returns `{stdout, stderr, status, success}`
`exec_at(dir, cmd, args...)`	dir: string, cmd: string, args: strings	dict	Execute external command inside a specific directory
`shell(cmd)`	cmd: string	dict	Execute command via shell. Returns `{stdout, stderr, status, success}`
`shell_at(dir, cmd)`	dir: string, cmd: string	dict	Execute shell command inside a specific directory
`exit(code)`	code: int (default 0)	never	Terminate the process
`username()`	none	string	Current OS username
`hostname()`	none	string	Machine hostname
`platform()`	none	string	OS name: `"darwin"`, `"linux"`, or `"windows"`
`arch()`	none	string	CPU architecture (e.g., `"aarch64"`, `"x86_64"`)
`uuid()`	none	string	Generate a random v4 UUID
`home_dir()`	none	string	User’s home directory path
`pid()`	none	int	Current process ID
`cwd()`	none	string	Current working directory
`execution_root()`	none	string	Directory used for source-relative execution helpers such as `exec_at(...)` / `shell_at(...)`
`asset_root()`	none	string	Directory used for source-relative asset helpers such as `render(...)` / `render_prompt(...)`
`source_dir()`	none	string	Directory of the currently-executing `.harn` file (falls back to cwd)
`project_root()`	none	string or nil	Nearest ancestor directory containing `harn.toml`
`runtime_paths()`	none	dict	Resolved runtime path model: `{execution_root, asset_root, state_root, run_root, worktree_root}`
`date_iso()`	none	string	Current UTC time in ISO 8601 format (e.g., `"2026-03-29T14:30:00.123Z"`)

Regular expressions

Function	Parameters	Returns	Description
`regex_match(pattern, text)`	pattern: string, text: string	list or nil	Find all non-overlapping matches. Returns nil if no matches
`regex_replace(pattern, replacement, text)`	pattern: string, replacement: string, text: string	string	Replace all matches. Throws on invalid regex
`regex_captures(pattern, text)`	pattern: string, text: string	list	Find all matches with capture group details

regex_captures

Returns a list of dicts, one per match. Each dict contains:

match – the full matched string
groups – a list of positional capture group values (from (...))
Named capture groups (from (?P<name>...)) appear as additional keys

let results = regex_captures("(\\w+)@(\\w+)", "alice@example bob@test")
// [
//   {match: "alice@example", groups: ["alice", "example"]},
//   {match: "bob@test", groups: ["bob", "test"]}
// ]

Named capture groups are added as top-level keys on each result dict:

let named = regex_captures("(?P<user>\\w+):(?P<role>\\w+)", "alice:admin")
// [{match: "alice:admin", groups: ["alice", "admin"], user: "alice", role: "admin"}]

Returns an empty list if there are no matches. Throws on invalid regex.

Encoding

Function	Parameters	Returns	Description
`base64_encode(string)`	string: string	string	Base64 encode a string (standard alphabet with padding)
`base64_decode(string)`	string: string	string	Base64 decode a string. Throws on invalid input
`url_encode(string)`	string: string	string	URL percent-encode a string. Unreserved characters (alphanumeric, `-`, `_`, `.`, `~`) pass through unchanged
`url_decode(string)`	string: string	string	Decode a URL-encoded string. Decodes `%XX` sequences and `+` as space

Example:

let encoded = base64_encode("Hello, World!")
println(encoded)                  // SGVsbG8sIFdvcmxkIQ==
println(base64_decode(encoded))   // Hello, World!

println(url_encode("hello world"))         // hello%20world
println(url_decode("hello%20world"))       // hello world
println(url_encode("a=1&b=2"))             // a%3D1%26b%3D2
println(url_decode("hello+world"))         // hello world

Hashing

Function	Parameters	Returns	Description
`sha256(string)`	string: string	string	SHA-256 hash, returned as a lowercase hex-encoded string
`md5(string)`	string: string	string	MD5 hash, returned as a lowercase hex-encoded string

Example:

println(sha256("hello"))  // 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
println(md5("hello"))     // 5d41402abc4b2a76b9719d911017c592

Date/Time

Function	Parameters	Returns	Description
`date_now()`	none	dict	Current UTC datetime as dict with `year`, `month`, `day`, `hour`, `minute`, `second`, `weekday`, and `timestamp` fields
`date_parse(str)`	str: string	float	Parse a datetime string (e.g., `"2024-01-15 10:30:00"`) into a Unix timestamp. Extracts numeric components from the string. Throws if fewer than 3 parts (year, month, day). Validates month (1-12), day (1-31), hour (0-23), minute (0-59), second (0-59)
`date_format(dt, format?)`	dt: float, int, or dict; format: string (default `"%Y-%m-%d %H:%M:%S"`)	string	Format a timestamp or date dict as a string. Supports `%Y`, `%m`, `%d`, `%H`, `%M`, `%S` placeholders. Throws for negative timestamps

Testing

Function	Parameters	Returns	Description
`assert(condition, msg?)`	condition: any, msg: string (optional)	nil	Assert value is truthy. Throws with message on failure
`assert_eq(a, b, msg?)`	a: any, b: any, msg: string (optional)	nil	Assert two values are equal. Throws with message on failure
`assert_ne(a, b, msg?)`	a: any, b: any, msg: string (optional)	nil	Assert two values are not equal. Throws with message on failure

HTTP

Function	Parameters	Returns	Description
`http_get(url, options?)`	url: string, options: dict	dict	GET request
`http_post(url, body, options?)`	url: string, body: string, options: dict	dict	POST request
`http_put(url, body, options?)`	url: string, body: string, options: dict	dict	PUT request
`http_patch(url, body, options?)`	url: string, body: string, options: dict	dict	PATCH request
`http_delete(url, options?)`	url: string, options: dict	dict	DELETE request
`http_request(method, url, options?)`	method: string, url: string, options: dict	dict	Generic HTTP request

All HTTP functions return {status: int, headers: dict, body: string, ok: bool}. Options: timeout (ms), retries, backoff (ms), headers (dict), auth (string or {bearer: "token"} or {basic: {user, password}}), follow_redirects (bool), max_redirects (int), body (string). Throws on network errors.

Mock HTTP

For testing pipelines that make HTTP calls without hitting real servers.

Function	Parameters	Returns	Description
`http_mock(method, url_pattern, response)`	method: string, url_pattern: string, response: dict	nil	Register a mock. Use `` in url_pattern for glob matching (supports multiple `` wildcards, e.g., `https://api.example.com//items/`)
`http_mock_clear()`	none	nil	Clear all mocks and recorded calls
`http_mock_calls()`	none	list	Return list of `{method, url, body}` for all intercepted calls

http_mock("GET", "https://api.example.com/users", {
  status: 200,
  body: "{\"users\": [\"alice\"]}",
  headers: {}
})
let resp = http_get("https://api.example.com/users")
assert_eq(resp.status, 200)

Interactive input

Function	Parameters	Returns	Description
`prompt_user(msg)`	msg: string (optional)	string	Display message, read line from stdin

Host interop

Function	Parameters	Returns	Description
`host_call(name, args)`	name: string, args: any	any	Call a host capability operation using `capability.operation` naming
`host_capabilities()`	—	dict	Typed host capability manifest
`host_has(capability, op?)`	capability: string, op: string	bool	Check whether a typed host capability/operation exists
`host_tool_list()`	—	list	List host-exposed bridge tools as `{name, description, schema, deprecated}`
`host_tool_call(name, args)`	name: string, args: any	any	Invoke a bridge-exposed host tool by name using the existing `builtin_call` path
`host_mock(capability, op, response_or_config, params?)`	capability: string, op: string, response_or_config: any or dict, params: dict	nil	Register a runtime mock for a typed host operation
`host_mock_clear()`	—	nil	Clear registered typed host mocks and recorded mock invocations
`host_mock_calls()`	—	list	Return recorded typed host mock invocations

host_capabilities() returns the capability manifest surfaced by the active host bridge. The local runtime exposes generic process, template, and interaction capabilities. Product hosts can add capabilities such as workspace, project, runtime, editor, git, or diagnostics.

Prefer host_call("capability.operation", args) in shared wrappers and host-owned .harn modules so capability names stay consistent across the runtime, host manifest, and preflight validation.

host_tool_list() is the discovery surface for host-native tools such as Read, Edit, Bash, or IDE actions exposed by the active bridge host. Without a bridge it returns []. host_tool_call(name, args) uses that same bridge host’s existing dynamic builtin dispatch path, so scripts can discover a tool at runtime and then call it by name without hard-coding it into the initial prompt. Import std/host when you want small helpers such as host_tool_lookup(name) or host_tool_available(name).

host_mock(...) is intended for tests and local conformance runs. The third argument may be either a direct result value or a config dict containing result, params, and/or error. Mock matching is last-write-wins and only requires the declared params subset to match the actual host call call. Matched calls are recorded in host_mock_calls() as {capability, operation, params} dictionaries.

For higher-level test helpers, import std/testing:

import {
  assert_host_called,
  clear_host_mocks,
  mock_host_error,
  mock_host_result,
} from "std/testing"

clear_host_mocks()
mock_host_result("project", "metadata_get", "hello", {dir: ".", namespace: "facts"})
assert_eq(host_call("project.metadata_get", {dir: ".", namespace: "facts"}), "hello")
assert_host_called("project", "metadata_get", {dir: ".", namespace: "facts"}, nil)

mock_host_error("project", "scan", "scan failed", nil)
let result = try { host_call("project.scan", {}) }
assert(is_err(result))

Async and timing

Function	Parameters	Returns	Description
`sleep(duration)`	duration: int (ms) or duration literal	nil	Pause execution

Concurrency primitives

Channels

Function	Parameters	Returns	Description
`channel(name?)`	name: string (default `"default"`)	dict	Create a channel with `name`, `type`, and `messages` fields
`send(ch, value)`	ch: dict, value: any	nil	Send a value to a channel
`receive(ch)`	ch: dict	any	Receive a value from a channel (blocks until data available)
`close_channel(ch)`	ch: channel	nil	Close a channel, preventing further sends
`try_receive(ch)`	ch: channel	any or nil	Non-blocking receive. Returns nil if no data available
`select(ch1, ch2, ...)`	channels: channel	dict or nil	Wait for data on any channel. Returns `{index, value, channel}` for the first ready channel, or nil if all closed

Atomics

Function	Parameters	Returns	Description
`atomic(initial?)`	initial: any (default 0)	dict	Create an atomic value
`atomic_get(a)`	a: dict	any	Read the current value
`atomic_set(a, value)`	a: dict, value: any	int	Set value, returns previous value
`atomic_add(a, delta)`	a: dict, delta: int	int	Add delta, returns previous value
`atomic_cas(a, expected, new)`	a: dict, expected: int, new: int	bool	Compare-and-swap. Returns true if the swap succeeded

Persistent store

Function	Parameters	Returns	Description
`store_get(key)`	key: string	any	Retrieve value from store, nil if missing
`store_set(key, value)`	key: string, value: any	nil	Store value, auto-saves to `.harn/store.json`
`store_delete(key)`	key: string	nil	Remove key from store
`store_list()`	none	list	List all keys (sorted)
`store_save()`	none	nil	Explicitly flush store to disk
`store_clear()`	none	nil	Remove all keys from store

The store is backed by .harn/store.json relative to the script’s directory. The file is created lazily on first store_set. In bridge mode, the host can override these builtins.

LLM

See LLM calls and agent loops for full documentation.

Function	Parameters	Returns	Description
`llm_call(prompt, system?, options?)`	prompt: string, system: string, options: dict	dict	Single LLM request. Returns `{text, model, input_tokens, output_tokens}`. Throws on transport / rate-limit / schema-validation failures
`llm_call_safe(prompt, system?, options?)`	prompt: string, system: string, options: dict	dict	Non-throwing envelope around `llm_call`. Returns `{ok: bool, response: dict or nil, error: {category, message} or nil}`. `error.category` is one of `ErrorCategory`’s canonical strings (`"rate_limit"`, `"timeout"`, `"overloaded"`, `"server_error"`, `"transient_network"`, `"schema_validation"`, `"auth"`, `"not_found"`, `"circuit_open"`, `"tool_error"`, `"tool_rejected"`, `"cancelled"`, `"generic"`)
`with_rate_limit(provider, fn, options?)`	provider: string, fn: closure, options: dict	whatever `fn` returns	Acquire a permit from the provider’s sliding-window rate limiter, invoke `fn`, and retry with exponential backoff on retryable errors (`rate_limit`, `overloaded`, `transient_network`, `timeout`). Options: `max_retries` (default 5), `backoff_ms` (default 1000, capped at 30s after doubling)
`llm_completion(prefix, suffix?, system?, options?)`	prefix: string, suffix: string, system: string, options: dict	dict	Text completion / fill-in-the-middle request. Returns `{text, model, input_tokens, output_tokens}`
`agent_loop(prompt, system?, options?)`	prompt: string, system: string, options: dict	dict	Multi-turn agent loop with `##DONE##` sentinel, daemon/idling support, and optional per-turn context filtering. Returns `{status, text, iterations, duration_ms, tools_used}`
`daemon_spawn(config)`	config: dict	dict	Start a daemon-mode agent and return a daemon handle with persisted state + queue metadata
`daemon_trigger(handle, event)`	handle: dict or string, event: any	dict	Enqueue a durable FIFO trigger event for a running daemon; throws `VmError::DaemonQueueFull` on overflow
`daemon_snapshot(handle)`	handle: dict or string	dict	Return the latest daemon snapshot plus live queue state (`pending_events`, `inflight_event`, counts, capacity)
`daemon_stop(handle)`	handle: dict or string	dict	Stop a daemon and preserve queued trigger state for resume
`daemon_resume(path)`	path: string	dict	Resume a daemon from its persisted state directory
`trigger_list()`	—	list	Return the live trigger registry snapshot as `list<TriggerBinding>`
`trigger_register(config)`	config: dict	dict	Dynamically register a trigger and return its `TriggerHandle`
`trigger_fire(handle, event)`	handle: dict or string, event: dict	dict	Fire a synthetic event into a trigger and return a `DispatchHandle`; execution routes through the trigger dispatcher
`trigger_replay(event_id)`	event_id: string	dict	Fetch a historical event from `triggers.events`, re-dispatch it through the trigger dispatcher, and thread `replay_of_event_id` through the returned `DispatchHandle`
`trigger_inspect_dlq()`	—	list	Return the current DLQ snapshot as `list<DlqEntry>` with retry history
`trigger_test_harness(fixture)`	fixture: string or `{fixture: string}`	dict	Run a named trigger-system harness fixture and return a structured report. Intended for Rust/unit/conformance coverage of cron, webhook, retry, DLQ, dedupe, rate-limit, cost-guard, recovery, and dead-man-switch scenarios
`llm_info()`	—	dict	Current LLM config: `{provider, model, api_key_set}`
`llm_usage()`	—	dict	Cumulative usage: `{input_tokens, output_tokens, total_duration_ms, call_count, total_calls}`
`llm_resolve_model(alias)`	alias: string	dict	Resolve model alias to `{id, provider}` via providers.toml
`llm_pick_model(target, options?)`	target: string, options: dict	dict	Resolve a model alias or tier to `{id, provider, tier}`
`llm_infer_provider(model_id)`	model_id: string	string	Infer provider from model ID (e.g. `"claude-*"` → `"anthropic"`)
`llm_model_tier(model_id)`	model_id: string	string	Get capability tier: `"small"`, `"mid"`, or `"frontier"`
`llm_healthcheck(provider?)`	provider: string	dict	Validate API key. Returns `{valid, message, metadata}`
`llm_rate_limit(provider, options?)`	provider: string, options: dict	int/nil/bool	Set (`{rpm: N}`), query, or clear (`{rpm: 0}`) per-provider rate limit
`llm_providers()`	—	list	List all configured provider names
`llm_config(provider?)`	provider: string	dict	Get provider config (base_url, auth_style, etc.)
`llm_cost(model, input_tokens, output_tokens)`	model: string, input_tokens: int, output_tokens: int	float	Estimate USD cost from embedded pricing table
`llm_session_cost()`	—	dict	Session totals: `{total_cost, input_tokens, output_tokens, call_count}`
`llm_budget(max_cost)`	max_cost: float	nil	Set session budget in USD. LLM calls throw if exceeded
`llm_budget_remaining()`	—	float or nil	Remaining budget (nil if no budget set)
`llm_mock(response)`	response: dict	nil	Queue a mock LLM response. Dict supports `text`, `tool_calls`, `match` (glob), `consume_match` (consume a matched pattern instead of reusing it), `input_tokens`, `output_tokens`, `thinking`, `stop_reason`, `model`, `error: {category, message}` (short-circuits the call and surfaces as `VmError::CategorizedError` — useful for testing `llm_call_safe` envelopes and `with_rate_limit` retry loops)
`llm_mock_calls()`	—	list	Return list of `{messages, system, tools}` for all calls made to the mock provider
`llm_mock_clear()`	—	nil	Clear all queued mock responses and recorded calls

FIFO mocks (no match field) are consumed in order. Pattern-matched mocks (with match) are checked in declaration order against the request transcript text using glob patterns. They persist by default; add consume_match: true to advance through matching fixtures step by step. When no mocks match, the default deterministic mock behavior is used.

See Trigger stdlib for the typed std/triggers aliases, DLQ entry shapes, and the current shallow-path replay / manual-fire caveats.

// Queue specific responses for the mock provider
llm_mock({text: "The answer is 42."})
llm_mock({
  text: "Let me check that.",
  tool_calls: [{name: "read_file", arguments: {path: "main.rs"}}],
})
let r = llm_call("question", nil, {provider: "mock"})
assert_eq(r.text, "The answer is 42.")

// Pattern-matched mocks (reusable, not consumed)
llm_mock({text: "Hello!", match: "*greeting*"})
llm_mock({text: "step 1", match: "*planner*", consume_match: true})
llm_mock({text: "step 2", match: "*planner*", consume_match: true})

// Error injection for testing resilient code paths. The mock
// surfaces as a real `VmError::CategorizedError`, so `error_category`,
// `try { ... } catch`, `llm_call_safe`, and `with_rate_limit` all see
// it the same way they would a live provider failure.
llm_mock({error: {category: "rate_limit", message: "429 Too Many Requests"}})

// Inspect what was sent
let calls = llm_mock_calls()
llm_mock_clear()

Transcript helpers

Function	Parameters	Returns	Description
`transcript(metadata?)`	metadata: dict	dict	Create a new transcript
`transcript_from_messages(messages_or_transcript)`	list or dict	dict	Normalize a message list into a transcript
`transcript_messages(transcript)`	transcript: dict	list	Get transcript messages
`transcript_summary(transcript)`	transcript: dict	string or nil	Get transcript summary
`transcript_id(transcript)`	transcript: dict	string	Get transcript id
`transcript_export(transcript)`	transcript: dict	string	Export transcript as JSON
`transcript_import(json_text)`	json_text: string	dict	Import transcript JSON
`transcript_fork(transcript, options?)`	transcript: dict, options: dict	dict	Fork transcript, optionally dropping messages or summary
`transcript_summarize(transcript, options?)`	transcript: dict, options: dict	dict	Summarize and compact a transcript via `llm_call`
`transcript_compact(transcript, options?)`	transcript: dict, options: dict	dict	Compact a transcript with the runtime compaction engine, preserving durable artifacts and compaction events
`transcript_auto_compact(messages, options?)`	messages: list, options: dict	list	Apply the agent-loop compaction pipeline to a message list using `llm`, `truncate`, or `custom` strategy

Provider configuration

LLM provider endpoints, model aliases, inference rules, and default parameters are configured via a TOML file. The VM searches for config in this order:

Built-in defaults (Anthropic, OpenAI, OpenRouter, HuggingFace, Ollama, Local)
HARN_PROVIDERS_CONFIG if set, otherwise ~/.config/harn/providers.toml
Installed package [llm] tables in .harn/packages/*/harn.toml
The nearest project harn.toml [llm] table

The [llm] section uses the same schema as providers.toml, so project and package manifests can ship provider adapters declaratively:

[llm.providers.anthropic]
base_url = "https://api.anthropic.com/v1"
auth_style = "header"
auth_header = "x-api-key"
auth_env = "ANTHROPIC_API_KEY"
chat_endpoint = "/messages"

[llm.providers.local]
base_url = "http://localhost:8000"
base_url_env = "LOCAL_LLM_BASE_URL"
auth_style = "none"
chat_endpoint = "/v1/chat/completions"
completion_endpoint = "/v1/completions"

[llm.aliases]
sonnet = { id = "claude-sonnet-4-20250514", provider = "anthropic" }

[[llm.inference_rules]]
pattern = "claude-*"
provider = "anthropic"

[[llm.tier_rules]]
pattern = "claude-*"
tier = "frontier"

[llm.model_defaults."qwen/*"]
temperature = 0.3

Timers

Function	Parameters	Returns	Description
`timer_start(name?)`	name: string	dict	Start a named timer
`timer_end(timer)`	timer: dict	int	Stop timer, prints elapsed, returns milliseconds
`elapsed()`	—	int	Milliseconds since process start

Circuit breakers

Protect against cascading failures by tracking error counts and opening a circuit when a threshold is reached.

Function	Parameters	Returns	Description
`circuit_breaker(name, threshold?, reset_ms?)`	name: string, threshold: int (default 5), reset_ms: int (default 30000)	string	Create a named circuit breaker. Returns the name
`circuit_check(name)`	name: string	string	Check state: `"closed"`, `"open"`, or `"half_open"` (after reset period)
`circuit_record_failure(name)`	name: string	bool	Record a failure. Returns true if the circuit just opened
`circuit_record_success(name)`	name: string	nil	Record a success, resetting failure count and closing the circuit
`circuit_reset(name)`	name: string	nil	Manually reset the circuit to closed

Example:

circuit_breaker("api", 3, 10000)

for i in 0 to 5 exclusive {
  if circuit_check("api") == "open" {
    println("circuit open, skipping call")
  } else {
    try {
      let resp = http_get("https://api.example.com/data")
      circuit_record_success("api")
    } catch e {
      circuit_record_failure("api")
    }
  }
}

Tracing

Distributed tracing primitives for instrumenting pipeline execution.

Function	Parameters	Returns	Description
`trace_start(name)`	name: string	dict	Start a trace span. Returns a span dict with `trace_id`, `span_id`, `name`, `start_ms`
`trace_end(span)`	span: dict	nil	End a span and emit a structured log line with duration
`trace_id()`	none	string or nil	Current trace ID from the span stack, or nil if no active span
`enable_tracing(enabled?)`	enabled: bool (default true)	nil	Enable or disable pipeline-level tracing
`trace_spans()`	none	list	Peek at recorded trace spans
`trace_summary()`	none	string	Formatted summary of trace spans

Example:

let span = trace_start("fetch_data")
// ... do work ...
trace_end(span)

println(trace_summary())

Agent trace events

Fine-grained agent loop trace events for observability and debugging. Events are collected during agent_loop execution and can be inspected after the loop completes.

Function	Parameters	Returns	Description
`agent_trace()`	none	list	Peek at collected agent trace events. Each event is a dict with a `type` field (`llm_call`, `tool_execution`, `tool_rejected`, `loop_intervention`, `context_compaction`, `phase_change`, `loop_complete`) and type-specific fields
`agent_trace_summary()`	none	dict	Rolled-up summary of agent trace events with aggregated token counts, durations, tool usage, and iteration counts

Example:

let result = agent_loop("summarize this file", tools: [read_file])
let summary = agent_trace_summary()
println("LLM calls: " + str(summary.llm_calls))
println("Tools used: " + str(summary.tools_used))

Error classification

Structured error throwing and classification for retry logic and error handling.

Function	Parameters	Returns	Description
`throw_error(message, category?)`	message: string, category: string	never	Throw a categorized error. The error is a dict with `message` and `category` fields
`error_category(err)`	err: any	string	Extract category from a caught error. Returns `"timeout"`, `"auth"`, `"rate_limit"`, `"tool_error"`, `"cancelled"`, `"not_found"`, `"circuit_open"`, or `"generic"`
`is_timeout(err)`	err: any	bool	Check if error is a timeout
`is_rate_limited(err)`	err: any	bool	Check if error is a rate limit

Example:

try {
  throw_error("request timed out", "timeout")
} catch e {
  if is_timeout(e) {
    println("will retry after backoff")
  }
  println(error_category(e))  // "timeout"
}

Tool registry (low-level)

Low-level tool management functions for building and inspecting tool registries programmatically. For MCP serving, see the tool_define / mcp_tools API above.

Function	Parameters	Returns	Description
`tool_remove(registry, name)`	registry, name: string	dict	Remove a tool by name
`tool_list(registry)`	registry: dict	list	List tools as `[{name, description, parameters}]`
`tool_find(registry, name)`	registry, name: string	dict or nil	Find a tool entry by name
`tool_select(registry, names)`	registry: dict, names: list	dict	Return a registry containing only the named tools
`tool_count(registry)`	registry: dict	int	Number of tools in the registry
`tool_describe(registry)`	registry: dict	string	Human-readable summary of all tools
`tool_schema(registry, components?)`	registry, components: dict	dict	Generate JSON Schema for all tools
`tool_prompt(registry)`	registry: dict	string	Generate an LLM system prompt describing available tools
`tool_parse_call(text)`	text: string	list	Parse `<tool_call>...</tool_call>` XML from LLM output
`tool_format_result(name, result)`	name, result: string	string	Format a `<tool_result>` XML envelope

Structured logging

Function	Parameters	Returns	Description
`log_json(key, value)`	key: string, value: any	nil	Emit a JSON log line with timestamp

Metadata

Project metadata store backed by host-managed sharded JSON files. Supports hierarchical namespace resolution (child directories inherit from parents). The default filesystem backend persists namespace shards under .harn/metadata/<namespace>/entries.json and still reads the legacy monolithic root.json shard.

Function	Parameters	Returns	Description
`metadata_get(dir, namespace?)`	dir: string, namespace: string	dict \| nil	Read metadata with inheritance
`metadata_resolve(dir, namespace?)`	dir: string, namespace: string	dict \| nil	Read resolved metadata while preserving namespaces
`metadata_entries(namespace?)`	namespace: string	list	List stored directories with local and resolved metadata
`metadata_set(dir, namespace, data)`	dir: string, namespace: string, data: dict	nil	Write metadata for directory/namespace
`metadata_save()`	—	nil	Flush metadata to disk
`metadata_stale(project)`	project: string	dict	Check staleness: `{any_stale, tier1, tier2}`
`metadata_status(namespace?)`	namespace: string	dict	Summarize directory counts, namespaces, missing hashes, and stale state
`metadata_refresh_hashes()`	—	nil	Recompute content hashes
`compute_content_hash(dir)`	dir: string	string	Hash of directory contents
`invalidate_facts(dir)`	dir: string	nil	Mark cached facts as stale
`scan_directory(path?, pattern_or_options?, options?)`	path: string, pattern: string or options: dict	list	Enumerate files and directories with optional `pattern`, `max_depth`, `include_hidden`, `include_dirs`, `include_files`

MCP (Model Context Protocol)

Connect to external tool servers using the Model Context Protocol. Harn supports stdio transport (spawns a child process) and HTTP transport for remote MCP servers.

Function	Parameters	Returns	Description
`mcp_connect(command, args?)`	command: string, args: list	mcp_client	Spawn an MCP server and perform the initialize handshake
`mcp_list_tools(client)`	client: mcp_client	list	List available tools from the server
`mcp_call(client, name, arguments?)`	client: mcp_client, name: string, arguments: dict	string or list	Call a tool and return the result
`mcp_list_resources(client)`	client: mcp_client	list	List available resources from the server
`mcp_list_resource_templates(client)`	client: mcp_client	list	List resource templates (URI templates) from the server
`mcp_read_resource(client, uri)`	client: mcp_client, uri: string	string or list	Read a resource by URI
`mcp_list_prompts(client)`	client: mcp_client	list	List available prompts from the server
`mcp_get_prompt(client, name, arguments?)`	client: mcp_client, name: string, arguments: dict	dict	Get a prompt with optional arguments
`mcp_server_info(client)`	client: mcp_client	dict	Get connection info (`name`, `connected`)
`mcp_disconnect(client)`	client: mcp_client	nil	Kill the server process and release resources

Example:

let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
let tools = mcp_list_tools(client)
println(tools)

let result = mcp_call(client, "read_file", {"path": "/tmp/hello.txt"})
println(result)

mcp_disconnect(client)

Notes:

mcp_call returns a string when the tool produces a single text block, a list of content dicts for multi-block results, or nil when empty.
If the tool reports isError: true, mcp_call throws the error text.
mcp_connect throws if the command cannot be spawned or the initialize handshake fails.

Auto-connecting MCP servers via harn.toml

Instead of calling mcp_connect manually, you can declare MCP servers in harn.toml. They will be connected automatically before the pipeline executes and made available through the global mcp dict.

Add a [[mcp]] entry for each server:

[[mcp]]
name = "filesystem"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]

[[mcp]]
name = "github"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]

Each entry requires:

Field	Type	Description
`name`	string	Identifier used to access the client (e.g., `mcp.filesystem`)
`command`	string	Executable to spawn for stdio transports
`args`	list of strings	Command-line arguments for stdio transports (default: empty)
`transport`	string	`stdio` (default) or `http`
`url`	string	Remote MCP server URL for HTTP transports
`auth_token`	string	Optional explicit bearer token for HTTP transports
`client_id`	string	Optional pre-registered OAuth client ID for HTTP transports
`client_secret`	string	Optional pre-registered OAuth client secret
`scopes`	string	Optional OAuth scope string for login/consent
`protocol_version`	string	Optional MCP protocol version override

The connected clients are available as properties on the mcp global dict:

pipeline default() {
  let tools = mcp_list_tools(mcp.filesystem)
  println(tools)

  let result = mcp_call(mcp.github, "list_issues", {repo: "harn"})
  println(result)
}

If a server fails to connect, a warning is printed to stderr and that server is omitted from the mcp dict. Other servers still connect normally. The mcp global is only defined when at least one server connects successfully.

For HTTP MCP servers, use the CLI to establish OAuth once and let Harn reuse the stored token automatically:

harn mcp redirect-uri
harn mcp login notion

MCP server mode

Harn pipelines can expose tools, resources, resource templates, and prompts as an MCP server using harn mcp-serve. The CLI serves them over stdio using the MCP protocol, making them callable by Claude Desktop, Cursor, or any MCP client.

Declarative syntax (preferred):

tool greet(name: string) -> string {
  description "Greet someone by name"
  "Hello, " + name + "!"
}

The tool keyword declares a tool with typed parameters, an optional description, and a body. Parameter types map to JSON Schema (string -> "string", int -> "integer", float -> "number", bool -> "boolean"). Parameters with default values are emitted as optional schema fields (required: false) and carry their default value into the generated tool registry entry. Each tool declaration produces its own tool registry dict.

Programmatic API:

Function	Parameters	Returns	Description
`tool_registry()`	—	dict	Create an empty tool registry
`tool_define(registry, name, desc, config)`	registry, name, desc: string, config: dict	dict	Add a tool (config: `{parameters, handler, returns?, annotations?, ...}`)
`mcp_tools(registry)`	registry: dict	nil	Register tools for MCP serving
`mcp_resource(config)`	config: dict	nil	Register a static resource (`{uri, name, text, description?, mime_type?}`)
`mcp_resource_template(config)`	config: dict	nil	Register a resource template (`{uri_template, name, handler, description?, mime_type?}`)
`mcp_prompt(config)`	config: dict	nil	Register a prompt (`{name, handler, description?, arguments?}`)

Tool annotations (MCP spec annotations field) can be passed in the tool_define config to describe tool behavior:

tools = tool_define(tools, "search", "Search files", {
  parameters: { query: {type: "string"} },
  returns: {type: "string"},
  handler: { args -> "results for ${args.query}" },
  annotations: {
    title: "File Search",
    readOnlyHint: true,
    destructiveHint: false
  }
})

Unknown tool_define config keys are preserved on the tool entry. Workflow graphs use this to carry runtime policy metadata directly on a tool registry, for example:

tools = tool_define(tools, "read", "Read files", {
  parameters: { path: {type: "string"} },
  returns: {type: "string"},
  handler: nil,
  policy: {
    capabilities: {workspace: ["read_text"]},
    side_effect_level: "read_only",
    path_params: ["path"],
    mutation_classification: "read_only"
  }
})

When a workflow node uses that registry, Harn intersects the declared tool policy with the graph, node, and host ceilings during validation and at execution time.

Declarative tool approval

agent_loop, workflow_execute, and workflow stage nodes accept an approval_policy option that declaratively gates tool calls:

agent_loop("task", "system", {
  approval_policy: {
    auto_approve: ["read*", "list_*"],
    auto_deny: ["shell*"],
    require_approval: ["edit_*", "write_*"],
    write_path_allowlist: ["/workspace/**"]
  }
})

Evaluation order: auto_deny → write_path_allowlist → auto_approve → require_approval. Tools that match no pattern default to AutoApproved. require_approval calls the host via the canonical ACP session/request_permission request and fails closed if the host does not implement it. Policies compose across nested scopes with most-restrictive intersection: auto-deny and require-approval take the union, while auto_approve and write_path_allowlist take the intersection.

Example (agent.harn):

pipeline main(task) {
  var tools = tool_registry()
  tools = tool_define(tools, "greet", "Greet someone", {
    parameters: { name: {type: "string"} },
    returns: {type: "string"},
    handler: { args -> "Hello, ${args.name}!" }
  })
  mcp_tools(tools)

  mcp_resource({
    uri: "docs://readme",
    name: "README",
    text: "# My Agent\nA demo MCP server."
  })

  mcp_resource_template({
    uri_template: "config://{key}",
    name: "Config Values",
    handler: { args -> "value for ${args.key}" }
  })

  mcp_prompt({
    name: "review",
    description: "Code review prompt",
    arguments: [{ name: "code", required: true }],
    handler: { args -> "Please review:\n${args.code}" }
  })
}

Run as an MCP server:

harn mcp-serve agent.harn

Configure in Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "my-agent": {
      "command": "harn",
      "args": ["mcp-serve", "agent.harn"]
    }
  }
}

Notes:

mcp_tools(registry) (or the alias mcp_serve) must be called to register tools.
Resources, resource templates, and prompts are registered individually.
All print/println output goes to stderr (stdout is the MCP transport).
The server supports the 2025-11-25 MCP protocol version over stdio.
Tool handlers receive arguments as a dict and should return a string result.
Prompt handlers receive arguments as a dict and return a string (single user message) or a list of {role, content} dicts.
Resource template handlers receive URI template variables as a dict and return the resource text.

Workflow and orchestration builtins

These builtins expose Harn’s typed orchestration runtime.

Workflow graph and planning

Function	Parameters	Returns	Description
`workflow_graph(config)`	config: dict	workflow graph	Normalize a workflow definition into the typed workflow IR
`workflow_validate(graph, ceiling?)`	graph: workflow, ceiling: dict (optional)	dict	Validate graph structure and capability ceilings
`workflow_inspect(graph, ceiling?)`	graph: workflow, ceiling: dict (optional)	dict	Return graph plus validation summary
`workflow_clone(graph)`	graph: workflow	workflow graph	Clone a workflow and append an audit entry
`workflow_insert_node(graph, node, edge?)`	graph, node, edge	workflow graph	Insert a node and optional edge
`workflow_replace_node(graph, node_id, node)`	graph, node_id, node	workflow graph	Replace a node definition
`workflow_rewire(graph, from, to, branch?)`	graph, from, to, branch	workflow graph	Rewire an edge
`workflow_set_model_policy(graph, node_id, policy)`	graph, node_id, policy	workflow graph	Set per-node model policy
`workflow_set_context_policy(graph, node_id, policy)`	graph, node_id, policy	workflow graph	Set per-node context policy
`workflow_set_auto_compact(graph, node_id, policy)`	graph, node_id, policy	workflow graph	Set per-node auto-compaction policy
`workflow_set_output_visibility(graph, node_id, visibility)`	graph, node_id, visibility	workflow graph	Set per-node output-visibility filter (`"public"`/`"public_only"`/nil)
`workflow_policy_report(graph, ceiling?)`	graph, ceiling: dict (optional)	dict	Inspect workflow/node policies against an explicit or builtin ceiling
`workflow_diff(left, right)`	left, right	dict	Compare two workflow graphs
`workflow_commit(graph, reason?)`	graph, reason	workflow graph	Validate and append a commit audit entry

Workflow execution and run records

Function	Parameters	Returns	Description
`workflow_execute(task, graph, artifacts?, options?)`	task, graph, artifacts, options	dict	Execute a workflow and persist a run record
`run_record(payload)`	payload: dict	run record	Normalize a run record
`run_record_save(run, path?)`	run, path	dict	Persist a run record
`run_record_load(path)`	path: string	run record	Load a run record from disk
`load_run_tree(path)`	path: string	dict	Load a persisted run with delegated child-run lineage
`run_record_fixture(run)`	run	replay fixture	Derive a replay/eval fixture from a saved run
`run_record_eval(run, fixture?)`	run, fixture	dict	Evaluate a run against an embedded or explicit fixture
`run_record_eval_suite(cases)`	cases: list	dict	Evaluate a list of `{run, fixture?, path?}` cases as a regression suite
`run_record_diff(left, right)`	left, right	dict	Compare two run records and summarize stage/status deltas
`eval_suite_manifest(payload)`	payload: dict	dict	Normalize a grouped eval suite manifest
`eval_suite_run(manifest)`	manifest: dict	dict	Evaluate a manifest of saved runs, fixtures, and optional baselines
`eval_metric(name, value, metadata?)`	name: string, value: any, metadata: dict	nil	Record a named metric into the eval metric store
`eval_metrics()`	—	list	Return all recorded eval metrics as `{name, value, metadata?}` dicts

workflow_execute options currently include:

max_steps
persist_path
resume_path
resume_run
replay_path
replay_run
replay_mode ("deterministic" currently replays saved stage fixtures)
parent_run_id
root_run_id
execution ({cwd?, env?, worktree?} for isolated delegated execution)
audit (seed mutation-session metadata for trust/audit grouping)
mutation_scope
approval_policy (declarative tool approval policy; see below)

verify nodes may also define execution checks inside node.verify, including:

command to execute via the host shell in the current execution context
assert_text to require visible output to contain a substring
expect_status to require a specific exit status

Tool lifecycle hooks

Function	Parameters	Returns	Description
`register_tool_hook(config)`	config: dict	nil	Register a pre/post hook for tool calls matching `pattern` (glob). `deny` string blocks matching tools; `max_output` int truncates results
`clear_tool_hooks()`	none	nil	Remove all registered tool hooks

Context and compaction utilities

Function	Parameters	Returns	Description
`estimate_tokens(messages)`	messages: list	int	Estimate token count for a message list (chars / 4 heuristic)
`microcompact(text, max_chars?)`	text, max_chars (default 20000)	string	Snip oversized text, keeping head and tail with a marker
`select_artifacts_adaptive(artifacts, policy)`	artifacts: list, policy: dict	list	Deduplicate, microcompact oversized artifacts, then select with token budget
`transcript_auto_compact(messages, options?)`	messages: list, options: dict	list	Run the same transcript auto-compaction pipeline used by `agent_loop`

Delegated workers

Function	Parameters	Returns	Description
`spawn_agent(config)`	config: dict	dict	Start a worker from a workflow graph or delegated stage config
`sub_agent_run(task, options?)`	task: string, options: dict	dict	Run an isolated child agent loop and return a clean envelope `{summary, artifacts, evidence_added, tokens_used, budget_exceeded, ...}` without leaking the child transcript into the parent
`send_input(handle, task)`	handle, task	dict	Re-run a completed worker with a new task, carrying forward worker state where applicable
`resume_agent(id_or_snapshot_path)`	id or path	dict	Restore a persisted worker snapshot into the current runtime
`wait_agent(handle_or_list)`	handle or list	dict or list	Wait for one worker or a list of workers to finish
`close_agent(handle)`	handle	dict	Cancel a worker and mark it terminal
`list_agents()`	none	list	List worker summaries tracked by the current runtime

spawn_agent(...) accepts either:

{task, graph, artifacts?, options?, name?, wait?} for typed workflow runs
{task, node, artifacts?, transcript?, name?, wait?} for delegated stage runs
Either shape may also include policy: <capability_policy> to narrow the worker’s inherited execution ceiling.
Either shape may also include tools: ["name", ...] as shorthand for a worker policy that only allows those tool names.
Either shape may also include execution: {cwd?, env?, worktree?} where worktree accepts {repo, path?, branch?, base_ref?, cleanup?}.
Either shape may also include audit: {session_id?, parent_session_id?, mutation_scope?, approval_policy?}

Worker configs may also include carry to control continuation behavior:

carry: {artifacts: "inherit" | "none" | <context_policy>}
carry: {resume_workflow?: bool, persist_state?: bool}

To give a spawned worker prior conversation context, open a session before spawning and set model_policy.session_id on the worker’s node. Use agent_session_fork(parent) if the worker should start from a branch of an existing conversation; agent_session_reset(id) before the call if you want a fresh run with the same id.

Workers return handle dicts with an id, lifecycle timestamps, status, mode, result/error fields, transcript presence, produced artifact count, snapshot/child-run paths, immutable original request metadata, normalized provenance, and audit mutation-session metadata when available. The request object preserves canonical research_questions, action_items, workflow_stages, and verification_steps arrays when the caller supplied them. When a worker-scoped policy denies a tool call, the agent receives a structured tool result payload: {error: "permission_denied", tool: "...", reason: "..."}.

sub_agent_run(task, options?) is the lighter-weight context-firewall primitive. It starts a child session, runs a full agent_loop, and returns only a single typed envelope to the parent:

summary, artifacts, evidence_added, tokens_used, budget_exceeded, session_id, and optional data
ok: false plus error: {category, message, tool?} when the child fails or hits a capability denial
background: true returns a normal worker handle whose mode is sub_agent

Options mirror agent_loop where relevant (provider, model, tools, tool_format, max_iterations, token_budget, policy, approval_policy, session_id, system) and also accept:

allowed_tools: ["name", ...] to narrow the child tool registry and capability ceiling
response_format: "json" to parse structured child JSON into data from the final successful transcript when possible
returns: {schema: ...} to validate that structured child JSON against a schema

Artifacts and context

Function	Parameters	Returns	Description
`artifact(payload)`	payload: dict	artifact	Normalize a typed artifact/resource
`artifact_derive(parent, kind, extra?)`	parent, kind, extra	artifact	Derive a new artifact from a prior one
`artifact_select(artifacts, policy?)`	artifacts, policy	list	Select artifacts under context policy and budget
`artifact_context(artifacts, policy?)`	artifacts, policy	string	Render selected artifacts into context
`artifact_workspace_file(path, content, extra?)`	path, content, extra	artifact	Build a normalized workspace-file artifact with path provenance
`artifact_workspace_snapshot(paths, summary?, extra?)`	paths, summary, extra	artifact	Build a workspace snapshot artifact for host/editor context
`artifact_editor_selection(path, text, extra?)`	path, text, extra	artifact	Build an editor-selection artifact from host UI state
`artifact_verification_result(title, text, extra?)`	title, text, extra	artifact	Build a verification-result artifact
`artifact_test_result(title, text, extra?)`	title, text, extra	artifact	Build a test-result artifact
`artifact_command_result(command, output, extra?)`	command, output, extra	artifact	Build a command-result artifact with structured output
`artifact_diff(path, before, after, extra?)`	path, before, after, extra	artifact	Build a unified diff artifact from before/after text
`artifact_git_diff(diff_text, extra?)`	diff_text, extra	artifact	Build a git-diff artifact from host/tool output
`artifact_diff_review(target, summary?, extra?)`	target, summary, extra	artifact	Build a diff-review artifact linked to a diff/patch target
`artifact_review_decision(target, decision, extra?)`	target, decision, extra	artifact	Build an accept/reject review-decision artifact linked by lineage
`artifact_patch_proposal(target, patch, extra?)`	target, patch, extra	artifact	Build a proposed patch artifact linked to an existing target
`artifact_verification_bundle(title, checks, extra?)`	title, checks, extra	artifact	Bundle structured verification checks into one review artifact
`artifact_apply_intent(target, intent, extra?)`	target, intent, extra	artifact	Record an apply or merge intent linked to a reviewed artifact

Core artifact kinds commonly used by the runtime include resource, workspace_file, workspace_snapshot, editor_selection, summary, transcript_summary, diff, git_diff, patch, patch_set, patch_proposal, diff_review, review_decision, verification_bundle, apply_intent, test_result, verification_result, command_result, and plan.

Sessions

Sessions are the first-class resource for agent-loop conversations. They own a transcript history, closure subscribers, and a lifecycle. See the Sessions chapter for the full model.

Function	Parameters	Returns	Description
`agent_session_open(id?)`	id: string or nil	string	Idempotent open; `nil` mints a UUIDv7
`agent_session_exists(id)`	id	bool	Safe on unknown ids
`agent_session_length(id)`	id	int	Message count; errors on unknown id
`agent_session_snapshot(id)`	id	dict or nil	Read-only deep copy of the transcript
`agent_session_reset(id)`	id	nil	Wipes history; preserves id and subscribers
`agent_session_fork(src, dst?)`	src, dst	string	Copies transcript; subscribers are not copied
`agent_session_trim(id, keep_last)`	id, keep_last: int	int	Retain last `keep_last` messages; returns kept count
`agent_session_compact(id, opts)`	id, opts: dict	int	Runs the LLM/truncate/observation-mask compactor
`agent_session_inject(id, message)`	id, message: dict	nil	Appends `{role, content, …}`; missing `role` errors
`agent_session_close(id)`	id	nil	Evicts immediately regardless of LRU cap

Pair with agent_loop(..., {session_id: id, ...}): prior messages load as prefix and the final transcript is persisted back on exit.

Transcript lifecycle

Lower-level transcript primitives. Most callers should prefer sessions; these remain useful for building synthetic transcripts, replay fixtures, and offline analysis.

Function	Parameters	Returns	Description
`transcript(metadata?)`	metadata: any	transcript	Create an empty transcript
`transcript_messages(transcript)`	transcript	list	Return transcript messages
`transcript_assets(transcript)`	transcript	list	Return transcript asset descriptors
`transcript_add_asset(transcript, asset)`	transcript, asset	transcript	Register a durable asset reference on a transcript
`transcript_events(transcript)`	transcript	list	Return canonical transcript events
`transcript_events_by_kind(transcript, kind)`	transcript, kind	list	Filter transcript events by their `kind` field
`transcript_stats(transcript)`	transcript	dict	Count messages, tool calls, and visible events on a transcript
`transcript_summary(transcript)`	transcript	string or nil	Return transcript summary
`transcript_fork(transcript, options?)`	transcript, options	transcript	Fork transcript state
`transcript_reset(options?)`	options	transcript	Start a fresh active transcript with optional metadata
`transcript_archive(transcript)`	transcript	transcript	Mark transcript archived and append an internal lifecycle event
`transcript_abandon(transcript)`	transcript	transcript	Mark transcript abandoned and append an internal lifecycle event
`transcript_resume(transcript)`	transcript	transcript	Mark transcript active again and append an internal lifecycle event
`transcript_compact(transcript, options?)`	transcript, options	transcript	Compact a transcript with the runtime compaction engine
`transcript_summarize(transcript, options?)`	transcript, options	transcript	Compact via LLM-generated summary
`transcript_auto_compact(messages, options?)`	messages, options	list	Apply the agent-loop compaction pipeline to a message list
`transcript_render_visible(transcript)`	transcript	string	Render only public/human-visible messages
`transcript_render_full(transcript)`	transcript	string	Render the full execution history

Transcript messages may now carry structured block content instead of plain text. Use add_user(...), add_assistant(...), or add_message(...) with a list of blocks such as {type: "text", text: "..."}, {type: "image", asset_id: "..."}, {type: "file", asset_id: "..."}, and {type: "tool_call", ...}, with per-block visibility: "public" | "internal" | "private". Durable media belongs in transcript.assets, while message/event blocks should reference those assets by id or path.

Project scanning

The std/project module now includes a deterministic L0/L1 project scanner for lightweight “what kind of project is this?” evidence without any LLM calls.

Import it with:

import "std/project"

What it returns

project_scan(path, options?) resolves path to a directory and returns a dictionary describing exactly that directory:

let ev = project_scan(".", {tiers: ["ambient", "config"]})

Typical fields:

path: absolute path to the scanned directory
languages: stable, confidence-filtered language IDs such as ["rust"]
frameworks: coarse framework IDs when an anchor is obvious
build_systems: coarse build systems such as ["cargo"] or ["npm"]
vcs: currently "git" when the directory is inside a Git checkout
anchors: anchor files or directories found at the project root
lockfiles: lockfiles found at the project root
confidence: coarse per-language/per-framework scores
package_name: root package/module name when it can be parsed deterministically

When tiers includes "config", the scan also fills in:

build_commands: default or discovered build/test commands
declared_scripts: parsed package.json scripts
makefile_targets: parsed Makefile targets
dockerfile_commands: parsed RUN, CMD, and ENTRYPOINT commands
readme_code_fences: fenced-language labels found in the README

Tiers

ambient: anchor files, lockfiles, coarse build system detection, VCS, and confidence scoring. No config parsing.
config: deterministic config reads for files already found by ambient.

If tiers is omitted, project_scan(...) defaults to ["ambient"].

Polyglot repos

Single-directory scans stay leaf-scoped on purpose. For polyglot repos and monorepos, use project_scan_tree(...) and let callers decide how to combine sub-project evidence:

let tree = project_scan_tree(".", {tiers: ["ambient"], depth: 3})
// {".": {...}, "frontend": {...}, "backend": {...}}

project_scan_tree(...):

always includes "." for the requested base directory
walks subdirectories deterministically
honors .gitignore by default
skips standard vendor/build directories such as node_modules/ and target/ by default

You can override those defaults with:

respect_gitignore: false
include_vendor: true
include_hidden: true

Enrichment

project_enrich(path, options) layers an L2, caller-owned enrichment pass on top of deterministic project_scan(...) evidence. The caller supplies the prompt template and the output schema; Harn owns prompt rendering, bounded file selection, schema-retry plumbing, and content-hash caching.

Typical use:

let base = project_scan(".", {tiers: ["ambient", "config"]})
let enriched = project_enrich(".", {
  base_evidence: base,
  prompt: "Project: {{package_name}}\n{{ for file in files }}FILE {{file.path}}\n{{file.content}}\n{{ end }}\nReturn JSON.",
  schema: {
    type: "object",
    required: ["framework", "indent_style"],
    properties: {
      framework: {type: "string"},
      indent_style: {type: "string"},
    },
  },
  budget_tokens: 4000,
  model: "auto",
  cache_key: "coding-enrichment-v1",
})

Bindings available to the template:

path: absolute project path
base_evidence / evidence: the supplied or auto-scanned L0/L1 evidence
every top-level key from base_evidence
files: deterministic bounded file context as {path, content, truncated}

Behavior:

cache key includes cache_key, path, schema, rendered prompt, and the content hash of the selected files
cached hits surface _provenance.cached == true
when the rendered prompt would exceed budget_tokens, the call returns the base evidence with budget_exceeded: true instead of failing
schema-retry exhaustion returns an envelope with validation_error and base_evidence instead of raising

By default, cache entries live under .harn/cache/enrichment/ inside the project root. Override that with cache_dir when a caller wants a different location.

Cached deep scans

project_deep_scan(path, options?) layers a cached per-directory tree on top of the metadata store. It is intended for repeated L2/L3 repo analysis where callers want stable hierarchical evidence instead of re-running enrichment on every turn.

Typical shape:

let tree = project_deep_scan(".", {
  namespace: "coding-enrichment-v1",
  tiers: ["ambient", "config", "enriched"],
  incremental: true,
  max_staleness_seconds: 86400,
  depth: nil,
  enrichment: {
    prompt: "Return valid JSON only.",
    schema: {purpose: "string", conventions: ["string"]},
    provider: "mock",
    budget_tokens_per_dir: 1024,
  },
})

Notes:

namespace is caller-owned, so multiple agents can keep separate trees for the same repo without collisions.
incremental: true reuses cached directories whose local directory structure_hash and content_hash still match.
depth: nil means unbounded traversal.
The filesystem backend persists namespace shards under .harn/metadata/<namespace>/entries.json.
project_deep_scan_status(namespace, path?) returns the last recorded scan summary for that scope: {total_dirs, enriched_dirs, stale_dirs, cache_hits, last_refresh, ...}.

project_enrich(path, options?) is the single-directory building block used by deep scan when the enriched tier is requested.

Catalog

project_catalog() returns the authoritative built-in catalog that drives ambient detection. Each entry includes:

id
languages
frameworks
build_systems
anchors
lockfiles
source_globs
default_build_cmd
default_test_cmd

The catalog lives in crates/harn-vm/src/stdlib/project_catalog.rs. Adding a new language should be a table entry plus a test, not a new custom code path.

Existing helper

project_root_package() now delegates to the scanner’s config tier after checking metadata enrichment, so existing callers keep the same package-name surface while the manifest parsing logic stays centralized.

Prompt templating

Harn ships a small template language for rendering .harn.prompt and .prompt asset files. It is invoked by the render(path, bindings?) and render_prompt(path, bindings?) builtins (and, equivalently, via the template.render host capability). The engine is intentionally minimal — a rendering layer for prompts, not a scripting language — but it covers the ergonomics most prompt authors reach for: conditionals with else/elif, loops, includes, filters, comments, and whitespace control.

This page is the reference. The one-page quickref has a condensed version for agents writing Harn.

At a glance

{{ name }}                                   interpolation
{{ user.name }} / {{ items[0] }}             nested path access
{{ name | upper | default: "anon" }}         filter pipeline
{{ if expr }} ... {{ elif expr }} ... {{ else }} ... {{ end }}
{{ for item in xs }} ... {{ else }} ... {{ end }}       else = empty-iterable fallback
{{ for key, value in dict }} ... {{ end }}
{{ include "partial.harn.prompt" }}
{{ include "partial.harn.prompt" with { x: name } }}
{{# stripped at parse time #}}
{{ raw }} ... literal {{braces}} ... {{ endraw }}
{{- name -}}                                 whitespace-trim markers

Interpolation

{{ path }} evaluates an expression and writes its string form into the output. Paths support nested field access and integer/string indexing:

{{ user.name }}          — field
{{ user.tags[0] }}       — list index
{{ user.tags[-1] }}      — negative index (counts from end)
{{ config["api-key"] }}  — string key with non-identifier characters

Missing values render as the empty string, except for legacy bare identifiers (e.g. {{ name }} with no dots/brackets/filters). For back-compat, those render their source verbatim on a miss (the pre-v2 behavior), so existing templates that relied on “missing → literal passthrough” keep working.

Conditionals

{{ if role == "admin" }}
  welcome, admin
{{ elif role == "user" and active }}
  welcome back!
{{ else }}
  please sign in
{{ end }}

Only {{ if expr }} ... {{ end }} is required; elif and else branches are optional and can be combined. The expression grammar is:

Category	Syntax
Literals	`"str"`, `'str'`, `123`, `1.5`, `true`, `false`, `nil`
Paths	`ident`, `a.b.c`, `a[0]`, `a["key"]`
Unary	`not x`, `!x`
Equality	`==`, `!=`
Comparison	`<`, `<=`, `>`, `>=` (numbers and strings)
Boolean (short-cut)	`and` / `&&`, `or` / `\|\|`
Grouping	`(expr)`
Filters	`expr \| filter`, `expr \| filter: arg1, arg2`

String escapes inside quoted literals: \n, \t, \r, \\, \", \'.

Truthiness

Used both by if and by the short-circuit and/or:

Value kind	Truthy?
`nil`	false
`false`	false
`0`, `0.0`	false
empty/whitespace-only string	false
empty list / set / dict	false
everything else	true

Loops

{{ for x in xs }}
- {{ loop.index }}. {{ x }}
{{ else }}
(no items)
{{ end }}

{{ else }} inside a for block renders when the iterable is empty — a cleaner alternative to wrapping the loop in an {{ if }}.

Loop variables

Inside the loop body, a synthetic loop dict is in scope:

Field	Type	Description
`loop.index`	int	1-based index of the current item
`loop.index0`	int	0-based index
`loop.first`	bool	`true` on the first iteration
`loop.last`	bool	`true` on the final iteration
`loop.length`	int	total number of items

Dict iteration

{{ for key, value in my_dict }}
  {{ key }} = {{ value }}
{{ end }}

Dicts iterate in their canonical (BTreeMap) order.

Includes

Include another template file. Paths resolve relative to the including file’s directory:

{{ include "partials/header.harn.prompt" }}

The included template inherits the parent’s scope by default. Pass explicit bindings with with { ... } — these are merged into the parent scope for the inner render only:

{{ include "partials/item.prompt" with { item: x, style: "bold" } }}

Safety:

Circular includes are detected (e.g. a.prompt includes b.prompt which includes a.prompt) and produce a circular include detected error with the full chain.
Include depth is capped at 32 levels.
A missing included file fails with failed to read included template <path>.

Comments

Before{{# this never renders #}}After

Comments are stripped entirely at parse time. Use them to document a template without leaking the note into the final prompt.

Raw blocks

When a prompt needs to emit literal {{ / }} (say, the prompt includes another template language, JSON with braces, etc.):

{{ raw }}
{{ this is output verbatim }}
{{ endraw }}

Everything between {{ raw }} and {{ endraw }} is passed through as-is, no directive interpretation.

Whitespace control

Directives support {{- ... -}} trim markers (Jinja-style). A leading - strips the preceding whitespace and one newline; a trailing - strips the following whitespace and one newline. This is the idiomatic way to keep templates readable without emitting extra blank lines:

Items:
{{- for x in xs -}}
  {{ x }},
{{- end -}}
DONE

renders Items: a, b, c,DONE — no leading or trailing newlines introduced by the control directives.

Filters

Apply transformations to a value via a pipeline. Filters can be chained and some accept arguments after a colon:

{{ items | join: ", " }}
{{ name | upper }}
{{ user.bio | default: "(no bio)" | indent: 4 }}

Built-in filters

Filter	Args	Description
`upper`	—	Uppercase the string form
`lower`	—	Lowercase
`trim`	—	Strip leading/trailing whitespace
`capitalize`	—	First char upper, rest lower
`title`	—	Title Case (uppercase each word)
`length`	—	Number of items (string chars, list/set/dict entries, range size)
`first`	—	First element (or char)
`last`	—	Last element (or char)
`reverse`	—	Reversed list or string
`join`	`sep: string`	Join list items with `sep`
`default`	`fallback: any`	Use `fallback` when the value is falsey
`json`	`pretty?: bool`	Serialize as JSON (pass `true` for pretty)
`indent`	`width: int, first?: bool`	Indent every line by `width` spaces; pass `true` to indent the first line too
`lines`	—	Split string on `\n` into a list
`escape_md`	—	Escape Markdown special characters
`replace`	`from: str, to: str`	Replace all occurrences

Unknown filters raise a clear error at render time.

Errors

On any parse or render error, the engine raises a thrown value (via VmError::Thrown) with a message of the form:

<template-path> at <line>:<col>: <what went wrong>

Typical cases:

unterminated directive — a {{ without a matching }}.
unterminated comment — a {{# without a matching #}}.
unterminated \{{ raw }}` block— missing{{ endraw }}`.
unknown filter \foo`` — the named filter isn’t registered.
circular include detected: a.prompt → b.prompt → a.prompt.
include path must be a string — {{ include }} target wasn’t a string.

Preflight checks

harn check parses every template referenced by a literal render(...) / render_prompt(...) call and surfaces syntax errors before you run the pipeline. Catches things like an unterminated {{ for }} block at static time rather than at first render.

Back-compat

The engine is a strict superset of the pre-v2 syntax:

{{ name }} — interpolation, missing bare identifier passes through verbatim
{{ if key }} ... {{ end }} — truthy test

All pre-v2 templates render identically. Migrating awkward workarounds to the new forms is optional but usually shorter — see the migration guide.

Configuring LLM Providers

Harn supports multiple LLM providers out of the box. This page explains how provider and API key resolution works, and how to configure each one.

Provider resolution order

When you call llm_call() or start an agent_loop(), Harn resolves the provider in this order:

Explicit option — llm_call({provider: "openai", ...}) in your script
Environment variable — HARN_LLM_PROVIDER
Inferred from model name — e.g. gpt-4o → OpenAI, claude-3 → Anthropic
Default — anthropic
Fallback — if Anthropic key is missing, tries ollama then local

API key resolution

Each provider defines an auth_style and one or more environment variables:

Provider	Environment Variable(s)	Auth Style
Anthropic	`ANTHROPIC_API_KEY`	header
OpenAI	`OPENAI_API_KEY`	bearer
OpenRouter	`OPENROUTER_API_KEY`	bearer
HuggingFace	`HF_TOKEN`, `HUGGINGFACE_API_KEY`	bearer
Ollama	(none)	none
Local	(none)	none

Model selection

Set the model explicitly or via environment:

// In code
llm_call({model: "claude-sonnet-4-5-20241022", prompt: "..."})

// Or via environment
// export HARN_LLM_MODEL=gpt-4o

The HARN_LLM_MODEL environment variable sets the default model when none is specified in the script.

Rate limiting

Harn supports per-provider rate limiting (requests per minute):

# Set via environment
export HARN_RATE_LIMIT_ANTHROPIC=60
export HARN_RATE_LIMIT_OPENAI=120

Or in code:

llm_rate_limit("anthropic", 60)

The rate limiter uses a token-bucket algorithm and will pause before sending requests that would exceed the configured RPM.

Local LLM support

For local models (Ollama, llama.cpp, vLLM, etc.):

export LOCAL_LLM_BASE_URL=http://localhost:11434
export LOCAL_LLM_MODEL=llama3

Harn will automatically fall back to a local provider if no cloud API key is configured. This makes it easy to develop and test without incurring API costs.

Troubleshooting

“No API key found” — Check that the correct environment variable is set for your provider. Run echo $ANTHROPIC_API_KEY to verify.
Wrong provider selected — Set HARN_LLM_PROVIDER explicitly to override automatic detection.
Rate limit errors — Use HARN_RATE_LIMIT_<PROVIDER> to throttle requests below your plan’s limit.
Debug message shapes — Set HARN_DEBUG_MESSAGE_SHAPES=1 to log the structure of messages sent to the LLM provider.

Debugging Agent Runs

Harn provides several tools for inspecting, replaying, and evaluating agent runs. This page walks through the debugging workflow.

Source-level debugging

For step-through debugging, start the Debug Adapter Protocol server:

cargo run --bin harn-dap

In VS Code, the Harn extension contributes a harn debug configuration automatically. The equivalent launch.json entry is:

{
  "type": "harn",
  "request": "launch",
  "name": "Debug Current Harn File",
  "program": "${file}",
  "cwd": "${workspaceFolder}"
}

This supports line breakpoints, variable inspection, stack traces, and step in / over / out against .harn files.

Host-call bridge (`harnHostCall`)

The debug adapter advertises supportsHarnHostCall: true in its Capabilities response. When a script calls host_call(capability, operation, params) and the VM has no built-in handler for the op, the adapter forwards it to the DAP client as a reverse request named harnHostCall — mirroring the DAP runInTerminal pattern:

{"seq": 17, "type": "request", "command": "harnHostCall",
 "arguments": {"capability": "workspace", "operation": "project_root",
               "params": {}}}

The client replies with a normal DAP response:

{"seq": 18, "type": "response", "request_seq": 17, "command": "harnHostCall",
 "success": true, "body": {"value": "/Users/x/proj"}}

On success: true, the adapter returns the body’s value field (or the whole body when value is absent) to the script. On success: false, the adapter throws VmError::Thrown(message) so scripts can try / catch the failure like any other Harn exception. Clients that do not implement harnHostCall still work — the script just sees the standalone fallbacks (workspace.project_root, workspace.cwd, etc.).

LLM telemetry output events

During run / step-through, the adapter forwards every llm_call the VM makes as a DAP output event with category: "telemetry" and a JSON body:

{"category": "telemetry",
 "output": "{\"call_id\":\"…\",\"model\":\"…\",\"prompt_tokens\":…,\"completion_tokens\":…,\"cache_tokens\":…,\"total_ms\":…,\"iteration\":…}"}

IDEs can parse these to show a live LLM-call ledger alongside the debug session.

Run records

Every agent_loop() or workflow_execute() call can produce a run record — a JSON file in .harn-runs/ that captures the full execution trace including LLM calls, tool invocations, and intermediate results.

# List recent runs
ls .harn-runs/

# Inspect a run record
harn runs inspect .harn-runs/<run-id>.json

The inspect command shows a structured summary: stages executed, tools called, token usage, timing, and final output.

Comparing runs

Compare a run against a baseline to identify regressions:

harn runs inspect .harn-runs/new.json --baseline .harn-runs/old.json

This highlights differences in tool calls, outputs, and token consumption.

Replay

Replay re-executes a recorded run, using the saved LLM responses instead of making live API calls. This is useful for deterministic debugging:

harn replay .harn-runs/<run-id>.json

Replay shows each stage transition and lets you verify that your pipeline produces the same results given the same LLM responses.

Visualizing a pipeline

When you want a quick structural view instead of a live debug session, render a Mermaid graph from the AST:

harn viz main.harn
harn viz main.harn --output docs/main.mmd

The generated graph is useful for reviewing branch-heavy pipelines, match arms, parallel blocks, and nested retries before you start stepping through them.

Evaluation

The harn eval command scores a run or set of runs against expected outcomes:

# Evaluate a single run
harn eval .harn-runs/<run-id>.json

# Evaluate all runs in a directory
harn eval .harn-runs/

# Evaluate using a manifest
harn eval eval-suite.json

Custom metrics

Use eval_metric() in your pipeline to record domain-specific metrics:

eval_metric("accuracy", 0.95, {dataset: "test-v2"})
eval_metric("latency_ms", 1200)

These metrics appear in run records and are aggregated by harn eval.

Token usage tracking

Track LLM costs during a run:

let usage = llm_usage()
log("Tokens used: ${usage.input_tokens + usage.output_tokens}")
log("LLM calls: ${usage.total_calls}")

Portal

The Harn portal is an interactive web UI for inspecting runs:

harn portal

This opens a dashboard showing all runs in .harn-runs/, with drill-down into individual stages, tool calls, and transcript snapshots.

Tips

Add eval_metric() calls to your pipelines early — they’re cheap to record and invaluable for tracking quality over time.
Use replay for debugging non-deterministic failures: record the failing run, then replay it locally to step through the logic.
Compare baselines when refactoring prompts or changing tool definitions to catch regressions before they ship.

Editor integration

Harn provides first-class editor support through an LSP server, a DAP debugger, and a tree-sitter grammar. These cover most modern editors and IDE workflows.

VS Code

The editors/vscode/ directory contains a VS Code extension that bundles syntax highlighting (via tree-sitter) and automatic LSP/DAP client configuration.

Install from the extension directory:

cd editors/vscode && npm install && npm run build

Then use Extensions: Install from VSIX or symlink into ~/.vscode/extensions/.

Language server (LSP)

Start the LSP server with:

cargo run --bin harn-lsp

Or use the compiled binary directly (harn-lsp). The server communicates over stdin/stdout using the Language Server Protocol.

Supported capabilities

Feature	Description
Diagnostics	Real-time parse errors, type errors (including cross-module undefined-call errors), and warnings. Shares the same module graph used by `harn check` and `harn run`.
Completions	Scope-aware: pipelines, functions, variables, parameters, enums, structs, interfaces. Dot-completions for methods plus inferred shape fields, struct members, and enum payload fields. Builtins and keywords.
Go-to-definition	Jump to the declaration of pipelines, functions, variables, enums, structs, and interfaces. Cross-file navigation walks the recursive module graph (relative paths and `.harn/packages/`), so symbols reachable through any number of transitive imports resolve.
Find references	Locate all usages of a symbol across the document
Hover	Shows type information and documentation for builtins
Signature help	Parameter hints while typing function arguments
Document symbols	Outline view of pipelines, functions, structs, enums
Workspace symbols	Cross-file search for pipelines and functions
Semantic tokens	Fine-grained syntax highlighting for keywords, types, functions, parameters, enums, and more
Code actions	Quick fixes for lint warnings (var→let, boolean simplification, unused import removal, string interpolation) and type errors
Rename	Rename symbols across the document
Document formatting	Delegates to `harn-fmt` for format-on-save support

Configuration

Most editors auto-detect the LSP binary. For manual configuration, point your editor’s LSP client at the harn-lsp binary with no arguments. The server uses TextDocumentSyncKind::FULL and debounces full-document reparses so diagnostics stay responsive while you are typing.

Debug adapter (DAP)

Start the debugger with:

cargo run --bin harn-dap

The DAP server communicates over stdin/stdout using the Debug Adapter Protocol. It supports:

Breakpoints (line-based)
Step in / step over / step out
Variable inspection in scopes
Stack frame navigation
Continue / pause execution

VS Code launch configuration

The VS Code extension now contributes a harn debugger type and an initial Debug Current Harn File launch configuration. You can also add it manually:

{
  "type": "harn",
  "request": "launch",
  "name": "Debug Harn",
  "program": "${file}",
  "cwd": "${workspaceFolder}"
}

Set harn.dapPath if harn-dap is not on your PATH.

Tree-sitter grammar

The tree-sitter-harn/ directory contains a tree-sitter grammar for Harn. This powers syntax highlighting in editors that support tree-sitter (Neovim, Helix, Zed, etc.).

Build the grammar:

cd tree-sitter-harn && npx tree-sitter generate

Highlight queries are in tree-sitter-harn/queries/highlights.scm.

Formatter

Format Harn files from the command line or integrate with editor format-on-save:

harn fmt file.harn          # format in place
harn fmt --check file.harn  # check without modifying

Linter

Run the linter for static analysis:

harn lint file.harn
harn lint --fix file.harn   # automatically apply safe fixes

The linter checks for: shadow variables, unused variables, unused types, undefined functions, unreachable code, missing harndoc comments, naming convention drift, branch-heavy functions, and prompt-injection risks such as interpolated llm_call system prompts. With --fix, the linter automatically rewrites fixable issues (e.g., var → let, boolean comparison simplification, unused import removal).

Testing

Harn provides several layers of testing support: a conformance test runner, a standard library testing module, and host-mock helpers for isolating agent behavior from real host capabilities.

Conformance tests

Conformance tests are the primary executable specification for the Harn language and runtime. They live under conformance/tests/ as paired files:

test_name.harn — Harn source code
test_name.expected — exact expected stdout output

Tests are grouped by area into subdirectories. ls conformance/tests/ gives the current top-level map (examples: language/, control_flow/, types/, collections/, concurrency/, stdlib/, templates/, modules/, agents/, integration/, runtime/). The runner discovers .harn files recursively, so new tests just need to be dropped into the appropriate subdirectory.

Shared helpers live alongside the tests that use them: conformance/tests/modules/lib/ holds import targets for the modules/ tests, and conformance/tests/templates/fixtures/ holds prompt-template fixtures for the templates/ tests.

Error tests (Harn programs that are expected to fail) live under conformance/errors/, similarly subdivided into syntax/, types/, semantic/, and runtime/.

Running tests

# Run the full conformance suite
harn test conformance

# Filter by name (substring match)
harn test conformance --filter workflow_runtime

# Filter by tag (if test uses tags)
harn test conformance --tag agent

# Verbose output
harn test conformance --filter my_test -v

# Timing summary without verbose failure details
harn test conformance --timing --filter my_test

Writing a conformance test

Create a .harn file with a pipeline default(task) entry point and use log() or println() to produce output:

// conformance/tests/<group>/my_feature.harn  (e.g. stdlib/, types/)
pipeline default(task) {
  let result = my_feature(42)
  log(result)
}

Then create a .expected file with the exact output:

[harn] 84

The `std/testing` module

Import std/testing in your Harn tests for higher-level test helpers:

import { mock_host_result, assert_host_called, clear_host_mocks } from "std/testing"

Host mock helpers

Function	Description
`clear_host_mocks()`	Remove all registered host mocks
`mock_host_result(cap, op, result, params?)`	Mock a host capability to return a value
`mock_host_error(cap, op, message, params?)`	Mock a host capability to return an error
`mock_host_response(cap, op, config)`	Mock with full response configuration

Host call assertions

Function	Description
`host_calls()`	Return all recorded host calls
`host_calls_for(cap, op)`	Return calls for a specific capability/operation
`assert_host_called(cap, op, params?)`	Assert a host call was made
`assert_host_call_count(cap, op, expected_count)`	Assert exact call count
`assert_no_host_calls()`	Assert no host calls were made

Example

import { mock_host_result, assert_host_called, clear_host_mocks } from "std/testing"

pipeline default(task) {
  clear_host_mocks()

  // Mock the workspace.read_text capability
  mock_host_result("workspace", "read_text", "file contents")

  // Code under test calls host_call("workspace.read_text", ...)
  let content = host_call("workspace.read_text", {path: "test.txt"})
  log(content)

  // Verify the call was made
  assert_host_called("workspace", "read_text")
}

LLM mocking

For testing agent loops without real LLM calls, use llm_mock():

llm_mock({text: "The answer is 42"})

let result = llm_call([
  {role: "user", content: "What is the answer?"}
])
log(result)

This queues a canned response that the next LLM call consumes.

For end-to-end CLI runs, harn run and harn playground can preload the same mock infrastructure from a JSONL fixture file:

{"text":"PLAN: find the middleware module first","model":"fixture-model"}
{"match":"*hello*","text":"matched","model":"fixture-model"}
{"match":"*","error":{"category":"rate_limit","message":"fake rate limit"}}

harn run script.harn --llm-mock fixtures.jsonl
harn playground --script pipeline.harn --llm-mock fixtures.jsonl

A line without match is FIFO and is consumed on use.
A line with match is checked in file order as a glob against the request transcript text.
Add "consume_match": true when repeated matching prompts should advance through a scripted sequence instead of reusing the same line forever.
When no fixture matches, harn run --llm-mock ... and harn playground --llm-mock ... fail with the first prompt snippet so you can add the missing case directly.

To capture a replayable fixture from a run, record once and then replay the saved JSONL:

harn run script.harn --llm-mock-record fixtures.jsonl
harn run script.harn --llm-mock fixtures.jsonl

harn playground --script pipeline.harn --llm-mock-record fixtures.jsonl
harn playground --script pipeline.harn --llm-mock fixtures.jsonl

Built-in assertions

Harn provides assert, assert_eq, and assert_ne builtins for test pipelines:

assert(x > 0, "x must be positive")
assert_eq(actual, expected)
assert_ne(actual, unexpected)
assert_eq(len(items), 3)

Failed assertions throw an error with a descriptive message including the expected and actual values.

Use require for runtime invariants in normal pipelines. The linter warns if you use assert* outside test pipelines, and it suggests assert* instead of require inside test pipelines.

Migrating from 0.6.x to 0.7.0

Harn 0.7.0 replaces the implicit transcript_policy dict with first-class sessions. Session lifecycle is driven by imperative builtins, and unknown inputs hard-error instead of silently no-op’ing.

This guide lists every removed surface with a side-by-side rewrite.

`transcript_policy` on workflow nodes

The per-node policy dict is gone. Its fields moved to two dedicated setters plus lifecycle verbs.

Before (0.6)

workflow_set_transcript_policy(graph, "summarize", {
  mode: "reset",
  visibility: "public",
  auto_compact: true,
  compact_threshold: 8000,
  compact_strategy: "truncate",
  keep_last: 6,
})

After (0.7)

// Shape the node's compaction behavior:
workflow_set_auto_compact(graph, "summarize", {
  auto_compact: true,
  compact_threshold: 8000,
  compact_strategy: "truncate",
  keep_last: 6,
})
workflow_set_output_visibility(graph, "summarize", "public")

// To reset the stage's conversation explicitly before execution,
// open a caller-controlled session and wire it into the node's
// model_policy:
let sid = agent_session_open("summarize-v2")
workflow_set_model_policy(graph, "summarize", {session_id: sid})
agent_session_reset(sid)

mode: "fork" maps to agent_session_fork(src, dst?) called before workflow_execute, wiring the fork id into the node’s model_policy.session_id. mode: "continue" is the new default — two stages sharing a session_id share a conversation automatically.

`transcript_id` / `transcript_metadata` on `llm_call`

Both keys were removed. Session id subsumes them.

Before

let result = llm_call("hi", {
  transcript_id: "chat-42",
  transcript_metadata: {user: "ada"},
})

After

// `session_id` is honored by `agent_loop`; `llm_call` is single-shot.
// For conversational continuity, move to agent_loop:
let sid = agent_session_open("chat-42")
let result = agent_loop("hi", nil, {session_id: sid})

If you relied on the transcript_metadata bag, attach it to the session via your own store or pass per-call context through the metadata field of injected messages. transcript_summary (per-call summary injection for mid-loop compaction output) is unchanged.

`transcript` option on `llm_call` / `agent_loop`

Passing a raw transcript dict through the transcript option is now a hard error.

Before

let t = transcript()
let result = agent_loop("task", nil, {transcript: t, provider: "mock"})

After

let sid = agent_session_open()
let result = agent_loop("task", nil, {session_id: sid, provider: "mock"})
// `agent_session_snapshot(sid)` if you want the transcript back as a dict.

The loop loads prior messages from the session store as a prefix before running and persists the final transcript back on exit.

Lifecycle via dict (`mode: "reset" | "fork"`)

Previously some call sites accepted a lifecycle dict. That pattern is gone — call the verbs explicitly:

mode: "reset" → agent_session_reset(id)
mode: "fork" → let dst = agent_session_fork(src) (optionally with a caller-provided dst id)
mode: "continue" → no-op; just reuse the same session_id

Subscribers

CLOSURE_SUBSCRIBERS (thread-local in agent_events.rs) was removed. Subscribers now live on SessionState.subscribers.

agent_subscribe(id, cb) opens the session lazily and appends.
agent_session_fork does not copy subscribers — a fork is a conversation branch, not an event fanout.
clear_session_sinks only clears external ACP-style sinks now; it no longer evicts sessions.

Unknown-key / unknown-id behavior

A class of silent pass-throughs is now an error:

Unknown agent_session_compact option keys.
Missing role on agent_session_inject.
Negative keep_last.
reset / fork / close / trim / inject / length / compact called against an unknown session id.

exists, open, and snapshot remain tolerant of unknown ids by design.

`agent_loop` terminal status

max_iterations reached without a natural break now reports status = "budget_exhausted" (previously "done"). If your host keys off "done" to detect “agent is finished,” add "budget_exhausted" to the accept list — the loop ran out of rope, not out of work. Daemon loops in the same condition no longer silently relabel to "idle".

See the Sessions chapter for the full model and the 0.7.0 entry in the changelog for the complete breaking-change list.

Prompt templates: v2 migration

The prompt-template engine used by render(...) / render_prompt(...) now supports else/elif, loops, includes, filters, comments, raw blocks, and whitespace trim markers. Existing templates keep rendering unchanged — this is a strict superset. But many pre-v2 workarounds can now be simplified.

If / else

Before — mutually-exclusive {{ if }} blocks with inverted flags:

{{if expected_output}}
Expected: {{expected_output}}
{{end}}{{if no_expected_output}}
(no expected output provided)
{{end}}

After:

{{if expected_output}}
Expected: {{expected_output}}
{{else}}
(no expected output provided)
{{end}}

Loops instead of hand-rolled list concatenation

Before — build a string in .harn and inject it as a single variable:

let block = ""
for sample in samples {
  block = "${block}### ${sample.path}\n\`\`\`\n${sample.content}\n\`\`\`\n\n"
}
let prompt = render("enrichment.prompt", {block: block, ...})

# enrichment.prompt
## Samples
{{block}}

After — iterate in the template:

let prompt = render("enrichment.prompt", {samples: samples, ...})

# enrichment.prompt
## Samples
{{for s in samples}}
### {{s.path}}
```
{{s.content}}
```
{{end}}

Shared prose → `{{ include }}`

When multiple repair-stage prompts share the same boilerplate (“self-verification instructions”, system rules, etc.), extract the shared text into a partial:

# lib/partials/self-verify.harn.prompt
Before responding, verify your answer against: {{verification_hint}}

Call it from each repair stage:

{{include "partials/self-verify.harn.prompt"}}
...stage-specific instructions...

Pass stage-specific overrides with with:

{{include "partials/self-verify.harn.prompt" with { verification_hint: "compile output" }}}

Filters instead of pre-processing

Before — uppercase, join lists, JSON-stringify in .harn before rendering:

let tags_str = join(map(tags, fn(t) { return uppercase(t) }), ", ")
render("x.prompt", {tags: tags_str})

After:

Tags: {{tags | join: ", " | upper}}

Comments and raw blocks

Add {{# authoring notes #}} to document a template without leaking the note into the final prompt. Wrap literal {{ / }} (e.g. examples of another template language embedded in a prompt) in a {{ raw }} ... {{ endraw }} block.

Whitespace trim

{{- ... -}} markers strip whitespace and one newline on the respective side. Use them to keep source templates readable without introducing blank lines in the rendered output:

Items:
{{- for x in xs -}}
  {{ x }},
{{- end -}}
DONE

See Prompt templating for the full reference.

Migration — schema-as-type (`type` aliases drive `output_schema`)

Prior to this change, Harn had two parallel representations for structured LLM output:

Harn-native types — type Foo = {verdict: string, ...}.
Raw JSON-Schema dicts — passed as output_schema: {type: "dict", properties: {...}, required: [...]} to llm_call, and consumed by schema_is, schema_expect, schema_parse, and friends.

The two representations drifted. A grader script that declared a type alias for documentation and a separate schema dict for validation had no compile-time check that the two agreed.

This release unifies them. A single type alias now feeds:

Static type-checking on the values that flow through it.
JSON-Schema emission for llm_call structured output.
schema_is / schema_expect narrowing on runtime-typed values (unknown, unions, parsed JSON).
ACP ToolAnnotations.args compatibility (same emitted schema).

Migrating a grader script

Before — duplicated surface, no cross-check:

let grader_schema = {
  type: "object",
  required: ["verdict", "summary"],
  properties: {
    verdict: {type: "string", enum: ["pass", "fail", "unclear"]},
    summary: {type: "string"},
  },
}

let r = llm_call(prompt, nil, {
  model: routing.model,
  output_schema: grader_schema,
  schema_retries: 2,
})

// No compile-time check that r.data has these shape/fields.
log("verdict=${r.data.verdict}")

After — one alias, two uses:

type GraderOut = {
  verdict: "pass" | "fail" | "unclear",
  summary: string,
}

let r = llm_call(prompt, nil, {
  model: routing.model,
  output_schema: GraderOut,   // compiled to the JSON-Schema dict
  schema_retries: 2,
})

if schema_is(r.data, GraderOut) {
  // r.data is narrowed to GraderOut here.
  log("verdict=${r.data.verdict}")
}

What translates mechanically

Old schema key	New type grammar
`{type: "string"}`	`string`
`{type: "int"}` / `"integer"`	`int`
`{type: "bool"}` / `"boolean"`	`bool`
`{type: "list", items: T}`	`list<T>`
`{type: "dict", additional_properties: V}`	`dict<string, V>`
`{type: "string", enum: ["a","b"]}`	`"a" \| "b"`
`{type: "int", enum: [0,1,2]}`	`0 \| 1 \| 2`
`{properties, required}` with `additional_properties: false`	`type T = {field: type, optional?: type}`
`{union: [A, B]}` / `{oneOf: [A, B]}`	`A \| B`
`{nullable: true}` wrapping `T`	`T \| nil`

Staying with raw schema dicts

Nothing forces you to migrate. output_schema: dict_literal still works and is still the right tool when you need schema features Harn’s type grammar does not yet express (regex pattern, min_length, numeric min/max, const, nested $ref, etc.). You can mix:

type Name = {first: string, last: string}

let r = llm_call(prompt, nil, {
  output_schema: {
    type: "dict",
    properties: {
      name: schema_of(Name),       // alias → schema dict
      email: {type: "string", pattern: "^[^@]+@[^@]+$"},
    },
    required: ["name", "email"],
  },
})

Caveats

schema_of(T) lowers at compile time. T must be a top-level type alias visible to the compiler. Dynamic construction (let T = ...) falls back to the runtime schema_of builtin, which is a dict-passthrough — it does not look up alias names at runtime.
The compiler-level alias emitter handles shapes, lists, dict<string, V>, literal-string/int unions, and nested aliases. Shapes containing Applied<T> (generic containers) or fn types emit a best-effort {type: "closure"} placeholder; prefer raw schema dicts there.
response.data of llm_call(..., {output_schema: T}) is not yet automatically narrowed to T by the type checker. Use if schema_is(r.data, T) { ... } in the interim — the narrowing there is exact.

Keyboard shortcuts

Harn Documentation