Harn
Harn is a pipeline-oriented programming language for orchestrating AI agents. LLM calls, tool use, concurrency, and error recovery are built into the language – no libraries or SDKs needed.
let response = llm_call(
"Explain quicksort in two sentences.",
"You are a computer science tutor."
)
println(response)
Harn files can contain top-level code like the above (implicit pipeline), or organize logic into named pipelines for larger programs:
pipeline default(task) {
let files = ["src/main.rs", "src/lib.rs"]
let reviews = parallel each files { file ->
let content = read_file(file)
llm_call("Review this code:\n${content}", "You are a code reviewer.")
}
for review in reviews {
println(review)
}
}
Get started
The fastest way to start is the Getting Started guide: install Harn, write a program, and run it in under five minutes.
What’s in this guide
- Getting started – Install and run your first program
- Why Harn? – What problems Harn solves and how it compares
- Language basics – Syntax, types, control flow, functions, structs, enums
- Error handling – try/catch, Result type, the
?operator, retry - Modules and imports – Splitting code across files, standard library
- Concurrency – spawn/await, parallel, channels, mutexes, deadlines
- Language specification – Formal grammar and runtime semantics
- LLM calls and agent loops – Calling models, agent loops, tool use
- Transcript architecture – How Harn stores and replays agent conversations
- Workflow runtime – Workflow graphs, artifacts, run records, replay, evals
- Cookbook – Practical recipes and patterns
- Host boundary – How Harn integrates with host applications
- Bridge protocol – JSON-RPC contract for host bridges
- MCP and ACP integration – MCP client/server, ACP, and A2A protocols
- Harn portal – Local observability UI for runs and transcripts
- CLI reference – All CLI commands and flags
- Builtin functions – Complete reference for all built-in functions
- Editor integration – LSP, tree-sitter, and formatter support
- Testing – Running user tests and the conformance suite
Links
Getting started
This page gets you from zero to running your first Harn program.
Prerequisites
- Rust 1.70 or later – install with
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh - Git
Installation
From crates.io
cargo install harn-cli
From source
git clone https://github.com/burin-labs/harn
cd harn
./scripts/dev_setup.sh # installs dev tooling, portal deps/build, git hooks, sccache
cargo build --release
cp target/release/harn ~/.local/bin/
Verify the installation:
harn version
Your first program
Create a file called hello.harn:
println("Hello, world!")
Run it:
harn run hello.harn
That’s it. Harn files can contain top-level code without any boilerplate. The above is an implicit pipeline – the runtime wraps your top-level statements automatically.
Adding a pipeline
For larger programs, organize code into named pipelines. The runtime
executes the default pipeline (or the first one declared):
pipeline default(task) {
let name = "Harn"
println("Hello from ${name}!")
}
The task parameter is injected by the host runtime. It carries the
user’s request when Harn is used as an agent backend.
Calling an LLM
Harn has native LLM support. Set your API key and call a model directly:
export ANTHROPIC_API_KEY=sk-ant-...
let response = llm_call(
"Explain quicksort in two sentences.",
"You are a computer science tutor."
)
println(response)
No imports, no SDK initialization, no response parsing. Harn ships with built-in configs for Anthropic, OpenAI, OpenRouter, Ollama, HuggingFace, and local OpenAI-compatible servers.
The REPL
Start an interactive session:
harn repl
The REPL evaluates expressions as you type and displays results
immediately. It keeps a persistent history in ~/.harn/repl_history and
supports multi-line blocks until delimiters are balanced, which makes it useful
for experimenting with builtins and small snippets.
Project setup
Scaffold a new project with harn init or pick a starter with harn new:
harn new my-agent --template agent
cd my-agent
harn doctor --no-network
This creates a directory with harn.toml (project config) and starter files
for the selected template. Run it with:
harn run main.harn
Remote MCP quick start
If you want to use a cloud MCP server such as Notion, authorize it once with
the CLI and then reference it from harn.toml:
harn mcp redirect-uri
harn mcp login https://mcp.notion.com/mcp --scope "read write"
Next steps
- Why Harn? – What problems Harn solves
- Language basics – Syntax, types, control flow
- LLM calls and agent loops – Calling models and building agents
- Cookbook – Practical recipes and patterns
Scripting Cheatsheet
A compact, prose-friendly tour of everything you need to write real
Harn scripts. The companion one-page LLM reference is at
docs/llm/harn-quickref.md
(outside the mdBook; served as raw Markdown) — they cover the same
ground with different shapes, and should stay in lockstep. Agents that
can fetch URLs should prefer the quickref.
Strings
Use standard double-quoted strings with \n escapes for short
literals, and triple-quoted """...""" for multiline prose like
system prompts:
let greeting = "Hello, ${name}!"
let prompt = """
You are a strict grader.
Emit exactly one verdict.
"""
Heredoc-style <<TAG ... TAG is only valid inside LLM tool-call
argument JSON — in source code, the parser points you at triple
quotes.
Slicing
End-exclusive slicing works on strings and lists:
let head = content[0:400]
let tail = content[len(content) - 400:len(content)]
let sub = xs[1:4]
substring(s, start, length) exists too, but the third argument is a
length, not an end index. Prefer the slice syntax to avoid that
footgun.
if is an expression
if / else produces a value. Drop it straight into let, an
argument, or a return:
let body = if len(content) > 2400 {
content[0:400] + "..." + content[len(content) - 400:len(content)]
} else {
content
}
Module scope
Top-level let / var and fn declarations are visible inside
functions defined in the same file — no wrapping in a getter fn
needed:
let GRADER_SYSTEM = """
You are a strict grader...
"""
pub fn grade(path) {
return llm_call(read_file(path), GRADER_SYSTEM, {
provider: "auto",
model: "local:gemma-4-e4b-it",
})
}
(Module-level mutable var cross-function mutation is not fully
supported yet. If you need shared mutable state across functions, use
atomics: atomic(0), atomic_add(a, 1), atomic_get(a).)
Results and error handling
let r = try { llm_call(prompt, nil, opts) }
// Optional chaining short-circuits on Result.Err.
let text = r?.prose ?? "no response"
// Explicit error inspection.
if unwrap_err(r) != "" {
log("failed")
}
// `try/catch` also works as an expression — the whole form evaluates to
// the try body's tail value on success or the catch handler's tail value
// on a caught throw, so simple fallbacks don't need Result gymnastics.
let prose = try { llm_call(prompt, nil, opts).prose } catch (e) { "fallback" }
Concurrency
// Spawn a task, collect its result.
let h = spawn { long_work() }
let value = await(h)
// parallel each: concurrent map over a list.
let doubled = parallel each xs { x -> x * 2 }
// parallel settle: concurrent map that collects per-item Ok/Err.
let outcome = parallel settle paths { p -> grade(p) }
println(outcome.succeeded)
// Cap in-flight workers so you don't overwhelm the backend.
let results = parallel settle paths with { max_concurrent: 4 } { p ->
llm_call(p, nil, opts)
}
max_concurrent: 0 (or a missing with clause) means unlimited. See
concurrency.md for the RPM rate limiter, channels, select,
deadline, and defer.
CLI: argv
harn run my_script.harn -- file1.md file2.md
Inside the script:
fn grade_file(path) {
println(path)
}
for path in argv {
grade_file(path)
}
argv is always defined as list<string>; empty when no positional
args were given.
Regex
let matches = regex_match("[0-9]+", "abc 42 def 7")
let swapped = regex_replace("(\\w+)\\s(\\w+)", "$2 $1", "hello world")
let same = regex_replace_all("(\\w+)\\s(\\w+)", "$2 $1", "hello world")
let captures = regex_captures("(?P<day>[A-Z][a-z]+)", "Mon Tue")
Both regex_replace and regex_replace_all replace every match;
both support $1, $2, ${name} backrefs from the regex crate.
LLM calls
let r = llm_call(prompt, system, {
provider: "auto", // infers from model prefix
model: "local:gemma-4-e4b-it",
output_schema: schema,
output_validation: "error",
schema_retries: 2, // retry with corrective nudge on schema mismatch
response_format: "json",
})
println(r.prose) // unwrapped prose (preferred for "the answer")
println(r.data.verdict) // parsed structured output
Key options:
| Option | Default | Notes |
|---|---|---|
provider | "auto" | "auto" infers from model prefix (local: / / / claude-* / gpt-* / :). |
llm_retries | 2 | Transient error retries (HTTP 5xx, timeout, rate-limit). Set 0 to fail fast. |
llm_backoff_ms | 2000 | Base exponential backoff. |
schema_retries | 1 | Re-prompt on output_schema validation failure. Requires output_validation: "error" to kick in. |
schema_retry_nudge | auto | String (verbatim), true (auto), or false (bare retry). |
output_validation | "off" | "error" throws on mismatch; "warn" logs. |
See docs/src/llm-and-agents.md for agent_loop, tool dispatch, and
the full option surface.
Rate limiting
max_concurrent bounds simultaneous in-flight tasks on the caller
side. Providers can also be rate-limited at the throughput layer via
rpm: in providers.toml / harn.toml or
HARN_RATE_LIMIT_<PROVIDER>=N env vars. The two compose: use
max_concurrent to prevent bursts, and rpm to shape sustained
throughput.
More
- LLM-friendly one-pager:
docs/llm/harn-quickref.md(loaded automatically by theharn-scriptingClaude skill when present). - Full mdBook:
docs/src/(introduction.md,language-basics.md,concurrency.md,error-handling.md,llm-and-agents.md). - Language spec:
spec/HARN_SPEC.md. - Conformance examples:
conformance/tests/*.harn.
Why Harn?
The problem
Building AI agents is complex. A typical agent needs to call LLMs, execute tools, handle errors and retries, run tasks concurrently, maintain conversation state, and coordinate multiple sub-agents. In most languages, this means assembling a tower of libraries:
- An LLM SDK (LangChain, OpenAI SDK, Anthropic SDK)
- An async runtime (asyncio, Tokio, goroutines)
- Retry and timeout logic (tenacity, custom decorators)
- Tool registration and dispatch (custom JSON Schema plumbing)
- Structured logging and tracing (separate packages)
- A test framework (pytest, Jest)
Each layer adds configuration, boilerplate, and failure modes. The orchestration logic – the part that actually matters – gets buried under infrastructure code.
What Harn does differently
Harn is a programming language where agent orchestration primitives are built into the syntax, not bolted on as libraries.
In practice that means Harn aims to be the long-term orchestration boundary between product code and provider/runtime code. Product integrations should mainly declare workflows, policies, capabilities, and UI hooks rather than rebuilding transcript logic, tool queues, replay fixtures, or provider response normalization.
Native LLM calls
llm_call and agent_loop are language primitives. No SDK imports, no
client initialization, no response parsing. Set an environment variable
and call a model:
let answer = llm_call("Summarize this code", "You are a code reviewer.")
Harn ships with built-in configs for Anthropic, OpenAI, OpenRouter, HuggingFace, Ollama, and local OpenAI-compatible servers. Switching providers is a one-field change in the options dict.
Pipeline composition
Pipelines are the unit of composition. They can extend each other, override steps, and be imported across files. This gives you a natural way to structure multi-stage agent workflows:
pipeline analyze(task) {
let context = read_file("README.md")
let plan = llm_call("${task}\n\nContext:\n${context}", "Break this into steps.")
let steps = json_parse(plan)
let results = parallel each steps { step ->
agent_loop(step, "You are a coding assistant.", {persistent: true})
}
write_file("results.json", json_stringify(results))
}
Files can also contain top-level code without a pipeline block (implicit pipeline), making Harn work well for scripts and quick experiments.
MCP and ACP integration
Harn has built-in support for the Model Context Protocol. Connect to any MCP server, or expose your Harn pipeline as one. ACP integration lets editors use Harn as an agent backend.
That includes remote HTTP MCP servers with standalone OAuth handled by the CLI, so cloud MCP integrations can be treated as normal runtime dependencies instead of host-specific glue.
let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
let tools = mcp_list_tools(client)
let content = mcp_call(client, "read_file", {path: "/tmp/data.txt"})
mcp_disconnect(client)
Concurrency without async/await
parallel each, parallel, spawn/await, and channels are keywords,
not library functions. No callback chains, no promise combinators, no
async def annotations:
let results = parallel each files { file ->
llm_call(read_file(file), "Review this file for security issues")
}
Retry and error recovery
retry and try/catch are control flow constructs. Wrapping an
unreliable LLM call in retries is a one-liner:
retry 3 {
let result = llm_call(prompt, system)
json_parse(result)
}
Gradual typing
Type annotations are optional. Add them where they help, leave them off where they don’t. Structural shape types let you describe expected dict fields:
fn score(text: string) -> int {
let result = llm_call(text, "Rate 1-10. Respond with just the number.")
return to_int(result)
}
Embeddable
Harn compiles to a WASM target for browser embedding and ships with LSP and DAP servers for IDE integration. Agent pipelines can run inside editors, CI systems, or web applications.
Who Harn is for
- Developers building AI agents who want orchestration logic to be readable and concise, not buried under framework boilerplate.
- IDE authors who want a scriptable, embeddable language for agent pipelines with built-in LSP support.
- Researchers prototyping agent architectures who need fast iteration without setting up infrastructure.
Comparison
Here is what a “fetch three URLs in parallel, summarize each with an LLM, and retry failures” pattern looks like across approaches:
Python (LangChain + asyncio):
import asyncio
from langchain_anthropic import ChatAnthropic
from tenacity import retry, stop_after_attempt
import aiohttp
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
@retry(stop=stop_after_attempt(3))
async def summarize(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
text = await resp.text()
result = await llm.ainvoke(f"Summarize:\n{text}")
return result.content
async def main():
urls = ["https://a.com", "https://b.com", "https://c.com"]
results = await asyncio.gather(*[summarize(u) for u in urls])
for r in results:
print(r)
asyncio.run(main())
Harn:
pipeline default(task) {
let urls = ["https://a.com", "https://b.com", "https://c.com"]
let results = parallel each urls { url ->
retry 3 {
let page = http_get(url)
llm_call("Summarize:\n${page}", "Be concise.")
}
}
for r in results {
println(r)
}
}
The Harn version has no imports, no decorators, no client initialization, no async annotations, and no runtime setup. The orchestration logic is all that remains.
Getting started
See the Getting started guide to install Harn and run your first program, or jump to the cookbook for practical patterns.
Language basics
This guide covers the core syntax and semantics of Harn.
Implicit pipeline
Harn files can contain top-level code without a pipeline block. The
runtime wraps it in an implicit pipeline automatically:
let x = 1 + 2
println(x)
fn double(n) {
return n * 2
}
println(double(5))
This is convenient for scripts, experiments, and small programs.
Pipelines
For larger programs, organize code into named pipelines. The runtime
executes the pipeline named default, or the first one declared.
pipeline default(task) {
println("Hello from the default pipeline")
}
pipeline other(task) {
println("This only runs if called or if there's no default")
}
Pipeline parameters task and project are injected by the host runtime.
A context dict with keys task, project_root, and task_type is
always available.
Variables
let creates immutable bindings. var creates mutable ones.
let name = "Alice"
var counter = 0
counter = counter + 1 // ok
name = "Bob" // error: immutable assignment
Bindings are lexically scoped. Each if branch, loop body, catch body, and
explicit { ... } block gets its own scope, so inner bindings can shadow outer
names without colliding:
let status = "outer"
if true {
let status = "inner"
println(status) // inner
}
println(status) // outer
If you want to update an outer binding from inside a block, declare it with
var outside the block and assign to it inside the branch or loop body.
Types and values
Harn is dynamically typed with optional type annotations.
| Type | Example | Notes |
|---|---|---|
int | 42 | Platform-width integer |
float | 3.14 | Double-precision |
string | "hello" | UTF-8, supports interpolation |
bool | true, false | |
nil | nil | Null value |
list | [1, 2, 3] | Heterogeneous, ordered |
dict | {name: "Alice"} | String-keyed map |
closure | { x -> x + 1 } | First-class function |
duration | 5s, 100ms | Time duration |
Type annotations
Annotations are optional and checked at compile time:
let x: int = 42
let name: string = "hello"
let nums: list<int> = [1, 2, 3]
fn add(a: int, b: int) -> int {
return a + b
}
Supported type expressions: int, float, string, bool, nil, list,
list<T>, dict, dict<K, V>, union types (string | nil), and structural
shape types ({name: string, age: int}).
Parameter type annotations for primitive types (int, float, string,
bool, list, dict, set, nil, closure) are enforced at runtime.
Calling a function with the wrong type produces a TypeError:
fn add(a: int, b: int) -> int {
return a + b
}
add("hello", "world")
// TypeError: parameter 'a' expected int, got string (hello)
Structural types (shapes)
Shape types describe the expected fields of a dict. The type checker verifies that required fields are present with compatible types. Extra fields are allowed (width subtyping).
let user: {name: string, age: int} = {name: "Alice", age: 30}
let config: {host: string, port?: int} = {host: "localhost"}
fn greet(u: {name: string}) -> string {
return "hi ${u["name"]}"
}
greet({name: "Bob", age: 25})
Use type aliases for reusable shape definitions:
type Config = {model: string, max_tokens: int}
let cfg: Config = {model: "gpt-4", max_tokens: 100}
Truthiness
These values are falsy: false, nil, 0, 0.0, "", [], {}. Everything else is truthy.
Strings
Interpolation
let name = "world"
println("Hello, ${name}!")
println("2 + 2 = ${2 + 2}")
Any expression works inside ${}.
Raw strings
Raw strings use the r"..." prefix. No escape processing or interpolation
is performed – backslashes and dollar signs are taken literally. Useful for
regex patterns and file paths:
let pattern = r"\d+\.\d+"
let path = r"C:\Users\alice\docs"
Raw strings cannot span multiple lines.
Multi-line strings
let doc = """
This is a multi-line string.
Common leading whitespace is stripped.
"""
Multi-line strings support ${expression} interpolation with automatic
indent stripping:
let name = "world"
let greeting = """
Hello, ${name}!
Welcome to Harn.
"""
Escape sequences
\n (newline), \t (tab), \\ (backslash), \" (quote), \$ (dollar sign).
String methods
"hello".count // 5
"hello".empty // false
"hello".contains("ell") // true
"hello".replace("l", "r") // "herro"
"a,b,c".split(",") // ["a", "b", "c"]
" hello ".trim() // "hello"
"hello".starts_with("he") // true
"hello".ends_with("lo") // true
"hello".uppercase() // "HELLO"
"hello".lowercase() // "hello"
"hello world".substring(0, 5) // "hello"
Operators
Ordered by precedence (lowest to highest):
| Precedence | Operators | Description |
|---|---|---|
| 1 | |> | Pipe |
| 2 | ? : | Ternary conditional |
| 3 | ?? | Nil coalescing |
| 4 | || | Logical OR (short-circuit) |
| 5 | && | Logical AND (short-circuit) |
| 6 | == != | Equality |
| 7 | < > <= >= in not in | Comparison, membership |
| 8 | + - | Add, subtract, string/list concat |
| 9 | * / | Multiply, divide |
| 10 | ! - | Unary not, negate |
| 11 | . ?. [] [:] () ? | Member access, optional chaining, subscript, slice, call, try |
Division by zero returns nil. Integer division truncates.
Arithmetic operators are strictly typed — mismatched operands (e.g.
"hello" + 5) produce a TypeError. Use to_string() or string
interpolation ("value=${x}") for explicit conversion.
Optional chaining (?.)
Access properties or call methods on values that might be nil. Returns nil instead of erroring when the receiver is nil:
let user = nil
println(user?.name) // nil (no error)
println(user?.greet("hi")) // nil (method not called)
let d = {name: "Alice"}
println(d?.name) // Alice
Chains propagate nil: a?.b?.c returns nil if any step is nil.
List and string slicing ([start:end])
Extract sublists or substrings using slice syntax:
let items = [10, 20, 30, 40, 50]
println(items[1:3]) // [20, 30]
println(items[:2]) // [10, 20]
println(items[3:]) // [40, 50]
println(items[-2:]) // [40, 50]
let s = "hello world"
println(s[0:5]) // hello
println(s[-5:]) // world
Negative indices count from the end. Omit start for 0, omit end for length.
Try operator (?)
The postfix ? operator works with Result values (Ok / Err). It
unwraps Ok values and propagates Err values by returning early from
the enclosing function:
fn divide(a, b) {
if b == 0 {
return Err("division by zero")
}
return Ok(a / b)
}
fn compute(x) {
let result = divide(x, 2)? // unwraps Ok, or returns Err early
return Ok(result + 10)
}
fn compute_zero(x) {
let result = divide(x, 0)? // divide returns Err, ? propagates it
return Ok(result + 10)
}
println(compute(20)) // Result.Ok(20)
println(compute_zero(20)) // Result.Err(division by zero)
Multiple ? calls can be chained in a single function to build
pipelines that short-circuit on the first error.
Membership operators (in, not in)
Test whether a value is contained in a collection:
// Lists
println(3 in [1, 2, 3]) // true
println(6 not in [1, 2, 3]) // true
// Strings (substring containment)
println("world" in "hello world") // true
println("xyz" not in "hello") // true
// Dicts (key membership)
let data = {name: "Alice", age: 30}
println("name" in data) // true
println("email" not in data) // true
// Sets
let s = set(1, 2, 3)
println(2 in s) // true
println(5 not in s) // true
Control flow
if/else
if score > 90 {
println("A")
} else if score > 80 {
println("B")
} else {
println("C")
}
Can be used as an expression: let grade = if score > 90 { "A" } else { "B" }
for/in
for item in [1, 2, 3] {
println(item)
}
// Dict iteration yields {key, value} entries sorted by key
for entry in {a: 1, b: 2} {
println("${entry.key}: ${entry.value}")
}
while
var i = 0
while i < 10 {
println(i)
i = i + 1
}
Safety limit of 10,000 iterations.
match
match status {
"active" -> { println("Running") }
"stopped" -> { println("Halted") }
}
Patterns are expressions compared by equality. First match wins. No match returns nil.
guard
Early exit if a condition isn’t met:
guard x > 0 else {
return "invalid"
}
// x is guaranteed > 0 here
Ranges
Harn has a single range keyword: to. Ranges are inclusive by default —
1 to 5 is [1, 2, 3, 4, 5] — because that matches how the expression reads
aloud. Add the trailing exclusive modifier when you want the half-open form.
for i in 1 to 5 { // inclusive: 1, 2, 3, 4, 5
println(i)
}
for i in 0 to 3 exclusive { // half-open: 0, 1, 2
println(i)
}
For Python-compatible 0-indexed iteration there is also a range() stdlib
builtin. range(n) is equivalent to 0 to n exclusive; range(a, b) is
a to b exclusive. Both forms always produce half-open integer ranges.
for i in range(5) { println(i) } // 0, 1, 2, 3, 4
for i in range(3, 7) { println(i) } // 3, 4, 5, 6
Iteration patterns
Prefer destructuring and stdlib helpers over integer-indexed loops — they read better and avoid off-by-one bugs.
// enumerate(): yields a list of {index, value} dicts.
for {index, value} in ["a", "b", "c"].enumerate() {
println("${index}: ${value}")
}
// zip(): yields [a, b] pairs — use list destructuring.
for [name, score] in names.zip(scores) {
println("${name}: ${score}")
}
// Dict iteration yields {key, value} entries sorted by key.
for {key, value} in {a: 1, b: 2}.entries() {
println("${key} -> ${value}")
}
for heads currently accept a bare name, a list pattern [a, b], or a dict
pattern {name1, name2}. Tuple patterns written with parentheses
(for (a, b) in ...) are not yet supported — use the list pattern when the
iterable yields pair-lists (zip), and the dict pattern when the iterable
yields shaped dicts (enumerate, entries).
Functions and closures
Named functions
fn double(x) {
return x * 2
}
fn greet(name: string) -> string {
return "Hello, ${name}!"
}
Functions can be declared at the top level (for library files) or inside pipelines.
Rest parameters
Use ...name as the last parameter to collect any remaining arguments into
a list:
fn sum(...nums) {
var total = 0
for n in nums {
total = total + n
}
return total
}
println(sum(1, 2, 3)) // 6
fn log(level, ...parts) {
println("[${level}] ${join(parts, " ")}")
}
log("INFO", "server", "started") // [INFO] server started
If no extra arguments are provided, the rest parameter is an empty list.
Closures
let square = { x -> x * x }
let add = { a, b -> a + b }
println(square(4)) // 16
println(add(2, 3)) // 5
Closures capture their lexical environment at definition time. Parameters are immutable.
Higher-order functions
let nums = [1, 2, 3, 4, 5]
nums.map({ x -> x * 2 }) // [2, 4, 6, 8, 10]
nums.filter({ x -> x > 3 }) // [4, 5]
nums.reduce(0, { acc, x -> acc + x }) // 15
nums.find({ x -> x == 3 }) // 3
nums.any({ x -> x > 4 }) // true
nums.all({ x -> x > 0 }) // true
nums.flat_map({ x -> [x, x] }) // [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
Lazy iterators
Collection methods like .map and .filter above are eager — each
call allocates a new list and walks the whole input. That’s fine for
small inputs, but wastes work when you only need the first few
results, or when you want to compose several transforms.
Harn also ships a lazy iterator protocol. Call .iter() on any
iterable source (list, dict, set, string, generator, channel) to lift
it into an Iter<T> — a single-pass, fused iterator. Combinators on
an Iter return a new Iter without running any work. Sinks drain
the iter and return an eager value.
let xs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
let first_three_doubled_evens = xs
.iter()
.filter({ x -> x % 2 == 0 })
.map({ x -> x * 2 })
.take(3)
.to_list()
println(first_three_doubled_evens) // [4, 8, 12]
Use .enumerate() to get (index, value) pairs in a for-loop:
let items = ["a", "b", "c"]
for (i, x) in items.iter().enumerate() {
println("${i}: ${x}")
}
.iter() on a dict yields Pair(key, value) values — destructure
them in a for-loop:
for (k, v) in {a: 1, b: 2}.iter() {
println("${k}: ${v}")
}
A direct for entry in some_dict still yields the usual
{key, value} dicts (back-compat). pair(a, b) also exists as a
builtin for constructing pairs explicitly.
Lazy combinators (return a new Iter): .map, .filter,
.flat_map, .take(n), .skip(n), .take_while, .skip_while,
.zip, .enumerate, .chain, .chunks(n), .windows(n).
Sinks (drain the iter, return a value): .to_list(), .to_set(),
.to_dict() (requires Pair items), .count(), .sum(), .min(),
.max(), .reduce(init, f), .first(), .last(), .any(p),
.all(p), .find(p), .for_each(f).
When to use which: reach for eager list/dict/set methods for
simple one-shot transforms where you want a collection back. Reach
for .iter() when you’re composing multiple transforms, taking the
first N results of a large input, consuming a generator lazily, or
driving a for-loop over combined sources.
Iterators are single-pass and fused — once exhausted, they stay
exhausted. Iteration takes a snapshot of the backing collection,
so mutating the source after .iter() does not affect the iter.
Printing an iter renders <iter> without draining it.
Numeric ranges (a to b, range(n)) participate in the lazy iter
protocol directly: .map / .filter / .take / .zip / .enumerate / ...
on a Range return a lazy iter with no upfront allocation, so
(1 to 10_000_000).map(fn(x) { return x * 2 }).take(5).to_list()
finishes instantly. Range still keeps its O(1) fast paths for
.len / .first / .last / .contains(x) and r[k] subscript — those
don’t round-trip through iter.
Pipe operator
The pipe operator |> passes the left side as the argument to the right side:
let result = data
|> { list -> list.filter({ x -> x > 0 }) }
|> { list -> list.map({ x -> x * 2 }) }
|> json_stringify
Pipe placeholder (_)
Use _ to control where the piped value is placed in the call:
"hello world" |> split(_, " ") // ["hello", "world"]
[3, 1, 2] |> _.sort() // [1, 2, 3]
items |> len(_) // length of items
"world" |> replace("hello _", "_", _) // "hello world"
Without _, the value is passed as the sole argument to a closure or
function name.
Multiline expressions
Binary operators, method chains, and pipes can span multiple lines:
let message = "hello"
+ " "
+ "world"
let result = items
.filter({ x -> x > 0 })
.map({ x -> x * 2 })
let valid = check_a()
&& check_b()
|| fallback()
Note: - does not continue across lines because it doubles as unary
negation.
A backslash at the end of a line forces the next line to continue the current expression, even when no operator is present:
let long_value = some_function( \
arg1, arg2, arg3 \
)
Destructuring
Destructuring extracts values from dicts and lists into local variables.
Dict destructuring
let person = {name: "Alice", age: 30}
let {name, age} = person
println(name) // "Alice"
println(age) // 30
List destructuring
let items = [1, 2, 3, 4, 5]
let [first, ...rest] = items
println(first) // 1
println(rest) // [2, 3, 4, 5]
Renaming
Use : to bind a dict field to a different variable name:
let data = {name: "Alice"}
let {name: user_name} = data
println(user_name) // "Alice"
Destructuring in for-in loops
let entries = [{key: "a", value: 1}, {key: "b", value: 2}]
for {key, value} in entries {
println("${key}: ${value}")
}
Default values
Pattern fields can specify defaults with = expr. The default is used when
the value would otherwise be nil:
let { name = "anon", role = "user" } = { name: "Alice" }
println(name) // Alice
println(role) // user
let [a = 0, b = 0, c = 0] = [1, 2]
println(c) // 0
// Combine with renaming
let { name: display = "Unknown" } = {}
println(display) // Unknown
Missing keys and empty rest
Missing keys destructure to nil (unless a default is specified). A rest
pattern with no remaining items gives an empty collection:
let {name, email} = {name: "Alice"}
println(email) // nil
let [only, ...rest] = [42]
println(rest) // []
Collections
Lists
let nums = [1, 2, 3]
nums.count // 3
nums.first // 1
nums.last // 3
nums.empty // false
nums[0] // 1 (subscript access)
Lists support + for concatenation: [1, 2] + [3, 4] yields [1, 2, 3, 4].
Assigning to an out-of-bounds index throws an error.
Dicts
let user = {name: "Alice", age: 30}
user.name // "Alice" (property access)
user["age"] // 30 (subscript access)
user.missing // nil (missing keys return nil)
user.has("email") // false
user.keys() // ["age", "name"] (sorted)
user.values() // [30, "Alice"]
user.entries() // [{key: "age", value: 30}, ...]
user.merge({role: "admin"}) // new dict with merged keys
user.map_values({ v -> to_string(v) })
user.filter({ v -> type_of(v) == "int" })
Computed keys use bracket syntax: {[dynamic_key]: value}.
Quoted string keys are also supported for JSON compatibility:
{"content-type": "json"}. The formatter normalizes simple quoted keys
to unquoted form and non-identifier keys to computed key syntax.
Keywords can be used as dict keys and property names: {type: "read"},
op.type.
Dicts iterate in sorted key order (alphabetical). This means
for k in dict is deterministic and reproducible, but does not preserve
insertion order.
Sets
Sets are unordered collections of unique values. Duplicates are automatically removed.
let s = set(1, 2, 3) // create from individual values
let s2 = set([4, 5, 5, 6]) // create from a list (deduplicates)
let tags = set("a", "b", "c") // works with any value type
Set operations are provided as builtin functions:
let a = set(1, 2, 3)
let b = set(3, 4, 5)
set_contains(a, 2) // true
set_contains(a, 99) // false
set_union(a, b) // set(1, 2, 3, 4, 5)
set_intersect(a, b) // set(3)
set_difference(a, b) // set(1, 2) -- items in a but not in b
set_add(a, 4) // set(1, 2, 3, 4)
set_remove(a, 2) // set(1, 3)
Sets support iteration with for..in:
var sum = 0
for item in set(10, 20, 30) {
sum = sum + item
}
println(sum) // 60
Convert a set to a list with to_list():
let items = to_list(set(10, 20))
type_of(items) // "list"
Enums and structs
Enums
enum Status {
Active
Inactive
Pending(reason)
Failed(code, message)
}
let s = Status.Pending("waiting")
match s.variant {
"Pending" -> { println(s.fields[0]) }
"Active" -> { println("ok") }
}
Structs
struct Point {
x: int
y: int
}
let p = {x: 10, y: 20}
println(p.x)
Structs can also be constructed with the struct name as a constructor, using named fields directly:
let p = Point { x: 10, y: 20 }
println(p.x) // 10
Structs can declare type parameters when fields should stay connected:
struct Pair<A, B> {
first: A
second: B
}
let pair: Pair<int, string> = Pair { first: 1, second: "two" }
println(pair.second) // two
Impl blocks
Add methods to a struct with impl:
struct Point {
x: int
y: int
}
impl Point {
fn distance(self) {
return sqrt(self.x * self.x + self.y * self.y)
}
fn translate(self, dx, dy) {
return Point { x: self.x + dx, y: self.y + dy }
}
}
let p = Point { x: 3, y: 4 }
println(p.distance()) // 5.0
println(p.translate(10, 20)) // Point({x: 13, y: 24})
The first parameter must be self, which receives the struct instance.
Methods are called with dot syntax on values constructed with the struct
constructor.
Interfaces
Interfaces let you define a contract: a set of methods that a type must
have. Harn uses implicit satisfaction, just like Go. A struct satisfies
an interface automatically if its impl block has all the required methods.
You never write implements or impl Interface for Type.
Step 1: Define an interface
An interface lists method signatures without bodies:
interface Displayable {
fn display(self) -> string
}
This says: any type that has a display(self) -> string method counts as
Displayable.
Interfaces can also be generic, and individual interface methods may declare their own type parameters when the contract needs them:
interface Repository<T> {
fn get(id: string) -> T
fn map<U>(value: T, f: fn(T) -> U) -> U
}
Interfaces may also declare associated types when the contract needs to name an implementation-defined type without making the whole interface generic:
interface Collection {
type Item
fn get(self, index: int) -> Item
}
Step 2: Create structs with matching methods
struct Dog {
name: string
breed: string
}
impl Dog {
fn display(self) -> string {
return "${self.name} the ${self.breed}"
}
}
struct Cat {
name: string
indoor: bool
}
impl Cat {
fn display(self) -> string {
let status = if self.indoor { "indoor" } else { "outdoor" }
return "${self.name} (${status} cat)"
}
}
Both Dog and Cat have a display(self) -> string method, so they
both satisfy Displayable. No extra annotation is needed.
Step 3: Use the interface as a type
Now you can write a function that accepts any Displayable:
fn introduce(animal: Displayable) {
println("Meet: ${animal.display()}")
}
let d = Dog({name: "Rex", breed: "Labrador"})
let c = Cat({name: "Whiskers", indoor: true})
introduce(d) // Meet: Rex the Labrador
introduce(c) // Meet: Whiskers (indoor cat)
The type checker verifies at compile time that Dog and Cat satisfy
Displayable. If a struct is missing a required method, you get a
clear error at the call site.
Interfaces with multiple methods
Interfaces can require more than one method:
interface Serializable {
fn serialize(self) -> string
fn byte_size(self) -> int
}
guard, require, and assert
These three forms serve different jobs:
guard condition else { ... }handles expected control flow and narrows types after the guard.require condition, "message"enforces runtime invariants in normal code and throws on failure.assert,assert_eq, andassert_neare for test pipelines. The linter warns when you use them in non-test code, and it nudges test pipelines away fromrequire.
guard user != nil else {
return "missing user"
}
require len(user.name) > 0, "user name cannot be empty"
A struct must implement all listed methods to satisfy the interface.
Generic constraints
You can also use interfaces as constraints on generic type parameters:
fn log_item<T>(item: T) where T: Displayable {
println("[LOG] ${item.display()}")
}
The where T: Displayable clause tells the type checker to verify that
whatever concrete type is passed for T satisfies Displayable. If it
does not, a compile-time error is produced. Generic parameters must also bind
consistently across arguments, so fn<T>(a: T, b: T) cannot be called with
mixed concrete types such as (int, string). Container bindings like
list<T> preserve and validate their element type at call sites too.
Variance: in T and out T
Type parameters on user-defined generics may be marked in (the
parameter is contravariant — it appears only in input positions) or
out (covariant — only in output positions). Unannotated parameters
default to invariant: Box<int> and Box<float> are unrelated
unless Box declares out T and uses T only covariantly.
type Reader<out T> = fn() -> T // T is produced
interface Sink<in T> { fn accept(v: T) -> int } // T is consumed
Built-in containers carry sensible variance: iter<T> is covariant
(read-only), but list<T> and dict<K, V> are invariant (mutable).
Function types are contravariant in their parameters and covariant in
their return type — fn(float) can stand in for fn(int), but not
the other way around. The full variance table lives in the spec under
“Subtyping and variance”.
Declarations are checked at the definition site: a type Box<out T> = fn(T) -> int is rejected because T appears in a contravariant
position despite the out annotation.
Spread in function calls
The spread operator ... expands a list into individual function
arguments:
fn add(a, b, c) {
return a + b + c
}
let nums = [1, 2, 3]
println(add(...nums)) // 6
You can mix regular arguments and spread arguments:
fn add(a, b, c) {
return a + b + c
}
let rest = [2, 3]
println(add(1, ...rest)) // 6
Spread works in method calls too:
let point = Point({x: 0, y: 0})
let deltas = [10, 20]
let moved = point.translate(...deltas)
Try-expression
The try keyword without a catch block is a try-expression. It
evaluates its body and wraps the outcome in a Result:
let result = try { json_parse(raw_input) }
// Result.Ok(parsed_data) -- if parsing succeeds
// Result.Err("invalid JSON: ...") -- if parsing throws
This is the complement of the ? operator. Use try to enter
Result-land (catching errors into Result.Err), and ? to exit
Result-land (propagating errors upward):
fn safe_divide(a, b) {
return try { a / b }
}
fn compute(x) {
let half = safe_divide(x, 2)? // unwrap Ok or propagate Err
return Ok(half + 10)
}
No catch or finally is needed. If a catch follows try, it is
parsed as the traditional try/catch statement instead.
Ask expression
The ask expression is syntactic sugar for making an LLM call. It takes
a set of key-value fields and returns the LLM response as a string:
let answer = ask {
system: "You are a helpful assistant.",
user: "What is 2 + 2?"
}
println(answer)
Common fields include system (system prompt), user (user message),
model, max_tokens, and provider. The ask expression is equivalent
to building a dict and passing it to llm_call.
Duration literals
let d1 = 500ms // 500 milliseconds
let d2 = 5s // 5 seconds
let d3 = 2m // 2 minutes
let d4 = 1h // 1 hour
Durations can be passed to sleep() and used in deadline blocks.
Math constants
pi and e are global constants (not functions):
println(pi) // 3.141592653589793
println(e) // 2.718281828459045
let area = pi * r * r
Named format placeholders
The format builtin supports both positional {} placeholders and named
{key} placeholders when the second argument is a dict:
// Positional
println(format("Hello, {}!", "world"))
// Named
println(format("Hello {name}, you are {age}.", {name: "Alice", age: 30}))
For simple cases, string interpolation with ${} is usually more
convenient:
let name = "Alice"
println("Hello, ${name}!")
Comments
// Line comment
/** HarnDoc comment for a public API.
Use a `/** ... */` block directly above `pub fn`. */
pub fn greet(name: string) -> string {
return "Hello, ${name}"
}
pub pipeline deploy(task) {
return
}
pub enum Result {
Ok(value: string)
Err(message: string)
}
pub struct Config {
host: string
port?: int
}
/* Block comment
/* Nested block comments are supported */
Still inside the outer comment */
Error handling
Harn provides try/catch/throw for error handling and retry for automatic recovery.
throw
Any value can be thrown as an error:
throw "something went wrong"
throw {code: 404, message: "not found"}
throw 42
try/catch
Catch errors with an optional error binding:
try {
let data = json_parse(raw_input)
} catch (e) {
println("Parse failed: ${e}")
}
The error variable is optional:
fn risky_operation() { throw "boom" }
try {
risky_operation()
} catch {
println("Something failed, moving on")
}
What gets bound to the error variable
- If the error was created with
throw:eis the thrown value directly (string, dict, etc.) - If the error is an internal runtime error:
eis the error’s description as a string
return inside try
A return statement inside a try block is not caught. It propagates
out of the enclosing pipeline or function as expected.
fn find_user(id) {
try {
let user = lookup(id)
return user // this returns from find_user, not caught
} catch (e) {
return nil
}
}
Typed catch
Catch specific error types using enum-based error hierarchies:
enum AppError {
NotFound(resource)
Unauthorized(reason)
Internal(message)
}
try {
throw AppError.NotFound("user:123")
} catch (e: AppError) {
match e.variant {
"NotFound" -> { println("Missing: ${e.fields[0]}") }
"Unauthorized" -> { println("Access denied") }
"Internal" -> { println("Internal: ${e.fields[0]}") }
}
}
Errors that don’t match the typed catch propagate up the call stack.
require
The require statement checks a condition and throws an error if it is
false. An optional second argument provides the error message:
require len(items) > 0, "items list must not be empty"
require user != nil, "user is required"
require score >= 0 // throws a generic error if false
require is useful at the top of a function to validate preconditions
before proceeding. If the condition is falsy, execution stops with a
thrown error that can be caught by try/catch or will surface as a
runtime error.
guard
The guard statement provides an early-return pattern. If the condition
is false, the else block executes. The else block must exit the
current scope (typically via return or throw):
fn process(input) {
guard input != nil else {
return "no input"
}
guard type_of(input) == "string" else {
throw "expected string, got ${type_of(input)}"
}
// input is guaranteed non-nil and a string here
return input.uppercase()
}
After a guard statement, the type checker narrows the variable’s type
based on the condition. For example, guard x != nil ensures x is
non-nil in subsequent code.
retry
Automatically retry a block up to N times:
retry 3 {
let response = http_post(url, payload)
let parsed = json_parse(response)
parsed
}
- If the body succeeds on any attempt, returns that result immediately
- If all attempts fail, returns
nil returninside a retry block propagates out (not retried)
Try-expression
The try keyword without a catch block acts as a try-expression. It
evaluates the body and returns a Result:
- On success:
Result.Ok(value) - On error:
Result.Err(error)
let result = try { json_parse(raw_input) }
This is useful when you want to capture an error as a value rather than
crashing or needing a full try/catch:
let parsed = try { json_parse(input) }
if is_err(parsed) {
println("Bad input, using defaults")
parsed = Ok({})
}
let data = unwrap(parsed)
Try/catch expression
try { ... } catch (e) { ... } is also usable as an expression — the whole
form evaluates to the try body’s tail value on success, or the catch
handler’s tail value on a caught throw. The lub of the two branch types is
inferred automatically, and an explicit type annotation on the let binds
the result:
let parsed: dict = try { json_parse(input) } catch (e) { default_config() }
Typed catches work identically in expression position; when the thrown
error’s type does not match the catch’s type filter, the throw propagates
past the expression and the let binding is never established:
let user: User = try {
fetch_user(id)
} catch (e: NetworkError) {
cached_user(id)
}
// Any non-`NetworkError` throw surfaces out of this block unchanged.
A finally { ... } tail is optional on either form and runs once for
side-effect only — its value is discarded. The expression’s value still
comes from the try body or the catch handler.
The try-expression pairs naturally with the ? operator. Use try to
enter Result-land and ? to propagate within it:
fn fetch_json(url) {
let body = try { http_get(url) }
let text = unwrap(body)?
let data = try { json_parse(text) }
return data
}
When catch or finally follows try, the form is the handled
expression described above; only the bare try { body } form wraps in
Result.
Runtime shape validation errors
When a function parameter has a structural type annotation (a shape like
{name: string, age: int}), Harn validates the argument at runtime. If
the argument is missing a required field or a field has the wrong type,
a clear error is produced:
fn process(user: {name: string, age: int}) {
println("${user.name} is ${user.age}")
}
process({name: "Alice"})
// Error: parameter 'user': missing field 'age' (int)
process({name: "Alice", age: "old"})
// Error: parameter 'user': field 'age' expected int, got string
Shape validation works with both plain dicts and struct instances. Extra fields beyond those listed in the shape are allowed (width subtyping).
This catches a common class of bugs where a dict is passed with missing or mistyped fields, giving you precise feedback about exactly which field is wrong.
Result type
The built-in Result enum provides an alternative to try/catch for
representing success and failure as values. A Result is either
Ok(value) or Err(error). Statically, Result is generic:
Result<T, E>.
let ok = Ok(42)
let err = Err("something failed")
let typed_ok: Result<int, string> = ok
let typed_err: Result<int, string> = err
println(ok) // Result.Ok(42)
println(err) // Result.Err(something failed)
The shorthand constructors Ok(value) and Err(value) are equivalent to
Result.Ok(value) and Result.Err(value).
Result helper functions
| Function | Description |
|---|---|
is_ok(r) | Returns true if r is Result.Ok |
is_err(r) | Returns true if r is Result.Err |
unwrap(r) | Returns the Ok value, throws if r is Err |
unwrap_or(r, default) | Returns the Ok value, or default if r is Err |
unwrap_err(r) | Returns the Err value, throws if r is Ok |
let r = Ok(42)
println(is_ok(r)) // true
println(is_err(r)) // false
println(unwrap(r)) // 42
println(unwrap_or(Err("x"), "default")) // default
Pattern matching on Result
Result values can be destructured with match:
fn fetch_data(url) {
// ... returns Ok(data) or Err(message)
}
match fetch_data("/api/users") {
Result.Ok(data) -> { println("Got ${len(data)} users") }
Result.Err(err) -> { println("Failed: ${err}") }
}
The ? operator
The postfix ? operator provides concise error propagation. Applied to a
Result value, it unwraps Ok and returns the value, or immediately
returns the Err from the enclosing function.
fn divide(a, b) {
if b == 0 {
return Err("division by zero")
}
return Ok(a / b)
}
fn compute(x) {
let result = divide(x, 2)? // unwraps Ok, or returns Err early
return Ok(result + 10)
}
let r1 = compute(20) // Result.Ok(20)
let r2 = compute(0) // Result.Err(division by zero)
The ? operator has the same precedence as ., [], and (), so it
chains naturally:
fn fetch_and_parse(url) {
let response = http_get(url)?
let data = json_parse(response)?
return Ok(data)
}
Applying ? to a non-Result value produces a runtime type error.
Result vs. try/catch
Use Result and ? when errors are expected outcomes that callers should
handle (validation failures, missing data, parse errors). Use try/catch
for unexpected errors or when you want to recover from failures in-place
without propagating them through return values.
The two patterns can be combined:
fn transform(data) { return data }
fn safe_parse(input) {
try {
let data = json_parse(input)
return Ok(data)
} catch (e) {
return Err("parse error: ${e}")
}
}
fn process(raw) {
let data = safe_parse(raw)? // propagate Err if parse fails
return Ok(transform(data))
}
Stack traces
When a runtime error occurs, Harn displays a stack trace showing the call chain that led to the error. The trace includes file location, source context, and the sequence of function calls.
error: division by zero
--> example.harn:3:14
|
3 | let x = a / b
| ^
= note: called from compute at example.harn:8
= note: called from pipeline at example.harn:12
The error format shows:
- Error message: what went wrong
- Source location: file, line, and column where the error occurred
- Source context: the relevant source line with a caret (
^) pointing to the exact position - Call chain: each function in the call stack, from innermost to outermost, with file and line numbers
Stack traces are captured at the point of the error, before try/catch unwinding, so the full call chain is preserved even when errors are caught at a higher level.
Combining patterns
retry 3 {
try {
let result = llm_call(prompt, system)
let parsed = json_parse(result)
return parsed
} catch (e) {
println("Attempt failed: ${e}")
throw e // re-throw to trigger retry
}
}
Modules and imports
Harn supports splitting code across files using import and top-level fn declarations.
Importing files
import "lib/helpers.harn"
The extension is optional — these are equivalent:
import "lib/helpers.harn"
import "lib/helpers"
Import paths are resolved relative to the current file’s directory.
If main.harn imports "lib/helpers", it looks for lib/helpers.harn
next to main.harn.
Writing a library file
Library files contain top-level fn declarations:
// lib/math.harn
fn double(x) {
return x * 2
}
fn clamp(value, low, high) {
if value < low { return low }
if value > high { return high }
return value
}
When imported, these functions become available in the importing file’s scope.
Using imported functions
import "lib/math"
pipeline default(task) {
println(double(21)) // 42
println(clamp(150, 0, 100)) // 100
}
Importing pipelines
Imported files can also contain pipelines, which are registered globally by name:
// lib/analysis.harn
pipeline analyze(task) {
println("Analyzing: ${task}")
}
import "lib/analysis"
pipeline default(task) {
// the "analyze" pipeline is now registered and available
}
What needs an import
Most Harn builtins — println, log, read_file, write_file, llm_call,
agent_loop, http_get, parallel, workflow_*, transcript_*,
mcp_*, and the rest of the runtime surface — are registered globally and
require no import statement. You can call them directly from top-level
code or inside any pipeline.
import "std/..." is only needed for the Harn-written helper modules
described below (std/text, std/json, std/math, std/collections,
std/path, std/context, std/agent_state, std/agents, std/runtime,
std/project, std/worktree, std/checkpoint). These add layered
utilities on top of the core builtins; the core builtins themselves are
always available.
Standard library modules
Harn includes built-in modules that are compiled into the interpreter.
Import them with the std/ prefix:
import "std/text"
import "std/collections"
import "std/math"
import "std/path"
import "std/json"
import "std/context"
import "std/agent_state"
import "std/agents"
std/text
Text processing utilities for LLM output and code analysis:
| Function | Description |
|---|---|
int_to_string(value) | Convert an integer-compatible value to a decimal string |
float_to_string(value) | Convert a float-compatible value to a string |
parse_int_or(value, fallback) | Parse an integer, returning fallback on failure |
parse_float_or(value, fallback) | Parse a float, returning fallback on failure |
extract_paths(text) | Extract file paths from text, filtering comments and validating extensions |
parse_cells(response) | Parse fenced code blocks from LLM output. Returns [{type, lang, code}] |
filter_test_cells(cells, target_file?) | Filter cells to keep code blocks and write_file calls |
truncate_head_tail(text, n) | Keep first/last n lines with omission marker |
detect_compile_error(output) | Check for compile error patterns (SyntaxError, etc.) |
has_got_want(output) | Check for got/want test failure patterns |
format_test_errors(output) | Extract error-relevant lines (max 20) |
std/collections
Collection utilities and store helpers:
| Function | Description |
|---|---|
filter_nil(dict) | Remove entries where value is nil, empty string, or “null” |
store_stale(key, max_age_seconds) | Check if a store key’s timestamp is stale |
store_refresh(key) | Update a store key’s timestamp to now |
std/math
Extended math utilities:
| Function | Description |
|---|---|
clamp(value, lo, hi) | Clamp a value between min and max |
lerp(a, b, t) | Linear interpolation between a and b by t (0..1) |
map_range(value, in_lo, in_hi, out_lo, out_hi) | Map a value from one range to another |
deg_to_rad(degrees) | Convert degrees to radians |
rad_to_deg(radians) | Convert radians to degrees |
sum(items) | Sum a list of numbers |
avg(items) | Average of a list of numbers (returns 0 for empty lists) |
mean(items) | Arithmetic mean of a list of numbers |
median(items) | Median of a non-empty numeric list |
percentile(items, p) | R-7 percentile interpolation for p in [0, 100] |
argsort(items, score_fn?) | Indices that would sort a list ascending, optionally by score |
top_k(items, k, score_fn?) | Highest-scoring k items, descending |
variance(items, sample?) | Population variance, or sample variance when sample = true |
stddev(items, sample?) | Population standard deviation, or sample mode when sample = true |
minmax_scale(items) | Scale a numeric list into [0, 1], or all zeros for a constant list |
zscore(items, sample?) | Standardize a numeric list, or all zeros for a constant list |
weighted_mean(items, weights) | Weighted arithmetic mean |
weighted_choice(items, weights?) | Randomly choose one item by non-negative weights |
softmax(items, temperature?) | Convert numeric scores into probabilities |
normal_pdf(x, mean?, stddev?) | Normal density with defaults mean = 0, stddev = 1 |
normal_cdf(x, mean?, stddev?) | Normal cumulative distribution with defaults mean = 0, stddev = 1 |
normal_quantile(prob, mean?, stddev?) | Inverse normal CDF for 0 < prob < 1 |
dot(a, b) | Dot product of two equal-length numeric vectors |
vector_norm(v) | Euclidean norm of a numeric vector |
vector_normalize(v) | Unit-length version of a non-zero numeric vector |
cosine_similarity(a, b) | Cosine similarity of two non-zero equal-length vectors |
euclidean_distance(a, b) | Euclidean distance between two equal-length vectors |
manhattan_distance(a, b) | Manhattan distance between two equal-length vectors |
chebyshev_distance(a, b) | Chebyshev distance between two equal-length vectors |
covariance(xs, ys, sample?) | Population or sample covariance between two numeric lists |
correlation(xs, ys, sample?) | Pearson correlation between two numeric lists |
moving_avg(items, window) | Sliding-window moving average |
ema(items, alpha) | Exponential moving average over a numeric list |
kmeans(points, k, options?) | Deterministic k-means over list<list<number>>, returns {centroids, assignments, counts, iterations, converged, inertia} |
import "std/math"
println(clamp(150, 0, 100)) // 100
println(lerp(0, 10, 0.5)) // 5
println(map_range(50, 0, 100, 0, 1)) // 0.5
println(sum([1, 2, 3, 4])) // 10
println(avg([10, 20, 30])) // 20
println(percentile([1, 2, 3, 4], 75)) // 3.25
println(top_k(["a", "bbbb", "cc"], 2, { x -> len(x) })) // ["bbbb", "cc"]
println(softmax([1, 2, 3])) // probabilities summing to 1
println(cosine_similarity([1, 0], [1, 1])) // ~0.707
println(moving_avg([1, 2, 3, 4, 5], 3)) // [2.0, 3.0, 4.0]
let grouped = kmeans([[0, 0], [0, 1], [10, 10], [10, 11]], 2)
println(grouped.centroids) // [[0.0, 0.5], [10.0, 10.5]]
std/path
Path manipulation utilities:
| Function | Description |
|---|---|
ext(path) | Get the file extension without the dot |
stem(path) | Get the filename without extension |
normalize(path) | Normalize path separators (backslash to forward slash) |
is_absolute(path) | Check if a path is absolute |
workspace_info(path, workspace_root?) | Classify a path at the workspace boundary |
workspace_normalize(path, workspace_root?) | Normalize a path into workspace-relative form when safe |
list_files(dir) | List files in a directory (one level) |
list_dirs(dir) | List subdirectories in a directory |
import "std/path"
println(ext("main.harn")) // "harn"
println(stem("/src/main.harn")) // "main"
println(is_absolute("/usr/bin")) // true
println(workspace_normalize("/packages/app/SKILL.md", cwd())) // "packages/app/SKILL.md"
let files = list_files("src")
let dirs = list_dirs(".")
std/json
JSON utility patterns:
| Function | Description |
|---|---|
pretty(value) | Pretty-print a value as indented JSON |
safe_parse(text) | Safely parse JSON, returning nil on failure instead of throwing |
merge(a, b) | Shallow-merge two dicts (keys in b override keys in a) |
pick(data, keys) | Pick specific keys from a dict |
omit(data, keys) | Omit specific keys from a dict |
import "std/json"
let data = safe_parse("{\"x\": 1}") // {x: 1}, or nil on bad input
let merged = merge({a: 1}, {b: 2}) // {a: 1, b: 2}
let subset = pick({a: 1, b: 2, c: 3}, ["a", "c"]) // {a: 1, c: 3}
let rest = omit({a: 1, b: 2, c: 3}, ["b"]) // {a: 1, c: 3}
std/context
Structured prompt/context assembly helpers:
| Function | Description |
|---|---|
section(name, content, options?) | Create a named context section |
context_attach(name, path, content, options?) | Attach file/path-oriented context |
context(sections, options?) | Build a context object |
context_render(ctx, options?) | Render a context into prompt text |
prompt_compose(task, ctx, options?) | Compose {prompt, system, rendered_context} |
std/agent_state
Durable session-scoped state helpers built on the VM-side durable-state backend:
| Function | Description |
|---|---|
agent_state_init(root, options?) | Create or reopen a session-scoped durable state handle |
agent_state_resume(root, session_id, options?) | Reopen an existing durable state session |
agent_state_write(handle, key, content) | Atomically persist text content under a relative key |
agent_state_read(handle, key) | Read a key, returning nil when it is absent |
agent_state_list(handle) | Recursively list keys in deterministic order |
agent_state_delete(handle, key) | Delete a key |
agent_state_handoff(handle, summary) | Write a structured JSON handoff envelope to the reserved handoff key |
agent_state_handoff_key() | Return the reserved handoff key name |
See Agent state for the handle format, conflict policies, and backend details.
std/runtime
Generic host/runtime helpers that are useful across many hosts:
| Function | Description |
|---|---|
runtime_task() | Return the current runtime task string |
runtime_pipeline_input() | Return structured pipeline input from the host |
runtime_dry_run() | Return whether the current run is dry-run only |
runtime_approved_plan() | Return the host-approved plan text when available |
process_exec(command) | Execute a process through the typed host contract |
process_exec_with_timeout(command, timeout_ms) | Execute a process with an explicit timeout |
interaction_ask(question) | Ask the host/user a question through the typed interaction contract |
interaction_ask_with_kind(question, kind) | Ask the host/user a question with an explicit interaction kind |
record_run_metadata(run, workflow_name) | Persist normalized workflow run metadata through the runtime contract |
std/project
Project metadata helpers plus deterministic project evidence scanning:
| Function | Description |
|---|---|
metadata_namespace(dir, namespace) | Read resolved metadata for a namespace, defaulting to {} |
metadata_local_namespace(dir, namespace) | Read only the namespace data stored directly on a directory |
project_inventory(namespace?) | Return {entries, status} for metadata-backed project state |
project_root_package() | Infer the repository’s root package/module name from common manifests |
project_scan(path, options?) | Scan a directory for deterministic L0/L1 evidence |
project_enrich(path, options) | Run caller-owned L2 enrichment over bounded project context with schema validation and caching |
project_scan_tree(path, options?) | Walk subdirectories and return a {rel_path: evidence} map |
project_enrich(path, options?) | Run a structured per-directory L2 enrichment with caller-owned prompt/schema |
project_deep_scan(path, options?) | Build or refresh a cached per-directory evidence tree backed by metadata namespaces |
project_deep_scan_status(namespace, path?) | Return the last deep-scan status for a namespace/scope |
project_catalog() | Return the built-in anchor/lockfile catalog used by project_scan(...) |
project_scan_paths(path, options?) | Return only the keys from project_scan_tree(...) |
project_stale(namespace?) | Return the stale summary from metadata_status(...) |
project_stale_dirs(namespace?) | Return the tier1+tier2 stale directory list |
project_requires_refresh(namespace?) | Return true when stale or missing hashes require refresh |
Host-specific editor, git, diagnostics, learning, and filesystem/edit helpers
should live in host-side .harn libraries built on capability-aware
host_call(...), not in Harn’s shared stdlib.
std/agents
Workflow helpers built on transcripts and agent_loop:
| Function | Description |
|---|---|
workflow(config) | Create a workflow config |
action_graph(raw, options?) | Normalize planner output into a canonical action-graph envelope |
action_graph_batches(graph, completed?) | Compute dependency-ready action batches grouped by phase and tool class |
action_graph_render(graph) | Render a human-readable markdown summary of an action graph |
action_graph_flow(graph, config?) | Convert an action graph into a typed workflow graph |
action_graph_run(task, graph, config?, overrides?) | Execute an action graph through the shared workflow runtime |
task_run(task, flow, overrides?) | Run an act/verify/repair workflow |
workflow_result_text(result) | Extract a visible text result from an LLM call, workflow wrapper, or ad hoc payload |
workflow_result_run(task, workflow_name, result, artifacts?, options?) | Normalize an ad hoc result into a reusable run record |
workflow_result_persist(task, workflow_name, result, artifacts?, options?) | Persist an ad hoc result as a run record without going through workflow_execute |
workflow_session(prev) | Normalize a task result or transcript into a reusable session object |
workflow_session_new(metadata?) | Create a new empty workflow session |
workflow_session_restore(run_or_path) | Restore a session from a run record or persisted run path |
workflow_session_fork(prev) | Fork a session transcript and mark it forked |
workflow_session_archive(prev) | Archive a session transcript |
workflow_session_resume(prev) | Resume an archived session transcript |
workflow_session_compact(prev, options?) | Summarize/compact a session transcript in place |
workflow_session_reset(prev, carry_summary) | Reset a session transcript, optionally carrying summary |
workflow_session_persist(prev, path?) | Persist the session run record and attach the saved path |
workflow_continue(prev, task, flow, overrides?) | Continue from an existing transcript |
workflow_compact(prev, options?) | Summarize and compact a transcript |
workflow_reset(prev, carry_summary) | Reset or summarize-then-reset a workflow transcript |
worker_request(worker) | Return a worker handle’s immutable original request payload |
worker_result(worker) | Return a worker handle/result payload or worker-result artifact payload |
worker_provenance(worker) | Return normalized worker provenance fields |
worker_research_questions(worker) | Return the worker’s canonical research_questions list |
worker_action_items(worker) | Return the worker’s canonical action_items list |
worker_workflow_stages(worker) | Return the worker’s canonical workflow_stages list |
worker_verification_steps(worker) | Return the worker’s canonical verification_steps list |
workflow_session(...) returns a normalized session dict that includes the
current transcript, message count, summary, persisted run metadata, and a
usage object when the source run captured LLM totals:
{input_tokens, output_tokens, total_duration_ms, call_count}.
For background or delegated execution, use the worker lifecycle builtins
(spawn_agent, send_input, resume_agent, wait_agent, close_agent, list_agents)
directly from the runtime, or the worker_* helpers above when you need the
normalized request/provenance views.
std/worktree
Helpers for isolated git worktree execution built on exec_at(...) and
shell_at(...):
| Function | Description |
|---|---|
worktree_default_path(repo, name) | Return the default .harn/worktrees/<name> path |
worktree_create(repo, name, base_ref, path?) | Create or reset a worktree branch at a target path |
worktree_remove(repo, path, force) | Remove a worktree from the parent repo |
worktree_status(path) | Run git status --short --branch in the worktree |
worktree_diff(path, base_ref?) | Render diff output for the worktree |
worktree_shell(path, script) | Run an arbitrary shell command inside the worktree |
Selective imports
Import specific functions from any module:
import { extract_paths, parse_cells } from "std/text"
Import behavior
Import paths resolve in this order:
std/<module>from the embedded stdlib- Relative to the importing file, with implicit
.harn - Installed packages under the nearest ancestor
.harn/packages/ - Package manifest
[exports]aliases - Package directories with
lib.harn
Packages can publish stable module entry points in harn.toml:
[exports]
capabilities = "runtime/capabilities.harn"
providers = "runtime/providers.harn"
With that manifest, import "acme/capabilities" resolves to the
declared file inside .harn/packages/acme/, and nested package modules
can import sibling packages through the workspace-level .harn/packages
root instead of relying on brittle relative paths.
- The imported file is parsed and executed
- Pipelines in the imported file are registered by name
- Non-pipeline top-level statements (fn declarations, let bindings) are executed, making their values available
- Circular imports are detected and skipped (each file is imported at most once)
- The working directory is temporarily changed to the imported file’s directory, so nested imports resolve correctly
- Source-relative builtins like
render(...)inside imported functions resolve paths relative to the imported module’s directory, not the entry pipeline
Static cross-module checking
harn check, harn run, harn bench, and the Harn LSP all build a
module graph from the entry file that follows import statements
transitively, so they share one consistent view of what names are
visible in each module.
When every import in a file resolves, the typechecker treats a call to an unknown name as an error (not a lint warning):
error: call target `helpr` is not defined or imported
Resolution is conservative: if any import in the file fails to resolve (missing file, parse error, nonexistent package), the stricter cross-module check is turned off for that file and only the normal builtin/local-declaration check applies. That way one broken import does not produce a flood of follow-on undefined-name errors.
Go-to-definition in the LSP uses the same graph, so navigation works across any chain of imports — not just direct ones.
Import collision detection
If two wildcard imports export a function with the same name, Harn will
report an error at both runtime and during harn check preflight:
Import collision: 'helper' is already defined when importing lib/b.harn.
Use selective imports to disambiguate: import { helper } from "..."
To resolve collisions, use selective imports to import only the names you need from each module:
import { parse_output } from "lib/a"
import { format_result } from "lib/b"
Pipeline inheritance
Pipelines can extend other pipelines:
pipeline base(task) {
println("Step 1: setup")
println("Step 2: execute")
println("Step 3: cleanup")
}
pipeline custom(task) extends base {
override setup() {
println("Custom setup")
}
}
If the child pipeline has override declarations, the parent’s body runs
with the overrides applied. If the child has no overrides, the child’s body
replaces the parent’s entirely.
Organizing a project
A typical project structure:
my-project/
main.harn
lib/
context.harn # shared context-gathering functions
agent.harn # shared agent utility functions
helpers.harn # general-purpose utilities
// main.harn
import "lib/context"
import "lib/agent"
import "lib/helpers"
pipeline default(task, project) {
let ctx = gather_context(task, project)
let result = run_agent(ctx)
finalize(result)
}
Concurrency
Harn has built-in concurrency primitives that don’t require callbacks, promises, or async/await boilerplate.
spawn and await
Launch background tasks and collect results:
let handle = spawn {
sleep(1s)
"done"
}
let result = await(handle) // blocks until complete
println(result) // "done"
Cancel a task before it finishes:
let handle = spawn { sleep(10s) }
cancel(handle)
Each spawned task runs in an isolated interpreter instance.
parallel
Run N tasks concurrently and collect results in order:
let results = parallel(5) { i ->
i * 10
}
// [0, 10, 20, 30, 40]
The variable i is the zero-based task index. Results are always returned
in index order regardless of completion order.
parallel each
Map over a collection concurrently:
let files = ["a.txt", "b.txt", "c.txt"]
let contents = parallel each files { file ->
read_file(file)
}
Results preserve the original list order.
parallel settle
Like parallel each, but never throws. Instead, it collects both
successes and failures into a result object:
let items = [1, 2, 3]
let outcome = parallel settle items { item ->
if item == 2 {
throw "boom"
}
item * 10
}
println(outcome.succeeded) // 2
println(outcome.failed) // 1
for r in outcome.results {
if is_ok(r) {
println(unwrap(r))
} else {
println(unwrap_err(r))
}
}
The return value is a dict with:
| Field | Type | Description |
|---|---|---|
results | list | List of Result values (one per item), in order |
succeeded | int | Number of Ok results |
failed | int | Number of Err results |
This is useful when you want to process all items and handle failures after the fact, rather than aborting on the first error.
retry
Automatically retry a block that might fail:
retry 3 {
http_get("https://flaky-api.example.com/data")
}
Executes the body up to N times. If the body succeeds, returns immediately.
If all attempts fail, returns nil. Note that return statements inside
retry propagate out (they are not retried).
Channels
Message-passing between concurrent tasks:
let ch = channel("events")
send(ch, {event: "start", timestamp: timestamp()})
let msg = receive(ch)
Channel iteration
You can iterate over a channel with a for loop. The loop receives
messages one at a time and exits when the channel is closed and fully
drained:
let ch = channel("stream")
spawn {
send(ch, "chunk 1")
send(ch, "chunk 2")
close_channel(ch)
}
for chunk in ch {
println(chunk)
}
// prints "chunk 1" then "chunk 2", then the loop ends
This is especially useful with llm_stream, which returns a channel
of response chunks:
let stream = llm_stream("Tell me a story", "You are a storyteller")
for chunk in stream {
print(chunk)
}
Use try_receive(ch) for non-blocking reads – it returns nil
immediately if no message is available. Use close_channel(ch) to
signal that no more messages will be sent.
Atomics
Thread-safe counters:
let counter = atomic(0)
println(atomic_get(counter)) // 0
let c2 = atomic_add(counter, 5)
println(atomic_get(c2)) // 5
let c3 = atomic_set(c2, 100)
println(atomic_get(c3)) // 100
Atomic operations return new atomic values (they don’t mutate in place).
Mutex
Mutual exclusion for critical sections:
mutex {
// only one task executes this block at a time
var count = count + 1
}
Deadline
Set a timeout on a block of work:
deadline 30s {
// must complete within 30 seconds
agent_loop(task, system, {persistent: true})
}
Defer
Register cleanup code that runs when the enclosing scope exits, whether by normal return or by a thrown error:
fn open(path) { return path }
fn close(f) { log("closed ${f}") }
let f = open("data.txt")
defer { close(f) }
// ... use f ...
// close(f) runs automatically on scope exit
Multiple defer blocks execute in LIFO (last-registered, first-executed)
order, similar to Go’s defer.
Capping in-flight work with max_concurrent
parallel each, parallel settle, and parallel N all accept an
optional with { max_concurrent: N } clause that caps how many
workers are in flight at once. Tasks past the cap wait until a slot
frees up — fan-out stays bounded while the total work is unchanged.
// Without a cap: all 200 requests hit the server at once.
let results = parallel settle paths { p -> llm_call(p, nil, opts) }
// With max_concurrent=8: at most 8 in-flight calls at any moment.
let results = parallel settle paths with { max_concurrent: 8 } { p ->
llm_call(p, nil, opts)
}
max_concurrent: 0 (or a missing with clause) means unlimited.
Negative values are treated as unlimited. The cap applies to every
parallel mode, including the count form:
fn process(i) { log(i) }
parallel 100 with { max_concurrent: 4 } { i ->
process(i)
}
Rate limiting LLM providers
max_concurrent bounds simultaneous in-flight tasks on the caller’s
side. A provider can additionally be rate-limited at the throughput
layer (requests per minute). The RPM limiter is a sliding-window
budget enforced before each llm_call / llm_completion — requests
past the budget wait for the window to free up rather than error.
Configure RPM per provider via:
rpm: 600in the provider’s entry inproviders.toml/harn.toml.HARN_RATE_LIMIT_<PROVIDER>=600environment variable (e.g.HARN_RATE_LIMIT_TOGETHER=600,HARN_RATE_LIMIT_LOCAL=60). Env overrides config.llm_rate_limit("provider", 600)at runtime from a pipeline.
The two controls compose: max_concurrent prevents bursts from
saturating the server; RPM shapes sustained throughput. When batching
hundreds of LLM calls against a local single-GPU server, both are
worth setting — otherwise the RPM budget can be spent in a 2-second
burst that overwhelms the queue and drops requests.
Harn language specification
Version: 1.0 (derived from implementation, 2026-04-01)
Harn is a pipeline-oriented programming language for orchestrating AI agents. It is implemented as a Rust workspace with a lexer, parser, type checker, tree-walking VM, tree-sitter grammar, and CLI/runtime tooling. Programs consist of named pipelines containing imperative statements, expressions, and calls to registered builtins that perform I/O, LLM calls, and tool execution.
This file is the canonical language specification. The hosted docs page
docs/src/language-spec.md is generated from it by
scripts/sync_language_spec.sh.
Lexical rules
Whitespace
Spaces (' '), tabs ('\t'), and carriage returns ('\r') are insignificant and skipped
between tokens. Newlines ('\n') are significant tokens used as statement separators.
The parser skips newlines between statements but they are preserved in the token stream.
Backslash line continuation
A backslash (\) immediately before a newline joins the current line with the next.
Both the backslash and the newline are removed from the token stream, so the two
physical lines are treated as a single logical line by the lexer.
let total = 1 + 2 \
+ 3 + 4
// equivalent to: let total = 1 + 2 + 3 + 4
This is useful for breaking long expressions that do not involve a binary operator eligible for multiline continuation (see “Multiline expressions”).
Comments
// Line comment: everything until the next newline is ignored.
/* Block comment: can span multiple lines.
/* Nesting is supported. */
Still inside the outer comment. */
Block comments track nesting depth, so /* /* */ */ is valid. An unterminated block comment produces a lexer error.
Keywords
The following identifiers are reserved:
| Keyword | Token |
|---|---|
pipeline | .pipeline |
extends | .extends |
override | .overrideKw |
let | .letKw |
var | .varKw |
if | .ifKw |
else | .elseKw |
for | .forKw |
in | .inKw |
match | .matchKw |
retry | .retry |
parallel | .parallel |
defer | .defer |
return | .returnKw |
import | .importKw |
true | .trueKw |
false | .falseKw |
nil | .nilKw |
try | .tryKw |
catch | .catchKw |
throw | .throwKw |
finally | .finally |
fn | .fnKw |
spawn | .spawnKw |
while | .whileKw |
type | .typeKw |
enum | .enum |
struct | .struct |
interface | .interface |
pub | .pub |
from | .from |
to | .to |
tool | .tool |
exclusive | .exclusive |
guard | .guard |
require | .require |
each | .each |
settle | .settle |
deadline | .deadline |
yield | .yield |
mutex | .mutex |
break | .break |
continue | .continue |
select | .select |
impl | .impl |
Identifiers
An identifier starts with a letter or underscore, followed by zero or more letters, digits, or underscores:
identifier ::= [a-zA-Z_][a-zA-Z0-9_]*
Number literals
int_literal ::= digit+
float_literal ::= digit+ '.' digit+
A number followed by . where the next character is not a digit is lexed as an integer
followed by the . operator (enabling 42.method).
Duration literals
A duration literal is an integer followed immediately (no whitespace) by a time-unit suffix:
duration_literal ::= digit+ ('ms' | 's' | 'm' | 'h' | 'd' | 'w')
| Suffix | Unit | Equivalent |
|---|---|---|
ms | milliseconds | – |
s | seconds | 1000 ms |
m | minutes | 60 s |
h | hours | 60 m |
d | days | 24 h |
w | weeks | 7 d |
Duration literals evaluate to an integer number of milliseconds. They can be used anywhere an expression is expected:
sleep(500ms)
deadline 30s { /* ... */ }
let one_day = 1d // 86400000
let two_weeks = 2w // 1209600000
String literals
Single-line strings
string_literal ::= '"' (char | escape | interpolation)* '"'
escape ::= '\' ('n' | 't' | '\\' | '"' | '$')
interpolation ::= '${' expression '}'
A string cannot span multiple lines. An unescaped newline inside a string is a lexer error.
If the string contains at least one ${...} interpolation, it produces an
interpolatedString token containing a list of segments (literal text and expression
source strings). Otherwise it produces a plain stringLiteral token.
Escape sequences: \n (newline), \t (tab), \\ (backslash), \" (double quote),
\$ (dollar sign). Any other character after \ produces a literal backslash
followed by that character.
Raw string literals
raw_string_literal ::= 'r"' char* '"'
Raw strings use the r"..." prefix. No escape processing or interpolation is
performed inside a raw string – backslashes, dollar signs, and other characters
are taken literally. Raw strings cannot span multiple lines.
Raw strings are useful for regex patterns and file paths where backslashes are common:
let pattern = r"\d+\.\d+"
let path = r"C:\Users\alice\docs"
Multi-line strings
multi_line_string ::= '"""' newline? content '"""'
Triple-quoted strings can span multiple lines. The optional newline immediately after the
opening """ is consumed. Common leading whitespace is stripped from all non-empty lines.
A trailing newline before the closing """ is removed.
Multi-line strings support ${expression} interpolation with automatic indent
stripping. If at least one ${...} interpolation is present, the result is an
interpolatedString token; otherwise it is a plain stringLiteral token.
let name = "world"
let doc = """
Hello, ${name}!
Today is ${timestamp()}.
"""
Operators
Two-character operators (checked first)
| Operator | Token | Description |
|---|---|---|
== | .eq | Equality |
!= | .neq | Inequality |
&& | .and | Logical AND |
|| | .or | Logical OR |
|> | .pipe | Pipe |
?? | .nilCoal | Nil coalescing |
** | .pow | Exponentiation |
?. | .questionDot | Optional property/method chaining |
-> | .arrow | Arrow |
<= | .lte | Less than or equal |
>= | .gte | Greater than or equal |
+= | .plusAssign | Compound assignment |
-= | .minusAssign | Compound assignment |
*= | .starAssign | Compound assignment |
/= | .slashAssign | Compound assignment |
%= | .percentAssign | Compound assignment |
Single-character operators
| Operator | Token | Description |
|---|---|---|
= | .assign | Assignment |
! | .not | Logical NOT |
. | .dot | Member access |
+ | .plus | Addition / concatenation |
- | .minus | Subtraction / negation |
* | .star | Multiplication / string repetition |
/ | .slash | Division |
< | .lt | Less than |
> | .gt | Greater than |
% | .percent | Modulo |
? | .question | Ternary / Result propagation |
| | .bar | Union types |
Keyword operators
| Operator | Description |
|---|---|
in | Membership test (lists, dicts, strings, sets) |
not in | Negated membership test |
Delimiters
| Delimiter | Token |
|---|---|
{ | .lBrace |
} | .rBrace |
( | .lParen |
) | .rParen |
[ | .lBracket |
] | .rBracket |
, | .comma |
: | .colon |
; | .semicolon |
@ | .at (attribute prefix) |
Special tokens
| Token | Description |
|---|---|
.newline | Line break character |
.eof | End of input |
Grammar
The grammar is expressed in EBNF. Newlines between statements are implicit separators
(the parser skips them with skipNewlines()). The consume() helper also skips newlines
before checking the expected token.
Top-level
program ::= (top_level | NEWLINE)*
top_level ::= import_decl
| attributed_decl
| pipeline_decl
| statement
attributed_decl ::= attribute+ (pipeline_decl | fn_decl | tool_decl
| struct_decl | enum_decl | type_decl
| interface_decl | impl_block)
attribute ::= '@' IDENTIFIER ['(' attr_arg (',' attr_arg)* [','] ')']
attr_arg ::= [IDENTIFIER ':'] attr_value
attr_value ::= STRING_LITERAL | RAW_STRING | INT_LITERAL
| FLOAT_LITERAL | 'true' | 'false' | 'nil'
| IDENTIFIER | '-' INT_LITERAL | '-' FLOAT_LITERAL
import_decl ::= 'import' STRING_LITERAL
| 'import' '{' IDENTIFIER (',' IDENTIFIER)* '}'
'from' STRING_LITERAL
pipeline_decl ::= ['pub'] 'pipeline' IDENTIFIER '(' param_list ')'
['->' type_expr]
['extends' IDENTIFIER] '{' block '}'
param_list ::= (IDENTIFIER (',' IDENTIFIER)*)?
block ::= statement*
fn_decl ::= ['pub'] 'fn' IDENTIFIER [generic_params]
'(' fn_param_list ')' ['->' type_expr]
[where_clause] '{' block '}'
type_decl ::= 'type' IDENTIFIER '=' type_expr
enum_decl ::= ['pub'] 'enum' IDENTIFIER [generic_params] '{'
(enum_variant | ',' | NEWLINE)* '}'
enum_variant ::= IDENTIFIER ['(' fn_param_list ')']
struct_decl ::= ['pub'] 'struct' IDENTIFIER [generic_params]
'{' struct_field* '}'
struct_field ::= IDENTIFIER ['?'] ':' type_expr
impl_block ::= 'impl' IDENTIFIER '{' (fn_decl | NEWLINE)* '}'
interface_decl ::= 'interface' IDENTIFIER [generic_params] '{'
(interface_assoc_type | interface_method)* '}'
interface_assoc_type ::= 'type' IDENTIFIER ['=' type_expr]
interface_method ::= 'fn' IDENTIFIER [generic_params]
'(' fn_param_list ')' ['->' type_expr]
Standard library modules
Imports starting with std/ load embedded stdlib modules:
import "std/text"— text processing (extract_paths, parse_cells, filter_test_cells, truncate_head_tail, detect_compile_error, has_got_want, format_test_errors, int_to_string, float_to_string, parse_int_or, parse_float_or)import "std/collections"— collection utilities (filter_nil, store_stale, store_refresh)import "std/agent_state"— durable session-scoped state helpers (agent_state_init, agent_state_resume, agent_state_write, agent_state_read, agent_state_list, agent_state_delete, agent_state_handoff)
These modules are compiled into the interpreter binary and require no filesystem access.
Statements
statement ::= let_binding
| var_binding
| if_else
| for_in
| match_expr
| while_loop
| retry_block
| parallel_block
| parallel_each
| parallel_settle
| defer_block
| return_stmt
| throw_stmt
| override_decl
| try_catch
| fn_decl
| enum_decl
| struct_decl
| impl_block
| interface_decl
| type_decl
| guard_stmt
| require_stmt
| deadline_block
| mutex_block
| select_expr
| break_stmt
| continue_stmt
| expression_statement
let_binding ::= 'let' binding_pattern [':' type_expr] '=' expression
var_binding ::= 'var' binding_pattern [':' type_expr] '=' expression
if_else ::= 'if' expression '{' block '}'
['else' (if_else | '{' block '}')]
for_in ::= 'for' binding_pattern 'in' expression '{' block '}'
match_expr ::= 'match' expression '{' match_arm* '}'
match_arm ::= expression ['if' expression] '->' '{' block '}'
while_loop ::= 'while' expression '{' block '}'
retry_block ::= 'retry' ['(' expression ')'] expression? '{' block '}'
parallel_block ::= 'parallel' '(' expression ')' '{' [IDENTIFIER '->'] block '}'
parallel_each ::= 'parallel' 'each' expression '{' IDENTIFIER '->' block '}'
parallel_settle ::= 'parallel' 'settle' expression '{' IDENTIFIER '->' block '}'
defer_block ::= 'defer' '{' block '}'
return_stmt ::= 'return' [expression]
throw_stmt ::= 'throw' expression
override_decl ::= 'override' IDENTIFIER '(' param_list ')' '{' block '}'
try_catch ::= 'try' '{' block '}'
['catch' [('(' IDENTIFIER [':' type_expr] ')') | IDENTIFIER]
'{' block '}']
['finally' '{' block '}']
try_star_expr ::= 'try' '*' unary_expr
guard_stmt ::= 'guard' expression 'else' '{' block '}'
require_stmt ::= 'require' expression [',' expression]
deadline_block ::= 'deadline' primary '{' block '}'
mutex_block ::= 'mutex' '{' block '}'
select_expr ::= 'select' '{'
(IDENTIFIER 'from' expression '{' block '}'
| 'timeout' expression '{' block '}'
| 'default' '{' block '}')+
'}'
break_stmt ::= 'break'
continue_stmt ::= 'continue'
generic_params ::= '<' IDENTIFIER (',' IDENTIFIER)* '>'
where_clause ::= 'where' IDENTIFIER ':' IDENTIFIER
(',' IDENTIFIER ':' IDENTIFIER)*
fn_param_list ::= (fn_param (',' fn_param)*)? [',' rest_param]
| rest_param
fn_param ::= IDENTIFIER [':' type_expr] ['=' expression]
rest_param ::= '...' IDENTIFIER
A rest parameter (`...name`) must be the last parameter in the list. At call
time, any arguments beyond the positional parameters are collected into a list
and bound to the rest parameter name. If no extra arguments are provided, the
rest parameter is an empty list.
```harn
fn sum(...nums) {
var total = 0
for n in nums {
total = total + n
}
return total
}
sum(1, 2, 3) // 6
fn log(level, ...parts) {
println("[${level}] ${join(parts, " ")}")
}
log("INFO", "server", "started") // [INFO] server started
expression_statement ::= expression
| assignable '=' expression
| assignable ('+=' | '-=' | '*=' | '/=' | '%=') expression
assignable ::= IDENTIFIER
| postfix_property
| postfix_subscript
binding_pattern ::= IDENTIFIER
| '{' dict_pattern_fields '}'
| '[' list_pattern_elements ']'
dict_pattern_fields ::= dict_pattern_field (',' dict_pattern_field)*
dict_pattern_field ::= '...' IDENTIFIER
| IDENTIFIER [':' IDENTIFIER]
list_pattern_elements ::= list_pattern_element (',' list_pattern_element)*
list_pattern_element ::= '...' IDENTIFIER
| IDENTIFIER
The expression_statement rule handles both bare expressions (function calls, method calls)
and assignments. An assignment is recognized when the left-hand side is an identifier
followed by =.
Expressions (by precedence, lowest to highest)
expression ::= pipe_expr
pipe_expr ::= range_expr ('|>' range_expr)*
range_expr ::= ternary_expr ['to' ternary_expr ['exclusive']]
ternary_expr ::= logical_or ['?' logical_or ':' logical_or]
logical_or ::= logical_and ('||' logical_and)*
logical_and ::= equality ('&&' equality)*
equality ::= comparison (('==' | '!=') comparison)*
comparison ::= additive
(('<' | '>' | '<=' | '>=' | 'in' | 'not in') additive)*
additive ::= nil_coal_expr (('+' | '-') nil_coal_expr)*
nil_coal_expr ::= multiplicative ('??' multiplicative)*
multiplicative ::= power_expr (('*' | '/' | '%') power_expr)*
power_expr ::= unary ['**' power_expr]
unary ::= ('!' | '-') unary | postfix
postfix ::= primary (member_access
| optional_member_access
| subscript_access
| slice_access
| call
| try_unwrap)*
member_access ::= '.' IDENTIFIER ['(' arg_list ')']
optional_member_access
::= '?.' IDENTIFIER ['(' arg_list ')']
subscript_access ::= '[' expression ']'
slice_access ::= '[' [expression] ':' [expression] ']'
call ::= '(' arg_list ')' (* only when postfix base is an identifier *)
try_unwrap ::= '?' (* expr? on Result *)
Primary expressions
primary ::= STRING_LITERAL
| INTERPOLATED_STRING
| INT_LITERAL
| FLOAT_LITERAL
| DURATION_LITERAL
| 'true' | 'false' | 'nil'
| IDENTIFIER
| '(' expression ')'
| list_literal
| dict_or_closure
| parallel_block
| parallel_each
| parallel_settle
| retry_block
| if_else
| match_expr
| deadline_block
| 'spawn' '{' block '}'
| 'fn' '(' fn_param_list ')' '{' block '}'
| 'try' '{' block '}'
```text
list_literal ::= '[' (list_element (',' list_element)*)? ']'
list_element ::= '...' expression | expression
dict_or_closure ::= '{' '}'
| '{' closure_param_list '->' block '}'
| '{' dict_entries '}'
closure_param_list ::= fn_param_list
dict_entries ::= dict_entry (',' dict_entry)*
dict_entry ::= (IDENTIFIER | STRING_LITERAL | '[' expression ']')
':' expression
| '...' expression
arg_list ::= (arg_element (',' arg_element)*)?
arg_element ::= '...' expression | expression
Dict keys written as bare identifiers are converted to string literals
(e.g., {name: "x"} becomes {"name": "x"}).
Computed keys use bracket syntax: {[expr]: value}.
Operator precedence table
From lowest to highest binding:
| Precedence | Operators | Associativity | Description |
|---|---|---|---|
| 1 | |> | Left | Pipe |
| 2 | ? : | Right | Ternary conditional |
| 3 | || | Left | Logical OR |
| 4 | && | Left | Logical AND |
| 5 | == != | Left | Equality |
| 6 | < > <= >= in not in | Left | Comparison / membership |
| 7 | + - | Left | Additive |
| 8 | ?? | Left | Nil coalescing |
| 9 | * / % | Left | Multiplicative |
| 10 | ** | Right | Exponentiation |
| 11 | ! - (unary) | Right (prefix) | Unary |
| 12 | . ?. [] [:] () ? | Left | Postfix |
Multiline expressions
Binary operators ||, &&, +, *, /, %, **, |> and the .
member
access operator can span multiple lines. The operator at the start of a
continuation line causes the parser to treat it as a continuation of the
previous expression rather than a new statement.
Note: - does not support multiline continuation because it is also a
unary negation prefix.
let result = items
.filter({ x -> x > 0 })
.map({ x -> x * 2 })
let msg = "hello"
+ " "
+ "world"
let ok = check_a()
&& check_b()
|| fallback()
Pipe placeholder (_)
When the right side of |> contains _ identifiers, the expression is
automatically wrapped in a closure where _ is replaced with the piped
value:
"hello world" |> split(_, " ") // desugars to: |> { __pipe -> split(__pipe, " ") }
[3, 1, 2] |> _.sort() // desugars to: |> { __pipe -> __pipe.sort() }
items |> len(_) // desugars to: |> { __pipe -> len(__pipe) }
Without _, the pipe passes the value as the first argument to a closure
or function.
Scope rules
Harn uses lexical scoping with a parent-chain environment model.
Environment
Each HarnEnvironment has:
- A
valuesdictionary mapping names toHarnValue - A
mutableset tracking which names were declared withvar - An optional
parentreference
Variable lookup
env.get(name) checks the current scope’s values first, then walks up the parent chain.
Returns nil (which becomes .nilValue) if not found anywhere.
Variable definition
let name = value– definesnameas immutable in the current scope.var name = value– definesnameas mutable in the current scope.
Variable assignment
name = value walks up the scope chain to find the binding. If the binding is found but was
declared with let, throws HarnRuntimeError.immutableAssignment. If not found in any scope,
throws HarnRuntimeError.undefinedVariable.
Scope creation
New child scopes are created for:
- Pipeline bodies
forloop bodies (loop variable is mutable)whileloop iterationsparallel,parallel each, andparallel settletask bodies (isolated interpreter per task)try/catchblocks (catch body gets its own child scope with optional error variable)- Closure invocations (child of the captured environment, not the call site)
blocknodes
Control flow statements (if/else, match) execute in the current scope without creating a new child scope.
Destructuring patterns
Destructuring binds multiple variables from a dict or list in a single
let, var, or for-in statement.
Dict destructuring
let {name, age} = {name: "Alice", age: 30}
// name == "Alice", age == 30
Each field name in the pattern extracts the value for the matching key.
If the key is missing from the dict, the variable is bound to nil.
Default values
Pattern fields can specify default values with = expr syntax. The
default expression is evaluated when the extracted value is nil (i.e.
when the key is missing from the dict or the index is out of bounds for
a list):
let { name = "workflow", system = "" } = { name: "custom" }
// name == "custom" (key exists), system == "" (default applied)
let [a = 10, b = 20, c = 30] = [1, 2]
// a == 1, b == 2, c == 30 (default applied)
Defaults can be combined with field renaming:
let { name: displayName = "Unknown" } = {}
// displayName == "Unknown"
Default expressions are evaluated fresh each time the pattern is matched
(they are not memoized). Rest patterns (...rest) do not support
default values.
List destructuring
let [first, second, third] = [10, 20, 30]
// first == 10, second == 20, third == 30
Elements are bound positionally. If there are more bindings than elements
in the list, the excess bindings receive nil (unless a default value is
specified).
Field renaming
A dict pattern field can be renamed with key: alias syntax:
let {name: user_name} = {name: "Bob"}
// user_name == "Bob"
Rest patterns
A ...rest element collects remaining items into a new list or dict:
let [head, ...tail] = [1, 2, 3, 4]
// head == 1, tail == [2, 3, 4]
let {name, ...extras} = {name: "Carol", age: 25, role: "dev"}
// name == "Carol", extras == {age: 25, role: "dev"}
If there are no remaining items, the rest variable is bound to [] for
list patterns or {} for dict patterns. The rest element must appear
last in the pattern.
For-in destructuring
Destructuring patterns work in for-in loops to unpack each element:
let entries = [{name: "X", val: 1}, {name: "Y", val: 2}]
for {name, val} in entries {
println("${name}=${val}")
}
let pairs = [[1, 2], [3, 4]]
for [a, b] in pairs {
println("${a}+${b}")
}
Var destructuring
var destructuring creates mutable bindings that can be reassigned:
var {x, y} = {x: 1, y: 2}
x = 10
y = 20
Type errors
Destructuring a non-dict value with a dict pattern or a non-list value
with a list pattern produces a runtime error. For example,
let {a} = "hello" throws "dict destructuring requires a dict value".
Evaluation order
Program entry
- All top-level nodes are scanned. Pipeline declarations are registered by name. Import declarations are processed (loaded and evaluated).
- The entry pipeline is selected: the pipeline named
"default"if it exists, otherwise the first pipeline in the file. - The entry pipeline’s body is executed.
If no pipeline is found in the file, all top-level statements are compiled and executed directly as an implicit entry point (script mode). This allows simple scripts to work without wrapping code in a pipeline block.
Pipeline parameters
If the pipeline parameter list includes task, it is bound to context.task.
If it includes project, it is bound to context.projectRoot.
A context dict is always injected with keys task, project_root, and task_type.
Pipeline return type
Pipelines may declare a return type with the same -> TypeExpr syntax
as functions:
pipeline ghost_text(task) -> {text: string, code: int} {
return {text: "hello", code: 0}
}
The type checker verifies every return <expr> statement against the
declared type. Mismatches are reported as return type doesn't match
errors.
A declared return type is the typed contract that a host or bridge (ACP, A2A) can rely on when consuming the pipeline’s output.
Public pipelines (pub pipeline) without an explicit return type emit
the pipeline-return-type lint warning; explicit return types on the
Harn→ACP boundary will be required in a future release.
Pipeline inheritance
pipeline child(x) extends parent { ... }:
- If the child body contains
overridedeclarations, the resolved body is the parent’s body plus any non-override statements from the child. Override declarations are available for lookup by name. - If the child body contains no
overridedeclarations, the child body entirely replaces the parent body.
Statement execution
Statements execute sequentially. The last expression value in a block is the block’s result, though this is mostly relevant for closures and parallel bodies.
Import resolution
import "path" resolves in this order:
- If path starts with
std/, loads embedded stdlib module (e.g.std/text) - Relative to current file’s directory; auto-adds
.harnextension .harn/packages/<path>directories rooted at the nearest ancestor package root (the search walks upward and stops at a.gitboundary)- Package manifest
[exports]mappings under.harn/packages/<package>/harn.toml - Package directories with
lib.harnentry point
Package manifests can publish stable module entry points without forcing consumers to import the on-disk file layout directly:
[exports]
capabilities = "runtime/capabilities.harn"
providers = "runtime/providers.harn"
With the example above, import "acme/capabilities" resolves to the
declared file inside the installed acme package.
Selective imports: import { name1, name2 } from "module" imports only
the specified functions. Functions marked pub are exported by default;
if no pub functions exist, all functions are exported.
Imported pipelines are registered for later invocation. Non-pipeline top-level statements (fn declarations, let bindings) are executed immediately.
Static cross-module resolution
harn check, harn run, harn bench, and the LSP build a module graph
from the entry file that transitively loads every import-reachable
.harn module. The graph drives:
- Typechecker: when every import in a file resolves, call targets
that are not builtins, not local declarations, not struct constructors,
not callable variables, and not introduced by an import produce a
call target ... is not defined or importederror (not a lint warning). This catches typos and stale imports before the VM loads. - Linter: wildcard imports are resolved via the same graph; the
undefined-functionrule can now check against the actual exported name set of imported modules rather than silently disabling itself. - LSP go-to-definition: cross-file navigation walks the graph’s
definition_oflookup, so any reachable symbol (through any number of transitive imports) can be jumped to.
Resolution conservatively degrades to the pre-v0.7.12 behavior when any import in the file is unresolved (missing file, parse error, non-existent package directory), so a single broken import does not avalanche into a sea of false-positive undefined-name errors. The unresolved import itself still surfaces via the runtime loader.
Runtime values
| Type | Syntax | Description |
|---|---|---|
string | "text" | UTF-8 string |
int | 42 | Platform-width integer |
float | 3.14 | Double-precision float |
bool | true / false | Boolean |
nil | nil | Null value |
list | [1, 2, 3] | Ordered collection |
dict | {key: value} | String-keyed map |
set | set(1, 2, 3) | Unordered collection of unique values |
closure | { x -> x + 1 } | First-class function with captured environment |
enum | Color.Red | Enum variant, optionally with associated data |
struct | Point({x: 3, y: 4}) | Struct instance with named fields |
taskHandle | (from spawn) | Opaque handle to an async task |
Iter<T> | x.iter() / iter(x) | Lazy, single-pass, fused iterator. See Iterator protocol |
Pair<K, V> | pair(k, v) | Two-element value; access via .first / .second |
Truthiness
| Value | Truthy? |
|---|---|
bool(false) | No |
nil | No |
int(0) | No |
float(0) | No |
string("") | No |
list([]) | No |
dict([:]) | No |
set() (empty) | No |
| Everything else | Yes |
Equality
Values are equal if they have the same type and same contents, with these exceptions:
intandfloatare compared by convertinginttofloat- Two closures are never equal
- Two task handles are equal if their IDs match
Comparison
Only int, float, and string support ordering (<, >, <=, >=).
Comparison between other types returns 0 (equal).
Binary operator semantics
Arithmetic (+, -, *, /)
| Left | Right | + | - | * | / |
|---|---|---|---|---|---|
| int | int | int | int | int | int (truncating) |
| float | float | float | float | float | float |
| int | float | float | float | float | float |
| float | int | float | float | float | float |
| string | string | string (concatenation) | TypeError | TypeError | TypeError |
| string | int | TypeError | TypeError | string (repetition) | TypeError |
| int | string | TypeError | TypeError | string (repetition) | TypeError |
| list | list | list (concatenation) | TypeError | TypeError | TypeError |
| dict | dict | dict (merge, right wins) | TypeError | TypeError | TypeError |
| other | other | TypeError | TypeError | TypeError | TypeError |
Division by zero returns nil.
string * int repeats the string; negative or zero counts return "".
Type mismatches that are not listed as valid combinations above produce a
TypeError at runtime. The type checker reports these as compile-time errors
when operand types are statically known. Use to_string() or string
interpolation ("${expr}") for explicit type conversion.
Modulo (%)
% is numeric-only. int % int returns int; any case involving a float
returns float. Modulo by zero follows the same runtime error path as
division by zero.
Exponentiation (**)
** is numeric-only and right-associative, so 2 ** 3 ** 2 evaluates as
2 ** (3 ** 2).
int ** intreturnsintfor non-negative exponents that fit inu32, using wrapping integer exponentiation.- Negative or very large integer exponents promote to
float. - Any case involving a
floatreturnsfloat. - Non-numeric operands raise
TypeError.
Logical (&&, ||)
Short-circuit evaluation:
&&: if left is falsy, returnsfalsewithout evaluating right.||: if left is truthy, returnstruewithout evaluating right.
Nil coalescing (??)
Short-circuit: if left is not nil, returns left without evaluating right.
?? binds tighter than additive/comparison/logical operators but looser than
multiplicative operators, so xs?.count ?? 0 > 0 parses as
(xs?.count ?? 0) > 0.
Pipe (|>)
a |> f evaluates a, then:
- If
fevaluates to a closure, invokes it withaas the single argument. - If
fis an identifier resolving to a builtin, calls the builtin with[a]. - If
fis an identifier resolving to a closure variable, invokes it witha. - Otherwise returns
nil.
Ternary (? :)
condition ? trueExpr : falseExpr evaluates condition, then evaluates and returns
either trueExpr (if truthy) or falseExpr.
Ranges (to, to … exclusive)
a to b evaluates a and b (both must be integers) and produces a list of
consecutive integers. The form is inclusive by default — 1 to 5 is
[1, 2, 3, 4, 5] — because that matches how the expression reads aloud.
Add the trailing modifier exclusive to get the half-open form:
1 to 5 exclusive is [1, 2, 3, 4].
| Expression | Value | Shape |
|---|---|---|
1 to 5 | [1, 2, 3, 4, 5] | [a, b] |
1 to 5 exclusive | [1, 2, 3, 4] | [a, b) |
0 to 3 | [0, 1, 2, 3] | [a, b] |
0 to 3 exclusive | [0, 1, 2] | [a, b) |
If b < a, the result is the empty list. The range(n) / range(a, b) stdlib
builtins always produce the half-open form, for Python-compatible indexing.
Control flow
if/else
if condition {
// then
} else if other {
// else-if
} else {
// else
}
else if chains are parsed as a nested ifElse node in the else branch.
for/in
for item in iterable {
// body
}
If iterable is a list, iterates over elements. If iterable is a dict, iterates over
entries sorted by key, where each entry is {key: "...", value: ...}.
The loop variable is mutable within the loop body.
while
while condition {
// body
}
Maximum 10,000 iterations (safety limit). Condition is re-evaluated each iteration.
match
match value {
pattern1 -> { body1 }
pattern2 if condition -> { body2 }
}
Patterns are expressions. Each pattern is evaluated and compared to the match value
using valuesEqual. An arm may include an if guard after the pattern; when
present, the arm only matches if the pattern matches and the guard expression
evaluates to a truthy value. The first matching arm executes.
If no arm matches, a runtime error is thrown (no matching arm in match expression).
This makes non-exhaustive matches a hard failure rather than a silent nil.
let x = 5
match x {
1 -> { "one" }
n if n > 3 -> { "big: ${n}" }
_ -> { "other" }
}
// -> "big: 5"
retry
retry 3 {
// body that may throw
}
Executes the body up to N times. If the body succeeds (no error), returns immediately.
If the body throws, catches the error and retries. return statements inside retry
propagate out (are not retried). After all attempts are exhausted, returns nil
(does not re-throw the last error).
Concurrency
parallel
parallel(count) { i ->
// body executed count times concurrently
}
Creates count concurrent tasks. Each task gets an isolated interpreter with a child
environment. The optional variable i is bound to the task index (0-based).
Returns a list of results in index order.
parallel each
parallel each list { item ->
// body for each item
}
Maps over a list concurrently. Each task gets an isolated interpreter. The variable is bound to the current list element. Returns a list of results in the original order.
parallel settle
parallel settle list { item ->
// body for each item
}
Like parallel each, but never throws. Instead, it collects both
successes and failures into a result object with fields:
| Field | Type | Description |
|---|---|---|
results | list | List of Result values (one per item), in order |
succeeded | int | Number of Ok results |
failed | int | Number of Err results |
defer
defer {
// cleanup body
}
Registers a block to run when the enclosing scope exits, whether by
normal return or by a thrown error. Multiple defer blocks in the same
scope execute in LIFO (last-registered, first-executed) order, similar
to Go’s defer. The deferred block runs in the scope where it was
declared.
fn open(path) { path }
fn close(f) { log("closing ${f}") }
let f = open("data.txt")
defer { close(f) }
// ... use f ...
// close(f) runs automatically on scope exit
spawn/await/cancel
let handle = spawn {
// async body
}
let result = await(handle)
cancel(handle)
spawn launches an async task and returns a taskHandle.
await (a built-in interpreter function, not a keyword) blocks until the task completes
and returns its result. cancel cancels the task.
Channels
Channels provide typed message-passing between concurrent tasks.
let ch = channel("name", 10) // buffered channel with capacity 10
send(ch, "hello") // send a value
let msg = receive(ch) // blocking receive
Channel iteration
A for-in loop over a channel asynchronously receives values until the
channel is closed and drained:
let ch = channel("stream", 10)
spawn {
send(ch, "a")
send(ch, "b")
close_channel(ch)
}
for item in ch {
println(item) // prints "a", then "b"
}
// loop exits after channel is closed and all items are consumed
When the channel is closed, remaining buffered items are still delivered. The loop exits once all items have been consumed.
close_channel(ch)
Closes a channel. After closing, send returns false and no new values
are accepted. Buffered items can still be received.
try_receive(ch)
Non-blocking receive. Returns the next value from the channel, or nil if
the channel is empty (regardless of whether it is closed).
select
Multiplexes across multiple channels, executing the body of whichever channel receives a value first:
select {
msg from ch1 {
log("ch1: ${msg}")
}
msg from ch2 {
log("ch2: ${msg}")
}
}
Each case binds the received value to a variable (msg) and executes the
corresponding body. Only one case fires per select.
timeout case
fn handle(msg) { log(msg) }
let ch1 = channel(1)
select {
msg from ch1 { handle(msg) }
timeout 5s {
log("timed out")
}
}
If no channel produces a value within the duration, the timeout body runs.
default case (non-blocking)
fn handle(msg) { log(msg) }
let ch1 = channel(1)
select {
msg from ch1 { handle(msg) }
default {
log("nothing ready")
}
}
If no channel has a value immediately available, the default body runs
without blocking. timeout and default are mutually exclusive.
select() builtin
The statement form desugars to the select(ch1, ch2, ...) async builtin,
which returns {index, value, channel}. The builtin can be called directly
for dynamic channel lists.
Error model
throw
throw expression
Evaluates the expression and throws it as HarnRuntimeError.thrownError(value).
Any value can be thrown (strings, dicts, etc.).
try/catch/finally
try {
// body
} catch (e) {
// handler
} finally {
// cleanup — always runs
}
If the body throws:
- A
thrownError(value):eis bound to the thrown value directly. - Any other runtime error:
eis bound to the error’slocalizedDescriptionstring.
return inside a try block propagates out of the enclosing pipeline (is not caught).
The error variable (e) is optional: catch { ... } is valid without it.
try { ... } catch (e) { ... } is also usable as an expression: the value of
the whole form is the tail value of the try body when it succeeds, and the tail
value of the catch handler when an error is caught. This means the natural
let v = try { risky() } catch (e) { fallback } binding is supported directly,
without needing to restructure through Result helpers. When a typed catch
(catch (e: AppError) { ... }) does not match the thrown error’s type, the
throw propagates past the expression unchanged — the surrounding let never
binds. See the Try-expression section below for the
Result-wrapping behavior when catch is omitted.
try* (rethrow-into-catch)
try* EXPR is a prefix operator that evaluates EXPR and rethrows any
thrown error so an enclosing try { ... } catch (e) { ... } can handle
it, instead of forcing the caller to manually convert thrown errors
into a Result and then guard is_ok / unwrap. The lowered form is:
{ let _r = try { EXPR }
guard is_ok(_r) else { throw unwrap_err(_r) }
unwrap(_r) }
On success try* EXPR evaluates to EXPR’s value with no Result
wrapping. The rethrow runs every finally block between the rethrow
site and the innermost catch handler exactly once, matching the
finally exactly-once guarantee for plain throw.
fn fetch(prompt) {
// Without try*: try { llm_call(prompt) } / guard is_ok / unwrap
let response = try* llm_call(prompt)
return parse(response)
}
let outcome = try {
let result = fetch(prompt)
Ok(result)
} catch (e: ApiError) {
Err(e.code)
}
try* requires an enclosing function (fn, tool, or pipeline) so
the rethrow has a body to live in — using it at module top level is a
compile error. The operand is parsed at unary-prefix precedence, so
try* foo.bar(1) parses as try* (foo.bar(1)) and try* a + b parses
as (try* a) + b. Use parentheses to combine try* with binary
operators on its operand. try* is distinct from the postfix ?
operator: ? early-returns Result.Err(...) from a Result-returning
function, while try* rethrows a thrown value into an enclosing catch.
finally
The finally block is optional and runs regardless of whether the try body
succeeds, throws, or the catch body re-throws. Supported forms:
try { ... } catch e { ... } finally { ... }
try { ... } finally { ... }
try { ... } catch e { ... }
return, break, and continue inside a try body with a finally block will
execute the finally block before the control flow transfer completes.
The finally block’s return value is discarded — the overall expression value comes from the try or catch body.
Functions and closures
fn declarations
fn name(param1, param2) {
return param1 + param2
}
Declares a named function. Equivalent to let name = { param1, param2 -> ... }.
The function captures the lexical scope at definition time.
Default parameters
Parameters may have default values using = expr. Required parameters must
come before optional (defaulted) parameters. Defaults are evaluated fresh at
each call site (not memoized at definition time). Any expression is valid as
a default — not just literals.
fn greet(name, greeting = "hello") {
log("${greeting}, ${name}!")
}
greet("world") // "hello, world!"
greet("world", "hi") // "hi, world!"
fn config(host = "localhost", port = 8080, debug = false) {
// all params optional
}
let add = { x, y = 10 -> x + y } // closures support defaults too
Explicit nil counts as a provided argument (does NOT trigger the default).
Arguments are positional — fill left to right, only trailing defaults can
be omitted.
tool declarations
tool read_file(path: string, encoding: string) -> string {
description "Read a file from the filesystem"
read_file(path)
}
tool search(query: string, file_glob: string = "*.py") -> string {
description "Search files matching an optional glob"
"..."
}
Declares a named tool and registers it with a tool registry. The body is
compiled as a closure and attached as the tool’s handler. An optional
description metadata string may appear as the first statement in the body.
Annotated tool parameter and return types are lowered into the same schema
model used by runtime validation and structured LLM I/O. Primitive types map to
their JSON Schema equivalents, while nested shapes, list<T>,
dict<string, V>, and unions produce nested schema objects. Parameters with
default values are emitted as optional schema fields (required: false) and
include their default value in the generated tool registry entry.
The result of a tool declaration is a tool registry dict (the return
value of tool_define). Multiple tool declarations accumulate into
separate registries; use tool_registry() and tool_define(...) for
multi-tool registries.
Like fn, tool may be prefixed with pub.
Deferred tool loading (defer_loading)
A tool registered through tool_define may set defer_loading: true
in its config dict. Deferred tools keep their schema out of the model’s
context on each LLM call until a tool-search call surfaces them.
fn admin(token) { log(token) }
let registry = tool_registry()
registry = tool_define(registry, "rare_admin_action", "...", {
parameters: {token: {type: "string"}},
defer_loading: true,
handler: { args -> admin(args.token) },
})
defer_loading is validated as a bool at registration time — typos like
defer_loading: "yes" raise at tool_define rather than silently
falling back to eager loading.
Deferred tools are only materialised on the wire when the call opts
into tool_search (see the llm_call option of the same name and
docs/src/llm-and-agents.md). Harn supports two native backends plus a
provider-agnostic client fallback:
- Anthropic Claude Opus/Sonnet 4.0+ and Haiku 4.5+ — Harn emits
defer_loading: trueon each deferred tool and prepends thetool_search_tool_{bm25,regex}_20251119meta-tool. Anthropic keeps deferred schemas in the API prefix (prompt caching stays warm) but out of the model’s context. - OpenAI GPT 5.4+ (Responses API) — Harn emits
defer_loading: trueon each deferred tool and prepends{"type": "tool_search", "mode": "hosted"}to the tools array. OpenRouter, Together, Groq, DeepSeek, Fireworks, HuggingFace, and local vLLM inherit the capability when their routed model matchesgpt-5.4+. - Everyone else (and any of the above on older models) — Harn
injects a synthetic
__harn_tool_searchtool and runs the configured strategy (BM25, regex, semantic, or host-delegated) in-VM, promoting matching deferred tools into the next turn’s schema list.
Tool entries may also set namespace: "<label>" to group deferred tools
for the OpenAI meta-tool’s namespaces field. The field is a harmless
passthrough on Anthropic — ignored by the API, preserved in replay.
mode: "native" refuses to silently downgrade and errors when the
active (provider, model) pair is not natively capable; mode: "client"
forces the fallback everywhere; mode: "auto" (default) picks native
when available.
The per-provider / per-model capability table that gates native
tool_search, defer_loading, prompt caching, and extended thinking
is a shipped TOML matrix overridable per-project via
[[capabilities.provider.<name>]] in harn.toml. Scripts query the
effective matrix at runtime with:
let caps = provider_capabilities("anthropic", "claude-opus-4-7")
// {
// provider, model, native_tools, defer_loading,
// tool_search: [string], max_tools: int | nil,
// prompt_caching, thinking,
// }
The provider_capabilities_install(toml_src) and
provider_capabilities_clear() builtins let scripts install and
revert overrides in-process for cases where editing the manifest is
awkward (runtime proxy detection, conformance test setup). See
docs/src/llm-and-agents.md#capability-matrix--harntoml-overrides
for the rule schema.
skill declarations
pub skill deploy {
description "Deploy the application to production"
when_to_use "User says deploy/ship/release"
invocation "explicit"
paths ["infra/**", "Dockerfile"]
allowed_tools ["bash", "git"]
model "claude-opus-4-7"
effort "high"
prompt "Follow the deployment runbook."
on_activate fn() {
log("deploy skill activated")
}
on_deactivate fn() {
log("deploy skill deactivated")
}
}
Declares a named skill and registers it with a skill registry. A skill bundles metadata, tool references, MCP server lists, system-prompt fragments, and auto-activation rules into a typed unit that hosts can enumerate, select, and invoke.
Body entries are <field_name> <expression> pairs separated by
newlines. The field name is an ordinary identifier (no keyword is
reserved), and the value is any expression — string literal, list
literal, identifier reference, dict literal, or fn-literal (for
lifecycle hooks). The compiler lowers the decl to:
skill_define(skill_registry(), NAME, { field: value, ... })
and binds the resulting registry dict to NAME, parallel to how
tool NAME { ... } works.
skill_define performs light value-shape validation on known keys:
description, when_to_use, prompt, invocation, model, effort
must be strings; paths, allowed_tools, mcp must be lists.
Mistyped values fail at registration rather than at use. Unknown keys
pass through unchanged to support integrator metadata.
Like fn and tool, skill may be prefixed with pub to export it
from the module. The registry-dict value is bound as a module-level
variable.
Skill registry operations
let reg = skill_registry()
let reg = skill_define(reg, "review", {
description: "Code review",
invocation: "auto",
paths: ["src/**"],
})
skill_count(reg) // int
skill_find(reg, "review") // dict | nil
skill_list(reg) // list (closure hooks stripped)
skill_select(reg, ["review"])
skill_remove(reg, "review")
skill_describe(reg) // formatted string
skill_list strips closure-valued fields (lifecycle hooks) so its
output is safe to serialize. skill_find returns the full entry
including closures.
@acp_skill attribute
Functions can be promoted into skills via the @acp_skill attribute:
@acp_skill(name: "deploy", when_to_use: "User says deploy", invocation: "explicit")
pub fn deploy_run() { ... }
Attribute arguments populate the skill’s metadata dict, and the
annotated function is registered as the skill’s on_activate
lifecycle hook. Like @acp_tool, @acp_skill only applies to
function declarations; using it on other kinds of item is a compile
error.
Closures
let f = { x -> x * 2 }
let g = { a, b -> a + b }
First-class values. When invoked, a child environment is created from the captured environment (not the call-site environment), and parameters are bound as immutable bindings.
Spread in function calls
The spread operator ... expands a list into individual function arguments.
It can be used in both function calls and method calls:
fn add(a, b, c) {
return a + b + c
}
let args = [1, 2, 3]
add(...args) // equivalent to add(1, 2, 3)
Spread arguments can be mixed with regular arguments:
fn add(a, b, c) { return a + b + c }
let rest = [2, 3]
add(1, ...rest) // equivalent to add(1, 2, 3)
Multiple spreads are allowed in a single call, and they can appear in any position:
fn add(a, b, c) { return a + b + c }
let first = [1]
let last = [3]
add(...first, 2, ...last) // equivalent to add(1, 2, 3)
At runtime the VM flattens all spread arguments into the argument list before invoking the function. If the total number of arguments does not match the function’s parameter count, the usual arity error is produced.
Return
return value inside a function/closure unwinds execution via
HarnRuntimeError.returnValue. The closure invocation catches this and returns the value.
return inside a pipeline terminates the pipeline.
Enums
Enums define a type with a fixed set of named variants, each optionally carrying associated data.
Enum declaration
enum Color {
Red,
Green,
Blue
}
enum Shape {
Circle(float),
Rectangle(float, float)
}
Variants without data are simple tags. Variants with data carry positional fields specified in parentheses.
Enum construction
Variants are constructed using dot syntax on the enum name:
let c = Color.Red
let s = Shape.Circle(5.0)
let r = Shape.Rectangle(3.0, 4.0)
Pattern matching on enums
Enum variants are matched using EnumName.Variant(binding) patterns in
match expressions:
match s {
Shape.Circle(radius) -> { log("circle r=${radius}") }
Shape.Rectangle(w, h) -> { log("rect ${w}x${h}") }
}
A match on an enum must be exhaustive: a missing variant is a hard
error, not a warning. Add the missing arm or end with a wildcard
_ -> { … } arm to opt out. if/elif/else chains stay intentionally
partial; opt into exhaustiveness by ending the chain with
unreachable("…").
Built-in Result enum
Harn provides a built-in generic Result<T, E> enum with two variants:
Result.Ok(value)– represents a successful resultResult.Err(error)– represents an error
Shorthand constructor functions Ok(value) and Err(value) are available
as builtins, equivalent to Result.Ok(value) and Result.Err(value).
let ok = Ok(42)
let err = Err("something failed")
let typed_ok: Result<int, string> = ok
// Equivalent long form:
let ok2 = Result.Ok(42)
let err2 = Result.Err("oops")
Result helper functions
| Function | Description |
|---|---|
is_ok(r) | Returns true if r is Result.Ok |
is_err(r) | Returns true if r is Result.Err |
unwrap(r) | Returns the Ok value, throws if r is Err |
unwrap_or(r, default) | Returns the Ok value, or default if r is Err |
unwrap_err(r) | Returns the Err value, throws if r is Ok |
The ? operator (Result propagation)
The postfix ? operator unwraps a Result.Ok value or propagates a
Result.Err from the current function. It is a postfix operator with the
same precedence as ., [], and ().
fn divide(a, b) {
if b == 0 {
return Err("division by zero")
}
return Ok(a / b)
}
fn compute(x) {
let result = divide(x, 2)? // unwraps Ok, or returns Err early
return Ok(result + 10)
}
let r1 = compute(20) // Result.Ok(20)
let r2 = compute(0) // would propagate Err from divide
The ? operator requires its operand to be a Result value. Applying ?
to a non-Result value produces a type error at runtime.
Disambiguation: when the parser sees expr?, it distinguishes between the
postfix ? (Result propagation) and the ternary ? : operator by checking
whether the token following ? could start a ternary branch expression.
Pattern matching on Result
match result {
Result.Ok(val) -> { log("success: ${val}") }
Result.Err(err) -> { log("error: ${err}") }
}
Try-expression
The try keyword used without a catch block acts as a try-expression.
It evaluates the body and wraps the result in a Result:
- If the body succeeds, returns
Result.Ok(value). - If the body throws an error, returns
Result.Err(error).
let result = try { json_parse(raw_input) }
// result is Result.Ok(parsed_data) or Result.Err("invalid JSON: ...")
The try-expression is the complement of the ? operator: try enters
Result-land by catching errors, while ? exits Result-land by propagating
errors. Together they form a complete error-handling pipeline:
fn safe_divide(a, b) {
let result = try { a / b }
return result
}
fn compute(x) {
let val = safe_divide(x, 2)? // unwrap Ok or propagate Err
return Ok(val + 10)
}
No catch or finally block is needed for the Result-wrapping form. When
catch or finally follow try, the form is a handled try/catch
expression whose value is the try or catch body’s tail value (see
try/catch/finally); only the bare try { ... } form wraps
in Result.
Result in pipelines
The ? operator works naturally in pipelines:
fn fetch_and_parse(url) {
let response = http_get(url)?
let data = json_parse(response)?
return Ok(data)
}
Structs
Structs define named record types with typed fields. Structs may also be generic.
Struct declaration
struct Point {
x: int
y: int
}
struct User {
name: string
age: int
}
struct Pair<A, B> {
first: A
second: B
}
Fields are declared with name: type syntax, one per line.
Struct construction
Struct instances can be constructed with the struct name followed by a named-field body:
let p = Point { x: 3, y: 4 }
let u = User { name: "Alice", age: 30 }
let pair: Pair<int, string> = Pair { first: 1, second: "two" }
Field access
Struct fields are accessed with dot syntax, the same as dict property access:
log(p.x) // 3
log(u.name) // "Alice"
Impl blocks
Impl blocks attach methods to a struct type.
Syntax
impl TypeName {
fn method_name(self, arg) {
// body -- self refers to the struct instance
}
}
The first parameter of each method must be self, which receives the
struct instance the method is called on.
Method calls
Methods are called using dot syntax on struct instances:
struct Point {
x: int
y: int
}
impl Point {
fn distance(self) {
return sqrt(self.x * self.x + self.y * self.y)
}
fn translate(self, dx, dy) {
return Point { x: self.x + dx, y: self.y + dy }
}
}
let p = Point { x: 3, y: 4 }
log(p.distance()) // 5.0
let p2 = p.translate(10, 20)
log(p2.x) // 13
When instance.method(args) is called, the VM looks up methods registered
by the impl block for the instance’s struct type. The instance is
automatically passed as the self argument.
Interfaces
Interfaces define a set of method signatures that a struct type must
implement. Harn uses Go-style implicit satisfaction: a struct satisfies
an interface if its impl block contains all the required methods with
compatible signatures. There is no implements keyword. Interfaces may
also declare associated types.
Interface declaration
interface Displayable {
fn display(self) -> string
}
interface Serializable {
fn serialize(self) -> string
fn byte_size(self) -> int
}
interface Collection {
type Item
fn get(self, index: int) -> Item
}
Each method signature lists parameters (the first must be self) and an
optional return type. Associated types name implementation-defined types
that methods can refer to. The body is omitted – interfaces only declare
the shape of the methods.
Implicit satisfaction
A struct satisfies an interface when its impl block has all the methods
declared by the interface, with matching parameter counts:
struct Dog {
name: string
}
impl Dog {
fn display(self) -> string {
return "Dog(${self.name})"
}
}
Dog satisfies Displayable because it has a display(self) -> string
method. No extra annotation is needed.
Using interfaces as type annotations
Interfaces can be used as parameter types. At compile time, the type checker verifies that any struct passed to such a parameter satisfies the interface:
fn show(item: Displayable) {
println(item.display())
}
let d = Dog({name: "Rex"})
show(d) // OK: Dog satisfies Displayable
Generic constraints with interfaces
Interfaces can be used as generic constraints via where clauses:
fn process<T>(item: T) where T: Displayable {
println(item.display())
}
The type checker verifies at call sites that the concrete type passed
for T satisfies Displayable. Passing a type that does not satisfy
the constraint produces a compile-time error. Generic parameters must bind
consistently across all arguments in the call, and container bindings such as
list<T> propagate the concrete element type instead of collapsing to an
unconstrained generic.
Subtyping and variance
Harn’s subtype relation is polarity-aware: each compound type has a
declared variance per slot that determines whether widening (e.g.
int <: float) is allowed in that slot, prohibited entirely, or
applied with the direction reversed.
Type parameters on user-defined generics may be marked with in or
out:
type Reader<out T> = fn() -> T // T appears only in output position
interface Sink<in T> { fn accept(v: T) -> int }
fn map<in A, out B>(value: A) -> B { ... }
| Marker | Meaning | Where T may appear |
|---|---|---|
out T | covariant | output positions only |
in T | contravariant | input positions only |
| (none) | invariant (default) | anywhere |
Unannotated parameters default to invariant. This is strictly
safer than implicit covariance — Box<int> does not flow into
Box<float> unless Box declares out T and the body uses T
only in covariant positions.
Built-in variance
| Constructor | Variance |
|---|---|
iter<T> | covariant in T (read-only) |
list<T> | invariant in T (mutable: push, index assignment) |
dict<K, V> | invariant in both K and V (mutable) |
Result<T, E> | covariant in both T and E |
fn(P1, ...) -> R | parameters contravariant, return covariant |
Shape { field: T, ... } | covariant per field (width subtyping) |
The numeric widening int <: float only applies in covariant
positions. In invariant or contravariant positions it is suppressed —
that is what makes list<int> to list<float> a type error.
Function subtyping
For an actual fn(A) -> R' to be a subtype of an expected fn(B) -> R,
B must be a subtype of A (parameters are contravariant) and
R' must be a subtype of R (return is covariant). A callback that
accepts a wider input or produces a narrower output is always a valid
substitute.
let wide = fn(x: float) { return 0 }
let cb: fn(int) -> int = wide // OK: float-accepting closure stands in for int-accepting
let narrow = fn(x: int) { return 0 }
let bad: fn(float) -> int = narrow // ERROR: narrow cannot accept the float a caller may pass
Declaration-site checking
When a type parameter is marked in or out, the declaration body
is checked: each occurrence of the parameter must respect the
declared variance. Mismatches are caught at definition time, not at
each use:
type Box<out T> = fn(T) -> int
// ERROR: type parameter 'T' is declared 'out' (covariant) but appears
// in a contravariant position in type alias 'Box'
Attributes
Attributes are declarative metadata attached to a top-level declaration
with the @ prefix. They compile to side-effects (warnings, runtime
registrations) at the attached declaration, and stack so a single decl
can carry multiple. Arguments are restricted to literal values
(strings, numbers, booleans, nil, bare identifiers) — no runtime
evaluation, no expressions.
Syntax
attribute ::= '@' IDENTIFIER ['(' attr_arg (',' attr_arg)* [','] ')']
attr_arg ::= [IDENTIFIER ':'] attr_value
attr_value ::= literal | IDENTIFIER
@deprecated(since: "0.8", use: "compute_v2")
@test
pub fn compute(x: int) -> int { return x + 1 }
Attributes attach to the immediately following declaration —
either pipeline, fn, tool, struct, enum, type, interface,
or impl. Attaching to anything else (a let, a statement) is a parse
error.
Standard attributes
@deprecated
@deprecated(since: "0.8", use: "new_fn")
pub fn old_fn() -> int { ... }
Emits a type-checker warning at every call site of the attributed function. Both arguments are optional; when present they are folded into the warning message.
| Argument | Type | Meaning |
|---|---|---|
since | string | Version that introduced the deprecation |
use | string | Replacement function name (rendered as a help line) |
@test
@test
pipeline test_smoke(task) { ... }
Marks a pipeline as a test entry point. The conformance / harn test
runner discovers attributed pipelines in addition to the legacy
test_* naming convention. Both forms continue to work.
@complexity(allow)
@complexity(allow)
pub fn classify(x: int) -> string {
if x == 1 { return "one" }
...
}
Suppresses the cyclomatic-complexity lint warning on the attached
function. The bare allow identifier is the only currently accepted
form. Use it for functions whose branching is intrinsic (parsers,
tier dispatchers, tree-sitter adapters) rather than accidental.
The rule fires when a function’s cyclomatic score exceeds the default
threshold of 25. Projects can override the threshold in
harn.toml:
[lint]
complexity_threshold = 15 # stricter for this project
Cyclomatic complexity counts each branching construct (if/else,
guard, match arm, for, while, try/catch, ternary,
select case, retry) and each short-circuit boolean operator
(&&, ||). Nesting, guard-vs-if, and De Morgan rewrites are all
score-preserving — the only way to reduce the count is to
extract helpers or mark the function @complexity(allow).
@acp_tool
@acp_tool(name: "edit", kind: "edit", side_effect_level: "mutation")
pub fn apply_edit(path: string, content: string) -> EditResult { ... }
Compiles to the same runtime registration as an imperative
tool_define(tool_registry(), name, "", { handler, annotations })
call, with the function bound as the tool’s handler and every named
attribute argument (other than name) lifted into the
annotations dict. name defaults to the function name when
omitted.
| Argument | Type | Meaning |
|---|---|---|
name | string | Tool name (defaults to fn name) |
kind | string | One of read, edit, delete, move, search, execute, think, fetch, other |
side_effect_level | string | none, read, mutation, destructive |
Other named arguments pass through to the annotations dict unchanged,
so additional ToolAnnotations fields can be added without a parser
change.
Unknown attributes
Unknown attribute names produce a type-checker warning so that misspellings surface at check time. The attribute itself is otherwise ignored — code still compiles.
Type annotations
Harn has an optional, gradual type system. Type annotations are checked at compile time but do not affect runtime behavior. Omitting annotations is always valid.
Basic types
let name: string = "Alice"
let age: int = 30
let rate: float = 3.14
let ok: bool = true
let nothing: nil = nil
The never type
never is the bottom type — the type of expressions that never produce a
value. It is a subtype of all other types.
Expressions that infer to never:
throw exprreturn exprbreakandcontinue- A block where every control path exits
- An
if/elsewhere both branches infer tonever - Calls to
unreachable()
never is removed from union types: never | string simplifies to
string. An empty union (all members removed by narrowing) becomes
never.
fn always_throws() -> never {
throw "this function never returns normally"
}
The any type
any is the top type and the explicit escape hatch. Every concrete
type is assignable to any, and any is assignable back to every
concrete type without narrowing. any disables type checking in both
directions for the values it flows through.
fn passthrough(x: any) -> any {
return x
}
let s: string = passthrough("hello") // any → string, no narrowing required
let n: int = passthrough(42)
Use any deliberately, when you want to opt out of checking — for
example, a generic dispatcher that forwards values through a runtime
protocol you don’t want to describe statically. Prefer unknown (see
below) for values from untrusted boundaries where callers should be
forced to narrow.
The unknown type
unknown is the safe top type. Every concrete type is assignable to
unknown, but an unknown value is not assignable to any
concrete type without narrowing. This is the correct annotation for
values arriving from untrusted boundaries (parsed JSON, LLM responses,
dynamic dicts) where callers should be forced to validate the shape
before use.
fn describe(v: unknown) -> string {
// Direct use of `v` as a concrete type is a compile-time error.
// Narrow via type_of/schema_is first.
if type_of(v) == "string" {
return "string: ${v.upper()}"
}
if type_of(v) == "int" {
return "int: ${v + 1}"
}
return "other"
}
Narrowing rules for unknown:
type_of(x) == "T"narrowsxtoTon the truthy branch (whereTis one of the type-of protocol names:string,int,float,bool,nil,list,dict,closure).schema_is(x, Shape)narrowsxtoShapeon the truthy branch.guard type_of(x) == "T" else { ... }narrowsxtoTin the surrounding scope after the guard.- The falsy branch keeps
unknown— subtracting one concrete type from an open top still leaves an open top. The checker still tracks which concretetype_ofvariants have been ruled out on the current flow path, so an exhaustive chain ending inunreachable()/throwcan be validated; see the “Exhaustive narrowing onunknown” subsection of “Flow-sensitive type refinement”.
Interop between any and unknown:
unknownis assignable toany(upward to the full escape hatch).anyis assignable tounknown(downward — theanyescape hatch lets it flow into anything, includingunknown).
When to pick which:
- No annotation — “I haven’t annotated this.” Callers get no checking. Use for internal, unstable code.
unknown— “this value could be anything; narrow before use.” Use at untrusted boundaries and in APIs that hand back open-ended data. This is the preferred annotation for LLM / JSON / dynamic dict values.any— “stop checking.” A last-resort escape hatch. Preferunknownunless you have a specific reason to defeat checking bidirectionally.
Union types
let value: string | nil = nil
let id: int | string = "abc"
Union members may also be literal types — specific string or int values used to encode enum-like discriminated sets:
type Verdict = "pass" | "fail" | "unclear"
type RetryCount = 0 | 1 | 2 | 3
let v: Verdict = "pass"
Literal types are assignable to their base type ("pass" flows into
string), and a base-typed value flows into a literal union (string
into Verdict). Runtime schema_is / schema_expect guards and the
parameter-annotation runtime check reject values that violate the
literal set.
A match on a literal union must cover every literal or include a
wildcard _ arm — non-exhaustive match is a hard error.
Tagged shape unions (discriminated unions)
A union of two or more dict shapes is a tagged shape union when the
shapes share a discriminant field. The discriminant is auto-detected:
the first field of the first variant that (a) is non-optional in every
member, (b) has a literal type (LitString or LitInt), and (c) takes
a distinct literal value per variant qualifies. The field can be named
anything — kind, type, op, t, etc. — there is no privileged
spelling.
type Msg =
{kind: "ping", ttl: int} |
{kind: "pong", latency_ms: int}
Matching on the discriminant narrows the value to the matching variant
inside each arm; the same narrowing fires under
if obj.<tag> == "value" / else:
fn handle(m: Msg) -> string {
match m.kind {
"ping" -> { return "ttl=" + to_string(m.ttl) }
"pong" -> { return to_string(m.latency_ms) + "ms" }
}
}
Such a match must cover every variant or include a wildcard _ arm
— non-exhaustive match is a hard error.
Distributive generic instantiation
Generic type aliases distribute over closed-union arguments. Writing
Container<A | B> is equivalent to Container<A> | Container<B> so
each instantiation independently fixes the type parameter. This is what
keeps processCreate: fn("create") -> nil flowing into a list< ActionContainer<Action>> element instead of getting rejected by the
contravariance of the function-parameter slot:
type Action = "create" | "edit"
type ActionContainer<T> = {action: T, process_action: fn(T) -> nil}
ActionContainer<Action> resolves to ActionContainer<"create"> | ActionContainer<"edit">, and a literal-tagged shape on the right flows
into the matching branch.
Parameterized types
let numbers: list<int> = [1, 2, 3]
let headers: dict<string, string> = {content_type: "json"}
Structural types (shapes)
Dict shape types describe the expected fields of a dict value. The type checker verifies that dict literals have the required fields with compatible types.
let user: {name: string, age: int} = {name: "Alice", age: 30}
Optional fields use ? and need not be present:
let config: {host: string, port?: int} = {host: "localhost"}
Width subtyping: a dict with extra fields satisfies a shape that requires fewer fields.
fn greet(u: {name: string}) -> string {
return "hi ${u["name"]}"
}
greet({name: "Bob", age: 25}) // OK — extra field allowed
Nested shapes:
let data: {user: {name: string}, tags: list} = {user: {name: "X"}, tags: []}
Shapes are compatible with dict and dict<string, V> when all field values match V.
Type aliases
type Config = {model: string, max_tokens: int}
let cfg: Config = {model: "gpt-4", max_tokens: 100}
A type alias can also drive schema validation for structured LLM output
and runtime guards. schema_of(T) lowers an alias to a JSON-Schema
dict at compile time:
type GraderOut = {
verdict: "pass" | "fail" | "unclear",
summary: string,
findings: list<string>,
}
// Use the alias directly wherever a schema dict is expected.
let s = schema_of(GraderOut)
let ok = schema_is({verdict: "pass", summary: "x", findings: []}, GraderOut)
let r = llm_call(prompt, nil, {
provider: "openai",
output_schema: GraderOut, // alias in value position — compiled to schema_of(T)
schema_retries: 2,
})
The emitted schema follows canonical JSON-Schema conventions (objects
with properties/required, arrays with items, literal unions as
{type, enum}) so it is compatible with structured-output validators
and with ACP ToolAnnotations.args schemas. The compile-time lowering
applies when the alias identifier appears as:
- The argument of
schema_of(T). - The schema argument of
schema_is,schema_expect,schema_parse,schema_check,is_type,json_validate. - The value of an
output_schema:entry in anllm_calloptions dict.
For aliases not known at compile time (e.g. let T = schema_of(Foo)
or dynamic construction), passthrough through the runtime schema_of
builtin keeps existing schema dicts working.
Generic inference via Schema<T>
Schema-driven builtins are typed with proper generics so user-defined wrappers pick up the same narrowing.
llm_call<T>(prompt, system, options: {output_schema: Schema<T>, ...}) -> {data: T, text: string, ...}llm_completion<T>has the same signature.schema_parse<T>(value: unknown, schema: Schema<T>) -> Result<T, string>schema_check<T>(value: unknown, schema: Schema<T>) -> Result<T, string>schema_expect<T>(value: unknown, schema: Schema<T>) -> T
Schema<T> denotes a runtime schema value whose static shape is T.
In a parameter position, matching a Schema<T> against an argument
whose value resolves to a type alias (directly, via schema_of(T),
or via an inline JSON-Schema dict literal) binds the type parameter.
A user-defined wrapper such as
fn grade<T>(prompt: string, schema: Schema<T>) -> T {
let r = llm_call(prompt, nil,
{provider: "mock", output_schema: schema, output_validation: "error",
response_format: "json"})
return r.data
}
let out: GraderOut = grade("Grade this", schema_of(GraderOut))
println(out.verdict)
narrows out to GraderOut at the call site without any
schema_is / schema_expect guard, and without per-wrapper
typechecker support.
Schema<T> is a type-level construct. In value positions, the
runtime schema_of(T) builtin returns an idiomatic schema dict
whose static type is Schema<T>.
Function type annotations
Parameters and return types can be annotated:
fn add(a: int, b: int) -> int {
return a + b
}
Type checking behavior
- Annotations are optional (gradual typing). Untyped values are
Noneand skip checks. intis assignable tofloat.- Dict literals with string keys infer a structural shape type.
- Dict literals with computed keys infer as generic
dict. - Shape-to-shape: all required fields in the expected type must exist with compatible types.
- Shape-to-
dict<K, V>: all field values must be compatible withV. - Type errors are reported at compile time and halt execution.
Flow-sensitive type refinement
The type checker performs flow-sensitive type refinement (narrowing) on union types based on control flow conditions. Refinements are bidirectional — both the truthy and falsy paths of a condition are narrowed.
Nil checks
x != nil narrows to non-nil in the then-branch and to nil in the
else-branch. x == nil applies the inverse.
fn greet(name: string | nil) -> string {
if name != nil {
// name is `string` here
return "hello ${name}"
}
// name is `nil` here
return "hello stranger"
}
type_of() checks
type_of(x) == "typename" narrows to that type in the then-branch and
removes it from the union in the else-branch.
fn describe(x: string | int) {
if type_of(x) == "string" {
log(x) // x is `string`
} else {
log(x) // x is `int`
}
}
Truthiness
A bare identifier in condition position narrows by removing nil:
fn check(x: string | nil) {
if x {
log(x) // x is `string`
}
}
Logical operators
a && b: combines both refinements on the truthy path.a || b: combines both refinements on the falsy path.!cond: inverts truthy and falsy refinements.
fn check(x: string | int | nil) {
if x != nil && type_of(x) == "string" {
log(x) // x is `string`
}
}
Guard statements
After a guard statement, the truthy refinements apply to the outer
scope (since the else-body must exit):
fn process(x: string | nil) {
guard x != nil else { return }
log(x) // x is `string` here
}
Early-exit narrowing
When one branch of an if/else definitely exits (via return,
throw, break, or continue), the opposite refinements apply after
the if:
fn process(x: string | nil) {
if x == nil { return }
log(x) // x is `string` — the nil path returned
}
While loops
The condition’s truthy refinements apply inside the loop body.
Ternary expressions
The condition’s refinements apply to the true and false branches respectively.
Match expressions
When matching a union-typed variable against literal patterns, the variable’s type is narrowed in each arm:
fn check(x: string | int) {
match x {
"hello" -> { log(x) } // x is `string`
42 -> { log(x) } // x is `int`
_ -> {}
}
}
Or-patterns (pat1 | pat2 -> body)
A match arm may list two or more alternative patterns separated by |;
the shared body runs when any alternative matches. Each alternative
contributes to exhaustiveness coverage independently, so an or-pattern
and a single-literal arm compose naturally:
fn verdict(v: "pass" | "fail" | "unclear") -> string {
return match v {
"pass" -> { "ok" }
"fail" | "unclear" -> { "not ok" }
}
}
Narrowing inside the or-arm refines the matched variable to the union of the alternatives’ single-literal narrowings. On a literal union this is a sub-union; on a tagged shape union it is a union of the matching shape variants:
type Msg =
{kind: "ping", ttl: int} |
{kind: "pong", latency_ms: int} |
{kind: "close", reason: string}
fn summarise(m: Msg) -> string {
return match m.kind {
"ping" | "pong" -> {
// m is narrowed to {kind:"ping",…} | {kind:"pong",…};
// the shared `kind` discriminant stays accessible.
"live:" + m.kind
}
"close" -> { "closed:" + m.reason }
}
}
Guards apply to the arm as a whole: 1 | 2 | 3 if n > 2 -> … runs the
body only when some alternative matched and the guard held. A guard
failure falls through to the next arm, exactly like a literal-pattern
arm.
Or-patterns are restricted to literal alternatives (string, int, float, bool, nil) in this release. Alternatives that introduce identifier bindings or destructuring patterns are a forward-compatible extension and currently rejected.
.has() on shapes
dict.has("key") narrows optional shape fields to required:
fn check(x: {name?: string, age: int}) {
if x.has("name") {
log(x) // x.name is now required (non-optional)
}
}
Exhaustiveness checking with unreachable()
The unreachable() builtin acts as a static exhaustiveness assertion.
When called with a variable argument, the type checker verifies that the
variable has been narrowed to never — meaning all possible types have
been handled. If not, a compile-time error reports the remaining types.
fn process(x: string | int | nil) -> string {
if type_of(x) == "string" { return "string: ${x}" }
if type_of(x) == "int" { return "int: ${x}" }
if x == nil { return "nil" }
unreachable(x) // compile-time verified: x is `never` here
}
At runtime, unreachable() throws "unreachable code was reached" as a
safety net. When called without arguments or with a non-variable argument,
no compile-time check is performed.
Exhaustive narrowing on unknown
The checker tracks the set of concrete type_of variants that have been
ruled out on the current flow path for every unknown-typed variable.
The falsy branch of type_of(v) == "T" still leaves v typed unknown
(subtracting one concrete type from an open top still leaves an open
top), but the coverage set for v gains "T".
When control flow reaches a never-returning site — unreachable(), a
throw statement, or a call to a user-defined function whose return
type is never — the checker verifies that the coverage set for every
still-unknown variable is either empty or complete. An incomplete
coverage set is treated as a failed exhaustiveness claim and triggers a
warning that names the uncovered concrete variants:
fn handle(v: unknown) -> string {
if type_of(v) == "string" { return "s:${v}" }
if type_of(v) == "int" { return "i:${v}" }
unreachable("unknown type_of variant")
// warning: `unreachable()` reached but `v: unknown` was not fully
// narrowed — uncovered concrete type(s): float, bool, nil, list,
// dict, closure
}
Covering all eight type_of variants (int, string, float, bool,
nil, list, dict, closure) silences the warning. Suppression via
an explicit fallthrough return is intentional: a plain return
doesn’t claim exhaustiveness, so partial narrowing followed by a normal
return stays silent. Reaching throw or unreachable() with no prior
type_of narrowing also stays silent — the coverage set must be
non-empty for the warning to fire, which avoids false positives on
unrelated error paths.
Reassigning the variable clears its coverage set, matching the way narrowing is already invalidated on reassignment.
Unreachable code warnings
The type checker warns about code after statements that definitely exit
(via return, throw, break, or continue), including composite
exits where both branches of an if/else exit:
fn foo(x: bool) {
if x { return 1 } else { throw "err" }
log("never reached") // warning: unreachable code
}
Reassignment invalidation
When a narrowed variable is reassigned, the narrowing is invalidated and the original declared type is restored.
Mutability
Variables declared with let are immutable. Assigning to a let
variable produces a compile-time warning (and a runtime error).
Runtime parameter type enforcement
In addition to compile-time checking, function parameters with type annotations
are enforced at runtime. When a function is called, the VM verifies that each
annotated parameter matches its declared type before executing the function body.
If the types do not match, a TypeError is thrown:
TypeError: parameter 'name' expected string, got int (42)
The following types are enforced at runtime: int, float, string, bool,
list, dict, set, nil, and closure. int and float are mutually
compatible (passing an int to a float parameter is allowed, and vice versa).
Union types, list<T>, dict<string, V>, and nested shapes are also checked at
runtime when the parameter annotation can be lowered into a runtime schema.
Runtime shape validation
Shape-annotated function parameters are validated at runtime. When a function
parameter has a structural type annotation (e.g., {name: string, age: int}),
the VM checks that the argument is a dict (or struct instance) with all
required fields and that each field has the expected type.
fn process(user: {name: string, age: int}) {
println("${user.name} is ${user.age}")
}
process({name: "Alice", age: 30}) // OK
process({name: "Alice"}) // Error: parameter 'user': missing field 'age' (int)
process({name: "Alice", age: "old"}) // Error: parameter 'user': field 'age' expected int, got string
Shape validation works with both plain dicts and struct instances. Extra
fields are allowed (width subtyping). Optional fields (declared with ?)
are not required to be present.
Built-in methods
String methods
| Method | Signature | Returns |
|---|---|---|
count | .count (property) | int – character count |
empty | .empty (property) | bool – true if empty |
contains(sub) | string | bool |
replace(old, new) | string, string | string |
split(sep) | string | list of strings |
trim() | (none) | string – whitespace stripped |
starts_with(prefix) | string | bool |
ends_with(suffix) | string | bool |
lowercase() | (none) | string |
uppercase() | (none) | string |
substring(start, end?) | int, int? | string – character range |
List methods
| Method | Signature | Returns |
|---|---|---|
count | (property) | int |
empty | (property) | bool |
first | (property) | value or nil |
last | (property) | value or nil |
map(closure) | closure(item) -> value | list |
filter(closure) | closure(item) -> bool | list |
reduce(init, closure) | value, closure(acc, item) -> value | value |
find(closure) | closure(item) -> bool | value or nil |
any(closure) | closure(item) -> bool | bool |
all(closure) | closure(item) -> bool | bool |
flat_map(closure) | closure(item) -> value/list | list (flattened) |
Dict methods
| Method | Signature | Returns |
|---|---|---|
keys() | (none) | list of strings (sorted) |
values() | (none) | list of values (sorted by key) |
entries() | (none) | list of {key, value} dicts (sorted by key) |
count | (property) | int |
has(key) | string | bool |
merge(other) | dict | dict (other wins on conflict) |
map_values(closure) | closure(value) -> value | dict |
filter(closure) | closure(value) -> bool | dict |
Dict property access
dict.name returns the value for key "name", or nil if absent.
Set builtins
Sets are created with the set() builtin and are immutable – mutation
operations return a new set. Sets deduplicate values using structural
equality.
| Function | Signature | Returns |
|---|---|---|
set(...) | values or a list | set – deduplicated |
set_add(s, value) | set, value | set – with value added |
set_remove(s, value) | set, value | set – with value removed |
set_contains(s, value) | set, value | bool |
set_union(a, b) | set, set | set – all items from both |
set_intersect(a, b) | set, set | set – items in both |
set_difference(a, b) | set, set | set – items in a but not b |
to_list(s) | set | list – convert set to list |
Sets are iterable with for ... in and support len().
Encoding and hashing builtins
| Function | Description |
|---|---|
base64_encode(str) | Returns the base64-encoded version of str |
base64_decode(str) | Returns the decoded string from a base64-encoded str |
sha256(str) | Returns the hex-encoded SHA-256 hash of str |
md5(str) | Returns the hex-encoded MD5 hash of str |
let encoded = base64_encode("hello world") // "aGVsbG8gd29ybGQ="
let decoded = base64_decode(encoded) // "hello world"
let hash = sha256("hello") // hex string
let md5hash = md5("hello") // hex string
Regex builtins
| Function | Description |
|---|---|
regex_match(pattern, str) | Returns match data if str matches pattern, or nil |
regex_replace(pattern, str, replacement) | Replaces all matches of pattern in str |
regex_captures(pattern, str) | Returns a list of capture group dicts for all matches |
regex_captures
regex_captures(pattern, text) finds all matches of pattern in text
and returns a list of dicts, one per match. Each dict contains:
match: the full match stringgroups: a list of positional capture group strings (from(...))- Any named capture groups (from
(?P<name>...)) as additional keys
let results = regex_captures("(\\w+)@(\\w+)", "alice@example bob@test")
// results == [
// {match: "alice@example", groups: ["alice", "example"]},
// {match: "bob@test", groups: ["bob", "test"]}
// ]
let named = regex_captures("(?P<user>\\w+):(?P<role>\\w+)", "alice:admin")
// named == [{match: "alice:admin", groups: ["alice", "admin"], user: "alice", role: "admin"}]
Returns an empty list if there are no matches.
Regex patterns are compiled and cached internally using a thread-local cache. Repeated calls with the same pattern string reuse the compiled regex, avoiding recompilation overhead. This is a performance optimization with no API-visible change.
Iterator protocol
Harn provides a lazy iterator protocol layered over the eager
collection methods. Eager methods (list.map, list.filter,
list.flat_map, dict.map_values, dict.filter, etc.) are
unchanged — they return eager collections. Lazy iteration is opt-in
via .iter() and the iter(x) builtin.
The Iter<T> type
Iter<T> is a runtime value representing a lazy, single-pass, fused
iterator over values of type T. It is produced by calling iter(x)
or x.iter() on an iterable source (list, dict, set, string,
generator, channel) or by chaining a combinator on an existing iter.
iter(x) / x.iter() on a value that is already an Iter<T> is a
no-op (returns the iter unchanged).
The Pair<K, V> type
Pair<K, V> is a two-element value used by the iterator protocol for
key/value and index/value yields.
- Construction:
pair(a, b)builtin. Combinators such as.zipand.enumerateand dict iteration produce pairs automatically. - Access:
.firstand.secondas properties. - For-loop destructuring:
for (k, v) in iter_expr { ... }binds the.firstand.secondof eachPairtokandv. - Equality: structural (
pair(1, 2) == pair(1, 2)). - Printing:
(a, b).
For-loop integration
for x in iter_expr pulls values one at a time from iter_expr until
the iter is exhausted.
for (a, b) in iter_expr destructures each yielded Pair into two
bindings. If a yielded value is not a Pair, a runtime error is
raised.
for entry in some_dict (no .iter()) continues to yield
{key, value} dicts in sorted-key order for back-compat. Only
some_dict.iter() yields Pair(key, value).
Semantics
- Lazy: combinators allocate a new
Iterand perform no work; values are only produced when a sink (or for-loop) pulls them. - Single-pass: once an item has been yielded, it cannot be re-read from the same iter.
- Fused: once exhausted, subsequent pulls continue to report
exhaustion (never panic, never yield again). Re-call
.iter()on the source collection to obtain a fresh iter. - Snapshot: lifting a list/dict/set/string
Rc-clones the backing storage into the iter, so mutating the source after.iter()does not affect iteration. - String iteration: yields chars (Unicode scalar values), not graphemes.
- Printing:
log(it)/to_string(it)renders<iter>or<iter (exhausted)>without draining the iter.
Combinators
Each combinator below is a method on Iter<T> and returns a new
Iter without consuming items eagerly.
| Method | Signature |
|---|---|
.iter() | Iter<T> -> Iter<T> (no-op) |
.map(f) | Iter<T>, (T) -> U -> Iter<U> |
.filter(p) | Iter<T>, (T) -> bool -> Iter<T> |
.flat_map(f) | Iter<T>, (T) -> Iter<U> | list<U> -> Iter<U> |
.take(n) | Iter<T>, int -> Iter<T> |
.skip(n) | Iter<T>, int -> Iter<T> |
.take_while(p) | Iter<T>, (T) -> bool -> Iter<T> |
.skip_while(p) | Iter<T>, (T) -> bool -> Iter<T> |
.zip(other) | Iter<T>, Iter<U> -> Iter<Pair<T, U>> |
.enumerate() | Iter<T> -> Iter<Pair<int, T>> |
.chain(other) | Iter<T>, Iter<T> -> Iter<T> |
.chunks(n) | Iter<T>, int -> Iter<list<T>> |
.windows(n) | Iter<T>, int -> Iter<list<T>> |
Sinks
Sinks drive the iter to completion (or until a short-circuit) and return an eager value.
| Method | Signature |
|---|---|
.to_list() | Iter<T> -> list<T> |
.to_set() | Iter<T> -> set<T> |
.to_dict() | Iter<Pair<K, V>> -> dict<K, V> |
.count() | Iter<T> -> int |
.sum() | Iter<T> -> int | float |
.min() | Iter<T> -> T | nil |
.max() | Iter<T> -> T | nil |
.reduce(init, f) | Iter<T>, U, (U, T) -> U -> U |
.first() | Iter<T> -> T | nil |
.last() | Iter<T> -> T | nil |
.any(p) | Iter<T>, (T) -> bool -> bool |
.all(p) | Iter<T>, (T) -> bool -> bool |
.find(p) | Iter<T>, (T) -> bool -> T | nil |
.for_each(f) | Iter<T>, (T) -> any -> nil |
Notes
.to_dict()requires the iter to yieldPairvalues; a runtime error is raised otherwise..min()/.max()returnnilon an empty iter..any/.all/.findshort-circuit as soon as the result is determined.- Numeric ranges (
a to b,range(n)) participate in the lazy iter protocol directly; applying any combinator on aRangereturns a lazyIterwithout materializing the range.
Method-style builtins
If obj.method(args) is called and obj is an identifier, the interpreter first checks
for a registered builtin named "obj.method". If found, it is called with just args
(not obj). This enables namespaced builtins like experience_bank.save(...)
and negative_knowledge.record(...).
Runtime errors
| Error | Description |
|---|---|
undefinedVariable(name) | Variable not found in any scope |
undefinedBuiltin(name) | No registered builtin or user function with this name |
immutableAssignment(name) | Attempted = on a let binding |
typeMismatch(expected, got) | Type assertion failed |
returnValue(value?) | Internal: used to implement return (not a user-facing error) |
retryExhausted | All retry attempts failed |
thrownError(value) | User-thrown error via throw |
Most undefinedBuiltin errors are now caught statically by the
cross-module typechecker (see Static cross-module
resolution) — harn check and
harn run refuse to start the VM when a file contains a call to a name
that is not a builtin, local declaration, struct constructor, callable
variable, or imported symbol. The runtime check remains as a backstop
for cases where imports could not be resolved at check time.
Stack traces
Runtime errors include a full call stack trace showing the chain of function calls that led to the error. The stack trace lists each frame with its function name, source file, line number, and column:
Error: division by zero
at divide (script.harn:3:5)
at compute (script.harn:8:18)
at default (script.harn:12:10)
Stack traces are captured at the point of the error before unwinding, so they accurately reflect the call chain at the time of failure.
Persistent store
Six builtins provide a persistent key-value store backed by the resolved Harn
state root (default .harn/store.json):
| Function | Description |
|---|---|
store_get(key) | Retrieve value or nil |
store_set(key, value) | Set key, auto-saves to disk |
store_delete(key) | Remove key, auto-saves |
store_list() | List all keys (sorted) |
store_save() | Explicit flush to disk |
store_clear() | Remove all keys, auto-saves |
The store file is created lazily on first mutation. In bridge mode, the
host can override these builtins via the bridge protocol. The state root can
be relocated with HARN_STATE_DIR.
Checkpoint & resume
Checkpoints enable resilient, resumable pipelines. State is persisted to the
resolved Harn state root (default .harn/checkpoints/<pipeline>.json) and survives crashes, restarts, and
migration to another machine.
Core builtins
| Function | Description |
|---|---|
checkpoint(key, value) | Save value at key; writes to disk immediately |
checkpoint_get(key) | Retrieve saved value, or nil if absent |
checkpoint_exists(key) | Return true if key is present (even if value is nil) |
checkpoint_delete(key) | Remove a single key; no-op if absent |
checkpoint_clear() | Remove all checkpoints for this pipeline |
checkpoint_list() | Return sorted list of all checkpoint keys |
checkpoint_exists is preferable to checkpoint_get(key) == nil when nil
is a valid checkpoint value.
std/checkpoint module
import { checkpoint_stage, checkpoint_stage_retry } from "std/checkpoint"
checkpoint_stage(name, fn) -> value
Runs fn() and caches the result under name. On subsequent calls with the
same name, returns the cached result without running fn() again. This is the
primary primitive for building resumable pipelines.
import { checkpoint_stage } from "std/checkpoint"
fn fetch_dataset(url) { url }
fn clean(data) { data }
fn run_model(cleaned) { cleaned }
fn upload(result) { log(result) }
pipeline process(task) {
let url = "https://example.com/data.csv"
let data = checkpoint_stage("fetch", fn() { fetch_dataset(url) })
let cleaned = checkpoint_stage("clean", fn() { clean(data) })
let result = checkpoint_stage("process", fn() { run_model(cleaned) })
upload(result)
}
On first run all three stages execute. On a resumed run (pipeline restarted after a crash), completed stages are skipped automatically.
checkpoint_stage_retry(name, max_retries, fn) -> value
Like checkpoint_stage, but retries fn() up to max_retries times on
failure before propagating the error. Once successful, the result is cached so
retries are never needed on resume.
import { checkpoint_stage_retry } from "std/checkpoint"
fn fetch_with_timeout(url) { url }
let url = "https://example.com/data.csv"
let data = checkpoint_stage_retry("fetch", 3, fn() { fetch_with_timeout(url) })
log(data)
File location
Checkpoint files are stored at .harn/checkpoints/<pipeline>.json relative to
the project root (where harn.toml lives), or relative to the source file
directory if no project root is found. Files are plain JSON and can be copied
between machines to migrate pipeline state.
std/agent_state module
import "std/agent_state"
Provides a durable, session-scoped text/blob store rooted at a caller-supplied directory.
| Function | Notes |
|---|---|
agent_state_init(root, options?) | Create or reopen a session root under root/<session_id>/ |
agent_state_resume(root, session_id, options?) | Reopen an existing session; errors when absent |
agent_state_write(handle, key, content) | Atomic temp-write plus rename |
agent_state_read(handle, key) | Returns string or nil |
agent_state_list(handle) | Deterministic recursive key listing |
agent_state_delete(handle, key) | Deletes a key |
agent_state_handoff(handle, summary) | Writes a JSON handoff envelope to __handoff.json |
Keys must be relative paths inside the session root. Absolute paths and parent-directory escapes are rejected.
Workspace manifest (harn.toml)
Harn projects declare a workspace manifest at the project root named
harn.toml. Tooling walks upward from a target .harn file looking
for the nearest ancestor manifest and stops at a .git boundary so a
stray manifest in a parent project or $HOME is never silently picked
up.
[check] — type-checker and preflight
[check]
host_capabilities_path = "./schemas/host-capabilities.json"
preflight_severity = "warning" # "error" (default), "warning", "off"
preflight_allow = ["mystery.*", "runtime.task"]
[check.host_capabilities]
project = ["ensure_enriched", "enrich"]
workspace = ["read_text", "write_text"]
host_capabilities_pathand[check.host_capabilities]declare the host-call surface that the preflight pass is allowed to assume exists at runtime. The CLI flag--host-capabilities <file>takes precedence for a single invocation. The external file is JSON or TOML with the namespaced shape{ capability: [op, ...], ... }; nested{ capabilities: { ... } }wrappers and per-op metadata dictionaries are accepted.preflight_severitydowngrades preflight diagnostics to warnings or suppresses them entirely. Type-checker and lint diagnostics are unaffected — preflight failures are reported under thepreflightcategory so IDEs and CI filters can route them separately.preflight_allowsuppresses preflight diagnostics tagged with a specific host capability. Entries match an exactcapability.operationpair, acapability.*wildcard, a barecapabilityname, or a blanket*.
Preflight capabilities in this section are a static check surface
for the Harn type-checker only. They are not the same thing as ACP’s
agent/client capability handshake (agentCapabilities /
clientCapabilities), which is runtime protocol-level negotiation and
lives outside harn.toml.
[workspace] — multi-file targets
[workspace]
pipelines = ["Sources/BurinCore/Resources/pipelines", "scripts"]
harn check --workspace resolves each path in pipelines relative to
the manifest directory and recursively checks every .harn file under
each. Positional targets remain additive. The manifest is discovered by
walking upward from the first positional target (or the current working
directory when none is supplied).
[exports] — stable package module entry points
[exports]
capabilities = "runtime/capabilities.harn"
providers = "runtime/providers.harn"
[exports] maps logical import suffixes to package-root-relative module
paths. After harn install, consumers import them as
"<package>/<export>" instead of coupling to the package’s internal
directory layout.
Exports are resolved after the direct .harn/packages/<path> lookup, so
packages can still expose raw file trees when they want that behavior.
[llm] — packaged provider extensions
[llm.providers.my_proxy]
base_url = "https://llm.example.com/v1"
chat_endpoint = "/chat/completions"
completion_endpoint = "/completions"
auth_style = "bearer"
auth_env = "MY_PROXY_API_KEY"
[llm.aliases]
my-fast = { id = "vendor/model-fast", provider = "my_proxy" }
The [llm] table accepts the same schema as providers.toml
(providers, aliases, inference_rules, tier_rules,
tier_defaults, model_defaults) but scopes it to the current run.
When Harn starts from a file inside a workspace, it merges:
- built-in defaults,
- the global provider file (
HARN_PROVIDERS_CONFIGor~/.config/harn/providers.toml), - installed package
[llm]tables from.harn/packages/*/harn.toml, - the root project’s
[llm]table.
Later layers win on key collisions; rule lists are prepended so package and project inference/tier overrides run before the built-in defaults.
[lint] — lint configuration
[lint]
disabled = ["unused-import"]
require_file_header = false
complexity_threshold = 25
disabledsilences the listed rules for the whole project.require_file_headeropts into therequire-file-headerrule, which checks that each source file begins with a/** */HarnDoc block whose title matches the filename.complexity_thresholdoverrides the default cyclomatic-complexity warning threshold (default 25, chosen to match Clippy’scognitive_complexitydefault). Set lower to tighten, higher to loosen. Per-function escapes still go through@complexity(allow).
Sandbox mode
The harn run command supports sandbox flags that restrict which builtins
a program may call.
–deny
harn run --deny read_file,write_file,exec script.harn
Denies the listed builtins. Any call to a denied builtin produces a runtime error:
Permission denied: builtin 'read_file' is not allowed in sandbox mode
(use --allow read_file to permit)
–allow
harn run --allow llm,llm_stream script.harn
Allows only the listed builtins plus the core builtins (see below). All other builtins are denied.
--deny and --allow cannot be used together; specifying both is an error.
Core builtins
The following builtins are always allowed, even when using --allow:
println, print, log, type_of, to_string, to_int, to_float,
len, assert, assert_eq, assert_ne, json_parse, json_stringify
Propagation
Sandbox restrictions propagate to child VMs created by spawn,
parallel, and parallel each. A child VM inherits the same set of
denied builtins as its parent.
Test framework
Harn includes a built-in test runner invoked via harn test.
Running tests
harn test path/to/tests/ # run all test files in a directory
harn test path/to/test_file.harn # run tests in a single file
Test discovery
The test runner scans .harn files for pipelines whose names start with
test_. Each such pipeline is executed independently. A test passes if
it completes without error; it fails if it throws or an assertion fails.
pipeline test_addition() {
assert_eq(1 + 1, 2)
}
pipeline test_string_concat() {
let result = "hello" + " " + "world"
assert_eq(result, "hello world")
}
Assertions
Three assertion builtins are available. They can be called anywhere, but they are intended for test pipelines and the linter warns on non-test use:
| Function | Description |
|---|---|
assert(condition) | Throws if condition is falsy |
assert_eq(a, b) | Throws if a != b, showing both values |
assert_ne(a, b) | Throws if a == b, showing both values |
Mock LLM provider
During harn test, the HARN_LLM_PROVIDER environment variable is
automatically set to "mock" unless explicitly overridden. The mock
provider returns deterministic placeholder responses, allowing tests
that call llm or llm_stream to run without API keys.
CLI options
| Flag | Description |
|---|---|
--filter <pattern> | Only run tests whose names contain <pattern> |
--verbose / -v | Show per-test timing and detailed failures |
--timing | Show per-test timing and summary statistics |
--timeout <ms> | Per-test timeout in milliseconds (default 30000) |
--parallel | Run test files concurrently |
--junit <path> | Write JUnit XML report to <path> |
--record | Record LLM responses to .harn-fixtures/ |
--replay | Replay LLM responses from .harn-fixtures/ |
Environment variables
The following environment variables configure runtime behavior:
| Variable | Description |
|---|---|
HARN_LLM_PROVIDER | Override the default LLM provider. Any configured provider is accepted. Built-in names include anthropic (default), openai, openrouter, huggingface, ollama, local, and mock. |
HARN_LLM_TIMEOUT | LLM request timeout in seconds. Default 120. |
HARN_STATE_DIR | Override the runtime state root used for store, checkpoint, metadata, and default worktree state. Relative values resolve from the active project/runtime root. |
HARN_RUN_DIR | Override the default persisted run directory. Relative values resolve from the active project/runtime root. |
HARN_WORKTREE_DIR | Override the default worker worktree root. Relative values resolve from the active project/runtime root. |
ANTHROPIC_API_KEY | API key for the Anthropic provider. |
OPENAI_API_KEY | API key for the OpenAI provider. |
OPENROUTER_API_KEY | API key for the OpenRouter provider. |
HF_TOKEN | API key for the HuggingFace provider. |
HUGGINGFACE_API_KEY | Alternate API key name for the HuggingFace provider. |
OLLAMA_HOST | Override the Ollama host. Default http://localhost:11434. |
LOCAL_LLM_BASE_URL | Base URL for a local OpenAI-compatible server. Default http://localhost:8000. |
LOCAL_LLM_MODEL | Default model ID for the local OpenAI-compatible provider. |
Known limitations and future work
The following are known limitations in the current implementation that may be addressed in future versions.
Type system
- Definition-site generic checking: Inside a generic function body,
type parameters are treated as compatible with any type. The checker
does not yet restrict method calls on
Tto only those declared in thewhereclause interface. - No runtime interface enforcement: Interface satisfaction is checked at compile-time only. Passing an untyped value to an interface-typed parameter is not caught at runtime.
Runtime
Syntax limitations
- No
impl Interface for Typesyntax: Interface satisfaction is always implicit. There is no way to explicitly declare that a type implements an interface.
LLM calls and agent loops
Harn has built-in support for calling language models and running persistent agent loops. No libraries or SDKs needed.
Providers
Harn ships with built-in configs for Anthropic, OpenAI, OpenRouter, Ollama, HuggingFace, and a local OpenAI-compatible server. Set the appropriate environment variable to authenticate or point Harn at a local endpoint:
| Provider | Environment variable | Default model |
|---|---|---|
| Anthropic (default) | ANTHROPIC_API_KEY | claude-sonnet-4-20250514 |
| OpenAI | OPENAI_API_KEY | gpt-4o |
| OpenRouter | OPENROUTER_API_KEY | anthropic/claude-sonnet-4-20250514 |
| HuggingFace | HF_TOKEN or HUGGINGFACE_API_KEY | explicit model |
| Ollama | OLLAMA_HOST (optional) | llama3.2 |
| Local server | LOCAL_LLM_BASE_URL | LOCAL_LLM_MODEL or explicit model |
Ollama runs locally and doesn’t require an API key. The default host is
http://localhost:11434.
For a generic OpenAI-compatible local server, set LOCAL_LLM_BASE_URL to
something like http://192.168.86.250:8000 and either pass
{provider: "local", model: "qwen2.5-coder-32b"} or set
LOCAL_LLM_MODEL=qwen2.5-coder-32b.
llm_call
Make a single LLM request. Harn normalizes provider responses into a canonical dict so product code does not need to parse provider-native message shapes.
let result = llm_call("What is 2 + 2?")
println(result.text)
With a system message:
let result = llm_call(
"Explain quicksort",
"You are a computer science teacher. Be concise."
)
println(result.text)
With options:
let result = llm_call(
"Translate to French: Hello, world",
"You are a translator.",
{
provider: "openai",
model: "gpt-4o",
max_tokens: 1024
}
)
println(result.text)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| prompt | string | yes | The user message |
| system | string | no | System message for the model |
| options | dict | no | Provider, model, and generation settings |
Return value
llm_call always returns a dict:
| Field | Type | Description |
|---|---|---|
text | string | The text content of the response |
visible_text | string | Human-visible assistant output |
model | string | The model used |
provider | string | Canonical provider identifier |
input_tokens | int | Input/prompt token count |
output_tokens | int | Output/completion token count |
cache_read_tokens | int | Prompt tokens served from provider-side cache when supported |
cache_write_tokens | int | Prompt tokens written into provider-side cache when supported |
data | any | Parsed JSON (when response_format: "json") |
tool_calls | list | Tool calls (when model uses tools) |
thinking | string | Reasoning trace (when thinking is enabled) |
private_reasoning | string | Provider reasoning metadata kept separate from visible text |
blocks | list | Canonical structured content blocks across providers |
stop_reason | string | "end_turn", "max_tokens", "tool_use", "stop_sequence" |
transcript | dict | Transcript carrying message history, events, summary, metadata, and id |
Options dict
| Key | Type | Default | Description |
|---|---|---|---|
provider | string | "anthropic" | Any configured provider. Built-in names include "anthropic", "openai", "openrouter", "huggingface", "ollama", and "local" |
model | string | varies by provider | Model identifier |
max_tokens | int | 16384 | Maximum tokens in the response |
temperature | float | provider default | Sampling temperature (0.0-2.0) |
top_p | float | nil | Nucleus sampling |
top_k | int | nil | Top-K sampling (Anthropic/Ollama only) |
stop | list | nil | Stop sequences |
seed | int | nil | Reproducibility seed (OpenAI/Ollama) |
frequency_penalty | float | nil | Frequency penalty (OpenAI only) |
presence_penalty | float | nil | Presence penalty (OpenAI only) |
response_format | string | "text" | "text" or "json" |
schema | dict | nil | JSON Schema, OpenAPI Schema Object, or canonical Harn schema dict for structured output |
thinking | bool/dict | nil | Enable provider reasoning. true or {budget_tokens: N}. Anthropic maps this to thinking/adaptive thinking, OpenRouter maps it to reasoning, and Ollama maps it to think. |
tools | list | nil | Tool definitions |
tool_choice | string/dict | "auto" | "auto", "none", "required", or {name: "tool"} |
tool_search | bool/string/dict | nil | Progressive tool disclosure. See Tool Vault |
cache | bool | false | Enable prompt caching (Anthropic) |
stream | bool | true | Use streaming SSE transport. Set false for synchronous request/response. Env: HARN_LLM_STREAM |
timeout | int | 120 | Request timeout in seconds |
messages | list | nil | Full message list (overrides prompt) |
transcript | dict | nil | Continue from a previous transcript; prompt is appended as the next user turn |
model_tier | string | nil | Resolve a configured tier alias such as "small", "mid", or "frontier" |
Provider-specific overrides can be passed as sub-dicts:
let result = llm_call("hello", nil, {
provider: "ollama",
ollama: {num_ctx: 32768}
})
Tool Vault
Harn’s Tool Vault is the progressive-tool-disclosure primitive: tool definitions that stay out of the model’s context until they’re surfaced by a search call. This keeps context cheap for agents with hundreds of tools (coding agents, MCP-heavy setups) without requiring the integrator to hand-filter tools per turn.
Per-tool flag: defer_loading
Any tool registered via tool_define (or the tool { … } language
form) can opt out of eager loading:
var registry = tool_registry()
registry = tool_define(registry, "deploy", "Deploy to production", {
parameters: {env: {type: "string"}},
defer_loading: true,
handler: { args -> shell("deploy " + args.env) },
})
Deferred tools never appear in the model’s context unless a tool-search call surfaces them. They are sent to the provider (so prompt caching stays warm on Anthropic — the schemas live in the API prefix but not the model’s context).
Call-level option: tool_search
Turning progressive disclosure on is one option away:
let r = llm_call(prompt, sys, {
provider: "anthropic",
model: "claude-opus-4-7",
tools: registry,
tool_search: "bm25",
})
Accepted shapes:
| Shape | Meaning |
|---|---|
tool_search: true | Default: bm25 variant, mode auto. |
tool_search: "bm25" | Natural-language queries. |
tool_search: "regex" | Python-regex queries. |
tool_search: false | Explicit off (same as omitting). |
tool_search: {variant, mode, strategy, always_loaded, budget_tokens, name, include_stub_listing} | Explicit dict form. |
mode options:
"auto"(default) — use native if the provider supports it, otherwise fall back to the client-executed path (no error)."native"— force the provider’s native mechanism. Errors if unsupported."client"— force the client-executed path even on providers with native support. Useful for A/B-ing strategies or pinning behavior across heterogeneous provider fleets.
Provider support
| Provider | Native tool_search | Variants / modes |
|---|---|---|
| Anthropic Claude Opus/Sonnet 4.0+, Haiku 4.5+ | ✓ | bm25, regex |
| Anthropic 3.x or earlier 4.x Haiku | ✗ (uses client fallback) | — |
| OpenAI Responses API — GPT 5.4+ | ✓ | hosted (default), client |
OpenAI pre-5.4 (gpt-4o, gpt-4.1, …) | ✗ | client fallback works today |
| OpenRouter / Together / Groq / DeepSeek / Fireworks / HuggingFace / local | ✓ when routed model matches gpt-5.4+ upstream | hosted forwarded; escape hatch below for proxies |
| Gemini, Ollama, mock (default model) | ✗ | client fallback works today |
The OpenAI native path (harn#71) emits a flat {"type": "tool_search", "mode": "hosted"} meta-tool at the front of the tools array, alongside
defer_loading: true on the wrapper of each user tool. The server runs
the search and replies with tool_search_call / tool_search_output
entries that Harn parses into the same transcript event shape as the
Anthropic path (replays are indistinguishable across providers).
Namespace grouping
OpenAI’s tool_search can group deferred tools into namespaces; pass
namespace: "<label>" on tool_define(...) to tag a tool. Harn collects
the distinct set into the meta-tool’s namespaces field. Anthropic
ignores the label — harmless passthrough for replay fidelity.
tool_define(registry, "deploy_api", "Deploy the API", {
parameters: {env: {type: "string"}},
defer_loading: true,
namespace: "ops",
handler: { args -> shell("deploy api " + args.env) },
})
Escape hatch for proxied OpenAI-compat endpoints
Self-hosted routers and enterprise gateways sometimes advertise a model
ID Harn cannot parse (my-internal-gpt-clone-v2) yet forward the OpenAI
Responses payload unchanged. Opt into the hosted path with:
llm_call(prompt, sys, {
provider: "openrouter",
model: "my-custom/gpt-forward",
tools: registry,
tool_search: {mode: "native"},
openrouter: {force_native_tool_search: true},
})
The override is keyed by the provider name (the same dict you’d use for any provider-specific knob).
Capability matrix + harn.toml overrides
The provider support table above is not hard-coded: it’s the output
of a shipped data file (crates/harn-vm/src/llm/capabilities.toml)
matched against the (provider, model) pair at call time. Scripts
can query the effective capability surface without carrying
vendor-specific knowledge:
let caps = provider_capabilities("anthropic", "claude-opus-4-7")
// {
// native_tools: true, defer_loading: true,
// tool_search: ["bm25", "regex"], max_tools: 10000,
// prompt_caching: true, thinking: true,
// }
if "bm25" in caps.tool_search {
llm_call(prompt, sys, {
tools: registry,
tool_search: "bm25",
})
}
Projects override or extend the shipped table in harn.toml — useful
for flagging a proxied OpenAI-compat endpoint as supporting
tool_search ahead of a Harn release that knows about it natively:
# harn.toml
[[capabilities.provider.my-proxy]]
model_match = "*"
native_tools = true
defer_loading = true
tool_search = ["hosted"]
prompt_caching = true
# Shadow the built-in Anthropic rule to force client-executed
# fallback on every Opus call (e.g. while a regional outage is
# active):
[[capabilities.provider.anthropic]]
model_match = "claude-opus-*"
native_tools = true
defer_loading = false
tool_search = []
prompt_caching = true
thinking = true
Each [[capabilities.provider.<name>]] entry accepts these fields:
| Field | Type | Purpose |
|---|---|---|
model_match | glob string | Required. Matched against the lowercased model ID. Leading/trailing * or a single middle * supported. |
version_min | [major, minor] | Narrows the match to a parseable version (Anthropic / OpenAI extractors). Rules where version_min is set but the model ID won’t parse are skipped. |
native_tools | bool | Whether the provider accepts a native tool-call wire shape. |
defer_loading | bool | Whether defer_loading: true on tool definitions is honored server-side. |
tool_search | list of strings | Native tool_search variants, preferred first. Anthropic: ["bm25", "regex"]. OpenAI: ["hosted", "client"]. Empty = no native support (client fallback only). |
max_tools | int | Cap on tool count. harn lint will warn if a registry exceeds the smallest cap any active provider advertises. |
prompt_caching | bool | cache_control blocks honored. |
thinking | bool | Extended or adaptive thinking available. |
First match wins. User rules for a given provider are consulted before the shipped rules — so the order inside the TOML file matters (place more specific patterns above wildcards).
[provider_family] declares sibling providers that inherit rules
from a canonical family. The shipped table routes OpenRouter,
Together, Groq, DeepSeek, Fireworks, HuggingFace, and local vLLM to
[[provider.openai]] by default.
Two programmatic helpers mirror the harn.toml path for cases where
editing the manifest is awkward:
provider_capabilities_install(toml_src)— install overrides from a TOML string (same layout ascapabilities.toml, without thecapabilities.prefix: just[[provider.<name>]]). Useful when a script detects a proxied endpoint at runtime.provider_capabilities_clear()— revert to shipped defaults.
Packaged provider adapters via [llm]
Projects and installed packages can also contribute provider definitions,
aliases, inference rules, and model defaults directly from harn.toml
under [llm]. The schema matches providers.toml, but the merge is
scoped to the current run:
[llm.providers.my_proxy]
base_url = "https://llm.example.com/v1"
chat_endpoint = "/chat/completions"
completion_endpoint = "/completions"
auth_style = "bearer"
auth_env = "MY_PROXY_API_KEY"
[llm.aliases]
my-fast = { id = "vendor/model-fast", provider = "my_proxy" }
Load order is:
- built-in defaults
HARN_PROVIDERS_CONFIGwhen set, otherwise~/.config/harn/providers.toml- installed package
[llm]tables from.harn/packages/*/harn.toml - the root project’s
[llm]table
That gives packages a stable, declarative way to ship provider adapters and model aliases without editing Rust-side registration code.
Client-executed fallback
On providers without native defer_loading, Harn falls back to an
in-VM execution path (landed in harn#70).
The fallback is identical to the native path from a script’s point of
view: same option surface, same transcript events, same promotion
behavior across turns. Internally, Harn injects a synthetic tool
called __harn_tool_search — when the model calls it, the loop runs
the configured strategy against the deferred-tool index, promotes the
matching tools into the next turn’s schema list, and emits the
same tool_search_query / tool_search_result transcript events as
native mode (tagged mode: "client" in metadata so replays can
distinguish paths).
Strategies (client mode only):
strategy | Runs in | Notes |
|---|---|---|
"bm25" (default) | VM | Tokenized BM25 over name + description + param text. Matches open_file from query open file. |
"regex" | VM | Case-insensitive Rust-regex over the same corpus. No backreferences, no lookaround. |
"semantic" | Host (bridge) | Delegated to the host via tool_search/query so integrators can wire embeddings without Harn pulling in ML crates. |
"host" | Host (bridge) | Pure host-side; the VM round-trips the query and promotes whatever the host returns. |
Extra client-mode knobs:
budget_tokens: N— soft cap on the total token footprint of promoted tool schemas. Oldest-first eviction when exceeded. Omit to keep every promoted schema for the life of the call.name: "find_tool"— override the synthetic tool’s name. Handy when a skill’s vocabulary suggests a more natural verb (discover,lookup, …).always_loaded: ["read_file", "run"]— pin tool names to the eager set even ifdefer_loading: trueis set on their registry entries.include_stub_listing: true— append a short list of deferred tool names + one-line descriptions to the tool-contract prompt so the model can eyeball what’s available without a search call. Off by default to match Anthropic’s native ergonomic.
Pre-flight validation
- At least one user tool must be non-deferred. Harn errors before the API call is made, matching Anthropic’s documented 400.
defer_loadingmust be a bool — typos likedefer_loading: "yes"error attool_definetime rather than silently falling back to the “no defer” default.
Transcript events
Every native tool-search round-trip emits structured events in the run record:
tool_search_query— the search tool’s invocation (input query, search-tool id).tool_search_result— the references returned by the server (which deferred tools got promoted on this turn).
These are stable shapes; replay / eval can reconstruct which tools were available when without re-running the call.
llm_completion
Use llm_completion for text continuation and fill-in-the-middle generation.
It lives at the same abstraction level as llm_call.
let result = llm_completion("let total = ", ";", nil, {
provider: "ollama",
model_tier: "small"
})
println(result.text)
agent_loop
Run an agent that keeps working until it’s done. The agent maintains
conversation history across turns and loops until it outputs the
##DONE## sentinel. Returns a dict with canonical visible text,
tool usage, transcript state, and any deferred queued human messages.
let result = agent_loop(
"Write a function that sorts a list, then write tests for it.",
"You are a senior engineer.",
{persistent: true}
)
println(result.text) // the accumulated output
println(result.status) // "done", "stuck", "budget_exhausted", "idle", "watchdog", or "failed"
println(result.iterations) // number of LLM round-trips
How it works
- Sends the prompt to the model
- Reads the response
- If
persistent: true:- Checks if the response contains
##DONE## - If yes, stops and returns the accumulated output
- If no, sends a nudge message asking the agent to continue
- Repeats until done or limits are hit
- Checks if the response contains
- If
persistent: false(default): returns after the first response
agent_loop return value
agent_loop returns a dict with the following fields:
| Field | Type | Description |
|---|---|---|
status | string | Terminal state: "done" (natural completion), "stuck" (exceeded max_nudges consecutive text-only turns), "budget_exhausted" (hit max_iterations without any explicit break), "idle" (daemon yielded with no remaining wake source), "watchdog" (daemon idle-wait tripped the idle_watchdog_attempts limit), or "failed" (require_successful_tools not satisfied). |
text | string | Accumulated text output from all iterations |
visible_text | string | Human-visible accumulated output |
iterations | int | Number of LLM round-trips |
duration_ms | int | Total wall-clock time in milliseconds |
tools_used | list | Names of tools that were called |
rejected_tools | list | Tools rejected by policy/host ceiling |
deferred_user_messages | list | Queued human messages deferred until agent yield/completion |
daemon_state | string | Final daemon lifecycle state; mirrors status for daemon loops. |
daemon_snapshot_path | string or nil | Persisted snapshot path when daemon persistence is enabled |
transcript | dict | Transcript of the full conversation state |
agent_loop options
Same as llm_call, plus additional options:
| Key | Type | Default | Description |
|---|---|---|---|
persistent | bool | false | Keep looping until ##DONE## |
max_iterations | int | 50 | Maximum number of LLM round-trips |
max_nudges | int | 3 | Max consecutive text-only responses before stopping |
nudge | string | see below | Custom message to send when nudging the agent |
tool_retries | int | 0 | Number of retry attempts for failed tool calls |
tool_backoff_ms | int | 1000 | Base backoff delay in ms for tool retries (doubles each attempt) |
policy | dict | nil | Capability ceiling applied to this agent loop |
daemon | bool | false | Idle instead of terminating after text-only turns |
persist_path | string | nil | Persist daemon snapshots to this path on idle/finalize |
resume_path | string | nil | Restore daemon state from a previously persisted snapshot |
wake_interval_ms | int | nil | Fixed timer wake interval for daemon loops |
watch_paths | list/string | nil | Files to poll for mtime changes while idle |
consolidate_on_idle | bool | false | Run transcript auto-compaction before persisting an idle daemon snapshot |
idle_watchdog_attempts | int | nil (disabled) | Max consecutive idle-wait ticks that may return no wake reason before the daemon terminates with status = "watchdog". Guards against a misconfigured daemon (e.g. bridge never signals, no timer, no watch paths) hanging the session silently |
context_callback | closure | nil | Per-turn hook that can rewrite prompt-visible messages and/or the effective system prompt before the next LLM call |
context_filter | closure | nil | Alias for context_callback |
post_turn_callback | closure | nil | Hook called after each tool turn. Receives turn metadata and may inject a message, request an immediate stage stop, or both |
turn_policy | dict | nil | Turn-shape policy for action stages. Supports require_action_or_yield: bool, allow_done_sentinel: bool (default true; set to false in workflow-owned action stages so nudges stop advertising the done sentinel), and max_prose_chars: int |
stop_after_successful_tools | list<string> | nil | Stop after a tool-calling turn whose successful results include one of these tool names. Useful for workflow-owned verify loops such as ["edit", "scaffold"] |
require_successful_tools | list<string> | nil | Mark the loop status = "failed" unless at least one of these tool names succeeds at some point during the interaction. Keeps action stages honest when every attempted effect was rejected or errored |
loop_detect_warn | int | 2 | Consecutive identical tool calls before appending a redirection hint |
loop_detect_block | int | 3 | Consecutive identical tool calls before replacing the result with a hard redirect |
loop_detect_skip | int | 4 | Consecutive identical tool calls before skipping execution entirely |
skills | skill_registry or list | nil | Skill registry exposed to the match-and-activate lifecycle phase. See Skills lifecycle |
skill_match | dict | {strategy: "metadata", top_n: 1, sticky: true} | Match configuration — strategy ("metadata" | "host" | "embedding"), top_n, sticky |
working_files | list|string | [] | Paths that feed paths: glob auto-trigger in the metadata matcher and ride along as a hint to host-delegated matchers |
When daemon: true, the loop transitions active -> idle -> active instead of
terminating on a text-only turn. Idle daemons can be woken by queued human
messages, agent/resume bridge notifications, wake_interval_ms, or watched
file changes from watch_paths.
Default nudge message:
You have not output ##DONE## yet — the task is not complete. Use your tools to continue working. Only output ##DONE## when the task is fully complete and verified.
When persistent: true, the system prompt is automatically extended with:
IMPORTANT: You MUST keep working until the task is complete. Do NOT stop to explain or summarize — take action. Output ##DONE## only when the task is fully complete and verified.
Daemon stdlib wrappers
When you want a first-class daemon handle instead of wiring agent_loop
options manually, use the daemon builtins:
daemon_spawn(config)daemon_trigger(handle, event)daemon_snapshot(handle)daemon_stop(handle)daemon_resume(path)
daemon_spawn accepts the same daemon-related options that agent_loop
understands (wake_interval_ms, watch_paths, idle_watchdog_attempts,
etc.) plus event_queue_capacity, which bounds the durable FIFO trigger queue
used by daemon_trigger.
let daemon = daemon_spawn({
name: "reviewer",
task: "Watch for trigger events and summarize the latest change.",
system: "You are a careful reviewer.",
provider: "mock",
persist_path: ".harn/daemons/reviewer",
event_queue_capacity: 256,
})
daemon_trigger(daemon, {kind: "file_changed", path: "src/lib.rs"})
let snap = daemon_snapshot(daemon)
println(snap.pending_event_count)
daemon_stop(daemon)
let resumed = daemon_resume(".harn/daemons/reviewer")
These wrappers preserve queued trigger events across stop/resume. If a daemon is stopped while a trigger is mid-flight, that trigger is re-queued and replayed on resume instead of being lost.
Context callback
context_callback lets you keep the full recorded transcript for replay and
debugging while showing the model a smaller or rewritten prompt-visible
history on each turn.
The callback receives one argument:
{
iteration: int,
system: string?,
messages: list,
visible_messages: list,
recorded_messages: list,
recent_visible_messages: list,
recent_recorded_messages: list,
latest_visible_user_message: string?,
latest_visible_assistant_message: string?,
latest_recorded_user_message: string?,
latest_recorded_assistant_message: string?,
latest_tool_result: string?,
latest_recorded_tool_result: string?
}
It may return:
nilto leave the current prompt-visible context unchanged- a
listof messages to use as the next prompt-visible message list - a
dictwith optionalmessagesandsystemfields
Example: hide older assistant messages so the model mostly sees user intent, tool results, and the latest assistant turn.
fn hide_old_assistant_turns(ctx) {
var kept = []
var latest_assistant = nil
for msg in ctx.visible_messages {
if msg?.role == "assistant" {
latest_assistant = msg
} else {
kept = kept + [msg]
}
}
if latest_assistant != nil {
kept = kept + [latest_assistant]
}
return {messages: kept}
}
let result = agent_loop(task, "You are a coding assistant.", {
persistent: true,
context_callback: hide_old_assistant_turns
})
Post-turn callback
post_turn_callback runs after a tool-calling turn completes. Use it when the
workflow should react to the tool outcomes directly instead of waiting for the
model to emit another message.
The callback receives:
{
tool_names: list,
tool_results: list,
successful_tool_names: list,
tool_count: int,
iteration: int,
consecutive_single_tool_turns: int,
session_tools_used: list,
session_successful_tools: list,
}
Each tool_results entry has:
{tool_name: string, status: string, rejected: bool}
It may return:
- a
stringto inject as the next user-visible message - a
boolwheretruestops the current stage immediately after the turn - a
dictwith optionalmessageandstopfields
Example: stop after the first successful write turn, but still allow multiple edits in that same turn.
fn stop_after_successful_write(turn) {
if turn?.successful_tool_names?.contains("edit") {
return {stop: true}
}
return ""
}
Example with retry
retry 3 {
let result = agent_loop(
task,
"You are a coding assistant.",
{
persistent: true,
max_iterations: 30,
max_nudges: 5,
provider: "anthropic",
model: "claude-sonnet-4-20250514"
}
)
println(result.text)
}
Skills lifecycle
Skills bundle metadata, a system-prompt fragment, scoped tools, and
lifecycle hooks into a typed unit. Declare them with the top-level
skill NAME { ... } language form (see the Harn spec)
or the imperative skill_define(...) builtin, then pass the resulting
skill_registry to agent_loop via the skills: option. The agent
loop matches, activates, and (optionally) deactivates skills across
turns automatically.
Matching strategies
skill_match: { strategy: ..., top_n: 1, sticky: true } controls how
the loop picks which skill(s) to activate:
"metadata"(default) — in-VM BM25-ish scoring overdescription+when_to_usecombined with glob matching against thepaths:list. Name-in-prompt mentions count as a strong boost. No host round-trip, so matching is fast and deterministic."host"— delegates scoring to the host via theskill/matchbridge RPC (see bridge-protocol.md). Useful for embedding-based or LLM-driven matchers. Failing RPC falls back to metadata scoring with a warning."embedding"— alias for"host"; accepted so the language matches Anthropic’s canonical terminology.
Activation lifecycle
- Match runs at the head of iteration 0 (always) and, when
sticky: false, before every subsequent iteration (reassess). - Activate: the skill’s
on_activateclosure (if any) is called, itspromptbody is woven into the effective system prompt, andallowed_toolsnarrows the tool surface for the next LLM call. Each activation emitsAgentEvent::SkillActivated+ askill_activatedtranscript event with the match score and reason. - Deactivate (only in
sticky: falsemode) — when reassess picks a different top-N, the previously-active skill’son_deactivateruns and the scoped tool filter is dropped. EmitsAgentEvent::SkillDeactivated+ askill_deactivatedtranscript event. - Session resume: when
session_id:is set, the set of active skills at the end of one run is persisted in the session store. The nextagent_loopcall on the same session rehydrates them before iteration-0 matching runs, so sticky re-entry stays hot without re-matching from a cold prompt.
Scoped tools
A skill’s allowed_tools list is the union across all active
skills; any tool outside that union is filtered out of both the
contract prompt and the native tool schemas the provider sees.
Runtime-internal tools like __harn_tool_search are never filtered
— scoping gates the user-declared surface, not the runtime’s own
scaffolding.
Frontmatter honoured by the runtime
| Field | Type | Effect |
|---|---|---|
description | string | Primary ranking signal for metadata matching |
when_to_use | string | Secondary ranking signal |
paths | list<string> | Glob patterns for paths: auto-trigger |
allowed_tools | list<string> | Whitelist applied to the tool surface on activation |
prompt | string | Body woven into the active-skill system-prompt block |
disable-model-invocation | bool | When true, the matcher skips the skill entirely |
user-invocable | bool | Placeholder for host UI (not consumed by the runtime today) |
mcp | list<string> | MCP servers the skill wants booted (consumed by host integrations) |
on_activate / on_deactivate | fn | Closures invoked on transition |
Example
skill ship {
description "Ship a production release"
when_to_use "User says ship/release/deploy"
paths ["infra/**", "Dockerfile"]
allowed_tools ["deploy_service"]
prompt "Follow the deploy runbook. One command at a time."
}
let result = agent_loop(
"Ship the new release to production",
"You are a staff deploy engineer.",
{
provider: "anthropic",
tools: tools(),
skills: ship,
working_files: ["infra/terraform/cluster.tf"],
}
)
The loop emits one skill_matched event per match pass (including
zero-candidate passes so replayers see the boundary), one
skill_activated per activated skill, and one skill_scope_tools
event per activation whose allowed_tools narrowed the surface.
Streaming responses
llm_stream returns a channel that yields response chunks as they
arrive. Iterate over it with a for loop:
let stream = llm_stream("Tell me a story", "You are a storyteller")
for chunk in stream {
print(chunk)
}
llm_stream accepts the same options as llm_call (provider, model,
max_tokens). The channel closes automatically when the response is
complete.
Delegated workers
For long-running or parallel orchestration, Harn exposes a worker/task lifecycle directly in the runtime.
let worker = spawn_agent({
name: "research-pass",
task: "Draft a summary",
node: {
kind: "subagent",
mode: "llm",
model_policy: {provider: "mock"},
output_contract: {output_kinds: ["summary"]}
}
})
let done = wait_agent(worker)
println(done.status)
spawn_agent(...) accepts either:
- a
graphplus optionalartifactsandoptions, which runs a typed workflow in the background, or - a
nodeplus optionalartifactsandtranscript, which runs a single delegated stage and preserves transcript continuity acrosssend_input(...)
Worker configs may also include policy to narrow the delegated worker to a
subset of the parent’s current execution ceiling, or a top-level
tools: ["name", ...] shorthand:
let worker = spawn_agent({
task: "Read project files only",
tools: ["read", "search"],
node: {
kind: "subagent",
mode: "llm",
model_policy: {provider: "mock"},
tools: repo_tools()
}
})
If neither is provided, the worker inherits the current execution policy as-is.
If either is provided, Harn intersects the requested worker scope with the
parent ceiling before the worker starts or is resumed. Permission denials are
returned to the agent loop as structured tool results:
{error: "permission_denied", tool, reason}.
Worker lifecycle builtins:
| Function | Description |
|---|---|
spawn_agent(config) | Start a worker from a workflow graph or delegated stage |
sub_agent_run(task, options?) | Run an isolated child agent loop and return a single clean result envelope to the parent |
send_input(handle, task) | Re-run a completed worker with a new task, carrying transcript/artifacts forward when applicable |
resume_agent(id_or_snapshot_path) | Restore a persisted worker snapshot and continue it in the current runtime |
wait_agent(handle_or_list) | Wait for one worker or a list of workers to finish |
close_agent(handle) | Cancel a worker and mark it terminal |
list_agents() | Return summaries for all known workers in the current runtime |
sub_agent_run
Use sub_agent_run(...) when you want a full child agent_loop with its own
session and narrowed capability scope, but you do not want the child transcript
to spill into the parent conversation history.
let result = sub_agent_run("Find the config entrypoints.", {
provider: "mock",
tools: repo_tools(),
allowed_tools: ["search", "read"],
token_budget: 1200,
returns: {
schema: {
type: "object",
properties: {
paths: {type: "array", items: {type: "string"}}
},
required: ["paths"]
}
}
})
if result.ok {
println(result.data.paths)
} else {
println(result.error.category)
}
The parent transcript only records the outer tool call and tool result. The
child keeps its own session and transcript, linked by session_id / parent
lineage metadata.
sub_agent_run(...) returns an envelope with:
oksummaryartifactsevidence_addedtokens_usedbudget_exceededsession_iddatawhen the child requests JSON mode orreturns.schemasucceedserror: {category, message, tool?}when the child fails or a narrowed tool policy rejects a call
Set background: true to get a normal worker handle back instead of waiting
inline. The resulting worker uses mode: "sub_agent" and can be resumed with
wait_agent(...), send_input(...), and close_agent(...).
Background handles retain the original structured request plus a normalized
provenance object, so parent pipelines can recover child questions, actions,
workflow stages, and verification steps directly from the handle/result.
Workers can persist state and child run paths between sessions. Use carry
inside spawn_agent(...) when you want continuation to reset transcript state,
drop carried artifacts, or disable workflow resume against the previous child
run record. Worker configs may also include execution to pin delegated work
to an explicit cwd/env overlay or a managed git worktree:
let worker = spawn_agent({
task: "Run the repo-local verification pass",
graph: some_graph,
execution: {
worktree: {
repo: ".",
branch: "worker/research-pass",
cleanup: "preserve"
}
}
})
Transcript management
Harn includes transcript primitives for carrying context across calls, forks, repairs, and resumptions:
let first = llm_call("Plan the work", nil, {provider: "mock"})
let second = llm_call("Continue", nil, {
provider: "mock",
transcript: first.transcript
})
let compacted = transcript_compact(second.transcript, {
keep_last: 4,
summary: "Planning complete."
})
Use transcript_summarize() when you want Harn to create a fresh summary with
an LLM, or transcript_compact() when you want the runtime compaction engine
outside the agent_loop path.
Transcript helpers also expose the canonical event model:
let visible = transcript_render_visible(result.transcript)
let full = transcript_render_full(result.transcript)
let events = transcript_events(result.transcript)
Use these when a host app needs to render human-visible chat separately from internal execution history.
For chat/session lifecycle, std/agents now exposes a higher-level workflow
session contract on top of raw transcripts and run records:
import "std/agents"
let result = task_run("Write a note", some_flow, {provider: "mock"})
let session = workflow_session(result)
let forked = workflow_session_fork(session)
let archived = workflow_session_archive(forked)
let resumed = workflow_session_resume(archived)
let persisted = workflow_session_persist(result, ".harn-runs/chat.json")
let restored = workflow_session_restore(persisted.run.persisted_path)
Each workflow session also carries a normalized usage summary copied from the
underlying run record when available:
println(session?.usage?.input_tokens)
println(session?.usage?.output_tokens)
println(session?.usage?.total_duration_ms)
println(session?.usage?.call_count)
std/agents also exposes worker helpers for delegated/background orchestration:
worker_request(worker), worker_result(worker), worker_provenance(worker),
worker_research_questions(worker), worker_action_items(worker),
worker_workflow_stages(worker), and worker_verification_steps(worker).
This is the intended host integration boundary:
- hosts persist chat tabs, titles, and durable asset files
- Harn persists transcript/run-record/session semantics
- hosts should prefer restoring a Harn session or transcript over inventing a parallel hidden memory format
Workflow runtime
For multi-stage orchestration, prefer the workflow runtime over product-side loop wiring. Define a helper that assembles the tools your agents will use:
fn review_tools() {
var tools = tool_registry()
tools = tool_define(tools, "read", "Read a file", {
parameters: {path: {type: "string"}},
returns: {type: "string"},
handler: nil
})
tools = tool_define(tools, "edit", "Edit a file", {
parameters: {path: {type: "string"}},
returns: {type: "string"},
handler: nil
})
tools = tool_define(tools, "run", "Run a command", {
parameters: {command: {type: "string"}},
returns: {type: "string"},
handler: nil
})
return tools
}
let graph = workflow_graph({
name: "review_and_repair",
entry: "act",
nodes: {
act: {kind: "stage", mode: "agent", tools: review_tools()},
verify: {kind: "verify", mode: "agent", tools: tool_select(review_tools(), ["run"])}
},
edges: [{from: "act", to: "verify"}]
})
let run = workflow_execute(
"Fix the failing test and verify the change.",
graph,
[],
{max_steps: 6}
)
This keeps orchestration structure, transcript policy, context policy, artifacts, and retries inside Harn instead of product code.
Cost tracking
Harn provides builtins for estimating and controlling LLM costs:
// Estimate cost for a specific call
let cost = llm_cost("claude-sonnet-4-20250514", 1000, 500)
println("Estimated cost: $${cost}")
// Check cumulative session costs
let session = llm_session_cost()
println("Total: $${session.total_cost}")
println("Calls: ${session.call_count}")
println("Input tokens: ${session.input_tokens}")
println("Output tokens: ${session.output_tokens}")
// Set a budget (LLM calls throw if exceeded)
llm_budget(1.00)
println("Remaining: $${llm_budget_remaining()}")
| Function | Description |
|---|---|
llm_cost(model, input_tokens, output_tokens) | Estimate USD cost from embedded pricing table |
llm_session_cost() | Session totals: {total_cost, input_tokens, output_tokens, call_count} |
llm_budget(max_cost) | Set session budget in USD. LLM calls throw if exceeded |
llm_budget_remaining() | Remaining budget (nil if no budget set) |
Provider API details
Anthropic
- Endpoint:
https://api.anthropic.com/v1/messages - Auth:
x-api-keyheader - API version:
2023-06-01 - System message sent as a top-level
systemfield
OpenAI
- Endpoint:
https://api.openai.com/v1/chat/completions - Auth:
Authorization: Bearer <key> - System message sent as a message with
role: "system"
OpenRouter
- Endpoint:
https://openrouter.ai/api/v1/chat/completions - Auth:
Authorization: Bearer <key> - Same message format as OpenAI
HuggingFace
- Endpoint:
https://router.huggingface.co/v1/chat/completions - Auth:
Authorization: Bearer <key> - Use
HF_TOKENorHUGGINGFACE_API_KEY - Same message format as OpenAI
Ollama
- Endpoint:
<OLLAMA_HOST>/v1/chat/completions - Default host:
http://localhost:11434 - No authentication required
- Same message format as OpenAI
Local OpenAI-compatible server
- Endpoint:
<LOCAL_LLM_BASE_URL>/v1/chat/completions - Default host:
http://localhost:8000 - No authentication required
- Same message format as OpenAI
Testing with mock LLM responses
The mock provider returns deterministic responses without API keys.
Use llm_mock() to queue specific responses — text, tool calls, or both:
// Queue a text response (consumed in FIFO order)
llm_mock({text: "The capital of France is Paris."})
let r = llm_call("What is the capital of France?", nil, {provider: "mock"})
assert_eq(r.text, "The capital of France is Paris.")
// Queue a response with tool calls
llm_mock({
text: "Let me read that file.",
tool_calls: [{name: "read_file", arguments: {path: "src/main.rs"}}],
})
// Pattern-matched mocks (reusable by default, matched in declaration order)
llm_mock({text: "I don't know.", match: "*unknown*"})
llm_mock({text: "step 1", match: "*planner*", consume_match: true})
llm_mock({text: "step 2", match: "*planner*", consume_match: true})
// Inspect what was sent to the mock provider
let calls = llm_mock_calls()
// Each entry: {messages: [...], system: "..." or nil, tools: [...] or nil}
// Clear all mocks and call log between tests
llm_mock_clear()
When no llm_mock() responses are queued, the mock provider falls back to
its default deterministic behavior (echoing prompt metadata). This means
existing tests using provider: "mock" without llm_mock() continue to
work unchanged.
Daemon stdlib
Harn’s daemon builtins wrap the existing agent_loop(..., {daemon: true})
runtime so scripts can manage long-lived assistants without hand-assembling
snapshot paths and resume options.
Builtins
daemon_spawn(config)
Start a daemon-mode agent and return a daemon handle dict.
Required config:
taskorpromptpersist_pathorstate_dir
Useful optional config:
namesystemprovider,model,tools,max_iterations, and otheragent_loopoptionswake_interval_mswatch_pathsidle_watchdog_attemptsevent_queue_capacity(default1024)
Example:
let reviewer = daemon_spawn({
name: "reviewer",
task: "Watch for trigger events and summarize the change.",
system: "You are a careful code reviewer.",
provider: "mock",
persist_path: ".harn/daemons/reviewer",
watch_paths: ["src/"],
wake_interval_ms: 30000,
event_queue_capacity: 256,
})
daemon_trigger(handle, event)
Queue a trigger event for a running daemon. Events are delivered FIFO, one daemon wake at a time, and the queue is durably persisted in the daemon’s metadata so a stop/resume or crash/recovery cycle does not lose pending work.
If the queue is full, the builtin throws VmError::DaemonQueueFull.
daemon_trigger(reviewer, {
kind: "file_changed",
path: "src/lib.rs",
})
daemon_snapshot(handle)
Return the latest persisted daemon snapshot plus live queue metadata:
pending_eventspending_event_countinflight_eventqueued_event_countevent_queue_capacity
The rest of the payload mirrors agent_loop daemon snapshots, including
daemon_state, recorded_messages, total_iterations, and saved_at.
daemon_stop(handle)
Stop a daemon and preserve its state on disk. The runtime waits briefly for an
idle boundary when possible; if the daemon is still mid-turn, the current
in-flight trigger is re-queued so daemon_resume(...) can replay it safely.
daemon_resume(path)
Resume a daemon from its persisted state directory. The path is the same root
directory you passed as persist_path / state_dir to daemon_spawn(...),
not the inner daemon.json snapshot file.
If the daemon stopped with queued or in-flight trigger events, they are restored and replayed after resume.
Delivery semantics
- Trigger events are FIFO.
- The queue is bounded by
event_queue_capacity. - Trigger payloads are handed to the daemon only from an idle boundary, so a persisted snapshot always reflects the pre-trigger or post-trigger state and never an ambiguous half-consumed queue.
- Forced stop/restart is intentionally at-least-once: an in-flight trigger is re-queued on stop/resume instead of being dropped silently.
Trigger stdlib
The trigger stdlib exposes the live runtime registry to Harn scripts. Use it to inspect installed bindings, register new bindings at runtime, fire synthetic events for tests/manual invocations, replay a recorded event by id, and inspect the current dead-letter queue (DLQ).
Import the shared types from std/triggers when you want typed handles and
payloads:
import "std/triggers"
Builtins
trigger_list()
Return the current live registry snapshot as list<TriggerBinding>.
Each binding includes:
idversionsource("manifest"or"dynamic")kindproviderhandler_kindstatemetrics
metrics is a typed TriggerMetrics record with counters for received,
dispatched, failed, dlq, in_flight, and the cost snapshot fields.
trigger_register(config)
Register a trigger dynamically and return its TriggerHandle.
TriggerConfig uses the same broad shape as manifest-loaded bindings:
idkindproviderhandlerwhenretrymatchoreventsdedupe_keyfilterbudgetmanifest_pathpackage_name
The runtime currently accepts two handler forms:
- Local Harn closures / function references
- Remote URI strings with
a2a://...orworker://...
retry is optional. The current stdlib surface accepts:
{max: N, backoff: "svix"}{max: N, backoff: "immediate"}
Example:
import "std/triggers"
fn handle_issue(event: TriggerEvent) -> dict {
return {kind: event.kind, provider: event.provider}
}
let handle: TriggerHandle = trigger_register({
id: "github-new-issue",
kind: "issue.opened",
provider: "github",
handler: handle_issue,
when: nil,
match: {events: ["issue.opened"]},
events: nil,
dedupe_key: nil,
filter: nil,
budget: nil,
manifest_path: nil,
package_name: nil,
})
trigger_fire(handle, event)
Fire a synthetic TriggerEvent into a binding and return a
DispatchHandle.
The builtin accepts either:
- A
TriggerHandle/TriggerBindingdict - A plain trigger id string
If the event dict omits low-level envelope fields such as id,
received_at, trace_id, or provider_payload, the runtime fills them with
synthetic defaults.
Current behavior:
- Execution routes through the trigger dispatcher, so local handlers inherit dispatcher retries, lifecycle events, action-graph updates, and DLQ moves.
whenpredicates execute before the handler and can still short-circuit a dispatch.a2a://...andworker://...handlers still return the dispatcher’s explicitNotImplementedfailure path.
trigger_replay(event_id)
Replay a previously recorded event from the EventLog by id and return a
DispatchHandle.
Current replay behavior:
- Fetch the prior event from the
triggers.eventstopic - Re-dispatch it through the trigger dispatcher using the recorded binding
- Preserve
replay_of_event_idon the returnedDispatchHandle - Resolve the pending stdlib DLQ entry when a replay succeeds
trigger_replay(...) is still not the full deterministic T-14 replay engine.
It replays the recorded trigger event through the current dispatcher/runtime
state rather than a sandboxed drift-detecting environment.
trigger_inspect_dlq()
Return the current DLQ snapshot as list<DlqEntry>.
Each DlqEntry includes:
- The failed
event - Trigger identity (
binding_id,binding_version) - Current
state - Latest
error retry_history
retry_history records every DLQ attempt, including replay attempts.
Example
import "std/triggers"
fn fail_handler(event: TriggerEvent) -> any {
throw("manual failure: " + event.kind)
}
let handle = trigger_register({
id: "manual-dlq",
kind: "issue.opened",
provider: "github",
handler: fail_handler,
when: nil,
retry: {max: 1, backoff: "immediate"},
match: nil,
events: ["issue.opened"],
dedupe_key: nil,
filter: nil,
budget: nil,
manifest_path: nil,
package_name: nil,
})
let fired = trigger_fire(handle, {provider: "github", kind: "issue.opened"})
let dlq = trigger_inspect_dlq().filter({ entry -> entry.binding_id == handle.id })
let replay = trigger_replay(fired.event_id)
println(fired.status) // "dlq"
println(len(dlq[0].retry_history)) // 1
println(replay.replay_of_event_id) // original event id
Notes
- Dynamic registrations are runtime-local.
trigger_register(...)updates the live registry in the current process; it does not rewriteharn.toml. trigger_fire(...)andtrigger_replay(...)need an active EventLog to persisttriggers.eventsandtriggers.dlq. If the runtime did not already install one, the stdlib wrapper falls back to an in-memory log for the current thread.- When
workflow_execute(...)runs inside a replayed trigger dispatch, the runtime carries the replay pointer into run metadata so derived observability can render areplay_chainedge back to the original event.
Skills
Harn discovers skills — bundled instructions, tool lists, and
activation rules — from the filesystem and from the host process. Every
skill is a directory containing a SKILL.md file with YAML
frontmatter plus a Markdown body; the format matches Anthropic’s
Agent Skills
and Claude Code specs, so
skills you author once work across both environments.
This page describes:
- the layered discovery hierarchy (CLI > env > project > manifest > user > package > system > host),
- the SKILL.md frontmatter Harn recognizes,
- the body substitution (
$ARGUMENTS,$N,${HARN_SKILL_DIR},${HARN_SESSION_ID}) that runs over SKILL.md before the model sees it, - the
harn.toml[skills]/[[skill.source]]tables, and - the
harn doctoroutput for diagnosing collisions / missing entries.
The companion language form — skill NAME { ... } — is documented in
Language basics and the skill builtins
(skill_registry, skill_define, skill_find, skill_list,
skill_render, skills_catalog_entries, render_always_on_catalog,
…) in Builtin functions.
Layered discovery
When harn run / harn test / harn check starts, every discovered
skill is merged into a single registry and exposed as the pre-populated
VM global skills. The layers — in order of highest to lowest
priority — are:
| # | Layer | Source | When |
|---|---|---|---|
| 1 | CLI | --skill-dir <path> (repeatable) | Ephemeral overrides, CI pinning |
| 2 | Env | $HARN_SKILLS_PATH (colon-separated on Unix, ; on Windows) | Deployment config, Docker, cloud agents |
| 3 | Project | .harn/skills/<name>/SKILL.md walking up from the script | Default for repo-scoped skills |
| 4 | Manifest | [skills] paths + [[skill.source]] in harn.toml | Multi-root, shared across siblings |
| 5 | User | ~/.harn/skills/<name>/SKILL.md | Personal skills across projects |
| 6 | Package | .harn/packages/**/skills/<name>/SKILL.md | Skills shipped via [dependencies] |
| 7 | System | /etc/harn/skills/ + $XDG_CONFIG_HOME/harn/skills/ | Managed / enterprise |
| 8 | Host | Registered via the bridge at runtime | Cloud / embedded hosts |
Name collisions: when two layers both expose a skill named deploy,
the higher layer wins. The shadowed entry is recorded so harn doctor
can surface it. Scripts that need both at once can register a
fully-qualified <namespace>/<skill> id via [[skill.source]] in the
manifest (see below).
SKILL.md frontmatter
The frontmatter is YAML, delimited by --- on its own line above and
below. Unknown fields are not hard errors — harn doctor reports
them as warnings so newer spec fields roll out cleanly.
---
name: deploy
description: Deploy the application to production
when-to-use: User says deploy / ship / release
disable-model-invocation: false
user-invocable: true
allowed-tools: [bash, git]
paths:
- infra/**
- Dockerfile
context: fork
agent: ops-lead
model: claude-opus-4-7
effort: high
shell: bash
argument-hint: "<target-env>"
hooks:
on-activate: echo "starting deploy"
on-deactivate: echo "deploy ended"
---
# Deploy runbook
Ship it: `$ARGUMENTS`. Skill directory: `${HARN_SKILL_DIR}`.
Recognized fields (Harn normalizes hyphens to underscores, so
when-to-use and when_to_use are the same key):
| Field | Type | Purpose |
|---|---|---|
name | string | Required. Id the script looks up via skill_find. |
description | string | One-liner the model sees for auto-activation. |
when-to-use | string | Longer activation trigger. |
disable-model-invocation | bool | If true, never auto-activate — explicit use only. |
allowed-tools | list of string | Restrict tool surface while the skill is active. Entries accept three shapes: an exact tool name ("deploy_service"), a namespace tag ("namespace:read" — matches every tool declared with namespace: "read"), or "*" (escape hatch that keeps the full surface, useful for skills that only carry prompt context). |
user-invocable | bool | Expose the skill to end users via a slash menu. |
paths | list of glob | Files the skill expects to touch. |
context | string | "fork" runs in an isolated subcontext. |
agent | string | Sub-agent that owns the skill. |
hooks | map or list | Shell commands for lifecycle events. |
model | string | Preferred model alias. |
effort | string | low / medium / high. |
shell | string | Shell to run the body under when context is shell-ish. |
argument-hint | string | UI hint for $ARGUMENTS. |
Tool scoping with namespace:<tag>
Tool declarations that carry a namespace: field can be grouped into
one allowed-tools entry instead of enumerating names. Given
tool_define(reg, "read_file", "...", {namespace: "read", ...})
tool_define(reg, "list_files", "...", {namespace: "read", ...})
tool_define(reg, "write_file", "...", {namespace: "write", ...})
a skill with allowed-tools: ["namespace:read"] scopes the turn to
read_file + list_files and hides write_file. Exact tool names
and the wildcard "*" remain valid and can mix freely:
allowed-tools: ["namespace:read", "grep", "*"]
Malformed entries fail loudly at skill_define time — a bare ":"
without a tag or a colon-prefixed entry that isn’t namespace: raises
so authors don’t silently scope to an empty set.
Body substitution
When a skill is rendered (via the skill_render builtin, or by a host
before handing the body to the model), the following substitutions run
over the Markdown body:
$ARGUMENTS→ all positional args joined with spaces$N→ the N-th positional arg (1-based).$0is reserved.${HARN_SKILL_DIR}→ absolute path to the skill directory${HARN_SESSION_ID}→ opaque session id threaded through the run${OTHER_NAME}→ looks upOTHER_NAMEin the process environment$$→ literal$
Missing positional args ($3 when only $1 was supplied) pass
through unchanged so authors see what wasn’t supplied rather than a
silent empty substitution.
let deploy = skill_find(skills, "deploy")
let rendered = skill_render(deploy, ["prod", "us-east-1"])
// rendered now has $1 and $2 replaced with "prod" and "us-east-1".
Progressive disclosure with load_skill
When an agent loop receives a skill registry through skills:,
Harn automatically exposes a runtime-owned load_skill({ name }) tool.
The tool:
- resolves the requested skill id against the loop’s resolved skill registry,
- applies the same SKILL.md body substitution described above, and
- returns the substituted body as the tool result so it lands in the next turn’s transcript naturally.
If the target skill has disable-model-invocation: true,
load_skill returns a typed error instead of leaking the body.
Always-on catalog helper
The recommended harness convention is:
- Keep a compact catalog of available skills in the system prompt.
- Let the model call
load_skillonly when one of those entries looks relevant.
Harn ships two pure helpers for that pattern:
let entries = skills_catalog_entries(skills)
let catalog = render_always_on_catalog(entries, 2000)
skills_catalog_entries projects the resolved registry into compact
{name, description, when_to_use} cards (sorted deterministically by
skill id, using <namespace>/<name> when present). render_always_on_catalog
formats those cards into a stable prompt block and trims the list to the
requested character budget.
Copy-pasteable example:
let catalog = render_always_on_catalog(skills_catalog_entries(skills), 2000)
let result = agent_loop(
"Help me ship this release",
catalog,
{
provider: "mock",
model: "gpt-5.4",
persistent: true,
skills: skills,
},
)
On a later turn the model can emit:
load_skill({ name: "deploy" })
and the next turn will see the substituted SKILL.md body in the tool
result, while any allowed-tools declared by that skill narrow the
tool surface for subsequent turns.
harn.toml [skills] + [[skill.source]]
Projects that share skills across siblings or pull them from a remote tag use the manifest instead of a per-script flag:
[skills]
paths = ["packages/*/skills", "../shared-skills"]
lookup_order = ["cli", "project", "manifest", "user", "package", "system", "host"]
disable = ["system"]
[skills.defaults]
tool_search = "bm25"
always_loaded = ["look", "edit", "bash"]
[[skill.source]]
type = "fs"
path = "../shared"
[[skill.source]]
type = "git"
url = "https://github.com/acme/harn-skills"
tag = "v1.2.0"
[[skill.source]]
type = "registry" # reserved, inert until a marketplace exists
url = "https://skills.harnlang.com"
name = "acme/ops"
pathsis joined against the directory holding harn.toml and supports a single trailing*component (packages/*/skills).lookup_orderlets you invert a layer’s priority — for example, to preferuseroverprojecton a personal checkout without touching the repo.disablekicks entire layers out of discovery. Disabled layers are reported byharn doctor.[[skill.source]]entries of typegitexpect their materialized checkout to live under.harn/packages/<name>/skills/— runharn installto populate it.registryentries are accepted but inert until a Harn Skills marketplace exists (tracked by #73).
harn doctor
harn doctor reports the resolved skill catalog:
OK skills 3 loaded (1 cli, 1 project, 1 user)
WARN skill:deploy shadowed by cli layer; user version at /home/me/.harn/skills/deploy is hidden
WARN skill:review unknown frontmatter field(s) forwarded as metadata: future_field
SKIP skills-layer:system layer disabled by harn.toml [skills.disable]
CLI flags
harn run --skill-dir <path>(repeatable) — highest-priority layer.harn test --skill-dir <path>— same semantics for user tests and conformance fixtures.$HARN_SKILLS_PATH— colon-separated list of directories, applied to every invocation.
Bridge protocol
Hosts expose their own managed skill store through three RPCs:
skills/list(request) — response is an array of{ id, name, description, source }entries.skills/fetch(request) — payload{ id: "<skill id>" }; response is the full manifest + body shape so the CLI can hydrate aSkillManifestRefinto aSkill.skills/update(notification, host → VM) — invalidates the VM’s cached catalog. The CLI re-runs discovery on the next boundary.
See Bridge protocol for wire-format details.
Managing skills
The harn skills CLI manages and inspects skills without running a
pipeline. Each subcommand resolves the layered catalog the same way
harn run does (--skill-dir, HARN_SKILLS_PATH, project, manifest,
user, packages, system, host), so what you see here is exactly what
pipelines see.
harn skills list
Prints every resolved skill with the layer it came from. Pass
--all to include shadowed entries; pass --json for machine output.
$ harn skills list
Resolved skills (3):
deploy [cli] Deploy to production with rollback support
review [project] Review a pull request
helpers/utils [package] Shared helpers from the acme/ops package
Shadowed skills (1):
deploy winner=[cli] hidden=[user] origin=/home/me/.harn/skills/deploy
harn skills inspect <name>
Dumps the resolved SKILL.md — frontmatter, bundled files under the
skill directory, and the full body — for a specific skill. Accepts
bare <name> or fully-qualified <namespace>/<name>:
$ harn skills inspect deploy
id: deploy
name: deploy
layer: cli
description: Deploy to production with rollback support
skill_dir: /repo/.harn/skills/deploy
Bundled files:
files/runbook.md
files/rollback.sh
---- SKILL.md body ----
Run the deploy. Confirm replicas and then flip traffic.
harn skills match "<query>"
Runs the built-in metadata matcher (same scorer the agent loop uses)
against a prompt and prints the ranked candidates with their scores.
Supports --working-file to simulate path-glob matches:
$ harn skills match "deploy the staging service" --top-n 3
Match results for: deploy the staging service
1. deploy score=2.400 [cli] prompt mentions 'deploy'; 1 keyword hit(s)
2. review score=0.400 [project] 1 keyword hit(s)
Useful when authoring a SKILL.md to confirm its description: and
when_to_use: frontmatter actually attracts the right prompts.
harn skills install <spec>
Materializes a git ref or local path into .harn/skills-cache/ so
the filesystem package walker picks it up on the next run. The
.harn/skills-cache/ layout mirrors .harn/packages/:
$ harn skills install acme/harn-skills --tag v1.2.0
installing acme/harn-skills to .harn/skills-cache/harn-skills
installed — layer=package, path=.harn/skills-cache/harn-skills
<spec> accepts:
- A full git URL:
https://github.com/acme/harn-skills.git owner/reposhorthand (expands to GitHub):acme/harn-skills- A local filesystem path:
../shared/skills/deploy
Pass --namespace <ns> to shelf the install under a subdirectory so
it shows up in the resolver as <ns>/<skill>. Pass --tag <ref> to
pin a git branch or tag. Every install rewrites
.harn/skills-cache/skills.lock with the resolved source + commit.
harn skills new <name>
Scaffolds a new SKILL.md and files/ directory under .harn/skills/:
$ harn skills new deploy --description "Deploy to production"
Scaffolded skill 'deploy' at .harn/skills/deploy
SKILL.md
files/README.md
Edit the SKILL.md frontmatter and body, then run `harn skills list`
to verify it's picked up.
Pass --dir <path> to target a different destination (for example
~/.harn/skills/deploy to scaffold under the user layer instead of
the project layer), or --force to overwrite an existing directory.
Portal observability
The Harn portal (harn portal) surfaces two skill-focused panels on
every run detail page:
- Skill timeline — horizontal bars showing which skills activated on which agent-loop iteration and when they deactivated. Hover a bar for the matcher score and the reason the skill was promoted.
- Tool-load waterfall — one row per
tool_search_queryevent, pairing each query with itstool_search_resultso you can see which deferred tools entered the LLM’s context in each turn. - Matcher decisions — per-iteration expansions showing every candidate the matcher considered, its score, and the working-file snapshot it scored against.
The runs index page takes a skill=<name> filter so you can narrow
evals to runs where a specific skill was active. The same
skill=<name> query parameter works from a URL, making it easy to
link to “every run that used deploy”.
Sessions
A session is a first-class VM resource that owns three things for a given conversational agent run:
- Its transcript history (
messages,events,summary, …). - The closure subscribers registered against it via
agent_subscribe(session_id, cb). - Its lifecycle — create, reset, fork, trim, compact, close.
Sessions replace the old transcript_policy config pattern. Lifecycle
used to be a side effect of dict fields (mode: "reset", mode: "fork"
quietly surgerying state on stage entry); it is now expressed by
explicit, imperative builtins. Unknown inputs are hard errors.
Quick tour
pipeline main(task) {
// Open (or resume) a session. `nil` mints a UUIDv7.
let s = agent_session_open()
// Seed the conversation.
agent_session_inject(s, {role: "user", content: "Hello!"})
// Run an agent loop against the session — prior messages are
// automatically loaded as prefix, the final transcript is persisted
// back under `s`.
let first = agent_loop("continue the greeting", nil, {
session_id: s,
provider: "mock",
})
// A second call sees `first`'s assistant reply as prior history.
let second = agent_loop("what do you remember?", nil, {
session_id: s,
provider: "mock",
})
// Fork to explore a counterfactual without touching `s`.
let branch = agent_session_fork(s)
agent_session_inject(branch, {role: "user", content: "what if …"})
// Release a session immediately.
agent_session_close(branch)
}
If you don’t pass session_id to agent_loop, the loop mints an
anonymous id internally and does NOT persist anything. That preserves
the “one-shot” call shape.
Builtins
| Function | Returns | Notes |
|---|---|---|
agent_session_open(id?: string) | string | Idempotent. nil mints a UUIDv7. |
agent_session_exists(id) | bool | Safe on unknown ids. |
agent_session_length(id) | int | Message count. Errors if id doesn’t exist. |
agent_session_snapshot(id) | dict or nil | Read-only deep copy of the transcript. |
agent_session_reset(id) | nil | Wipes history; preserves id and subscribers. |
agent_session_fork(src, dst?) | string | Copies transcript; subscribers are NOT copied. |
agent_session_trim(id, keep_last) | int | Retains last keep_last messages. Returns kept count. |
agent_session_compact(id, opts) | int | Runs the LLM/truncate/observation-mask compactor. Unknown keys in opts error. |
agent_session_inject(id, message) | nil | Appends a {role, content, …} message. Missing role errors. |
agent_session_close(id) | nil | Evicts immediately. |
agent_session_compact options
Accepts any subset of these keys; anything else is a hard error:
keep_last(int, default 12)token_threshold(int)tool_output_max_chars(int)compact_strategy("llm" | "truncate" | "observation_mask" | "custom")hard_limit_tokens(int)hard_limit_strategy(same values as above)custom_compactor(closure)mask_callback(closure)compress_callback(closure)
Storage model
Sessions live in a per-thread HashMap<String, SessionState> in
crate::agent_sessions. Thread-local is correct because VmValue
wraps Rc and the agent loop runs on a pinned tokio LocalSet task.
An LRU cap (default 128 sessions per VM) evicts the least-recently
accessed session when a new one is opened over the cap.
agent_session_close evicts immediately regardless of the cap.
Subscribers
agent_subscribe(id, closure) appends closure to the session’s
subscribers list. The agent loop fires turn_end (and other)
events through every subscriber for that session id. Subscribers are
not copied by agent_session_fork — a fork is a conversation branch,
not an event fanout.
Interaction with workflows
Workflow stages pick up a session id from
model_policy.session_id on the node; if unset, each stage mints a
stable stage-scoped id. Two stages sharing a session_id share their
transcript automatically through the session store — no explicit
threading or policy dict required.
To branch a stage’s conversation before running it, call
agent_session_fork in the pipeline before workflow_execute and
wire the fork id into the relevant node’s model_policy.session_id.
Fail-loud
Unknown option keys on agent_session_compact, a missing role on
agent_session_inject, a negative keep_last, and any of the
lifecycle verbs (reset, fork, close, trim, inject,
length, compact) called against an unknown id all raise a
VmError::Thrown(string). exists, open, and snapshot are the
only calls that tolerate unknown ids by design.
Agent State
std/agent_state is Harn’s durable, session-scoped scratch space for
agent orchestration. It gives a caller-owned root directory plus a
session id a small set of predictable operations:
- write text blobs atomically
- read them back later
- list keys deterministically
- delete keys
- persist a machine-readable handoff document
- reopen the same session from a later process with
agent_state_resume
The important design point is that the primitive is generic. Harn owns the durable-state substrate; host apps own their schema and naming conventions layered on top of it.
Import
import "std/agent_state"
Functions
| Function | Returns | Notes |
|---|---|---|
agent_state_init(root, options?) | state_handle | Creates or reopens a session-scoped state root under root/<session_id>/ |
agent_state_resume(root, session_id, options?) | state_handle | Reopens an existing session; errors if it does not exist |
agent_state_write(handle, key, content) | nil | Atomic temp-write plus rename |
agent_state_read(handle, key) | string or nil | Returns nil for missing keys |
agent_state_list(handle) | list<string> | Lexicographically sorted, recursive, deterministic |
agent_state_delete(handle, key) | nil | Missing keys are ignored |
agent_state_handoff(handle, summary) | nil | Writes a structured JSON handoff envelope to __handoff.json |
agent_state_handoff_key() | string | Returns the reserved handoff key name ("__handoff.json") |
Handle shape
agent_state_init(...) and agent_state_resume(...) return a tagged
dict:
{
_type: "state_handle",
backend: "filesystem",
root: "/absolute/root",
session_id: "session-123",
handoff_key: "__handoff.json",
conflict_policy: "ignore",
writer: {
writer_id: "worker-a",
stage_id: "worker-a",
session_id: "session-123",
worker_id: "worker-a"
}
}
The exact fields are stable on purpose. Other runtime features can build on the same handle semantics without introducing a second durable-state model.
Session ids
agent_state_init(root, options?) looks for options.session_id first.
If it is absent, Harn defaults to the active agent/workflow session id
when one exists. Outside an active agent context, Harn mints a fresh
UUIDv7.
That means common agent code can usually say:
import "std/agent_state"
pipeline default() {
let state = agent_state_init(".harn/state", {writer_id: "planner"})
agent_state_write(state, "plan.md", "# Plan")
}
and get a session-specific namespace automatically.
Keys and layout
Keys are always relative to the session root. Nested paths are fine:
import "std/agent_state"
pipeline default() {
let state = agent_state_init(".harn/state", {writer_id: "planner"})
agent_state_write(state, "plan.md", "# Plan")
agent_state_write(state, "evidence/files.json", "{\"paths\":[]}")
}
Rejected key forms:
- absolute paths
- any path containing
.. - reserved internal metadata paths
The default filesystem backend stores user content under:
<root>/<session_id>/<key>
with internal writer metadata stored separately under a hidden backend
directory. agent_state_list(...) only returns user-visible keys.
Atomic writes
agent_state_write(...) writes to a temp file in the target directory,
syncs it, then renames it into place. If the process crashes before the
rename, the old file remains intact and the partially-written temp file
never becomes the visible key.
This guarantees “no partial file at the target path”, which is the durability property the primitive is designed to expose.
Handoff documents
agent_state_handoff(handle, summary) stores a JSON envelope at
__handoff.json:
{
"_type": "agent_state_handoff",
"version": 1,
"session_id": "session-123",
"key": "__handoff.json",
"summary": {
"status": "ready"
}
}
Callers own the shape of summary. Harn owns the outer envelope and the
well-known key.
Two-writer discipline
Each handle can carry a writer identity and conflict policy:
let state = agent_state_init(".harn/state", {
session_id: "demo",
writer_id: "planner",
conflict_policy: "error"
})
Supported policies:
"ignore": accept overlapping writes silently"warn": accept the write and emit a runtime warning"error": reject the write before replacing the existing content
Conflict detection compares the previous writer id for that key with the current writer id. This is intentionally simple and deterministic: it is a guard rail against accidental stage overlap, not a full distributed locking protocol.
Backend seam
The default implementation is a filesystem backend, but the storage
layer is split behind a backend trait in
crates/harn-vm/src/stdlib/agent_state/backend.rs.
That trait is designed around:
- scope creation/resume
- atomic blob read/write/delete
- deterministic list
- conflict metadata on write
so future backends such as in-memory, SQLite, or remote stores can plug in without changing the Harn-facing handle semantics.
Example
import "std/agent_state"
pipeline default() {
let state = agent_state_init(".harn/state", {
session_id: "review-42",
writer_id: "triage"
})
agent_state_write(state, "plan.md", "# Plan\n- inspect PR")
agent_state_handoff(state, {
status: "needs_review",
next_stage: "implement"
})
let resumed = agent_state_resume(".harn/state", "review-42", {
writer_id: "implement"
})
println(agent_state_read(resumed, "plan.md"))
}
Transcript architecture
Harn transcripts are now versioned runtime values with three distinct layers:
messages: durable conversational turns used to continue model calls.events: normalized audit history derived from messages plus lifecycle/runtime events.assets: durable descriptors for large or non-text payloads that should not be inlined into prompt history.
The intended schema is:
{
"_type": "transcript",
"version": 2,
"id": "tr_...",
"state": "active",
"summary": "optional compacted summary",
"metadata": {},
"messages": [
{
"role": "user",
"content": [
{"type": "image", "asset_id": "asset_1", "visibility": "public"},
{"type": "text", "text": "Review this screenshot", "visibility": "public"}
]
}
],
"events": [
{
"kind": "message",
"role": "user",
"visibility": "public",
"text": "<image:screenshot.png> Review this screenshot",
"blocks": [...]
}
],
"assets": [
{
"_type": "transcript_asset",
"id": "asset_1",
"kind": "image",
"mime_type": "image/png",
"visibility": "internal",
"storage": {"path": ".harn/assets/asset_1.png"}
}
]
}
Rules:
- Put prompt-relevant turn content in
messages. - Put replay/audit/lifecycle facts in
events. - Put large media, file blobs, provider payload dumps, and durable attachments in
assets. - Message blocks should reference assets by
asset_idinstead of embedding base64 when persistence matters. - Compaction should summarize archived text while retaining asset descriptors and recent multimodal turns.
Persistence split:
- Hosts should persist asset files and any product-level chat/session metadata needed to reopen a conversation in the app shell.
- Harn run records, worker snapshots, and transcript values should persist the structured transcript object, including asset descriptors and message/event links.
- Hosts should avoid inventing a parallel hidden memory model. If a chat needs continuity, reuse or restore the Harn transcript and run record state.
Workflow runtime
Harn’s workflow runtime is the layer above raw llm_call() and
agent_loop(). It gives host applications a typed, inspectable, replayable
orchestration boundary instead of pushing orchestration logic into app code.
Core concepts
Workflow graphs
Use workflow_graph(...) to normalize a workflow definition into a typed
graph with:
- named nodes
- explicit edges
- node kinds such as stage, verify, join, condition, fork, map, reduce, subagent, and escalation
- typed stage input/output contracts
- explicit branch semantics and typed run transitions
- per-node model, transcript, context, retry, and capability policies
- workflow-level capability ceiling
- mutation audit log entries
subagent nodes are now a real delegated execution boundary. They run through
the worker lifecycle, attach worker metadata to their stage records, and tag
their produced artifacts with delegated provenance so parent workflows can
inspect and reduce child results explicitly.
Start with a helper that registers the tools the workflow will expose to each node. Each tool carries its own capability policy so validation can enforce them automatically:
fn review_tools() {
var tools = tool_registry()
tools = tool_define(tools, "read", "Read a file", {
parameters: {path: {type: "string"}},
returns: {type: "string"},
handler: nil,
policy: {
capabilities: {workspace: ["read_text"]},
side_effect_level: "read_only",
path_params: ["path"],
mutation_classification: "read_only"
}
})
tools = tool_define(tools, "edit", "Edit a file", {
parameters: {path: {type: "string"}},
returns: {type: "string"},
handler: nil,
policy: {
capabilities: {workspace: ["write_text"]},
side_effect_level: "workspace_write",
path_params: ["path"],
mutation_classification: "apply_workspace"
}
})
tools = tool_define(tools, "run", "Run a command", {
parameters: {command: {type: "string"}},
returns: {type: "string"},
handler: nil,
policy: {
capabilities: {process: ["exec"]},
side_effect_level: "process_exec",
mutation_classification: "ambient_side_effect"
}
})
return tools
}
let graph = workflow_graph({
name: "repair_loop",
entry: "act",
nodes: {
act: {kind: "stage", mode: "agent", tools: review_tools()},
verify: {kind: "verify", mode: "agent", tools: tool_select(review_tools(), ["run"])},
repair: {kind: "stage", mode: "agent", tools: tool_select(review_tools(), ["edit", "run"])}
},
edges: [
{from: "act", to: "verify"},
{from: "verify", to: "repair", branch: "failed"},
{from: "repair", to: "verify", branch: "retry"}
]
})
let report = workflow_validate(graph)
assert(report.valid)
When tool entries include policy, Harn folds that metadata into workflow
validation and execution automatically. That keeps the registry itself as the
source of truth for capability requirements instead of forcing products to
repeat the same information in both tool definitions and node policy blocks.
Action graphs
std/agents now exposes an action-graph layer above raw workflow graphs for
planner-driven orchestration:
action_graph(raw, options?)canonicalizes planner output variants into a stable{_type: "action_graph", actions: [...]}envelope.action_graph_batches(graph, completed?)repairs missing cross-phase dependencies and groups ready work by phase plus tool class.action_graph_flow(graph, config?)turns that plan envelope into a typed workflow graph with one scheduled batch stage per ready batch.action_graph_run(task, graph, config?, overrides?)attaches a durableplanartifact and executes the generated workflow viaworkflow_execute.
This is the intended shared substrate for “research -> plan -> execute -> verify” style pipelines when the planner output is unstable but the executor should still see a canonical schedule.
import "std/agents"
let raw_plan = {
steps: [
{id: "inspect", kind: "research", title: "Inspect parser", tools: ["read", "search"]},
{id: "patch", title: "Patch diagnostics", tools: ["edit"]},
{id: "docs", title: "Update release notes", tools: ["edit"]}
]
}
let plan = action_graph(raw_plan, {task: "Fix parser diagnostics"})
let run = action_graph_run("Fix parser diagnostics", plan, {
research: {mode: "llm", model_policy: {provider: "mock"}},
execute: {mode: "llm", model_policy: {provider: "mock"}},
verify: {command: "cargo test --workspace --quiet", expect_status: 0}
})
println(run.status)
println(len(run.batches))
Artifacts and resources
Artifacts are the real context boundary. Instead of building context mostly by concatenating strings, Harn selects typed artifacts under policy and budget.
Core artifact kinds that ship in the runtime include:
artifactresourcesummaryanalysis_notedifftest_resultverification_resultplan
Artifacts carry provenance fields such as:
sourcecreated_atfreshnesslineagerelevanceestimated_tokensmetadata
Example:
let selection = artifact({
kind: "resource",
title: "Selected code",
text: read_file("src/parser.rs"),
source: "workspace",
relevance: 0.95
})
let plan = artifact_derive(selection, "plan", {
text: "Update the parser diagnostic wording and preserve spans."
})
let context = artifact_context([selection, plan], {
include_kinds: ["resource", "plan"],
max_tokens: 1200
})
Executing workflows
workflow_execute(task, graph, artifacts?, options?) executes a typed
workflow and persists a structured run record.
let run = workflow_execute(
"Fix the diagnostic regression and verify the tests.",
graph,
[selection, plan],
{max_steps: 8}
)
println(run.status)
println(run.path)
println(run.run.stages)
verify nodes can also run deterministic checks without an LLM loop:
verify: {
kind: "verify",
verify: {
command: "cargo test --workspace --quiet",
expect_status: 0,
assert_text: "test result: ok"
}
}
Command-based verification records stdout, stderr, exit_status, and a
derived success flag on the stage result while still flowing through the same
workflow branch/outcome machinery as LLM-backed verification.
Verifier requirements can also be published as structured contract inputs for earlier planning and execution stages. Harn injects these contracts into the stage prompt automatically so the model sees exact verifier-owned identifiers, paths, and wiring text before it starts editing:
verify: {
kind: "verify",
verify: {
command: "python scripts/verify_rate_limit.py",
expect_status: 0,
required_identifiers: ["rateLimit"],
required_paths: ["src/middleware/rateLimit.ts"],
required_text: ["app.use(rateLimit)"],
notes: ["Use the verifier-exact symbol names. Do not rename them."]
}
}
When the verifier contract lives outside the workflow file, point contract_path
at a JSON file relative to the workflow execution context:
verify: {
kind: "verify",
verify: {
command: "python scripts/verify_rate_limit.py",
contract_path: "scripts/verify_rate_limit.contract.json",
expect_status: 0
}
}
Options currently include:
max_stepspersist_pathresume_pathresume_runreplay_pathreplay_runreplay_mode: "deterministic"auditmutation_scopeapproval_policy
Resuming is practical rather than magical: if a saved run has unfinished successor stages, Harn continues from persisted ready-node checkpoints with saved artifacts, transcript state, and traversed run-graph edges.
Deterministic replay is now a runtime mode rather than a CLI-only inspection
tool: passing a prior run via replay_run or replay_path replays saved stage
records and artifacts through the workflow engine without calling providers or
tools again.
Delegated runs surface child worker lineage in each delegated stage’s metadata.
This makes replay/eval and host timelines able to distinguish parent execution
from child execution without reconstructing that structure from plain text.
Persisted runs also retain explicit parent_run_id, root_run_id, and
child_runs lineage, and load_run_tree(path) materializes that hierarchy
recursively for inspection or host-side task views.
Map nodes can now execute branch work in parallel. node.join_policy.strategy
accepts:
"all"to wait for every branch result"first"to return after the first completed branch"quorum"to return afterjoin_policy.min_completedbranches finish
node.map_policy.max_concurrent limits branch fan-out, and partial failures are
retained alongside successful branch artifacts instead of aborting the whole map
stage on the first error.
Runs may also include metadata.mutation_session, a normalized audit record
used to tie tool gates, workers, and artifacts back to one mutation boundary:
session_idparent_session_idrun_idworker_idexecution_kindmutation_scopeapproval_policy
This is not an editor undo stack. It is the runtime-side provenance contract that hosts can map onto their own approval and undo/redo UX.
Transcripts and sessions
Stage transcripts are owned by the session store, not by
a per-node transcript_policy dict. Each node picks up a session id from
model_policy.session_id; two nodes that share an id share their
conversation automatically. Unset ids get a stable stage-scoped default.
To shape transcript behavior on a node, use the dedicated workflow setters plus the lifecycle builtins:
workflow_set_auto_compact(graph, node_id, policy)— setsauto_compact,compact_threshold,tool_output_max_chars,compact_strategy,hard_limit_tokens,hard_limit_strategy.workflow_set_output_visibility(graph, node_id, visibility)—"public" | "private" | nil.agent_session_reset(id),agent_session_fork(src, dst?),agent_session_trim(id, keep_last),agent_session_compact(id, opts)— call these in the pipeline beforeworkflow_executeto branch, reset, or compact a stage’s conversation explicitly.
The old transcript_policy dict (with mode: "continue" | "reset" | "fork") was removed in 0.7.0; see Sessions for
migration.
Meta-orchestration builtins
Harn exposes typed workflow editing builtins so orchestration changes can be audited and validated against the workflow IR:
workflow_inspect(..., ceiling?)workflow_clone(...)workflow_insert_node(...)workflow_replace_node(...)workflow_rewire(...)workflow_set_model_policy(...)workflow_set_context_policy(...)workflow_set_auto_compact(...)workflow_set_output_visibility(...)workflow_diff(...)workflow_validate(..., ceiling?)workflow_policy_report(..., ceiling?)workflow_commit(...)
These mutate structured workflow graphs, not free-form prompt text.
Capability ceilings
Workflows and sub-orchestration may narrow capabilities, but they must not exceed the host/runtime ceiling.
This is enforced explicitly by capability-policy intersection during validation and execution setup. If a node requests tools or host operations outside the ceiling, validation fails.
Run records, replay, and evals
Workflow execution produces a persisted run record containing:
- workflow identity
- task
- stage records
- stage attempts, outcomes, and branch decisions
- traversed graph transitions
- ready-node checkpoints for resume
- stage transcripts
- visible output
- private reasoning metadata
- tool intent and tool execution events
- provider payload metadata kept separate from visible text
- verification outcomes
- artifacts
- policy metadata
- parent/root run lineage and delegated child runs
- a derived observability block summarizing planner rounds, research facts, action-graph nodes/edges, verification outcomes, and transcript pointers
- execution status
CLI support:
harn portal
harn runs inspect .harn-runs/<run>.json
harn runs inspect .harn-runs/<run>.json --compare baseline.json
harn replay .harn-runs/<run>.json
harn eval .harn-runs/<run>.json
harn eval .harn-runs/
harn eval evals/regression.json
The replay/eval surface is intentionally tied to saved typed run records so host applications do not need to build their own provenance layer.
For a local visual view over the same persisted data, harn portal reads the
run directory directly and renders stages, the derived action graph, trace
spans, transcript sections, and delegated child runs without introducing a
second storage format.
For host/runtime consumers that want the same logic inside Harn code, the VM also exposes:
run_record_fixture(...)run_record_eval(...)run_record_eval_suite(...)run_record_diff(...)eval_suite_manifest(...)eval_suite_run(...)
Eval manifests group persisted runs, optional explicit replay fixtures, and optional baseline run comparisons under a single typed document. This lets hosts treat replay/eval suites as data rather than external scripts.
Host artifact handoff
Hosts and editor bridges should hand Harn typed artifacts instead of embedding their own orchestration rules in ad hoc prompt strings. The VM now exposes helpers for the most common host surfaces:
artifact_workspace_file(...)artifact_workspace_snapshot(...)artifact_editor_selection(...)artifact_verification_result(...)artifact_test_result(...)artifact_command_result(...)artifact_diff(...)artifact_git_diff(...)artifact_diff_review(...)artifact_review_decision(...)artifact_patch_proposal(...)artifact_verification_bundle(...)artifact_apply_intent(...)
These helpers normalize kind names, token estimates, priority defaults, lineage, and metadata so host products can pass editor/test/diff state into Harn without recreating artifact taxonomy and provenance logic externally.
Trigger manifests
[[triggers]] extends harn.toml with declarative trigger registrations in the
same manifest-overlay family as [exports], [llm], and [[hooks]].
Each entry declares:
- a stable trigger
id - a trigger
kindsuch aswebhook,cron, ora2a-push - a
providerfrom the registered trigger provider catalog - a delivery
handler - optional dedupe, retry, budget, secret, and predicate settings
Shape
[[triggers]]
id = "github-new-issue"
kind = "webhook"
provider = "github"
match = { events = ["issues.opened"] }
when = "handlers::should_handle"
handler = "handlers::on_new_issue"
dedupe_key = "event.dedupe_key"
retry = { max = 7, backoff = "svix", retention_days = 7 }
priority = "normal"
budget = { daily_cost_usd = 5.00, max_concurrent = 10 }
secrets = { signing_secret = "github/webhook-secret" }
filter = "event.kind"
Handler URI schemes
Harn currently accepts three handler forms:
- local function:
handler = "on_event"orhandler = "handlers::on_event" - A2A dispatch:
handler = "a2a://reviewer.prod/triage" - worker queue dispatch:
handler = "worker://triage-queue"
Unsupported URI schemes fail fast at load time.
Local handlers and predicates resolve through the same module-export plumbing as the manifest hook loader:
- bare names resolve against
lib.harnnext to the manifest module::functionresolves either through the current manifest’s[exports]table or through package imports under.harn/packages
Validation
The manifest loader rejects invalid trigger declarations before execution:
- trigger ids must be unique across the loaded root manifest plus installed package manifests
providermust exist in the registered trigger provider cataloghandlermust be a supported URI, and local handlers must resolve to exported functionswhenmust resolve to a function with signaturefn(TriggerEvent) -> booldedupe_keyandfiltermust parse as JMESPath expressionsretry.maxmust be<= 100retry.retention_daysdefaults to7and must be>= 1budget.daily_cost_usdmust be>= 0- cron triggers must declare a parseable
schedule - cron
timezonemust be a valid IANA timezone name - secret references must use
<namespace>/<name>syntax and the namespace must match the trigger provider
Errors include the manifest path plus the [[triggers]] table index so the bad
entry is easy to locate.
Durable dedupe retention
Trigger dedupe now uses a durable inbox index backed by the shared EventLog
topic trigger.inbox. Each successful claim stores the binding id plus the
resolved dedupe_key, and duplicate deliveries are rejected until the claim’s
TTL expires.
- configure the TTL with
retry.retention_days - the default is
7days - shorter retention trims durable dedupe history sooner, which lowers storage cost but increases the chance that a late provider retry will be treated as a fresh event
Use a retention window at least as long as the provider’s maximum retry window. If a provider can redeliver for longer than your configured TTL, Harn may dispatch that late retry again once the durable claim has expired.
Doctor output
harn doctor now lists loaded triggers with:
- trigger id
- trigger kind
- provider
- handler kind (
local,a2a, orworker) - budget summary
Examples
See the example manifests under examples/triggers:
Trigger event schema
TriggerEvent is the normalized envelope every inbound trigger provider
converges on before dispatch. Connectors preserve provider-specific payload
fidelity inside provider_payload, but the orchestration layer always sees the
same outer shape:
import "std/triggers"
fn on_event(event: TriggerEvent) {
let payload = event.provider_payload
if payload.provider == "github" && payload.event == "issues" {
println(payload.issue.title ?? "unknown")
}
let signature = event.signature_status
if signature.state == "failed" {
println(signature.reason)
}
}
Envelope fields
TriggerEvent carries:
id: runtime-assigned event id.provider: provider identity such as"github","slack","cron", or"webhook".kind: provider-specific event kind.received_at: RFC3339 timestamp captured by the runtime.occurred_at: provider-reported RFC3339 timestamp when available.dedupe_key: delivery id or equivalent idempotency key.trace_id: trace correlation id propagated through dispatch.tenant_id: optional orchestrator-assigned tenant namespace.headers: redacted provider headers retained for audit/debugging.provider_payload: provider-tagged payload union.signature_status: typed verification result.
Signature status
signature_status is a discriminated union:
{ state: "verified" }{ state: "unsigned" }{ state: "failed", reason: string }
Unsigned events are valid for synthetic sources such as cron. Failed events can still be logged for audit purposes even if the dispatcher rejects them.
Provider payloads
The initial std/triggers payload aliases are intentionally small. Each
provider variant exposes a stable normalized surface plus raw: dict. GitHub’s
payload is already narrowed into the six MVP event families (issues,
pull_request, issue_comment, pull_request_review, push, and
workflow_run) with event-specific top-level fields such as issue,
pull_request, comment, review, commits, and workflow_run:
GitHubEventPayloadSlackEventPayloadLinearEventPayloadNotionEventPayloadCronEventPayloadGenericWebhookPayloadA2aPushPayloadExtensionProviderPayload
The runtime registers these through a ProviderCatalog, so future connectors
can contribute new payload schemas without rewriting the top-level
TriggerEvent envelope.
Header redaction
The runtime keeps delivery, event, timestamp, request-id, signature, and
user-agent headers by default. It redacts sensitive headers such as
Authorization, Cookie, and names containing secret, token, or key
unless they are explicitly allow-listed as safe metadata.
Trigger Dispatcher
The trigger dispatcher is the runtime path that turns a normalized
TriggerEvent plus a live registry binding into actual handler work.
At MVP, the dispatcher fully wires the local-function path and keeps the
remote handler schemes (a2a://..., worker://...) as explicit stubs with
clear error messages pointing at their follow-up tickets.
Dispatch shape
Each dispatch goes through the same sequence:
- Append the inbound event to
trigger.inbox. - Match the event against active registry bindings for the provider + event kind.
- Evaluate the optional
whenpredicate in the same VM/runtime surface as the handler. - Invoke the resolved handler target.
- Record each attempt on
trigger.attempts. - Record successful handler results on
trigger.outbox. - Schedule retries from the manifest retry policy.
- Move exhausted deliveries into the in-memory DLQ and append a copy to
trigger.dlq. - When the dispatch is a replay, emit a
replay_chainaction-graph edge linking the new trigger node back to the original event id.
The dispatcher keeps per-thread stats for:
- in-flight dispatch count
- retry queue depth
- DLQ depth
harn doctor surfaces that snapshot next to the trigger registry view.
Handler URI resolution
Manifest handler URIs support three forms:
- bare/local function name:
handler = "on_issue"orhandler = "handlers::on_issue" - remote A2A target:
handler = "a2a://reviewer.prod/triage" - worker queue target:
handler = "worker://triage-queue"
By the time the dispatcher sees a manifest-installed binding, local function
handlers have already been resolved to concrete VmClosure values through the
same export-loading path used by manifest hooks and trigger predicates.
The dispatcher still re-normalizes those shapes internally so it can emit a stable handler kind and target URI in lifecycle logs and action-graph nodes.
Retry policy
Bindings carry a normalized TriggerRetryConfig:
SvixLinear { delay_ms }Exponential { base_ms, cap_ms }
The default retry budget is 7 total attempts.
The Svix schedule is:
immediate -> 5s -> 5m -> 30m -> 2h -> 5h -> 10h -> 10h
The last slot saturates, so attempts beyond the published vector continue to wait 10 hours unless a future manifest surface narrows that policy.
Cancellation
Dispatcher shutdown is cooperative:
- a shutdown signal flips the active per-dispatch VM cancel tokens immediately
- sleeping retry waits listen for the shared shutdown broadcast and abort early
- local handlers observe cancellation through the existing VM
install_cancel_token(...)path and exit on the next instruction boundary
This keeps the trigger runtime aligned with the orchestrator shutdown model without inventing a second cancellation mechanism.
Event-log topics
The dispatcher uses the shared EventLog instead of a parallel queue layer:
trigger.inboxtrigger.outboxtrigger.attemptstrigger.dlqtriggers.lifecycleobservability.action_graph
triggers.lifecycle now includes dispatcher-specific lifecycle records:
DispatchStartedDispatchSucceededDispatchFailedRetryScheduledDlqMoved
Action-graph updates
Dispatcher streaming closes the local-handler portion of the trigger action-graph deferral:
- node kinds:
trigger,predicate,dispatch,retry,dlq - edge kinds:
trigger_dispatch,predicate_gate,retry,dlq_move
Each update is appended to observability.action_graph using the shared
RunActionGraphNodeRecord / RunActionGraphEdgeRecord schema so the portal
and any other subscriber can consume dispatcher traces without special-casing a
separate payload format.
Replay dispatches add one more edge kind:
replay_chain
The portal renders that edge as the visible link from the replayed trigger event back to the original event id.
Current MVP limits
a2a://...returnsDispatchError::NotImplementedand points atO-04 #181worker://...returnsDispatchError::NotImplementedand points atO-05 #182- DLQ storage is in-memory plus event-log append; durable replay remains follow-up work
Trigger Observability In The Action Graph
Harn now projects dispatcher-independent trigger activity into persisted run
observability. This lands the first half of issue #163: trigger and
predicate nodes, plus the matching trigger_dispatch and predicate_gate
edges.
What lands in this change
- A synthetic
triggernode is added when a run carries atrigger_eventenvelope inrun.metadata. - Workflow
conditionstages render aspredicatenodes inobservability.action_graph_nodes. - Entry edges from the trigger node into the workflow render as
trigger_dispatch. - Transitions leaving a predicate render as
predicate_gate. trace_idpropagates from theTriggerEventonto the synthetic trigger node and every downstream action-graph node derived from that run.
The runtime also streams the derived graph onto the shared event-log topic
observability.action_graph whenever a run record is persisted. This reuses
the generalized EventLog infrastructure instead of a parallel observability
bus.
Current shape
This scoped change is intentionally limited to the dispatcher-independent surface:
- Landed here:
triggerandpredicatenode kinds. - Deferred to T-06:
dispatch,a2a_hop,worker_enqueue, anddlq. - Deferred to T-06: portal replay controls and dispatcher-coupled UI work.
- Deferred to T-06: A2A
trace_idheader propagation.
Example
When a workflow is started with a trigger_event option, the persisted run
record will include observability nodes like:
{
"kind": "trigger",
"label": "cron:tick",
"trace_id": "trace_123"
}
and:
{
"kind": "predicate",
"label": "gate",
"trace_id": "trace_123"
}
with edges such as:
{"kind": "trigger_dispatch", "from_id": "trigger:...", "to_id": "stage:..."}
{"kind": "predicate_gate", "label": "true"}
The portal does not yet render specialized UI for these nodes in this PR; it will consume the shared event-log topic in the dispatcher follow-up.
Connector authoring
Custom connectors implement the harn_vm::connectors::Connector trait and
plug into a ConnectorRegistry at orchestrator startup. The initial surface
lives in crates/harn-vm/src/connectors/ because the supporting abstractions
it depends on today already live in harn-vm:
EventLogfor audit and durable event plumbingSecretProviderfor signing secrets and outbound tokensTriggerEventfor the normalized inbound envelope
If the connector ecosystem grows large enough, the module can be extracted into a dedicated crate later without changing the core trait contract.
Provider catalog
Connectors should treat the runtime ProviderCatalog as the authoritative
discovery surface for provider metadata. Each provider entry carries:
- the normalized payload schema name exposed through
std/triggers - supported trigger kinds such as
webhookorcron - outbound method names (empty today for the built-in providers)
- required secrets, including the namespace each secret must live under
- signature verification strategy metadata
- runtime connector metadata indicating whether the provider is backed by a built-in connector or a placeholder implementation
Harn also exposes that same catalog to scripts through
import "std/triggers" and list_providers(), so connector metadata has one
runtime-facing source instead of separate registry and docs tables.
Implementing a connector
A connector implementation owns two concerns:
- Inbound normalization: verify the provider request, preserve the raw bytes,
and normalize into
TriggerEvent. - Outbound callbacks: expose provider APIs through a
ConnectorClient.
The runtime-facing surface is:
#![allow(unused)]
fn main() {
use std::sync::Arc;
use async_trait::async_trait;
use harn_vm::connectors::{
Connector, ConnectorClient, ConnectorCtx, ConnectorError, ProviderPayloadSchema,
RawInbound, TriggerBinding, TriggerKind,
};
use harn_vm::{ProviderId, TriggerEvent};
use serde_json::Value as JsonValue;
struct ExampleConnector {
provider_id: ProviderId,
kinds: Vec<TriggerKind>,
client: Arc<ExampleClient>,
}
struct ExampleClient;
#[async_trait]
impl ConnectorClient for ExampleClient {
async fn call(
&self,
method: &str,
args: JsonValue,
) -> Result<JsonValue, harn_vm::ClientError> {
let _ = (method, args);
Ok(JsonValue::Null)
}
}
#[async_trait]
impl Connector for ExampleConnector {
fn provider_id(&self) -> &ProviderId {
&self.provider_id
}
fn kinds(&self) -> &[TriggerKind] {
&self.kinds
}
async fn init(&mut self, _ctx: ConnectorCtx) -> Result<(), ConnectorError> {
Ok(())
}
async fn activate(
&self,
_bindings: &[TriggerBinding],
) -> Result<harn_vm::ActivationHandle, ConnectorError> {
Ok(harn_vm::ActivationHandle::new(self.provider_id.clone(), 0))
}
fn normalize_inbound(&self, raw: RawInbound) -> Result<TriggerEvent, ConnectorError> {
let _payload = raw.json_body()?;
todo!("map the provider request into TriggerEvent")
}
fn payload_schema(&self) -> ProviderPayloadSchema {
ProviderPayloadSchema::named("ExamplePayload")
}
fn client(&self) -> Arc<dyn ConnectorClient> {
self.client.clone()
}
}
}
HMAC verification helper
Webhook-style connectors should reuse
harn_vm::connectors::verify_hmac_signed(...) instead of open-coding HMAC
checks. The helper enforces the non-negotiable rules from issue #167:
- verification happens against the raw request body bytes
- signature comparisons use constant-time equality
- timestamped schemes reject outside a caller-provided window
- rejection paths write an audit event to the
audit.signature_verifytopic
The helper currently supports the three MVP HMAC header styles needed by the planned connector tickets:
- GitHub:
X-Hub-Signature-256: sha256=<hex> - Stripe:
Stripe-Signature: t=<unix>,v1=<hex>[,v1=<hex>...] - Standard Webhooks:
webhook-id,webhook-timestamp, andwebhook-signature: v1,<base64>
Rate limiting
Connector clients should acquire outbound permits through the shared
RateLimiterFactory. The current implementation is intentionally small: a
process-local token bucket keyed by (provider_id, scope_key). That keeps the
first landing trait-pure while giving upcoming provider clients one place to
enforce per-installation or per-tenant quotas.
What is deliberately not here yet
This foundation PR does not define:
- outbound stdlib client wrappers for connector-specific APIs
- third-party manifest ABI for external connector packages
Those land in follow-up tickets once the shared trait, provider catalog, runtime registry, audit, and verification primitives are in place.
Trigger registry
The trigger registry is the runtime-owned binding table that turns
validated [[triggers]] manifest entries into live, versioned trigger
bindings inside a VM thread.
Ownership model
- The registry is thread-local, following the same pattern as the
runtime hook table. Each VM thread owns its own bindings and does not
share
Rc<VmClosure>values across threads. - Cross-thread coordination is pushed down to the event-log layer. The trigger registry only tracks the bindings that the current VM can execute.
- Manifest parsing and validation still live in
harn-cli. Once handlers and predicates resolve, the CLI passes a compact binding spec intoharn-vm, which owns lifecycle and metrics.
Binding shape
Each live binding stores:
- logical trigger id
- monotonically increasing version
- provider and trigger kind
- resolved handler target (
local,a2a, orworker) - optional resolved
whenpredicate - lifecycle state:
registering,active,draining,terminated - metrics snapshot:
received,dispatched,failed,dlq,in_flight, and last-received timestamp - manifest provenance for diagnostics
Hot reload keeps the logical id stable and bumps the binding version whenever the manifest definition fingerprint changes.
Lifecycle
Manifest install performs a reconcile step against the current thread-local registry:
- New trigger id: register version
1, emitregistering, thenactive. - Existing trigger id with unchanged definition: keep the current active binding.
- Existing trigger id with changed definition: mark the old binding
draining, register a new active version, and keep both bindings visible until the old version reachesin_flight == 0. - Removed manifest trigger: mark the live binding
draining. Oncein_flight == 0, it transitions toterminated.
Dynamic registrations follow the same state machine, but they are not reconciled by manifest reload.
Metrics and draining
begin_in_flight(id, version)incrementsreceivedandin_flightand updateslast_received_ms.finish_in_flight(id, version, outcome)decrementsin_flightand increments one ofdispatched,failed, ordlq.- A draining binding becomes terminated only after the in-flight count returns to zero.
This keeps hot reload safe: events that started under version N
complete under version N, while new events route to version N+1.
Event-log integration
When an active event log is installed for the VM thread, every lifecycle
transition appends a record to the triggers.lifecycle topic. The event
payload includes:
- logical trigger id
id@vNbinding key- provider
- trigger kind
- handler kind
- transition
from_stateandto_state
harn doctor uses the installed registry snapshot to report the live
bindings it sees after manifest load, including state, version, and
zeroed metrics for newly installed triggers.
The trigger stdlib’s manual replay path also depends on the registry:
trigger_fire(...)records the synthetic event ontriggers.eventstrigger_replay(...)looks up that recorded envelope plus any pending stdlib DLQ summary entry ontriggers.dlq- the wrapper then re-enters the dispatcher against the resolved live binding
version and threads
replay_of_event_idthrough dispatch observability
Test Harness
harn_vm::triggers::test_util now provides the shared trigger-system
test harness used by both Rust unit tests and .harn conformance
fixtures. The harness owns:
- a reusable mock clock with wall-clock and monotonic hooks
- a recording connector sink/registry for emitted normalized events
- named fixture runners that cover cron, webhook verification, retry/backoff, DLQ/replay, dedupe, rate limiting, cost guards, crash recovery, hot reload, and dead-man alerts
The script-facing entrypoint is the trigger_test_harness(...) builtin,
which returns a structured report for the selected fixture instead of
requiring each conformance script to rebuild connector state by hand.
Cron connector
The cron connector is Harn’s in-process scheduler for time-triggered work. It
implements the shared Connector trait, evaluates cron expressions in an IANA
time zone, and persists the last-fired boundary for each trigger in the shared
EventLog.
Manifest shape
Cron triggers live under [[triggers]] and keep their schedule-specific
settings inline with the rest of the trigger manifest entry:
[[triggers]]
id = "daily-digest"
kind = "cron"
provider = "cron"
match = { events = ["cron.tick"] }
handler = "worker://digest-queue"
schedule = "0 9 * * *"
timezone = "America/New_York"
catchup_mode = "skip"
Supported fields:
schedule: five-field cron expression parsed bycronertimezone: IANA time zone name such asAmerica/New_Yorkcatchup_mode:skip(default),all, orlatest
Offset literals such as +02:00 and UTC-5 are rejected at manifest-load
time. Use a named zone instead so DST transitions can be evaluated correctly.
DST semantics
The cron connector intentionally favors stable wall-clock semantics over trying to synthesize impossible local times:
- Fall-back overlaps fire a matching wall-clock slot once, even though the local hour appears twice.
- Spring-forward gaps do not invent a firing for a missing local time. A
schedule like
0 2 * * *simply does not fire on the DST transition day when02:00is skipped. - Named zones continue to track the intended local wall time across standard and
daylight time. Midnight in
America/New_Yorkfires at05:00Zin winter and04:00Zin summer.
Durable state
Every successful firing appends the latest scheduled boundary for that trigger
to the EventLog topic connectors.cron.state. On restart, the connector reloads
the latest entry for each trigger_id and uses it to determine whether any
ticks were missed while the orchestrator was down.
The current implementation persists:
trigger_idlast_fired_at
This keeps recovery append-only and backend-agnostic across the memory, file, and SQLite EventLog implementations.
Catch-up modes
Catch-up behavior is evaluated from the persisted last_fired_at boundary to
the connector’s current clock on activation.
skip: drop missed ticks and resume from “now”all: replay every missed scheduled tick in chronological orderlatest: replay only the most recent missed scheduled tick
Catch-up reuses the original scheduled boundary as occurred_at, so downstream
consumers can distinguish between when a job was due and when the process
actually resumed.
Event output
Until the broader trigger dispatcher lands, cron firings are emitted as
serialized TriggerEvent envelopes on the EventLog topic connectors.cron.tick
with provider cron, kind tick, and a CronEventPayload that includes:
cron_idscheduletick_atraw.catchupraw.timezone
This keeps the connector testable today and preserves a normalized event shape for the follow-up dispatcher work.
GitHub App connector
GitHubConnector is Harn’s built-in GitHub App integration for inbound webhook
events plus outbound GitHub REST calls authenticated as an installation.
The MVP scope in #170 is intentionally narrow:
- inbound GitHub webhook verification with
X-Hub-Signature-256 - strongly typed payload narrowing for the six orchestration-relevant event
families:
issues,pull_request,issue_comment,pull_request_review,push, andworkflow_run - outbound installation-token lifecycle for GitHub App auth
- seven outbound helper methods exposed through
std/connectors/github
Guided install / OAuth setup remains deferred to C-10. This landing supports the manual-config path now: provide the App id, installation id, private key, and webhook secret through the orchestrator config + secret providers.
Inbound webhook bindings
Configure GitHub as a provider = "github" webhook trigger:
[[triggers]]
id = "github-prs"
kind = "webhook"
provider = "github"
match = { path = "/hooks/github" }
handler = "handlers::on_github"
dedupe_key = "event.dedupe_key"
secrets = { signing_secret = "github/webhook-secret" }
The connector verifies X-Hub-Signature-256 against the raw request body using
the shared verify_hmac_signed(...) helper from the generic webhook path. It
does not duplicate HMAC logic. Successful deliveries normalize into
TriggerEvent with:
kindfromX-GitHub-Eventdedupe_keyfromX-GitHub-Deliverysignature_status = { state: "verified" }provider_payload = GitHubEventPayload
GitHubEventPayload is narrowed into the six MVP event families. For example,
an issues delivery exposes payload.issue, while pull_request_review
exposes both payload.review and payload.pull_request.
Outbound configuration
Outbound helpers authenticate as a GitHub App installation. Required config:
app_idinstallation_idprivate_key_pemorprivate_key_secret
Optional config:
api_base_urlfor GitHub Enterprise or tests; defaults tohttps://api.github.com
Recommended production shape:
import { configure } from "std/connectors/github"
configure({
app_id: 12345,
installation_id: 67890,
private_key_secret: "github/app-private-key",
})
For tests and local fixtures, private_key_pem can be passed inline.
Installation-token lifecycle
The connector follows the GitHub App installation flow:
- Mint a short-lived App JWT (
RS256,iss = app_id) from the configured private key. - Exchange it at
POST /app/installations/{installation_id}/access_tokens. - Cache the returned installation token per installation.
- Refresh lazily a little before expiry, or immediately after a
401.
The in-process cache refreshes roughly every 55 minutes even though GitHub
tokens are valid for one hour. Token fetches still flow through the shared
secret-provider-backed connector context, and outbound requests are scoped
through the connector RateLimiterFactory.
Outbound helpers
Import from std/connectors/github:
import {
add_labels,
comment,
create_issue,
get_pr_diff,
list_stale_prs,
merge_pr,
request_review,
} from "std/connectors/github"
Available methods:
comment(issue_url, body, options = nil)add_labels(issue_url, labels, options = nil)request_review(pr_url, reviewers, options = nil)merge_pr(pr_url, options = nil)list_stale_prs(repo, days, options = nil)get_pr_diff(pr_url, options = nil)create_issue(repo, title, body = nil, labels = nil, options = nil)
All helpers accept the same auth/config fields through options, but
configure(...) is the intended shared setup path.
Example:
import {
comment,
configure,
list_stale_prs,
merge_pr,
} from "std/connectors/github"
pipeline default() {
configure({
app_id: 12345,
installation_id: 67890,
private_key_secret: "github/app-private-key",
})
let stale = list_stale_prs("acme/api", 14)
if stale.total_count > 0 {
let pr = stale.items[0]
comment("https://github.com/acme/api/issues/" + to_string(pr.number), "Taking a look.")
}
let merged = merge_pr(
"https://github.com/acme/api/pull/42",
{merge_method: "squash", admin_override: true},
)
println(merged.merged)
}
admin_override: true records that the caller requested an override and
annotates the returned JSON with admin_override_requested = true. GitHub’s
REST merge endpoint does not currently expose a distinct override flag, so the
connector still uses the standard merge call.
Rate limiting
The connector uses the shared RateLimiterFactory with a per-installation
scope key before each outbound request. It also reacts to GitHub rate-limit
responses:
- retries once after
429usingRetry-AfterorX-RateLimit-Reset - invalidates cached tokens and re-mints on
401 - emits observations to the
connectors.github.rate_limitevent-log topic
This keeps the MVP aligned with the generic connector rate-limit contract without introducing a second bespoke limiter.
Generic webhook connector
GenericWebhookConnector is the first concrete inbound connector built on top
of the C-01 Connector trait. It accepts generic HTTP webhook deliveries,
verifies supported HMAC signature conventions against the raw request body, and
normalizes the delivery into a TriggerEvent with the built-in
GenericWebhookPayload shape.
The current implementation is intentionally small:
- activation-only; the O-02 HTTP listener still wires request routing later
- raw-body verification for Standard Webhooks, Stripe-style, and GitHub-style signatures
TriggerEventnormalization with header redaction and provider payload preservation- process-local dedupe stub keyed by the manifest
dedupe_keyopt-in until the durable trigger inbox lands
Manifest shape
[[triggers]]
id = "incoming-webhook"
kind = "webhook"
provider = "webhook"
match = { path = "/hooks/incoming" }
handler = "handlers::on_webhook"
dedupe_key = "event.dedupe_key"
secrets = { signing_secret = "webhook/incoming" }
[triggers.webhook]
signature_scheme = "standard" # "standard" | "stripe" | "github"
timestamp_tolerance_secs = 300
source = "incoming"
signature_scheme defaults to "standard" when omitted. Standard Webhooks and
Stripe-style signatures default to a 5-minute timestamp tolerance. GitHub-style
signatures are untimestamped and therefore ignore timestamp skew.
Supported signature conventions
The connector delegates signature checks to
harn_vm::connectors::verify_hmac_signed(...), so it inherits the shared
verification rules from C-01:
- verify against the raw inbound bytes, not a reparsed body
- compare signatures in constant time
- enforce a timestamp window for timestamped schemes
- append signature failures to the
audit.signature_verifyevent-log topic
Supported variants:
- Standard Webhooks:
webhook-id,webhook-timestamp,webhook-signature: v1,<base64> - Stripe-style:
Stripe-Signature: t=<unix>,v1=<hex>[,v1=<hex>...] - GitHub-style:
X-Hub-Signature-256: sha256=<hex>
Normalized event fields
For successful deliveries the connector produces:
provider = "webhook"kindfromRawInbound.kind, thenX-GitHub-Event, then payloadtype/event, else"webhook"dedupe_keyfrom the provider-native delivery identifier:webhook-id, Stripe eventid, orX-GitHub-Deliverysignature_status = { state: "verified" }provider_payload = GenericWebhookPayload
GenericWebhookPayload.raw keeps parsed JSON when the body is JSON. When the
payload is not valid JSON, the connector preserves the bytes as:
{
"raw_base64": "<base64-encoded body>",
"raw_utf8": "optional utf-8 view"
}
GenericWebhookPayload.source comes from X-Webhook-Source when present, or
from the binding’s optional webhook.source override.
Dedupe
If the trigger manifest declares dedupe_key, the connector records the
normalized event.dedupe_key in the current inbox dedupe stub and rejects
replays for the same binding. This is process-local today; durable inbox-backed
dedupe is still deferred to T-09.
Activation and listener integration
The connector’s activate() hook validates the binding config and reserves
unique match.path values across active bindings. Because O-02 is still
outstanding, request routing is not implemented here. Until the listener lands:
- a single active binding can call
normalize_inbound(...)directly - multiple active bindings must pass the selected
binding_idinRawInbound.metadata.binding_id
Notes and follow-up
- Signature failures are audited even when normalization returns an error.
- Production TLS handling is owned by the eventual listener, not this connector.
- Streaming request bodies larger than 10 MiB is still a follow-up item.
Cookbook
Practical patterns for building agents and pipelines in Harn. Each recipe is self-contained with a short explanation and working code.
1. Basic LLM call
Single-shot prompt with a system message. Set ANTHROPIC_API_KEY (or the
appropriate key for your provider) before running.
pipeline default(task) {
let response = llm_call(
"Explain the builder pattern in three sentences.",
"You are a software engineering tutor. Be concise."
)
println(response)
}
To use a different provider or model, pass an options dict:
pipeline default(task) {
let response = llm_call(
"Explain the builder pattern in three sentences.",
"You are a software engineering tutor. Be concise.",
{provider: "openai", model: "gpt-4o", max_tokens: 512}
)
println(response)
}
2. Agent loop with tools
Register tools with JSON Schema-compatible definitions, generate a system prompt that describes them, then let the LLM call tools in a loop.
pipeline default(task) {
var tools = tool_registry()
tools = tool_define(tools, "read", "Read a file from disk", {
parameters: {path: {type: "string", description: "Path to read"}},
returns: {type: "string"},
handler: { path -> return read_file(path) }
})
tools = tool_define(tools, "search", "Search code for a pattern", {
parameters: {query: {type: "string", description: "Query to search"}},
returns: {type: "string"},
handler: { query ->
let result = shell("grep -r '${query}' src/ || true")
return result.stdout
}
})
let system = tool_prompt(tools)
var messages = task
var done = false
var iterations = 0
while !done && iterations < 10 {
let response = llm_call(messages, system)
let calls = tool_parse_call(response)
if calls.count() == 0 {
println(response)
done = true
} else {
var tool_output = ""
for call in calls {
let t = tool_find(tools, call.name)
let handler = t.handler
let result = handler(call.arguments[call.arguments.keys()[0]])
tool_output = tool_output + tool_format_result(call.name, result)
}
messages = tool_output
}
iterations = iterations + 1
}
}
3. Parallel tool execution
Run multiple independent operations concurrently with parallel each.
Results preserve the original list order.
pipeline default(task) {
let files = ["src/main.rs", "src/lib.rs", "src/utils.rs"]
let reviews = parallel each files { file ->
let content = read_file(file)
llm_call(
"Review this code for bugs and suggest fixes:\n\n${content}",
"You are a senior code reviewer. Be specific."
)
}
for i in 0 to files.count exclusive {
println("=== ${files[i]} ===")
println(reviews[i])
}
}
Use parallel when you need to run N indexed tasks rather than mapping
over a list:
pipeline default(task) {
let prompts = [
"Write a haiku about Rust",
"Write a haiku about concurrency",
"Write a haiku about debugging"
]
let results = parallel(prompts.count) { i ->
llm_call(prompts[i], "You are a poet.")
}
for r in results {
println(r)
}
}
4. MCP client integration
Connect to an MCP-compatible tool server, list available tools, and call them. This example uses the filesystem MCP server.
pipeline default(task) {
let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
// Check connection
let info = mcp_server_info(client)
println("Connected to: ${info.name}")
// List available tools
let tools = mcp_list_tools(client)
for t in tools {
println("Tool: ${t.name} - ${t.description}")
}
// Write a file, then read it back
mcp_call(client, "write_file", {path: "/tmp/hello.txt", content: "Hello from Harn!"})
let content = mcp_call(client, "read_file", {path: "/tmp/hello.txt"})
println("File content: ${content}")
// List directory
let entries = mcp_call(client, "list_directory", {path: "/tmp"})
println(entries)
mcp_disconnect(client)
}
You can also declare MCP servers in harn.toml for automatic connection.
See MCP and ACP Integration for details.
For remote HTTP MCP servers, authorize once with the CLI and then reuse the
stored token automatically from harn.toml:
harn mcp redirect-uri
harn mcp login https://mcp.notion.com/mcp --scope "read write"
5. Filtering with in and not in
Use the in and not in operators to filter collections by membership.
pipeline default(task) {
let allowed_extensions = [".rs", ".harn", ".toml"]
let files = list_dir("src")
// Filter files to only allowed extensions
let relevant = files.filter({ f ->
let ext = extname(f)
ext in allowed_extensions
})
println("Relevant files: ${relevant}")
// Exclude specific keys from a config dict
let config = {host: "localhost", port: 8080, debug: true, secret: "abc"}
let sensitive = ["secret", "password"]
let safe = {}
for entry in config {
if entry.key not in sensitive {
println("${entry.key}: ${entry.value}")
}
}
}
The in operator works with lists, strings (substring test), dicts
(key membership), and sets.
6. Pipeline composition
Split agent logic across files and compose pipelines using imports and inheritance.
lib/context.harn – shared context-gathering logic:
fn gather_context(task) {
let readme = read_file("README.md")
return {
task: task,
readme: readme,
timestamp: timestamp()
}
}
lib/review.harn – a reusable review pipeline:
import "lib/context"
pipeline review(task) {
let ctx = gather_context(task)
let prompt = "Review this project.\n\nREADME:\n${ctx.readme}\n\nTask: ${ctx.task}"
let result = llm_call(prompt, "You are a code reviewer.")
println(result)
}
main.harn – extend and customize:
import "lib/review"
pipeline default(task) extends review {
override setup() {
println("Starting custom review pipeline")
}
}
7. Error handling in agent loops
Wrap LLM calls in try/catch with retry to handle transient failures.
Use typed catch for structured error handling.
pipeline default(task) {
enum AgentError {
LlmFailure(message)
ParseFailure(raw)
Timeout(seconds)
}
fn safe_llm_call(prompt, system) {
retry 3 {
try {
let raw = llm_call(prompt, system)
let parsed = json_parse(raw)
return parsed
} catch (e) {
println("LLM call failed: ${e}")
throw AgentError.LlmFailure(to_string(e))
}
}
}
try {
let result = safe_llm_call(
"Return a JSON object with keys 'summary' and 'score'.",
"You are an evaluator. Always respond with valid JSON only."
)
println("Summary: ${result.summary}")
println("Score: ${result.score}")
} catch (e) {
// Harn supports a single catch per try; branch on the error type here.
if type_of(e) == "enum" {
match e.variant {
"LlmFailure" -> { println("LLM failed after retries: ${e.fields[0]}") }
"ParseFailure" -> { println("Could not parse LLM output: ${e.fields[0]}") }
"Timeout" -> { println("Timed out after ${e.fields[0]}s") }
}
} else {
println("Unexpected error: ${e}")
}
}
}
8. Channel-based coordination
Use channels to coordinate between spawned tasks. One task produces work, another consumes it.
pipeline default(task) {
let ch = channel("work", 10)
let results_ch = channel("results", 10)
// Producer: send work items
let producer = spawn {
let items = ["item_a", "item_b", "item_c"]
for item in items {
send(ch, item)
}
send(ch, "DONE")
}
// Consumer: process work items
let consumer = spawn {
var processed = 0
var running = true
while running {
let item = receive(ch)
if item == "DONE" {
running = false
} else {
let result = "processed: ${item}"
send(results_ch, result)
processed = processed + 1
}
}
send(results_ch, "COMPLETE:${processed}")
}
await(producer)
await(consumer)
// Collect results
var collecting = true
while collecting {
let msg = receive(results_ch)
if msg.starts_with("COMPLETE:") {
println(msg)
collecting = false
} else {
println(msg)
}
}
}
9. Context building pattern
Gather context from multiple sources, merge it into a single dict, and pass it to an LLM.
pipeline default(task) {
fn read_or_empty(path) {
try {
return read_file(path)
} catch (e) {
return ""
}
}
// Gather context from multiple sources in parallel
let sources = ["README.md", "CHANGELOG.md", "docs/architecture.md"]
let contents = parallel each sources { path ->
{path: path, content: read_or_empty(path)}
}
// Build a merged context dict
var context = {task: task, files: {}}
for item in contents {
if item.content != "" {
context = context.merge({files: context.files.merge({[item.path]: item.content})})
}
}
// Format context for the LLM
var prompt = "Task: ${task}\n\n"
for entry in context.files {
prompt += "=== ${entry.key} ===\n${entry.value}\n\n"
}
let result = llm_call(prompt, "You are a helpful assistant. Use the provided files as context.")
println(result)
}
10. Structured output parsing
Ask the LLM for JSON output, parse it with json_parse, and validate
the structure before using it.
pipeline default(task) {
let system = """
You are a task planner. Given a task description, break it into steps.
Respond with ONLY a JSON array of objects, each with "step" (string) and
"priority" (int 1-5). No other text.
"""
fn get_plan(task_desc) {
retry 3 {
let raw = llm_call(task_desc, system)
let parsed = json_parse(raw)
// Validate structure
guard type_of(parsed) == "list" else {
throw "Expected a JSON array, got: ${type_of(parsed)}"
}
for item in parsed {
guard item.has("step") && item.has("priority") else {
throw "Missing required fields in: ${json_stringify(item)}"
}
}
return parsed
}
}
let plan = get_plan("Build a REST API for a todo app")
if plan != nil {
let sorted = plan.filter({ s -> s.priority <= 3 })
for step in sorted {
println("[P${step.priority}] ${step.step}")
}
} else {
println("Failed to get a valid plan after retries")
}
}
11. Sets for deduplication and membership testing
Use sets to track processed items and avoid duplicates. Sets provide
O(1)-style membership testing via set_contains and are immutable –
operations like set_add return a new set.
pipeline default(task) {
let urls = [
"https://example.com/a",
"https://example.com/b",
"https://example.com/a",
"https://example.com/c",
"https://example.com/b"
]
// Deduplicate with set(), then convert back to a list
let unique_urls = to_list(set(urls))
println("${len(unique_urls)} unique URLs out of ${len(urls)} total")
// Track which URLs have been processed
var visited = set()
for url in unique_urls {
if !set_contains(visited, url) {
println("Processing: ${url}")
visited = set_add(visited, url)
}
}
// Set operations: find overlap between two batches
let batch_a = set("task-1", "task-2", "task-3")
let batch_b = set("task-2", "task-3", "task-4")
let already_done = set_intersect(batch_a, batch_b)
let new_work = set_difference(batch_b, batch_a)
println("Overlap: ${len(already_done)}, New: ${len(new_work)}")
}
12. Typed functions with runtime enforcement
Add type annotations to function parameters for automatic runtime
validation. When a caller passes a value of the wrong type, the VM
throws a TypeError before the function body executes.
pipeline default(task) {
fn summarize(text: string, max_words: int) -> string {
let words = text.split(" ")
if words.count <= max_words {
return text
}
let truncated = words.slice(0, max_words)
return "${join(truncated, " ")}..."
}
println(summarize("The quick brown fox jumps over the lazy dog", 5))
// Catch type errors gracefully. `harn check` rejects this call statically
// before the catch can run — the example is shown for illustration only.
try {
summarize(42, "not a number")
} catch (e) {
println("Caught: ${e}")
// -> TypeError: parameter 'text' expected string, got int (42)
}
// Works with all primitive types: string, int, float, bool, list, dict, set
fn process_batch(items: list, verbose: bool) {
for item in items {
if verbose {
println("Processing: ${item}")
}
}
println("Done: ${len(items)} items")
}
process_batch(["a", "b", "c"], true)
}
13. MCP client with agent loop
Connect to an MCP server and pass its tools to an agent_loop, letting
the LLM decide which tools to call.
pipeline default(task) {
let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
let mcp_tool_list = mcp_list_tools(client)
// Build a tool registry from MCP tools
var tools = tool_registry()
for t in mcp_tool_list {
tools = tool_define(tools, t.name, t.description, {
parameters: t.inputSchema?.properties ?? {},
returns: {type: "string"},
handler: { args -> return mcp_call(client, t.name, args) }
})
}
let result = agent_loop(
"List all files in /tmp and read the first one.",
"You are a helpful file assistant.",
{
tools: tools,
persistent: true,
max_iterations: 10
}
)
println(result.text)
mcp_disconnect(client)
}
14. Recursive agent with tail call optimization
Tail-recursive functions are optimized by the VM, so they do not overflow the stack even across thousands of iterations. This is an advanced pattern useful for processing a queue of work items one at a time.
pipeline default(task) {
let items = ["Refactor auth module", "Add input validation", "Write unit tests"]
fn process(remaining, results) {
if remaining.count == 0 {
return results
}
let item = remaining.first
let rest = remaining.slice(1)
let result = retry 3 {
llm_call(
"Plan how to: ${item}",
"You are a senior engineer. Output a numbered list of steps."
)
}
return process(rest, results + [{task: item, plan: result}])
}
let plans = process(items, [])
for p in plans {
println("=== ${p.task} ===")
println(p.plan)
}
}
For non-LLM workloads, TCO handles deep recursion without issues:
pipeline default(task) {
fn sum_to(n, acc) {
if n <= 0 {
return acc
}
return sum_to(n - 1, acc + n)
}
println(sum_to(10000, 0))
}
15. Multi-agent delegation
Spawn worker agents for different roles and collect their results in parallel.
// Spawn workers and collect results
let agents = ["research", "analyze", "summarize"]
let results = parallel each agents { role ->
let agent = spawn_agent({name: role, system: "You are a ${role} agent."})
send_input(agent, task)
wait_agent(agent)
}
16. Parallel LLM evaluation
Evaluate multiple prompts concurrently using parallel each.
// Evaluate multiple prompts in parallel
let prompts = ["Explain X", "Explain Y", "Explain Z"]
let responses = parallel each prompts { p ->
llm_call({prompt: p})
}
17. MCP client usage
Connect to an MCP server, list tools, call one, and disconnect.
// Connect to an MCP server and call tools
let client = mcp_connect({command: "npx", args: ["-y", "some-mcp-server"]})
let tools = mcp_list_tools(client)
log("Available: ${len(tools)} tools")
let result = mcp_call(client, "tool_name", {arg: "value"})
mcp_disconnect(client)
18. Eval metrics tracking
Track quality metrics during agent execution for later analysis.
// Track quality metrics during agent execution
eval_metric("accuracy", score, {model: model_name})
let usage = llm_usage()
eval_metric("cost_tokens", usage.input_tokens + usage.output_tokens)
Tutorial: Build a code review agent
This tutorial shows a small but realistic review pipeline. The goal is not to rebuild a full IDE integration. Instead, we want a deterministic Harn program that can review a patch, inspect context, and return a concise report.
Use the companion example as a starting point:
cargo run --bin harn -- run examples/code-reviewer.harn
1. Start with a tight review prompt
The simplest useful reviewer is just an LLM call with a strong system prompt. Keep the instructions short, specific, and opinionated:
pipeline default(task) {
let system = """
You are a senior code reviewer.
Review the patch for correctness, security, maintainability, and tests.
Return:
- must-fix issues
- suggestions
- missing tests
End with a short verdict.
"""
let review = llm_call(task, system, {
temperature: 0.2,
max_tokens: 1200,
})
println(review.text)
}
This is enough when the user pastes a diff directly into task.
2. Add file context when you need it
Real review agents usually need a bit of surrounding code. The simplest route is to read a small, explicit list of files and combine them with the patch. Keep the list short so the prompt stays focused.
pipeline default(task) {
let files = ["src/main.rs", "src/lib.rs"]
var context = ""
for file in files {
context = context + "\n\n=== " + file + " ===\n" + read_file(file)
}
let review = llm_call(
"Patch:\n" + task + "\n\nContext:\n" + context,
"""
You are a strict code reviewer.
Flag correctness bugs first, then test gaps, then maintainability issues.
Do not invent missing context. If the context is insufficient, say so.
""",
{temperature: 0.2, max_tokens: 1400}
)
println(review.text)
}
If you want to review a directory tree instead, use list_dir() and
parallel each to gather files concurrently, then trim the result to the most
relevant ones before calling the model.
3. Make the review measurable
Good review agents should record something observable, even if it is only a
small heuristic. Use eval_metric() to track whether the agent found issues
and how often it asked for more context.
pipeline default(task) {
let review = llm_call(
task,
"You are a code reviewer. Return a concise bullet list.",
{temperature: 0.2}
)
let has_issue = review.text.contains("issue") || review.text.contains("bug")
eval_metric("review_has_issue", has_issue)
eval_metric("review_chars", review.text.count)
println(review.text)
}
That makes the output easier to compare in harn eval runs later.
4. When to stop
Use the agent loop when the review needs to gather context, but stop once the review itself is stable. For code review, that usually means:
- inspect a small, explicit file set
- keep the system prompt short
- request concrete fixes, not a long essay
- record metrics so you can compare review quality over time
If you need a richer workflow, combine this with the eval tutorial and the
debugging tools in docs/src/debugging.md.
Tutorial: Build an MCP server
This tutorial builds a small MCP server in Harn. The same program can expose tools, static resources, resource templates, and prompts over stdio.
Use the companion example as a baseline:
cargo run --bin harn -- mcp-serve examples/mcp_server.harn
1. Register tools
Start by creating a tool registry and attaching a few tools with explicit schemas:
pipeline main(task) {
var tools = tool_registry()
tools = tool_define(tools, "greet", "Greet someone by name", {
params: { name: "string" },
handler: { args -> "Hello, " + args.name + "!" },
annotations: {
title: "Greeting Tool",
readOnlyHint: true,
destructiveHint: false,
}
})
tools = tool_define(tools, "add", "Add two numbers", {
params: { a: "number", b: "number" },
handler: { args -> to_string(args.a + args.b) }
})
mcp_tools(tools)
}
Keep tool names short and descriptive. The description should be written for a model, not for a human reading source code.
2. Add resources and templates
Resources are good for static content, while resource templates are better for parameterized data.
pipeline main(task) {
mcp_resource({
uri: "docs://readme",
name: "README",
mime_type: "text/markdown",
text: "# Harn MCP Demo\n\nThis server is implemented in Harn."
})
mcp_resource_template({
uri_template: "config://{key}",
name: "Configuration values",
mime_type: "text/plain",
handler: { args ->
if args.key == "version" {
"0.6.0"
} else if args.key == "name" {
"harn-demo"
} else {
"unknown key: " + args.key
}
}
})
}
That pattern is useful for docs, policy data, generated summaries, and other state you want to expose without writing a dedicated tool for each lookup.
3. Add prompts
Prompts let the client ask the server for structured guidance:
pipeline main(task) {
mcp_prompt({
name: "code_review",
description: "Review code for correctness and maintainability",
arguments: [
{ name: "code", description: "The code to review", required: true },
{ name: "language", description: "Programming language" }
],
handler: { args ->
let lang = args.language ?? "unknown"
"Please review this " + lang + " code for correctness, bugs, and tests:\n\n" + args.code
}
})
}
Prompts are a good way to standardize a client workflow while still letting the client supply the final payload.
4. Run it over stdio
Once the pipeline calls mcp_tools(), mcp_resource(), or mcp_prompt(),
launch the server with:
harn mcp-serve examples/mcp_server.harn
All user-visible output goes to stderr; the MCP transport stays on stdout. That keeps the server compatible with Claude Desktop, Cursor, and other MCP clients.
5. Keep the surface small
A good MCP server has a narrow surface area:
- expose only the operations the client truly needs
- keep tool names and schemas stable
- prefer explicit resources over ad hoc text blobs
- use resource templates when one static resource is not enough
If you want the server to be consumable from a desktop client, add a short launch snippet in the client config and test the tool list before expanding the surface.
Tutorial: Build an eval pipeline
This tutorial builds a small evaluation loop that runs a set of examples, records metrics, and produces an auditable summary. The goal is to make quality visible, not to build an elaborate benchmark harness.
Use the companion example as a baseline:
cargo run --bin harn -- run examples/data-pipeline.harn
1. Define the dataset inline
Start with a tiny set of representative inputs. Keep the examples small enough that you can inspect failures by eye:
pipeline main(task) {
let cases = [
{id: "case-1", input: "What is 2 + 2?", expected: "4"},
{id: "case-2", input: "Capital of France?", expected: "Paris"},
{id: "case-3", input: "Color of grass?", expected: "green"},
]
println("Loaded ${cases.count} eval cases")
}
2. Run the cases in parallel
If each case is independent, use parallel each so the slow parts overlap.
pipeline main(task) {
let cases = [
{id: "case-1", input: "What is 2 + 2?", expected: "4"},
{id: "case-2", input: "Capital of France?", expected: "Paris"},
{id: "case-3", input: "Color of grass?", expected: "green"},
]
let results = parallel each cases { tc ->
let answer = llm_call(tc.input, "Answer in one word or short phrase.", {
temperature: 0.0,
max_tokens: 64,
})
{
id: tc.id,
expected: tc.expected,
actual: answer.text,
correct: answer.text.contains(tc.expected),
}
}
println(json_stringify(results))
}
For a real eval suite, replace the inline cases list with a manifest or a
dataset file that your pipeline reads with read_file().
3. Record metrics
The important part of an eval pipeline is the metric trail. Use
eval_metric() to record per-case and aggregate results.
pipeline main(task) {
let cases = [
{id: "case-1", input: "What is 2 + 2?", expected: "4"},
{id: "case-2", input: "Capital of France?", expected: "Paris"},
]
var passed = 0
for tc in cases {
let answer = llm_call(tc.input, "Answer in one word.", {temperature: 0.0})
let correct = answer.text.contains(tc.expected)
if correct {
passed = passed + 1
}
eval_metric("case_correct", correct, {case_id: tc.id})
}
let accuracy = passed / cases.count
eval_metric("accuracy", accuracy, {passed: passed, total: cases.count})
eval_metric("run_id", uuid())
eval_metric("generated_at", timestamp())
}
4. Export a report
Once the metrics are recorded, write a compact report so a later run can diff the results.
pipeline main(task) {
let summary = {
run_id: uuid(),
generated_at: timestamp(),
accuracy: 0.83,
notes: "Replace the fixed accuracy with real case scoring",
}
write_file("eval-summary.json", json_stringify(summary))
println(json_stringify(summary))
}
5. How to use it
Run the pipeline, inspect the metrics, then compare runs over time:
harn run examples/eval-workflow.harn
harn eval .harn-runs/<run-id>.json
A good eval pipeline answers three questions:
- did the model improve?
- did latency or token usage regress?
- which cases failed, and why?
Best practices
This guide collects the habits that keep Harn programs small, testable, and easier to operate.
Keep prompts narrow
The best prompts are short and explicit. Tell the model exactly what shape of output you want, what to avoid, and when to say it does not know something. Prefer one task per call over one giant prompt that tries to do everything.
Use explicit context
Pass the minimum useful context into each model call. If the model only needs a few files or a short patch, read those directly instead of dumping the entire repository into the prompt.
Prefer typed boundaries
Use type annotations, shape types, and small helper functions where they make the interface clearer. A narrow typed boundary is easier to debug than a large pile of implicit dicts.
Make concurrency obvious
Use parallel each when the work is independent and order matters. Use
parallel when you need indexed fan-out. Keep the body of each worker short so
it is obvious what is happening concurrently.
Record metrics early
If a pipeline matters enough to keep, add eval_metric() calls sooner rather
than later. Track the numbers you will want during regressions: accuracy,
latency, token usage, and counts of failures or retries.
Fail fast on unclear inputs
Use require, guard, typed catches, and explicit validation when the pipeline
depends on a particular shape of data. It is cheaper to fail immediately than
to let a bad input travel through several stages.
Keep operational surfaces small
For MCP servers, host integrations, and agent tools, expose only the minimum surface you need. Smaller tool surfaces are easier to document, secure, and debug.
Inspect before you scale
Use harn repl for quick experiments, harn viz for structural overviews,
harn doctor for environment checks, and cargo run --bin harn-dap through
the DAP adapter when you need line-level stepping.
Recommended workflow
For a new agent or pipeline:
- Prototype the prompt in
harn repl. - Turn it into a named pipeline.
- Add a small example under
examples/. - Add metrics or a conformance test.
- Use
harn vizand the debugger when the control flow gets complicated.
That sequence is usually enough to keep the implementation honest without turning the repository into a framework project.
Playground
harn playground runs a pipeline against a Harn-native host module in the same
process. It is intended for fast pipeline iteration without wiring a JSON-RPC
host or booting a larger app shell.
Quick start
The repo ships with a minimal example:
harn playground \
--host examples/playground/host.harn \
--script examples/playground/echo.harn \
--task "Explain this repository in plain English"
--task is exposed to the script through the HARN_TASK environment variable,
so the example reads it with env_or("HARN_TASK", "").
If you want an offline smoke test, force the mock provider:
harn playground \
--host examples/playground/host.harn \
--script examples/playground/echo.harn \
--task "Say hello" \
--llm mock:mock
For deterministic end-to-end iteration, harn playground also accepts the
same JSONL fixture flags as harn run:
harn playground \
--host examples/playground/host.harn \
--script examples/playground/echo.harn \
--task "Explain this repository" \
--llm-mock fixtures/playground.jsonl
Use --llm-mock-record <path> once to capture a replayable fixture, then
switch back to --llm-mock <path> while you iterate on control flow.
Host modules
A playground host is just a .harn file that exports the functions your
pipeline expects:
pub fn build_prompt(task_text) {
return "Task: " + task_text + "\nWorkspace: " + cwd()
}
pub fn request_permission(tool_name, request_args) -> bool {
return true
}
The playground command loads those exported functions and makes them available to the entry script during execution. If the script calls a host function that the module does not export, the command fails with a pointed error naming the missing function and the caller location.
Watch mode
Use --watch to re-run when either the host module or the script changes:
harn playground --watch --task "Refine the prompt"
The watcher tracks the host and script parent directories recursively and debounces save bursts before re-running.
Starter project
Use the built-in scaffold when you want a dedicated scratchpad:
harn new pipeline-lab-demo --template pipeline-lab
cd pipeline-lab-demo
harn playground --task "Summarize this project"
Host boundary
Harn is the orchestration layer. Hosts supply facts and platform effects.
The boundary should stay narrow:
- Hosts expose typed capabilities such as project scan data, editor state, diagnostics, git facts, approval decisions, and persistence hooks.
- Harn owns orchestration policy: workflow topology, retries, verification, transcript lifecycle, context assembly, contract enforcement, replay, evals, and worker semantics.
What belongs in Harn std/* modules or the VM:
- Generic runtime wrappers like
runtime_task(),process_exec(), orinteraction_ask() - Reusable metadata/scanner helpers and product-agnostic project-state normalization
- Transcript schemas, assets, compaction, and replay semantics
- Context/artifact assembly rules that are product-agnostic
- Structured contract enforcement and eval/replay helpers
- Test-time typed host mocks such as
host_mock(...)when the behavior is a runtime fixture for host-backed flows rather than a product-specific bridge - Mutation-session identity and audit provenance for write-capable workflows and delegated workers
What should stay in host-side .harn scripts:
- Product-specific prompts and instruction tone
- IDE-specific flows such as edit application, approval UX, repo enrichment, or bespoke tool choreography
- Host-owned filesystem and edit wrappers built on capability-aware
host_call(...) - Host-owned editor, diagnostics, git, learning, and project-context wrappers
- Concrete undo/redo stacks and editor-native mutation application
- Proprietary ranking, routing, or heuristics tied to one host product
- Features that depend on host-only commercial, account, or app lifecycle rules
Rule of thumb:
- If a behavior decides how an agent or workflow should think, continue, verify, compact, replay, or select context, it probably belongs in Harn.
- If a behavior fetches facts from a specific editor or app surface, asks the user for approval, or performs a host-only side effect, it belongs in the host.
Keep advanced host-side .harn modules local to the host when they encode
host-only UX, proprietary behavior, or app-specific heuristics. Move a helper
into Harn only when it is general enough to be useful across hosts.
Trust boundary
Harn should own the audit contract for mutations:
- mutation-session IDs
- workflow/worker/session lineage
- tool-gate mutation classification and declared scope
- artifact and run-record provenance
Hosts should own the concrete UX:
- apply/approve/deny flows
- patch previews
- editor undo/redo semantics
- trust UI around which worker or session produced a change
Contract surfaces
Harn now ships machine-readable contract exports so hosts do not need to reverse-engineer runtime assumptions:
harn contracts builtinsfor the builtin registry and parser/runtime driftharn contracts host-capabilitiesfor the effective host manifest used by preflight validationharn contracts bundlefor entry modules, imported modules, prompt/template assets, explicit module-dependency edges, required host capabilities, literal execution directories, worker repo dependencies, and stable summary counts
Those surfaces are intended to be the generic boundary for embedded hosts such as editors or native apps. Product-specific packaging logic should build on top of them rather than re-implementing Harn’s import, asset, and host-capability resolution rules independently.
Bridge protocol
Harn’s stdio bridge uses JSON-RPC 2.0 notifications and requests for host/runtime coordination that sits below ACP session semantics.
Tool lifecycle observation
The tool/pre_use, tool/post_use, and tool/request_approval bridge
request/response methods have been retired in favor of the canonical
ACP surface:
- Tool lifecycle is now carried on the
session/updatenotification stream astool_callandtool_call_updatevariants (see the ACP schema at https://agentclientprotocol.com/protocol/schema). Hosts observe every dispatch via the session update stream — there is no host-side approve/deny/modify hook at dispatch time. - Approvals route through canonical
session/request_permission. When harn’s declarativeToolApprovalPolicyclassifies a call asRequiresHostApproval, the agent loop issues asession/request_permissionrequest to the host and fails closed if the host does not implement it (or returns an error).
Internally, the agent loop emits AgentEvent::ToolCall +
AgentEvent::ToolCallUpdate events; harn-cli’s ACP server translates
them into session/update notifications via an AgentEventSink it
registers per session.
session/request_permission
Request payload (harn-issued):
{
"sessionId": "session_123",
"toolCall": {
"toolCallId": "call_123",
"toolName": "edit_file",
"rawInput": {"path": "src/main.rs"}
},
"mutation": {
"session_id": "session_123",
"run_id": "run_123",
"worker_id": null,
"mutation_scope": "apply_workspace",
"approval_policy": {"require_approval": ["edit*"]}
},
"declaredPaths": ["src/main.rs"]
}
Response payload (host-issued):
{ "outcome": { "outcome": "selected" } }(ACP canonical): granted{ "granted": true }(legacy shim): granted with original args{ "granted": true, "args": {...} }: granted with rewritten args{ "granted": false, "reason": "..." }: denied
Worker lifecycle notifications
Delegated workers emit session/update notifications with worker_update
content. Those payloads include lifecycle timing, child run/snapshot paths,
and audit-session metadata so hosts can render background work without
scraping plain-text logs.
Daemon idle/resume notifications
Daemon agents stay alive after text-only turns and wait for host activity with adaptive
backoff: 100ms, 500ms, 1s, 2s, resetting to 100ms whenever activity arrives.
agent/idle
Sent as a bridge notification whenever the daemon enters or remains in the idle wait loop.
Payload:
{
"iteration": 3,
"backoff_ms": 1000
}
agent/resume
Hosts can send this notification to wake an idle daemon without injecting a user-visible message.
Payload:
{}
A host may also wake the daemon by sending a queued user_message, session/input, or
agent/user_message notification.
Client-executed tool search
When a Harn script opts into tool_search against a provider that lacks
native defer-loading support, the runtime switches to a client-executed
fallback (see the LLM and agents guide). For the
"bm25" and "regex" strategies everything stays in-VM; the
"semantic" and "host" strategies round-trip the query through the
bridge.
tool_search/query
Request payload (harn-issued, host response required):
{
"strategy": "semantic",
"query": "deploy a new service version",
"candidates": ["deploy_service", "rollback_service", "query_metrics", "..."]
}
strategy: one of"semantic"or"host". The in-tree strategies ("bm25"/"regex") never hit the bridge.query: the raw query string the model passed to the synthetic search tool. Forstrategy: "regex"/"bm25"hosts don’t see this; those strategies run inside the VM.candidates: full list of deferred tool names the host may choose from. The host should return a subset.
Response payload (host-issued):
{
"tool_names": ["deploy_service", "rollback_service"],
"diagnostic": "matched by vector similarity"
}
tool_names(required): ordered list of tool names to promote. Unknown names are ignored by the runtime — they can’t be surfaced because their schemas weren’t registered. Return at most ~20 names per call; the runtime caps promotions soft-per-turn regardless.diagnostic(optional): short explanation surfaced to the model in the tool result alongsidetool_names. Useful for “no hits, try broader terms”-style feedback.
An ACP-style wrapper { "result": { "tool_names": [...] } } is also
accepted for hosts that re-wrap everything in a result envelope.
Errors: a JSON-RPC error response (standard shape) is surfaced to the
model as a tool_names: [] result with a diagnostic that includes the
host error message. The loop continues — the model can retry with a
different query.
Host tool discovery
Hosts can expose their own dynamic tool surface to scripts without
pre-registering every tool in the initial prompt. Harn discovers that
surface through one bridge RPC and then invokes individual tools
through the existing builtin_call request path.
host/tools/list
VM-issued request. No parameters (or an empty object). The host responds with a list of tool descriptors. Canonical response shape:
{
"tools": [
{
"name": "Read",
"description": "Read a file from the active workspace",
"schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path to read"}
},
"required": ["path"]
},
"deprecated": false
},
{
"name": "open_file",
"description": "Reveal a file in the editor",
"schema": {
"type": "object",
"properties": {
"path": {"type": "string"}
},
"required": ["path"]
},
"deprecated": true
}
]
}
Accepted variants:
- a bare array
[{...}, {...}] - an ACP-style wrapper
{ "result": { "tools": [...] } } - compatibility field names
short_description,parameters, orinput_schema; Harn normalizes them todescriptionandschema
Each normalized descriptor surfaced to scripts has exactly these keys:
name: string, requireddescription: string, defaults to""schema: JSON Schema object ornulldeprecated: boolean, defaults tofalse
Invocation:
host_tool_list()returns the normalized list directly.host_tool_call(name, args)then dispatches that tool through the existingbuiltin_callbridge request usingnameas the builtin name andargsas the single argument payload.
Skill registry (issue #73)
Hosts expose their own managed skill store to the VM through three RPCs.
Filesystem skill discovery works without the bridge (harn run walks
the seven non-host layers described in Skills); these
RPCs add a layer 8 so cloud hosts, enterprise deployments, and the
Burin Code IDE can serve skills the filesystem can’t see.
skills/list
VM-issued request. No parameters (or an empty object). The host
responds with an array of SkillManifestRef entries. Minimal shape:
[
{ "id": "deploy", "name": "deploy", "description": "Ship it", "source": "host" },
{ "id": "acme/ops/review", "name": "review", "description": "Code review", "source": "host" }
]
The VM also accepts { "skills": [ ... ] } for hosts that wrap
collections in an object.
skills/fetch
VM-issued request. Parameters: { "id": "<skill id>" }. Response is a
single skill object carrying enough metadata to populate a Skill:
{
"name": "deploy",
"description": "Ship it",
"body": "# Deploy runbook\n...",
"manifest": {
"when_to_use": "...",
"allowed_tools": ["bash", "git"],
"paths": ["infra/**"],
"model": "claude-opus-4-7"
}
}
Hosts may flatten the manifest fields into the top level instead — the CLI accepts either shape.
skills/update
Host-issued notification. No parameters. Invalidates the VM’s cached
skill catalog; the CLI re-runs layered discovery (including another
skills/list call) on the next iteration boundary — for harn watch,
between file changes; for long-running agents, between turns. A VM
without an active bridge simply ignores the notification.
Host-delegated skill matching
Harn agents that opt into
skill_match: { strategy: "host" } (or the alias "embedding")
delegate skill ranking to the host via a single JSON-RPC request. The
host response is purely advisory — unknown skill names are ignored,
and an RPC error falls back to the in-VM metadata ranker with a
warning logged against agent.skill_match.
skill/match
Request payload (harn-issued, host response required):
{
"strategy": "host",
"prompt": "Ship the new release to production",
"working_files": ["infra/terraform/cluster.tf"],
"candidates": [
{
"name": "ship",
"description": "Ship a production release",
"when_to_use": "User says ship/release/deploy",
"paths": ["infra/**", "Dockerfile"]
},
{
"name": "review",
"description": "Review existing code for correctness",
"when_to_use": "User asks to review/audit",
"paths": []
}
]
}
Response payload (host-issued):
{
"matches": [
{"name": "ship", "score": 0.92, "reason": "matched by embedding similarity"}
]
}
matches[*].name(required): the candidate’s skill name. Names absent from the originalcandidateslist are ignored.matches[*].score(optional): non-negative float; higher scores rank earlier. Defaults to1.0when omitted.matches[*].reason(optional): short diagnostic stored on theskill_matched/skill_activatedtranscript events. Defaults to"host match".
Alternative shapes accepted for host convenience:
- Top-level array:
[{"name": ..., "score": ...}, ...] {"skills": [...]}wrapping{"result": {"matches": [...]}}ACP envelope
Skill lifecycle session updates
Agents emit ACP session/update notifications for skill lifecycle
transitions so hosts can surface active-skill state in real time.
harn-cli’s ACP server translates the canonical AgentEvent
variants into:
sessionUpdate: "skill_activated"—{skillName, iteration, reason}sessionUpdate: "skill_deactivated"—{skillName, iteration}sessionUpdate: "skill_scope_tools"—{skillName, allowedTools}
skill_matched stays internal to the VM transcript — the candidate
list can be large and host UIs typically only care about activation
transitions, not every ranking pass.
Host tools over the bridge
host_tool_list() and host_tool_call(name, args) are the host-side
mirror of Harn’s LLM-facing tool_search flow: the script can ask the
host what tools exist right now, inspect their schemas, and invoke the
one it actually needs.
This is useful when the host owns the real capabilities:
- Claude Code style tools such as
Read,Edit, andBash - IDE actions such as
open_file,ide.panel.focus, oride.git.worktree - product-specific actions that vary by project, session, or user role
Worked example
The script below discovers a readable tool at runtime, refuses to use a deprecated one, and then calls it with a single structured argument payload.
import { host_tool_available, host_tool_lookup } from "std/host"
pipeline inspect_readme(task) {
if !host_tool_available("Read") {
log("Host does not expose a Read tool in this session")
return nil
}
let read_tool = host_tool_lookup("Read")
assert(read_tool != nil, "Read tool metadata should be present")
assert(read_tool?.deprecated != true, "Read tool is deprecated on this host")
let result = host_tool_call("Read", {path: "README.md"})
log(result)
}
What happens at runtime:
host_tool_list()sendshost/tools/listto the active bridge host.- The host replies with tool descriptors:
name,description,schema, anddeprecated. host_tool_call("Read", {path: "README.md"})reuses the bridge’s existingbuiltin_callpath, so the host receives the dynamic tool invocation without Harn needing a second bespoke call protocol.
Shape conventions
Harn normalizes each entry returned by host/tools/list to this form:
{
"name": "Read",
"description": "Read a file",
"schema": {
"type": "object",
"properties": {
"path": {"type": "string"}
},
"required": ["path"]
},
"deprecated": false
}
That means scripts can safely branch on tool.schema or
tool.deprecated without having to care whether the host originally
used compatibility field names such as short_description or
input_schema.
Notes
- Without a bridge host,
host_tool_list()returns[]. host_tool_call(...)requires an attached bridge host and throws if none is active.- Hosts remain authoritative: if a tool disappears between discovery and invocation, the host error is surfaced to the script normally.
MCP and ACP integration
Harn has built-in support for the Model Context Protocol (MCP), Agent Client Protocol (ACP), and Agent-to-Agent (A2A) protocol. This guide covers how to use each from both client and server perspectives.
MCP client (connecting to MCP servers)
Connect to any MCP-compatible tool server, list its capabilities, and call tools from within a Harn program. Harn supports both stdio MCP servers and remote HTTP MCP servers.
Connecting manually
Use mcp_connect to spawn an MCP server process and perform the
initialize handshake:
let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
let info = mcp_server_info(client)
println("Connected to: ${info.name}")
Listing and calling tools
let tools = mcp_list_tools(client)
for t in tools {
println("${t.name}: ${t.description}")
}
let content = mcp_call(client, "read_file", {path: "/tmp/data.txt"})
println(content)
mcp_call returns a string for single-text results, a list of content
dicts for multi-block results, or nil when empty. If the tool reports an
error, mcp_call throws.
Resources and prompts
let resources = mcp_list_resources(client)
let data = mcp_read_resource(client, "file:///tmp/config.json")
let prompts = mcp_list_prompts(client)
let prompt = mcp_get_prompt(client, "review", {code: "fn main() {}"})
Disconnecting
mcp_disconnect(client)
Auto-connection via harn.toml
Instead of calling mcp_connect manually, declare servers in harn.toml.
They connect automatically before the pipeline executes and are available
through the global mcp dict:
[[mcp]]
name = "filesystem"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
[[mcp]]
name = "github"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
[[mcp]]
name = "notion"
transport = "http"
url = "https://mcp.notion.com/mcp"
scopes = "read write"
Lazy boot (harn#75)
Servers marked lazy = true are NOT booted at pipeline startup. They
start on the first mcp_call, mcp_ensure_active("name"), or skill
activation that declares the server in requires_mcp. This keeps cold
starts fast when many servers are declared but only a few are needed
per run.
[[mcp]]
name = "github"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
lazy = true
keep_alive_ms = 30_000 # keep the process alive 30s after last release
[[mcp]]
name = "datadog"
command = "datadog-mcp"
lazy = true
Ref-counting: each skill activation or explicit
mcp_ensure_active(name) call bumps a binder count. On deactivation or
mcp_release(name), the count drops. When it reaches zero, Harn
disconnects the server — immediately if keep_alive_ms is absent, or
after the window elapses if set.
Explicit control from user code:
// Start the lazy server and hold it open.
let client = mcp_ensure_active("github")
let issues = mcp_call(client, "list_issues", {repo: "burin-labs/harn"})
// Release when done — lets the registry shut it down.
mcp_release("github")
// Inspect current state.
let status = mcp_registry_status()
for s in status {
println("${s.name}: lazy=${s.lazy} active=${s.active} refs=${s.ref_count}")
}
Server Cards (MCP v2.1)
A Server Card is a small JSON document that advertises a server’s identity, capabilities, and tool catalog without requiring a connection. Harn consumes cards for discoverability and can publish its own when running as an MCP server.
Declare a card source in harn.toml:
[[mcp]]
name = "notion"
transport = "http"
url = "https://mcp.notion.com/mcp"
card = "https://mcp.notion.com/.well-known/mcp-card"
[[mcp]]
name = "local-agent"
command = "my-agent"
lazy = true
card = "./agents/my-agent-card.json"
Fetch it from a pipeline:
// Look up by registered server name.
let card = mcp_server_card("notion")
println(card.description)
for t in card.tools {
println("- ${t.name}")
}
// Or pass a URL / path directly.
let card = mcp_server_card("./agents/my-agent-card.json")
Cards are cached in-process with a 5-minute TTL — repeated calls are free. Skill matchers can factor card metadata into scoring without paying connection cost.
Skill-scoped MCP binding
Skills can declare the MCP servers they need via requires_mcp (or the
equivalent mcp) frontmatter field. On activation, Harn ensures every
listed server is running; on deactivation, it releases them.
skill github_triage {
description: "Triage GitHub issues and cut fixes",
when_to_use: "User mentions a GitHub issue or PR by number",
requires_mcp: ["github"],
allowed_tools: ["list_issues", "create_pr", "add_comment"],
prompt: "You are a triage assistant...",
}
When agent_loop activates github_triage, the lazy github MCP
server boots (if configured that way) and its process stays alive for
as long as the skill is active. When the skill deactivates, the server
is released — and if no other skill holds it, the process shuts down
(respecting keep_alive_ms).
Transcript events emitted along the way: skill_mcp_bound,
skill_mcp_unbound, skill_mcp_bind_failed.
MCP tools in the tool-search index
When an LLM uses tool_search (progressive tool disclosure), MCP tools
are auto-tagged with both mcp:<server> and <server> in the BM25
corpus. That means a query like "github" or "mcp:github" surfaces
every tool from that server even when the tool’s own name and
description don’t contain the word. Tools returned by mcp_list_tools
carry an _mcp_server field that the indexer consumes automatically —
no extra wiring needed.
Use them in your pipeline:
pipeline default(task) {
let tools = mcp_list_tools(mcp.filesystem)
let content = mcp_call(mcp.filesystem, "read_file", {path: "/tmp/data.txt"})
println(content)
}
If a server fails to connect, a warning is printed to stderr and that
server is omitted from the mcp dict. Other servers still connect
normally.
For HTTP MCP servers, Harn can reuse OAuth tokens stored with the CLI:
harn mcp redirect-uri
harn mcp login notion
If the server uses a pre-registered OAuth client, you can provide those
values in harn.toml or on the CLI:
[[mcp]]
name = "internal"
transport = "http"
url = "https://mcp.example.com"
client_id = "https://client.example.com/metadata.json"
client_secret = "super-secret"
scopes = "read:docs write:docs"
When no client_id is provided, Harn will attempt dynamic client
registration if the authorization server advertises it.
Example: filesystem MCP server
A complete example connecting to the filesystem MCP server, writing a file, and reading it back:
let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
mcp_call(client, "write_file", {path: "/tmp/hello.txt", content: "Hello from Harn!"})
let content = mcp_call(client, "read_file", {path: "/tmp/hello.txt"})
println(content)
let entries = mcp_call(client, "list_directory", {path: "/tmp"})
println(entries)
mcp_disconnect(client)
MCP server (exposing Harn as an MCP server)
Harn pipelines can expose tools, resources, resource templates, and prompts as an MCP server. This lets Claude Desktop, Cursor, or any MCP client call into your Harn code.
Defining tools
Use tool_registry() and tool_define() to create tools, then register
them with mcp_tools():
pipeline main(task) {
var tools = tool_registry()
tools = tool_define(tools, "greet", "Greet someone", {
parameters: {name: "string"},
handler: { args -> "Hello, ${args.name}!" }
})
tools = tool_define(tools, "search", "Search files", {
parameters: {query: "string"},
handler: { args -> "results for ${args.query}" },
annotations: {
title: "File Search",
readOnlyHint: true,
destructiveHint: false
}
})
mcp_tools(tools)
}
Defining resources and prompts
pipeline main(task) {
// Static resource
mcp_resource({
uri: "docs://readme",
name: "README",
text: "# My Agent\nA demo MCP server."
})
// Dynamic resource template
mcp_resource_template({
uri_template: "config://{key}",
name: "Config Values",
handler: { args -> "value for ${args.key}" }
})
// Prompt
mcp_prompt({
name: "review",
description: "Code review prompt",
arguments: [{name: "code", required: true}],
handler: { args -> "Please review:\n${args.code}" }
})
}
Running as an MCP server
harn mcp-serve agent.harn
All print/println output goes to stderr (stdout is the MCP
transport). The server supports the 2025-11-25 MCP protocol version
over stdio.
Publishing a Server Card
Attach a Server Card so clients can discover your server’s identity and capabilities before connecting:
harn mcp-serve agent.harn --card ./card.json
The card JSON is embedded in the initialize response’s
serverInfo.card field and also exposed as a read-only resource at
well-known://mcp-card. Minimal shape:
{
"name": "my-agent",
"version": "1.0.0",
"description": "Short one-line summary shown in pickers.",
"protocolVersion": "2025-11-25",
"capabilities": { "tools": true, "resources": false, "prompts": false },
"tools": [
{"name": "greet", "description": "Greet someone by name"}
]
}
--card also accepts an inline JSON string for ad-hoc publishing:
--card '{"name":"demo","description":"…"}'.
Configuring in Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"my-agent": {
"command": "harn",
"args": ["mcp-serve", "agent.harn"]
}
}
}
ACP (Agent Client Protocol)
ACP lets host applications and local clients use Harn as a runtime backend. Communication is JSON-RPC 2.0 over stdin/stdout.
Bridge-level tool gates and daemon idle/resume notifications are documented in Bridge protocol.
Running the ACP server
harn acp # no pipeline, uses bridge mode
harn acp pipeline.harn # execute a specific pipeline per prompt
Protocol overview
The ACP server supports these JSON-RPC methods:
| Method | Description |
|---|---|
initialize | Handshake with capabilities |
session/new | Create a new session (returns session ID) |
session/prompt | Send a prompt to the agent for execution |
session/cancel | Cancel the currently running prompt |
Queued user messages during agent execution
ACP hosts can inject user follow-up messages while an agent is running. Harn owns the delivery semantics inside the runtime so product apps do not need to reimplement queue/orchestration logic.
Supported notification methods:
user_messagesession/inputagent/user_messagesession/updatewithworker_updatecontent for delegated worker lifecycle events
Payload shape:
{
"content": "Please stop editing that file and explain first.",
"mode": "interrupt_immediate"
}
Supported mode values:
interrupt_immediatefinish_stepwait_for_completion
Runtime behavior:
interrupt_immediate: inject on the next agent loop boundary immediately- Worker lifecycle updates are emitted as structured
session/updatepayloads with worker id/name, status, lineage metadata, artifact counts, transcript presence, snapshot path, execution metadata, child run ids/paths, lifecycle summaries, and audit-session metadata when applicable. Hosts can render these as background task notifications instead of scraping stdout. - Bridge-mode logs also stream boot timing records (
ACP_BOOTwithcompile_ms,vm_setup_ms, andexecute_ms) and livespan_endduration events while a prompt is still running, so hosts do not need to wait for the final stdout flush to surface basic timing telemetry. finish_step: inject after the current tool/operation completeswait_for_completion: defer until the current agent interaction yields
Typed pipeline returns (Harn → ACP boundary)
Pipelines are what produce ACP events (agent_message_chunk,
tool_call, tool_call_update, plan, sessionUpdate). Declaring a
return type on a pipeline turns the Harn→ACP boundary into a
type-checked contract instead of an implicit shape that only the bridge
validates:
type PipelineResult = {
text: string | nil,
events: list<dict> | nil,
}
pub pipeline ghost_text(task) -> PipelineResult {
return {
text: "hello",
events: [],
}
}
The type checker verifies every return <expr> against the declared
type, so drift between pipeline output and bridge expectation is caught
before the Swift/TypeScript bridge ever sees the message.
Public pipelines without an explicit return type emit the
pipeline-return-type lint warning. Explicit return types on the
Harn→ACP boundary will be required in a future release; the warning is
a one-release deprecation window.
Well-known entry pipelines (default, main, auto, test) are
exempt from the warning because their return value is host-driven, not
consumed by a protocol bridge.
Canonical ACP envelope types are provided as Harn type aliases in
std/acp — SessionUpdate, AgentMessageChunk, ToolCall,
ToolCallUpdate, and Plan — and can be used directly as pipeline
return types so a pipeline’s contract matches the ACP schema
byte-for-byte.
Security notes
Remote MCP OAuth
harn mcp login stores remote MCP OAuth tokens in the local OS keychain for
standalone CLI reuse. Treat that as durable delegated access:
- prefer the narrowest scopes the server supports
- treat configured
client_secretvalues as secrets - review remote MCP capabilities before wiring them into autonomous workflows
Safer write defaults
Harn now propagates mutation-session audit metadata through workflow runs, delegated workers, and bridge tool gates. Recommended host defaults remain:
- proposal-first application for direct workspace edits
- worktree-backed execution for autonomous/background workers
- explicit approval for destructive or broad-scope mutation tools
Bridge mode
ACP internally uses Harn’s host bridge so the host can retain control over tool execution while Harn still owns agent/runtime orchestration.
Unknown builtins are delegated to the host via builtin_call JSON-RPC
requests. This enables the host to provide filesystem access, editor
integration, or other capabilities that Harn code can call as regular
builtins.
A2A (Agent-to-Agent Protocol)
A2A exposes a Harn pipeline as an HTTP server that other agents can interact with. The server implements A2A protocol version 1.0.0.
Running the server
harn serve agent.harn # default port 8080
harn serve --port 3000 agent.harn # custom port
Agent card
The server publishes an agent card at GET /.well-known/agent.json
describing the agent’s capabilities. MCP clients and other A2A agents
use this to discover the agent.
Task submission
Submit a task with a POST request:
POST /message/send
Content-Type: application/json
{
"message": {
"role": "user",
"parts": [{"type": "text", "text": "Analyze this codebase"}]
}
}
Task status
Check the status of a submitted task:
GET /task/get?id=<task-id>
Task states follow the A2A protocol lifecycle: submitted, working,
completed, failed, cancelled.
Harn portal
harn portal launches a local observability UI for persisted Harn runs.
The portal frontend is now a Vite-built React application embedded into
harn-cli as static assets. Running harn portal does not require Node once
those built assets are present in the repository, but editing the portal UI
does.
The portal treats .harn-runs/ as the source of truth and gives you one place
to inspect:
- run history
- the derived action-graph / planner observability artifact
- workflow stages
- nested trace spans
- transcript/story sections
- delegated child runs
- token/call usage
Start the portal
harn portal
make portal
By default the portal:
- serves from
http://127.0.0.1:4721 - watches
.harn-runs - opens a browser automatically
For a fresh source checkout, the simplest local setup is:
./scripts/dev_setup.sh
make portal
./scripts/dev_setup.sh also installs the portal’s Node dependencies and
builds crates/harn-cli/portal-dist up front, so the git hooks and
harn portal start from a ready state.
For portal frontend work specifically:
npm run portal:build
npm run portal:test
Useful flags:
harn portal --dir runs/archive
harn portal --host 0.0.0.0 --port 4900
harn portal --open false
For frontend development with Vite, npm run portal:dev starts:
- the Rust portal server on
http://127.0.0.1:4721 - the Vite UI on
http://127.0.0.1:4723with/apiproxied to the Rust server
Quick demo
To generate a purpose-built demo dataset and launch the portal against it:
make portal-demo
That script creates .harn-runs/portal-demo/ with:
- a successful workflow-graph run
- a deterministic replay of that run
- a failed verification run with failure context in the run list
If you only want the data without launching the server:
./scripts/portal_demo.sh --generate-only
cargo run --bin harn -- portal --dir .harn-runs/portal-demo --open false
If you want to regenerate that dataset from scratch, pass --refresh.
How to read it
The UI is organized around a few simple ideas:
Launchis a dedicated workspace for playground runs and script executionRunsis a dedicated paginated library for persisted run recordsRun detailis a separate inspector page for one run at a time- the top of the detail view is the quick read
- the action-graph panel is the “debug this run from one artifact” view: planner rounds, research facts, worker lineage, verification outcomes, and transcript pointers all come from the same derived block in the saved run
- the policy panel shows the effective run ceiling plus saved validation output
- the replay panel shows whether a run already carries replay/eval assertions
- the flamegraph shows where time went
- the activity feed shows what the runtime actually did
- the transcript story shows the human-visible text that was preserved
- the stage detail drawers expose persisted per-stage policy, contracts, worker, prompt, and rendered-context metadata
The portal is intentionally generic. It does not assume a particular editor, client, or host integration. If Harn persisted the run, the portal can inspect it.
Live updates
The portal polls conservatively instead of hammering the run directory:
- the runs index refreshes on a slower cadence
- the selected run detail refreshes faster only while that run is still active
- hidden browser tabs do not poll
The portal also supports:
- deep-linking to a selected run via the URL
- manual refresh without waiting for the poll interval
- comparing a run against any other run of the same workflow, not just the latest earlier one
- surfacing action-graph, worker-lineage, transcript-pointer, and tool-result diffs alongside stage-level drift
Launch and playground
The portal can also launch Harn directly through a small control panel at the top of the page.
It supports three modes:
- existing
.harnfiles fromexamples/andconformance/tests/ - inline Harn source through the script editor
- a lightweight playground that turns a task plus provider/model selection into a real persisted workflow run
For local model servers, the launch UI also exposes the provider’s endpoint
override env when one exists, so you can point local or similar providers at
another localhost or LAN address without editing config files first.
The portal now shows both roots explicitly in the launch panel:
Workspace root: the directory whereharn portalwas started, and the current working directory for launchesRun artifacts: the watched run directory passed via--dir
Inline and playground launches create a concrete per-job workspace under the watched run directory:
.harn-runs/playground/<job-id>/
workflow.harn
task.txt
launch.json
run.json
run-llm/llm_transcript.jsonl
That keeps the portal useful even before building a larger hosted playground: you get an inspectable source file, launch metadata, and a real run record that the debugger can reopen later.
Security and privacy constraints:
- env overrides are passed only to the child
harn runprocess - env overrides are validated as uppercase shell-style keys
- env values are not persisted in portal job state or run metadata
- launch file paths must stay inside the current workspace
- run inspection paths must stay inside the configured run directory
The transcript sidecar is only populated for runtime paths that currently emit
HARN_LLM_TRANSCRIPT_DIR output. Agent-loop traffic supports this today;
generic workflow-stage model calls may still only appear in the persisted run
record itself.
Saved model-turn detail
If a run has a sibling transcript sidecar directory named like:
.harn-runs/<run-id>.json
.harn-runs/<run-id>-llm/llm_transcript.jsonl
the portal will automatically render step-by-step model turns, including:
- kept vs newly added context
- saved request messages
- reply text
- tool calls
- token counts
- span ids
For richer live observability, Harn already exposes ACP session/update
notifications with:
call_startcall_progresscall_endworker_update
Those can power a future streaming view without inventing a second provenance system alongside run records.
Skill observability
Each run detail page renders three skill-focused panels above the replay/eval section:
- Skill timeline — horizontal bars showing which skills activated on which agent-loop iteration and when they deactivated. Hover a bar for the matcher score and the reason the skill was promoted.
- Tool-load waterfall — one row per
tool_search_queryevent, pairing each query with thetool_search_resultthat followed so you can see which deferred tools entered the LLM’s context in each turn. - Matcher decisions — per-iteration expansions showing every candidate the matcher considered, its score, and the working-file snapshot it scored against.
The runs index also accepts a skill=<name> query parameter (and
exposes it as a filter input on the runs page), so you can narrow
evals to runs where a specific skill was active — useful when
validating that a new skill attracts the right prompts.
Orchestrator
harn orchestrator serve is the long-running process entry point for
manifest-driven trigger ingestion and connector activation.
Today, the command:
- load
harn.tomlthrough the existing manifest loader - boot the selected orchestrator role
- initialize the shared EventLog under
--state-dir - initialize the configured secret-provider chain
- resolve and register manifest triggers
- activate connectors for the manifest’s providers
- bind an HTTP listener for
webhookanda2a-pushtriggers - write a state snapshot and stay up until shutdown
Current limitations:
multi-tenantreturns a clear not-implemented error that points atO-12 #190inspect,replay,dlq, andqueueare placeholders forO-08 #185
Command
harn orchestrator serve \
--config harn.toml \
--state-dir ./.harn/orchestrator \
--bind 0.0.0.0:8080 \
--cert certs/dev.pem \
--key certs/dev-key.pem \
--role single-tenant
Omit --cert and --key to serve plain HTTP. When both are present,
the listener serves HTTPS and terminates TLS with rustls.
On startup, the command logs the active secret-provider chain, loaded
triggers, registered connectors, and the actual bound listener URL. On
SIGTERM, it stops accepting new requests, lets in-flight requests drain,
appends lifecycle events to the EventLog, and persists a final
orchestrator-state.json snapshot under --state-dir.
--manifest is an alias for --config, and --listen is an alias for
--bind. Container deployments can also configure those through
HARN_ORCHESTRATOR_MANIFEST, HARN_ORCHESTRATOR_LISTEN,
HARN_ORCHESTRATOR_STATE_DIR, HARN_ORCHESTRATOR_CERT, and
HARN_ORCHESTRATOR_KEY.
On Unix, SIGHUP reloads manifest-backed HTTP trigger bindings without
rebinding the socket. The orchestrator reparses harn.toml,
re-collects manifest triggers, installs a new manifest binding version
for changed webhook / a2a-push entries, and swaps the live listener
route table in place. Requests already in flight keep the binding
version they started with; new requests route to the newest active
binding version. The orchestrator records reload_succeeded /
reload_failed events on orchestrator.manifest and refreshes
orchestrator-state.json after a successful reload.
Current reload scope is intentionally narrow: listener-wide settings
such as --bind, TLS files, allowed_origins, max_body_bytes, and
connector-managed trigger changes still require a full restart.
HTTP Listener
The orchestrator listener assembles routes from [[triggers]] entries
with kind = "webhook" or kind = "a2a-push".
- If a trigger declares
path = "/github/issues", that path is used. - Otherwise the route defaults to
/triggers/<id>. /health,/healthz, and/readyzare reserved listener endpoints; useGET /healthfor container health checks.
Accepted deliveries are normalized into TriggerEvent records and
appended to the shared orchestrator.triggers.pending queue in the
event log for downstream dispatch.
Hot reload uses the trigger registry’s versioned manifest bindings. A modified trigger id drains the old binding version, activates a new version, and keeps terminated versions around for a short retention window so operators can inspect the handoff without the registry growing unbounded.
Listener controls
Listener-wide controls live under [orchestrator] in harn.toml.
[orchestrator]
allowed_origins = ["https://app.example.com"]
max_body_bytes = 10485760
allowed_originsdefaults to["*"]semantics when omitted or empty. Requests with anOriginheader outside the allowlist are rejected with403 Forbidden.max_body_bytesdefaults to10485760bytes (10 MiB). Larger requests are rejected with413 Payload Too Large.
Listener auth
Health probes stay public:
GET /healthGET /healthzGET /readyz
Webhook routes keep using their provider-specific signature checks.
a2a-push routes require either a bearer API key or a shared-secret
HMAC authorization header.
Configure the auth material with environment variables:
export HARN_ORCHESTRATOR_API_KEYS="dev-key-1,dev-key-2"
export HARN_ORCHESTRATOR_HMAC_SECRET="replace-me"
Bearer requests use:
Authorization: Bearer <api-key>
HMAC requests use:
Authorization: HMAC-SHA256 timestamp=<unix>,signature=<base64>
The canonical string is:
METHOD
PATH
TIMESTAMP
SHA256(BODY)
METHOD is uppercased, PATH is the request path without the query
string, TIMESTAMP is a Unix epoch seconds value, and SHA256(BODY) is
the lowercase hex digest of the raw request body. Timestamps outside the
5-minute replay window are rejected with 401 Unauthorized.
Deployment
Release tags publish a distroless container image to
ghcr.io/burin-labs/harn for both linux/amd64 and linux/arm64.
docker run \
-p 8080:8080 \
-v "$PWD/triggers.toml:/etc/harn/triggers.toml:ro" \
-e HARN_ORCHESTRATOR_API_KEYS=xxx \
-e HARN_ORCHESTRATOR_HMAC_SECRET=replace-me \
-e RUST_LOG=info \
ghcr.io/burin-labs/harn
The image runs as UID 10001 and stores orchestrator state under
/var/lib/harn/state by default. Override the startup contract with
environment variables instead of replacing the entrypoint:
HARN_ORCHESTRATOR_MANIFESTdefaults to/etc/harn/triggers.tomlHARN_ORCHESTRATOR_LISTENdefaults to0.0.0.0:8080HARN_ORCHESTRATOR_STATE_DIRdefaults to/var/lib/harn/stateHARN_ORCHESTRATOR_API_KEYSsupplies bearer credentials for authenticateda2a-pushroutesHARN_ORCHESTRATOR_HMAC_SECRETsupplies the shared secret for canonical-request HMAC auth ona2a-pushroutesHARN_SECRET_*, provider API-key env vars, and deployment-specificHARN_PROVIDER_*values are passed through to connector/provider codeRUST_LOGcontrols runtime log verbosity
The image healthcheck issues GET /health against the local listener, so
it works with Docker, BuildKit smoke tests, and most container platforms
without requiring curl inside the distroless runtime.
Trigger examples
[[triggers]]
id = "github-new-issue"
kind = "webhook"
provider = "github"
path = "/triggers/github-new-issue"
match = { events = ["issues.opened"] }
handler = "handlers::on_new_issue"
secrets = { signing_secret = "github/webhook-secret" }
[[triggers]]
id = "incoming-review-task"
kind = "a2a-push"
provider = "a2a-push"
path = "/a2a/review"
match = { events = ["a2a.task.received"] }
handler = "a2a://reviewer.prod/triage"
GitHub webhook triggers verify the X-Hub-Signature-256 HMAC against
secrets.signing_secret before enqueueing. Generic provider = "webhook"
triggers use the shared Standard Webhooks verifier. a2a-push routes
require either Authorization: Bearer <api-key> or a valid
Authorization: HMAC-SHA256 ... header before enqueueing.
Orchestrator Secrets
Reactive Harn features need a single way to fetch secrets without
sprinkling provider-specific code across connectors, OAuth flows, and
future orchestrator runtime surfaces. The secret layer lives in
harn_vm::secrets and currently ships with two concrete providers:
EnvSecretProviderKeyringSecretProvider
The default chain is:
env -> keyring
Use harn doctor --no-network to inspect the active chain and to verify
that the keyring backend is reachable on the current machine.
Secret model
Secrets are addressed by a structured SecretId:
#![allow(unused)]
fn main() {
use harn_vm::secrets::{SecretId, SecretVersion};
let id = SecretId::new(
"harn.orchestrator.github",
"installation-12345/private-key",
)
.with_version(SecretVersion::Latest);
}
Secret values are held in SecretBytes:
- bytes are zeroized on drop
Debugis redactedDisplayis intentionally absent- explicit duplication requires
reborrow() - callers expose bytes via
with_exposed(|bytes| ...)
Successful get() calls also emit a structured audit event through the
existing VM event sink with the secret id, provider name, caller span,
mutation session id when present, and a timestamp. The event payload never
contains the secret bytes.
Provider chain configuration
The provider order is controlled with HARN_SECRET_PROVIDERS:
export HARN_SECRET_PROVIDERS=env,keyring
The doctor output also reports a namespace used for backend grouping. By
default Harn derives it as harn/<current-directory-name>. Override it
with:
export HARN_SECRET_NAMESPACE="harn/my-workspace"
Environment provider
EnvSecretProvider is first in the chain so CI, local shells, and
containers can override secrets without touching the OS credential store.
Environment variable names use:
HARN_SECRET_<NAMESPACE>_<NAME>
For example:
export HARN_SECRET_HARN_ORCHESTRATOR_GITHUB_INSTALLATION_12345_PRIVATE_KEY="$(cat github-app.pem)"
Non-alphanumeric characters are normalized to underscores and multiple separators collapse.
Keyring provider
KeyringSecretProvider uses the keyring
crate so the same code path works against:
- macOS Keychain
- Linux native keyring / Secret Service backends supported by
keyring - Windows Credential Manager
This is the default local-first provider. The CLI already uses it for MCP
OAuth token storage, and harn doctor probes it directly.
Recommended setups
Laptop development:
export HARN_SECRET_PROVIDERS=env,keyring
CI or containers:
export HARN_SECRET_PROVIDERS=env
Cloud deployments:
Today, use env for injected platform secrets. The SecretProvider
surface is intentionally ready for Vault / AWS / GCP implementations, but
those provider backends are not wired in yet.
CLI reference
All commands available in the harn CLI.
harn run
Execute a .harn file.
harn run <file.harn>
harn run --trace <file.harn>
harn run -e 'println("hello")'
harn run --deny shell,exec <file.harn>
harn run --allow read_file,write_file <file.harn>
| Flag | Description |
|---|---|
--trace | Print LLM trace summary after execution |
-e <code> | Evaluate inline code instead of a file |
--deny <builtins> | Deny specific builtins (comma-separated) |
--allow <builtins> | Allow only specific builtins (comma-separated) |
You can also run a file directly without the run subcommand:
harn main.harn
Before starting the VM, harn run <file> builds the cross-module
graph for the entry file. When all imports resolve, unknown call
targets produce a static error and the VM is never started — the same
call target ... is not defined or imported message you see from
harn check. The inline -e <code> form has no importing file and
therefore skips the cross-module check.
harn playground
Run a pipeline against a Harn-native host module for fast local iteration.
harn playground --host host.harn --script pipeline.harn --task "Explain this repo"
harn playground --watch --task "Refine the prompt"
harn playground --llm ollama:qwen2.5-coder:latest --task "Use a local model"
| Flag | Description |
|---|---|
--host <file> | Host module exporting the functions the script expects (default: host.harn) |
--script <file> | Pipeline entrypoint to execute (default: pipeline.harn) |
--task <text> | Task string exposed as HARN_TASK during the run |
--llm <provider:model> | Override the provider/model selection for this invocation |
--llm-mock <path> | Replay LLM responses from a JSONL fixture file instead of calling the provider |
--llm-mock-record <path> | Record executed LLM responses into a JSONL fixture file |
--watch | Re-run when the host module or script changes |
harn playground type-checks the host module, merges its exported function
names into the script’s static call-target validation, then executes the script
with an in-process host adapter. Missing host functions fail with a pointed
error naming the function and caller location.
harn test
Run tests.
harn test conformance # run conformance test suite
harn test conformance tests/language/arithmetic.harn # run one conformance file
harn test conformance tests/stdlib/ # run a conformance subtree
harn test tests/ # run user tests in directory
harn test tests/ --filter "auth*" # filter by pattern
harn test tests/ --parallel # run tests concurrently
harn test tests/ --watch # re-run on file changes
harn test conformance --verbose # show per-test timing
harn test conformance --timing # show timing summary without verbose failures
harn test tests/ --record # record LLM fixtures
harn test tests/ --replay # replay LLM fixtures
| Flag | Description |
|---|---|
--filter <pattern> | Only run tests matching pattern |
--parallel | Run tests concurrently |
--watch | Re-run tests on file changes |
--verbose / -v | Show per-test timing and detailed failures |
--timing | Show per-test timing plus summary statistics |
--junit <path> | Write JUnit XML report |
--timeout <ms> | Per-test timeout in milliseconds (default: 30000) |
--record | Record LLM responses to .harn-fixtures/ |
--replay | Replay recorded LLM responses |
When no path is given, harn test auto-discovers a tests/ directory
in the current folder. Conformance targets must resolve to a file or directory
inside conformance/; the CLI now errors instead of silently falling back to
the full suite when a requested target is missing.
harn repl
Start an interactive REPL with syntax highlighting, multiline editing, live
builtin completion, and persistent history in ~/.harn/repl_history.
harn repl
The REPL keeps incomplete blocks open until braces, brackets, parentheses, and quoted strings are balanced, so you can paste or type multi-line pipelines and control-flow blocks directly.
harn bench
Benchmark a .harn file over repeated runs.
harn bench main.harn
harn bench main.harn --iterations 25
harn bench parses and compiles the file once, executes it with a fresh VM for
each iteration, and reports wall time plus aggregated LLM token, call, and cost
metrics.
harn viz
Render a .harn file as a Mermaid flowchart.
harn viz main.harn
harn viz main.harn --output docs/graph.mmd
harn viz parses the file, walks the AST, and emits a Mermaid flowchart TD
graph showing pipelines, functions, branches, loops, and other workflow-shaped
control-flow nodes.
harn fmt
Format .harn source files. Accepts files or directories.
harn fmt main.harn
harn fmt src/
harn fmt --check main.harn # check mode (no changes, exit 1 if unformatted)
harn fmt --line-width 80 main.harn # custom line width
| Flag | Description |
|---|---|
--check | Check mode: exit 1 if any file would be reformatted, make no changes |
--line-width <N> | Maximum line width before wrapping (default: 100) |
The formatter enforces a 100-character line width by default (overridable with --line-width). When a line exceeds
this limit the formatter wraps it automatically:
- Comma-separated forms — function call arguments, function declaration parameters, list literals, dict literals, struct construction fields, enum constructor payloads, selective import names, interface method parameters, and enum variant fields all wrap with one item per line and trailing commas.
- Binary operator chains — long expressions like
a + b + c + dbreak before the operator. Operators that the parser cannot resume across a bare newline (-,==,!=,<,>,<=,>=,in,not in,??) get an automatic backslash continuation (\); other operators (+,*,/,%,||,&&,|>) break without one. - Operator precedence parentheses — the formatter inserts parentheses
to preserve semantics when the AST drops them (e.g.
a * (b + c)stays parenthesised) and for clarity when mixing&&/||(e.g.a && b || cbecomes(a && b) || c).
harn lint
Lint one or more .harn files or directories for common issues (unused
variables, unused functions, unreachable code, empty blocks, missing
/** */ HarnDoc on public functions, etc.).
harn lint main.harn
harn lint src/ tests/
Pass --fix to automatically apply safe fixes (e.g., var → let for
never-reassigned bindings, boolean comparison simplification, unused import
removal, and string interpolation conversion):
harn lint --fix main.harn
harn check
Type-check one or more .harn files or directories and run preflight
validation without executing them. The preflight pass resolves imports, checks
literal render(...) / render_prompt(...) targets, detects import symbol collisions across
modules, validates host_call("capability.operation", ...) capability
contracts, and flags missing template resources, execution directories, and worker repos that would
otherwise fail only at runtime. Source-aware lint rules run as part of
check, including the missing-harndoc warning for undocumented pub fn
APIs.
check builds a cross-module graph from each entry file and follows
import statements recursively. When every import in a file resolves,
the typechecker knows the exact set of names that module brings into
scope and will emit a hard error for any call target that is neither a
builtin, a local declaration, a struct constructor, a callable
variable, nor an imported symbol:
error: call target `helpr` is not defined or imported
This catches typos and stale imports before the VM runs. If any import in the file is unresolved, the stricter check is turned off for that file so one broken import does not avalanche into spurious errors — the unresolved import itself still fails at runtime.
harn check main.harn
harn check src/ tests/
harn check --host-capabilities host-capabilities.json main.harn
harn check --bundle-root .bundle main.harn
harn check --workspace
harn check --preflight warning src/
| Flag | Description |
|---|---|
--host-capabilities <file> | Load a host capability manifest for preflight validation. Supports plain {capability: [ops...]} objects, nested {capabilities: ...} wrappers, and per-op metadata dictionaries. Overrides [check].host_capabilities_path in harn.toml. |
--bundle-root <dir> | Validate render(...), render_prompt(...), and template paths against an alternate bundled layout root |
--workspace | Walk every path listed in [workspace].pipelines of the nearest harn.toml. Positional targets remain additive. |
--preflight <severity> | Override preflight diagnostic severity: error (default, fails the check), warning (reports but does not fail), or off (suppresses all preflight diagnostics). Overrides [check].preflight_severity. |
--strict-types | Flag unvalidated boundary-API values used in field access. |
harn.toml — [check] and [workspace] sections
harn check walks upward from the target file (stopping at the first .git
directory) to find the nearest harn.toml. The following keys are honored:
[check]
# Load an external capability manifest. Path is resolved relative to
# harn.toml. Accepts JSON or TOML with the namespaced shape
# { workspace = [...], process = [...], project = [...], ... }.
host_capabilities_path = "./schemas/host-capabilities.json"
# Or declare inline:
[check.host_capabilities]
project = ["ensure_enriched", "enrich"]
workspace = ["read_text", "write_text"]
[check]
# Downgrade preflight errors to warnings (or suppress entirely with "off").
# Keeps type diagnostics visible while an external capability schema is
# still catching up to a host's live surface.
preflight_severity = "warning"
# Suppress preflight diagnostics for specific capabilities/operations.
# Entries match either an exact "capability.operation" pair, a
# "capability.*" wildcard, a bare "capability" name, or a blanket "*".
preflight_allow = ["mystery.*", "runtime.task"]
[workspace]
# Directories or files checked by `harn check --workspace`. Paths are
# resolved relative to harn.toml.
pipelines = ["Sources/BurinCore/Resources/pipelines", "scripts"]
Preflight diagnostics are reported under the preflight category so they
can be distinguished from type-checker errors in IDE output streams and
CI log filters.
harn contracts
Export machine-readable contracts for hosts, release tooling, and embedded bundles.
harn contracts builtins
harn contracts host-capabilities --host-capabilities host-capabilities.json
harn contracts bundle main.harn --verify
harn contracts bundle src/ --bundle-root .bundle --host-capabilities host-capabilities.json
harn contracts builtins
Print the parser/runtime builtin registry as JSON, including return-type hints and alignment status.
harn contracts host-capabilities
Print the effective host-capability manifest used by preflight validation after merging the built-in defaults with any external manifest file.
harn contracts bundle
Print a bundle manifest for one or more .harn targets. The manifest includes:
- explicit
entry_modules,import_modules, andmodule_dependenciesedges - explicit
prompt_assetsandtemplate_assetsslices, plus a fullassetstable resolved through the same source-relative rules asrender(...) - required host capabilities discovered from literal
host_call(...)sites - literal execution directories and worker worktree repos
- a
summaryblock with stable counts for packagers and release tooling
Use --verify to run normal Harn preflight validation before emitting the
bundle manifest and return a non-zero exit code if the selected targets are not
bundle-safe.
harn init
Scaffold a new project with harn.toml and main.harn.
harn init # create in current directory
harn init my-project # create in a new directory
harn init --template eval
harn new
Scaffold a new project from a starter template. Supported templates are
basic, agent, mcp-server, and eval.
harn new my-agent --template agent
harn new local-mcp --template mcp-server
harn new eval-suite --template eval
harn init and harn new share the same scaffolding engine. Use init for
the default quick-start flow and new when you want the template choice to be
explicit.
harn doctor
Inspect the local environment and report the current Harn setup, including the resolved secret-provider chain and keyring health.
harn doctor
harn doctor --no-network
harn watch
Watch a file for changes and re-run it automatically.
harn watch main.harn
harn watch --deny shell main.harn
harn acp
Start an ACP (Agent Client Protocol) server on stdio.
harn acp # bridge mode, no pipeline
harn acp pipeline.harn # execute a pipeline per prompt
See MCP and ACP Integration for protocol details.
harn portal
Launch the local Harn observability portal for persisted runs.
harn portal
harn portal --dir runs/archive
harn portal --host 0.0.0.0 --port 4900
harn portal --open false
See Harn Portal for the full guide.
harn runs
Inspect persisted workflow run records.
harn runs inspect .harn-runs/<run>.json
harn runs inspect .harn-runs/<run>.json --compare baseline.json
harn replay
Replay a persisted workflow run record from saved output.
harn replay .harn-runs/<run>.json
harn eval
Evaluate a persisted workflow run record as a regression fixture.
harn eval .harn-runs/<run>.json
harn eval .harn-runs/<run>.json --compare baseline.json
harn eval .harn-runs/
harn eval evals/regression.json
harn eval accepts three inputs:
- a single run record JSON file
- a directory of run record JSON files
- an eval suite manifest JSON file with grouped cases and optional baseline comparisons
harn serve
Start an A2A (Agent-to-Agent) HTTP server.
harn serve agent.harn # default port 8080
harn serve --port 3000 agent.harn # custom port
See MCP and ACP Integration for protocol details.
harn mcp-serve
Serve a Harn pipeline as an MCP server over stdio.
harn mcp-serve agent.harn
See MCP and ACP Integration for details on defining tools, resources, and prompts.
harn mcp
Manage standalone OAuth state for remote HTTP MCP servers.
harn mcp redirect-uri
harn mcp login notion
harn mcp login https://mcp.notion.com/mcp
harn mcp login my-server --url https://example.com/mcp --client-id <id> --client-secret <secret>
harn mcp status notion
harn mcp logout notion
harn mcp login resolves the server from the nearest harn.toml when you pass
an MCP server name, or uses the explicit URL when you pass --url or a raw
https://... target. The CLI:
- discovers OAuth protected resource and authorization server metadata
- prefers pre-registered
client_id/client_secretvalues when supplied - falls back to dynamic client registration when supported by the server
- stores tokens in the local OS keychain and refreshes them automatically
Relevant flags:
| Flag | Description |
|---|---|
--url <url> | Explicit MCP server URL when logging in/out by a custom name |
--client-id <id> | Use a pre-registered client ID instead of dynamic registration |
--client-secret <secret> | Optional client secret for client_secret_post / client_secret_basic servers |
--scope <scopes> | Override or provide requested OAuth scopes |
--redirect-uri <uri> | Override the default loopback redirect URI (default shown by harn mcp redirect-uri) |
Security guidance:
- prefer the narrowest scopes the remote MCP server supports
- treat configured
client_secretvalues as secrets - review remote MCP capabilities before using them in autonomous workflows
Release gate
For repo maintainers, the deterministic full-release path is:
./scripts/release_ship.sh --bump patch
This runs audit → dry-run publish → bump → commit → tag → push → cargo publish
→ GitHub release in that order. Pushing happens before cargo publish so
downstream consumers (GitHub release binary workflows, burin-code’s
fetch-harn) start in parallel with crates.io.
For piecewise work, the docs audit, verification gate, bump flow, and publish sequence are exposed individually:
./scripts/release_gate.sh audit
./scripts/release_gate.sh full --bump patch --dry-run
harn add
Add a dependency to harn.toml.
harn add my-lib --git https://github.com/user/my-lib
harn install
Install dependencies declared in harn.toml.
harn install
harn version
Show version information.
harn version
Builtin functions
Complete reference for all built-in functions available in Harn.
Output
| Function | Parameters | Returns | Description |
|---|---|---|---|
log(msg) | msg: any | nil | Print with [harn] prefix and newline |
print(msg) | msg: any | nil | Print without prefix or newline |
println(msg) | msg: any | nil | Print with newline, no prefix |
progress(phase, message, progress?, total?) | phase: string, message: string, optional numeric progress | nil | Emit standalone progress output. Dict options support mode: "spinner" with step, or mode: "bar" with current, total, and optional width |
color(text, name) | text: any, name: string | string | Wrap text with an ANSI foreground color code |
bold(text) | text: any | string | Wrap text with ANSI bold styling |
dim(text) | text: any | string | Wrap text with ANSI dim styling |
Type conversion
| Function | Parameters | Returns | Description |
|---|---|---|---|
type_of(value) | value: any | string | Returns type name: "int", "float", "string", "bool", "nil", "list", "dict", "closure", "taskHandle", "duration", "enum", "struct" |
to_string(value) | value: any | string | Convert to string representation |
to_int(value) | value: any | int or nil | Parse/convert to integer. Floats truncate, bools become 0/1 |
to_float(value) | value: any | float or nil | Parse/convert to float |
unreachable(value?) | value: any (optional) | never | Throws “unreachable code was reached” at runtime. When the argument is a variable, the type checker verifies it has been narrowed to never (exhaustiveness check) |
iter(x) | x: list, dict, set, string, generator, channel, or iter | Iter<T> | Lift an iterable source into a lazy, single-pass, fused iterator. No-op on an existing iter. Dict iters yield Pair(key, value); string iters yield chars. See Iterator methods |
pair(a, b) | a: any, b: any | Pair | Construct a two-element Pair value. Access via .first / .second, or destructure in a for-loop: for (k, v) in ... |
Runtime shape validation
Function parameters with structural type annotations (shapes) are validated at runtime. If a dict or struct argument is missing a required field or has the wrong field type, a descriptive error is thrown before the function body executes.
fn greet(u: {name: string, age: int}) {
println("${u.name} is ${u.age}")
}
greet({name: "Alice", age: 30}) // OK
greet({name: "Alice"}) // Error: parameter 'u': missing field 'age' (int)
See Error handling – Runtime shape validation errors for more details.
Result
Harn has a built-in Result type for representing success/failure values
without exceptions. Ok and Err create Result.Ok and Result.Err
enum variants respectively. When called on a non-Result value, unwrap
and unwrap_or pass the value through unchanged.
| Function | Parameters | Returns | Description |
|---|---|---|---|
Ok(value) | value: any | Result.Ok | Create a Result.Ok value |
Err(value) | value: any | Result.Err | Create a Result.Err value |
is_ok(result) | result: any | bool | Returns true if value is Result.Ok |
is_err(result) | result: any | bool | Returns true if value is Result.Err |
unwrap(result) | result: any | any | Extract Ok value. Throws on Err. Non-Result values pass through |
unwrap_or(result, default) | result: any, default: any | any | Extract Ok value. Returns default on Err. Non-Result values pass through |
unwrap_err(result) | result: any | any | Extract Err value. Throws on non-Err |
Example:
let good = Ok(42)
let bad = Err("something went wrong")
println(is_ok(good)) // true
println(is_err(bad)) // true
println(unwrap(good)) // 42
println(unwrap_or(bad, 0)) // 0
println(unwrap_err(bad)) // something went wrong
JSON
| Function | Parameters | Returns | Description |
|---|---|---|---|
json_parse(str) | str: string | value | Parse JSON string into Harn values. Throws on invalid JSON |
json_stringify(value) | value: any | string | Serialize Harn value to JSON. Closures and handles become null |
yaml_parse(str) | str: string | value | Parse YAML string into Harn values. Throws on invalid YAML |
yaml_stringify(value) | value: any | string | Serialize Harn value to YAML |
toml_parse(str) | str: string | value | Parse TOML string into Harn values. Throws on invalid TOML |
toml_stringify(value) | value: any | string | Serialize Harn value to TOML |
json_validate(data, schema) | data: any, schema: dict | bool | Validate data against a schema. Returns true if valid, throws with details if not |
schema_check(data, schema) | data: any, schema: dict | Result | Validate data against an extended schema and return Result.Ok(data) or Result.Err({message, errors, value?}) |
schema_parse(data, schema) | data: any, schema: dict | Result | Same as schema_check, but applies default values recursively |
schema_is(data, schema) | data: any, schema: dict | bool | Validate data against a schema and return true/false without throwing |
schema_expect(data, schema, apply_defaults?) | data: any, schema: dict, bool (optional) | any | Validate data and return the normalized value, throwing on failure |
schema_from_json_schema(schema) | schema: dict | dict | Convert a JSON Schema object into Harn’s canonical schema dict |
schema_from_openapi_schema(schema) | schema: dict | dict | Convert an OpenAPI Schema Object into Harn’s canonical schema dict |
schema_to_json_schema(schema) | schema: dict | dict | Convert an extended Harn schema into JSON Schema |
schema_to_openapi_schema(schema) | schema: dict | dict | Convert an extended Harn schema into an OpenAPI-friendly schema object |
schema_extend(base, overrides) | base: dict, overrides: dict | dict | Shallow-merge two schema dicts |
schema_partial(schema) | schema: dict | dict | Remove required recursively so properties become optional |
schema_pick(schema, keys) | schema: dict, keys: list | dict | Keep only selected top-level properties |
schema_omit(schema, keys) | schema: dict, keys: list | dict | Remove selected top-level properties |
json_extract(text, key?) | text: string, key: string (optional) | value | Extract JSON from text (strips markdown code fences). If key given, returns that key’s value |
Type mapping:
| JSON | Harn |
|---|---|
| string | string |
| integer | int |
| decimal/exponent | float |
| true/false | bool |
| null | nil |
| array | list |
| object | dict |
Canonical schema format
The canonical schema is a plain Harn dict. The validator also accepts compatible
JSON Schema / OpenAPI Schema Object spellings such as object, array,
integer, number, boolean, oneOf, allOf, minLength, maxLength,
minItems, maxItems, and additionalProperties, normalizing them into the
same internal form.
Supported canonical keys:
| Key | Type | Description |
|---|---|---|
type | string | Expected type: "string", "int", "float", "bool", "list", "dict", "any" |
required | list | List of required key names (for dicts) |
properties | dict | Dict mapping property names to sub-schemas (for dicts) |
items | dict | Schema to validate each item against (for lists) |
additional_properties | bool or dict | Whether unknown dict keys are allowed, or which schema they must satisfy |
Example:
let schema = {
type: "dict",
required: ["name", "age"],
properties: {
name: {type: "string"},
age: {type: "int"},
tags: {type: "list", items: {type: "string"}}
}
}
json_validate(data, schema) // throws if invalid
Extended schema constraints
The schema builtins support these additional keys:
| Key | Type | Description |
|---|---|---|
nullable | bool | Allow nil |
min / max | int or float | Numeric bounds |
min_length / max_length | int | String length bounds |
pattern | string | Regex pattern for strings |
enum | list | Allowed literal values |
const | any | Exact required literal value |
min_items / max_items | int | List length bounds |
union | list of schemas | Value must match one schema |
all_of | list of schemas | Value must satisfy every schema |
default | any | Default value applied by schema_parse |
Example:
let user_schema = {
type: "dict",
required: ["name", "age"],
properties: {
name: {type: "string", min_length: 1},
age: {type: "int", min: 0},
role: {type: "string", enum: ["admin", "user"], default: "user"}
}
}
let parsed = schema_parse({name: "Ada", age: 36}, user_schema)
println(is_ok(parsed))
println(unwrap(parsed).role)
println(schema_to_json_schema(user_schema).type)
schema_is(...) is useful for dynamic checks and can participate in static
type refinement when the schema is a literal (or a variable bound from a
literal schema).
The lazy std/schema module provides ergonomic builders such as
schema_string(), schema_object(...), schema_union(...),
get_typed_result(...), get_typed_value(...), and is_type(...).
Composition helpers:
let public_user = schema_pick(user_schema, ["name", "role"])
let patch_schema = schema_partial(user_schema)
let admin_user = schema_extend(user_schema, {
properties: {
name: {type: "string", min_length: 1},
age: {type: "int", min: 0},
role: {type: "string", enum: ["admin"], default: "admin"}
}
})
json_extract
Extracts JSON from LLM responses that may contain markdown code fences
or surrounding prose. Handles ```json ... ```, ``` ... ```,
and bare JSON with surrounding text. Uses balanced bracket matching to
correctly extract nested objects and arrays from mixed prose.
let result = llm_call("Return JSON with name and age")
let data = json_extract(result.text) // parse, stripping fences
let name = json_extract(result.text, "name") // extract just one key
Math
| Function | Parameters | Returns | Description |
|---|---|---|---|
abs(n) | n: int or float | int or float | Absolute value |
ceil(n) | n: float | int | Ceiling (rounds up). Ints pass through unchanged |
floor(n) | n: float | int | Floor (rounds down). Ints pass through unchanged |
round(n) | n: float | int | Round to nearest integer. Ints pass through unchanged |
sqrt(n) | n: int or float | float | Square root |
pow(base, exp) | base: number, exp: number | int or float | Exponentiation. Returns int when both args are int and exp is non-negative |
min(a, b) | a: number, b: number | int or float | Minimum of two values. Returns float if either argument is float |
max(a, b) | a: number, b: number | int or float | Maximum of two values. Returns float if either argument is float |
random() | none | float | Random float in [0, 1) |
random_int(min, max) | min: int, max: int | int | Random integer in [min, max] inclusive |
Trigonometry
| Function | Parameters | Returns | Description |
|---|---|---|---|
sin(n) | n: float | float | Sine (radians) |
cos(n) | n: float | float | Cosine (radians) |
tan(n) | n: float | float | Tangent (radians) |
asin(n) | n: float | float | Inverse sine |
acos(n) | n: float | float | Inverse cosine |
atan(n) | n: float | float | Inverse tangent |
atan2(y, x) | y: float, x: float | float | Two-argument inverse tangent |
Logarithms and exponentials
| Function | Parameters | Returns | Description |
|---|---|---|---|
log2(n) | n: float | float | Base-2 logarithm |
log10(n) | n: float | float | Base-10 logarithm |
ln(n) | n: float | float | Natural logarithm |
exp(n) | n: float | float | Euler’s number raised to the power n |
Constants and utilities
| Function | Parameters | Returns | Description |
|---|---|---|---|
pi | — | float | The constant pi (3.14159…) |
e | — | float | Euler’s number (2.71828…) |
sign(n) | n: int or float | int | Sign of a number: -1, 0, or 1 |
is_nan(n) | n: float | bool | Check if value is NaN |
is_infinite(n) | n: float | bool | Check if value is infinite |
Sets
| Function | Parameters | Returns | Description |
|---|---|---|---|
set(items?) | items: list (optional) | set | Create a new set, optionally from a list |
set_add(s, value) | s: set, value: any | set | Add a value to a set, returns new set |
set_remove(s, value) | s: set, value: any | set | Remove a value from a set, returns new set |
set_contains(s, value) | s: set, value: any | bool | Check if set contains a value |
set_union(a, b) | a: set, b: set | set | Union of two sets |
set_intersect(a, b) | a: set, b: set | set | Intersection of two sets |
set_difference(a, b) | a: set, b: set | set | Difference (elements in a but not b) |
set_symmetric_difference(a, b) | a: set, b: set | set | Elements in either but not both |
set_is_subset(a, b) | a: set, b: set | bool | True if all elements of a are in b |
set_is_superset(a, b) | a: set, b: set | bool | True if a contains all elements of b |
set_is_disjoint(a, b) | a: set, b: set | bool | True if a and b share no elements |
to_list(s) | s: set | list | Convert a set to a list |
Set methods (dot syntax)
Sets also support method syntax: my_set.union(other).
| Method | Parameters | Returns | Description |
|---|---|---|---|
.count() / .len() | none | int | Number of elements |
.empty() | none | bool | True if set is empty |
.contains(val) | val: any | bool | Check membership |
.add(val) | val: any | set | New set with val added |
.remove(val) | val: any | set | New set with val removed |
.union(other) | other: set | set | Union |
.intersect(other) | other: set | set | Intersection |
.difference(other) | other: set | set | Elements in self but not other |
.symmetric_difference(other) | other: set | set | Elements in either but not both |
.is_subset(other) | other: set | bool | True if self is a subset of other |
.is_superset(other) | other: set | bool | True if self is a superset of other |
.is_disjoint(other) | other: set | bool | True if no shared elements |
.to_list() | none | list | Convert to list |
.map(fn) | fn: closure | set | Transform elements (deduplicates) |
.filter(fn) | fn: closure | set | Keep elements matching predicate |
.any(fn) | fn: closure | bool | True if any element matches |
.all(fn) / .every(fn) | fn: closure | bool | True if all elements match |
String functions
| Function | Parameters | Returns | Description |
|---|---|---|---|
len(value) | value: string, list, or dict | int | Length of string (chars), list (items), or dict (keys) |
trim(str) | str: string | string | Remove leading and trailing whitespace |
lowercase(str) | str: string | string | Convert to lowercase |
uppercase(str) | str: string | string | Convert to uppercase |
split(str, sep) | str: string, sep: string | list | Split string by separator |
starts_with(str, prefix) | str: string, prefix: string | bool | Check if string starts with prefix |
ends_with(str, suffix) | str: string, suffix: string | bool | Check if string ends with suffix |
contains(str, substr) | str: string, substr: string | bool | Check if string contains substring. Also works on lists |
replace(str, old, new) | str: string, old: string, new: string | string | Replace all occurrences |
join(list, sep) | list: list, sep: string | string | Join list elements with separator |
substring(str, start, len?) | str: string, start: int, len: int | string | Extract substring from start position |
format(template, ...) | template: string, args: any | string | Format string with {} placeholders. With a dict as the second arg, supports named {key} placeholders |
String methods (dot syntax)
These are called on string values with dot notation: "hello".uppercase().
| Method | Parameters | Returns | Description |
|---|---|---|---|
.trim() | none | string | Remove leading/trailing whitespace |
.trim_start() | none | string | Remove leading whitespace only |
.trim_end() | none | string | Remove trailing whitespace only |
.lines() | none | list | Split string by newlines |
.char_at(index) | index: int | string or nil | Character at index (nil if out of bounds) |
.index_of(substr) | substr: string | int | First character offset of substring (-1 if not found) |
.last_index_of(substr) | substr: string | int | Last character offset of substring (-1 if not found) |
.lower() / .to_lower() | none | string | Lowercase string |
.len() | none | int | Character count |
.upper() / .to_upper() | none | string | Uppercase string |
.chars() | none | list | List of single-character strings |
.reverse() | none | string | Reversed string |
.repeat(n) | n: int | string | Repeat n times |
.pad_left(width, char?) | width: int, char: string | string | Pad to width with char (default space) |
.pad_right(width, char?) | width: int, char: string | string | Pad to width with char (default space) |
List methods (dot syntax)
| Method | Parameters | Returns | Description |
|---|---|---|---|
.map(fn) | fn: closure | list | Transform each element |
.filter(fn) | fn: closure | list | Keep elements where fn returns truthy |
.reduce(init, fn) | init: any, fn: closure | any | Fold with accumulator |
.find(fn) | fn: closure | any or nil | First element matching predicate |
.find_index(fn) | fn: closure | int | Index of first match (-1 if not found) |
.any(fn) | fn: closure | bool | True if any element matches |
.all(fn) / .every(fn) | fn: closure | bool | True if all elements match |
.none(fn?) | fn: closure | bool | True if no elements match (no arg: checks emptiness) |
.first(n?) | n: int (optional) | any or list | First element, or first n elements |
.last(n?) | n: int (optional) | any or list | Last element, or last n elements |
.partition(fn) | fn: closure | list | Split into [[truthy], [falsy]] |
.group_by(fn) | fn: closure | dict | Group into dict keyed by fn result |
.sort() / .sort_by(fn) | fn: closure (optional) | list | Sort (natural or by key function) |
.min() / .max() | none | any | Minimum/maximum value |
.min_by(fn) / .max_by(fn) | fn: closure | any | Min/max by key function |
.chunk(size) | size: int | list | Split into chunks of size |
.window(size) | size: int | list | Sliding windows of size |
.each_cons(size) | size: int | list | Sliding windows of size |
.compact() | none | list | Remove nil values |
.unique() | none | list | Remove duplicates |
.flatten() | none | list | Flatten one level of nesting |
.flat_map(fn) | fn: closure | list | Map then flatten |
.tally() | none | dict | Frequency count: {value: count} |
.zip(other) | other: list | list | Pair elements from two lists |
.enumerate() | none | list | List of {index, value} dicts |
.take(n) / .skip(n) | n: int | list | First/remaining n elements |
.sum() | none | int or float | Sum of numeric values |
.join(sep?) | sep: string | string | Join to string |
.reverse() | none | list | Reversed list |
.push(item) / .pop() | item: any | list | New list with item added/removed (immutable) |
.contains(item) | item: any | bool | Check if list contains item |
.index_of(item) | item: any | int | Index of item (-1 if not found) |
.slice(start, end?) | start: int, end: int | list | Slice with negative index support |
Iterator methods
Eager list/dict/set/string methods listed above are unchanged — they
still return eager collections. Lazy iteration is opt-in via
.iter(), which lifts a list, dict, set, string, generator, or
channel into an Iter<T> value. Iterators are single-pass, fused,
and snapshot — they Rc-clone the backing collection, so mutating
the source after .iter() does not affect the iter.
On a dict, .iter() yields Pair(key, value) values (use .first /
.second, or destructure in a for-loop). String iteration yields
chars (Unicode scalar values).
Printing with log(it) renders <iter> or <iter (exhausted)> and
does not drain the iterator.
Lazy combinators (return a new Iter)
| Method | Parameters | Returns | Description |
|---|---|---|---|
.iter() | none | Iter<T> | Lift a source into an iter; no-op on an existing iter |
.map(fn) | fn: closure | Iter<U> | Lazily transform each item |
.filter(fn) | fn: closure | Iter<T> | Lazily keep items where fn returns truthy |
.flat_map(fn) | fn: closure | Iter<U> | Map then flatten, lazily |
.take(n) | n: int | Iter<T> | First n items |
.skip(n) | n: int | Iter<T> | Drop first n items |
.take_while(fn) | fn: closure | Iter<T> | Items until predicate first returns falsy |
.skip_while(fn) | fn: closure | Iter<T> | Drop items while predicate is truthy |
.zip(other) | other: iter | Iter<Pair<T, U>> | Pair items from two iters, stops at shorter |
.enumerate() | none | Iter<Pair<int, T>> | Pair each item with a 0-based index |
.chain(other) | other: iter | Iter<T> | Yield items from self, then from other |
.chunks(n) | n: int | Iter<list<T>> | Non-overlapping fixed-size chunks |
.windows(n) | n: int | Iter<list<T>> | Sliding windows of size n |
Sinks (drain the iter, return an eager value)
| Method | Parameters | Returns | Description |
|---|---|---|---|
.to_list() | none | list | Collect all items into a list |
.to_set() | none | set | Collect all items into a set |
.to_dict() | none | dict | Collect Pair(key, value) items into a dict |
.count() | none | int | Count remaining items |
.sum() | none | int or float | Sum of numeric items |
.min() / .max() | none | any | Min/max item |
.reduce(init, fn) | init: any, fn: closure | any | Fold with accumulator |
.first() / .last() | none | any or nil | First/last item |
.any(fn) | fn: closure | bool | True if any remaining item matches |
.all(fn) | fn: closure | bool | True if all remaining items match |
.find(fn) | fn: closure | any or nil | First item matching predicate |
.for_each(fn) | fn: closure | nil | Invoke fn on each remaining item |
Path functions
| Function | Parameters | Returns | Description |
|---|---|---|---|
dirname(path) | path: string | string | Directory component of path |
basename(path) | path: string | string | File name component of path |
extname(path) | path: string | string | File extension including dot (e.g., .harn) |
path_join(parts...) | parts: strings | string | Join path components |
path_workspace_info(path, workspace_root?) | path: string, workspace_root?: string | dict | Classify a path as workspace_relative, host_absolute, or invalid, and project both workspace-relative and host-absolute forms when known |
path_workspace_normalize(path, workspace_root?) | path: string, workspace_root?: string | string or nil | Normalize a path into workspace-relative form when it is safely inside the workspace (including common leading-slash drift like /packages/...) |
File I/O
| Function | Parameters | Returns | Description |
|---|---|---|---|
read_file(path) | path: string | string | Read entire file as UTF-8 string. Throws on failure. Deprecated in favor of read_file_result for new code; the throwing form remains supported. |
read_file_result(path) | path: string | Result<string, string> | Non-throwing read: returns Result.Ok(content) on success or Result.Err(message) on failure. Shares read_file’s content cache |
write_file(path, content) | path: string, content: string | nil | Write string to file. Throws on failure |
append_file(path, content) | path: string, content: string | nil | Append string to file, creating it if it doesn’t exist. Throws on failure |
copy_file(src, dst) | src: string, dst: string | nil | Copy a file. Throws on failure |
delete_file(path) | path: string | nil | Delete a file or directory (recursive). Throws on failure |
file_exists(path) | path: string | bool | Check if a file or directory exists |
list_dir(path?) | path: string (default ".") | list | List directory contents as sorted list of file names. Throws on failure |
mkdir(path) | path: string | nil | Create directory and all parent directories. Throws on failure |
stat(path) | path: string | dict | File metadata: {size, is_file, is_dir, readonly, modified}. Throws on failure |
temp_dir() | none | string | System temporary directory path |
render(path, bindings?) | path: string, bindings: dict | string | Read a template file relative to the current module’s asset root and render it. The template language supports {{ name }} interpolation (with nested paths and filters), {{ if }} / {{ elif }} / {{ else }} / {{ end }}, {{ for item in xs }} ... {{ end }} (with {{ loop.index }} etc.), {{ include "..." }} partials, {{# comments #}}, {{ raw }} ... {{ endraw }} verbatim blocks, and {{- -}} whitespace trim markers. See the Prompt templating reference for the full grammar and filter list. When called from an imported module, resolves relative to that module’s directory, not the entry pipeline. Without bindings, just reads the file |
render_prompt(path, bindings?) | path: string, bindings: dict | string | Prompt-oriented alias of render(...). Use this for .harn.prompt / .prompt assets when you want the asset to be surfaced explicitly in bundle manifests and preflight output |
Environment and system
| Function | Parameters | Returns | Description |
|---|---|---|---|
env(name) | name: string | string or nil | Read environment variable |
env_or(name, default) | name: string, default: any | string or default | Read environment variable, or return default when unset. One-line replacement for the common let v = env(K); if v { v } else { default } pattern |
timestamp() | none | float | Unix timestamp in seconds with sub-second precision |
elapsed() | none | int | Milliseconds since VM startup |
exec(cmd, args...) | cmd: string, args: strings | dict | Execute external command. Returns {stdout, stderr, status, success} |
exec_at(dir, cmd, args...) | dir: string, cmd: string, args: strings | dict | Execute external command inside a specific directory |
shell(cmd) | cmd: string | dict | Execute command via shell. Returns {stdout, stderr, status, success} |
shell_at(dir, cmd) | dir: string, cmd: string | dict | Execute shell command inside a specific directory |
exit(code) | code: int (default 0) | never | Terminate the process |
username() | none | string | Current OS username |
hostname() | none | string | Machine hostname |
platform() | none | string | OS name: "darwin", "linux", or "windows" |
arch() | none | string | CPU architecture (e.g., "aarch64", "x86_64") |
uuid() | none | string | Generate a random v4 UUID |
home_dir() | none | string | User’s home directory path |
pid() | none | int | Current process ID |
cwd() | none | string | Current working directory |
execution_root() | none | string | Directory used for source-relative execution helpers such as exec_at(...) / shell_at(...) |
asset_root() | none | string | Directory used for source-relative asset helpers such as render(...) / render_prompt(...) |
source_dir() | none | string | Directory of the currently-executing .harn file (falls back to cwd) |
project_root() | none | string or nil | Nearest ancestor directory containing harn.toml |
runtime_paths() | none | dict | Resolved runtime path model: {execution_root, asset_root, state_root, run_root, worktree_root} |
date_iso() | none | string | Current UTC time in ISO 8601 format (e.g., "2026-03-29T14:30:00.123Z") |
Regular expressions
| Function | Parameters | Returns | Description |
|---|---|---|---|
regex_match(pattern, text) | pattern: string, text: string | list or nil | Find all non-overlapping matches. Returns nil if no matches |
regex_replace(pattern, replacement, text) | pattern: string, replacement: string, text: string | string | Replace all matches. Throws on invalid regex |
regex_captures(pattern, text) | pattern: string, text: string | list | Find all matches with capture group details |
regex_captures
Returns a list of dicts, one per match. Each dict contains:
match– the full matched stringgroups– a list of positional capture group values (from(...))- Named capture groups (from
(?P<name>...)) appear as additional keys
let results = regex_captures("(\\w+)@(\\w+)", "alice@example bob@test")
// [
// {match: "alice@example", groups: ["alice", "example"]},
// {match: "bob@test", groups: ["bob", "test"]}
// ]
Named capture groups are added as top-level keys on each result dict:
let named = regex_captures("(?P<user>\\w+):(?P<role>\\w+)", "alice:admin")
// [{match: "alice:admin", groups: ["alice", "admin"], user: "alice", role: "admin"}]
Returns an empty list if there are no matches. Throws on invalid regex.
Encoding
| Function | Parameters | Returns | Description |
|---|---|---|---|
base64_encode(string) | string: string | string | Base64 encode a string (standard alphabet with padding) |
base64_decode(string) | string: string | string | Base64 decode a string. Throws on invalid input |
url_encode(string) | string: string | string | URL percent-encode a string. Unreserved characters (alphanumeric, -, _, ., ~) pass through unchanged |
url_decode(string) | string: string | string | Decode a URL-encoded string. Decodes %XX sequences and + as space |
Example:
let encoded = base64_encode("Hello, World!")
println(encoded) // SGVsbG8sIFdvcmxkIQ==
println(base64_decode(encoded)) // Hello, World!
println(url_encode("hello world")) // hello%20world
println(url_decode("hello%20world")) // hello world
println(url_encode("a=1&b=2")) // a%3D1%26b%3D2
println(url_decode("hello+world")) // hello world
Hashing
| Function | Parameters | Returns | Description |
|---|---|---|---|
sha256(string) | string: string | string | SHA-256 hash, returned as a lowercase hex-encoded string |
md5(string) | string: string | string | MD5 hash, returned as a lowercase hex-encoded string |
Example:
println(sha256("hello")) // 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
println(md5("hello")) // 5d41402abc4b2a76b9719d911017c592
Date/Time
| Function | Parameters | Returns | Description |
|---|---|---|---|
date_now() | none | dict | Current UTC datetime as dict with year, month, day, hour, minute, second, weekday, and timestamp fields |
date_parse(str) | str: string | float | Parse a datetime string (e.g., "2024-01-15 10:30:00") into a Unix timestamp. Extracts numeric components from the string. Throws if fewer than 3 parts (year, month, day). Validates month (1-12), day (1-31), hour (0-23), minute (0-59), second (0-59) |
date_format(dt, format?) | dt: float, int, or dict; format: string (default "%Y-%m-%d %H:%M:%S") | string | Format a timestamp or date dict as a string. Supports %Y, %m, %d, %H, %M, %S placeholders. Throws for negative timestamps |
Testing
| Function | Parameters | Returns | Description |
|---|---|---|---|
assert(condition, msg?) | condition: any, msg: string (optional) | nil | Assert value is truthy. Throws with message on failure |
assert_eq(a, b, msg?) | a: any, b: any, msg: string (optional) | nil | Assert two values are equal. Throws with message on failure |
assert_ne(a, b, msg?) | a: any, b: any, msg: string (optional) | nil | Assert two values are not equal. Throws with message on failure |
HTTP
| Function | Parameters | Returns | Description |
|---|---|---|---|
http_get(url, options?) | url: string, options: dict | dict | GET request |
http_post(url, body, options?) | url: string, body: string, options: dict | dict | POST request |
http_put(url, body, options?) | url: string, body: string, options: dict | dict | PUT request |
http_patch(url, body, options?) | url: string, body: string, options: dict | dict | PATCH request |
http_delete(url, options?) | url: string, options: dict | dict | DELETE request |
http_request(method, url, options?) | method: string, url: string, options: dict | dict | Generic HTTP request |
All HTTP functions return {status: int, headers: dict, body: string, ok: bool}.
Options: timeout (ms), retries, backoff (ms), headers (dict),
auth (string or {bearer: "token"} or {basic: {user, password}}),
follow_redirects (bool), max_redirects (int), body (string).
Throws on network errors.
Mock HTTP
For testing pipelines that make HTTP calls without hitting real servers.
| Function | Parameters | Returns | Description |
|---|---|---|---|
http_mock(method, url_pattern, response) | method: string, url_pattern: string, response: dict | nil | Register a mock. Use * in url_pattern for glob matching (supports multiple * wildcards, e.g., https://api.example.com/*/items/*) |
http_mock_clear() | none | nil | Clear all mocks and recorded calls |
http_mock_calls() | none | list | Return list of {method, url, body} for all intercepted calls |
http_mock("GET", "https://api.example.com/users", {
status: 200,
body: "{\"users\": [\"alice\"]}",
headers: {}
})
let resp = http_get("https://api.example.com/users")
assert_eq(resp.status, 200)
Interactive input
| Function | Parameters | Returns | Description |
|---|---|---|---|
prompt_user(msg) | msg: string (optional) | string | Display message, read line from stdin |
Host interop
| Function | Parameters | Returns | Description |
|---|---|---|---|
host_call(name, args) | name: string, args: any | any | Call a host capability operation using capability.operation naming |
host_capabilities() | — | dict | Typed host capability manifest |
host_has(capability, op?) | capability: string, op: string | bool | Check whether a typed host capability/operation exists |
host_tool_list() | — | list | List host-exposed bridge tools as {name, description, schema, deprecated} |
host_tool_call(name, args) | name: string, args: any | any | Invoke a bridge-exposed host tool by name using the existing builtin_call path |
host_mock(capability, op, response_or_config, params?) | capability: string, op: string, response_or_config: any or dict, params: dict | nil | Register a runtime mock for a typed host operation |
host_mock_clear() | — | nil | Clear registered typed host mocks and recorded mock invocations |
host_mock_calls() | — | list | Return recorded typed host mock invocations |
host_capabilities() returns the capability manifest surfaced by the active
host bridge. The local runtime exposes generic process, template, and
interaction capabilities. Product hosts can add capabilities such as
workspace, project, runtime, editor, git, or diagnostics.
Prefer host_call("capability.operation", args) in shared wrappers and
host-owned .harn modules so capability names stay consistent across the
runtime, host manifest, and preflight validation.
host_tool_list() is the discovery surface for host-native tools such as
Read, Edit, Bash, or IDE actions exposed by the active bridge host.
Without a bridge it returns []. host_tool_call(name, args) uses that same
bridge host’s existing dynamic builtin dispatch path, so scripts can discover a
tool at runtime and then call it by name without hard-coding it into the
initial prompt. Import std/host when you want small helpers such as
host_tool_lookup(name) or host_tool_available(name).
host_mock(...) is intended for tests and local conformance runs. The third
argument may be either a direct result value or a config dict containing
result, params, and/or error. Mock matching is last-write-wins and only
requires the declared params subset to match the actual host call
call. Matched calls are recorded in host_mock_calls() as
{capability, operation, params} dictionaries.
For higher-level test helpers, import std/testing:
import {
assert_host_called,
clear_host_mocks,
mock_host_error,
mock_host_result,
} from "std/testing"
clear_host_mocks()
mock_host_result("project", "metadata_get", "hello", {dir: ".", namespace: "facts"})
assert_eq(host_call("project.metadata_get", {dir: ".", namespace: "facts"}), "hello")
assert_host_called("project", "metadata_get", {dir: ".", namespace: "facts"}, nil)
mock_host_error("project", "scan", "scan failed", nil)
let result = try { host_call("project.scan", {}) }
assert(is_err(result))
Async and timing
| Function | Parameters | Returns | Description |
|---|---|---|---|
sleep(duration) | duration: int (ms) or duration literal | nil | Pause execution |
Concurrency primitives
Channels
| Function | Parameters | Returns | Description |
|---|---|---|---|
channel(name?) | name: string (default "default") | dict | Create a channel with name, type, and messages fields |
send(ch, value) | ch: dict, value: any | nil | Send a value to a channel |
receive(ch) | ch: dict | any | Receive a value from a channel (blocks until data available) |
close_channel(ch) | ch: channel | nil | Close a channel, preventing further sends |
try_receive(ch) | ch: channel | any or nil | Non-blocking receive. Returns nil if no data available |
select(ch1, ch2, ...) | channels: channel | dict or nil | Wait for data on any channel. Returns {index, value, channel} for the first ready channel, or nil if all closed |
Atomics
| Function | Parameters | Returns | Description |
|---|---|---|---|
atomic(initial?) | initial: any (default 0) | dict | Create an atomic value |
atomic_get(a) | a: dict | any | Read the current value |
atomic_set(a, value) | a: dict, value: any | int | Set value, returns previous value |
atomic_add(a, delta) | a: dict, delta: int | int | Add delta, returns previous value |
atomic_cas(a, expected, new) | a: dict, expected: int, new: int | bool | Compare-and-swap. Returns true if the swap succeeded |
Persistent store
| Function | Parameters | Returns | Description |
|---|---|---|---|
store_get(key) | key: string | any | Retrieve value from store, nil if missing |
store_set(key, value) | key: string, value: any | nil | Store value, auto-saves to .harn/store.json |
store_delete(key) | key: string | nil | Remove key from store |
store_list() | none | list | List all keys (sorted) |
store_save() | none | nil | Explicitly flush store to disk |
store_clear() | none | nil | Remove all keys from store |
The store is backed by .harn/store.json relative to the script’s
directory. The file is created lazily on first store_set. In bridge mode,
the host can override these builtins.
LLM
See LLM calls and agent loops for full documentation.
| Function | Parameters | Returns | Description |
|---|---|---|---|
llm_call(prompt, system?, options?) | prompt: string, system: string, options: dict | dict | Single LLM request. Returns {text, model, input_tokens, output_tokens}. Throws on transport / rate-limit / schema-validation failures |
llm_call_safe(prompt, system?, options?) | prompt: string, system: string, options: dict | dict | Non-throwing envelope around llm_call. Returns {ok: bool, response: dict or nil, error: {category, message} or nil}. error.category is one of ErrorCategory’s canonical strings ("rate_limit", "timeout", "overloaded", "server_error", "transient_network", "schema_validation", "auth", "not_found", "circuit_open", "tool_error", "tool_rejected", "cancelled", "generic") |
with_rate_limit(provider, fn, options?) | provider: string, fn: closure, options: dict | whatever fn returns | Acquire a permit from the provider’s sliding-window rate limiter, invoke fn, and retry with exponential backoff on retryable errors (rate_limit, overloaded, transient_network, timeout). Options: max_retries (default 5), backoff_ms (default 1000, capped at 30s after doubling) |
llm_completion(prefix, suffix?, system?, options?) | prefix: string, suffix: string, system: string, options: dict | dict | Text completion / fill-in-the-middle request. Returns {text, model, input_tokens, output_tokens} |
agent_loop(prompt, system?, options?) | prompt: string, system: string, options: dict | dict | Multi-turn agent loop with ##DONE## sentinel, daemon/idling support, and optional per-turn context filtering. Returns {status, text, iterations, duration_ms, tools_used} |
daemon_spawn(config) | config: dict | dict | Start a daemon-mode agent and return a daemon handle with persisted state + queue metadata |
daemon_trigger(handle, event) | handle: dict or string, event: any | dict | Enqueue a durable FIFO trigger event for a running daemon; throws VmError::DaemonQueueFull on overflow |
daemon_snapshot(handle) | handle: dict or string | dict | Return the latest daemon snapshot plus live queue state (pending_events, inflight_event, counts, capacity) |
daemon_stop(handle) | handle: dict or string | dict | Stop a daemon and preserve queued trigger state for resume |
daemon_resume(path) | path: string | dict | Resume a daemon from its persisted state directory |
trigger_list() | — | list | Return the live trigger registry snapshot as list<TriggerBinding> |
trigger_register(config) | config: dict | dict | Dynamically register a trigger and return its TriggerHandle |
trigger_fire(handle, event) | handle: dict or string, event: dict | dict | Fire a synthetic event into a trigger and return a DispatchHandle; execution routes through the trigger dispatcher |
trigger_replay(event_id) | event_id: string | dict | Fetch a historical event from triggers.events, re-dispatch it through the trigger dispatcher, and thread replay_of_event_id through the returned DispatchHandle |
trigger_inspect_dlq() | — | list | Return the current DLQ snapshot as list<DlqEntry> with retry history |
trigger_test_harness(fixture) | fixture: string or {fixture: string} | dict | Run a named trigger-system harness fixture and return a structured report. Intended for Rust/unit/conformance coverage of cron, webhook, retry, DLQ, dedupe, rate-limit, cost-guard, recovery, and dead-man-switch scenarios |
llm_info() | — | dict | Current LLM config: {provider, model, api_key_set} |
llm_usage() | — | dict | Cumulative usage: {input_tokens, output_tokens, total_duration_ms, call_count, total_calls} |
llm_resolve_model(alias) | alias: string | dict | Resolve model alias to {id, provider} via providers.toml |
llm_pick_model(target, options?) | target: string, options: dict | dict | Resolve a model alias or tier to {id, provider, tier} |
llm_infer_provider(model_id) | model_id: string | string | Infer provider from model ID (e.g. "claude-*" → "anthropic") |
llm_model_tier(model_id) | model_id: string | string | Get capability tier: "small", "mid", or "frontier" |
llm_healthcheck(provider?) | provider: string | dict | Validate API key. Returns {valid, message, metadata} |
llm_rate_limit(provider, options?) | provider: string, options: dict | int/nil/bool | Set ({rpm: N}), query, or clear ({rpm: 0}) per-provider rate limit |
llm_providers() | — | list | List all configured provider names |
llm_config(provider?) | provider: string | dict | Get provider config (base_url, auth_style, etc.) |
llm_cost(model, input_tokens, output_tokens) | model: string, input_tokens: int, output_tokens: int | float | Estimate USD cost from embedded pricing table |
llm_session_cost() | — | dict | Session totals: {total_cost, input_tokens, output_tokens, call_count} |
llm_budget(max_cost) | max_cost: float | nil | Set session budget in USD. LLM calls throw if exceeded |
llm_budget_remaining() | — | float or nil | Remaining budget (nil if no budget set) |
llm_mock(response) | response: dict | nil | Queue a mock LLM response. Dict supports text, tool_calls, match (glob), consume_match (consume a matched pattern instead of reusing it), input_tokens, output_tokens, thinking, stop_reason, model, error: {category, message} (short-circuits the call and surfaces as VmError::CategorizedError — useful for testing llm_call_safe envelopes and with_rate_limit retry loops) |
llm_mock_calls() | — | list | Return list of {messages, system, tools} for all calls made to the mock provider |
llm_mock_clear() | — | nil | Clear all queued mock responses and recorded calls |
FIFO mocks (no match field) are consumed in order. Pattern-matched mocks
(with match) are checked in declaration order against the request transcript
text using glob patterns. They persist by default; add consume_match: true
to advance through matching fixtures step by step. When no mocks match, the
default deterministic mock behavior is used.
See Trigger stdlib for the typed std/triggers aliases,
DLQ entry shapes, and the current shallow-path replay / manual-fire caveats.
// Queue specific responses for the mock provider
llm_mock({text: "The answer is 42."})
llm_mock({
text: "Let me check that.",
tool_calls: [{name: "read_file", arguments: {path: "main.rs"}}],
})
let r = llm_call("question", nil, {provider: "mock"})
assert_eq(r.text, "The answer is 42.")
// Pattern-matched mocks (reusable, not consumed)
llm_mock({text: "Hello!", match: "*greeting*"})
llm_mock({text: "step 1", match: "*planner*", consume_match: true})
llm_mock({text: "step 2", match: "*planner*", consume_match: true})
// Error injection for testing resilient code paths. The mock
// surfaces as a real `VmError::CategorizedError`, so `error_category`,
// `try { ... } catch`, `llm_call_safe`, and `with_rate_limit` all see
// it the same way they would a live provider failure.
llm_mock({error: {category: "rate_limit", message: "429 Too Many Requests"}})
// Inspect what was sent
let calls = llm_mock_calls()
llm_mock_clear()
Transcript helpers
| Function | Parameters | Returns | Description |
|---|---|---|---|
transcript(metadata?) | metadata: dict | dict | Create a new transcript |
transcript_from_messages(messages_or_transcript) | list or dict | dict | Normalize a message list into a transcript |
transcript_messages(transcript) | transcript: dict | list | Get transcript messages |
transcript_summary(transcript) | transcript: dict | string or nil | Get transcript summary |
transcript_id(transcript) | transcript: dict | string | Get transcript id |
transcript_export(transcript) | transcript: dict | string | Export transcript as JSON |
transcript_import(json_text) | json_text: string | dict | Import transcript JSON |
transcript_fork(transcript, options?) | transcript: dict, options: dict | dict | Fork transcript, optionally dropping messages or summary |
transcript_summarize(transcript, options?) | transcript: dict, options: dict | dict | Summarize and compact a transcript via llm_call |
transcript_compact(transcript, options?) | transcript: dict, options: dict | dict | Compact a transcript with the runtime compaction engine, preserving durable artifacts and compaction events |
transcript_auto_compact(messages, options?) | messages: list, options: dict | list | Apply the agent-loop compaction pipeline to a message list using llm, truncate, or custom strategy |
Provider configuration
LLM provider endpoints, model aliases, inference rules, and default parameters are configured via a TOML file. The VM searches for config in this order:
- Built-in defaults (Anthropic, OpenAI, OpenRouter, HuggingFace, Ollama, Local)
HARN_PROVIDERS_CONFIGif set, otherwise~/.config/harn/providers.toml- Installed package
[llm]tables in.harn/packages/*/harn.toml - The nearest project
harn.toml[llm]table
The [llm] section uses the same schema as providers.toml, so project and
package manifests can ship provider adapters declaratively:
[llm.providers.anthropic]
base_url = "https://api.anthropic.com/v1"
auth_style = "header"
auth_header = "x-api-key"
auth_env = "ANTHROPIC_API_KEY"
chat_endpoint = "/messages"
[llm.providers.local]
base_url = "http://localhost:8000"
base_url_env = "LOCAL_LLM_BASE_URL"
auth_style = "none"
chat_endpoint = "/v1/chat/completions"
completion_endpoint = "/v1/completions"
[llm.aliases]
sonnet = { id = "claude-sonnet-4-20250514", provider = "anthropic" }
[[llm.inference_rules]]
pattern = "claude-*"
provider = "anthropic"
[[llm.tier_rules]]
pattern = "claude-*"
tier = "frontier"
[llm.model_defaults."qwen/*"]
temperature = 0.3
Timers
| Function | Parameters | Returns | Description |
|---|---|---|---|
timer_start(name?) | name: string | dict | Start a named timer |
timer_end(timer) | timer: dict | int | Stop timer, prints elapsed, returns milliseconds |
elapsed() | — | int | Milliseconds since process start |
Circuit breakers
Protect against cascading failures by tracking error counts and opening a circuit when a threshold is reached.
| Function | Parameters | Returns | Description |
|---|---|---|---|
circuit_breaker(name, threshold?, reset_ms?) | name: string, threshold: int (default 5), reset_ms: int (default 30000) | string | Create a named circuit breaker. Returns the name |
circuit_check(name) | name: string | string | Check state: "closed", "open", or "half_open" (after reset period) |
circuit_record_failure(name) | name: string | bool | Record a failure. Returns true if the circuit just opened |
circuit_record_success(name) | name: string | nil | Record a success, resetting failure count and closing the circuit |
circuit_reset(name) | name: string | nil | Manually reset the circuit to closed |
Example:
circuit_breaker("api", 3, 10000)
for i in 0 to 5 exclusive {
if circuit_check("api") == "open" {
println("circuit open, skipping call")
} else {
try {
let resp = http_get("https://api.example.com/data")
circuit_record_success("api")
} catch e {
circuit_record_failure("api")
}
}
}
Tracing
Distributed tracing primitives for instrumenting pipeline execution.
| Function | Parameters | Returns | Description |
|---|---|---|---|
trace_start(name) | name: string | dict | Start a trace span. Returns a span dict with trace_id, span_id, name, start_ms |
trace_end(span) | span: dict | nil | End a span and emit a structured log line with duration |
trace_id() | none | string or nil | Current trace ID from the span stack, or nil if no active span |
enable_tracing(enabled?) | enabled: bool (default true) | nil | Enable or disable pipeline-level tracing |
trace_spans() | none | list | Peek at recorded trace spans |
trace_summary() | none | string | Formatted summary of trace spans |
Example:
let span = trace_start("fetch_data")
// ... do work ...
trace_end(span)
println(trace_summary())
Agent trace events
Fine-grained agent loop trace events for observability and debugging.
Events are collected during agent_loop execution and can be inspected
after the loop completes.
| Function | Parameters | Returns | Description |
|---|---|---|---|
agent_trace() | none | list | Peek at collected agent trace events. Each event is a dict with a type field (llm_call, tool_execution, tool_rejected, loop_intervention, context_compaction, phase_change, loop_complete) and type-specific fields |
agent_trace_summary() | none | dict | Rolled-up summary of agent trace events with aggregated token counts, durations, tool usage, and iteration counts |
Example:
let result = agent_loop("summarize this file", tools: [read_file])
let summary = agent_trace_summary()
println("LLM calls: " + str(summary.llm_calls))
println("Tools used: " + str(summary.tools_used))
Error classification
Structured error throwing and classification for retry logic and error handling.
| Function | Parameters | Returns | Description |
|---|---|---|---|
throw_error(message, category?) | message: string, category: string | never | Throw a categorized error. The error is a dict with message and category fields |
error_category(err) | err: any | string | Extract category from a caught error. Returns "timeout", "auth", "rate_limit", "tool_error", "cancelled", "not_found", "circuit_open", or "generic" |
is_timeout(err) | err: any | bool | Check if error is a timeout |
is_rate_limited(err) | err: any | bool | Check if error is a rate limit |
Example:
try {
throw_error("request timed out", "timeout")
} catch e {
if is_timeout(e) {
println("will retry after backoff")
}
println(error_category(e)) // "timeout"
}
Tool registry (low-level)
Low-level tool management functions for building and inspecting tool
registries programmatically. For MCP serving, see the tool_define /
mcp_tools API above.
| Function | Parameters | Returns | Description |
|---|---|---|---|
tool_remove(registry, name) | registry, name: string | dict | Remove a tool by name |
tool_list(registry) | registry: dict | list | List tools as [{name, description, parameters}] |
tool_find(registry, name) | registry, name: string | dict or nil | Find a tool entry by name |
tool_select(registry, names) | registry: dict, names: list | dict | Return a registry containing only the named tools |
tool_count(registry) | registry: dict | int | Number of tools in the registry |
tool_describe(registry) | registry: dict | string | Human-readable summary of all tools |
tool_schema(registry, components?) | registry, components: dict | dict | Generate JSON Schema for all tools |
tool_prompt(registry) | registry: dict | string | Generate an LLM system prompt describing available tools |
tool_parse_call(text) | text: string | list | Parse <tool_call>...</tool_call> XML from LLM output |
tool_format_result(name, result) | name, result: string | string | Format a <tool_result> XML envelope |
Structured logging
| Function | Parameters | Returns | Description |
|---|---|---|---|
log_json(key, value) | key: string, value: any | nil | Emit a JSON log line with timestamp |
Metadata
Project metadata store backed by host-managed sharded JSON files.
Supports hierarchical namespace resolution (child directories inherit
from parents). The default filesystem backend persists namespace shards
under .harn/metadata/<namespace>/entries.json and still reads the legacy
monolithic root.json shard.
| Function | Parameters | Returns | Description |
|---|---|---|---|
metadata_get(dir, namespace?) | dir: string, namespace: string | dict | nil | Read metadata with inheritance |
metadata_resolve(dir, namespace?) | dir: string, namespace: string | dict | nil | Read resolved metadata while preserving namespaces |
metadata_entries(namespace?) | namespace: string | list | List stored directories with local and resolved metadata |
metadata_set(dir, namespace, data) | dir: string, namespace: string, data: dict | nil | Write metadata for directory/namespace |
metadata_save() | — | nil | Flush metadata to disk |
metadata_stale(project) | project: string | dict | Check staleness: {any_stale, tier1, tier2} |
metadata_status(namespace?) | namespace: string | dict | Summarize directory counts, namespaces, missing hashes, and stale state |
metadata_refresh_hashes() | — | nil | Recompute content hashes |
compute_content_hash(dir) | dir: string | string | Hash of directory contents |
invalidate_facts(dir) | dir: string | nil | Mark cached facts as stale |
scan_directory(path?, pattern_or_options?, options?) | path: string, pattern: string or options: dict | list | Enumerate files and directories with optional pattern, max_depth, include_hidden, include_dirs, include_files |
MCP (Model Context Protocol)
Connect to external tool servers using the Model Context Protocol. Harn supports stdio transport (spawns a child process) and HTTP transport for remote MCP servers.
| Function | Parameters | Returns | Description |
|---|---|---|---|
mcp_connect(command, args?) | command: string, args: list | mcp_client | Spawn an MCP server and perform the initialize handshake |
mcp_list_tools(client) | client: mcp_client | list | List available tools from the server |
mcp_call(client, name, arguments?) | client: mcp_client, name: string, arguments: dict | string or list | Call a tool and return the result |
mcp_list_resources(client) | client: mcp_client | list | List available resources from the server |
mcp_list_resource_templates(client) | client: mcp_client | list | List resource templates (URI templates) from the server |
mcp_read_resource(client, uri) | client: mcp_client, uri: string | string or list | Read a resource by URI |
mcp_list_prompts(client) | client: mcp_client | list | List available prompts from the server |
mcp_get_prompt(client, name, arguments?) | client: mcp_client, name: string, arguments: dict | dict | Get a prompt with optional arguments |
mcp_server_info(client) | client: mcp_client | dict | Get connection info (name, connected) |
mcp_disconnect(client) | client: mcp_client | nil | Kill the server process and release resources |
Example:
let client = mcp_connect("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
let tools = mcp_list_tools(client)
println(tools)
let result = mcp_call(client, "read_file", {"path": "/tmp/hello.txt"})
println(result)
mcp_disconnect(client)
Notes:
mcp_callreturns a string when the tool produces a single text block, a list of content dicts for multi-block results, or nil when empty.- If the tool reports
isError: true,mcp_callthrows the error text. mcp_connectthrows if the command cannot be spawned or the initialize handshake fails.
Auto-connecting MCP servers via harn.toml
Instead of calling mcp_connect manually, you can declare MCP servers in
harn.toml. They will be connected automatically before the pipeline executes
and made available through the global mcp dict.
Add a [[mcp]] entry for each server:
[[mcp]]
name = "filesystem"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
[[mcp]]
name = "github"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
Each entry requires:
| Field | Type | Description |
|---|---|---|
name | string | Identifier used to access the client (e.g., mcp.filesystem) |
command | string | Executable to spawn for stdio transports |
args | list of strings | Command-line arguments for stdio transports (default: empty) |
transport | string | stdio (default) or http |
url | string | Remote MCP server URL for HTTP transports |
auth_token | string | Optional explicit bearer token for HTTP transports |
client_id | string | Optional pre-registered OAuth client ID for HTTP transports |
client_secret | string | Optional pre-registered OAuth client secret |
scopes | string | Optional OAuth scope string for login/consent |
protocol_version | string | Optional MCP protocol version override |
The connected clients are available as properties on the mcp global dict:
pipeline default() {
let tools = mcp_list_tools(mcp.filesystem)
println(tools)
let result = mcp_call(mcp.github, "list_issues", {repo: "harn"})
println(result)
}
If a server fails to connect, a warning is printed to stderr and that
server is omitted from the mcp dict. Other servers still connect
normally. The mcp global is only defined when at least one server
connects successfully.
For HTTP MCP servers, use the CLI to establish OAuth once and let Harn reuse the stored token automatically:
harn mcp redirect-uri
harn mcp login notion
MCP server mode
Harn pipelines can expose tools, resources, resource templates, and prompts
as an MCP server using harn mcp-serve. The CLI serves them over stdio
using the MCP protocol, making them callable by Claude Desktop, Cursor,
or any MCP client.
Declarative syntax (preferred):
tool greet(name: string) -> string {
description "Greet someone by name"
"Hello, " + name + "!"
}
The tool keyword declares a tool with typed parameters, an optional
description, and a body. Parameter types map to JSON Schema
(string -> "string", int -> "integer", float -> "number",
bool -> "boolean"). Parameters with default values are emitted as
optional schema fields (required: false) and carry their default
value into the generated tool registry entry. Each tool declaration produces its own
tool registry dict.
Programmatic API:
| Function | Parameters | Returns | Description |
|---|---|---|---|
tool_registry() | — | dict | Create an empty tool registry |
tool_define(registry, name, desc, config) | registry, name, desc: string, config: dict | dict | Add a tool (config: {parameters, handler, returns?, annotations?, ...}) |
mcp_tools(registry) | registry: dict | nil | Register tools for MCP serving |
mcp_resource(config) | config: dict | nil | Register a static resource ({uri, name, text, description?, mime_type?}) |
mcp_resource_template(config) | config: dict | nil | Register a resource template ({uri_template, name, handler, description?, mime_type?}) |
mcp_prompt(config) | config: dict | nil | Register a prompt ({name, handler, description?, arguments?}) |
Tool annotations (MCP spec annotations field) can be passed in the
tool_define config to describe tool behavior:
tools = tool_define(tools, "search", "Search files", {
parameters: { query: {type: "string"} },
returns: {type: "string"},
handler: { args -> "results for ${args.query}" },
annotations: {
title: "File Search",
readOnlyHint: true,
destructiveHint: false
}
})
Unknown tool_define config keys are preserved on the tool entry. Workflow
graphs use this to carry runtime policy metadata directly on a tool registry,
for example:
tools = tool_define(tools, "read", "Read files", {
parameters: { path: {type: "string"} },
returns: {type: "string"},
handler: nil,
policy: {
capabilities: {workspace: ["read_text"]},
side_effect_level: "read_only",
path_params: ["path"],
mutation_classification: "read_only"
}
})
When a workflow node uses that registry, Harn intersects the declared tool policy with the graph, node, and host ceilings during validation and at execution time.
Declarative tool approval
agent_loop, workflow_execute, and workflow stage nodes accept an
approval_policy option that declaratively gates tool calls:
agent_loop("task", "system", {
approval_policy: {
auto_approve: ["read*", "list_*"],
auto_deny: ["shell*"],
require_approval: ["edit_*", "write_*"],
write_path_allowlist: ["/workspace/**"]
}
})
Evaluation order: auto_deny → write_path_allowlist → auto_approve →
require_approval. Tools that match no pattern default to AutoApproved.
require_approval calls the host via the canonical ACP
session/request_permission request and fails closed if the host
does not implement it. Policies compose
across nested scopes with most-restrictive intersection: auto-deny and
require-approval take the union, while auto_approve and
write_path_allowlist take the intersection.
Example (agent.harn):
pipeline main(task) {
var tools = tool_registry()
tools = tool_define(tools, "greet", "Greet someone", {
parameters: { name: {type: "string"} },
returns: {type: "string"},
handler: { args -> "Hello, ${args.name}!" }
})
mcp_tools(tools)
mcp_resource({
uri: "docs://readme",
name: "README",
text: "# My Agent\nA demo MCP server."
})
mcp_resource_template({
uri_template: "config://{key}",
name: "Config Values",
handler: { args -> "value for ${args.key}" }
})
mcp_prompt({
name: "review",
description: "Code review prompt",
arguments: [{ name: "code", required: true }],
handler: { args -> "Please review:\n${args.code}" }
})
}
Run as an MCP server:
harn mcp-serve agent.harn
Configure in Claude Desktop (claude_desktop_config.json):
{
"mcpServers": {
"my-agent": {
"command": "harn",
"args": ["mcp-serve", "agent.harn"]
}
}
}
Notes:
mcp_tools(registry)(or the aliasmcp_serve) must be called to register tools.- Resources, resource templates, and prompts are registered individually.
- All
print/printlnoutput goes to stderr (stdout is the MCP transport). - The server supports the
2025-11-25MCP protocol version over stdio. - Tool handlers receive arguments as a dict and should return a string result.
- Prompt handlers receive arguments as a dict and return a string (single
user message) or a list of
{role, content}dicts. - Resource template handlers receive URI template variables as a dict and return the resource text.
Workflow and orchestration builtins
These builtins expose Harn’s typed orchestration runtime.
Workflow graph and planning
| Function | Parameters | Returns | Description |
|---|---|---|---|
workflow_graph(config) | config: dict | workflow graph | Normalize a workflow definition into the typed workflow IR |
workflow_validate(graph, ceiling?) | graph: workflow, ceiling: dict (optional) | dict | Validate graph structure and capability ceilings |
workflow_inspect(graph, ceiling?) | graph: workflow, ceiling: dict (optional) | dict | Return graph plus validation summary |
workflow_clone(graph) | graph: workflow | workflow graph | Clone a workflow and append an audit entry |
workflow_insert_node(graph, node, edge?) | graph, node, edge | workflow graph | Insert a node and optional edge |
workflow_replace_node(graph, node_id, node) | graph, node_id, node | workflow graph | Replace a node definition |
workflow_rewire(graph, from, to, branch?) | graph, from, to, branch | workflow graph | Rewire an edge |
workflow_set_model_policy(graph, node_id, policy) | graph, node_id, policy | workflow graph | Set per-node model policy |
workflow_set_context_policy(graph, node_id, policy) | graph, node_id, policy | workflow graph | Set per-node context policy |
workflow_set_auto_compact(graph, node_id, policy) | graph, node_id, policy | workflow graph | Set per-node auto-compaction policy |
workflow_set_output_visibility(graph, node_id, visibility) | graph, node_id, visibility | workflow graph | Set per-node output-visibility filter ("public"/"public_only"/nil) |
workflow_policy_report(graph, ceiling?) | graph, ceiling: dict (optional) | dict | Inspect workflow/node policies against an explicit or builtin ceiling |
workflow_diff(left, right) | left, right | dict | Compare two workflow graphs |
workflow_commit(graph, reason?) | graph, reason | workflow graph | Validate and append a commit audit entry |
Workflow execution and run records
| Function | Parameters | Returns | Description |
|---|---|---|---|
workflow_execute(task, graph, artifacts?, options?) | task, graph, artifacts, options | dict | Execute a workflow and persist a run record |
run_record(payload) | payload: dict | run record | Normalize a run record |
run_record_save(run, path?) | run, path | dict | Persist a run record |
run_record_load(path) | path: string | run record | Load a run record from disk |
load_run_tree(path) | path: string | dict | Load a persisted run with delegated child-run lineage |
run_record_fixture(run) | run | replay fixture | Derive a replay/eval fixture from a saved run |
run_record_eval(run, fixture?) | run, fixture | dict | Evaluate a run against an embedded or explicit fixture |
run_record_eval_suite(cases) | cases: list | dict | Evaluate a list of {run, fixture?, path?} cases as a regression suite |
run_record_diff(left, right) | left, right | dict | Compare two run records and summarize stage/status deltas |
eval_suite_manifest(payload) | payload: dict | dict | Normalize a grouped eval suite manifest |
eval_suite_run(manifest) | manifest: dict | dict | Evaluate a manifest of saved runs, fixtures, and optional baselines |
eval_metric(name, value, metadata?) | name: string, value: any, metadata: dict | nil | Record a named metric into the eval metric store |
eval_metrics() | — | list | Return all recorded eval metrics as {name, value, metadata?} dicts |
workflow_execute options currently include:
max_stepspersist_pathresume_pathresume_runreplay_pathreplay_runreplay_mode("deterministic"currently replays saved stage fixtures)parent_run_idroot_run_idexecution({cwd?, env?, worktree?}for isolated delegated execution)audit(seed mutation-session metadata for trust/audit grouping)mutation_scopeapproval_policy(declarative tool approval policy; see below)
verify nodes may also define execution checks inside node.verify, including:
commandto execute via the host shell in the current execution contextassert_textto require visible output to contain a substringexpect_statusto require a specific exit status
Tool lifecycle hooks
| Function | Parameters | Returns | Description |
|---|---|---|---|
register_tool_hook(config) | config: dict | nil | Register a pre/post hook for tool calls matching pattern (glob). deny string blocks matching tools; max_output int truncates results |
clear_tool_hooks() | none | nil | Remove all registered tool hooks |
Context and compaction utilities
| Function | Parameters | Returns | Description |
|---|---|---|---|
estimate_tokens(messages) | messages: list | int | Estimate token count for a message list (chars / 4 heuristic) |
microcompact(text, max_chars?) | text, max_chars (default 20000) | string | Snip oversized text, keeping head and tail with a marker |
select_artifacts_adaptive(artifacts, policy) | artifacts: list, policy: dict | list | Deduplicate, microcompact oversized artifacts, then select with token budget |
transcript_auto_compact(messages, options?) | messages: list, options: dict | list | Run the same transcript auto-compaction pipeline used by agent_loop |
Delegated workers
| Function | Parameters | Returns | Description |
|---|---|---|---|
spawn_agent(config) | config: dict | dict | Start a worker from a workflow graph or delegated stage config |
sub_agent_run(task, options?) | task: string, options: dict | dict | Run an isolated child agent loop and return a clean envelope {summary, artifacts, evidence_added, tokens_used, budget_exceeded, ...} without leaking the child transcript into the parent |
send_input(handle, task) | handle, task | dict | Re-run a completed worker with a new task, carrying forward worker state where applicable |
resume_agent(id_or_snapshot_path) | id or path | dict | Restore a persisted worker snapshot into the current runtime |
wait_agent(handle_or_list) | handle or list | dict or list | Wait for one worker or a list of workers to finish |
close_agent(handle) | handle | dict | Cancel a worker and mark it terminal |
list_agents() | none | list | List worker summaries tracked by the current runtime |
spawn_agent(...) accepts either:
{task, graph, artifacts?, options?, name?, wait?}for typed workflow runs{task, node, artifacts?, transcript?, name?, wait?}for delegated stage runs- Either shape may also include
policy: <capability_policy>to narrow the worker’s inherited execution ceiling. - Either shape may also include
tools: ["name", ...]as shorthand for a worker policy that only allows those tool names. - Either shape may also include
execution: {cwd?, env?, worktree?}whereworktreeaccepts{repo, path?, branch?, base_ref?, cleanup?}. - Either shape may also include
audit: {session_id?, parent_session_id?, mutation_scope?, approval_policy?}
Worker configs may also include carry to control continuation behavior:
carry: {artifacts: "inherit" | "none" | <context_policy>}carry: {resume_workflow?: bool, persist_state?: bool}
To give a spawned worker prior conversation context, open a session
before spawning and set model_policy.session_id on the worker’s node.
Use agent_session_fork(parent) if the worker should start from a
branch of an existing conversation; agent_session_reset(id) before
the call if you want a fresh run with the same id.
Workers return handle dicts with an id, lifecycle timestamps, status,
mode, result/error fields, transcript presence, produced artifact count,
snapshot/child-run paths, immutable original request metadata, normalized
provenance, and audit mutation-session metadata when available.
The request object preserves canonical research_questions,
action_items, workflow_stages, and verification_steps arrays when the
caller supplied them.
When a worker-scoped policy denies a tool call, the agent receives a structured
tool result payload: {error: "permission_denied", tool: "...", reason: "..."}.
sub_agent_run(task, options?) is the lighter-weight context-firewall primitive.
It starts a child session, runs a full agent_loop, and returns only a single
typed envelope to the parent:
summary,artifacts,evidence_added,tokens_used,budget_exceeded,session_id, and optionaldataok: falsepluserror: {category, message, tool?}when the child fails or hits a capability denialbackground: truereturns a normal worker handle whosemodeissub_agent
Options mirror agent_loop where relevant (provider, model, tools,
tool_format, max_iterations, token_budget, policy, approval_policy,
session_id, system) and also accept:
allowed_tools: ["name", ...]to narrow the child tool registry and capability ceilingresponse_format: "json"to parse structured child JSON intodatafrom the final successful transcript when possiblereturns: {schema: ...}to validate that structured child JSON against a schema
Artifacts and context
| Function | Parameters | Returns | Description |
|---|---|---|---|
artifact(payload) | payload: dict | artifact | Normalize a typed artifact/resource |
artifact_derive(parent, kind, extra?) | parent, kind, extra | artifact | Derive a new artifact from a prior one |
artifact_select(artifacts, policy?) | artifacts, policy | list | Select artifacts under context policy and budget |
artifact_context(artifacts, policy?) | artifacts, policy | string | Render selected artifacts into context |
artifact_workspace_file(path, content, extra?) | path, content, extra | artifact | Build a normalized workspace-file artifact with path provenance |
artifact_workspace_snapshot(paths, summary?, extra?) | paths, summary, extra | artifact | Build a workspace snapshot artifact for host/editor context |
artifact_editor_selection(path, text, extra?) | path, text, extra | artifact | Build an editor-selection artifact from host UI state |
artifact_verification_result(title, text, extra?) | title, text, extra | artifact | Build a verification-result artifact |
artifact_test_result(title, text, extra?) | title, text, extra | artifact | Build a test-result artifact |
artifact_command_result(command, output, extra?) | command, output, extra | artifact | Build a command-result artifact with structured output |
artifact_diff(path, before, after, extra?) | path, before, after, extra | artifact | Build a unified diff artifact from before/after text |
artifact_git_diff(diff_text, extra?) | diff_text, extra | artifact | Build a git-diff artifact from host/tool output |
artifact_diff_review(target, summary?, extra?) | target, summary, extra | artifact | Build a diff-review artifact linked to a diff/patch target |
artifact_review_decision(target, decision, extra?) | target, decision, extra | artifact | Build an accept/reject review-decision artifact linked by lineage |
artifact_patch_proposal(target, patch, extra?) | target, patch, extra | artifact | Build a proposed patch artifact linked to an existing target |
artifact_verification_bundle(title, checks, extra?) | title, checks, extra | artifact | Bundle structured verification checks into one review artifact |
artifact_apply_intent(target, intent, extra?) | target, intent, extra | artifact | Record an apply or merge intent linked to a reviewed artifact |
Core artifact kinds commonly used by the runtime include resource,
workspace_file, workspace_snapshot, editor_selection, summary,
transcript_summary, diff, git_diff, patch, patch_set,
patch_proposal, diff_review, review_decision, verification_bundle,
apply_intent, test_result, verification_result, command_result,
and plan.
Sessions
Sessions are the first-class resource for agent-loop conversations. They own a transcript history, closure subscribers, and a lifecycle. See the Sessions chapter for the full model.
| Function | Parameters | Returns | Description |
|---|---|---|---|
agent_session_open(id?) | id: string or nil | string | Idempotent open; nil mints a UUIDv7 |
agent_session_exists(id) | id | bool | Safe on unknown ids |
agent_session_length(id) | id | int | Message count; errors on unknown id |
agent_session_snapshot(id) | id | dict or nil | Read-only deep copy of the transcript |
agent_session_reset(id) | id | nil | Wipes history; preserves id and subscribers |
agent_session_fork(src, dst?) | src, dst | string | Copies transcript; subscribers are not copied |
agent_session_trim(id, keep_last) | id, keep_last: int | int | Retain last keep_last messages; returns kept count |
agent_session_compact(id, opts) | id, opts: dict | int | Runs the LLM/truncate/observation-mask compactor |
agent_session_inject(id, message) | id, message: dict | nil | Appends {role, content, …}; missing role errors |
agent_session_close(id) | id | nil | Evicts immediately regardless of LRU cap |
Pair with agent_loop(..., {session_id: id, ...}): prior messages load
as prefix and the final transcript is persisted back on exit.
Transcript lifecycle
Lower-level transcript primitives. Most callers should prefer sessions; these remain useful for building synthetic transcripts, replay fixtures, and offline analysis.
| Function | Parameters | Returns | Description |
|---|---|---|---|
transcript(metadata?) | metadata: any | transcript | Create an empty transcript |
transcript_messages(transcript) | transcript | list | Return transcript messages |
transcript_assets(transcript) | transcript | list | Return transcript asset descriptors |
transcript_add_asset(transcript, asset) | transcript, asset | transcript | Register a durable asset reference on a transcript |
transcript_events(transcript) | transcript | list | Return canonical transcript events |
transcript_events_by_kind(transcript, kind) | transcript, kind | list | Filter transcript events by their kind field |
transcript_stats(transcript) | transcript | dict | Count messages, tool calls, and visible events on a transcript |
transcript_summary(transcript) | transcript | string or nil | Return transcript summary |
transcript_fork(transcript, options?) | transcript, options | transcript | Fork transcript state |
transcript_reset(options?) | options | transcript | Start a fresh active transcript with optional metadata |
transcript_archive(transcript) | transcript | transcript | Mark transcript archived and append an internal lifecycle event |
transcript_abandon(transcript) | transcript | transcript | Mark transcript abandoned and append an internal lifecycle event |
transcript_resume(transcript) | transcript | transcript | Mark transcript active again and append an internal lifecycle event |
transcript_compact(transcript, options?) | transcript, options | transcript | Compact a transcript with the runtime compaction engine |
transcript_summarize(transcript, options?) | transcript, options | transcript | Compact via LLM-generated summary |
transcript_auto_compact(messages, options?) | messages, options | list | Apply the agent-loop compaction pipeline to a message list |
transcript_render_visible(transcript) | transcript | string | Render only public/human-visible messages |
transcript_render_full(transcript) | transcript | string | Render the full execution history |
Transcript messages may now carry structured block content instead of plain
text. Use add_user(...), add_assistant(...), or add_message(...) with a
list of blocks such as {type: "text", text: "..."},
{type: "image", asset_id: "..."}, {type: "file", asset_id: "..."}, and
{type: "tool_call", ...}, with per-block
visibility: "public" | "internal" | "private". Durable media belongs in
transcript.assets, while message/event blocks should reference those assets
by id or path.
Project scanning
The std/project module now includes a deterministic L0/L1 project scanner for
lightweight “what kind of project is this?” evidence without any LLM calls.
Import it with:
import "std/project"
What it returns
project_scan(path, options?) resolves path to a directory and returns a
dictionary describing exactly that directory:
let ev = project_scan(".", {tiers: ["ambient", "config"]})
Typical fields:
path: absolute path to the scanned directorylanguages: stable, confidence-filtered language IDs such as["rust"]frameworks: coarse framework IDs when an anchor is obviousbuild_systems: coarse build systems such as["cargo"]or["npm"]vcs: currently"git"when the directory is inside a Git checkoutanchors: anchor files or directories found at the project rootlockfiles: lockfiles found at the project rootconfidence: coarse per-language/per-framework scorespackage_name: root package/module name when it can be parsed deterministically
When tiers includes "config", the scan also fills in:
build_commands: default or discovered build/test commandsdeclared_scripts: parsedpackage.jsonscriptsmakefile_targets: parsed Makefile targetsdockerfile_commands: parsedRUN,CMD, andENTRYPOINTcommandsreadme_code_fences: fenced-language labels found in the README
Tiers
ambient: anchor files, lockfiles, coarse build system detection, VCS, and confidence scoring. No config parsing.config: deterministic config reads for files already found byambient.
If tiers is omitted, project_scan(...) defaults to ["ambient"].
Polyglot repos
Single-directory scans stay leaf-scoped on purpose. For polyglot repos and
monorepos, use project_scan_tree(...) and let callers decide how to combine
sub-project evidence:
let tree = project_scan_tree(".", {tiers: ["ambient"], depth: 3})
// {".": {...}, "frontend": {...}, "backend": {...}}
project_scan_tree(...):
- always includes
"."for the requested base directory - walks subdirectories deterministically
- honors
.gitignoreby default - skips standard vendor/build directories such as
node_modules/andtarget/by default
You can override those defaults with:
respect_gitignore: falseinclude_vendor: trueinclude_hidden: true
Enrichment
project_enrich(path, options) layers an L2, caller-owned enrichment pass on
top of deterministic project_scan(...) evidence. The caller supplies the
prompt template and the output schema; Harn owns prompt rendering, bounded file
selection, schema-retry plumbing, and content-hash caching.
Typical use:
let base = project_scan(".", {tiers: ["ambient", "config"]})
let enriched = project_enrich(".", {
base_evidence: base,
prompt: "Project: {{package_name}}\n{{ for file in files }}FILE {{file.path}}\n{{file.content}}\n{{ end }}\nReturn JSON.",
schema: {
type: "object",
required: ["framework", "indent_style"],
properties: {
framework: {type: "string"},
indent_style: {type: "string"},
},
},
budget_tokens: 4000,
model: "auto",
cache_key: "coding-enrichment-v1",
})
Bindings available to the template:
path: absolute project pathbase_evidence/evidence: the supplied or auto-scanned L0/L1 evidence- every top-level key from
base_evidence files: deterministic bounded file context as{path, content, truncated}
Behavior:
- cache key includes
cache_key, path, schema, rendered prompt, and the content hash of the selected files - cached hits surface
_provenance.cached == true - when the rendered prompt would exceed
budget_tokens, the call returns the base evidence withbudget_exceeded: trueinstead of failing - schema-retry exhaustion returns an envelope with
validation_errorandbase_evidenceinstead of raising
By default, cache entries live under .harn/cache/enrichment/ inside the
project root. Override that with cache_dir when a caller wants a different
location.
Cached deep scans
project_deep_scan(path, options?) layers a cached per-directory tree on top
of the metadata store. It is intended for repeated L2/L3 repo analysis where
callers want stable hierarchical evidence instead of re-running enrichment on
every turn.
Typical shape:
let tree = project_deep_scan(".", {
namespace: "coding-enrichment-v1",
tiers: ["ambient", "config", "enriched"],
incremental: true,
max_staleness_seconds: 86400,
depth: nil,
enrichment: {
prompt: "Return valid JSON only.",
schema: {purpose: "string", conventions: ["string"]},
provider: "mock",
budget_tokens_per_dir: 1024,
},
})
Notes:
namespaceis caller-owned, so multiple agents can keep separate trees for the same repo without collisions.incremental: truereuses cached directories whose local directorystructure_hashandcontent_hashstill match.depth: nilmeans unbounded traversal.- The filesystem backend persists namespace shards under
.harn/metadata/<namespace>/entries.json. project_deep_scan_status(namespace, path?)returns the last recorded scan summary for that scope:{total_dirs, enriched_dirs, stale_dirs, cache_hits, last_refresh, ...}.
project_enrich(path, options?) is the single-directory building block used by
deep scan when the enriched tier is requested.
Catalog
project_catalog() returns the authoritative built-in catalog that drives
ambient detection. Each entry includes:
idlanguagesframeworksbuild_systemsanchorslockfilessource_globsdefault_build_cmddefault_test_cmd
The catalog lives in
crates/harn-vm/src/stdlib/project_catalog.rs. Adding a new language should be
a table entry plus a test, not a new custom code path.
Existing helper
project_root_package() now delegates to the scanner’s config tier after
checking metadata enrichment, so existing callers keep the same package-name
surface while the manifest parsing logic stays centralized.
Prompt templating
Harn ships a small template language for rendering .harn.prompt and .prompt
asset files. It is invoked by the render(path, bindings?) and
render_prompt(path, bindings?) builtins (and, equivalently, via the
template.render host capability). The engine is intentionally minimal — a
rendering layer for prompts, not a scripting language — but it covers the
ergonomics most prompt authors reach for: conditionals with else/elif,
loops, includes, filters, comments, and whitespace control.
This page is the reference. The one-page quickref has a condensed version for agents writing Harn.
At a glance
{{ name }} interpolation
{{ user.name }} / {{ items[0] }} nested path access
{{ name | upper | default: "anon" }} filter pipeline
{{ if expr }} ... {{ elif expr }} ... {{ else }} ... {{ end }}
{{ for item in xs }} ... {{ else }} ... {{ end }} else = empty-iterable fallback
{{ for key, value in dict }} ... {{ end }}
{{ include "partial.harn.prompt" }}
{{ include "partial.harn.prompt" with { x: name } }}
{{# stripped at parse time #}}
{{ raw }} ... literal {{braces}} ... {{ endraw }}
{{- name -}} whitespace-trim markers
Interpolation
{{ path }} evaluates an expression and writes its string form into the
output. Paths support nested field access and integer/string indexing:
{{ user.name }} — field
{{ user.tags[0] }} — list index
{{ user.tags[-1] }} — negative index (counts from end)
{{ config["api-key"] }} — string key with non-identifier characters
Missing values render as the empty string, except for legacy bare
identifiers (e.g. {{ name }} with no dots/brackets/filters). For
back-compat, those render their source verbatim on a miss (the pre-v2
behavior), so existing templates that relied on “missing → literal passthrough”
keep working.
Conditionals
{{ if role == "admin" }}
welcome, admin
{{ elif role == "user" and active }}
welcome back!
{{ else }}
please sign in
{{ end }}
Only {{ if expr }} ... {{ end }} is required; elif and else branches are
optional and can be combined. The expression grammar is:
| Category | Syntax |
|---|---|
| Literals | "str", 'str', 123, 1.5, true, false, nil |
| Paths | ident, a.b.c, a[0], a["key"] |
| Unary | not x, !x |
| Equality | ==, != |
| Comparison | <, <=, >, >= (numbers and strings) |
| Boolean (short-cut) | and / &&, or / || |
| Grouping | (expr) |
| Filters | expr | filter, expr | filter: arg1, arg2 |
String escapes inside quoted literals: \n, \t, \r, \\, \", \'.
Truthiness
Used both by if and by the short-circuit and/or:
| Value kind | Truthy? |
|---|---|
nil | false |
false | false |
0, 0.0 | false |
| empty/whitespace-only string | false |
| empty list / set / dict | false |
| everything else | true |
Loops
{{ for x in xs }}
- {{ loop.index }}. {{ x }}
{{ else }}
(no items)
{{ end }}
{{ else }} inside a for block renders when the iterable is empty — a
cleaner alternative to wrapping the loop in an {{ if }}.
Loop variables
Inside the loop body, a synthetic loop dict is in scope:
| Field | Type | Description |
|---|---|---|
loop.index | int | 1-based index of the current item |
loop.index0 | int | 0-based index |
loop.first | bool | true on the first iteration |
loop.last | bool | true on the final iteration |
loop.length | int | total number of items |
Dict iteration
{{ for key, value in my_dict }}
{{ key }} = {{ value }}
{{ end }}
Dicts iterate in their canonical (BTreeMap) order.
Includes
Include another template file. Paths resolve relative to the including file’s directory:
{{ include "partials/header.harn.prompt" }}
The included template inherits the parent’s scope by default. Pass explicit
bindings with with { ... } — these are merged into the parent scope for the
inner render only:
{{ include "partials/item.prompt" with { item: x, style: "bold" } }}
Safety:
- Circular includes are detected (e.g.
a.promptincludesb.promptwhich includesa.prompt) and produce acircular include detectederror with the full chain. - Include depth is capped at 32 levels.
- A missing included file fails with
failed to read included template <path>.
Comments
Before{{# this never renders #}}After
Comments are stripped entirely at parse time. Use them to document a template without leaking the note into the final prompt.
Raw blocks
When a prompt needs to emit literal {{ / }} (say, the prompt includes
another template language, JSON with braces, etc.):
{{ raw }}
{{ this is output verbatim }}
{{ endraw }}
Everything between {{ raw }} and {{ endraw }} is passed through as-is,
no directive interpretation.
Whitespace control
Directives support {{- ... -}} trim markers (Jinja-style). A leading -
strips the preceding whitespace and one newline; a trailing - strips the
following whitespace and one newline. This is the idiomatic way to keep
templates readable without emitting extra blank lines:
Items:
{{- for x in xs -}}
{{ x }},
{{- end -}}
DONE
renders Items: a, b, c,DONE — no leading or trailing newlines introduced
by the control directives.
Filters
Apply transformations to a value via a pipeline. Filters can be chained and some accept arguments after a colon:
{{ items | join: ", " }}
{{ name | upper }}
{{ user.bio | default: "(no bio)" | indent: 4 }}
Built-in filters
| Filter | Args | Description |
|---|---|---|
upper | — | Uppercase the string form |
lower | — | Lowercase |
trim | — | Strip leading/trailing whitespace |
capitalize | — | First char upper, rest lower |
title | — | Title Case (uppercase each word) |
length | — | Number of items (string chars, list/set/dict entries, range size) |
first | — | First element (or char) |
last | — | Last element (or char) |
reverse | — | Reversed list or string |
join | sep: string | Join list items with sep |
default | fallback: any | Use fallback when the value is falsey |
json | pretty?: bool | Serialize as JSON (pass true for pretty) |
indent | width: int, first?: bool | Indent every line by width spaces; pass true to indent the first line too |
lines | — | Split string on \n into a list |
escape_md | — | Escape Markdown special characters |
replace | from: str, to: str | Replace all occurrences |
Unknown filters raise a clear error at render time.
Errors
On any parse or render error, the engine raises a thrown value (via
VmError::Thrown) with a message of the form:
<template-path> at <line>:<col>: <what went wrong>
Typical cases:
unterminated directive— a{{without a matching}}.unterminated comment— a{{#without a matching#}}.unterminated \{{ raw }}` block— missing{{ endraw }}`.unknown filter \foo`` — the named filter isn’t registered.circular include detected: a.prompt → b.prompt → a.prompt.include path must be a string—{{ include }}target wasn’t a string.
Preflight checks
harn check parses every template referenced by a literal render(...) /
render_prompt(...) call and surfaces syntax errors before you run the
pipeline. Catches things like an unterminated {{ for }} block at static
time rather than at first render.
Back-compat
The engine is a strict superset of the pre-v2 syntax:
{{ name }}— interpolation, missing bare identifier passes through verbatim{{ if key }} ... {{ end }}— truthy test
All pre-v2 templates render identically. Migrating awkward workarounds to the new forms is optional but usually shorter — see the migration guide.
Configuring LLM Providers
Harn supports multiple LLM providers out of the box. This page explains how provider and API key resolution works, and how to configure each one.
Provider resolution order
When you call llm_call() or start an agent_loop(), Harn resolves the
provider in this order:
- Explicit option —
llm_call({provider: "openai", ...})in your script - Environment variable —
HARN_LLM_PROVIDER - Inferred from model name — e.g.
gpt-4o→ OpenAI,claude-3→ Anthropic - Default —
anthropic - Fallback — if Anthropic key is missing, tries
ollamathenlocal
API key resolution
Each provider defines an auth_style and one or more environment variables:
| Provider | Environment Variable(s) | Auth Style |
|---|---|---|
| Anthropic | ANTHROPIC_API_KEY | header |
| OpenAI | OPENAI_API_KEY | bearer |
| OpenRouter | OPENROUTER_API_KEY | bearer |
| HuggingFace | HF_TOKEN, HUGGINGFACE_API_KEY | bearer |
| Ollama | (none) | none |
| Local | (none) | none |
Model selection
Set the model explicitly or via environment:
// In code
llm_call({model: "claude-sonnet-4-5-20241022", prompt: "..."})
// Or via environment
// export HARN_LLM_MODEL=gpt-4o
The HARN_LLM_MODEL environment variable sets the default model when none
is specified in the script.
Rate limiting
Harn supports per-provider rate limiting (requests per minute):
# Set via environment
export HARN_RATE_LIMIT_ANTHROPIC=60
export HARN_RATE_LIMIT_OPENAI=120
Or in code:
llm_rate_limit("anthropic", 60)
The rate limiter uses a token-bucket algorithm and will pause before sending requests that would exceed the configured RPM.
Local LLM support
For local models (Ollama, llama.cpp, vLLM, etc.):
export LOCAL_LLM_BASE_URL=http://localhost:11434
export LOCAL_LLM_MODEL=llama3
Harn will automatically fall back to a local provider if no cloud API key is configured. This makes it easy to develop and test without incurring API costs.
Troubleshooting
- “No API key found” — Check that the correct environment variable is
set for your provider. Run
echo $ANTHROPIC_API_KEYto verify. - Wrong provider selected — Set
HARN_LLM_PROVIDERexplicitly to override automatic detection. - Rate limit errors — Use
HARN_RATE_LIMIT_<PROVIDER>to throttle requests below your plan’s limit. - Debug message shapes — Set
HARN_DEBUG_MESSAGE_SHAPES=1to log the structure of messages sent to the LLM provider.
Debugging Agent Runs
Harn provides several tools for inspecting, replaying, and evaluating agent runs. This page walks through the debugging workflow.
Source-level debugging
For step-through debugging, start the Debug Adapter Protocol server:
cargo run --bin harn-dap
In VS Code, the Harn extension contributes a harn debug configuration
automatically. The equivalent launch.json entry is:
{
"type": "harn",
"request": "launch",
"name": "Debug Current Harn File",
"program": "${file}",
"cwd": "${workspaceFolder}"
}
This supports line breakpoints, variable inspection, stack traces, and step
in / over / out against .harn files.
Host-call bridge (harnHostCall)
The debug adapter advertises supportsHarnHostCall: true in its
Capabilities response. When a script calls host_call(capability, operation, params) and the VM has no built-in handler for the op, the
adapter forwards it to the DAP client as a reverse request named
harnHostCall — mirroring the DAP runInTerminal pattern:
{"seq": 17, "type": "request", "command": "harnHostCall",
"arguments": {"capability": "workspace", "operation": "project_root",
"params": {}}}
The client replies with a normal DAP response:
{"seq": 18, "type": "response", "request_seq": 17, "command": "harnHostCall",
"success": true, "body": {"value": "/Users/x/proj"}}
On success: true, the adapter returns the body’s value field (or the
whole body when value is absent) to the script. On success: false,
the adapter throws VmError::Thrown(message) so scripts can try /
catch the failure like any other Harn exception. Clients that do not
implement harnHostCall still work — the script just sees the
standalone fallbacks (workspace.project_root, workspace.cwd, etc.).
LLM telemetry output events
During run / step-through, the adapter forwards every llm_call the
VM makes as a DAP output event with category: "telemetry" and a
JSON body:
{"category": "telemetry",
"output": "{\"call_id\":\"…\",\"model\":\"…\",\"prompt_tokens\":…,\"completion_tokens\":…,\"cache_tokens\":…,\"total_ms\":…,\"iteration\":…}"}
IDEs can parse these to show a live LLM-call ledger alongside the debug session.
Run records
Every agent_loop() or workflow_execute() call can produce a run record —
a JSON file in .harn-runs/ that captures the full execution trace including
LLM calls, tool invocations, and intermediate results.
# List recent runs
ls .harn-runs/
# Inspect a run record
harn runs inspect .harn-runs/<run-id>.json
The inspect command shows a structured summary: stages executed, tools called, token usage, timing, and final output.
Comparing runs
Compare a run against a baseline to identify regressions:
harn runs inspect .harn-runs/new.json --baseline .harn-runs/old.json
This highlights differences in tool calls, outputs, and token consumption.
Replay
Replay re-executes a recorded run, using the saved LLM responses instead of making live API calls. This is useful for deterministic debugging:
harn replay .harn-runs/<run-id>.json
Replay shows each stage transition and lets you verify that your pipeline produces the same results given the same LLM responses.
Visualizing a pipeline
When you want a quick structural view instead of a live debug session, render a Mermaid graph from the AST:
harn viz main.harn
harn viz main.harn --output docs/main.mmd
The generated graph is useful for reviewing branch-heavy pipelines, match arms, parallel blocks, and nested retries before you start stepping through them.
Evaluation
The harn eval command scores a run or set of runs against expected outcomes:
# Evaluate a single run
harn eval .harn-runs/<run-id>.json
# Evaluate all runs in a directory
harn eval .harn-runs/
# Evaluate using a manifest
harn eval eval-suite.json
Custom metrics
Use eval_metric() in your pipeline to record domain-specific metrics:
eval_metric("accuracy", 0.95, {dataset: "test-v2"})
eval_metric("latency_ms", 1200)
These metrics appear in run records and are aggregated by harn eval.
Token usage tracking
Track LLM costs during a run:
let usage = llm_usage()
log("Tokens used: ${usage.input_tokens + usage.output_tokens}")
log("LLM calls: ${usage.total_calls}")
Portal
The Harn portal is an interactive web UI for inspecting runs:
harn portal
This opens a dashboard showing all runs in .harn-runs/, with drill-down
into individual stages, tool calls, and transcript snapshots.
Tips
- Add
eval_metric()calls to your pipelines early — they’re cheap to record and invaluable for tracking quality over time. - Use replay for debugging non-deterministic failures: record the failing run, then replay it locally to step through the logic.
- Compare baselines when refactoring prompts or changing tool definitions to catch regressions before they ship.
Editor integration
Harn provides first-class editor support through an LSP server, a DAP debugger, and a tree-sitter grammar. These cover most modern editors and IDE workflows.
VS Code
The editors/vscode/ directory contains a VS Code extension that bundles
syntax highlighting (via tree-sitter) and automatic LSP/DAP client
configuration.
Install from the extension directory:
cd editors/vscode && npm install && npm run build
Then use Extensions: Install from VSIX or symlink into
~/.vscode/extensions/.
Language server (LSP)
Start the LSP server with:
cargo run --bin harn-lsp
Or use the compiled binary directly (harn-lsp). The server communicates
over stdin/stdout using the Language Server Protocol.
Supported capabilities
| Feature | Description |
|---|---|
| Diagnostics | Real-time parse errors, type errors (including cross-module undefined-call errors), and warnings. Shares the same module graph used by harn check and harn run. |
| Completions | Scope-aware: pipelines, functions, variables, parameters, enums, structs, interfaces. Dot-completions for methods plus inferred shape fields, struct members, and enum payload fields. Builtins and keywords. |
| Go-to-definition | Jump to the declaration of pipelines, functions, variables, enums, structs, and interfaces. Cross-file navigation walks the recursive module graph (relative paths and .harn/packages/), so symbols reachable through any number of transitive imports resolve. |
| Find references | Locate all usages of a symbol across the document |
| Hover | Shows type information and documentation for builtins |
| Signature help | Parameter hints while typing function arguments |
| Document symbols | Outline view of pipelines, functions, structs, enums |
| Workspace symbols | Cross-file search for pipelines and functions |
| Semantic tokens | Fine-grained syntax highlighting for keywords, types, functions, parameters, enums, and more |
| Code actions | Quick fixes for lint warnings (var→let, boolean simplification, unused import removal, string interpolation) and type errors |
| Rename | Rename symbols across the document |
| Document formatting | Delegates to harn-fmt for format-on-save support |
Configuration
Most editors auto-detect the LSP binary. For manual configuration, point
your editor’s LSP client at the harn-lsp binary with no arguments. The
server uses TextDocumentSyncKind::FULL and debounces full-document reparses
so diagnostics stay responsive while you are typing.
Debug adapter (DAP)
Start the debugger with:
cargo run --bin harn-dap
The DAP server communicates over stdin/stdout using the Debug Adapter Protocol. It supports:
- Breakpoints (line-based)
- Step in / step over / step out
- Variable inspection in scopes
- Stack frame navigation
- Continue / pause execution
VS Code launch configuration
The VS Code extension now contributes a harn debugger type and an initial
Debug Current Harn File launch configuration. You can also add it manually:
{
"type": "harn",
"request": "launch",
"name": "Debug Harn",
"program": "${file}",
"cwd": "${workspaceFolder}"
}
Set harn.dapPath if harn-dap is not on your PATH.
Tree-sitter grammar
The tree-sitter-harn/ directory contains a tree-sitter grammar for Harn.
This powers syntax highlighting in editors that support tree-sitter
(Neovim, Helix, Zed, etc.).
Build the grammar:
cd tree-sitter-harn && npx tree-sitter generate
Highlight queries are in tree-sitter-harn/queries/highlights.scm.
Formatter
Format Harn files from the command line or integrate with editor format-on-save:
harn fmt file.harn # format in place
harn fmt --check file.harn # check without modifying
Linter
Run the linter for static analysis:
harn lint file.harn
harn lint --fix file.harn # automatically apply safe fixes
The linter checks for: shadow variables, unused variables, unused types,
undefined functions, unreachable code, missing harndoc comments, naming
convention drift, branch-heavy functions, and prompt-injection risks such as
interpolated llm_call system prompts. With --fix, the linter automatically
rewrites fixable issues (e.g., var → let, boolean comparison
simplification, unused import removal).
Testing
Harn provides several layers of testing support: a conformance test runner, a standard library testing module, and host-mock helpers for isolating agent behavior from real host capabilities.
Conformance tests
Conformance tests are the primary executable specification for the Harn
language and runtime. They live under conformance/tests/ as paired files:
test_name.harn— Harn source codetest_name.expected— exact expected stdout output
Tests are grouped by area into subdirectories. ls conformance/tests/ gives
the current top-level map (examples: language/, control_flow/, types/,
collections/, concurrency/, stdlib/, templates/, modules/,
agents/, integration/, runtime/). The runner discovers .harn files
recursively, so new tests just need to be dropped into the appropriate
subdirectory.
Shared helpers live alongside the tests that use them:
conformance/tests/modules/lib/ holds import targets for the modules/
tests, and conformance/tests/templates/fixtures/ holds prompt-template
fixtures for the templates/ tests.
Error tests (Harn programs that are expected to fail) live under
conformance/errors/, similarly subdivided into syntax/, types/,
semantic/, and runtime/.
Running tests
# Run the full conformance suite
harn test conformance
# Filter by name (substring match)
harn test conformance --filter workflow_runtime
# Filter by tag (if test uses tags)
harn test conformance --tag agent
# Verbose output
harn test conformance --filter my_test -v
# Timing summary without verbose failure details
harn test conformance --timing --filter my_test
Writing a conformance test
Create a .harn file with a pipeline default(task) entry point and use
log() or println() to produce output:
// conformance/tests/<group>/my_feature.harn (e.g. stdlib/, types/)
pipeline default(task) {
let result = my_feature(42)
log(result)
}
Then create a .expected file with the exact output:
[harn] 84
The std/testing module
Import std/testing in your Harn tests for higher-level test helpers:
import { mock_host_result, assert_host_called, clear_host_mocks } from "std/testing"
Host mock helpers
| Function | Description |
|---|---|
clear_host_mocks() | Remove all registered host mocks |
mock_host_result(cap, op, result, params?) | Mock a host capability to return a value |
mock_host_error(cap, op, message, params?) | Mock a host capability to return an error |
mock_host_response(cap, op, config) | Mock with full response configuration |
Host call assertions
| Function | Description |
|---|---|
host_calls() | Return all recorded host calls |
host_calls_for(cap, op) | Return calls for a specific capability/operation |
assert_host_called(cap, op, params?) | Assert a host call was made |
assert_host_call_count(cap, op, expected_count) | Assert exact call count |
assert_no_host_calls() | Assert no host calls were made |
Example
import { mock_host_result, assert_host_called, clear_host_mocks } from "std/testing"
pipeline default(task) {
clear_host_mocks()
// Mock the workspace.read_text capability
mock_host_result("workspace", "read_text", "file contents")
// Code under test calls host_call("workspace.read_text", ...)
let content = host_call("workspace.read_text", {path: "test.txt"})
log(content)
// Verify the call was made
assert_host_called("workspace", "read_text")
}
LLM mocking
For testing agent loops without real LLM calls, use llm_mock():
llm_mock({text: "The answer is 42"})
let result = llm_call([
{role: "user", content: "What is the answer?"}
])
log(result)
This queues a canned response that the next LLM call consumes.
For end-to-end CLI runs, harn run and harn playground can preload the same mock
infrastructure from a JSONL fixture file:
{"text":"PLAN: find the middleware module first","model":"fixture-model"}
{"match":"*hello*","text":"matched","model":"fixture-model"}
{"match":"*","error":{"category":"rate_limit","message":"fake rate limit"}}
harn run script.harn --llm-mock fixtures.jsonl
harn playground --script pipeline.harn --llm-mock fixtures.jsonl
- A line without
matchis FIFO and is consumed on use. - A line with
matchis checked in file order as a glob against the request transcript text. - Add
"consume_match": truewhen repeated matching prompts should advance through a scripted sequence instead of reusing the same line forever. - When no fixture matches,
harn run --llm-mock ...andharn playground --llm-mock ...fail with the first prompt snippet so you can add the missing case directly.
To capture a replayable fixture from a run, record once and then replay the saved JSONL:
harn run script.harn --llm-mock-record fixtures.jsonl
harn run script.harn --llm-mock fixtures.jsonl
harn playground --script pipeline.harn --llm-mock-record fixtures.jsonl
harn playground --script pipeline.harn --llm-mock fixtures.jsonl
Built-in assertions
Harn provides assert, assert_eq, and assert_ne builtins for test pipelines:
assert(x > 0, "x must be positive")
assert_eq(actual, expected)
assert_ne(actual, unexpected)
assert_eq(len(items), 3)
Failed assertions throw an error with a descriptive message including the expected and actual values.
Use require for runtime invariants in normal pipelines. The linter warns if
you use assert* outside test pipelines, and it suggests assert* instead of
require inside test pipelines.
Migrating from 0.6.x to 0.7.0
Harn 0.7.0 replaces the implicit transcript_policy dict with
first-class sessions. Session lifecycle is driven by
imperative builtins, and unknown inputs hard-error instead of silently
no-op’ing.
This guide lists every removed surface with a side-by-side rewrite.
transcript_policy on workflow nodes
The per-node policy dict is gone. Its fields moved to two dedicated setters plus lifecycle verbs.
Before (0.6)
workflow_set_transcript_policy(graph, "summarize", {
mode: "reset",
visibility: "public",
auto_compact: true,
compact_threshold: 8000,
compact_strategy: "truncate",
keep_last: 6,
})
After (0.7)
// Shape the node's compaction behavior:
workflow_set_auto_compact(graph, "summarize", {
auto_compact: true,
compact_threshold: 8000,
compact_strategy: "truncate",
keep_last: 6,
})
workflow_set_output_visibility(graph, "summarize", "public")
// To reset the stage's conversation explicitly before execution,
// open a caller-controlled session and wire it into the node's
// model_policy:
let sid = agent_session_open("summarize-v2")
workflow_set_model_policy(graph, "summarize", {session_id: sid})
agent_session_reset(sid)
mode: "fork" maps to agent_session_fork(src, dst?) called before
workflow_execute, wiring the fork id into the node’s
model_policy.session_id. mode: "continue" is the new default — two
stages sharing a session_id share a conversation automatically.
transcript_id / transcript_metadata on llm_call
Both keys were removed. Session id subsumes them.
Before
let result = llm_call("hi", {
transcript_id: "chat-42",
transcript_metadata: {user: "ada"},
})
After
// `session_id` is honored by `agent_loop`; `llm_call` is single-shot.
// For conversational continuity, move to agent_loop:
let sid = agent_session_open("chat-42")
let result = agent_loop("hi", nil, {session_id: sid})
If you relied on the transcript_metadata bag, attach it to the
session via your own store or pass per-call context through the
metadata field of injected messages. transcript_summary (per-call
summary injection for mid-loop compaction output) is unchanged.
transcript option on llm_call / agent_loop
Passing a raw transcript dict through the transcript option is now a
hard error.
Before
let t = transcript()
let result = agent_loop("task", nil, {transcript: t, provider: "mock"})
After
let sid = agent_session_open()
let result = agent_loop("task", nil, {session_id: sid, provider: "mock"})
// `agent_session_snapshot(sid)` if you want the transcript back as a dict.
The loop loads prior messages from the session store as a prefix before running and persists the final transcript back on exit.
Lifecycle via dict (mode: "reset" | "fork")
Previously some call sites accepted a lifecycle dict. That pattern is gone — call the verbs explicitly:
mode: "reset"→agent_session_reset(id)mode: "fork"→let dst = agent_session_fork(src)(optionally with a caller-provideddstid)mode: "continue"→ no-op; just reuse the samesession_id
Subscribers
CLOSURE_SUBSCRIBERS (thread-local in agent_events.rs) was removed.
Subscribers now live on SessionState.subscribers.
agent_subscribe(id, cb)opens the session lazily and appends.agent_session_forkdoes not copy subscribers — a fork is a conversation branch, not an event fanout.clear_session_sinksonly clears external ACP-style sinks now; it no longer evicts sessions.
Unknown-key / unknown-id behavior
A class of silent pass-throughs is now an error:
- Unknown
agent_session_compactoption keys. - Missing
roleonagent_session_inject. - Negative
keep_last. reset/fork/close/trim/inject/length/compactcalled against an unknown session id.
exists, open, and snapshot remain tolerant of unknown ids by
design.
agent_loop terminal status
max_iterations reached without a natural break now reports
status = "budget_exhausted" (previously "done"). If your host keys
off "done" to detect “agent is finished,” add "budget_exhausted" to
the accept list — the loop ran out of rope, not out of work. Daemon
loops in the same condition no longer silently relabel to "idle".
See the Sessions chapter for the full model and the 0.7.0 entry in the changelog for the complete breaking-change list.
Prompt templates: v2 migration
The prompt-template engine used by render(...) / render_prompt(...) now
supports else/elif, loops, includes, filters, comments, raw blocks, and
whitespace trim markers. Existing templates keep rendering unchanged — this
is a strict superset. But many pre-v2 workarounds can now be simplified.
If / else
Before — mutually-exclusive {{ if }} blocks with inverted flags:
{{if expected_output}}
Expected: {{expected_output}}
{{end}}{{if no_expected_output}}
(no expected output provided)
{{end}}
After:
{{if expected_output}}
Expected: {{expected_output}}
{{else}}
(no expected output provided)
{{end}}
Loops instead of hand-rolled list concatenation
Before — build a string in .harn and inject it as a single variable:
let block = ""
for sample in samples {
block = "${block}### ${sample.path}\n\`\`\`\n${sample.content}\n\`\`\`\n\n"
}
let prompt = render("enrichment.prompt", {block: block, ...})
# enrichment.prompt
## Samples
{{block}}
After — iterate in the template:
let prompt = render("enrichment.prompt", {samples: samples, ...})
# enrichment.prompt
## Samples
{{for s in samples}}
### {{s.path}}
```
{{s.content}}
```
{{end}}
Shared prose → {{ include }}
When multiple repair-stage prompts share the same boilerplate (“self-verification instructions”, system rules, etc.), extract the shared text into a partial:
# lib/partials/self-verify.harn.prompt
Before responding, verify your answer against: {{verification_hint}}
Call it from each repair stage:
{{include "partials/self-verify.harn.prompt"}}
...stage-specific instructions...
Pass stage-specific overrides with with:
{{include "partials/self-verify.harn.prompt" with { verification_hint: "compile output" }}}
Filters instead of pre-processing
Before — uppercase, join lists, JSON-stringify in .harn before rendering:
let tags_str = join(map(tags, fn(t) { return uppercase(t) }), ", ")
render("x.prompt", {tags: tags_str})
After:
Tags: {{tags | join: ", " | upper}}
Comments and raw blocks
Add {{# authoring notes #}} to document a template without leaking the note
into the final prompt. Wrap literal {{ / }} (e.g. examples of another
template language embedded in a prompt) in a {{ raw }} ... {{ endraw }}
block.
Whitespace trim
{{- ... -}} markers strip whitespace and one newline on the respective
side. Use them to keep source templates readable without introducing blank
lines in the rendered output:
Items:
{{- for x in xs -}}
{{ x }},
{{- end -}}
DONE
See Prompt templating for the full reference.
Migration — schema-as-type (type aliases drive output_schema)
Prior to this change, Harn had two parallel representations for structured LLM output:
- Harn-native types —
type Foo = {verdict: string, ...}. - Raw JSON-Schema dicts — passed as
output_schema: {type: "dict", properties: {...}, required: [...]}tollm_call, and consumed byschema_is,schema_expect,schema_parse, and friends.
The two representations drifted. A grader script that declared a type alias for documentation and a separate schema dict for validation had no compile-time check that the two agreed.
This release unifies them. A single type alias now feeds:
- Static type-checking on the values that flow through it.
- JSON-Schema emission for
llm_callstructured output. schema_is/schema_expectnarrowing on runtime-typed values (unknown, unions, parsed JSON).- ACP
ToolAnnotations.argscompatibility (same emitted schema).
Migrating a grader script
Before — duplicated surface, no cross-check:
let grader_schema = {
type: "object",
required: ["verdict", "summary"],
properties: {
verdict: {type: "string", enum: ["pass", "fail", "unclear"]},
summary: {type: "string"},
},
}
let r = llm_call(prompt, nil, {
model: routing.model,
output_schema: grader_schema,
schema_retries: 2,
})
// No compile-time check that r.data has these shape/fields.
log("verdict=${r.data.verdict}")
After — one alias, two uses:
type GraderOut = {
verdict: "pass" | "fail" | "unclear",
summary: string,
}
let r = llm_call(prompt, nil, {
model: routing.model,
output_schema: GraderOut, // compiled to the JSON-Schema dict
schema_retries: 2,
})
if schema_is(r.data, GraderOut) {
// r.data is narrowed to GraderOut here.
log("verdict=${r.data.verdict}")
}
What translates mechanically
| Old schema key | New type grammar |
|---|---|
{type: "string"} | string |
{type: "int"} / "integer" | int |
{type: "bool"} / "boolean" | bool |
{type: "list", items: T} | list<T> |
{type: "dict", additional_properties: V} | dict<string, V> |
{type: "string", enum: ["a","b"]} | "a" | "b" |
{type: "int", enum: [0,1,2]} | 0 | 1 | 2 |
{properties, required} with additional_properties: false | type T = {field: type, optional?: type} |
{union: [A, B]} / {oneOf: [A, B]} | A | B |
{nullable: true} wrapping T | T | nil |
Staying with raw schema dicts
Nothing forces you to migrate. output_schema: dict_literal still
works and is still the right tool when you need schema features Harn’s
type grammar does not yet express (regex pattern, min_length,
numeric min/max, const, nested $ref, etc.). You can mix:
type Name = {first: string, last: string}
let r = llm_call(prompt, nil, {
output_schema: {
type: "dict",
properties: {
name: schema_of(Name), // alias → schema dict
email: {type: "string", pattern: "^[^@]+@[^@]+$"},
},
required: ["name", "email"],
},
})
Caveats
schema_of(T)lowers at compile time.Tmust be a top-leveltypealias visible to the compiler. Dynamic construction (let T = ...) falls back to the runtimeschema_ofbuiltin, which is a dict-passthrough — it does not look up alias names at runtime.- The compiler-level alias emitter handles shapes, lists,
dict<string, V>, literal-string/int unions, and nested aliases. Shapes containingApplied<T>(generic containers) orfntypes emit a best-effort{type: "closure"}placeholder; prefer raw schema dicts there. response.dataofllm_call(..., {output_schema: T})is not yet automatically narrowed toTby the type checker. Useif schema_is(r.data, T) { ... }in the interim — the narrowing there is exact.