MCP surface
Start, advance, and inspect chit runs from inside a chat, with one run id and a heartbeat on each long step
The MCP surface is the primary way to operate chit. It runs the same runtime as the CLI, exposed as tools a model can call from inside Claude Code, so you and the model drive and inspect a run from the chat instead of a second terminal.
The model is small and worth holding in your head:
- A chit is a declared routine (a manifest). A run executes one chit.
- A manifest's policy decides what a run is: one-shot (a single pass over the step DAG) or loop (a write-capable implementer and a read-only reviewer iterate until the reviewer says proceed, blocks, or the budget runs out).
- A mode decides where the run lives: foreground (this chat supervises it, one unit per call) or background (a detached worker drives it to completion and it survives a reconnect).
- A batch coordinates several background runs, one per git worktree.
- Audit reads the receipts a run leaves behind.
You hold one id: the run_id that chit_start returns. Everything else (chit_next, chit_status, chit_trace, chit_cancel) takes that id.
Install the CLI, then register the MCP server with chit mcp:
bun install -g @chit-run/cli@latest
claude mcp add chit --scope local -- chit mcpchit mcp runs the stdio MCP server from the installed chit binary (the same binary as chit run, chit audit). Upgrade with the same command, bun install -g @chit-run/cli@latest (during 0.x, bun update -g will not cross a minor). From a source checkout the equivalent is claude mcp add chit --scope local -- bun <repo>/apps/cli/src/cli/run.ts mcp.
Source: apps/cli/src/surfaces/mcp/ (server.ts registers the tools; engine.ts is the one-shot run engine; converge-engine.ts is the single-iteration loop driver; controller.ts resolves a run_id to its run).
Execution model
A run is a stepwise projection of a manifest DAG. chit, not the model, decides what is legal to run next.
- chit owns the order. A step is ready only when every step in its
manifest.dependenciesisdone.chit_nextruns only ready steps. The model drives, but it cannot invent routing. There are no dynamic, model-decided handoffs; the static-DAG thesis holds. chit_nextadvances one unit and returns. It never drains by surprise. For a one-shot run, one unit is the currently-ready wave (every step whose dependencies are met), or a singlestep_idif you name one. For a loop run, one unit is one implement-to-review iteration. To drive a foreground run to completion, callchit_nextuntil it reports complete; to run unattended, start it withmode: "background".- A step runs exactly once. A step marks itself
runningbefore its first await, so a concurrent advance on the same step is rejected. On settle the record is terminal:done | failed | cancelled. - Completion is all-steps-done, not just the output step. An independent pending or failed branch keeps the run incomplete.
chit_nextblocks until its unit settles. The heartbeat renders on that in-flight tool call, so a model turn is pinned for the unit's whole duration.- Sessions.
per_scopeparticipants, and every loop run, need ascope.chit_startrefuses aper_scopeor loop run started without one rather than silently running stateless. Within a run, aper_scopeparticipant resumes its own session across steps.
What the ids mean
One public id, and it never shifts meaning.
- run_id is the only handle you pass.
chit_startreturns it;chit_next/chit_status/chit_trace/chit_canceltake it. For a foreground run it lives in this session's memory; for a background run it is the durable record's id, so it resolves after a reconnect. - audit refs are receipt ids. A run records one per audited step or iteration; they appear in
chit_traceandchit_statusoutput and you open them withchit_audit_show. A run is never its own audit ref. - loop ids and job ids are internal. They key the loop log and the durable job record on disk. chit never asks you for one and never returns one as a handle. If you see an id in a control or status result, it is the run_id.
Foreground vs background
mode is a property of a run, not a different kind of run.
- Foreground (default) means this chat supervises the run. You advance it one unit at a time with
chit_nextand watch each unit settle. A foreground run lives in memory: a new session starts with none, and an idle one is evicted after an hour. Use it when you want to review every unit. - Background means a detached worker drives the run to completion on its own while you keep working. Its state is durable (the job record, loop log, and audit transcripts are on disk), so you inspect or stop it from any later turn, and it survives an MCP reconnect. Use it when the task is scoped enough to run unattended. Poll with
chit_status, stop withchit_cancel.
One-shot vs loop
policy is a property of the manifest, declared once, that decides what a run does.
- One-shot (the default when a manifest declares no policy) is a single pass over the DAG. Each ready wave runs once; the run completes when every step is done. Pass
inputs, not atask. - Loop is the converge pattern: an implementer step and a reviewer step run in turn, and chit reads the reviewer's verdict (
proceed | revise | block) each round.proceedconverges,blockstops, an exhaustedmax_iterationsbudget stops asmax-iterations. A loop run takes ataskand ascope, notinputs. The bundled default loop is a write-capable Claude implementer and a read-only Codex reviewer;chit_startwith a baretaskuses it.
Start and advance a run
All results are JSON in a single text block. Most embed a run view: { run_id, mode, execution, ... , nextAction }, where mode is foreground | background, execution is one-shot | loop | job, and nextAction names the next tool to call.
chit_start
Open a run and return its run_id. Inputs:
task?- a slice to converge on with the built-in loop. A loop run requires it; omit it whenmanifest_pathnames a one-shot manifest.manifest_path?- a manifest.json(absolute, or relative tocwd). Its policy decides one-shot vs loop. Omit to converge ontaskwith the built-in loop.mode-foreground(default) orbackground.scope?- session scope id. Required for a loop run and for anyper_scopemanifest.cwd?- repo / working dir (defaults to the server cwd; also where a loop log is written).inputs- manifest inputs as string key/value pairs, default{}(one-shot runs).audit- persist a full audit transcript (prompts/outputs/usage as blobs), defaultfalsesince blobs can hold secrets. Background runs are always audited.max_iterations- loop iteration budget, default3(loop runs only).allow_unenforced_permissions- run even when a declared permission cannot be enforced (emits warnings), defaultfalse(such a manifest is otherwise refused).
Returns the run view with the initial ready set (one-shot) or the open loop (loop). Errors: unreadable or invalid manifest, a one-shot manifest given a task, a loop run with no scope, unknown agent, enforcement gap without the flag.
chit_next
Advance the run by one unit and return control; emits a heartbeat while the unit runs. Inputs: run_id, and step_id? (one-shot only: advance just that ready step instead of the whole ready wave).
- One-shot: runs the ready wave (or the named step). Returns
{ ran[], ...run view }, eachranentry{ step, durationMs, output }or{ step, cancelled: true, durationMs }. When nothing is ready, the run is complete. - Loop: runs one implement-to-review iteration. Returns
{ iteration, verdict, decision, findingCount, checksRun, changedFiles, workspaceWarnings, usage?, auditRef?, stopStatus?, ...run view }. A setstopStatusmeans the loop also stopped this round. A cancelled iteration returns{ cancelled: true, iteration, ...run view }and records a cleancancelledstop, never a fake-successful round. A graceful manifest failure returns{ failed: true, iteration, failure, ...run view }and closes the loopblocked.
chit_next rejects a run that has already finished, and a loop that already has an iteration in flight (one advancer per loop; a foreground call and a background worker cannot advance the same loop at once).
Inspect and stop a run
chit_status
Run status. With a run_id: that run's status, and whether it is foreground (supervised by this session) or a durable background run. With no run_id: the operator overview of what is active in this server now plus a compact list of recently finished runs (newest first). Read-only: it never sweeps or touches the in-memory stores, so polling never keeps a run alive. Inputs: run_id?, recent_limit? (overview only, default 5; 0 for none). Active foreground state is per-session; background runs and recent history are durable across reconnect.
chit_status is a snapshot. When you want to wait for something to happen rather than poll, use chit_wait.
chit_wait
Block until a background run or batch reaches a meaningful state, then return the same view as chit_status / chit_batch_status plus a waitResult. This is the "tell me when it's done" tool: reach for it instead of polling chit_status in a loop, and never read chit's state files to detect completion (they are private). Read-only: it never advances a batch or mutates a run; it watches the durable state and returns. Emits a heartbeat while waiting; press Esc to stop waiting (the run keeps running). Inputs: run_id OR batch_id, timeout_ms? (default 900000), cwd? (batch only).
- With a
run_id(background runs only): waits until the run is terminal (completed/failed/cancelled, or its worker died), so a crashed worker never hangs the wait. A foreground run is rejected (advance it withchit_next, which already blocks per unit). - With a
batch_id: waits untilchit_batch_advancewould do real work (a task can launch, or a finished/stale job can reconcile) or the batch is fully terminal. It does not advance the batch; callchit_batch_advanceafter.
waitResult is terminal | needs_advance | timeout.
chit_trace
The history of a run. Input: run_id. For a one-shot run, the step transcript: { run_id, execution: "one-shot", complete, trace[] }, each entry { step, kind, participant, agent, status, durationMs, output, error }. For a loop or background run, the iteration log read from the durable loop log: { run_id, execution, records[] }, where records are the header, each iteration (summary, changed files, verdict, decision, usage, audit ref), and the stop record. Audit refs appear here; the run_id is the only handle. Read-only.
chit_cancel
Cancel a run by run_id, foreground or background. Input: run_id.
- Foreground one-shot: aborts every running step; each settles
cancelled(terminal, blocks dependents). - Foreground loop: if an iteration is in flight, aborts it (it settles as a clean
cancelledstop); if the loop is open but idle, closes itcancelled. - Background: records the cancel intent first (so it survives a worker restart), then signals the worker's process group; the worker stops at the next safe point and records a clean
cancelledstop.
A run that already finished is reported back unchanged.
Run several tasks in parallel
A batch runs several loop tasks in parallel, one per git worktree, as background runs. It is a thin coordinator: it plans a task graph, creates a worktree per task, and launches a background run per runnable task. It owns no execution and never auto-merges; the deliverable is a set of reviewable worktree branches. There is no daemon: progress happens only at explicit tool calls. Batch state is durable under the state dir, keyed by repo (not in the reviewed tree).
You hand chit a reviewed task graph; chit runs it. Each task declares claimedPaths (the files it will touch); tasks with overlapping claims never run concurrently (they serialize into later waves), so parallel tasks cannot race on the same files. A task may declare dependencies (task ids that must reach review_ready first). Dependencies are a launch gate, not integration: each task's worktree branches from the batch base (base_branch), so a dependent task starts from that base and does not receive its dependencies' changes. Use dependencies to order work; merging the resulting branches is yours. The manifest per task resolves as task manifestPath > batch manifest_path > the bundled default (a write-capable Claude implementer and a read-only Codex reviewer). A batch can mix model pairs by pointing tasks at different manifests: e.g. a Codex-writer task using examples/converge-codex-writer.json beside a default Claude-writer task.
chit_batch_start
Plan the graph, create worktrees, launch the initial runnable wave, return immediately. Inputs: tasks (each { id, title, body, dependencies?, claimedPaths?, allowPathOverlap?, manifestPath? }), cwd?, max_parallel? (default 2), base_branch? (default HEAD), manifest_path?, max_iterations?. claimedPaths is required per task unless allowPathOverlap is set (which makes the task run alone). Returns the batch overview (see status).
chit_batch_list
List the batches in this repo, newest first: id, status, task count, how many tasks are review_ready / failed, createdAt, and cleanedAt if it has been cleaned up. Use it to recover a batch id you lost, then chit_batch_status for the full view. Read-only. Inputs: limit? (newest N), cwd?.
chit_batch_status
Read-only overview: each task's status, live run state/phase, branch/worktree path, changed files, audit refs, plus runnableCount and a nextAction. Inspection is safe: this never launches runs, creates worktrees, or mutates state. Inputs: batch_id, cwd?.
chit_batch_advance
The progression trigger. Reconciles finished runs into task state (converged -> review_ready; blocked / max-iterations / failed / stale -> failed; a dependent proceeds only past a review_ready task), then launches the next runnable wave. Call it when status reports a finished run or runnable tasks. Inputs: batch_id, cwd?.
chit_batch_cancel
Request cancellation of every active task run (intent-first, the same safety as chit_cancel) and mark pending tasks cancelled. Running runs settle cleanly in the background; worktrees are left in place for inspection. Inputs: batch_id, cwd?.
chit_batch_cleanup
Retire a batch's worktrees and branches once you are done reviewing them. Safe by default: with confirm omitted/false it is a DRY RUN that lists which worktrees/branches would be removed and which changed-file diffs that would discard, and removes nothing. With confirm: true it removes them (git worktree remove --force + branch -D). Refuses while any task is still running. Never deletes the batch / run / audit receipts - those stay as durable history (it records cleanedAt on the batch). Inputs: batch_id, confirm? (default false), cwd?.
Read receipts
The audit tools read the local transcripts that audited runs write: chit run --audit, an audited MCP run (chit_start audit: true), and every background run (always audited). Same reader as the CLI chit audit list/show. Read-only: a run with no run.completed event is reported incomplete with the reason from the timeline alone (an open call killed mid-flight, a failed step, or an abandoned run). Bodies are read only through blob refs a run's own events carry, never a caller-supplied path, so inspection can never serve an arbitrary file.
chit_audit_list
List audited runs, newest first. Input: limit?. Returns { runs[] }, each run { audit_ref, manifestId, surface, scope?, iteration?, startedAt?, status, stepCount, usage?, openCall? }, where audit_ref is the receipt handle you pass to chit_audit_show, status is the run.completed status or incomplete, and openCall (when present) names a step whose adapter call started but never completed (killed mid-call). (audit_ref is a receipt id, distinct from a control run_id: a loop run has one run_id but one audit_ref per iteration.)
chit_audit_show
Show one audited run as a receipt, by its audit_ref (from chit_trace's auditRefs or chit_audit_list, not a control run_id). Inputs: audit_ref, include_bodies (default false), verbose (default false). Returns { summary, incompleteReason?, participants?, timeline[], note? }: the summary above, the reason when incomplete, the participant config recorded at start, and the structured event timeline. Without verbose the timeline is a receipt (the raw per-call adapter events are hidden, and note says how many). Prompt/output/event bodies attach to their timeline entries (input/output/raw) only when include_bodies is true, since they can be large or hold secrets.
Observability (heartbeat)
While a call step runs, chit_next emits, every ~5s, both a progress notification (with progressToken) and a logging notification carrying the same latest-state text. Claude Code renders the latest heartbeat live in the collapsed tool call; the full transcript is chit_trace.
There is no within-step streaming of the agent's output to the MCP client. The heartbeat is latest-state text, not a token stream, and chit_next returns only the unit's final result. On an audited run the adapter does capture the agent's live event stream as adapter.event records, but that feeds the audit log, not the MCP client.
Cancellation
Cancellation has two reachable paths. The portable one is chit_cancel: each in-flight unit owns an AbortController registered for the whole call; chit_cancel aborts it, both adapters (claude-cli, codex-exec) kill their child process on abort and reject, and the engine settles the unit cancelled. The second path is ambient: in Claude Code, pressing Esc during a blocking chit_next propagates request cancellation through the call's folded-in extra.signal, which aborts the same controller. A live probe confirmed a long codex step settling cancelled in ~5s on Esc, with no chit_cancel call. Esc behavior is client-specific, so chit_cancel stays the portable backup.
A cancelled loop iteration records a clean cancelled stop with no iteration record, so the loop log never carries a fake-successful round.
Limits
- Foreground runs live in an in-memory store. A server restart or reconnect loses them. The store is idle-evicting: a run untouched for more than 1h is dropped on the next
chit_startsweep, unless it still has work in flight (those are never evicted). A background run is durable and is not affected. - After a foreground run is evicted,
chit_status/chit_traceno longer find thatrun_id. A background run's loop log and audit transcripts persist regardless. inputsare string to string.file[]inputs are not expressible via MCP.- Concurrent
per_scopesteps would hit the session store's read-modify-write race, so keep same-scope steps serial.
Not supported yet
Client-facing output streaming (a live token stream to the MCP client). The heartbeat is enough for now. Live adapter event capture for the audit log is separate, and has shipped.