Skip to content

Runner

The Runner is the execution engine that powers agent runs in herdctl. It integrates with the Claude Agent SDK to execute agents, stream output in real-time, and manage the full job lifecycle.

┌─────────────────────────────────────────────────────────────────────┐
│ JobExecutor │
├─────────────────────────────────────────────────────────────────────┤
│ 1. Create job record │
│ 2. Transform config → SDK options (sdk-adapter) │
│ 3. Execute SDK query (async iterator) │
│ 4. Process messages (message-processor) │
│ 5. Stream output to JSONL │
│ 6. Update job status and session info │
└─────────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────────┐ ┌────────────┐
│ SDK │ │ State │ │ Error │
│ Adapter │ │ Management │ │ Handler │
└──────────┘ └──────────────┘ └────────────┘

The runner module consists of four main components:

ComponentFilePurpose
JobExecutorjob-executor.tsMain execution engine and lifecycle manager
SDK Adaptersdk-adapter.tsTransforms agent config to SDK format
Message Processormessage-processor.tsValidates and transforms SDK messages
Error Handlererrors.tsClassifies errors and provides diagnostics

The runner integrates with the Claude Agent SDK using an async iterator pattern. This enables real-time streaming of agent output without buffering.

The SDK’s query() function returns an AsyncIterable<SDKMessage>, which the runner consumes:

// SDK query function signature
type SDKQueryFunction = (params: {
prompt: string;
options?: Record<string, unknown>;
abortController?: AbortController;
}) => AsyncIterable<SDKMessage>;
// Execution loop
const messages = sdkQuery({ prompt, options: sdkOptions });
for await (const message of messages) {
// Process each message as it arrives
const processed = processSDKMessage(message);
// Write immediately to JSONL (no buffering)
await appendJobOutput(jobsDir, jobId, processed.output);
// Check for terminal message
if (processed.isFinal) {
break;
}
}
  • Real-time streaming: Messages appear immediately in job output
  • Memory efficiency: No buffering of large outputs
  • Concurrent readers: Other processes can tail the JSONL file
  • Graceful shutdown: Can stop mid-execution via AbortController

The runner supports four permission modes that control how tool calls are approved:

ModeDescriptionAuto-Approved Tools
defaultRequires approval for everythingNone
acceptEditsDefault - Auto-approves file operationsRead, Write, Edit, mkdir, rm, mv, cp
bypassPermissionsAuto-approves all toolsAll tools
planPlanning only, no executionNone

Set the permission mode in your agent configuration:

agents/my-agent/agent.yaml
name: my-agent
permissions:
mode: acceptEdits # default, acceptEdits, bypassPermissions, plan
# Optional: explicitly allow specific tools
allowed_tools:
- Bash
- Read
- Write
# Optional: deny specific tools
denied_tools:
- mcp__github__create_issue

Every tool call requires human approval:

permissions:
mode: default

Use for: High-stakes operations, new agents, untested workflows.

File operations auto-approve, other tools require approval:

permissions:
mode: acceptEdits

Use for: Most development workflows where file edits are the primary action.

All tools auto-approve—the agent runs fully autonomously:

permissions:
mode: bypassPermissions

Use for: Trusted agents in controlled environments, scheduled jobs, CI/CD.

Agent can plan but not execute tools:

permissions:
mode: plan

Use for: Exploring solutions without making changes, generating plans for review.

Fine-grained control over specific tools:

permissions:
mode: acceptEdits
# Whitelist specific tools
allowed_tools:
- Bash
- Read
- Write
- Edit
- mcp__github__* # Wildcard for all GitHub MCP tools
# Blacklist dangerous tools
denied_tools:
- mcp__postgres__execute_query # Prevent database writes

MCP (Model Context Protocol) servers extend agent capabilities with external tools.

Spawn a local process that communicates via stdio:

mcp_servers:
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_TOKEN: ${GITHUB_TOKEN} # Environment variable interpolation

Connect to a remote MCP endpoint:

mcp_servers:
custom-api:
url: http://localhost:8080/mcp

MCP tools are namespaced as mcp__<server>__<tool>:

mcp__github__create_issue
mcp__github__list_pull_requests
mcp__postgres__query
mcp__filesystem__read_file
ServerPackagePurpose
GitHub@modelcontextprotocol/server-githubIssues, PRs, repos
Filesystem@modelcontextprotocol/server-filesystemFile operations
PostgreSQL@modelcontextprotocol/server-postgresDatabase access
Memory@modelcontextprotocol/server-memoryPersistent key-value store
agents/full-stack/agent.yaml
name: full-stack-agent
mcp_servers:
# GitHub for issue management
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_TOKEN: ${GITHUB_TOKEN}
# Database for analytics
postgres:
command: npx
args: ["-y", "@modelcontextprotocol/server-postgres"]
env:
DATABASE_URL: ${DATABASE_URL}
# Custom internal API
internal-api:
url: ${INTERNAL_API_URL}
permissions:
mode: acceptEdits
allowed_tools:
- mcp__github__*
- mcp__postgres__query # Read-only
denied_tools:
- mcp__postgres__execute # No writes

Sessions enable resuming conversations and forking agent state.

  • Session ID: Unique identifier from the Claude SDK for conversation context
  • Resume: Continue a previous conversation with full context
  • Fork: Branch from a previous state to explore alternatives

Resume continues the exact conversation:

Job A (creates session)
Job B (resume from A) → Continues with full context
Job C (resume from B) → Continues with full context

Usage:

const result = await runner.execute({
agent: myAgent,
prompt: "Continue from where we left off",
stateDir: ".herdctl",
resume: "session-id-from-previous-job"
});

Fork branches from a point in history:

Job A (creates session)
├─► Job B (fork from A) → New branch with A's context
└─► Job C (fork from A) → Another branch with A's context

Usage:

const result = await runner.execute({
agent: myAgent,
prompt: "Try a different approach",
stateDir: ".herdctl",
fork: "session-id-to-fork-from"
});

Session info is persisted in .herdctl/sessions/<agent-name>.json:

{
"agent_name": "bragdoc-coder",
"session_id": "claude-session-xyz789",
"created_at": "2024-01-19T08:00:00Z",
"last_used_at": "2024-01-19T10:05:00Z",
"job_count": 15,
"mode": "autonomous"
}
ScenarioUse
Continue a taskresume with previous session ID
Try alternative approachesfork from a checkpoint
Start freshNeither (creates new session)

The runner streams output in real-time using JSONL (newline-delimited JSON).

Each line is a complete, self-contained JSON object:

{"type":"system","subtype":"init","timestamp":"2024-01-19T09:00:00Z"}
{"type":"assistant","content":"Starting analysis...","partial":false,"timestamp":"2024-01-19T09:00:01Z"}
{"type":"tool_use","tool_name":"Bash","tool_use_id":"toolu_123","input":"ls -la","timestamp":"2024-01-19T09:00:02Z"}
{"type":"tool_result","tool_use_id":"toolu_123","result":"total 42...","success":true,"timestamp":"2024-01-19T09:00:03Z"}

The runner outputs five message types:

Session lifecycle events:

{
"type": "system",
"subtype": "init",
"content": "Session initialized",
"timestamp": "2024-01-19T09:00:00Z"
}

Subtypes: init, end, complete

Claude’s text responses:

{
"type": "assistant",
"content": "I'll analyze the codebase...",
"partial": false,
"usage": {
"input_tokens": 1500,
"output_tokens": 200
},
"timestamp": "2024-01-19T09:00:01Z"
}
  • partial: True for streaming chunks, false for complete messages
  • usage: Token counts (when available)

Tool invocations by the agent:

{
"type": "tool_use",
"tool_name": "Bash",
"tool_use_id": "toolu_abc123",
"input": "git status",
"timestamp": "2024-01-19T09:00:02Z"
}

Results from tool execution:

{
"type": "tool_result",
"tool_use_id": "toolu_abc123",
"result": "On branch main\nNothing to commit",
"success": true,
"error": null,
"timestamp": "2024-01-19T09:00:05Z"
}

Error events:

{
"type": "error",
"message": "API rate limit exceeded",
"code": "RATE_LIMIT",
"stack": "...",
"timestamp": "2024-01-19T09:00:05Z"
}

Stream output in real-time using the async generator:

import { readJobOutput } from '@herdctl/core';
// Memory-efficient streaming read
for await (const message of readJobOutput(jobsDir, jobId)) {
console.log(message.type, message.content || message.tool_name);
}

Or tail the file directly:

Terminal window
tail -f .herdctl/jobs/job-2024-01-19-abc123.jsonl | jq .

The runner provides structured error handling with detailed diagnostics.

RunnerError (base)
├── SDKInitializationError
│ └── Missing API key, network issues
├── SDKStreamingError
│ └── Rate limits, connection drops
└── MalformedResponseError
└── Invalid SDK message format

Errors are classified to determine the appropriate exit reason:

Exit ReasonTrigger
successJob completed normally
errorUnrecoverable error
timeoutExecution time exceeded
cancelledUser or system cancellation
max_turnsReached maximum conversation turns

The runner detects common error patterns:

// Missing API key
if (error.isMissingApiKey()) {
// Prompt user to set ANTHROPIC_API_KEY
}
// Rate limiting
if (error.isRateLimited()) {
// Implement backoff or wait
}
// Network issues
if (error.isNetworkError()) {
// Check connectivity
}
// Recoverable errors
if (error.isRecoverable()) {
// Can retry the operation
}
SDKInitializationError: Missing or invalid API key

Solution: Set your Anthropic API key:

Terminal window
export ANTHROPIC_API_KEY=sk-ant-...
SDKStreamingError: Rate limit exceeded

Solutions:

  1. Wait and retry (the error includes retry-after when available)
  2. Reduce concurrent agent runs
  3. Use a higher-tier API plan
SDKStreamingError: Connection refused (ECONNREFUSED)

Solutions:

  1. Check network connectivity
  2. Verify MCP server URLs are accessible
  3. Check firewall rules
MalformedResponseError: Invalid message format

This usually indicates an SDK version mismatch or API changes. The runner logs these but continues processing other messages.

The runner currently does not retry failed operations. For critical workflows, implement retry logic at the orchestration layer:

async function runWithRetry(options: RunnerOptions, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
const result = await runner.execute(options);
if (result.success) return result;
if (result.error instanceof SDKStreamingError &&
result.error.isRecoverable() &&
attempt < maxRetries) {
await sleep(1000 * attempt); // Exponential backoff
continue;
}
throw result.error;
}
}

Handle partial failures gracefully:

const result = await runner.execute(options);
if (!result.success && result.errorDetails?.code === 'RATE_LIMIT') {
// Save progress and schedule retry
await scheduleRetry(result.jobId, result.sessionId);
}

The runner returns a structured result:

interface RunnerResult {
success: boolean; // Whether the run completed successfully
jobId: string; // The job ID for this run
sessionId?: string; // Session ID for resume/fork
summary?: string; // Brief summary of accomplishments
error?: Error; // Error if run failed
errorDetails?: { // Detailed error info
code: string;
message: string;
recoverable: boolean;
};
durationSeconds?: number; // Total execution time
}