Skip to content

HTTP API

The HTTP API is a thin REST layer over FleetManager. Like the CLI, it contains no business logic of its own — every endpoint delegates to a FleetManager method and returns the result. The API powers the web dashboard and can be consumed by external scripts, CI/CD pipelines, or custom integrations.

HTTP API request flow showing client requests through Fastify, route handlers, FleetBridge, to FleetManager, plus WebSocket event relay path

The API follows the same thin client principle as every other interaction layer in herdctl:

  1. FleetManager is the single source of truth. Every REST endpoint calls a FleetManager method. No endpoint reads state files directly or manages agent lifecycle on its own.
  2. REST for queries and commands, WebSocket for real-time events. Clients fetch initial state via REST and receive incremental updates via a single WebSocket connection.
  3. No authentication (MVP). The API is designed for local use on localhost. Authentication is planned but not yet implemented.
  4. Consistent error responses. All errors return a JSON object with error and statusCode fields.

The HTTP API uses Fastify as its server framework. Fastify was chosen over Express for its TypeScript-first design, built-in WebSocket support via @fastify/websocket, plugin architecture, and request/response schema validation.

The server is created by the createWebServer() factory function in packages/web/src/server/index.ts. It registers:

  • CORS via @fastify/cors (allows localhost origins for development)
  • WebSocket via @fastify/websocket
  • Static file serving via @fastify/static (serves the built React SPA)
  • REST route modules for fleet, agents, jobs, schedules, and chat
  • SPA fallback handler that serves index.html for client-side routing

WebManager is the IChatManager implementation for the web platform. FleetManager dynamically imports @herdctl/web at startup when the fleet configuration includes a web block with enabled: true. WebManager follows the same lifecycle as the Discord and Slack managers:

FleetManager.initialize()
-> WebManager.initialize() # Creates Fastify server, registers routes
-> FleetManager.start()
-> WebManager.start() # Starts listening on host:port, starts FleetBridge
-> FleetManager.stop()
-> WebManager.stop() # Stops FleetBridge, closes WebSocket connections, shuts down Fastify

The server binds to the host and port specified in herdctl.yaml:

web:
enabled: true
port: 3232
host: localhost

Routes are organized into focused modules, each receiving the Fastify instance and FleetManager reference:

ModuleFileEndpoints
Fleetroutes/fleet.tsFleet status
Agentsroutes/agents.tsAgent listing and detail
Jobsroutes/jobs.tsJob listing, detail, cancel, fork
Schedulesroutes/schedules.tsSchedule listing, trigger, enable/disable
Chatroutes/chat.tsChat session management and messaging
Systemindex.ts (inline)Health check, version

All endpoints are prefixed with /api. Responses are JSON.

API route map showing endpoint groups for fleet, agents, jobs, schedules, chat, system, and WebSocket

MethodPathDescriptionFleetManager Method
GET/api/fleet/statusFleet status including state, uptime, agent count, job counts, scheduler stategetFleetStatus()

Example response:

{
"status": "running",
"startedAt": "2025-01-20T10:00:00Z",
"agentCount": 3,
"runningJobCount": 1,
"schedulerState": "running"
}
MethodPathDescriptionFleetManager Method
GET/api/agentsList all agents with status, schedules, and connector infogetAgentInfo()
GET/api/agents/:nameGet detailed info for a single agent by qualified name or local namegetAgentInfoByName(name)

The :name parameter accepts either a qualified name (e.g., herdctl.security-auditor) or a local name (e.g., security-auditor). If the agent is not found, the endpoint returns 404.

Example response for GET /api/agents/:name:

{
"name": "security-auditor",
"qualifiedName": "herdctl.security-auditor",
"description": "Runs security audits on the codebase",
"status": "idle",
"currentJobId": null,
"lastJobId": "job-2025-01-20-abc123",
"schedules": [
{
"name": "daily-audit",
"type": "cron",
"expression": "0 6 * * *",
"status": "idle",
"lastRunAt": "2025-01-20T06:00:00Z",
"nextRunAt": "2025-01-21T06:00:00Z"
}
],
"chatConnectors": {
"discord": { "status": "connected" },
"slack": { "status": "disconnected" }
}
}
MethodPathDescriptionFleetManager Method
GET/api/jobsList jobs with pagination and filteringlistJobs() (core utility)
GET/api/jobs/:idGet full metadata for a single jobgetJob() (core utility)
POST/api/jobs/:id/cancelCancel a running jobcancelJob(id)
POST/api/jobs/:id/forkFork a job, optionally with a new promptforkJob(id, modifications)

Query parameters for GET /api/jobs:

ParameterTypeDefaultDescription
limitnumber50Max results (clamped to 1-100)
offsetnumber0Pagination offset
agentNamestringFilter by agent qualified name
statusstringFilter by status: pending, running, completed, failed, cancelled

Example response for GET /api/jobs:

{
"jobs": [
{
"jobId": "job-2025-01-20-abc123",
"agentName": "herdctl.security-auditor",
"prompt": "Run daily security audit...",
"status": "completed",
"createdAt": "2025-01-20T06:00:00Z",
"startedAt": "2025-01-20T06:00:00Z",
"completedAt": "2025-01-20T06:05:30Z",
"exitCode": 0,
"sessionId": "claude-session-xyz",
"triggerType": "scheduled",
"workspace": "/home/user/projects/my-app"
}
],
"total": 142,
"limit": 50,
"offset": 0,
"errors": []
}

Fork request body:

{
"prompt": "Try a different approach to the security issue"
}

The prompt field is optional. If omitted, the fork uses the original job’s configuration.

MethodPathDescriptionFleetManager Method
GET/api/schedulesList all schedules across all agentsgetSchedules()
POST/api/agents/:name/triggerTrigger a job for an agent, optionally targeting a specific scheduletrigger(name, scheduleName, options)
POST/api/schedules/:agentName/:scheduleName/enableEnable a disabled scheduleenableSchedule(agentName, scheduleName)
POST/api/schedules/:agentName/:scheduleName/disableDisable an active scheduledisableSchedule(agentName, scheduleName)

Trigger request body:

{
"scheduleName": "issue-check",
"prompt": "Custom prompt override"
}

Both fields are optional. If scheduleName is omitted, the agent’s default trigger behavior applies. If prompt is provided, it overrides the schedule’s configured prompt.

The chat API manages web chat sessions. Actual message streaming happens via the WebSocket protocol, but REST endpoints handle session lifecycle and provide a non-streaming message endpoint.

MethodPathDescription
GET/api/chat/recentList recent sessions across all agents (sorted by last activity)
GET/api/chat/configGet chat configuration defaults (message grouping, tool results)
POST/api/chat/:agentName/sessionsCreate a new chat session for an agent
GET/api/chat/:agentName/sessionsList all sessions for an agent
GET/api/chat/:agentName/sessions/:sessionIdGet session details with full message history
DELETE/api/chat/:agentName/sessions/:sessionIdDelete a chat session
PATCH/api/chat/:agentName/sessions/:sessionIdRename a session (set custom name)
POST/api/chat/:agentName/sessions/:sessionId/messagesSend a message (non-streaming, waits for full response)

Create session response (201 Created):

{
"sessionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"createdAt": "2025-01-20T12:00:00.000Z"
}

Session detail response:

{
"sessionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"agentName": "herdctl.my-agent",
"createdAt": "2025-01-20T12:00:00.000Z",
"lastMessageAt": "2025-01-20T12:05:30.000Z",
"messageCount": 4,
"preview": "What issues are open on the repo?",
"customName": "Issue triage session",
"messages": [
{
"role": "user",
"content": "What issues are open on the repo?",
"timestamp": "2025-01-20T12:00:01.000Z"
},
{
"role": "assistant",
"content": "I'll check the open issues for you...",
"timestamp": "2025-01-20T12:00:03.000Z"
},
{
"role": "tool",
"content": "Found 3 open issues...",
"timestamp": "2025-01-20T12:00:05.000Z",
"toolCall": {
"toolName": "Bash",
"inputSummary": "gh issue list --state open",
"output": "Found 3 open issues...",
"isError": false,
"durationMs": 1200
}
}
]
}

Send message request body:

{
"message": "What issues are open on the repo?"
}

The REST message endpoint collects all streaming chunks and returns the complete response synchronously. For real-time streaming, use the WebSocket chat:send message type instead.

Rename request body:

{
"name": "Issue triage session"
}
MethodPathDescription
GET/api/healthHealth check (returns { status: "ok", timestamp })
GET/api/versionPackage versions for web, CLI, and core

Health check response:

{
"status": "ok",
"timestamp": "2025-01-20T12:00:00.000Z"
}

Version response:

{
"web": "0.5.0",
"cli": "0.5.0",
"core": "0.5.0"
}

The API provides a single WebSocket endpoint at /ws. Clients open one connection and receive all event types multiplexed over that connection. This avoids the complexity of managing multiple connections and simplifies reconnection logic.

  1. Client connects to ws://localhost:3232/ws
  2. Server immediately sends a fleet:status message with a full fleet status snapshot
  3. Client sends subscribe messages for agents whose output it wants to stream
  4. Server broadcasts events as they occur
  5. Client sends ping messages periodically for keepalive; server responds with pong
  6. On disconnect, the server cleans up the client’s subscription state

Messages sent from the browser to the server:

TypePayloadDescription
subscribe{ agentName }Subscribe to an agent’s job:output events
unsubscribe{ agentName }Stop receiving an agent’s job:output events
ping(none)Keepalive ping
chat:send{ agentName, sessionId, message }Send a chat message to an agent

Example subscribe message:

{
"type": "subscribe",
"payload": {
"agentName": "herdctl.security-auditor"
}
}

Example chat send message:

{
"type": "chat:send",
"payload": {
"agentName": "herdctl.my-agent",
"sessionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"message": "Check the open issues"
}
}

Messages sent from the server to connected browsers:

TypePayloadBroadcast ScopeDescription
fleet:statusFleetStatusSingle clientFull fleet snapshot, sent on connection
agent:updatedAgentStartedPayload or AgentStoppedPayloadAll clientsAgent lifecycle change
TypePayloadBroadcast ScopeDescription
job:createdJobCreatedPayloadAll clientsNew job started
job:outputJobOutputPayloadSubscribed clients onlyStreaming job output (high volume)
job:completedJobCompletedPayloadAll clientsJob finished successfully
job:failedJobFailedPayloadAll clientsJob failed with error
job:cancelledJobCancelledPayloadAll clientsJob was cancelled
TypePayloadBroadcast ScopeDescription
schedule:triggeredScheduleTriggeredPayloadAll clientsA schedule fired
TypePayloadBroadcast ScopeDescription
chat:response{ agentName, sessionId, jobId, chunk }Requesting clientStreaming text chunk from agent
chat:tool_call{ agentName, sessionId, jobId, toolName, inputSummary?, output, isError, durationMs? }Requesting clientTool call result during chat
chat:message_boundary{ agentName, sessionId, jobId }Requesting clientBoundary between distinct assistant text turns
chat:complete{ agentName, sessionId, jobId }Requesting clientChat response finished
chat:error{ agentName, sessionId, error }Requesting clientChat error occurred
TypePayloadDescription
pong(none)Response to client ping

Not all events are sent to all clients. The FleetBridge distinguishes between low-volume events (broadcast to all clients) and high-volume events (sent only to subscribed clients):

  • Broadcast to all: fleet:status, agent:updated, job:created, job:completed, job:failed, job:cancelled, schedule:triggered
  • Subscribers only: job:output (sent only to clients that have sent a subscribe message for the relevant agent)
  • Requesting client only: All chat:* messages (sent only to the client that initiated the chat:send)

This filtering prevents flooding inactive dashboard tabs with high-volume output data from agents the user is not viewing.

The FleetBridge class connects FleetManager’s event system to WebSocket clients. It subscribes to FleetManager events at startup and translates them into WebSocket server messages:

WebSocket event relay showing client subscriptions, FleetBridge event filtering, and real-time broadcast to connected clients

FleetManager Events FleetBridge WebSocket Clients
agent:started --------> broadcast() --------> All clients
agent:stopped --------> broadcast() --------> All clients
job:created --------> broadcast() --------> All clients
job:output --------> broadcastToSubscribers() --> Subscribed clients
job:completed --------> broadcast() --------> All clients
job:failed --------> broadcast() --------> All clients
job:cancelled --------> broadcast() --------> All clients
schedule:triggered -------> broadcast() --------> All clients

The FleetBridge properly cleans up event listeners when stopped, preventing memory leaks. It stores bound handler references so that fleetManager.off() calls remove the correct listeners.

All error responses use a consistent structure:

{
"error": "Descriptive error message",
"statusCode": 404
}
Status CodeUsage
200Successful GET, POST, PATCH, DELETE
201Resource created (e.g., new chat session)
400Invalid request (missing required fields, malformed input)
404Resource not found (agent, job, session)
500Internal server error
503Client build not available (SPA not built)

Error detection is string-based: if a FleetManager error message contains “not found” (case-insensitive), the API returns 404. All other errors return 500. This approach avoids coupling the API layer to specific error class hierarchies while still providing meaningful status codes.

The server configures CORS via @fastify/cors to allow requests from known development origins:

  • http://localhost:3232 and http://127.0.0.1:3232 (production server)
  • http://localhost:5173 and http://127.0.0.1:5173 (Vite dev server)
  • The configured host:port combination from the web config

Allowed methods are GET, POST, PUT, DELETE, and OPTIONS.

The API is designed for local use. Security is handled at the network level:

  • Default binding: The server binds to localhost by default, preventing LAN access.
  • Warning on exposure: When host is set to 0.0.0.0, the web dashboard displays a warning about the security implications.
  • Reverse proxy pattern: For remote access, the recommended approach is placing a reverse proxy (Caddy, Nginx + OAuth2 Proxy, Authelia) in front of the herdctl server. The proxy handles authentication and sets headers like X-Forwarded-User.

The Fastify plugin architecture supports adding authentication middleware without restructuring routes. When authentication is added, it will be injected as a Fastify preHandler hook that checks all /api/* routes.

Planned authentication options:

  1. Bearer token — A simple auth_token config field. When set, the server requires Authorization: Bearer <token> on all HTTP requests and the initial WebSocket handshake.
  2. API keys — Token-based auth for scripts and CI integrations.
  3. OIDC/OAuth — Enterprise SSO integration for multi-user deployments.

The web chat system uses WebChatManager to manage chat sessions. Unlike the monitoring endpoints that purely query FleetManager state, the chat system maintains its own persistent state for conversation history.

Chat sessions are server-managed, per-agent, and shared:

  • Sessions are stored in .herdctl/web/chat-history/<agentName>/<sessionId>.json
  • Each agent can have multiple concurrent chat sessions
  • Sessions are visible to all connected browsers (no per-user scoping)
  • Session IDs are server-generated UUIDs
  • Sessions expire after session_expiry_hours (default: 24 hours)

There is no concept of user identity in the web API. Any browser can see, continue, or delete any session. This design reflects the typical use case: a single operator (or small team) using the dashboard on localhost.

When a user sends a chat message, the flow differs depending on whether they use the REST or WebSocket interface:

REST path (POST /api/chat/:agentName/sessions/:sessionId/messages):

  1. Validate session exists
  2. Call WebChatManager.sendMessage() with a chunk collector callback
  3. Wait for the agent to complete its response
  4. Return the full accumulated response

WebSocket path (chat:send message):

  1. Validate session exists via WebChatManager
  2. Call WebChatManager.sendMessage() with streaming callbacks
  3. Stream chat:response chunks, chat:tool_call results, and chat:message_boundary signals back to the requesting client in real time
  4. Send chat:complete when finished (or chat:error on failure)

Both paths use the same underlying WebChatManager.sendMessage() method, which triggers a FleetManager job with triggerType: "web" and streams the agent’s response via SDK message callbacks. The @herdctl/chat package’s ChatSessionManager handles SDK session tracking for conversation continuity across multiple messages.

The Fastify server doubles as a static file server for the React SPA. In production, Vite builds the client to dist/client/, and @fastify/static serves these files from the root path /.

A custom setNotFoundHandler implements SPA fallback routing:

  • Requests to /api/*, /ws, or /assets/* that don’t match a route return 404
  • All other requests serve index.html, allowing React Router to handle client-side routing
  • If the client build doesn’t exist (e.g., development mode), the server returns 503 with a message to run pnpm build:client
AspectDevelopmentProduction
FrontendVite dev server on port 5173 with HMRPre-built static files served by Fastify
API serverFastify on configured portFastify on configured port
ProxyVite proxies /api/* and /ws/* to FastifyEverything on a single port
CORSRequired (cross-origin between Vite and Fastify)Not needed (same origin)
npm packageServer code onlyServer code + pre-built SPA assets
  • System Architecture — Overall system design, FleetManager composition, event system
  • Web Dashboard — React frontend architecture, UI components, state management
  • Chat Infrastructure — Shared chat layer used by WebChatManager
  • Job Lifecycle — Job creation, status transitions, output streaming
  • Schedule System — Polling loop, interval/cron parsing, trigger mechanics
  • CLI — The other thin client over FleetManager