Background Agents

Let agents ship code while your team sleeps.

AI coding agents are here. The question is where they actually run.

The Execution Problem

AI coding agents (Claude Code, Codex, Copilot, etc.) exist and can fix bugs, write tests, and refactor code. The challenge lies in their execution environment.

Problems with current approaches:

On developer laptops: requires developer activation; consumes resources while blocking the developer
In CI/continuous integration: ephemeral nature requires full setup (Docker pulls, dependency installation, database seeding) taking 10-20 minutes; agents fail when time limits are reached

Statistic: "40% of enterprise applications will include task-specific AI agents by end of 2026" (Gartner, 2025)

Teams currently use workarounds like overnight terminal sessions, staging environment deployments, and extended CI timeouts.

Fork, Do Not Rebuild

Start with a pre-configured machine containing the codebase, dependencies, and running services. Fork this machine for each agent task rather than rebuilding from scratch.

$ ssh boxd.sh fork dev-env --name=agent-fix-42
forked dev-env → agent-fix-42 (agent-fix-42.boxd.sh)

$ ssh boxd.sh exec agent-fix-42 -- claude \
  -p "Fix the auth timeout bug in issue #42. Run tests when done."

Key advantages:

Forks complete in under 200 milliseconds
Uses copy-on-write at disk level for speed and storage efficiency
Eliminates 10-20 minute setup overhead

No Time Limits

Complex work (bug fixes taking 30 minutes to 3 hours, module refactors taking full days) requires infrastructure without session caps or idle timeouts.

Features of boxd machines:

Run as long as needed
Sleep at zero cost when idle
Enable broader task delegation
Allow long-running monitoring and exploration tasks

Actions, Not Codespaces

Key observation: GitHub runs its own Copilot coding agent on GitHub Actions, not GitHub Codespaces, indicating agent infrastructure requires persistence over ephemerality.

The distinction:

Ephemeral runners: designed for short build-and-test jobs
Agent work: exploratory, iterative, requires continuous read-think-edit-test cycles

Parallel Exploration

Agents can work on multiple approaches simultaneously:

$ ssh boxd.sh fork dev-env --name=approach-a
forked dev-env → approach-a (approach-a.boxd.sh)

$ ssh boxd.sh fork dev-env --name=approach-b
forked dev-env → approach-b (approach-b.boxd.sh)

$ ssh boxd.sh exec approach-a -- claude \
  -p "Fix #42 by adding retry logic to the token refresh."

$ ssh boxd.sh exec approach-b -- claude \
  -p "Fix #42 by switching to a sliding session window."

Additional capabilities:

Agents can discover each other via DNS
Safe network communication between agents
Parallel test generation across modules
Multiple architectural approaches explored simultaneously
Refactoring and integration testing in parallel

The Full Workflow

Three-step lifecycle:

1. Trigger: Fork machine when issue assigned

$ ssh boxd.sh fork dev-env --name=agent-fix-42
forked dev-env → agent-fix-42 (agent-fix-42.boxd.sh)

2. Agent executes (unattended, no time limit)

$ ssh boxd.sh exec agent-fix-42 -- claude \
  -p "Fix the auth timeout bug in issue #42. Run the test suite when done."

...3 hours later...

✓ Found root cause in session middleware
✓ Fixed token refresh logic
✓ All 847 tests passing
✓ PR #43 ready for review

3. Machine sleeps at $0/hour until next task

Machine specifications:

2 vCPUs
8 GB RAM
100 GB disk
Docker support
Full network access
Developer tools included

Integration options:

GitHub webhooks
Slack messages
Cron jobs
Manual commands
CLI over SSH (no SDK required)

Scale

Market projection: "$139B projected agentic AI infrastructure market by 2034, growing at 40.5% CAGR" (Grand View Research, 2025)

Scaling advantages:

Each agent gets isolated machine
Cost scales with actual compute usage, not allocated machines
No warm runner pools or idle Kubernetes nodes
Single base image maintained once; all forks inherit updates

Give Agents Computers

Agents need real computers with full codebases, tools, and test suites; time-unlimited execution; parallel exploration capabilities; and zero idle costs.

The pattern: "Fork a ready machine, run the agent, sleep when done"