Back to blog

Background Agents

Let agents ship code while your team sleeps.

AI coding agents are here. The question is where they actually run.


The Execution Problem

AI coding agents (Claude Code, Codex, Copilot, etc.) exist and can fix bugs, write tests, and refactor code. The challenge lies in their execution environment.

Problems with current approaches:

  • On developer laptops: requires developer activation; consumes resources while blocking the developer
  • In CI/continuous integration: ephemeral nature requires full setup (Docker pulls, dependency installation, database seeding) taking 10-20 minutes; agents fail when time limits are reached

Statistic: "40% of enterprise applications will include task-specific AI agents by end of 2026" (Gartner, 2025)

Teams currently use workarounds like overnight terminal sessions, staging environment deployments, and extended CI timeouts.


Fork, Do Not Rebuild

Start with a pre-configured machine containing the codebase, dependencies, and running services. Fork this machine for each agent task rather than rebuilding from scratch.

$ ssh boxd.sh fork dev-env --name=agent-fix-42
forked dev-env → agent-fix-42 (agent-fix-42.boxd.sh)

$ ssh boxd.sh exec agent-fix-42 -- claude \
  -p "Fix the auth timeout bug in issue #42. Run tests when done."

Key advantages:

  • Forks complete in under 200 milliseconds
  • Uses copy-on-write at disk level for speed and storage efficiency
  • Eliminates 10-20 minute setup overhead

No Time Limits

Complex work (bug fixes taking 30 minutes to 3 hours, module refactors taking full days) requires infrastructure without session caps or idle timeouts.

Features of boxd machines:

  • Run as long as needed
  • Sleep at zero cost when idle
  • Enable broader task delegation
  • Allow long-running monitoring and exploration tasks

Actions, Not Codespaces

Key observation: GitHub runs its own Copilot coding agent on GitHub Actions, not GitHub Codespaces, indicating agent infrastructure requires persistence over ephemerality.

The distinction:

  • Ephemeral runners: designed for short build-and-test jobs
  • Agent work: exploratory, iterative, requires continuous read-think-edit-test cycles

Parallel Exploration

Agents can work on multiple approaches simultaneously:

$ ssh boxd.sh fork dev-env --name=approach-a
forked dev-env → approach-a (approach-a.boxd.sh)

$ ssh boxd.sh fork dev-env --name=approach-b
forked dev-env → approach-b (approach-b.boxd.sh)

$ ssh boxd.sh exec approach-a -- claude \
  -p "Fix #42 by adding retry logic to the token refresh."

$ ssh boxd.sh exec approach-b -- claude \
  -p "Fix #42 by switching to a sliding session window."

Additional capabilities:

  • Agents can discover each other via DNS
  • Safe network communication between agents
  • Parallel test generation across modules
  • Multiple architectural approaches explored simultaneously
  • Refactoring and integration testing in parallel

The Full Workflow

Three-step lifecycle:

1. Trigger: Fork machine when issue assigned

$ ssh boxd.sh fork dev-env --name=agent-fix-42
forked dev-env → agent-fix-42 (agent-fix-42.boxd.sh)

2. Agent executes (unattended, no time limit)

$ ssh boxd.sh exec agent-fix-42 -- claude \
  -p "Fix the auth timeout bug in issue #42. Run the test suite when done."

...3 hours later...

✓ Found root cause in session middleware
✓ Fixed token refresh logic
✓ All 847 tests passing
✓ PR #43 ready for review

3. Machine sleeps at $0/hour until next task

Machine specifications:

  • 2 vCPUs
  • 8 GB RAM
  • 100 GB disk
  • Docker support
  • Full network access
  • Developer tools included

Integration options:

  • GitHub webhooks
  • Slack messages
  • Cron jobs
  • Manual commands
  • CLI over SSH (no SDK required)

Scale

Market projection: "$139B projected agentic AI infrastructure market by 2034, growing at 40.5% CAGR" (Grand View Research, 2025)

Scaling advantages:

  • Each agent gets isolated machine
  • Cost scales with actual compute usage, not allocated machines
  • No warm runner pools or idle Kubernetes nodes
  • Single base image maintained once; all forks inherit updates

Give Agents Computers

Agents need real computers with full codebases, tools, and test suites; time-unlimited execution; parallel exploration capabilities; and zero idle costs.

The pattern: "Fork a ready machine, run the agent, sleep when done"