Agents Need Sandboxes

AI agents executing code need isolated, capability-scoped environments rather than running directly on user machines or in shared infrastructure.

The Assumption

As AI agents become more autonomous—writing code, making API calls, managing files—they need sandboxed execution environments. Running agent code directly on user machines or in shared infrastructure creates unacceptable security and reliability risk.

This is not obvious. Many agent systems today (Claude Code, Cursor, Devin) execute code directly on user machines. We’re betting that isolation becomes non-negotiable as agents:

Handle more sensitive data (credentials, customer information)
Execute longer-running autonomous workflows
Operate in regulated enterprise environments
Make decisions with real financial consequences

The analogy is browser sandboxing: early browsers ran code with full system access. Security incidents made sandboxing mandatory. We expect the same pattern for agent execution.

Evidence

Supporting signals:

Security researchers documenting prompt injection → code execution chains
Enterprise security teams asking about isolation before agent deployment
Anthropic’s Claude Code runs in sandboxed containers by default
AWS, GCP, Azure launching “agent sandbox” products (2024-2025)
E2B raised funding specifically for “sandboxes for AI agents”

Market indicators:

Modal, Fly.io, Cloudflare positioning around agent workloads
“Agent security” emerging as a conference track topic
Investor decks increasingly mention agent isolation as a category

Counter-Evidence

What would prove this wrong:

Major agent frameworks (LangChain, CrewAI) ship production systems without isolation and see no incidents
Enterprises deploy agents to production without sandbox requirements
Security incidents don’t materialise despite widespread agent adoption
Users consistently choose convenience over isolation

Current counter-signals:

Many developers run Claude Code without sandboxing today
Consumer users don’t seem concerned about agent permissions
No high-profile agent security incidents yet (attack surface may be theoretical)

Impact If Wrong

Products affected: SmartBoxes (entirely), Nomos Cloud (partially through enterprise requirements)

Revenue at risk: £17K MRR Year 1 (SmartBoxes) plus downstream enterprise deals

Strategic impact: If agents don’t need sandboxes, our core infrastructure bet fails. We’d need to pivot to a different capability—perhaps pure observability/audit trails rather than isolation. The execution platform becomes a commodity rather than a defensible moat.

Runway impact: 6+ months of focused development potentially wasted if we discover this late.

Testing Plan

Minimum viable test:

Customer discovery: 10 interviews with AI developers specifically about isolation concerns
Competitive watch: Track sandbox-focused startups, launches, and funding
Incident monitoring: Set up alerts for security incidents in agent systems

Signals to watch:

Do enterprise security questionnaires ask about agent isolation?
Do developers cite sandboxing as a blocker for production deployment?
Do competitors win deals by offering better isolation?

Timeline: 3 months to initial validation signal

Kill criteria: If 0/10 developers cite isolation as a concern AND no competitors emerge in the space AND no incidents occur, revisit this assumption at month 6.

This is a foundation assumption — it doesn’t depend on others, but many assumptions build on it.

Enables:

Developers Will Pay For Sandboxes — if isolation is needed, the question becomes who provides it
Non-Developers Want AI Tools — sandboxes enable safe AI tools for non-technical users
Audit Trails Will Be Required — sandboxed execution makes audit trails possible
Market Timing Is Right — the timing bet depends on agent adoption driving sandbox demand

Assumption

AI agents executing code need isolated, capability-scoped environments rather than running directly on user machines or in shared infrastructure.

Depends On

This assumption only matters if these are true:

Security Incidents Will Drive Demand — 🏛️ ⚪ 55%

Enables

If this assumption is true, these become relevant:

Developers Will Pay For Sandboxes — 🔴 ⚪ 50%
Market Timing Is Right — 🏛️ ⚪ 60%
Non-Developers Want AI Tools — 🟠 ⚪ 55%
Audit Trails Will Be Required — 🟠 ⚪ 65%

How To Test

Customer discovery interviews with AI developers; analysis of agent failure modes in production systems.

Validation Criteria

This assumption is validated if:

5+ enterprise teams cite isolation as a blocker
Security incidents in agent systems make news
Competitors emerge in sandbox space

Invalidation Criteria

This assumption is invalidated if:

Major agent frameworks ship without isolation
No security incidents despite widespread agent deployment
Enterprises comfortable with direct execution

Dependent Products

If this assumption is wrong, these products are affected:

SmartBoxes

Market timing: Is the market ready for agent-native?

Decisions Depending On This

SmartBoxes First — ✅ Sequencing