Read the Blast Radius Before You Let an Agent Run Code

Picture an agent mid-task: it has decided the best way to answer your question is to write a Python script, run it, and return the result. That is a genuinely useful capability. It is also the moment where a mistaken line of code - or a subtly adversarial prompt - could interact with your filesystem, your network, or your secrets. The agent does not know the difference between "delete temp files" and "delete everything." It just executes.

Among the most powerful and dangerous capabilities an agent can possess is the ability to write and execute code. A vulnerability here can easily escalate to a full remote code execution attack on the host system. That is not hypothetical. It is the reason the infrastructure that runs AI-generated code has its own name: a sandbox.

What a sandbox actually is

A sandbox is an isolated execution environment. The code runs inside it; the host system sits outside. If the code does something destructive, the damage stays contained.

Running LLM-generated code directly with Python's exec() or eval() executes it within the same process as your main application, offering no isolation at all. A subprocess call is marginally better - separate process, at least - but a subprocess does not constitute a full sandbox; the new process still inherits permissions and access rights that might be too broad.

Docker containers get used heavily because they are familiar and fast. Container isolation creates a separate namespace for the code, complete with its own filesystem, network stack, and process space, providing a complete runtime environment while maintaining strict boundaries around what the code can access. But containers share the host kernel. If there is a kernel-level exploit in the code being run, a container is not a hard wall.

Why microVMs became the answer

The industry largely settled on a stronger primitive: the microVM. The one you will encounter most often is Firecracker, which AWS built and open-sourced in 2018.

Firecracker is an open source virtualization technology purpose-built for creating and managing secure, multi-tenant container and function-based services. It deploys workloads in lightweight virtual machines - microVMs - which provide enhanced security and workload isolation over traditional VMs, while enabling the speed and resource efficiency of containers.

The key architectural move: Docker containers share the host kernel, providing process-level isolation. Firecracker microVMs each have their own kernel, providing hardware-level isolation. A guest kernel vulnerability cannot reach the host because the guest kernel is not the host kernel.

Firecracker excludes unnecessary devices and guest functionality to reduce the memory footprint and attack surface area of each microVM - which improves security, decreases startup time, and increases hardware utilization. There is no BIOS to emulate, no USB stack, no audio device. Serverless workloads simply do not need hardware features like USB, displays, speakers, and microphones, so they are not implemented.

The result is a VM that starts fast. Firecracker initiates user space code in as little as 125 ms and supports microVM creation rates of up to 150 microVMs per second per host.

Each microVM runs with a memory overhead of less than 5 MiB, enabling a high density of microVMs to be packed on each server.

What actually happens when an agent runs code

Here is the sequence, made concrete. Say an agent is asked to analyze a CSV and compute some statistics. It writes a twenty-line Python script. The execution layer then:

Requests a fresh microVM from a pre-warmed pool
Copies the generated code into the VM's isolated filesystem
Executes the code inside that VM
Captures stdout and stderr and any output files
Destroys the VM

This ensures that even if the agent is tricked into generating malicious code - code that attempts to delete files or access the network - the blast radius is confined entirely to the temporary container. The host system's filesystem, network, and processes remain unaffected.

The pre-warming detail matters for latency. E2B uses pre-warmed microVM pools and VM snapshots to achieve roughly 150ms restoration and provisioning times: boot microVMs to a fully initialized state, take a full snapshot, then restore incoming requests from that snapshot. From the user's perspective in a chat interface, that is imperceptible.

Running code in a microVM is not slower than running it in a container - it just uses a thinner hypervisor instead of shared kernel namespaces.

The multi-turn problem

Single execution is the easy case. Real agent workflows run across multiple turns. The agent installs a library, writes a file, calls a function, then in the next turn wants to reference what it wrote.

Multi-turn agent sessions need filesystem state that persists across turns. An agent working on a Python project across ten turns has installed packages, written files, and accumulated intermediate outputs. Full sandbox re-initialization on every turn wastes 200-500ms on environment setup.

Firecracker's snapshot-restore mechanism lets you pause a sandbox, preserve its memory and filesystem state, and resume it in 5-30ms - the operational foundation for agent harnesses where context accumulates over dozens of tool calls.

This is where a well-built agent system behaves differently from a naive one. Naively, each tool call gets a clean environment. That is safe but stateless. With snapshot-restore, you get isolation per execution and continuity of working state. Both, not a trade-off.

What the layers look like together

For teams shipping code-execution agents today, the stack usually looks like this. The orchestration layer (whatever manages the agent loop) calls out to a sandbox API. The sandbox provider - E2B, for example, is an open-source sandbox platform focused on secure code execution for AI agents; sandboxes run on Firecracker microVMs, and the platform offers a Code Interpreter SDK for running AI-generated code through a Jupyter-based environment. The sandbox returns stdout, stderr, and any artifacts. The orchestration layer passes those back to the model as the next turn's context.

LLM-generated code often appears legitimate and functional, making traditional security scanning less effective. That is precisely why the security model here is architectural rather than content-based - you do not inspect the code and guess whether it is safe; you run it somewhere that cannot hurt you regardless.

A teammate like Beagle, living in Slack, may eventually hand off to an agent that needs to run code against a data export or a script a teammate shared. When that moment comes, the question is not whether the model is trustworthy - it is whether the execution layer is. That question has a tractable answer, and now you know what to look for when someone says their agent "sandboxes" execution.

What a sandbox actually is

Why microVMs became the answer

What actually happens when an agent runs code

The multi-turn problem

What the layers look like together

Keep reading

Treat Every Line Your Agent Writes as Untrusted Code

How the LLM Context Window Actually Works

Fix Your GitHub Slack PR Notifications Before They Cost You