LangSmith Sandboxes Bring Secure Code Execution to AI Agents

LangSmith's new sandbox feature lets AI agents run code in isolated containers, preventing security risks while maintaining full functionality.

AI agents that write and execute code are becoming essential tools for developers, but security remains the critical blocker. LangChain's new LangSmith Sandboxes solve this problem by providing isolated, secure environments where agents can run code without risking your infrastructure. This launch signals a maturation in the AI agent ecosystem, moving from experimental demos to production-ready tools with enterprise-grade security.

The Security Problem with Coding Agents

Coding agents like Cursor, Claude Code, and OpenClaw have demonstrated how powerful it is to give AI the ability to write and run code. Agents can analyze data, call APIs, and even build applications from scratch. But this power comes with significant risk.

Without proper isolation, agents can execute destructive or malicious actions on your local environment. Traditional containers were designed to run known, vetted application code. Agent-generated code is fundamentally different: it's untrusted and unpredictable.

A web server handles a known set of operations. An agent might attempt anything, including malicious commands. Building secure code execution yourself usually means spinning up containers, locking down network access, piping output back to your agent, and tearing everything down when complete. Then you need to handle resource limits, because agents running code can rapidly consume CPU, memory, and disk if left unconstrained.

What LangSmith Sandboxes Provide

LangSmith Sandboxes offer secure, scalable environments for running untrusted code. They give you ephemeral, locked-down environments where agents can run code safely, with control over what they can access and the resources they can consume.

The core promise is simple: spin up a sandbox in a single line of code with the LangSmith SDK. Add your API key, pull in the SDK, and you're operational. LangChain has been using Sandboxes internally to power projects like Open SWE, and now they're making the same primitives available to all developers.

Runtime Configuration

Sandboxes support bring-your-own Docker images. Use LangChain's defaults or point to your own private registry. Start every sandbox with exactly the filesystem and tooling you need.

Sandbox Templates let you define an image, CPU, and memory configuration once, then reuse it every time. Shared access allows multiple agents to use the same sandbox, eliminating the need to transfer artifacts across isolated environments. Pooling and autoscaling pre-provision warm sandboxes so agents don't wait for cold starts.

Execution Capabilities

Agent tasks that take minutes or hours won't time out. Sandboxes support persistent commands over WebSockets, with real-time output streaming so you can see what's happening as it runs.

Persistent state carries across interactions. Your agent can use the same sandbox across multiple threads without losing context. Files, installed packages, and environment state carry over between runs. Tunnels expose sandbox ports to your local machine so you can preview your agent's output before deploying.

Security Architecture

The Auth Proxy ensures sandboxes access external services through an authentication layer, so secrets never touch the runtime. Credentials stay off the sandbox entirely.

Each sandbox runs in a hardware-virtualized microVM, not just Linux namespaces. This provides kernel-level isolation between sandboxes, significantly stronger than traditional container isolation.

Real-World Use Cases

Several workloads particularly benefit from sandboxed code execution. A coding assistant can run and validate its own output before responding, ensuring code actually works before presenting it to the user.

CI-style agents can clone a repository, install dependencies, and run a test suite before opening a pull request. LangChain's own Open SWE project uses this approach. Data analysis agents can execute Python scripts against datasets and return results without exposing the underlying infrastructure.

Platform Integration

LangSmith Sandboxes use the same SDK and infrastructure as the rest of LangSmith. If you're already using the Python or JavaScript client for tracing or deployment, you can spin up sandboxes without adding new dependencies.

Sandboxes integrate directly with LangSmith Deployment, allowing you to attach a sandbox to an agent thread. They have native integrations with LangChain's Deep Agents open source framework, as well as Open SWE.

What's Coming Next

LangChain is actively developing Sandboxes beyond the initial release. Shared volumes will give agents the ability to share state across sandboxes, allowing Agent 1 to write to a volume and Agent 2 to pick up where it left off.

Binary authorization will control which binaries can run inside a sandbox. Agents are prone to unexpected behavior like installing packages, exporting credentials, or consuming compute on unintended tasks. Binary authorization lets you restrict execution the same way you would on a managed corporate laptop.

Full execution tracing is also in development. While sandbox calls are already traced alongside agent runs, LangChain is working toward tracing everything that happens inside the virtual machine, including every process and network call. This serves as a complete audit log of what a sandbox did and when.

Availability and Access

LangSmith Sandboxes are available now in Private Preview. Developers building agents that need secure code execution can sign up for the waitlist to try it out. The framework-agnostic design means you can use Sandboxes with LangChain OSS, another framework, or no framework at all.

First-class clients are available in both Python and JavaScript through the LangSmith SDK. The Deep Agents integration allows plugging sandboxes directly into agentic workflows with minimal configuration.

FAQ

How is a microVM different from a Docker container?

MicroVMs provide hardware-level virtualization with kernel isolation, whereas Docker containers share the host kernel through Linux namespaces. This means microVMs offer stronger security boundaries. If a sandbox is compromised, the attacker remains trapped inside the microVM rather than having potential access to the host system or other containers.

Can I use my own Docker images with LangSmith Sandboxes?

Yes, LangSmith Sandboxes support bring-your-own Docker images. You can use LangChain's default images or point to your own private registry. This flexibility lets you start every sandbox with exactly the filesystem and tooling your agents need.

How does the Auth Proxy protect my secrets?

The Auth Proxy acts as an intermediary between the sandbox and external services. When your agent needs to access an API or database, requests go through the proxy rather than directly from the sandbox. This means API keys and credentials never touch the sandbox runtime, remaining isolated from potentially untrusted code execution.

Is LangSmith Sandboxes only for LangChain users?

No, LangSmith Sandboxes are framework-agnostic. While they integrate seamlessly with LangChain OSS and Deep Agents, you can use them with any framework or even without a framework. The Python and JavaScript SDKs provide first-class support regardless of what tools you're using to build your agents.

What's the difference between Private Preview and general availability?

Private Preview means LangSmith Sandboxes are available to selected users who sign up for the waitlist. This allows LangChain to gather feedback, identify issues, and refine the product before broader release. Features and pricing may change during this period based on user input.