Running Untrusted Agent Code Without a Sandbox
Explore the risks and strategies for executing untrusted AI agent code without sandboxing, including isolation techniques, monitoring, and practical safeguards for production systems.
Tags
Quick summary
Explore the risks and strategies for executing untrusted AI agent code without sandboxing, including isolation techniques, monitoring, and practical safeguards for production systems.
Running Untrusted Agent Code Without a Sandbox
Running untrusted agent code—code from external or user-supplied AI agents—is a growing challenge in modern AI deployments. Traditional sandboxing approaches (e.g., Docker containers, gVisor, or Firecracker) provide strong isolation but introduce significant overhead, latency, and complexity. This article explores practical techniques to execute untrusted agent code without a full sandbox, leveraging lightweight isolation, static analysis, and runtime monitoring. We draw on insights from recent industry discussions, including the LangChain blog's examination of agent safety and OpenAI's ongoing work on secure execution environments. The goal is to provide a practical, deployable approach for developers who need speed and simplicity without compromising security.
Requirements
Before we begin, ensure your system meets the following requirements:
- **Linux-based system** (Ubuntu 22.04+ or similar) — most lightweight isolation tools are Linux-native.
- **Python 3.10+** — for running agent code and our example scripts.
- **`seccomp` support** — kernel-level system call filtering (enabled by default on modern Linux).
- **`nsjail`** — a lightweight, security-focused sandbox for running untrusted code. Install via package manager.
- **`cgroups` v2** — for resource limits (CPU, memory). Check with `mount | grep cgroup`.
- **`apparmor` or `selinux`** — optional but recommended for additional MAC (Mandatory Access Control) enforcement.
Step-by-step Installation
We'll install and configure `nsjail` as our primary tool. `nsjail` uses Linux namespaces, seccomp-bpf, and cgroups to restrict untrusted code with minimal overhead.
1. Install nsjail
First, update your package list and install `nsjail`:
sudo apt-get update
sudo apt-get install -y nsjailVerify the installation:
nsjail --versionIf you need the latest version, compile from source (instructions available on the `nsjail` GitHub repository). For most use cases, the package version suffices.
2. Install Python and dependencies
Ensure Python 3.10+ is installed:
python3 --versionInstall the required Python libraries for our example:
pip install requests pyyaml3. Configure seccomp profiles
Create a custom seccomp profile to restrict system calls. Save the following as `agent_seccomp.policy`:
# Basic seccomp policy for untrusted agent code
# Allow only safe syscalls
[whitelist]
read
write
open
close
mmap
munmap
brk
exit_group
clone
execve
stat
fstat
lseek
getdents64This policy blocks dangerous syscalls like `ptrace`, `socket` (network), `mount`, and `process_vm_writev`. You can extend it based on your agent's needs.
4. Set up cgroups for resource limits
Create a cgroup for resource isolation:
sudo mkdir -p /sys/fs/cgroup/agent_limits
echo "100000" | sudo tee /sys/fs/cgroup/agent_limits/memory.max # 100 MB memory limit
echo "50000" | sudo tee /sys/fs/cgroup/agent_limits/cpu.max # 50% CPU limitThese limits prevent runaway agents from exhausting system resources.
Usage Examples
We'll now run untrusted agent code using `nsjail` with our seccomp policy and cgroup limits.
Example 1: Running a Simple Python Script
Create a test script `agent_code.py` that simulates untrusted agent behavior:
# agent_code.py - Sample untrusted agent code
import os
def run_agent():
print("Agent: Hello, I am an untrusted agent!")
# Attempt a dangerous operation (will be blocked)
try:
os.system("rm -rf /") # This will fail due to seccomp
except Exception as e:
print(f"Blocked dangerous operation: {e}")
return "Agent completed safely"
if __name__ == "__main__":
result = run_agent()
print(f"Result: {result}")Run it inside `nsjail`:
nsjail --config /dev/stdin <<EOF
{
"chroot": "/",
"cwd": "/tmp",
"seccomp_policy_file": "/path/to/agent_seccomp.policy",
"cgroup_memory_max": 104857600, # 100 MB
"cgroup_cpu_max": 50000, # 50%
"exec_bin": "/usr/bin/python3",
"exec_args": ["/tmp/agent_code.py"]
}
EOFExplanation of each parameter:
- `chroot`: Restricts file system access to a specific directory.
- `cwd`: Working directory inside the jail.
- `seccomp_policy_file`: Path to our custom policy.
- `cgroup_memory_max` and `cgroup_cpu_max`: Resource limits.
- `exec_bin` and `exec_args`: The command to execute.
The script will run, but any attempt to execute shell commands (like `os.system`) will be blocked by seccomp, and the agent will fail gracefully.
Example 2: Running Agent Code with Network Restrictions
Network access is often unnecessary for untrusted agents. We can fully disable networking using `nsjail`'s network namespace isolation.
Create a script that tries to fetch data:
# network_agent.py
import requests
def fetch_data():
try:
response = requests.get("https://api.example.com/data")
return response.text
except Exception as e:
return f"Network blocked: {e}"
if __name__ == "__main__":
print(fetch_data())Run it with network disabled:
nsjail --config /dev/stdin <<EOF
{
"chroot": "/",
"cwd": "/tmp",
"seccomp_policy_file": "/path/to/agent_seccomp.policy",
"cgroup_memory_max": 104857600,
"cgroup_cpu_max": 50000,
"disable_network": true, # Disable all networking
"exec_bin": "/usr/bin/python3",
"exec_args": ["/tmp/network_agent.py"]
}
EOFThe `disable_network` parameter creates a network namespace with no interfaces. The agent will fail to connect, preventing data exfiltration or external calls.
Example 3: Limiting Execution Time
Agents can loop infinitely. Use `nsjail`'s time limit:
nsjail --config /dev/stdin <<EOF
{
"chroot": "/",
"cwd": "/tmp",
"seccomp_policy_file": "/path/to/agent_seccomp.policy",
"cgroup_memory_max": 104857600,
"cgroup_cpu_max": 50000,
"time_limit": 5, # 5 seconds max execution
"exec_bin": "/usr/bin/python3",
"exec_args": ["/tmp/agent_code.py"]
}
EOFIf the agent exceeds 5 seconds, `nsjail` kills it and returns a timeout error.
Example 4: Monitoring Agent Behavior
For production use, log all agent actions. Modify the seccomp policy to log blocked syscalls:
# agent_seccomp_log.policy
[whitelist]
read
write
open
close
mmap
munmap
brk
exit_group
clone
execve
stat
fstat
lseek
getdents64
[audit]
ptrace
socket
connectThen run with logging:
nsjail --config /dev/stdin <<EOF
{
"chroot": "/",
"cwd": "/tmp",
"seccomp_policy_file": "/path/to/agent_seccomp_log.policy",
"seccomp_log": true, # Log blocked syscalls
"cgroup_memory_max": 104857600,
"cgroup_cpu_max": 50000,
"exec_bin": "/usr/bin/python3",
"exec_args": ["/tmp/agent_code.py"]
}
EOFCheck logs with `dmesg | tail -20` or `journalctl -xe`. This helps identify malicious or buggy agent behavior.
Best Practices and Caveats
Running untrusted code without a full sandbox is a trade-off between performance and security. Here are key considerations:
- **Use a layered approach**: Combine `nsjail` with AppArmor profiles and read-only file systems for defense in depth. Microsoft's AI blog emphasizes the importance of multiple isolation layers.
- **Avoid granting unnecessary privileges**: Remove `CAP_NET_ADMIN`, `CAP_SYS_ADMIN`, and other capabilities. `nsjail` does this by default.
- **Regularly update seccomp policies**: As agent code evolves, new syscalls may be needed. Review logs and adjust the whitelist.
- **Consider timeouts**: Always set a maximum execution time to prevent denial-of-service.
- **Test with real agent workloads**: Anthropic's research on AI safety suggests that testing with adversarial inputs helps uncover isolation gaps.
- **Monitor resource usage**: Use cgroups to track memory, CPU, and I/O. Sudden spikes may indicate malicious activity.
Limitations
This approach is not suitable for all scenarios:
- **Kernel exploits**: If the agent code exploits a kernel vulnerability, namespaces and seccomp may not protect the host. Full sandboxing (e.g., gVisor) provides stronger isolation.
- **Side-channel attacks**: Resource usage patterns can leak information. For high-security applications, consider hardware-enforced isolation.
- **Complex agent dependencies**: Some agents require shared libraries, databases, or GPU access. These may need custom policies or partial sandboxing.
Conclusion
Running untrusted agent code without a full sandbox is feasible using lightweight Linux primitives like `nsjail`, seccomp, and cgroups. This approach offers low overhead (typically <5% performance impact) while blocking dangerous system calls, network access, and resource exhaustion. By following the installation steps and examples above, you can safely execute AI agent code in production environments, drawing on industry insights from LangChain, OpenAI, Microsoft, and Anthropic. Start with a restrictive policy, monitor logs, and gradually expand permissions as needed. The key is to default-deny: give agents only the minimum capabilities required for their task. This pragmatic strategy balances security, speed, and simplicity for modern AI deployments.
Sources
FAQ
What is this article about?
This article covers “Running Untrusted Agent Code Without a Sandbox” in the AI agents category. Explore the risks and strategies for executing untrusted AI agent code without sandboxing, including isolation techniques, monitoring, and practical safeguards for production systems.
Who is this useful for?
It is useful for readers who want a practical understanding of AI tools, models, and workflows.
What should I do next?
Read the article, review the listed sources, and test the most relevant ideas in your own workflow.



