It takes the stress about needing to monitor all the agents all the time too, which is great and creates incentives to learn how to build longer tasks for CC with more feedback loops.
I'm on Ubuntu 22.04 and it was surprisingly pleasant to create a layered sandbox approach with bubblewrap and Landlock LSM: Landlock for filesystem restrictions (deny-first, only whitelisted paths accessible) and TCP port control (API, git, local dev servers), bubblewrap for mount namespace isolation (/tmp per-project, hiding secrets), and dnsmasq for DNS whitelisting (only essential domains resolve - everything else gets NXDOMAIN).
> a malicious AI trying to escape the VM (VM escape vulnerabilities exist, but they’re rare and require deliberate exploitation)
No VM escape vulns necessary. A malicious AI could just add arbitrary code to your Vagrantfile and get host access the first time you run a vagrant command.
If you're only worried about mistakes, Claude could decide to fix/improve something by adding a commit hook. If that contains a mistake, the mistake gets executed on your host the first time you git commit/push.
(Yes, it's unpleasantly difficult to truly isolate dev environments without inconveniencing yourself.)
Shannot[0] captures intent before execution. Scripts run in a PyPy sandbox that intercepts all system calls - commands and file writes get logged but don't happen. You review in a TUI, approve what's safe, then it actually executes.
The trade-off vs VMs: VMs let Claude do anything in isolation, Shannot lets Claude propose changes to your real system with human approval. Different use cases - VMs for agentic coding, whereas this is for "fix my server" tasks where you want the changes applied but reviewed first.
There's MCP integration for Claude, remote execution via SSH, checkpoint/rollback for undoing mistakes.
Feedback greatly appreciated!
I started running Claude Code in a devcontainer with limited file access (repo only) and limited outbound network access (allowlist only) for that reason.
This weekend, I generalized this to work with docker compose. Next up is support for additional agents (Codex, OpenCode, etc). After that, I'd like to force all network access through a proxy running on the host for greater control and logging (currently it uses iptables rules).
This workflow has been working well for me so far.
Still fresh, so may be rough around the edges, but check it out: https://github.com/mattolson/agent-sandbox
sandbox-run npx @anthropic-ai/claude-code
This runs npx (...) transparently inside a Bubblewrap sandbox, exposing only the $PWD. Contrary to many other solutions, it is a few lines of pure POSIX shell.https://code.claude.com/docs/en/sandboxing#sandboxing
> Claude Code includes an intentional escape hatch mechanism that allows commands to run outside the sandbox when necessary. When a command fails due to sandbox restrictions (such as network connectivity issues or incompatible tools), Claude is prompted to analyze the failure and may retry the command with the dangerouslyDisableSandbox parameter.
The ability for the agent itself to decide to disable the sandbox seems like a flaw. But do I understand correctly that this would cause a pause to ask for the user's approval?
You can also use Lima, a lightweight VM control plane, as it natively works with qemu and Virtualization.Framework. (I think Vagrant does too; it's been a minute since I've tried.) This has traditionally been used for running container engines, but it's great for narrowly-scoped use cases like this.
Just need to be careful about how the directory Claude is working with is shared. I copy my Git repo to a container volume to use with Claude (DinD is an issue unless you do something like what Kind did) and rsync my changes back and verify before pushing. This way, I don't have to worry if Claude decides to rewind the reflog or something.
Our next version of Docker Sandboxes will have MicroVM isolation and a Docker instance within for this exact reason. It'll let you use Claude Code + Containers without Docker-in-Docker.
The only access the container has are the folders that are bind mounted from the host’s filesystem. The container gets network access from a transparent proxy.
https://github.com/dogestreet/dev-container
Much more usable than setting up a VM and you can share the same desktop environment as the host.
There's a load of ways that a repository owner can get an LLM agent to execute code on user's machines so not a good plan to let them run on your main laptop/desktop.
Personally my approach has been put all my agents in a dedicated VM and then provide them a scratch test server with nothing on it, when they need to do something that requires bare metal.
So I just run agents as the agent user.
I don't need it to have root though. It just installs everything locally.
If I did need root I'd probably just buy a used NUC for $100, and let Claude have the whole box.
I did something similar by just renting a $3 VPS, and getting Claude root there. It sounds bad but I couldn't see any downside. If it blows it up, I can just reset it. And it's really nice having "my own sysadmin." :)
I currently apply the same strategy we use in case of the senior developer or CTO going off the deep end. Snapshots of VMs, PITR for databases and file shares, locked down master branches, etc.
I wouldn't spend a bunch of energy inventing an entirely new kind of prison for these agents. I would focus on the same mitigation strategies that could address a malicious human developer. Virtual box on a sensitive host another human is using is not how you'd go about it. Giving the developer a cheap cloud VM or physical host they can completely own is more typical. Locking down at the network is one of the simplest and most effective methods.
I needed a way to run Claude marketplace agents via Discord. Problem: agents can execute code, hit APIs, touch the filesystem—the dangerous stuff. Can't do that in a Worker's 30s timeout.
Solution: Worker handles Discord protocol (signature verification, deferred response) and queues the task. Cloudflare Sandbox picks it up with a 15min timeout and runs claude --agent plugin:agent in an isolated container. Discord threads store history, so everything stays stateless. Hono for routing.
This was surprisingly little glue. And the Cloudflare MCP made it a breeze do debug (instead of headbanging against the dashboard). Still working on getting E2E latency down.
Threat model for me is more "whoops it deleted my home directory" rather than some elaborate malicious exploit.
If you're on a Mac working on a linux docker containers, your Docker engine is already running a VM (and a linux VM doesn't need one). So you're still only "one VM away" from the real environment. Most IDEs support directly working in the VM via SSH if you need to inspect the code.
You then run --dangerously-skip-permissions and do all changes via PRs. I have been running this combined with workmux [0] for a couple of months and highly recommend it. You can one-shot several whole PRs concurrently with this setup.
The reason it beats a cloud VM is because when you're running multiple concurrent copies of all containers in a project, it quickly eats up memory. Running a cloud VM 24/7 with high enough memory is expensive.
I'm working on targeting both the curl|bash pattern and coding agents with this (via smart out of the box profiles). Early stages but functional. Feedback and bug reports would be appreciated.
I see the power and am considering Max but 5x cost is difficult to swallow. Just doing this for a lark, not professionally.
> So now you need Docker-in-Docker, which means --privileged mode, which defeats the entire purpose of sandboxing.
> That means trading “Claude might mess up my filesystem” for “Claude has root-level access to my container runtime.”
A Vagrant VM is exactly the same thing, just without Docker. The benefit of Docker is you've got an entire ecosystem of tooling and customized containers to benefit from, easier to maintain than a Vagrantfile, and no waiting for "initialization" on first booting a Vagrant box.On both Linux and MacOS, use this:
# Build 'claude' VM and Docker context
$ colima start --profile claude --vm-type=qemu
$ docker context create claude --docker "host=unix://$HOME/.colima/claude/docker.sock"
$ docker context use claude
# Start DinD, pass through ports 8080 and 8443, and mount one host directory (for a Git repo)
$ docker run -d --name dind-lab --privileged -e DOCKER_TLS_CERTDIR= -v dind-lab-data:/var/lib/docker \
-p 8080:8080 -p 8443:8443 -v /home/MYUSER/GITDIR:/mnt/host/home/MYUSER/GITDIR \
docker:27-dind
$ docker run --rm -it -e DOCKER_HOST=tcp://127.0.0.1:2375 \
-p 8080:8080 -p 8443:8443 -v /mnt/host/home/MYUSER/GITDIR:/home/MYUSER/GITDIR \
ubuntu:24.04 bash
# Or if you don't want to pass-through ports w/ DinD twice, use its network namespace directly
# ( docker run --rm -it -e DOCKER_HOST=tcp://127.0.0.1:2375 --network container:dind-lab .... )
Your normal default Docker context remains safe for normal use, and the "dangerous" context of claude euns in a different VM. If Claude destroys its container's VM, just delete it (colima stop claude; colima delete claude) and remake it.You could do rootless Docker/Podman, but there's a lot of broken stuff to deal with that will just distract the AI.
[1]: https://github.com/nikvdp/cco [2]: https://code.claude.com/docs/en/sandboxing
Claude will add Docker support and a few more tweaks in the next couple of days.
IMO, if you are not running in the dangerous mode then you are really missing out on one of the best aspects of claude code- its ability to iterate. If you have to confirm each iteration then it's just not practical.
But a simple vm and some automation to install developer tools using ansible, nix or whatever you prefer isn't that hard to (vibe) code together. I like Lima but it feels slightly sub-optimal for the job currently.
Some useful things to consider:
- Ssh agent forwarding for authenticating against e.g. git is useful. But maybe don't use the same key that authenticates to your production machines as well ...
- How do you authenticate without a browser? Most AI tools have ways to deal with that but it's slightly tedious to automate during provisioning.
- Making sure all your development tools are there; I use things like sdkman, nvm, bun, etc. And I have my shell preferences and some other tools I like to have around.
- Minimizing time provisioning these vms over and over again. This gets tedious really quickly.
- Keeping the VMs fast is important too. In my projects, build tool performance adds up and AI tools like to call them a lot. So assign enough memory and CPU.
- It would be nice to switch between local and remote/cloud based vms easily.
- Software flexibility; developers are picky about their tools. There is no one size fits all here. Even just deciding on the base image to use for your vm is likely to escalate. I picked debian for what it is worth.
In short, I think there's enough out there that you can pull something together but it still involves quite a bit of DIY. It would be nice if this got easier. And AI tools asking for permission for everything is not a good security model. Because people just turn that off. Sandboxing those things is the way to go. But AI tools need to be able to do enough to work with your software.
Windows is the best (sandboxed) linux
Does anybody have experience using microVMs (Firecracker, Kata Containers, etc.) for this use case? Would love to hear your thoughts.
Syncthing works well for getting a local copy of a directory from the VM.
I feel like the only good sandboxing at this point is one that also blocks generic web access.
It all integrates nicely with VS Code. It has a firewall script and you spin up your database within the docker compose file so it has full access to a postgres instance. I can share my full setup if anyone needs it.
(Maybe I should be asking Claude this)
Edit: someone already built this: https://github.com/neko-kai/claude-code-sandbox
https://github.com/EstebanForge/construct-cli
For Linux, WSL also of course, and macOS.
Any coding agent (from the supported ones, our you can install your own).
Podman, Docker or even Apple's container.
In case anyone is interested.
This allows you to use Claude Code from your mobile device, in a safe environment (restricted Kubernetes pod)
This seems like a very hard problem with coding specifically as you want unsafe content (web searches) to be able to impact sensitive things (code).
I'd love to find people to talk to about this stuff.
There was this HN post[0] last week on a tool for automatically shutting down the codespace container when idle.
And setup an .env for the project with user/password to access only a dev database.
industrially-making-exploits.. : https://news.ycombinator.com/item?id=46676081
Missing FreeBSD jails in 2026 is kind of weird (hello 1999)...
check it out: https://shellbox.dev
docker sandbox run claude
You can have the local environment completely isolated with vagrant. But if you’re not careful with auth tokens it can (and eventually will when it gets confused)go wipe the shared dev database or the GitHub repo. The author kinda acknowledges this, but it’s glossing over a big chunk of the problem. If it can pus to GitHub, unless you’ve set up your tokens carefully it can delete things too. Having a local isolated test database separate from the shared infrastructure is a matter of a mature dev environment, which is a completely separate thing from how you run Claude. Two of the three examples cited as “no, no, no” are not protected by vagrant or docker or even EC2. It’s what tokens the agent has and needs.
I have such a love/hate relationship with VirtualBox. It's so useful but so buggy. My current installation has a bug that causes high network latency, but I'm afraid to upgrade in case it introduces new, worse bugs.
VMware is a million times better, but it is also Proprietary™
Why don't Claude Code & other AI agents offer an option to make a sound or trigger a system notification whenever they prompt for approval? I've looked into setting this up, and it seems like I'd have to wire up a script that scrapes terminal output for an approval request. Codex has had a feature request open for a while: https://github.com/openai/codex/issues/3052
There's a bug in that it can't output smart quotes “like this”
Sonnet, Opus et al think they output it but something in the pipeline is rewriting it
https://github.com/firasd/vibesbench/blob/main/docs/2026/A/t...
Try it in Claude Code and you'll see what I mean! Very weird
Or you can just mount the socket and call docker from within docker.