My Secure AI Agent Setup: Building a Better Playground with Nix
I’ve been there more times than I care to admit. I’m about to let a new AI coding agent run "wild" on one of my projects, and that little voice in the back of my head starts whispering, "Is this about to delete my SSH keys? Is it going to find that old .env file I forgot to gitignore?"
It’s a valid fear. I've watched AI move from "suggesting code" to "acting on my behalf," and I've realized I'm essentially giving a black box the keys to my entire digital identity. They can read my files, run terminal commands, and even push code. I’m no longer just using a tool; I’m managing a "digital deputy" that has the potential to do real damage if it goes off the rails.
But I’ve also realized something else: I don’t have to trust them.
The goal I set for myself wasn't to "hope" the AI behaves well; it was to build a fence so strong that it doesn't matter if it hallucinates or decides to perform a "recursive rm" in the wrong folder. I wanted a playground where I could let agents work without ever looking over my shoulder. I wanted to move from a state of "constant monitoring" to a state of "automated containment."
To do that, I’ve put together a setup using Nix, Bubblewrap, and a library called jail.nix. This isn’t just some theoretical security exercise for me - it’s how I actually work every day. It’s what allows me to pull down untrusted code or experiment with the latest LLM models without worrying about my host system's integrity.
Why "Trust" is a Bad Strategy
In my experience, "Excessive Agency" is the silent killer of local security. It’s what happens when I give a tool more power than it needs just to be "helpful." I call this the "Helpful Stranger" problem. If a stranger offers to help me clean my house, I don't give them a master key to every room. I show them the kitchen and I close the bedroom doors.
Take a standard coding agent. To help me refactor a project, it needs a compiler, a linter, and access to my code. But it has no business looking at my ~/.ssh folder. It doesn't need to see my Slack history or my browser cookies. It doesn't need to know my hostname or see a list of every process running on my machine.
I used to try solving this with "Human-in-the-loop" reviews. I’d sit there and click "Yes" to every request the agent made. But I quickly learned that I'm not a good filter. After the 30th time clicking "Yes" for a harmless git status, my brain goes on autopilot. I will eventually click "Yes" to a dangerous command without even noticing. It’s human nature.
That’s why I moved to Digital Containment. I stopped trying to review every action and started restricting every action by default. I treat my agents like I'd treat any untrusted process: I put it in a box. This is the only way I've found to actually implement Zero Trust on my own machine.
How it works: A Deep Dive into Namespaces
To build my sandbox, I had to look at how Linux actually keeps processes apart. Most people jump straight to Docker, but I found Docker to be too heavy for my daily workflow. I didn't want a whole virtual OS; I just wanted to hide certain parts of my house from my guests.
Instead, I use the primitives that power containers: Linux Namespaces. You can find the raw technical details in the official documentation, but here's how I think about them.
1. The Mount Namespace
This is the core of my setup. When I start a mount namespace, the agent gets its own private view of my filesystem. I start with an empty root and then selectively "bind-mount" only the directories I want the agent to see. To a Python agent, my /etc/shadow file or my private keys don't just "lack permissions" - they literally don't exist. There is no path that leads back to them.
2. The Network Namespace
This is how I handle the "phone line." By default, I put my sandbox in a private network where it can't see the rest of my local network or my home lab. If I have an agent that needs to talk to an LLM API, I'll provide a specific bridge or a proxy, but I keep the rest of my digital life invisible to it.
3. The User Namespace
This is a game-changer for me. It allows me to be "root" inside the sandbox while remaining a standard, restricted user outside of it. It’s the reason I can run Bubblewrap without ever needing sudo. It maps my real user ID to a different one inside the jail. If the agent somehow finds a way to perform a root action, the kernel blocks it because, outside the jail, it has no special power.
4. PID and IPC Namespaces
I use the PID namespace so the agent can only see its own processes. It can't run ps and see that I have my password manager or my browser open. The IPC namespace stops the agent from using shared memory to talk to other apps. It truly is a "room with no windows."
My Tools: Why I Chose Nix and Bubblewrap
The journey from "knowing about namespaces" to "running them safely" was where I spent the most time. Writing raw terminal commands with dozens of --bind and --unshare flags is a recipe for a security hole. It’s too easy to forget one flag and leave a backdoor open.
I chose Nix because it provides what I call Binary Purity. In my old setup, my sandbox would just use whatever git or curl was installed on my system. If I had a vulnerable version of curl installed globally, my agent was vulnerable too. There was no "clean room" for me to work in.
Nix changed that. When I define an environment for an agent, I'm using a strictly defined, hashed, and immutable set of tools from the /nix/store.
- Immutability by Design: The Nix store is read-only. Even if an agent gains "fake root" inside the jail and tries to overwrite my
gitbinary to exfiltrate data, it hits the read-only mount of the host store. The kernel itself blocks the write. - The Flake Advantage: I use Nix Flakes to pin my dependencies. This means if I run the same sandbox today or a year from now, I’m getting the exact same byte-for-byte binaries. It turns my security policy into a version-controlled asset. I can "rollback" my security environment just as easily as I rollback code.
- Supply Chain Peace of Mind: Because Nix builds everything in a sandbox of its own, I’m not worried about "malicious post-install scripts" in a random package affecting my host. The build environment is as isolated as the runtime environment.
For the high-level logic, I use jail.nix. It’s a Nix-native DSL that translates my human-readable rules into low-level Bubblewrap calls. Instead of a messy 200-character shell script, I have a clean Nix file.
key Concept: The Combinators
Before looking at the full file, it helps to understand the "combinators" - the basic building blocks of the security policy.
commonJailOptions = with jail.combinators; [
network # Give it access to the internet for API calls
time-zone # Neutralize the time to UTC for privacy
mount-cwd # Crucial: only the project folder exists in the jail
no-new-session # Prevents the agent from detaching background processes
mount-dev # Minimal device access (no webcam/mic access!)
mount-proc # Minimal process information visibility
];The "combinators" are the magic here. For example, mount-cwd doesn't just "share" a folder; it creates a new filesystem view where my project directory is the ONLY thing visible. Everything else - my home folder, my downloads, my browser history - is gone. It’s the ultimate "clean room."
The Blueprint: A Complete flake.nix
This is the actual configuration I use to spin up my secure environments. It ties those combinators together into a usable developer shell.
{
description = "My secure AI agent playground";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
jail-nix.url = "github:alexdavid/jail.nix";
flake-utils.url = "github:numtide/flake-utils";
};
outputs = { self, nixpkgs, jail-nix, flake-utils }:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = import nixpkgs { inherit system; };
jail = jail-nix.lib.init pkgs;
# 1. Define the Security Policy
# reusing the combinators concept we saw above
commonJailOptions = with jail.combinators; [
network
time-zone
mount-cwd
mount-dev
mount-proc
no-new-session
];
# 2. Define a Generic Agent Factory
makeJailedAgent = { name, packages }:
jail name pkgs.bashInteractive (with jail.combinators; (
commonJailOptions ++ [
(readwrite (noescape "~/.cache/${name}"))
(readwrite (noescape "~/.local/share/${name}"))
(add-pkg-deps packages)
]
));
in
{
# 3. The Development Shell
devShells.default = pkgs.mkShell {
packages = [
pkgs.nixd # Tools for ME
# Example usage for a basic agent
(makeJailedAgent {
name = "basic-agent";
packages = [ pkgs.curl pkgs.git ];
})
];
};
}
);
}This setup does three critical things effectively:
- Dependencies: It pulls in
jail-nixdirectly from the source. - Policy Definition: It defines a
commonJailOptionsthat applies to every agent. - Isolation: The
makeJailedAgentfunction gives each agent its own "room."
How this stacks up against other things I’ve tried
I’ve experimented with almost every isolation method out there before settling on this Nix and Bubblewrap combo. Here is a quick breakdown of why I prefer this specific stack for my local development.
| Feature | Nix + Bubblewrap | Docker / Podman | gVisor / Firecracker |
|---|---|---|---|
| Startup Time | Milliseconds | Seconds | Seconds |
| Isolation Level | High (Namespaces) | Moderate (Namespaces) | Extreme (Guest Kernel) |
| Performance | Native (Zero overhead) | Near-native | Moderate overhead |
| Complexity | Low (Declarative) | Moderate (Imperative) | High (VM-based) |
| Reproducibility | Absolute (Flakes) | Variable (Layers) | Low (Needs base images) |
While Docker is great for shipping built apps, and gVisor is the gold standard for multi-tenant production security, for my day-to-day coding workspace, nothing beats the speed and reproducibility of Nix+Bubblewrap. It gives me the best balance of "security I can trust" and "performance I don't notice."
My Workflow: Securing Go and Python
I work across a bunch of different languages, but Go and Python are my mainstays. Each one has its own quirks that I had to account for when I started building these sandboxes. These aren't just "examples" for me; they are the configurations I use every single day to stay productive.
How I handle Go and its toolchain
Go agents often need the compiler or a language server like gopls. But compilers are dangerous. They can execute code during the build process, especially if the project uses go generate or has C-extensions (CGO). Giving an agent access to gcc is essentially giving it a box of matches.
I provide my agents with a Wrapped Toolchain. I give them access to the Go binaries, but those binaries are restricted to seeing only the Nix store and the project folder. If the project uses C libraries, I don't give the agent my system-wide C compiler. I provide a pinned version through Nix that can only see whitelisted headers.
Dealing with CGO Complexity
If I have a Go project that links against a C database driver, the agent needs the C headers. In a standard setup, I'd have to expose /usr/include. But in my Nix setup, I only expose the specific dev packages for that library. The agent gets to compile the code, but it doesn't get to see my system's internal C headers.
makeJailedGoAgent = { extraPkgs ? [] }: jail "jailed-go" pkgs.go (with jail.combinators; (
commonJailOptions ++ [
# I give it a dedicated, fast cache so builds are snappy
(readwrite (noescape "~/.cache/go-build"))
# I whitelist only these specific tools for the workflow
(add-pkg-deps [
pkgs.go
pkgs.gopls
pkgs.golangci-lint
pkgs.gcc
pkgs.binutils
pkgs.git
])
]
));The beauty of this is that if the agent tries to run ssh, it simply fails. It doesn't even know what ssh is. This is "Containment by Omission" - if the tool isn't in the list, it doesn't exist in the agent's universe.
How I isolate Python environments
Python is a different beast entirely. Python agents often deal with massive datasets and a sprawling tree of dependencies that love to download pre-compiled "wheels" from PyPI. These wheels can contain untrusted binary code that executes the moment you import the library.
Virtualenv Isolation and /tmp Hardening
Python scripts are also notorious for "leaking" state by writing to /tmp or ~/.cache. Instead of letting a Python agent clutter my actual disk, I use a Virtual Filesystem within the sandbox. I mount a tmpfs (a memory-based filesystem) to /tmp. This ensures that any temporary data the agent creates is never written to my SSD and is deleted the second the process exits.
makeJailedPythonAgent = { pythonPkgs ? [] }:
let
# I build the environment ONCE using Nix.
# The agent doesn't get to 'pip install' anything at runtime.
# This prevents the "Supply Chain" attack vector via PyPI.
pythonEnv = pkgs.python3.withPackages (ps: with ps; [
pandas
numpy
scikit-learn
requests
] ++ pythonPkgs);
in
jail "jailed-python" pythonEnv (with jail.combinators; (
commonJailOptions ++ [
(mount "tmpfs" "/tmp") # Volatile /tmp isolated in RAM
(mount "devtmpfs" "/dev") # Minimal devices (no real hardware access)
# I provide only this audited environment
(add-pkg-deps [ pythonEnv pkgs.bashInteractive ])
]
));This setup gives me total control. The agent doesn't "install" packages; it only "uses" the ones I’ve already audited and provided through Nix. If it needs a new package, I update my Flake and restart the sandbox. It’s a clean, reproducible loop that keeps my host system pristine.
Going Deeper: Syscalls and Capabilities
I don't just stop at hiding files. If I truly want to secure my environment, I have to limit what the agent can actually ask the kernel to do. This is where I move from "Filesystem Isolation" to "Kernel Isolation." It’s the difference between locking the front door and installing motion sensors in every room.
Stripping Capabilities: De-clawing the Process
Linux has about 40 Capabilities (like CAP_NET_RAW, CAP_SYS_ADMIN, or CAP_CHOWN). These are essentially "fragments of root power." Most of my development tools don't need a single one of them to do their job.
In my setup, I use the no-new-privs flag. This is a critical defense-in-depth measure. It ensures that even if an agent finds a way to run a binary that has the "setuid" bit (like a misconfigured local tool), the kernel will refuse to grant it any extra power. I'm essentially "de-clawing" the process from the moment it starts. It stays as a restricted user, no matter what it finds on the disk.
Filtering Syscalls with Seccomp
Every single request a program makes - opening a file, checking the time, or creating a network socket - is a system call (syscall). The Linux kernel has over 300 of these. Why would a Python script ever need to reboot() my machine or kexec_load() a new kernel? It shouldn't.
I apply Seccomp (Secure Computing) filters to my sandboxes. This allows me to define a tiny, safe subset of available syscalls. If the agent (or a library it's using) tries to make a forbidden call, the kernel kills the process immediately. It’s a "Fail-Fast" model that provides a massive reduction in the attack surface. In my experience, most agents only need about 40-50 syscalls to perform complex coding tasks. Blocking the other 250+ is just good hygiene.
Expertise in Action: Secrets without Secrets
This is one of my favorite patterns, and it’s where I see many developers get tripped up. I often need to give an agent an API key for OpenAI or Anthropic, but I don't want to actually give it the key. If the agent can run env or read /proc/self/environ, that key is gone.
The Unix Socket Proxy Pattern
Instead of using an environment variable, I run a tiny Unix Socket Proxy outside the sandbox on my host machine.
- The Proxy: My proxy is a small Go program that has the real API key stored in its memory. It listens on a Unix domain socket (e.g.,
/tmp/llm-proxy.sock). - The Mapping: I "bind" this unix socket into the sandbox as a file at the same path.
- The Interaction: The agent talks to the socket using a simple HTTP-over-Unix-Socket protocol.
- The Injection: The proxy receives the request, injects the
Authorization: Bearer sk-...header, forwards the request to the real LLM provider, and returns the result.
The agent gets the data it needs to function, but it never even sees the string of characters that represents my key. If the agent is compromised or its logs are leaked, my credentials remain safe on my host. This is a classic "Capability-based" security move that separates the permission to use a service from the possession of the secret.
How I Verify My Setup: The 10-Point Audit
Whenever I update my security Flake or change my kernel version, I run through this checklist. It’s my way of making sure I haven't accidentally opened a hole in the fence through a configuration drift.
- The Mount Test: Run
ls /home. It should be completely empty. If I see my username, the mount namespace hasn't isolated my home directory properly. - The Environment Scrub: Run
env. Does it show any of my private tokens orPATHvariables from my host? It should only show the minimal set I whitelisted. - The Network Probe: Try to
pingmy local router or a home server. It should fail with "Network is unreachable." If I can see my local network, the sandbox can pivot to other devices. - Process Visibility: Run
ps aux. Can I see my code editor or my browser? I should only see the agent’s processes and the shell. - Device Restriction: Look in
/dev. I should only see the basics likenull,zero, andrandom. If I see/dev/video0or/dev/audio, the agent could theoretically snoop on me. - Immutable Store Check: Try to
touch /nix/store/test. It should return a "Read-only file system" error. This proves the agent can't poison my shared build tools. - Capability Audit: Run
capsh --print. It should show a minimal set of privileges. If I seeCAP_SYS_ADMIN, something is fundamentally wrong with the mapping. - Privacy Fingerprinting: Run
date. Does it show my local time zone or UTC? I prefer UTC to prevent the agent from guessing my physical location. - DNS Leakage Test: Try to resolve a local hostname. It should fail. I want the agent using specified, public DNS resolvers (like Cloudflare's
1.1.1.1) exclusively. - Binary Lockdown: Try to run
sshorcurl(if not in the whitelist). It should return "command not found." This is my "Containment by Omission" check.
Resource Limiting: Protecting my CPU and RAM
Isolation isn't just about security; it's about system stability. I've had agents go into infinite loops - either through a bug in the code they generated or a hallucination - that would have frozen my entire machine if I hadn't used Control Groups (Cgroups).
While Bubblewrap handles the namespaces (the "where"), I use Nix to wrap my tools in Cgroup resource limits (the "how much"). This ensures that even if an agent starts a heavy processing task or a "fork bomb," it only takes a slice of my CPU and a fixed amount of RAM. I can keep working on my host machine - jumping into meetings, writing code, or browsing - without my machine stuttering. It’s about limiting the "Blast Radius" of a performance failure.
A Day in the Life: How I use this daily
To give you an idea of how this looks in practice, here is a typical workflow for me when I’m starting a new project.
- Initialization: I create a new directory and a
flake.nixthat defines my "Agent Jail." - Startup: I run my custom
jail-shellcommand. In milliseconds, the namespaces are created. - The Agent's View: I start my AI agent inside that shell. From its perspective, it's on a completely fresh, high-performance Linux machine with only my project files visible.
- The Build Loop: The agent generates code, runs tests, and iterates. It uses the pinned Nix tools I’ve provided. If it needs a new Go library, I update the
go.mod(which is shared) and the agent runsgo mod downloadwhich writes to the shared, but isolated, cache. - The Secret Sauce: When the agent needs to hit an API, it talks through my local Unix socket proxy. I never have to worry about the key leaking into the agent's logs.
- Cleanup: When I’m done, I simply exit the shell. The mount namespace disappears, the
tmpfsmounts are wiped from RAM, and my host system is exactly as it was before I started.
This workflow has completely changed how I think about "untrusted" tools. I no longer feel the need to "micromanage" every line of code the agent writes before I run it. I let the agent fail fast and fail safely.
Troubleshooting: When the Playground Breaks
Building a secure sandbox like this isn't always a "set it and forget it" affair. I've run into plenty of friction, and most of it comes from the clash between "least privilege" and "developer convenience." Here are the most common issues I’ve had to debug.
1. The "Permission Denied" Rabbit Hole
This usually happens when I forget to "bind" a path that a tool actually needs. For example, a Go agent might need to look at /etc/pkcs11 or a specific shared library. My rule of thumb: before I open a global path, I search for the absolute minimum file needed. I’d rather add five specific file binds than one wide-open directory bind.
2. DNS and Network Blind Spots
If I isolate the network so much that the agent can't even resolve a domain name, it can't talk to LLMs or download dependencies. I solve this by ensuring my sandbox has a clean /etc/resolv.conf and access to public DNS resolvers. I also use a local DNS proxy if I want to whitelist only specific domains (like api.openai.com), making the network cage even tighter.
3. The Shebang Problem in Shell Scripts
If a script has a shebang like #!/usr/bin/env python, it might fail because /usr/bin/env doesn't exist in my minimal mount namespace. I've learned to use Nix's pkgs.writeShellScriptBin or resholve to "patch" these scripts. This ensures that every command refers to a specific, immutable path in the Nix store. It’s a bit more work up front, but it makes the sandbox incredibly robust.
4. Shared Memory and Library Mismatches
Sometimes, a binary linked against one version of glibc on my host tries to run inside the sandbox against a different version provided by Nix. By using Nix to provide the entire toolchain - including the linker and the libraries - I avoid this "Impedance Mismatch." My agent isn't running in a guest environment; it's running in a coherent, Nix-managed universe.
Addressing Common Concerns
Is this overkill for a solo developer?
I hear this a lot. But I view my development machine as my most valuable asset. It’s where my bank accounts are accessed, my private keys are stored, and my identity is managed. Spending a few hours building a sandbox isn't "paranoia"; it's "engineering." It gives me the freedom to experiment with new AI tools with zero anxiety. The "overkill" pays for itself the first time an agent accidentally runs a destructive command in the wrong folder.
Comparison with Docker and Podman
For deployment and shipping to production, I still use Docker. But for the "Inner Loop" of development - where I’m iterating on local code with an AI - I prefer this Nix+Bubblewrap approach. It’s faster (no guest OS overhead), it’s more granular (I can hide specific files, not just whole layers), and it integrates perfectly with my existing Nix-managed system.
Performance Overhead
The performance overhead is effectively zero. I'm running native processes on my native Linux kernel. The only "cost" is the few milliseconds of setup time when I start the shell. In fact, because I use the Nix store and dedicated SSD caches for builds, my sandboxed agents often run faster than un-sandboxed tools that are fighting for system resources.
Running on macOS or Windows
The specific Linux Namespace primitives I've discussed are unique to the Linux kernel. However, if you're on macOS or Windows, you can achieve a similar result by running a lightweight Linux VM (like OrbStack or WSL2) and performing your containerization inside that VM. It’s not quite the same "native" feel, but the isolation principles remain identical.
Do I still need "Human-in-the-Loop"?
I still use HITL, but I use it for logical verification, not security enforcement. I let the sandbox handle the "Don't delete my disk" part, and I use my own eyes to handle the "Does this code actually make sense?" part. This separation of concerns makes me much more efficient.
Looking Ahead: The Future of Agent Sec
As AI agents become more autonomous - eventually moving from "coding partners" to "digital employees" that run 24/7 - the stakes will only get higher. I expect we’ll see more formalizations of these sandboxing patterns. We might see "Agent OS" abstractions that build these namespaces by default.
For now, the best strategy is to take control of your own boundaries. We are in the "Wild West" of AI agency. By building your own playground with Nix and Bubblewrap, you aren't just protecting your machine; you are building the skills needed to manage the next generation of computing.
Final Thoughts: Moving from Fear to Freedom
I started this journey because I was afraid of what my tools might do. I was tired of second-guessing every git push and every automated refactor. But along the way, I realized that security isn't about lived-in fear - it's about engineering a system that removes the need for fear.
By using Nix and Bubblewrap, I’ve built a foundation that lets me use the most powerful AI agents with total confidence. I’m no longer hoping for the best; I’m engineering for reality. The agents have their playground, and I have my peace of mind.
Happy coding!
Sources & Further Reading:
- Alex David’s jail.nix Project (The primary library I use).
- OWASP LLM Security Top 10 (The standard for AI vulnerability research).
- NixOS Foundation (The engine for my reproducibility).
- Bubblewrap Security Model (How the isolation actually works).