Feature · 2026-06-10

An Auditable Sandbox for AI Agents

You want an agent to run the dependency upgrade, the refactor, the unfamiliar build — but not loose in your working tree with your credentials and your network. h5i's env gives it a disposable, confined box instead: a git worktree plus a policy that limits what code can read, write, and reach, and a record of everything it did.

By Koukyosyumei Reading time 10 min Tags Sandbox · Isolation · Security

A coding agent is a process you didn't write, running commands you didn't read, with whatever access your shell has. Most of the time it installs a package and edits a file. Sometimes a poisoned dependency, a prompt-injected README, or an over-eager plan turns "run the build" into reading ~/.ssh, posting your environment to a webhook, or rewriting files far outside the task. The uncomfortable combination — private data, untrusted content, and network egress in the same process — is what makes an agent worth sandboxing rather than trusting.

The usual answer is a container. That works, but it's a second tool with its own daemon, its own image, and its own divergence from your real repo. h5i takes a narrower goal: give the agent an isolated environment tied to your git history, confine it with what the host already provides, and keep a tamper-evident record — so you can hand off the risky work and audit exactly what happened.

The loop: create, shell, audit, apply

An env is a git worktree (its own checkout under .git/.h5i/, invisible to your main tree) plus a policy pinned at creation. You work with it in four commands:

~/my-project

$ h5i env create fix-auth
  created env · isolation: supervised · worktree under .git/.h5i/…

$ h5i env shell fix-auth      # work inside it — or hand the box to an agent
box$ cargo build && cargo test
box$ exit

$ h5i env diff fix-auth     # what changed, vs the frozen starting point
$ h5i env apply fix-auth    # merge onto your branch — only when you choose to

create freezes a base commit and forks a branch. shell drops you (or the agent) into the box with a real terminal — so every command the session runs is confined, not just the ones someone remembered to wrap. diff shows the change against the frozen base. apply merges it onto your branch, and it never happens on its own: you review first. For a single one-off command there's h5i env run fix-auth -- cargo build, which runs it the same confined way and captures the output.

Tiered isolation, picked for you

"Confined" means different things on different machines, and h5i is honest about it. Isolation is a ladder of tiers, and h5i env create with no flag picks the strongest level the host can actually enforce. If you ask for a level the host can't provide, it refuses rather than quietly running with less — you never think you're sandboxed when you aren't.

Tier	What confines the code	Network
workspace	git worktree only — for trusted code	host (unrestricted)
process	Landlock filesystem allowlist + a seccomp syscall deny-list + user/mount/net namespaces + cgroup limits, all rootless	off, or full
supervised	the process tier + a live seccomp socket gate	off, or a real egress allowlist
container	rootless Podman — dropped capabilities, read-only root filesystem, no docker socket	an egress allowlist (L7 proxy)

None of this needs root or a VM. h5i env probe prints exactly what your host supports; if rootless Podman isn't installed, create tells you so when it lands on a kernel tier, instead of silently leaving the container tier out of reach.

The interesting part: an egress allowlist you can't undo

Filesystem confinement is well-trodden. Network egress is where most sandboxes get honest about their limits. An L7 proxy allowlist — the kind a container egress filter or a SOCKS proxy gives you — only stops programs that respect the proxy; a raw socket to an IP walks straight past it. h5i's supervised tier enforces the allowlist a layer lower, at L3/L4, where a raw socket can't help:

.h5i/env.toml

isolation  = "supervised"
net.egress = ["example.com", "pypi.org", "github.com:443"]

Inside the box, the allowlist holds — and there's no DNS channel for anything off it:

box

box$ curl https://example.com
  200 OK                              — on the allowlist
box$ curl https://www.cloudflare.com
  could not resolve host                — blocked, no DNS for it
box$ python3 -c 'import socket; socket.create_connection(("1.1.1.1",443))'
  timed out                             — raw IP dropped at L3/L4

Three pieces make that work, all without root:

A namespaced uplink. The confined process gets its own network namespace, and slirp4netns gives that namespace a NAT'd uplink from user space — no CAP_NET_ADMIN on the host required.
An nftables default-drop allowlist. Inside its namespace the process holds capabilities over its own network only, so it installs an nftables ruleset that drops everything except the pinned addresses. This is the layer that actually stops a packet.
DNS pinned through /etc/hosts. Each allowlisted host is resolved once, at startup, and written into a private /etc/hosts bound only inside the box. No port 53 is opened at all: allowlisted names resolve to exactly the address nftables permits, and everything else simply doesn't resolve. That kills both DNS rebinding and DNS as an exfiltration side-channel.

The natural objection: if the code runs with capabilities over its own network namespace, can't it just delete the firewall? It tries — and can't:

box

box$ nft flush ruleset
  Unable to initialize Netlink socket: Operation not permitted
box$ curl https://1.1.1.1
  timed out                             — still blocked

The seccomp socket gate sits in front of socket() and denies the AF_NETLINK family outright. nft, ip, and every other tool that would rewrite the ruleset or the routes needs a netlink socket to do it — so they can't open the door they'd need. The firewall is set before the untrusted program starts and is out of its reach afterward. Two cheap mechanisms — a syscall gate and a packet filter — compose into something neither gives alone.

Rootless, the whole way down. namespaces, the slirp uplink, nftables in the namespace, the bind-mounted /etc/hosts — none of it needs host root or a VM. The tradeoff is honest: this is strong containment for ordinary and opportunistic threats, not a hardware boundary against a determined kernel exploit. For that, h5i reserves microVM tiers it doesn't pretend to ship yet.

Everything is auditable after the fact

Confinement decides what code can do; the record tells you what it did. Every run in an env is captured — its output, exit code, the exact policy that was enforced (pinned by hash, so it's tamper-evident), and any secrets redacted out of the evidence. Blocked accesses are recorded too, not silently swallowed.

~/my-project

$ h5i env log fix-auth          # every command, secret use, and denial
$ h5i env inspect fix-auth --capture <id>   # one run's full record
$ h5i env compare a b c        # rank parallel attempts side by side

Because the record is content-addressed and lives in git refs, it travels with h5i push / pull: one agent can propose a change on its clone and a teammate (or another agent) can review and apply it on theirs, with the full evidence in hand. The web dashboard renders the timeline and scores every allowed and blocked action, so the boxes that pushed hardest on their boundary are the ones you look at first.

When to reach for it

Use an env whenever you'd hesitate to let a command touch your real tree or your real network: a dependency upgrade, a build you don't fully trust, an agent working unattended, a refactor you want to diff before it lands. Skip it for quick, trusted edits — that's what the workspace tier is for, and it costs nothing. The point isn't maximum paranoia on every command; it's that when the work is risky, you get a box that's confined as strongly as your host allows and a record you can actually audit — in the same tool that already holds your history.

Auditing AI-Generated Code: A Practical Framework

Once the work is confined, which commits actually need human eyes? Four deterministic risk signals that build a ranked review queue.

Give your agents a box, not your shell

h5i is open source, Apache 2.0, and runs entirely locally — no model in the path.

Star on GitHub Back to docs