An Auditable Sandbox for AI Agents
You want an agent to run the dependency upgrade, the refactor, the unfamiliar build — but
not loose in your working tree with your credentials and your network. h5i's
env gives it a disposable, confined box instead: a git worktree plus a policy
that limits what code can read, write, and reach, and a record of everything it did.
A coding agent is a process you didn't write, running commands you didn't read, with whatever
access your shell has. Most of the time it installs a package and edits a file. Sometimes a
poisoned dependency, a prompt-injected README, or an over-eager plan turns "run the build" into
reading ~/.ssh, posting your environment to a webhook, or rewriting files far
outside the task. The uncomfortable combination — private data, untrusted content, and network
egress in the same process — is what makes an agent worth sandboxing rather than trusting.
The usual answer is a container. That works, but it's a second tool with its own daemon, its own image, and its own divergence from your real repo. h5i takes a narrower goal: give the agent an isolated environment tied to your git history, confine it with what the host already provides, and keep a tamper-evident record — so you can hand off the risky work and audit exactly what happened.
The loop: create, shell, audit, apply
An env is a git worktree (its own checkout under .git/.h5i/, invisible
to your main tree) plus a policy pinned at creation. You work with it in four commands:
$ h5i env create fix-auth created env · isolation: supervised · worktree under .git/.h5i/… $ h5i env shell fix-auth # work inside it — or hand the box to an agent box$ cargo build && cargo test box$ exit $ h5i env diff fix-auth # what changed, vs the frozen starting point $ h5i env apply fix-auth # merge onto your branch — only when you choose to
create freezes a base commit and forks a branch. shell drops you (or
the agent) into the box with a real terminal — so every command the session runs is confined,
not just the ones someone remembered to wrap. diff shows the change against the
frozen base. apply merges it onto your branch, and it never happens on its own: you
review first. For a single one-off command there's h5i env run fix-auth -- cargo build,
which runs it the same confined way and captures the output.
Tiered isolation, picked for you
"Confined" means different things on different machines, and h5i is honest about it. Isolation
is a ladder of tiers, and h5i env create with no flag picks the strongest level the
host can actually enforce. If you ask for a level the host can't provide, it refuses rather than
quietly running with less — you never think you're sandboxed when you aren't.
| Tier | What confines the code | Network |
|---|---|---|
| workspace | git worktree only — for trusted code | host (unrestricted) |
| process | Landlock filesystem allowlist + a seccomp syscall deny-list + user/mount/net namespaces + cgroup limits, all rootless | off, or full |
| supervised | the process tier + a live seccomp socket gate | off, or a real egress allowlist |
| container | rootless Podman — dropped capabilities, read-only root filesystem, no docker socket | an egress allowlist (L7 proxy) |
None of this needs root or a VM. h5i env probe prints exactly what your host
supports; if rootless Podman isn't installed, create tells you so when it lands on a
kernel tier, instead of silently leaving the container tier out of reach.
The interesting part: an egress allowlist you can't undo
Filesystem confinement is well-trodden. Network egress is where most sandboxes get honest about
their limits. An L7 proxy allowlist — the kind a container egress filter or a SOCKS proxy gives
you — only stops programs that respect the proxy; a raw socket to an IP walks straight past it.
h5i's supervised tier enforces the allowlist a layer lower, at L3/L4, where a raw
socket can't help:
isolation = "supervised" net.egress = ["example.com", "pypi.org", "github.com:443"]
Inside the box, the allowlist holds — and there's no DNS channel for anything off it:
box$ curl https://example.com 200 OK — on the allowlist box$ curl https://www.cloudflare.com could not resolve host — blocked, no DNS for it box$ python3 -c 'import socket; socket.create_connection(("1.1.1.1",443))' timed out — raw IP dropped at L3/L4
Three pieces make that work, all without root:
- A namespaced uplink. The confined process gets its own network namespace, and
slirp4netnsgives that namespace a NAT'd uplink from user space — noCAP_NET_ADMINon the host required. - An nftables default-drop allowlist. Inside its namespace the process holds capabilities over its own network only, so it installs an
nftablesruleset that drops everything except the pinned addresses. This is the layer that actually stops a packet. - DNS pinned through
/etc/hosts. Each allowlisted host is resolved once, at startup, and written into a private/etc/hostsbound only inside the box. No port 53 is opened at all: allowlisted names resolve to exactly the address nftables permits, and everything else simply doesn't resolve. That kills both DNS rebinding and DNS as an exfiltration side-channel.
The natural objection: if the code runs with capabilities over its own network namespace, can't it just delete the firewall? It tries — and can't:
box$ nft flush ruleset Unable to initialize Netlink socket: Operation not permitted box$ curl https://1.1.1.1 timed out — still blocked
The seccomp socket gate sits in front of socket() and denies the
AF_NETLINK family outright. nft, ip, and every other tool
that would rewrite the ruleset or the routes needs a netlink socket to do it — so they can't
open the door they'd need. The firewall is set before the untrusted program starts and is out of
its reach afterward. Two cheap mechanisms — a syscall gate and a packet filter — compose into
something neither gives alone.
/etc/hosts — none of it needs host root or a VM. The
tradeoff is honest: this is strong containment for ordinary and opportunistic threats, not a
hardware boundary against a determined kernel exploit. For that, h5i reserves microVM tiers it
doesn't pretend to ship yet.
Everything is auditable after the fact
Confinement decides what code can do; the record tells you what it did. Every run in an env is captured — its output, exit code, the exact policy that was enforced (pinned by hash, so it's tamper-evident), and any secrets redacted out of the evidence. Blocked accesses are recorded too, not silently swallowed.
$ h5i env log fix-auth # every command, secret use, and denial $ h5i env inspect fix-auth --capture <id> # one run's full record $ h5i env compare a b c # rank parallel attempts side by side
Because the record is content-addressed and lives in git refs, it travels with
h5i push / pull: one agent can propose a change on its clone and a
teammate (or another agent) can review and apply it on theirs, with the full evidence in hand.
The web dashboard renders the timeline and scores every allowed and
blocked action, so the boxes that pushed hardest on their boundary are the ones you look at
first.
When to reach for it
Use an env whenever you'd hesitate to let a command touch your real tree or your real network: a
dependency upgrade, a build you don't fully trust, an agent working unattended, a refactor you
want to diff before it lands. Skip it for quick, trusted edits — that's what the
workspace tier is for, and it costs nothing. The point isn't maximum paranoia on
every command; it's that when the work is risky, you get a box that's confined as strongly as
your host allows and a record you can actually audit — in the same tool that already holds your
history.
Give your agents a box, not your shell
h5i is open source, Apache 2.0, and runs entirely locally — no model in the path.
Star on GitHub Back to docs