Sandboxing AI Agents, Part 4: How h5i Implements It
h5i treats a sandboxed agent run as a Git-addressed unit of work: code branch, reasoning context, policy, command evidence, denials, and review decision all travel together.
h5i's sandbox feature is named env. The name is deliberate. An environment is more than a container and more than a worktree. It is the place an agent works, the policy that confines it, the evidence that records it, and the review lifecycle that decides whether the result reaches the main branch.
h5i env sandbox core abstraction
An h5i environment fuses three objects. The code branch is a Git worktree and branch for the agent's file changes. The context branch is the h5i reasoning and memory context associated with the work. The environment manifest records policy, identity, base commit, captures, and provenance. Together, they make the sandboxed run audit-reconstructable: not bit-for-bit replay, but a durable answer to who ran what, under which policy, against which tree, producing which diff.
$ h5i env create fix-auth --profile agent $ h5i env shell fix-auth box$ cargo test box$ exit $ h5i env log fix-auth $ h5i env diff fix-auth $ h5i env propose fix-auth $ h5i env apply fix-auth
The user gets a normal review shape: create, run or shell, inspect logs, inspect diff, propose the mediated commit, and apply when satisfied. Apply refuses an unproposed environment — the mediated-commit step is mandatory, never implicit. The agent gets a workspace that feels like a real project. The repository gets a durable record rather than an unstructured transcript.
Tiered isolation
h5i does not pretend every host supports every boundary. It resolves a requested isolation claim against the local machine and refuses when the claim cannot be met. The tiers are intentionally explicit:
| Tier | Boundary | Defends well against | Does not claim |
|---|---|---|---|
| workspace | Git worktree only | parallel edit collisions | execution containment |
| process | Landlock, seccomp denylist, user/mount/PID/net namespaces, cgroups or rlimits | accidental damage, many prompt-injection effects, deny-network local runs | microVM-grade hostile-code isolation |
| supervised | process tier plus seccomp-notify socket gate and rootless L3/L4 egress | off-allowlist network, raw-IP bypass attempts, netlink firewall rewrites | dynamic policy patching or separate kernel |
| container | rootless Podman with dropped capabilities, read-only rootfs, no docker socket, L7 proxy egress | containerized toolchains and stronger filesystem packaging | un-bypassable raw-socket egress unless direct sockets are blocked by config |
The future slots are hardened-container and microvm, but h5i does not claim they are shipped when they are not. That is important: an isolation enum should express both current capability and a fail-closed upgrade path.
Worktree plus mediated commit
The worktree gives the agent a real checkout. At confined tiers, h5i hides the shared Git internals from the workload and treats the filesystem diff as the output. The host-side h5i process stages and commits after checking paths. This avoids giving the untrusted process direct write access to refs, hooks, shared objects, and other worktrees.
This design also makes review ordinary. You can inspect a diff, compare multiple environments, and apply the one you want. The sandbox does not auto-merge just because the command exited successfully.
Filesystem defense
In the process tier, h5i grants write access to the environment worktree and selected read-only system paths. It does not grant the user's whole home directory. It does not rely on "deny this child under an allowed parent" semantics because Landlock is allowlist-only. Profiles that need runtime state, such as an agent's own credential cache, must say so explicitly.
The defense here is against accidental host writes, prompt-injected commands that try to read common secret paths, and dependency scripts that assume normal home-directory access. It is not a promise that the host kernel is unreachable; this tier still shares the kernel.
Network defense
h5i's most distinctive enforcement detail is the supervised tier's rootless egress allowlist. The sandbox creates a private network namespace, uses a user-space uplink, installs a default-drop nftables ruleset inside the namespace, resolves allowlisted hosts once, pins them through a private /etc/hosts, and opens no general DNS path.
The remaining escape attempt is policy mutation: if the workload can call netlink, it can try to flush nftables or change routes. h5i places a seccomp-notify socket gate in front of socket creation and denies AF_NETLINK. That means tools such as nft and ip cannot open the control channel needed to rewrite the namespace's packet policy.
curl https://pypi.org # allowed when pypi.org is in net.egress curl https://example.invalid # blocked: no DNS route curl https://1.1.1.1 # blocked: packet default-drop nft flush ruleset # blocked: AF_NETLINK denied
The container tier has a different network shape: rootless Podman plus a proxy-oriented egress allowlist. That is useful and ergonomic, but h5i documents the difference. A proxy allowlist and a packet-enforced allowlist are not the same claim.
Secrets defense
h5i includes a secrets broker model rather than treating environment variables as invisible. The policy names which secrets may be released. Captures redact values and record fingerprints so a reviewer can see that a secret was involved without seeing the secret. The design goal is not simply "make the command work"; it is "make secret use reviewable without leaking the secret into evidence."
Profiles for real agents are scoped to one runtime. A Claude-oriented environment should not also receive Codex credentials and egress to OpenAI endpoints; a Codex-oriented environment should not also receive Claude state and egress to Anthropic endpoints. Combining runtime credentials expands the damage of a prompt-injected agent.
Resources defense
h5i uses cgroup v2 where possible and rlimits as fallback. Memory, process count, file size, CPU, and wall-clock duration are all part of the sandbox's operational boundary. This matters for agents because many failures are not clever attacks. They are runaway builds, recursive test fixtures, package managers that explode cache size, or generated code that forks too much.
Audit and boundary pressure
h5i records command captures, raw output pointers, structured summaries, policy digests, egress summaries, redactions, and denials. The web dashboard classifies "boundary pressure" across five lanes: filesystem, network, process and privilege, resources, and provenance. That copy is intentionally careful: a blocked off-allowlist request is important evidence, not proof of maliciousness. A sensitive access under workspace isolation is a weak-isolation gap, not a red alert.
This is the review advantage of binding sandboxing to provenance. A reviewer can prioritize the environment that attempted raw IP egress, touched sensitive paths, hit seccomp denials, or exceeded resource budgets. Clean runs and noisy runs should not consume the same review time.
What h5i defends
| Threat | h5i defense | Residual risk |
|---|---|---|
| Agent clobbers user's checkout | separate worktree and branch | none for normal Git edits; malicious host writes need process tier |
| Command reads home secrets | Landlock/file view excludes home by default | explicit profiles can reintroduce state if too broad |
| Command exfiltrates over internet | network deny or supervised egress allowlist | host mode and broad allowlists are intentionally weaker |
| Raw socket bypasses proxy | supervised L3/L4 packet filter catches raw IP | container L7 proxy tier has different claim |
| Workload rewrites firewall | AF_NETLINK denied after setup | kernel bugs remain shared-kernel risk |
| Runaway process consumes host | cgroups, rlimits, timeout | hosts without delegation may get weaker fallback limits |
| Reviewer cannot tell what happened | content-addressed captures and policy digests | not deterministic replay of external services |
What h5i does not defend
h5i's shipped tiers are shared-kernel tiers. They are useful against accidental damage, prompt-injection effects, many exfiltration attempts, and ordinary untrusted build scripts. They are not the same as a Firecracker or Kata-backed microVM boundary. A malicious binary with a working host-kernel exploit is outside the honest claim of the current implementation.
h5i also does not make a broad allowlist safe by magic. If a profile grants credentials and permits egress to a general-purpose endpoint, the policy has created a channel. Sandboxing reduces authority; it does not replace careful policy design.
The design lesson
h5i's answer to AI-agent sandboxing is not "one box to rule them all." It is a disciplined binding: create a disposable workspace, enforce the strongest local policy the host can actually satisfy, capture what happened, attach that evidence to the work, and let a reviewer choose whether the diff lands. The sandbox boundary and the review boundary are the same unit.
That is the difference between running an agent in a container and running agent work as an auditable environment. The former may be isolated. The latter can be reviewed, compared, pushed, pulled, and remembered.
Give the agent a box and keep the receipt
h5i is open source, local-first, and built around Git-native review evidence for AI-era work.
Star on GitHub Read part 3