Sandbox Series · Part 4 · 2026-06-12

Sandboxing AI Agents, Part 4: How h5i Implements It

h5i treats a sandboxed agent run as a Git-addressed unit of work: code branch, reasoning context, policy, command evidence, denials, and review decision all travel together.

By Koukyosyumei Reading time 18 min Tags h5i env · Provenance · Egress

h5i's sandbox feature is named env. The name is deliberate. An environment is more than a container and more than a worktree. It is the place an agent works, the policy that confines it, the evidence that records it, and the review lifecycle that decides whether the result reaches the main branch.

Series map. This h5i deep dive follows the foundations, implementation guide, and AI sandbox comparison.

h5i env sandbox core abstraction

An h5i environment fuses three objects. The code branch is a Git worktree and branch for the agent's file changes. The context branch is the h5i reasoning and memory context associated with the work. The environment manifest records policy, identity, base commit, captures, and provenance. Together, they make the sandboxed run audit-reconstructable: not bit-for-bit replay, but a durable answer to who ran what, under which policy, against which tree, producing which diff.

h5i env lifecycle

$ h5i env create fix-auth --profile agent
$ h5i env shell fix-auth
box$ cargo test
box$ exit
$ h5i env log fix-auth
$ h5i env diff fix-auth
$ h5i env propose fix-auth
$ h5i env apply fix-auth

The user gets a normal review shape: create, run or shell, inspect logs, inspect diff, propose the mediated commit, and apply when satisfied. Apply refuses an unproposed environment — the mediated-commit step is mandatory, never implicit. The agent gets a workspace that feels like a real project. The repository gets a durable record rather than an unstructured transcript.

Tiered isolation

h5i does not pretend every host supports every boundary. It resolves a requested isolation claim against the local machine and refuses when the claim cannot be met. The tiers are intentionally explicit:

Tier	Boundary	Defends well against	Does not claim
workspace	Git worktree only	parallel edit collisions	execution containment
process	Landlock, seccomp denylist, user/mount/PID/net namespaces, cgroups or rlimits	accidental damage, many prompt-injection effects, deny-network local runs	microVM-grade hostile-code isolation
supervised	process tier plus seccomp-notify socket gate and rootless L3/L4 egress	off-allowlist network, raw-IP bypass attempts, netlink firewall rewrites	dynamic policy patching or separate kernel
container	rootless Podman with dropped capabilities, read-only rootfs, no docker socket, L7 proxy egress	containerized toolchains and stronger filesystem packaging	un-bypassable raw-socket egress unless direct sockets are blocked by config

The future slots are hardened-container and microvm, but h5i does not claim they are shipped when they are not. That is important: an isolation enum should express both current capability and a fail-closed upgrade path.

Worktree plus mediated commit

The worktree gives the agent a real checkout. At confined tiers, h5i hides the shared Git internals from the workload and treats the filesystem diff as the output. The host-side h5i process stages and commits after checking paths. This avoids giving the untrusted process direct write access to refs, hooks, shared objects, and other worktrees.

This design also makes review ordinary. You can inspect a diff, compare multiple environments, and apply the one you want. The sandbox does not auto-merge just because the command exited successfully.

Filesystem defense

In the process tier, h5i grants write access to the environment worktree and selected read-only system paths. It does not grant the user's whole home directory. It does not rely on "deny this child under an allowed parent" semantics because Landlock is allowlist-only. Profiles that need runtime state, such as an agent's own credential cache, must say so explicitly.

The defense here is against accidental host writes, prompt-injected commands that try to read common secret paths, and dependency scripts that assume normal home-directory access. It is not a promise that the host kernel is unreachable; this tier still shares the kernel.

Network defense

h5i's most distinctive enforcement detail is the supervised tier's rootless egress allowlist. The sandbox creates a private network namespace, uses a user-space uplink, installs a default-drop nftables ruleset inside the namespace, resolves allowlisted hosts once, pins them through a private /etc/hosts, and opens no general DNS path.

The remaining escape attempt is policy mutation: if the workload can call netlink, it can try to flush nftables or change routes. h5i places a seccomp-notify socket gate in front of socket creation and denies AF_NETLINK. That means tools such as nft and ip cannot open the control channel needed to rewrite the namespace's packet policy.

supervised egress behavior

curl https://pypi.org            # allowed when pypi.org is in net.egress
curl https://example.invalid     # blocked: no DNS route
curl https://1.1.1.1             # blocked: packet default-drop
nft flush ruleset                # blocked: AF_NETLINK denied

The container tier has a different network shape: rootless Podman plus a proxy-oriented egress allowlist. That is useful and ergonomic, but h5i documents the difference. A proxy allowlist and a packet-enforced allowlist are not the same claim.

Secrets defense

h5i includes a secrets broker model rather than treating environment variables as invisible. The policy names which secrets may be released. Captures redact values and record fingerprints so a reviewer can see that a secret was involved without seeing the secret. The design goal is not simply "make the command work"; it is "make secret use reviewable without leaking the secret into evidence."

Profiles for real agents are scoped to one runtime. A Claude-oriented environment should not also receive Codex credentials and egress to OpenAI endpoints; a Codex-oriented environment should not also receive Claude state and egress to Anthropic endpoints. Combining runtime credentials expands the damage of a prompt-injected agent.

Resources defense

h5i uses cgroup v2 where possible and rlimits as fallback. Memory, process count, file size, CPU, and wall-clock duration are all part of the sandbox's operational boundary. This matters for agents because many failures are not clever attacks. They are runaway builds, recursive test fixtures, package managers that explode cache size, or generated code that forks too much.

Audit and boundary pressure

h5i records command captures, raw output pointers, structured summaries, policy digests, egress summaries, redactions, and denials. The web dashboard classifies "boundary pressure" across five lanes: filesystem, network, process and privilege, resources, and provenance. That copy is intentionally careful: a blocked off-allowlist request is important evidence, not proof of maliciousness. A sensitive access under workspace isolation is a weak-isolation gap, not a red alert.

This is the review advantage of binding sandboxing to provenance. A reviewer can prioritize the environment that attempted raw IP egress, touched sensitive paths, hit seccomp denials, or exceeded resource budgets. Clean runs and noisy runs should not consume the same review time.

What h5i defends

Threat	h5i defense	Residual risk
Agent clobbers user's checkout	separate worktree and branch	none for normal Git edits; malicious host writes need process tier
Command reads home secrets	Landlock/file view excludes home by default	explicit profiles can reintroduce state if too broad
Command exfiltrates over internet	network deny or supervised egress allowlist	host mode and broad allowlists are intentionally weaker
Raw socket bypasses proxy	supervised L3/L4 packet filter catches raw IP	container L7 proxy tier has different claim
Workload rewrites firewall	AF_NETLINK denied after setup	kernel bugs remain shared-kernel risk
Runaway process consumes host	cgroups, rlimits, timeout	hosts without delegation may get weaker fallback limits
Reviewer cannot tell what happened	content-addressed captures and policy digests	not deterministic replay of external services

What h5i does not defend

h5i's shipped tiers are shared-kernel tiers. They are useful against accidental damage, prompt-injection effects, many exfiltration attempts, and ordinary untrusted build scripts. They are not the same as a Firecracker or Kata-backed microVM boundary. A malicious binary with a working host-kernel exploit is outside the honest claim of the current implementation.

h5i also does not make a broad allowlist safe by magic. If a profile grants credentials and permits egress to a general-purpose endpoint, the policy has created a channel. Sandboxing reduces authority; it does not replace careful policy design.

The hard ceiling is the shared kernel. Process, supervised, and rootless-container tiers are strong local containment tools. They are not a substitute for separate-kernel isolation when the workload itself is a determined adversary.

The design lesson

h5i's answer to AI-agent sandboxing is not "one box to rule them all." It is a disciplined binding: create a disposable workspace, enforce the strongest local policy the host can actually satisfy, capture what happened, attach that evidence to the work, and let a reviewer choose whether the diff lands. The sandbox boundary and the review boundary are the same unit.

That is the difference between running an agent in a container and running agent work as an auditable environment. The former may be isolated. The latter can be reviewed, compared, pushed, pulled, and remembered.

Series start

Sandboxing AI Agents, Part 1: Foundations

Return to the beginning: what sandboxing means, what threat models matter, and how to read security claims.

Give the agent a box and keep the receipt

h5i is open source, local-first, and built around Git-native review evidence for AI-era work.

Star on GitHub Read part 3