Guide · Provenance

AI code provenance: track AI-generated code in Git

Git records who ran git commit — not which model wrote the code or from what prompt. h5i attaches that provenance to every AI-assisted commit, stored in Git itself.

Guide h5i Reading time 6 min Tags Provenance · Commits · Audit

In an AI-assisted codebase, git log tells you a comforting lie. The author is the human who pressed commit; the message is a one-line summary. Both omit the part that actually produced the code: the prompt, the model, the agent, and whether tests passed. Six months later, "who wrote this and why" has no answer Git can give.

The problem: Git wasn't built for AI authorship

Git's identity model assumes the committer is the author. When a model writes the diff and a human reviews it, that assumption breaks — and there's no standard place to record the difference. Teams end up with codebases where a large fraction of lines have no recoverable origin.

How h5i solves it

h5i captures an H5iCommitRecord for each AI-assisted commit — model, agent, prompt, token count, optional test metrics, and design decisions — and stores it as JSON in refs/h5i/notes, keyed by commit OID. Your code history is untouched; the provenance rides alongside it and is queryable with the same ergonomics as git log.

Commands

The commit is the agent's job, not yours. With the hooks wired, Claude Code and Codex commit through h5i capture commit in place of git commit — and in Claude Code the prompt, model, and agent are filled in automatically, so the agent just runs:

~/my-project

$ h5i capture commit -m "add retry logic to HTTP client" --tests --audit

✔  Committed a3f8c12  add retry logic to HTTP client
    model: claude-sonnet-4-6 · agent: claude-code · 312 tokens

Running it by hand — from Codex, from CI, or for a manual commit where no prompt-capture hook runs — pass the provenance explicitly with --intent, --model, and --agent.

Read it back — per-commit provenance, and a repo-wide AI footprint:

~/my-project

$ h5i recall log --limit 1

commit a3f8c12...
Author:  Alice <alice@example.com>
Agent:   claude-code (claude-sonnet-4-6) ✨
Prompt:  "add exponential backoff to the HTTP client"
Tests:   ✔ 42 passed, 0 failed, 1.23s [pytest]

$ h5i audit vibe            # repo-wide AI footprint
  AI-generated:   38% of commits   ·   12 fully-AI directories
  top agent:      claude-code (211 commits)

Worked example: capturing the "why", not just the "what"

A commit message says what changed. When you made a non-obvious tradeoff, record the decision so the reasoning survives — it shows up under a Decisions: block in h5i recall log:

/tmp/decisions.json

[
  {
    "location": "src/http_client.rs:88",
    "choice": "exponential backoff with jitter",
    "alternatives": ["fixed delay", "linear backoff"],
    "reason": "reduces thundering herd under high load"
  }
]

~/my-project

$ h5i capture commit -m "add retry logic" \
    --decisions /tmp/decisions.json

Make it automatic. h5i hook setup wires Claude Code so the exact prompt, model, and agent are captured on every commit — no flags to remember. The provenance is then a byproduct of working, not a discipline you have to maintain.

Frequently asked questions

What exactly is captured as provenance?

For each AI-assisted commit, h5i stores model, agent identity, the prompt, a token count, optional test metrics (framework, pass/fail, duration), and any recorded design decisions — as JSON in refs/h5i/notes keyed by the commit OID.

Does this change my Git history or commit hashes?

No. h5i wraps a normal git commit; the provenance is stored in a separate ref (refs/h5i/notes), so your code history, hashes, and a plain git log are unchanged. Teammates who don't use h5i see ordinary Git.

How do I find what fraction of the repo is AI-generated?

h5i audit vibe reports a repo-wide footprint: the percentage of AI-generated commits, fully-AI directories, the most active agents, and token-leak signals.

Can I record why an approach was chosen, not just the prompt?

Yes. Pass --decisions with a JSON array of {location, choice, alternatives, reason} to h5i capture commit. Those appear in h5i recall log under a Decisions block, preserving context that never fits in a commit message.

Is the provenance shared with my team?

Only if you opt in. The notes ref isn't moved by a plain git push; run h5i share push to share it and h5i share pull to receive it, so provenance follows the code for teammates who want it.

Go deeper

Deep dive: from git blame to AI blame

How per-commit provenance becomes per-line provenance — and why that changes incident triage.

Try h5i in your repo

One cargo install, then h5i init. Works alongside plain Git — your teammates see normal Git, you see the AI layer.

Star on GitHub All guides