Workflow · 2026-05-06

From `git blame` to AI blame: per-line provenance for AI-era code

git blame answers "who wrote this" with a name, a SHA, and a date. In an AI-assisted codebase that is necessary but no longer sufficient. Here's the upgrade — per-line lineage that includes the prompt, the model, the agent, and the test result that produced each line.

By Koukyosyumei Reading time 8 min Tags Git · Blame · AI Provenance

git blame is one of the most-googled git commands ever shipped, and for good reason. The sequence "see suspicious line → run blame → read commit message → understand the change" is a foundational debugging primitive. It works because in a pre-AI codebase, the author's name plus the commit message is enough context — the human author knew why they wrote that line, and the commit message captures the why with reasonable fidelity.

That model breaks the moment a line was written by Claude. The git author is the human who ran git commit, but they didn't write the line — they reviewed it. The commit message summarizes the user-visible intent ("add retry logic") but loses the actual prompt, the model version, the agent identity, and the tests that ran. When you're debugging at 2am, the difference between "Alice wrote this in 2026" and "Alice prompted claude-sonnet-4-6 with 'add exponential backoff' and 42 tests passed afterwards" is the difference between an hour of guessing and a one-line confirmation.

h5i extends git's blame model with that data. Same command shape, more answers.

The two-output upgrade

Standard git blame on a line:

~/my-project

$ git blame -L 88,90 src/http_client.rs

a3f9c2b9 (Alice 2026-03-27 14:02:11 +0000  88) async fn send_with_retry(req: Request) -> Result<Response> {
a3f9c2b9 (Alice 2026-03-27 14:02:11 +0000  89)     let mut delay = Duration::from_millis(100);
a3f9c2b9 (Alice 2026-03-27 14:02:11 +0000  90)     for attempt in 0..MAX_RETRIES {

Same lines under h5i blame:

~/my-project

$ h5i blame src/http_client.rs --show-prompt

STAT COMMIT   AUTHOR/AGENT     | CONTENT
✅ ✨ a3f9c2b  claude-code      | async fn send_with_retry(req: Request) -> Result<Response> {
✅    a3f9c2b  claude-code      |     let mut delay = Duration::from_millis(100);
✅    a3f9c2b  claude-code      |     for attempt in 0..MAX_RETRIES {

─── boundary at a3f9c2b ─────────────────────────────────────
  Author:  Alice <alice@example.com>
  Agent:   claude-code (claude-sonnet-4-6) ✨
  Prompt:  "add exponential backoff with jitter to the HTTP client
           — cap retries at 5 to avoid infinite loops"
  Tests:   ✔ 42 passed, 0 failed, 1.23s [pytest]
  Audit:   ✔ no integrity warnings

The four extra fields — agent, model, prompt, test result — are the difference between knowing that Alice landed the change and knowing what the change was actually for. Same git commit, four times as much context.

Where the data lives

The provenance fields are stored in refs/h5i/notes as JSON keyed by commit OID. Concretely, each AI-tagged commit has a H5iCommitRecord attached:

refs/h5i/notes

{
  "commit": "a3f9c2b9...",
  "ai_metadata": {
    "model": "claude-sonnet-4-6",
    "agent": "claude-code",
    "prompt": "add exponential backoff with jitter to the HTTP client...",
    "tokens": 312
  },
  "test_metrics": {
    "framework": "pytest",
    "passed": 42, "failed": 0, "duration_ms": 1230
  },
  "integrity_report": { "severity": "Valid", "warnings": [] }
}

The notes ref isn't pulled by a plain git fetch — it would clutter remotes that don't care. h5i push and h5i pull sync the h5i refs alongside your code, so teammates who opt in see the full provenance and others see normal git behavior.

How `h5i commit` populates the record

You can capture provenance manually:

~/my-project

$ h5i commit -m "add retry logic to HTTP client" \
    --model claude-sonnet-4-6 \
    --agent claude-code \
    --prompt "add exponential backoff to the HTTP client" \
    --tests \
    --audit

✔  Committed a3f8c12  add retry logic to HTTP client
    model: claude-sonnet-4-6 · agent: claude-code · 312 tokens

Or you can install the UserPromptSubmit hook from h5i hook setup and stop typing --prompt entirely. The hook captures every prompt you type to Claude Code, attaches the most recent one to the next h5i commit, and you keep using normal git ergonomics.

The new question: per-line prompt history

Standard blame stops at the line's introducing commit. h5i adds --ancestry, which walks backward and shows every prompt that ever touched the line, not just the last one:

~/my-project

$ h5i log --ancestry src/http_client.rs:88

Line 88 of src/http_client.rs — 3 commits in ancestry

● a3f9c2b  2026-04-12  Alice / claude-code
  prompt: "add exponential backoff with jitter to the HTTP client"

● 7216039  2026-03-08  Alice / claude-code
  prompt: "fix off-by-one in retry counter"

● 9eff001  2026-02-24  Alice (no AI metadata)
  commit: "initial HTTP client"

This is the answer to "why is this line the way it is?" not just at the most recent edit but across its whole life. When debugging a regression that bisects to a refactor, the ancestry view tells you whether the line you're staring at was rewritten by the refactor or merely moved — and what was on the prompt that moved it.

AST-mode blame

Line-level blame breaks when code is reformatted, reflowed, or moved across files. Cosmetic changes can rewrite blame for thousands of lines and obscure the meaningful authorship.

h5i blame --mode ast blames at the AST node level instead of the line level. A function whose body wasn't semantically changed keeps its original blame even if its formatting was rewritten. You get this for free wherever h5i has parsed the file's AST (Rust, Python, and a growing list).

~/my-project

$ h5i blame src/http_client.rs --mode ast

NODE                              COMMIT   AUTHOR/AGENT
fn send_with_retry           ⤷ a3f9c2b  claude-code (semantic body)
  └ for attempt in 0..MAX    ⤷ a3f9c2b  claude-code
  └ tokio::time::sleep       ⤷ 7216039  claude-code (older)
fn parse_url                 ⤷ 9eff001  Alice

Practical usage patterns

Three concrete situations where AI blame earns its keep:

1. Triaging an incident

An error is firing in parse_response. h5i blame on the function shows it was authored by claude-sonnet-4-6 with the prompt "handle the new v2 envelope format" two weeks ago. The original v1 parser was untouched. Fastest path to root cause: check whether the error payload is v1 (the parser doesn't handle it) or v2 (the parser is buggy). Without provenance, you'd have read both code paths first.

2. Reviewing your own past decisions

Six months later you don't remember why retry_max is 5. h5i blame shows the prompt was "…cap retries at 5 to avoid infinite loops". Decision recovered without re-reading the PR thread, the design doc, or your Slack history. Particularly valuable for solo developers whose "team handoff" is to themselves three months later.

3. Vetting an inherited codebase

You're given a service to take over. h5i log --limit 200 tells you the AI ratio (60% AI? 20%?), which agents wrote which subsystems, and what kinds of prompts were used. Combined with h5i notes review, you get a triaged list of "files most likely to harbor unreviewed AI assumptions." A 30-minute orientation that previously took days.

Bonus. h5i vibe is the explicit command for the inherited-repo case. It scans a repo in seconds and prints AI footprint, fully-AI-written directories, leaked tokens, and prompt-injection hits. Useful even on repos you didn't author.

What this is not trying to be

AI blame is not a copyright assertion. The git author is still legally and organizationally the one accountable for the commit — they ran the prompt, reviewed the diff, and merged. The AI fields are provenance, not attribution. They tell you what tools shaped the line, the same way git log tells you what compiler version the CI ran. Useful for debugging, auditing, and triage; not for ownership.

It's also not a substitute for code review. The fact that claude-sonnet-4-6 wrote a line doesn't make it correct or incorrect. It just makes the next reviewer faster.

Adopt incrementally

h5i blame works the moment you start using h5i commit. Pre-existing commits keep their standard git blame; new commits gain the AI fields. There's no migration, no backfill, no one-time event. The provenance accrues on the commits you make from here forward, and the ancestry view interleaves it gracefully with the historical commits you've never tagged.

getting started

$ cargo install --git https://github.com/Koukyosyumei/h5i h5i-core
$ cd your-project && h5i init
$ h5i hook setup  # auto-captures prompts via UserPromptSubmit

# From your next commit forward:
$ h5i commit -m "..."  # prompt + model + agent recorded automatically
$ h5i blame src/foo.rs --show-prompt

The first time you use it on a real bug, you'll wonder how you ever debugged AI-touched code without it.

Auditing AI-Generated Code: A Practical Framework

Per-line provenance is the building block. The framework is what you do with it across hundreds of commits.

Per-line provenance for the AI era

Same blame ergonomics. Four more answers per line.

Star on GitHub Back to docs

The two-output upgrade

Where the data lives

How h5i commit populates the record

The new question: per-line prompt history

AST-mode blame

Practical usage patterns

1. Triaging an incident

2. Reviewing your own past decisions

3. Vetting an inherited codebase

What this is not trying to be

Adopt incrementally

Auditing AI-Generated Code: A Practical Framework

Per-line provenance for the AI era

How `h5i commit` populates the record