FRESH

Hacker News

Home

If AI writes code, should the session be part of the commit?

496 points by mandel_x

by jedberg

21 subcomments

The way I write code with AI is that I start with a project.md file, where I describe what I want done. I then ask it to make a plan.md file from that project.md to describe the changes it will make (or what it will create if Greenfield).
I then iterate on that plan.md with the AI until it's what I want. I then ask it to make a detailed todo list from the plan.md and attach it to the end of plan.md.
Once I'm fully satisfied, I tell it to execute the todo list at the end of the plan.md, and don't do anything else, don't ask me any questions, and work until it's complete.
I then commit the project.md and plan.md along with the code.
So my back and forth on getting the plan.md correct isn't in the logs, but that is much like intermediate commits before a merge/squash. The plan.md is basically the artifact an AI or another engineer can use to figure out what happened and repeat the process.
The main reason I do this is so that when the models get a lot better in a year, I can go back and ask them to modify plan.md based on project.md and the existing code, on the assumption it might find it's own mistakes.

by 827a

12 subcomments

IMO: This might be a contrarian opinion, but I don't think so. Its much the same problem as asking, for example, if every single line you write, or every function, becomes a commit. The answer to this granularity is, much like anything, you have to think of the audience: Who is served by persisting these sessions? I would suspect that there is little reason why future engineers, or future LLMs, would need access to them; they likely contain a significant amount of noise, incorrect implementations, and red herrings. The product of the session is what matters.
I do think there's more value in ensuring that the initial spec, or the "first prompt" (which IME is usually much bigger and tries to get 80% of the way there) is stored. And, maybe part of the product is an LLM summary of that spec, the changes we made to the spec within the session, and a summary of what is built. But... that could be the commit message? Or just in a markdown file. Or in Notion or whatever.

by dang

20 subcomments

I floated that idea a week ago: https://news.ycombinator.com/item?id=47096202, although I used the word "prompts" which users pointed out was obsolete. "Session" seems better for now.
The objections I heard, which seemed solid, are (1) there's no single input to the AI (i.e. no single session or prompt) from which such a project is generated,
(2) the back-and-forth between human and AI isn't exactly like working with a compiler (the loop of source code -> object code) - it's also like a conversation between two engineers [1]. In the former case, you can make the source code into an artifact and treat that as "the project", but you can't really do that in the latter case, and
(3) even if you could, the resulting artifact would be so noisy and complicated that saving it as part of the project wouldn't add much value.
At the same time, people have been submitting so many Show HNs of generated projects, often with nothing more than a generated repo with a generated readme. We need a better way of processing these because treating them like old-fashioned Show HNs is overwhelming the system with noise right now [2].
I don't want to exclude these projects, because (1) some of them are good, (2) there's nothing wrong with more people being able to create and share things, (3) it's foolish to fight the future, and (4) there's no obvious way to exclude them anyhow.
But the status quo isn't great because these projects, at the moment, are mostly not that interesting. What's needed is some kind of support to make them more interesting.
So, community: what should we do?
[1] this point came from seldrige at https://news.ycombinator.com/item?id=47096903 and https://news.ycombinator.com/item?id=47108653.
YoumuChan makes a similar point at https://news.ycombinator.com/item?id=47213296, comparing it to Google search history. The analogy is different but the issue (signal/noise ratio) is the same.
[2] Is Show HN dead? No, but it's drowning - https://news.ycombinator.com/item?id=47045804 - Feb 2026 (422 comments)

by rfw300

7 subcomments

Why should it be? The agent session is a messy intermediate output, not an artifact that should be part of the final product. If the "why" of a code change is important, have your agent write a commit message or a documentation file that is polished and intended for consumption.

by onion2k

3 subcomments

Conceptually this is very similar to the question of whether or not you should squash your commits. To the point that it's really the same question.
If you think you should squash commits, then you're only really interested in the final code change. The history of how the dev got there can go in the bin.
If you don't think you should squash commits then you're interested in being able to look back at the journey that got the dev to the final code change.
Both approaches are valid for different reasons but they're a source of long and furious debate on every team I've been on. Whether or not you should be keeping a history of your AI sessions alongside the code could be useful for debugging (less code debugging, more thought process debugging) but the 'prefer squash' developers usually prefer to look the existing code rather than the history of changes to steer it back on course, so why would they start looking at AI sessions if they don't look at commits?
All that said, your AI's memory could easily be stored and managed somewhere separately to the repo history, and in a way that makes it more easily accessible to the LLM you choose, so probably not.

by yuvrajangads

0 subcomment

The session itself is mostly noise. Half of it is the model going down wrong paths, backtracking, and trying again. Storing that alongside the commit is like saving your browser history next to your finished code.
What actually helps is a good commit message explaining the intent. If an AI wrote the code, the interesting part isn't the transcript, it's why you asked for it and what constraints you gave it. A one-paragraph description of the goal and approach is worth more than a 200-message session log.
I think the real question isn't about storing sessions, it's about whether we're writing worse commit messages because we assume the AI context is "somewhere."

by D-Machine

2 subcomments

Obviously yes, at least if not the prompts in the session, some simple / automated distillation of those prompts. Code generated by AI is already clearly not going to be reviewed as carefully as code produced by humans, and intentions / assumptions will only be documented in AI-generated comments to some limited degree, completely contingent on the prompt(s).
Otherwise, when fixing a bug, you just risk starting from scratch and wasting time using the same prompts and/or assumptions that led to the issue in the first place.
Much of the reason code review was/is worth the time is because it can teach people to improve, and prevent future mistakes. Code review is not really about "correctness", beyond basic issues, because subtle logic errors are in general very hard to spot; that is covered by testing (or, unfortunately, deployment surprises).
With AI, at least as it is currently implemented, there is no learning, as such, so this removes much of the value of code review. But, if the goal is to prevent future mistakes, having some info about the prompts that led to the code at least brings some value back to the review process.
EDIT: Also, from a business standpoint, you still need to select for competent/incompetent prompters/AI users. It is hard to do so when you have no evidence of what the session looked like. Also, how can you teach juniors to improve their vibe-coding if you can't see anything about their sessions?

by YoumuChan

4 subcomments

Should my google search history be part of the commit? To that question my answer is no.

by voxleone

0 subcomment

I’ve found a workflow that feels both structured and respectful of professional craft, especially in the context of this thread. I don’t just "vibe code" and let an LLM fill in the blanks. I use a classic design discipline (UML and use-cases) to document the process: 1. Start with requirements – 2.Define use cases - 3. Implement classes/objects (Architecture first, not after-the-fact refactors) 4. Add constraints and invariants (Contracts, boundaries, failure modes, etc.) - 5. Let the agent work inside that frame, pausing at milestones for human oversight.
Those UML/use-case/constraint artifacts aren’t committed as session logs per se, but they are part of the author’s intent and reasoning that gets committed alongside the resulting code. That gives future reviewers the why as well as the what, which is far more useful than a raw AI session transcript.
Stepping back, this feels like a decent and dignified position for a programmer in 2026: humans retain architectural judgement --> AI accelerates boilerplate and edge implementation --> version history still reflects intent and accountability rather than chat transcripts. I can’t afford to let go of the productivity gains that flow from using AI as part of a disciplined engineering process, but I also don’t think commit logs should become a dumping ground for unfiltered conversation history.

by mandel_x

5 subcomments

I’ve been thinking about a simple problem: We’re increasingly merging AI-assisted code into production, but we rarely preserve the thing that actually produced it — the session. Six months later, when debugging or reviewing history, the only artifact left is the diff. So I built git-memento. It attaches AI session transcripts to commits using Git notes.

by ZoomZoomZoom

0 subcomment

If by AI you mean the LLM-based tools common now, then I don't want the commits in PRs I'm going to review to bring any more noise than they already do. The human operator is responsible for every line, like they always were.
If by AI you mean non-supervised, autonomous conscience (as I believe the term has to be reserved for), then the answer is again no, as it's as responsible for the quality of its PRs as humans.
If the thing writing code is the former, but there's no human or responsible representative of the latter in the loop, then the code shouldn't be even suggested for consideration in a project where any people do participate. In such case there's no point in storing any additional information as the code itself doesn't have any value (besides electricity wasted to create it) and can be substituted on demand.
Commit comments are generally underused, though, as a result of how forges work, but that's another discussion.

by abustamam

0 subcomment

I don't think it should be. I think a distilled summary of what the agent did should be committed. This requires some dev discipline. But for example:
Make a button that does X when clicked.
Agent makes the button.
I tell it to make the button red.
Agent makes it red.
I test it, it is missing an edge case. I tell it to fix it.
It fixes it.
I don't like where the button is. I tell it to put it in the sidebar.
It does that.
I can go on and on. But we don't need to know all those intermediaries. We just need to know Red button that does X by Y mechanism is in the sidebar. Tests that include edge cases here. All tests passing. 2026-03-01
And that document is persisted.
If later, the button gets deleted or moved again or something, we can instruct the agent to say why. Button deleted because not used and was noisy. 2026-03-02
This can be made trivial via skills, but I find it a good way to understand a bit more deeply than commit messages would allow me to do.
Of course, we can also just write (or instruct agents to write) better PRs but AFAICT there's no easy way to know that the button came about or was deleted by which PR unless you spelunk in git blame.

by raincole

0 subcomment

I hope people start doing that. Not that it has any practical usage for the repo itself, but if everyone does that, it'd probably make it much easier for open weight models to catch up the proprietary ones. It'd be like a huge crowdsourced project to collect proprietary models' output for future training.

by tototrains

0 subcomment

I considered this and even built a claude code extension to bring history/chats into the project folder.
Not once have I found it useful: if the intention isn't clear from the code and/or concise docs, the code is bad and needs to be polished.
Well written code written with intention is instantly interpretable with an LLM. Sending the developer or LLM down a rabbit hole of drafts is a waste of cognition and context.

by ovidev

0 subcomment

In the past months, I've been building a SaaS using Claude Code. I haven't written a single line of code.
This is the breakdown of my process - I use tons of .md files serving as a shared brain between Claude and me:
- CLAUDE.md is in the root of the repo, and it's the foundation - it describes the project vision, structure, features, architecture decisions, tech, and others. It then goes even more granular and talks about file sizes, method sizes, problem-solving methodologies (do not reinvent the wheel if a well-known library is already out there), coding practices, constraints, and other aspects like instructions for integration tests. It's basically the manual for the project vision and plan, and also for code writing. Claude reads it every session.
- Every feature has its own .md file, which is maintained. That file describes implementation details, decisions, challenges, and anything that is relevant when starting to code on the feature, and also when it's picked up by a new session.
- At a higher level, above features, I create pairs of roadmap.md and handoff.md. Those pairs are the crucial part of my process. They cover wider modules (e.g., licensing + payments + emailing features) and serve as a bridge between sessions. Roadmap.md is basically a huge checklist, based on CLAUDE.md and features .md docs, and is maintained. The handoff.md contains the current state, session notes, and knowledge. A session would start by getting up to speed with Claude.md and the specific roadmap.md + handoff.md that you plan to work on now and would end by updating the handoff, roadmap, and the impacted features.
This structure greatly helps preserve crucial context and also makes it very easy to use multi-agent.
Of course the commits and PRs are also very descriptive, however the engine is in the .md files.

by causal

2 subcomments

If a car is used to get you somewhere, should you put the exhaust in bags to bring with you?

by brendanmc6

1 subcomments

A few things really leveled up both my software quality and my productivity in the last few months. It wasn’t session history, memory files, context management or any of that.
1. Writing a spec with clear acceptance criteria.
2. Assigning IDs to my acceptance criteria. Sounds tedious, but actually the idea wasn’t mine, at some point an agent went and did it without me asking. The references proved so useful for guiding my review that I formalized the process (and switched from .md to .yaml to make it easier).
3. Giving my agents a source of truth to share implementation progress so they can plan their own tasks and more effectively review.
Of course, I can’t help myself, I had to formalize it into a spec standard and a toolkit. Gonna open source it all soon, but I really want feedback before I go too far down the rabbit hole:
https://acai.sh

by xhcuvuvyc

0 subcomment

No? For the same reason I don't want to work 8 hours a day with the boss looking over my shoulder.

by nz

0 subcomment

If a commit's job is to capture state at a particular point in time, so that it can be reproduced and understood, then it _also_ needs to include the exact model used. This is only useful if you can ensure access to the previous versions of the model -- which is not something that providers are willing to do (in fact, they regularly "retire" old models). The only transparent way forward is to open source the models, along with their weights, and their training set (to verify that the weights match, and to retrain the model when new architectures and new hardware are released).
Not insisting upon this, would be similar to depending on a SaaS to compile and packages software, and being totally cool with it. Both LLMs and build systems, convert human-friendly notation into machine-friendly notation. We should hold the LLM companies to the same standards of transparency that we hold the people who make things like nix, clang, llvm, cmake, cargo, etc.

by rDr4g0n

0 subcomment

When I began reviewing my teammate’s PRs with AI generated code in it, something started to feel weird. It took a bit, but I realized the problem: I am not reviewing the work my teammate did.
What are they even supposed to do with feedback on the code? It has to be translated by my teammate into the language of the work they did, which is the conversation they had with the AI agent.
But the conversation isn't the "real work": the decisions made in the conversation are the real work. That is what needs capture and review.
So now I know why code reviews are kinda wrong, what can we do to have meaningful reviews of the work my teammates have done?
What I landed on is aiming to capture more and more “work” in the form of a spec, review the spec, ignore the code. this isn't novel or interesting. HOWEVER...
For the large, messy, legacy codebases I work in today, I don’t like the giant spec driven development approach that is most popular today. It’s too risky to solely trust the spec because it touches so much messy code with so many gotchas. However, with the rate of AI generated code rolling in, I simply can’t switch context quickly enough to review it all efficiently. Also, it’s exhausting.
The approach I have been refining is defining very small modules (think a class or meaningful collection of utils) with a spec and a concise set of unit tests, generating code from the spec, then not reading or editing the generated code.
Any changes to the code must be made to the spec, and the code re-generated. This puts the PR conversation in the right place, against the work I have done: which is write the spec.
So far the approach has worked for replacing simple code (eg: a nestjs service that has a handful of public methods, a bit of business logic, and a few API client calls). PRs usually have a handful of lines of glue code to review, but the rest are specs (and a selection of “trust” unit tests) and the idea is that the code can be skipped.
AI review bots still review the PR and comment around code quality and potential security concerns, which I then translate into updates to the spec.
I find this to be a good step towards the codegen future without totally handing over my (very messy and not very agent friendly) codebases.

by westurner

0 subcomment

"Pulp Project Policy on AI Generated Content / AI Assisted Coding" https://github.com/pulp#pulp-project-policy-on-ai-generated-... :
> [...]
> All contributors must indicate in the commit message of their contribution if they used AI to create them and the contributor is fully responsible for the content that they submit.
> This can be a label such as `Assisted By: <Tool>` or `Generated by: <Tool>` based on what was used. This label should be representative of the contribution and how it was created for full transparency. The commit message must also be clear about how it is solving a problem/making an improvement if it is not immediately obvious.*
From "Entire: Open-source tool that pairs agent context to Git commits" (2026) https://news.ycombinator.com/item?id=46964096 :
> But which metadata is better stored in git notes than in a commit message? JSON-LD can be integrated with JSON-LD SBOM metadata

by lionkor

1 subcomments

Sone of the best engineers I've seen use commit messages to explain their intent, sometimes even in many sentences, below the message.
I bet, without trying to be snarky, that most AI users don't even know you can commit with an editor instead of -m "message" and write more detail.
It's good that AI fans are finding out that commits are important, now don't reinvent the wheel and just spend a couple minutes writing each commit message. You'll thank yourself later.

by kzahel

0 subcomment

I would love to be able to share all my sessions automatically. But I would want to share a carefully PII/secrets redacted session. I added a "session sharing" feature to my agent wrapper that just grabs innerHTML and uploads to cloudflare. So I can share how I produced/vibe coded an entire project from start to finish.
For example: https://github.com/kzahel/PearSync/blob/main/sessions/sessio...
I think it's valuable to share that so people who are interested can see how you interact with agents. Sharing raw JSONL is probably a waste and contains too many absolute paths and potential for sharing unintentionally.
https://github.com/peteromallet/dataclaw?tab=readme-ov-file#... is one project I saw that makes an attempt to remove PII/secrets. But I certainly wouldn't share all my sessions right now, I just don't know what secrets accidentally got in them.

by testbyhuman_tor

0 subcomment

Interesting discussion about tracking the AI session as part of the commit. But there is a missing piece in most of these workflows that bugs me: none of them include the step where a real person who was not involved in building it tries to use the result.
You can document the prompt chain, the plan, the design doc. But if nobody outside the team ever touches it before it ships, you are still flying blind on whether the thing actually works for a human who encounters it cold. The AI session log tells you what was intended. It does not tell you what was understood.

by hakanderyal

0 subcomment

I created a system which I call 'devlog'. Agent summarizes what it did & how it did in a concise file, and its gets committed along with first prompt and the plan file if any. Later due to noise & volume, I started saving those in a database and adding only devlog id to commit nowadays.
Now whenever I need to reason with what agent did & why, info is linked & ready on demand. If needed, session is also saved.
It helps a lot.

by reflectt

1 subcomments

The session capture problem is harder than it looks because you need to capture intent, not steps.
A coding session has a lot of 'left turn, dead end, backtrack' noise that buries the decision that actually mattered. Committing the full session is like committing compiler output — technically complete, practically unreadable.
We've been experimenting with structured post-task reflections instead: after completing significant work, capture what you tried, what failed, what you'd do differently, and the actual decision reasoning. A few hundred tokens instead of tens of thousands. Commits with a reflection pointer rather than an embedded session.
The result is more useful than raw logs. Future engineers (or future AI sessions) can understand intent without replaying the whole conversation. It's closer to how good commit messages work — not 'here's what changed' but 'here's why'.
Dang's point about there being no single session is also real. Our biggest tasks span multiple sessions and multiple contributors. 'Capture the session' doesn't compose. 'Capture the decision' does.

by vtemian

0 subcomment

Git was designed for humans.
Commits, branches, and the entire model works really well for human-to-human collaboration, but it starts to be too much for agent-to-human interactions.
Sharing the entire session, in a human, readble way, offering a rich experiences to other humans to understand, is way better then having git annotations.
That's why we built https://github.com/wunderlabs-dev/claudebin.com. A free and open-source Claude Code session sharing tool, which allows other humans to better understand decisions.
Those sessions can be shared in PR https://github.com/vtemian/blog.vtemian.com/pull/21, embedded https://blog.vtemian.com/post/vibe-infer/ or just shared with other humans.

by wakawaka28

0 subcomment

Absolutely not. It's a huge waste of resources, even for machines. We don't commit every false start that humans make either. If you MUST think like this, what you really want to do is have a two-phase generation process. Generate the SPEC in one session, then try to get an AI to compile it to code. However, our LLMs are not set up to be deterministic, so there is little benefit to doing this. There are also many tweaks that people want to make which are stylistic or nitpicky, which have nothing to do with the quality of the original spec. You don't actually want to document EVERYTHING unless you work in aerospace or something.

by ramoz

1 subcomments

We think so as well with emphasis on "why" for commits (i.e. intent provenance of all decisions).
https://github.com/eqtylab/y just a prototype, built at codex hackathon
The barrier for entry is just including the complete sessions. It gets a little nuanced because of the sheer size and workflows around squash merging and what not, and deciding where you actually want to store the sessions. For instance, get notes is intuitive; however, there are complexities around it. Less elegant approach is just to take all sessions in separate branches.
Beyond this, you could have agents summarize an intuitive data structure as to why certain commits exist and how the code arrived there. I think this would be a general utility for human and AI code reviewers alike. That is what we built. Cost /utility need to make sense. Research needs to determine if this is all actually better than proper comments in code

0 subcomment

by D-Machine

0 subcomment

An important consideration somewhat missing in discussion in this thread: if we don't carefully document AI-assisted coding sessions, how can we ever hope to improve our use of AI coding tools?
This applies both to future AI tools and also experts, and experts instructing novices.
To some degree, the lack of documenting AI sessions is also at the core of much of the skepticism toward the value of AI coding in general: there are so many claims of successes / failures, but only a vanishingly small amount of actual detailed receipts.
Automating the documentation of some aspects of the sessions (skills + prompts, at least) is something both AI skeptics and proponents ought to be able to agree on.
EDIT: Heck, if you also automate documenting the time spent prompting and waiting for answers and/or code-gen, this would also go a long way to providing really concrete evidence for / against the various claims of productivity gains.

by daemonk

2 subcomments

I did this in the beginning and realized I never went back to it. I think we have to learn to embrace the chaos. We can try to place a couple of anchors in the search space by having Claude summarize the code base every once in a while, but I am not sure if even that is necessary. The code it writes is git versioned and is probably enough to go on.

by Lerc

0 subcomment

I would say not, because it would lead some to think that what was said to the model represented what output was desired. While there is quite a bit of correlation with describing what you want with the output you receive, the nature of models as they stand mean you are not asking for what you want, you are crafting the text that elicits the response that you want. That distinction is important, and is model specific. Without keeping an archive of the entire model used to generate the output, the conversation can be very misleading.
Conversations may also be very non-linear. You can take a path attempting something, roll back to a fork in the conversation and take a different path using what you have learned from the models output. I think trying to interpret someone else's branching flow would be more likely to create an inaccurate impression than understanding.

by claud_ia

0 subcomment

The raw session noise — repeated clarifications, trial-and-error prompting, hallucinated APIs — probably isn't worth preserving. But AI sessions contain one category of signal that almost never makes it into code or commit messages: the counterfactual space — what approaches were tried and rejected, which constraints emerged mid-session, why the chosen implementation looks the way it does.
That's what architectural decision records (ADRs) are designed to capture, and it's where the workflow naturally lands. Not committing the full transcript, but having the agent synthesize a brief ADR at the close of each session: here's what was attempted, what was discarded and why, what the resulting code assumes. Future maintainers — human or AI — need exactly that, and it's compact enough that git handles it fine.

by mock-possum

1 subcomments

I’ve had the same thought, but after playing around with it, it just seems like adding noise. I never find myself looking at generated code and wondering “what prompt lead to that?” There’s no point, I won’t get any kind of useful response - I’m better off talking to the developer who committed it, that’s how code review works.

by adampunk

0 subcomment

I just cannot for the life of me understand the problem that this is solving. The only way that makes any sense is if sessions are atomic along with commits. If a session results in many commits in this becomes a fundamentally incomplete record, such as it was a record at all. Even if we do restrict to one session per commit, we are not in control over the agent’s context—-the session details will contain the user prompting the actions and the reasoning summaries. It will not contain a crucial part, which is how the agent assembles information about the project. So you’re left with a record that looks very complete and is silently incomplete. I don’t understand what the benefit of retaining that is.

by alainrk

1 subcomments

My complete reasoning, notes, errors have never been part of the commit. I don't see a valid reason on why the raw conversation must be included. Rather I have hooks (or just "manually" invoked) to process all of it and update the relevant documentation that I've been putting under docs/.

by CloakHQ

0 subcomment

The plan.md approach solves something I've been struggling with on a browser automation project. When you're building something stateful (browser sessions, fingerprinting logic etc.) the "why" behind decisions gets lost fast. Not just for other devs, but for the AI itself in future sessions.
One thing I've added on top of the plan/project structure: a short `decisions.md` that logs only the non-obvious choices, like "tried X, it caused Y issue, went with Z instead". Basically the things that would make future-me or a future agent waste time rediscovering.
Do you find the plan.md files stay useful past the initial build, or do they mostly just serve as a commit artifact?

by veunes

1 subcomments

The idea of "saving prompts for reproducibility" is dead on arrival. LLMs are non-deterministic by nature. In a year, they'll deprecate this model's API, and the new version will spit out completely different code with entirely new bugs for the exact same prompt. A prompt isn't source code, it's just a temporary crutch for stochastic generation. And if I have to read 50 pages of schizophrenic dialogue with an LLM just to understand why a specific function exists, that PR gets an instant reject. The artifact is and always will be readable code plus a sane commit message. Dumping a log of hallucinations will only make debugging a nightmare when this Frankenstein inevitably falls apart in prod tbh

by natex84

2 subcomments

If the model in use is managed by a 3rd party, can be updated at will, and also gives different output each time it is interacted with, what is the main benefit?
If I chat with an agent and give an initial prompt, and it gets "aspect A" (some arbitrary aspect of the expected code) wrong, I'll iterate to get "aspect A" corrected. Other aspects of the output may have exactly matched my (potentially unstated) expectation.
If I feed the initial prompt into the agent at some later date, should I expect exactly "aspect A" to be incorrect again? It seems more likely the result will be different, maybe with some other aspects being "unexpected". Maybe these new problems weren't even discussed in the initial archived chat log, since at that time they happened to be generated in a way in alignment with the original engineers expectation.

by daxfohl

0 subcomment

I think so. If nothing else, when you deploy and see a bug, you can have a script that revives the LLMs of the last N commits and ask "would your change have caused this?" Probably wouldn't work or be any more efficient than a new debugging agent most of the time, but it might sometimes and you'd have a fix PR ready before you even answered the pager, and a postmortem that includes WHY it did so, and a prompt to prevent that behavior in the future. And it's cheap, so why not.
Maybe not a permanent part of the commit, but something stored on the side for a few weeks at a time. Or even permanently, it could be useful to go back and ask, "why did you do it that way?", and realize that the reason is no longer relevant and you can simplify the design without worrying you're breaking something.

by angry_octet

0 subcomment

Since the code is literally the answer to What comes next after this prompt the answer is yes. Unfortunately there is also a hidden random seed in the engine (which this doesn't seem to address). But if you capture the seed, the exact version of the software and the prompt, the system is completely deterministic.
However there is an unpleasant reality: the system could be incredibly brittle, with the slightest change in input or seed resulting in significantly different output. It would be nice if all small and seemingly inconsequential input perturbations resulted in a cluster of outputs that are more or less the same, but that seems very model dependent.

by brainlounge

0 subcomment

The more fundamental question is: Is there information in the AI-coding session that should be preserved? Only if the answer is "yes", the next question becomes: Where do we store that data?
git is only one possible location.
I think there is very valuable information in session logs, like the prompts, or the usage statistics at the end of the session, which model was used etc. But git history or the commit messages should focus on the outcome of the work, not on the process itself. This is why the whole issue discussion before work in git starts is also typically kept separately in tickets. Not in git itself, but close to it.
There're platforms like tulpal.com which move the whole local agent-supported process to the server and therefore have much better after-the-fact observability in what happened.

by gck1

0 subcomment

If you want an answer to the OP question, just ask AI to analyze the session jsonl files in your user directory and give you statistics of what's in there.
You'll find that at least half of it is noise.
If you put that in commits, you lose the ability to add "study git commits to ground yourself" in your agents.md or prompts. Because now you'll have 50%+ noise in your active session's context window.
Context window is precious. Guard it however you can.

by micw

1 subcomments

IMO it depends a bit, but in most cases: No!
If you do proper software development (planing, spec, task breakdown, test case spec, implementation, unit test, acceptance test, ...) implementation is just a single step and the generated artifact is the source code. And that's what needs to be checked in. All the other artifacts are usually stored elsewhere.
If you do spec and planing with AI, you should also commit the outcome and maybe also the prompt and session (like a meeting note on a spec meeting). But it's a different artifact then.
But if you skip all the steps and put your idea directly to an coding agent in the hope that the result is a final, tested and production ready software, you should absolutely commit the whole chat session (or at least make the AI create a summary of it).

by otar

1 subcomments

In the ideal world a specification file should be committed to the repository and then linked to the PR/commit. But it slows you down and is no longer a vibe coding?
Soon only implementation details will matter. Code can be generated based on those specifications again and again.

by tokiory

0 subcomment

Hell no, there are many companies, which doesn't use any AI (or just using copilot). I would hate to read a commit history where every commit had a "conversation" attached to it. Code should be human-first, always

by burntoutgray

4 subcomments

YES! The session becomes the source code.
Back in the dark ages, you'd "cc -s hello.c" to check the assembler source. With time we stopped doing that and hello.c became the originating artefact. On the same basis the session becomes the originating artefact.

by rurban

0 subcomment

No, the (github issue) ID is enough.
For my AI coding sessions I just point opencode to the issue. It does a plan, (build, ie) implements and tests the plan, and commits it. For reference you always have the issue, revise the issue when something changed.
We always worked like this, recording the thinking and planning part is silly. You can always save your session data.

by JustFinishedBSG

0 subcomment

I understand the idea but the way I work, a commit isn't "a" session, it's potentially tens of sessions with branching in each session.
I honestly don't know if I'm doing something very wrong or if I have a very different working style than many people, but for me "just give the prompt/session" isn't a possibility because there isn't one.
I'm probably incredibly inefficient, because even when I don't use AI it is the same, a single commit is usually many different working states / ideas / branches of things I tried and explored that have been amended / squashed.

by crossroadsguy

0 subcomment

Goodness no! Sometimes I literally SHOUT at these agents/chats and often stoop down to using cuss words, which I am not proud of, but surprisingly it has shown to work here and there. As real as that is, I'd not want that on record in a commit.

by jumploops

0 subcomment

I've been experimenting with a few ways to keep the "historical context" of the codebase relevant to future agent sessions.
First, I tried using simple inline comments, but the agents happily (and silently) removed them, even when prompted not to.
The next attempt was to have a parallel markdown file for every code file. This worked OK, but suffered from a few issues:
1. Understanding context beyond the current session
2. Tracking related files/invocations
3. Cold start problem on an existing codebases
To solve 1 and 3, I built a simple "doc agent" that does a poor man's tree traversal of the codebase, noting any unknowns/TODOs, and running until "done."
To solve 2, I explored using the AST directly, but this made the human aspect of the codebase even less pronounced (not to mention a variety of complex edge-cases), and I found the "doc agent" approach good enough for outlining related files/uses.
To improve the "doc agent" cold start flow, I also added a folder level spec/markdown file, which in retrospect seems obvious.
The main benefit of this system, is that when the agent is working, it not only has to change the source code, but it has to reckon with the explanation/rationale behind said source code. I haven't done any rigorous testing, but in my anecdotal experience, the models make fewer mistakes and cause less regressions overall.
I'm currently toying around with a more formal way to mark something as a human decision vs. an agent decision (i.e. this is very important vs. this was just the path of least resistance), however the current approach seems to work well enough.
If anyone is curious what this looks like, I ran the cold start on OpenAI's Codex repo[0].
[0]https://github.com/jumploops/codex/blob/file-specs/codex-rs/...

by root_axis

0 subcomment

This seems wrong, like committing debug logs to the repo. There's also lots of research showing that models regularly produce incorrect trace tokens even with a correct solution, so there's questionable value even from a debugging perspective.

by jiveturkey

1 subcomments

https://entire.io thinks so

by gingersnap

0 subcomment

My instinct is to say that I don't want the session as part of the commit. For me that is like a Slack thread discussing the new feature, and that is not something I would commit. I think that the split shouldn't be "is this done with a machine"=> commit, I think the split for AI should be the same as before. Is it code or changes of code, then it should be included. Is it discussing, going back and forth, that is not commited now. On the other hand, if you do a plan that is then implemented, I actually do think it makes sense to save the plan, either as commit, or if you save that back to the issue.

by jon_north

0 subcomment

This seems like a very good idea, not just because of the desire to do human archaeology at times, but also to let further agentic exploration occur. It would be best if it became a separate section of the commit that could just be blank or contain other documentation in the case of human authorship. The commit message shouldn't get longer and longer. It should continue to tell the concise story that humans and LLMs alike consume quickly to gain some initial synthesis.
So I like the link's approach quite a bit.

by fladrif

1 subcomments

I think this is a lot of "kicking can down the road" of not understanding what code the ai is writing. Once you give up understanding the code that is written there is no going back. You can add all the helper commit messages, architecture designs, plans, but then you introduce the problem of having to read all of those once you run into an issue. We've left readability on the wayside to the alter of "writeability".
The paradigm shift, which is a shift back, is to embrace the fact that you have to slow down, and understand all the code the ai is writing.

by nomilk

0 subcomment

The way I've been storing prompts is a directory in the project called 'prompts' and an .md file for each topic/feature. Since I usually iterate a lot on the same prompt (to minimise context rot), I store many versions of the same prompt ordered chronologically (newest at top).
That way if I need to find a prompt from some feature from the past, I just find the relevant .md file and it's right at the top.
Interestingly, my projects are way better documented (via prompts) than they ever were in the pre-agentic era.

by resters

0 subcomment

What would be most useful is some kind of context representation that could be upgraded as better models get developed. If you put it in the commit then you need to compare contexts when comparing code across time. But if you make the context include the changes in the code over time, then the future context will be better at debugging a bug in code written years earlier. The years-old context is likely going to be obsolete by that time anyway.

by ryan_velazquez

0 subcomment

If the agent is like a compiler, show me the source code.
I'm not sure about becoming part of the repo/project long term but I think providing your prompts as part of the pull request makes the review much easier because the reviewer can quickly understand your _intent_. If your intent has faulty assumptions or if the review disagrees with the intent, that should be addressed first. If the intent looks good, a reviewer can then determine if you (or your coding agent) have actually implemented it first.

by visarga

0 subcomment

Yes, it should remain part of the commit, and the work plan too, including judgements/reviews done with other agents. The chat log encodes user intent in raw form, which justifies tasks which in turn justify the code and its tests. Bottom up we say the tests satisfy the code, which satisfies the plan and finally the user intent. You can do the "satisfied/justified" game across the stack.
I only log my own user messages not AI responses in a chat_log.md file, which is created by user message hook in the repo.

by kaycey2022

1 subcomments

This feels woefully inadequate. It should be saving everything. Not just the prompts and replies, but also the tool calls and skill invocations. If that is too much, then why even save anything in the session?
Right now this paradigm is so novel to us that we don’t know if what is being saved is useful in anyway or just hoarding garbage.
There are some who (rightly IMO) just neatly squash their commits and destroy the working branch after merging. There are others who would rather preserve everything.

by bloomca

0 subcomment

I don't think it's worth to include the session -- it would bloat the context too much anyway.
However, I do think that a higher-level description of every notable feature should be documented, along with the general implementation details. I use this approach for my side projects and it works fairly well.
The biggest question whether it will scale, I suspect that no, and I also suspect it is probably better to include nothing than a poor/disjointed/rare documentation of the sessions.

by willbeddow

0 subcomment

Increasingly, I'd like the code to live alongside a journal and research log. My workflow right now is spending most of my time in Obsidian writing design docs for features, and then manually managing claude sessions that I paste them back and forth into. I have a page in obsidian for each ongoing session, and I record my prompts, forked paths, thoughts on future directions, etc. It seems natural that at some point this (code, journal, LLM context) will all be unified.

by FpUser

0 subcomment

I keep trunk of conversation internally. No way I am putting it on github. The way I think, plan, interrogate LLM is part of competitive advantage in the market. I consider it my property and I would not ever let my clients read it (I pay for my usage of AI). Never mind some juicy language and being super straight and apolitical in a corporate sense. basically would be a major privacy breach

by rhgraysonii

0 subcomment

I think the decisions it made along the way are worth tracking. And it’s got some useful side effects with regard to actually going through the programming and architecture process. I made a tool that really helps with this and finds a pretty portable middle ground that can be used by one person or a team too, it’s flexible. https://deciduous.dev/

by saratogacx

0 subcomment

I've gotten into the habit of having the LLM produce a description of their process and summarize the change, Than I add that along with the model I used after my own commit message. It lets me know where I use AI and what I thought it did as well as what I thought it did.
The entire prompt and process would be fine if my git history was subject to research but really it is a tool for me or anyone else who wants to know what happened at a given time.

by vpribish

0 subcomment

Mostly that’s going to be noise. But in some rare occasion I could see it being useful. So my unhelpful notion is that we might need a new thing - leave the commit message as a meaning- dense human-to-human message, and also have a development process flight-recorder log stored alongside. Storage is basically free so why not?

by gavinray

0 subcomment

This is what the Github CEO recently announced as a product/company:
https://entire.io/
Original blogpost goes over motivations + workflow:
https://entire.io/blog/hello-entire-world/

by dboreham

1 subcomments

I've thought about this, and I do save the sessions for educational purposes. But what I ended up doing is exactly what I ask developers to do: update the bug report with the analysis, plan, notes etc. In the case there's a single PR fixing one bug, GitHub and Claude tend to prefer this information go in the PR description. That's ok for me since it's one click from the bug.

by dogas

0 subcomment

I created a tool that will automatically suck in claude sessions into a separate repo. It sanitizes any sensitive data like API keys. Our team finds this useful to share sessions + context.
https://github.com/gammons/ai-session

by globular-toast

0 subcomment

Like any discussion about AI there are two things people are talking about here and it's not always clear which:
1. Using LLMs as a tool but still very much crafting the software "by hand",
2. Just prompting LLMs, not reading or understanding the source code and just running the software to verify the output.
A lot of comments here seem to be thinking of 1. But I'm pretty sure the OP is thinking of 2.

by kkarpkkarp

0 subcomment

For my own projects in private repos I would benefit from exporting the session. For example if I need to return to the task, it could be great to give it as a context
For my work as one of developers in team, no. The way I prompt is my asset and advantage over others in a team who always complain about AI not being able to provide correct solutions and secures my career

by 131hn

0 subcomment

Vibecoded code is not C, python, ts, je or whatever.
It need to be considered as a compiled output of vbc-c, vbc-python, or vbc-ts, or vbc-js.
Keeping the source code (the prompt) is very natural, when compiled binaries “vibecoded” output is lacking _context_ and _motivation_ (which the source code / prompt provides)

by Garlef

0 subcomment

I think this is the wrong mental model.
Instead, we need better (self-explaining) translation from spec to code. And better tools that help us navigate codebases we've not written ourselves.
For example, imagine a UI where you click on a feature spec file and it highlights you all the relevant tests and code.

by phyzix5761

0 subcomment

Have AI explain the reasoning behind the PR. I don't think people really care about your step by step process but reviewers might care about your approach, design choices, caveats, and trade offs.
That context could clarify the problem, why the solution was chosen, key assumptions, potential risks, and future work.

by rcy

1 subcomments

I haven't adopted this yet, but have a feeling that something like this is the right level of recording the llm contribution / session https://blog.bryanl.dev/posts/change-intent-records

by jillesvangurp

0 subcomment

I think that's covered by the YAGNI rule. It has very little value that rapidly drops off as you commit more code. Maybe some types of software you might want to store some stuff for compliance/auditing reasons. But beyond that, I don't see what you would use it for.

by PeterStuer

0 subcomment

The session might contain many artifacts that are not suited for open sourcing. The additional fine grained curation effort required might be more of an obstacle to open sourcing than the perceived benefits.
That said preserved private session records might be of great personal benefit.

by travisgriggs

0 subcomment

In our (small) team, we’ve taken to documenting/disclosing what part(s) of the process an LLM tool played in the proposed changes. We’ve all agreed that we like this better, both as submitters and reviewers. And though we’ve discussed why, none of us has coined exactly WHY we like this model better.

by zkmon

0 subcomment

Source code repositories such as git are for "sources" which are direct outputs of human effort. Sny generated stuff is not "source". It is same as the outputs of compile and build activities. Only the direct outputs of human effort should go into git.

by ottah

0 subcomment

How could this possibly be of any value. Commit history is not a grab bag of every random thing that happened during the development process. It's a series of checkpoints that lets you back out of bad decisions.

by eddyg

0 subcomment

https://specstory.com/specstory-cli is another tool in this space (it writes clean Markdown session files into the project for future reference)

by segmondy

1 subcomments

It's already bad enough that people are saying there's too much code to read and review. You want to add session to it? Running it again, might not yield the same output. These models are non deterministic and models are often changed and upgraded.

by tayo42

0 subcomment

I feel like publishing the session is like publishing a sketch book. I don't need all of my mistakes and dumb questions recorded.
If that was important, why are we not already doing things like this. Should I have always been putting my browser history in commits?

by voidUpdate

1 subcomments

People keep talking about how LLMs are like a compiler from human language to code. We commit source code instead of just compiled machine code, so why should this be any different? The "source code" is the prompts

by genghisjahn

0 subcomment

If you can, run several agents. They document their process. Trade offs considered, reasoning. Etc. it’s not a full log of the session but a reasonable history of how the code came to be. Commit it with the code. Namespace it however you want.

by jollymonATX

0 subcomment

How verbose a history is even plausible to store and recall in modern git? This could add decent pressure on those mechanisms and the usability, for humans at least, would be taxing to consume.

by heavyset_go

0 subcomment

If you need LLM sessions included to understand or explain commits, you're doing something wrong.
Saving sessions is even more pointless without the full context the LLM uses that is hidden from the user. That's too noisy.

by galaxyLogic

0 subcomment

Couldn't AI write the commit-message based on the prompts-history up till the commit thus making it easier to understand for any future reviewers what lead to and what is in a specific commit?

by semiinfinitely

1 subcomments

Should your browser and search history be part of the commit too?

by akoskomuves

0 subcomment

I've done something similar with full analytics and options to add the full team. https://getpromptly.xyz

by pzygadlo

0 subcomment

I decided to extract the session. I know it is risky (you might lose some info), but the trade-off is readability (and so data access).

by ajam1507

1 subcomments

Yes, please, it would solve the problem of the relentless HN discussions about how useful AI is for coding. We could actually see how productive people are using it.

by Marlinski

0 subcomment

If there was a standardized way to save this information, and tie it up to each commits, it would be insanely useful to amass a very valuable training dataset.

by danhergir

2 subcomments

One of the use cases i see for this tool is helping companies to understand the output coming from the llm blackbox and the process which the employee took to complete a certain task

by wiseowise

0 subcomment

No, because if AI is set to replace a human – their prompting skill and approach are the only things differentiating them from the rest of the grey mass.

by x3n0ph3n3

0 subcomment

I include my "plans" and a link to my transcript on all my PRs that include AI-generated code. If nothing else, others on my team can learn from them.

by tezza

0 subcomment

I put a link to the LLM session at the end of the commit, and prefix with POH: if I wrote it by hand.
POH = Plain Old Human
Easy to achieve.
Why NOT include a link back? Why deprive yourself of information?

by ETH_start

0 subcomment

In principle, the documentation that's included in the code edit should have all the relevant information that a future agent would need.

by ares623

3 subcomments

Maybe Git isn't the right tool to track the sessions. Some kind of new Semi-Human Intelligence Tracking tool. It will need a clever and shorter name though.

by jtesp

0 subcomment

according to entire.io it should. i have been keeping a local log for a while and have now been trying out entire. still not sure how i feel about it
pros:
intent is documented
reference to see how it was made
informal documentation
find flaws in your mental model
others can learn from your style
cons:
others can see how it was made
mention things you don't want others to see/know
people can see how dumb we are
reality:
you will judge and be judged for engineering competency not through code, but through words

by robseed

0 subcomment

Unedited AI generated code should have a different blame line than regular code, something like author_ai vs author.

by DonThomasitos

0 subcomment

Everything in git can and must be merge-able when merging branches. After all, git is a collaboration tool, not a undo-redo stack.

by atmosx

0 subcomment

It is a useful piece of information, but the session is not “long lived” in terms of git commit history lifetime.

by darepublic

0 subcomment

If a human writes code, should the jira ticket be part of the commit? I am actually thinking about potential merits.

by ChicagoDave

0 subcomment

The last 5 sessions. Beyond that I archive them outside the repo. But I do save them for review and summaries.

by dolebirchwood

0 subcomment

I drop a lot of F-bombs and other unpleasantries when I talk to the robots, so I'd rather not.

by rclabs

0 subcomment

hell to the no, in between coding sessions, I go out on plenty of sidebars about random topics that help me, the prompter understand the problem more. Prompts in this way are entirely related to context (pre-knowledge) that is not available to the LLMs.

by mixdup

1 subcomments

LLMs are non-deterministic, so feeding that session back in possibly will get you a different output. Also, models change over time so you may not necessarily be able to run the session against the same model again
The whole point of the source code it generates is to have the artifact. Maybe this is somewhat useful if you need to train people how to use AI, but at the end of the day the generated code is the thing that matters. If you keep other notes/documentation from meetings and design sessions, however you keep that is probably where this should go, too?

by nautilus12

0 subcomment

This would just record a lot of me cursing at and calling the AI an idiot.

by ekjhgkejhgk

0 subcomment

If a person writes code, should all the process be part of the commit?

by stopthe

0 subcomment

No. Even further than that, maintaining AGENTS.md and the like in your company repo, you basically train your own replacement. Which replacement will not be as capable as you in the long run, but few businesses will care. Anyway having some representation of an employee's thinking definitely lowers cost of firing that employee.
That is a cynical take and not very different from an advice to never write any documentation, or never help your teammates. Only that resemblance is superficial. In any organization you shouldn't help people stealing you time for their benefit (Sean Goedecke calls them predators https://www.seangoedecke.com/predators/).
On the other hand, it may be beneficial to privately save CLAUDE.md and other parts of persistent context. You may gitignore them (but that will be conspicuous unless you also gitignore .gitignore) or just load them from ~/.claude
I expect an enterprise version of Claude Code that will save any human input to the org servers for later use.

by hsuduebc2

0 subcomment

I must say that would certainly show some funny converstaions in a log.

by stubbi

0 subcomment

Isn’t that what entire.io, founded by former GitHub CEO, is doing?

by grahar64

0 subcomment

If AI could reliably write good code then you shouldn't need to even commit the code as the general rule is you shouldn't commit generated code. Commit the session when you don't need to commit the code

by spion

0 subcomment

A summary of the session should be part of the commit message.

by hirako2000

0 subcomment

What's the value given answers are not deterministic.

by xnx

0 subcomment

Yagni

by SamDc73

0 subcomment

pre-ai if I had to include Google search queries in a commit, I’d be so embarrassed I’d probably never commit code like ever

by jes5199

0 subcomment

instead of committing code, we should just save videos of all of the zoom meetings about the code

by anishgupta

0 subcomment

isn't a similar thing done by entire cli? the startup which raised $60M seed recently

by Jach

0 subcomment

In general, no, but sometimes, yes, or at least linked from the commit the same way user stories/issues are. Admittedly the 'sometimes' from my perspective is mostly when there's a need to educate fellow humans about what's possible or about good prompt techniques and workarounds for the AI being dumb. It can also reveal more of x% by AI, y% by human by for example diffing the outputs from the session against the final commits.

by alansaber

0 subcomment

If the full session capture is not encoded s.t it provides insight into architecture/mistakes, what was the point? There needs to be 1. complete capture (all tool calls etc) as well as 2. which is also curated to be readable (collapsible, chronological, easy to navigate etc). A .txt dump of agent COT is not particularly useful to anyone aside from another agent.

by dekken_

0 subcomment

> AI writes code
you mean plagiarism?

by lsc4719

0 subcomment

Proof sketch is not proof

by exfalso

0 subcomment

Nope. Especially with these agents the thinking trace can get very large. No human will ever read it, and the agent will fill up their context with garbage trying to look for information.
I understand the drive for stabilizing control and consistency, but this ain't the way.

by weli

0 subcomment

I agree so much

by foamzou

0 subcomment

No. Prompt-like document is enough. (e.g. skills, AGENTS.md)

by flammafex

0 subcomment

No. Make me.

by dizlexic

0 subcomment

by igetspam

0 subcomment

Yes.
EOM

by raggi

1 subcomments

nope. Someones going to leak important private data using something like this.
Consider:
"I got a bug report from this user:
... bunch of user PII ..."
The LLM will do the right thing with the code, the developer reviewed the code and didn't see any mention of the original user or bug report data.
Now the notes thing they forgot about goes and makes this all public.

by est

0 subcomment

obligatory: git notes
Lots of comments mentioned this, for those who aren't aware, please checkout
Git Notes: Git's coolest, most unloved feature (2022)
https://news.ycombinator.com/item?id=44345334
I think it's a perfect match for this case.

by nicman23

0 subcomment

no and neither should be the actual code. you should at least remove the excessive bs that the ai comments and autisms about

by toddmorrow

0 subcomment

yep. but I don't know what folder. maybe under logs. it's really a new category

by hackersk

0 subcomment

[dead]

by trailblaze

0 subcomment

[dead]

by umairnadeem123

0 subcomment

IMO this is solving the wrong problem. the session log is just noise - its like attaching your google search history to a stackoverflow answer to "prove" you did the research. nobody wants to read 500 lines of an agent going back and forth debugging a race condition.
the actual problem is that AI produces MORE code not better code, and most people using it aren't reviewing what comes out. if you understood the code well enough to review it properly you wouldn't need the session log. and if you didn't understand it, the session log won't help you either because you'll just see the agent confidently explaining its own mistakes.
> have your agent write a commit message or a documentation file that is polished and intended for consumption
this is the right take. code review and commit messages matter more now than they ever did BECAUSE there's so much more code being generated. adding another artifact nobody reads doesn't fix the underlying issue which is that people skip the "understand what was built" step entirely.

by lidn12

0 subcomment

[dead]

by pipejosh

0 subcomment

I settled on a similar workflow but across two agents instead of one session.
One agent writes task specs. The other implements them. Handoff files bridge the gap. The spec IS the session artifact because it captures intent, scope, and constraints before any code gets written.
The plan.md approach people are describing here is basically what happens naturally when you force yourself to write intent before execution.

by adrian-vega

0 subcomment

[dead]