FRESH

Hacker News

Home

The Future of Version Control

260 points by c17r

by ulrikrasmussen

3 subcomments

The thing about how merges are presented seems orthogonal to how to represent history. I also hate the default in git, but that is why I just use p4merge as a merge tool and get a proper 4-pane merge tool (left, right, common base, merged result) which shows everything needed to figure out why there is a conflict and how to resolve it. I don't understand why you need to switch out the VCS to fix that issue.

by radarsat1

8 subcomments

Is it a good thing to have merges that never fail? Often a merge failure indicates a semantic conflict, not just "two changes in the same place". You want to be aware of and forced to manually deal with such cases.
I assume the proposed system addresses it somehow but I don't see it in my quick read of this.

by barrkel

3 subcomments

I don't really get the upside of focus on CRDTs.
The semantic problem with conflicts exists either way. You get a consistent outcome and a slightly better description of the conflict, but in a way that possibly interleaves changes, which I don't think is an improvement at all.
I am completely rebase-pilled. I believe merge commits should be avoided at all costs, every commit should be a fast forward commit, and a unit of work that can be rolled back in isolation. And also all commits should be small. Gitflow is an anti-pattern and should be avoided. Long-running branches are for patch releases, not for feature development.
I don't think this is the future of VCS.
Jujutsu (and Gerrit) solves a real git problem - multiple revisions of a change. That's one that creates pain in git when you have a chain of commits you need to rebase based on feedback.

by bos

2 subcomments

This is sort of a revival and elaboration of some of Bram’s ideas from Codeville, an earlier effort that dates back to the early 2000s Cambrian explosion of DVCS.
Codeville also used a weave for storage and merge, a concept that originated with SCCS (and thence into Teamware and BitKeeper).
Codeville predates the introduction of CRDTs by almost a decade, and at least on the face of it the two concepts seem like a natural fit.
It was always kind of difficult to argue that weaves produced unambiguously better merge results (and more limited conflicts) than the more heuristically driven approaches of git, Mercurial, et al, because the edit histories required to produce test cases were difficult (at least for me) to reason about.
I like that Bram hasn’t let go of the problem, and is still trying out new ideas in the space.

by simonw

1 subcomments

This thing is really short. https://github.com/bramcohen/manyana/blob/main/manyana.py is 473 lines of dependency-free Python (that file only imports difflib, itertools and inspect) and of that ~240 lines are implementation and the rest are tests.

by ZoomZoomZoom

3 subcomments

The key insight in the third sentence?
> ... CRDTs for version control, which is long overdue but hasn’t happened yet
Pijul happened and it has hundreds - perhaps thousands - of hours of real expert developer's toil put in it.
Not that Bram is not one of those, but the post reads like you all know what.

by aggregator-ios

3 subcomments

What CRDT's solve is conflicts at the system level. Not at the semantic level. 2 or more engineers setting a var to a different value cannot be handled by a CRDT.
Engineer A intended value = 1
Engineer B intended value = 2
CRDT picks 2
The outcome could be semantically wrong. It doesn't reflect the intent.
I think the primary issue with git and every other version control is the terrible names for everything. pull, push, merge, fast forward, stash, squash, rebase, theirs, ours, origin, upstream and that's just a subset. And the GUI's. They're all very confusing even to engineers who have been doing this for a decade. On top of this, conflict resolution is confusing because you don't have any prior warnings.
It would be incredibly useful if before you were about to edit a file, the version control system would warn you that someone else has made changes to it already or are actively working on it. In large teams, this sort of automation would reduce conflicts, as long as humans agree to not touch the same file. This would also reduce the amount of quality regressions that result from bad conflict resolutions.
Shameless self plug: I am trying to solve both issues with a simpler UI around git that automates some of this and it's free. https://www.satishmaha.com/BetterGit

by injidup

1 subcomments

I'm confused about what this solves. They give the example of someone editing a function and someone deleting the same function and claim that the merge never fails and then go on to demonstrate that indeed rightly the merges still fails. There are still merge markers in the sources. What is the improvement exactly?

by gnarlouse

2 subcomments

I think something like this needs to be born out of analysis of gradations of scales of teams using version control systems.
- What kind of problems do 1 person, 10 person, 100 person, 1k (etc) teams really run into with managing merge conflicts?
- What do teams of 1, 10, 100, 1k, etc care the most about?
- How does the modern "agent explosion" potentially affect this?
For example, my experience working in the 1-100 regime tells me that, for the most part, the kind of merge conflict being presented here is resolved by assigning subtrees of code to specific teams. For the large part, merge conflicts don't happen, because teams coordinate (in sprints) to make orthogonal changes, and long-running stale branches are discouraged.
However, if we start to mix in agents, a 100 person team could quickly jump into a 1000 person team, esp if each person is using subagents making micro commits.
It's an interesting idea definitely, but without real-world data, it kind of feels like this is just delivering a solution without a clear problem to assign it to. Like, yes merge-conflicts are a bummer, but they happen infrequently enough that it doesn't break your heart.

by ballsweat

0 subcomment

Everyone should vibe code a VCS from scratch in their fave language.
It’s an awesome weekend project, you can have fun visualizing commits in different ways (I’m experimenting with shaders), and importantly:
This is the way forward. So much software is a wrapper around S3 etc. now is your chance to make your own toolset.
I imagine this appeals more to DIYer types (I use Pulsar IDE lol)

by gavinhoward

2 subcomments

Bram Cohen is awesome, but this feels a little bare. I've put much more thought into version control ([1]), including the use of CRDTs (search for "# History Model" and read through the "Implementing CRDTs" section).
[1]: https://gavinhoward.com/uploads/designs/yore.md

by mikey-k

2 subcomments

Interesting idea. While conflicts can be improved, I personally don't see it as a critical challenge with VCS.
What I do think is the critical challenge (particularly with Git) is scalability.
Size of repository & rate of change of repositories are starting to push limits of git, and I think this needs revisited across the server, client & wire protocols.
What exactly, I don't know. :). But I do know that in my current role (mid-size well-known tech company) is hitting these limits today.

by merlindru

0 subcomment

I recently found a project called sem[1] that does git diffs but is aware of the language itself, giving feedback like "function validateToken added", "variable xyzzy removed", ...
i think that's where version control is going. especially useful with agents and CI
[1] https://ataraxy-labs.github.io/sem/

by logicprog

3 subcomments

This seems like an excellent idea. I'm sure a lot of us have been idly wondering why CRDTs aren't used for VCS for some time, so it's really cool to see someone take a stab at it! We really do need an improvement over git; the question is how to overcome network effects.

by jFriedensreich

2 subcomments

starts with “based on the fundamentally sound approach of using CRDTs for version control”. How on earth is crdt a sound base for a version control system? This makes no sense fundamentally, you need to reach a consistent state that is what you intended not what some crdt decided and jj shows you can do that also without blocking on merges but with first level conflicts that need to be resolved. ai and language aware merge drivers are helping so much here i really wonder if the world these “replace version control” projects were made for still exists at all.

by WCSTombs

0 subcomment

For the conflicts, note that in Git you can do
```
    git config --global merge.conflictstyle diff3
```
to get something like what is shown in the article.

by lemonwaterlime

1 subcomments

See vim-mergetool[1]. I use it to manage merge conflicts and it's quite intuitive. I've resolved conflicts that other people didn't even want to touch.
[1]: https://github.com/samoshkin/vim-mergetool

by BlueHotDog2

0 subcomment

This is cool and i keep thinking about CRDTs as a baseline for version control, but CRDTs has some major issues, mainly the fact that most of them are strict and "magic" in the way they actually converge(like the joke: CRDTs always converge, but to what). i didn't read if he's using some special CRDT that might solve for that, but i think that for agentic work especially this is very interesting

by bob1029

1 subcomments

I think there are still strong advantages to the centralized locking style of collaboration. The challenge is that it seems to work best in a setting where everyone is in the same physical location while they are working. You can break a lock in 30 seconds with your voice. Locking across time zones and date lines is a nonstarter by comparison.

by nkmnz

1 subcomments

I don't quite understand how CRDTs should help with merges. The difficult thing about merges is not that two changes touch the same part of the code; the difficult thing is that two changes can touch different parts of the code and still break each other - right?

by braidedpubes

0 subcomment

Do I have it right that it’s basically timestamp based, except not based on our clocks but one it manages itself?
So as long as all updates have been sent to the server from all clients, it will know what “time” each character changed and be able to merge automatically.
Is that it basically?

by ballsweat

0 subcomment

Cool timing.
I recently built Artifact: https://www.paganartifact.com/benny/artifact
Mirror: https://github.com/bennyschmidt/artifact
In case anyone was curious what a full rewrite of git would look like in Node!
The main difference is that on the server I only store deltas, not files, and the repo is “built”.
But yeah full alternative to git with familiar commands, and a hub to go with it.

by a-dub

0 subcomment

doesn't the side by side view in github diff solve this?
conflict free merging sounds cool, but doesn't that just mean that that a human review step is replaced by "changes become intervals rather than collections of lines" and "last set of intervals always wins"? seems like it makes sense when the conflicts are resolved instantaneously during live editing but does it still make sense with one shot code merges over long intervals of time? today's systems are "get the patch right" and then "get the merge right"... can automatic intervalization be trusted?
edit: actually really interesting if you think about it. crdts have been proven with character at a time edits and use of the mouse select tool.... these are inherently intervalized (select) or easy (character at a time). how does it work for larger patches can have loads of small edits?

0 subcomment

by echrisinger

0 subcomment

Has anyone considered a VCS that integrates more vertically with the source code through ASTs?
IE if I change something in my data model, that change & context could be surfaced with agentic tooling.

by shitfilleddonut

0 subcomment

It seems more like the past of version control

by mentalgear

1 subcomments

> [CRDT] This means merges don’t need to find a common ancestor or traverse the DAG. Two states go in, one state comes out, and it’s always correct.
Well, isn't that what the CRDT does in its own data structure ?
Also keep in mind that syntactic correctness doesn't mean functional correctness.

by lifeformed

6 subcomments

My issue with git is handling non-text files, which is a common issue with game development. git-lfs is okay but it has some tricky quirks, and you end up with lots of bloat, and you can't merge. I don't really have an answer to how to improve it, but it would be nice if there was some innovation in that area too.

by phtrivier

0 subcomment

A suggestion : is there any info to provide in diffs that is faster to parse than "left" and "right" ? Can the system have enough data to print "bob@foo.bar changed this" ?

by lasgawe

0 subcomment

This is a really interesting and well thought out idea, especially the way it turns conflicts into something informative instead of blocking. The improved conflict display alone makes it much easier to understand what actually happened. I think using CRDTs to guarantee merges always succeed while still keeping useful history feels like a strong direction for version control. Looks like a solid concept!

by alunchbox

1 subcomments

Jujutsu honestly is the future IMO, it already does what you have outlined but solved in a different way with merges, it'll let you merge but outline you have conflicts that need to be resolved for instance.
It's been amazing watching it grow over the last few years.

by socalgal2

0 subcomment

> [CRDT] This means merges don’t need to find a common ancestor or traverse the DAG. Two states go in, one state comes out, and it’s always correct.
Funny, there was just a post a couple of days ago how this is false.
https://news.ycombinator.com/item?id=47359712

by sibeliuss

0 subcomment

Why must everyone preprocess their blog posts with ChatGPT? It is such a disservice to ones ideas.

by skybrian

2 subcomments

It sounds interesting but the main selling point doesn’t really reasonate:
If you haven’t resolved conflicts then it probably doesn’t compile and of course tests won’t pass, so I don’t see any point in publishing that change? Maybe the commit is useful as a temporary state locally, but that seems of limited use?
Nowadays I’d ask a coding agent to figure out how to rebase a local branch to the latest published version before sending a pull request.

by EGreg

0 subcomment

I remember I met Bram Cohen (of Bittorent fame!) around 15 years ago. Around that time is when I had started building web-based distributed collaborative systems, starting with Qbix.com and then spun off a company to build blockchain-based smart contracts through Intercoin.org etc.
Anyway, I wanted to suggest a radical idea based on my experience:
Merges are the wrong primitive.
What organizations (whethr centralized or distributed projects) might actually need is:
1) Graph Database - of Streams and Relations
2) Governance per Stream - eg ACLs
A code base should be automatically turned into a graph database (functions calling other functions, accessing configs etc) so we know exactly what affects what.
The concept of what is “too near” each other mentioned in the article is not necessarily what leads to conflicts. Conflicts actually happen due to conflicting graph topology and propagating changes.
People should be able to clone some stream (with permission) and each stream (node in the graph) can be versioned.
Forking should happen into workspaces. Workspaces can be GOVERNED. Publishing some version of a stream just means relating it to your stream. Some people might publish one version, others another.
Rebasing is a first-class primitive, rather than a form of merging. A merge is an extremely privileged operation from a governance point of view, where some actor can just “push” (or “merge”) thousands of commits. The more commits, the more chance of conflicts.
The same problem occurs with CRDTs. I like CRDTs, but reconciling a big netsplit will result in merging strategies that create lots of unintended semantic side effects.
Instead, what if each individual stream was guarded by policies, there was a rate limit of changes, and people / AIs rejected most proposals. But occasionally they allow it with M of N sign offs.
Think of chatgpt chats that are used to modify evolving artifacts. People and bots working together. The artifacts are streams. And yes, this can even be done for codebases. It isnt about how “near” things are in a file. Rather it is about whether there is a conflict on a graph. When I modify a specific function or variable, the system knows all of its callers downstream. This is true for many other things besides coding too. We can also have AI workflows running 24/7 to try out experiments as a swarm in sandboxes, generate tests and commit the results that pass. But ultimately, each organization determines whether they want to rebase their stream relations to the next version of something or not.
That is what I’m building now with https://safebots.ai
PS: if anyone is interested in this kind of stuff, feel free to schedule a calendly meeting w me on that site. I just got started recently, but I’m dogfooding my own setup and using AI swarms which accelerates the work tremendously.

by jauntywundrkind

0 subcomment

In case the name doesn't jump out at you, this is Bram Cohen, inventory of Bittorrent. And Chia proof-of-storage (probably better descriptions available) cryptocurrency. https://en.wikipedia.org/wiki/Bram_Cohen
It's not the same as capturing it, but I would also note that there are a wide wide variety of ways to get 3-way merges / 3 way diffs from git too. One semi-recent submission (2022 discussing a 2017) discussed diff3 and has some excellent comments (https://news.ycombinator.com/item?id=31075608), including a fantastic incredibly wide ranging round up of merge tools (https://www.eseth.org/2020/mergetools.html).
However/alas git 2.35's (2022) fabulous zdiff3 doesn't seems to have any big discussions. Other links welcome but perhaps https://neg4n.dev/blog/understanding-zealous-diff3-style-git...? It works excellently for me; enthusiastically recommended!

by steveharing1

0 subcomment

Git is my first priority until or unless i see anything more robust than this one.

by Aperocky

1 subcomments

Outside of the merit of the idea itself, I thought I was going to look at a repository at least as complete as Linus when he released git after 3 weeks, especially with the tooling we had today.
Slightly disappointed to see that it is a 470 line python file being touted as "future of version control". Plenty of things are good enough in 470 lines of python, even a merge conflict resolver on top of git - but it looks like it didn't want anything to do with git.
Prototyping is almost free these days, so not sure why we only have the barest of POC here.

by codemog

1 subcomments

Nobody should have these types of problems in the age of AI agents. This kind of clean up and grunt work is perfect for AI agents. We don’t need new abstractions.

by catlifeonmars

0 subcomment

Can we stop using line-oriented diffs in favor of AST-oriented diffs?
Is it just lack of tooling, or is there something fundamentally better about line-oriented diffs that I’m missing? For the purpose of this question I’m considering line-oriented as a special case of AST-oriented where the AST is a list of lines (anticipating the response of how not all changes are syntactically meaningful or correct).

by monster_truck

0 subcomment

Not this again

by MattCruikshank

0 subcomment

For anyone who thinks diff / merge should be better - try Beyond Compare from Scooter Software.

by lowbloodsugar

0 subcomment

Araxis merge. Four views. Theirs, ours, base and “what you did so far in this damned merge hell”.

by newsoftheday

0 subcomment

OK, I'll stick with git.

by xthe

0 subcomment

[dead]

by hahhhha500012

0 subcomment

[dead]

by hahhhha500012

0 subcomment

[dead]

by hahaddmmm12x

0 subcomment

[dead]

0 subcomment

by hahaddmmm12x

0 subcomment

[dead]

by hahaddmmm12x

0 subcomment

[dead]