In general, you don't really get to compact tombstones meaningfully without consensus so you really are pushing at least remnants of the entire log around to each client indefinitely. You also can't easily upgrade your db you're stuck looking after legacy data structures indefinitely - or you have to impose arbitrary cut off points.
List CRDTs - which text CRDTs are built from are probably unavoidable except for really really simple applications. Over the last 15 years they have evolved, roughly: WOOT (paper) -> RGA (paper) -> YATA (paper) / YJS (js + later rust port) -> Peritext (paper) / Automerge (rust/js/swift) -> Loro (Rust/js). Cola (rust) is another recent one. The big libs (yjs, automerge, loro) offer full doc models.
Mostly the later ones improve on space, time & intent capture (not interleaving concurrent requests).
The same few guys (Martin Kleppman, Kevin Jahns, Joseph Gentle, probably others) pop up all over the more recent optimisations.
It would be interesting to try again now.
1) Counters
While not really useful, they demonstrate this well:
- mutations are +n and -n
- their order do not matter
- converging the state is a matter of applying the operations of remote peers locally
2) Append-only data structuresUseful for accounting, or replication of time-series/logs with no master/slave relationship between nodes (where writes would be accepted only on a "master" node).
- the only mutation is "append"
- converging the state is applying the peers operations then sorting by timestamp
EDIT: add more3) Multi Value registers (and maps)
Similar to Last-Write-Win registers (and maps), but all writes are kept, the value becomes a set of concurrent values.
4) Many more...
Each is useful for specific use cases. And since not everybody is making collaborative tools, but many are working on distributed systems, I think it's worth it to mention this.
On another note, the article talks about state based CRDTs, where you need to share the whole state. In the examples I gave above, they are operation based CRDTs, where you need to share the operations done on the state and recompute it when needed.
For example, in the Elixir ecosystem, we have Horde ( https://hexdocs.pm/horde/readme.html ) which allows distributing a worker pool over multiple nodes, it's backed by DeltaCrdt ( https://hexdocs.pm/delta_crdt/DeltaCrdt.html ).
Delta-CRDTs are an optimization over state based CRDTs where you share state diffs instead of the whole state (described in this paper: https://arxiv.org/pdf/1603.01529 ).
Also a really well written piece.
An interactive intro to CRDTs - https://news.ycombinator.com/item?id=37764581 - Oct 2023 (130 comments)
Edit: I had an excerpt here which I completely misread. Sorry.