FRESH

Hacker News

The rsync algorithm (1996) [pdf]

217 points by vortex_ape

by doodlesdev

2 subcomments

Well-written, succinct.
This small document shows what computer science looked like to me when I was just getting started: a way to make computers more efficient and smarter, to solve real problems. I wish more people who claim to be "computer scientists" or "engineers" would actually work on real problems like this (efficient file sync) instead of having to spend time learning how to use the new React API or patching the f-up NextJS CVE that's affecting a multitude of services.

by ssl-3

1 subcomments

by teleforce

6 subcomments

Fun facts, the author of rsync, Andrew Tridgell, is also the one who reverse-engineered Microsoft SMB that laid the foundation for Samba [1].
How he did manage to avoid lawsuits from Microsoft is beyond me.
[1] Server Message Block:
https://en.wikipedia.org/wiki/Server_Message_Block

by imiric

1 subcomments

Rsync is one of my favorite programs. I use it daily. The CLI is a bit quirky (e.g. trailing slashes), but once you get used to it, it makes sense. And I really always use the same flags: `-avmLP`, with `-n` for dry runs.
One alternative I'd like to try is Google's abandoned CDC[1], which claims to be up to 30x faster than rsync in certain scenarios. Does anyone know if there is a maintained fork with full Linux support?
[1]: https://github.com/google/cdc-file-transfer

by craftkiller

4 subcomments

I've been using this extensively recently. I was setting up remote virtual machines that boot a live ISO containing all the software for the machine. Sometimes I need to change a small config file, which would lead to generating a new 1.7GiB ISO, but 99.9% of that ISO is identical to the previous one. So I used rsync. Blew my mind when after a day of working on these images, uploading 1.7GiB ISO after 1.7GiB ISO, wireguard showed that I had only sent 600MiBs.
Fun surprise, rsync uses file size and modified time first to see if the files are identical. I build these ISOs with nix. Nix sets the time to Jan 1st 1970 for reproducible builds, and I suspect the ISOs are padded out to the next sector. So rsync was not noticing the new ISO images when I made small changes to config files until I added the --checksum flag.

by snvzz

1 subcomments

Besides Tridgell's venerable rsync, there exists a permissively licensed implementation[0] by openbsd.
0. https://www.openrsync.org/

by linsomniac

1 subcomments

If you want to do similar for block devices: https://github.com/rolffokkens/bdsync
I use it to back up a few virtual machines that, in the event of a site loss, would be difficult to rebuild but also critical to getting our developers back to work. I take an LVM snapshot of the VM, then use bdsync to replicate it to our backup server, and from there I replicate it off to backblaze, then destroy the snapshot.

by mickael-kerjean

0 subcomment

In section 6: "tar files ... of the Linux kernel sources ... version ... 1.99.10 ... are approximately 24MB in size ... Out of the 2441 files in the 2.0.0 release 291 files had changed"
It never crossed my mind Linux at some point only had 2441 files and you could actually parse the code that went through a new version, that time has sailed

by alex1138

0 subcomment

1996 is not that long ago as Unix goes but it's fun to know as I browse with my Debian computer that I'm using something (derived from Unix but BSD also isn't original Unix either) that has a long tradition

by cyanydeez

0 subcomment

by bix6

0 subcomment