The problem they have software written in Rust, and they need to use the libpg_query library, that is written in C. Because they can't use the C library directly, they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons. Problem is that it is slow.
So what they did is that they wrote their own non-portable but much more optimized Rust-to-C bindings, with the help of a LLM.
But had they written their software in C, they wouldn't have needed to do any conversion at all. It means they could have titled the article "How we lowered the performance penalty of using Rust".
I don't know much about Rust or libpg_query, but they probably could have gone even faster by getting rid of the conversion entirely. It would most likely have involved major adaptations and some unsafe Rust though. Writing a converter has many advantages: portability, convenience, security, etc... but it has a cost, and ultimately, I think it is a big reason why computers are so fast and apps are so slow. Our machines keep copying, converting, serializing and deserializing things.
Note: I have nothing against what they did, quite the opposite, I always appreciate those who care about performance, and what they did is reasonable and effective, good job!
They changed the persistence system completely. Looks like from a generic solution to something specific to what they're carrying across the wire.
They could have done it in Lua and it would have been 3x faster.
The initial motivation for developing pg_query was for pganalyze, where we use it to parse queries extracted from Postgres, to find the referenced tables, and these days also rewrite and format queries. That use case runs in the background, and as such is much less performance critical.
pg_query actually initially used a JSON format for the parse output (AST), but we changed that to Protobuf a few major releases ago, because Protobuf makes it easy to have typed bindings in the different languages we support (Ruby, Go, Rust, Python, etc). Alternatives (e.g. using FFI directly) make sense for Rust, but would require a lot of maintained glue code for other languages.
All that said, I'm supportive of Lev's effort here, and we'll add some additional functions (see [0]) in the libpg_query library to make using it directly (i.e. via FFI) easier. But I don't see Protobuf going away, because in non-performance critical cases, it is more ergonomic across the different bindings.
This is a common pattern: "We switched to X and got 5x faster" often really means "We fixed our terrible implementation and happened to rewrite it in X."
Key lessons from this:
1. Serialization/deserialization is often a hidden bottleneck, especially in microservices where you're doing it constantly 2. The default implementation of any library is rarely optimal for your specific use case 3. Benchmarking before optimization is critical - they identified the actual bottleneck instead of guessing
For anyone dealing with Protobuf performance issues, before rewriting: - Use arena allocation to reduce memory allocations - Pool your message objects - Consider if you actually need all the fields you're serializing - Profile the actual hot path
Rust FFI has overhead too. The real win here was probably rethinking their data flow and doing the optimization work, not just the language choice.
Well, yeah. If there's a feature you don't need, you'll see value by coding around it. Some features turn out not to be needed by anyone, maybe this is one. But some people need serialization, and that's what protobufs are for[1]. Those people are very (!) poorly served by headlines telling them to use Rust (!!) instead of serialization.
[1] Though as always the standard litany applies: you actually want JSON, and not protobus or ASN.1 or anything else. If you like some other technology better, you're wrong and you actually want JSON. If you think you need something faster, you probably don't and JSON would suit your needs better. If you really, 100%, know for sure that you need it faster than JSON, then you're probably isomorphic to the folks in the linked article, shouldn't have been serializing at all, and should get to work open coding your own hooks on the raw backend.
using a transport serialization and deserialization protocol for IPC. It is obvious why there was an overhead because it was architectural decision to manage the communication.
I guess the old adage of if something goes 20% faster something was improved if it is 10x faster, it was just built wrong is true here.
At the scale we were using PGDog, enabling the previous form of the query parser was extremely expensive (we would have had to 16x our pgdog fleet size).
But I would just increase the stack size limit if it ever becomes a problem. As far as I know the only reason it is so small is because of address space exhaustion which only affects 32-bit systems.
I wrote assembly, memory mapping oriented protobuf software... in assembly, then what? I am allowed to say I am going 1000 times faster than rust now???
So it's C actually, not Rust. But Hey! we used Rust somewhere, so let's post it on HN and farm internet points.