> Instead, we use ANTLR, a state-of-the-art, open source parser generator.
I don't agree with this (pre-AI-coding) take. Hand-rolled parsers are much easier to write well and maintain than people think. They also tend to be much faster and produce much better errors than parser generators. I guess if the language you're trying to parse is, say, C++, then you're going to have a miserable time (probably no matter what). But an SQL parser is very doable. (I say this as the author and maintainer of an in-house SQL dialect thingy at work.)
What makes building and maintaining a hand-written parser such a tractable task is:
- The code size can be large, but you can start with a core of a few well-chosen abstractions and then you add lots of parsing code for various language constructs but it's all kind of orthogonal and doesn't add compounding complexity as you go. - It's just about the most testable kind of code there is. You can cover all the various corner cases with tests and really lock in the behavior so that you can very confidently make changes. One approach I like is to make zillions of tiny test files in the target language accompanied by some golden representation of the AST.
And of course, as the author found out, these properties make writing a parser a really good task for AI coding, too. These tools are very, very good at generating a bunch of new code based on existing abstractions and covering it with lots of test cases.
So I agree with where they ended up, just not where they started :)
Makes me think of all the algorithms we specify in proof languages and then hand-implement in production languages - this setup could maybe let you just specify the proof of an algorithm and then let LLMs derive efficient implementations with the (slow) proof as an oracle
If you have an oracle, and your problem is largely just a pure function, it's pretty good at generating something that both works and is fast.
I have a tool I make as a data-plane to a graph engine, and it uses cap'n proto to help (And sqlite as a sort've IPC option). One of the biggest things I have is, I know I am not testing all of it to completion. I am not even really fuzzing, yet.
Thanks for sharing!
Perhaps the next target for a 100x improvement
So it's technically vibe-coding in the sense you don't really look at the code, you just look at the results and "go by the vibes"... except now you're working to rigorously quantify and enforce those vibes. (Philosophical aside: once vibes are rigorously enforced are they "vibes" anymore?)
Recently I was messing around with parquet files in Python and ended up needing to ship the results on Windows, without a Windows machine to test on.
Shipping Python to end users is half mad already, and doing it on Windows is exactly the kind of thing I don't want to spend my life maintaining.
So I figured I'd rewrite it in Go. But that meant embedding a DLL, and how would I test it? I could spin up a VM, sure. But GitHub Actions already has a Windows environment, and there was my loop: let the agent push to the repo, run tests in GHA, rinse and repeat.
In under an hour it had a full rewrite of my Python, passing every test and producing row-for-row copies of my Parquet output. And it does work on the user machine!
Spotting a loop like that is as satisfying as noticing you can walk your chess opponent into a smothered mate. Truly empowering.
Amusing anecdotes on LLMs to:
> It did, in fact, make a lot of mistakes, kept doubting whether such a rewrite was even possible, and wanted to call it a day after each round of coding.
> Hilariously one of the most effective was to tell Claude to “think really hard about edge cases" in a background agent.
There’s something kind of amazing here in that having read about property based testing I’m pretty confident I could apply it if I had a good use case.
Is this even true? I tried it in SQLite and there's a syntax error after first SELECT. It would work when "SELECT", "FROM" etc. are quoted, but that's not the same thing.
What's wrong with the source language that it's better to use a sufficiently smart random code generator for the target language, and then fuzz the hell out of the output of it until it behaves the same as the slow translated code, than to create a sufficiently smart compiler from the source to target languages?
I mean this sounds like if we replaced GCC with a really smart random assembly generator and a fuzzer for the output.
tobymao/sqlglot: Python SQL Parser and Transpiler; with tests and support for 30+ dialects: https://github.com/tobymao/sqlglot
Ibis depends upon sqlglot: https://github.com/tobymao/sqlglot/network/dependents