Before this, I had thought that C was a simple language. An idea propped up by articles likes this, as well as the oft touted fact that nearly every embedded system has a C compiler; no matter what you'll always have a C compiler.
This point was driven home by part of a blog post that simply states "you can't actually parse a C header"[0]. The blog makes a good supporting case for their claim. They link to a paper that says[1]:
> There exist many commercial and academic tools that can parse C.... Unfortunately, these parsers are often either designed for an older version of the language (such as C89) or plain incorrect. The C11 parsers found in popular compilers, such as GCC and Clang, are very likely correct, but their size is in the tens of thousands of lines.
And sure enough, in the OP linked blog post, they state they are only implementing a subset of the language. Of course, it still has value as a teaching tool; this is just a tangential fact about C I wanted to discuss.
[0]: https://faultlore.com/blah/c-isnt-a-language/#you-cant-actua...
Writing a C compiler in 500 lines of Python - https://news.ycombinator.com/item?id=37383913 - Sept 2023 (165 comments)
Still a really cool article and an impressive project, though. I especially like the StringPool technique; I'll have to keep it in mind if I ever write a compiler!
If you are interesting in learning in more detail how to write a C compiler, I highly recommend the book "Writing a C Compiler" by Nora Sandler [0]. This is a super detailed, incremental guide on how to write a C compiler. This also uses the traditional architecture of using multiple passes. It uses its own IR called Tacky and it even includes some optimization passes such as constant folding, copy propagation, dead code elimination, register allocation, etc. The book also implements much more features, including arrays, pointers, structs/unions, static variables, floating point, strings, linking to stdlib via System V ABI, and much more.
Never actually looked into how compilers work before, it's surprisingly similar/related to linguistics.