If the one-shot output resembles anything working (and I am betting it will), then obviously this isn't clean room at all.
I struggled a lot with some complex software, which worked on some emulators and failed on others (and mine).
For example one bug I had, which is still outstanding, relates to the Hisoft C compiler:
https://github.com/skx/cpmulator/issues/250
But I see that my cpm-dist repository is referenced in the download script so that made me happy!
It's great to see people still using CP/M, writing software for it, and sharing the knowledge. Though I do think the choice to implement the CCP in C, rather than using a genuine one, is an interesting one, and a bit of a cheat. It means that you cannot use "SUBMIT" and other common-place binaries/utilities.
> Instead, different classes of instructions were implemented incrementally, and there were bugs that were fixed…
Not sure the author fully grasps how and why LLM agents work this way. There’s a leap of logic here: the agent runs in a loop where command outputs get fed back as context for further token generation, which is what produces the incremental human like process he’s observing. It’s still that “decompression” from the weights, still the LLM’s unique way of extracting and blending patterns from training data, that’s doing the actual work. The agentic scaffolding just lets it happen in many small steps against real feedback instead of all at once. So the novel output is real, but he’s crediting the wrong thing for it.
Looking at the two z80.c side by side, it definitely doesn't look like a copy-paste job (at least compared with [0], I'm sure it was trained on many others). The AI version is a lot better commented, although [0] is probably closer to how I would have structured it had I written one.
The interfacing between the two is very similar though, so I was curious to try the AI version to see if there was any cycle efficiency difference. Running the same short loop in Level II basic, I find the AI version to be just about 1.5% slower. Make of that what you will.
On one RP2350 core, I figure these versions top out about equal to a 6Mhz Z80. I do wonder what you would get if you asked for a version optimized for ARM Cortex-M33.
> Address bits for pixel (x, y): > * 010 Y7 Y6 Y2 Y1 Y0 | Y5 Y4 Y3 X7 X6 X5 X4 X3
Which is wrong. It's x4-x0. Comment does not match the code below.
> static inline uint16_t zx_pixel_addr(int y, int col) {
It computes a pixel address with 0x4000 added to it only to always subtract 0x4000 from it later. The ZX apparently has ROM at 0x0000..0x3fff necessitating the shift in general but not in this case in particular.
This and the other inline function next to it for attributes are only ever used once.
> During the > * 192 display scanlines, the ULA fetches screen data for 128 T-states per > * line.
Yep.. but..
> Instead of a 69,888-byte lookup table
How does that follow? The description completely forgets to mention that it's 192 scan lines + 64+56 border lines * 224 T-States.
I'm bored. This is a pretty muddy implementation. It reminds me of the way children play with Duplo blocks.
As HN likes to say, only a amateur vibe-coder could believe this.
But anything is going to be in some way similar to one degree or another to one or more real projects. So skeptics will claim that it doesn't prove anything.
But that's just the nature of reality and problems solving and such an exercise would prove it could create a compiler for a novel platform.
It would be great if there were a website where all of these skeptics could register in solidarity their assuredness that "it's not real AI and can't be creative" and pledge not to use it.
What if Agents were hip enough to recognize that they have navigated into a specialized area and need additional hinting? "I'm set up for CP/M development, but what I really need now is Z80 memory management technique. Let me swap my tool head for the low-level Z80 unit..."
We can throw RAGs on the pile and hope the context window includes the relevant tokens, but what if there were pointers instead?
Maybe a more sensible challenge would be to describe a system that hasn't previously been emulated before (or had an emulator source released publicly as far as you can tell from the internet) and then try it.
For fun, try using obscure CPUs giving it the same level of specification as you needed for this, or even try an imagined Z80-like but swapping the order of the bits in the encodings and different orderings for the ALU instructions and see how it manages it.
> The above tools could theoretically be used to compile, build, and bootstrap an entire FreeBSD, Linux, or other similar operating system kernel onto MMIX hardware, were such hardware to exist.
love the project of course, but LLMs are a huge caveat to such claims, which will be very hard to make credibly in the future for anything not entirely novel
I am itching at testing its ability to code assembly.
I wish people would stop using this phrase altogether for LLM-assisted coding. It has a specific legal and cultural meaning, and the giant amount of proprietary IP that has been (illegally?) fed to the model during training completely disqualifies any LLM output from claiming this status.