> it’s just embarrassing — it’s as if the writer is walking around with their intellectual fly open.
I think Oxide didn't include this in the RFD because they exclusively hire senior engineers, but in an organization that contains junior engineers I'd add something specific to help junior engineers understand how they should approach LLM use.
Bryan has 30+ years of challenging software (and now hardware) engineering experience. He memorably said that he's worked on and completed a "hard program" (an OS), which he defines as a program you doubt you can actually get working.
The way Bryan approaches an LLM is super different to how a 2025 junior engineer does so. That junior engineer possibly hasn't programmed without the tantalizing, even desperately tempting option to be assisted by an LLM.
> Unlike prose, however (which really should be handed in a polished form to an LLM to maximize the LLM’s efficacy), LLMs can be quite effective writing code de novo.
Don't the same arguments against using LLMs to write one's prose also apply to code? Was this structure of the code and ideas within the engineers'? Or was it from the LLM? And so on.
Before I'm misunderstood as a LLM minimalist, I want to say that I think they're incredibly good at solving for the blank page syndrome -- just getting a starting point on the page is useful. But I think that the code you actually want to ship is so far from what LLMs write, that I think of it more as a crutch for blank page syndrome than "they're good at writing code de novo".
I'm open to being wrong and want to hear any discussion on the matter. My worry is that this is another one of the "illusion of progress" traps, similar to the one that currently fools people with the prose side of things.
That's a bold claim. Do they have data to back this up? I'd only have confidence to say this after testing this against multiple LLM outputs, but does this really work for, e.g. the em dash leaderboard of HN or people who tell an LLM to not do these 10 LLM-y writing cliches? I would need to see their reasoning on why they think this to believe.
My general procedure for using an LLM to write code, which is in the spirit of what is advocated here, is:
1) First, feed in the existing relevant code into an LLM. This is usually just a few source files in a larger project
2) Describe what I want to do, either giving an architecture or letting the LLM generate one. I tell it to not write code at this point.
3) Let it speak about the plan, and make sure that I like it. I will converse to address any deficiencies that I see, and I almost always do.
4) I then tell it to generate the code
5) I skim & test the code to see if it's generally correct, and have it make corrections as needed
6) Closely read the entire generated artifact at this point, and make manual corrections (occasionally automatic corrections like "replace all C style casts with the appropriate C++ style casts" then a review of the diff)
The hardest part for me is #6, where I feel a strong emotional bias towards not doing it, since I am not yet aware of any errors compelling such action.
This allows me to operate at a higher level of abstraction (architecture) and remove the drudgery of turning an architectural idea into written, precise, code. But, when doing so, you are abandoning those details to a non-deterministic system. This is different from, for example, using a compiler or higher level VM language. With these other tools, you can understand how they work and rapidly have a good idea of what you're going to get, and you have robust assurances. Understanding LLMs helps, but thus not to the same degree.
It sets the rule that things must be actually read when there’s a social expectation (code interviews for example) but otherwise… remarks that use of LLMs to assist comprehension has little downside.
I find two problems with this:
- there is incoherence there. If LLMs are flawless in reading and summarization, there is no difference with reading the original. And if they aren’t flawless, then that flaw also extends to non social stuff.
- in practice, I haven’t found LLMs so good as reading assistants. I’ve send them to check a linked doc and they’ve just read the index and inferred the context, for example. Just yesterday I asked for a comparison of three technical books on a similar topic, and it wrongly guessed the third one rather than follow the three links.
There is a significant risk in placing a translation layer between content and reader.
I think this points out a key point.. but I'm not sure the right way to articulate it.
A human-written comment may be worth something, but an LLM-generated is cheap/worthless.
The nicest phrase capturing the thought I saw was: "I'd rather read the prompt".
It's probably just as good to let an LLM generate it again, as it is to publish something written by an LLM.
Yes, allow the use of LLMs, encourage your employees to use them to move faster by rewarding "performance" regardless of risks, but make sure to place responsibility of failure upon them so that when it happens, the company culture should not be blamed.
This applies to natural language, but, interestingly, the opposite is true of code (in my experience and that of other people that I've discussed it with).
believing this in 2025 is really fascinating. this is like believing Meta won’t use info they (i)legally collected about you to serve you ads
This probably doesn't give them enough credit. If you can feed an LLM a list of crash dumps it can do a remarkable job producing both analyses and fixes. And I don't mean just for super obvious crashes. I was most impressed with a deadlock where numerous engineers and tried and failed to understand exactly how to fix it.
Is there any evidence for this?
<offtopic> The "RFD" here stands for "Reason/Request for Decision" or something else? (Request for Decision doesn't have a nice _ring_ on it tbh). I'm aware of RFCs ofc and the respective status changes (draft, review, accepted, rejected) or ADR (Architectural Decision Record) but have not come across the RFD acronym. Google gave several different answers. </offtopic> </offtopic>
I think the review by the prompt writer should be at a higher level than another person who reviews the code.
If I know how to do something, it is easier for me to avoid mistakes while doing it. When I'm reviewing it it requires different pathways in my brain. Since there is code out there I'm drawn to that path, and I might not not always spot the problem points. Or code might be written in a way that I don't recognize, but still exhibits the same mistake.
In the past, as a reviewer I used to be able to count on my colleagues' professionalism to be a moat.
The size of the moat is inverse to the amount of LLM generated code in a PR / project. At a certain moment you can no longer guarantee that you stand behind everything.
Combine that with the push to do more faster, with less, meaning we're increasing the amount of tech debt we're taking on.
By this own article's standards, now there are 2 authors who don't understand what they've produced.
[1] https://github.com/oxidecomputer/meta/tree/master/engineerin...
To extend that: If the LLM is the author and the responsible engineer is the genuine first reviewer, do you need a second engineer at all?
Typically in my experience one review is enough.
I couldn't disagree more. (In fact I'm shocked that Bryan Cantrill uses words like "comprehension" and "meaningfully" in relation to LLMs.)
Summaries provided by ChatGPT, conclusions drawn by it, contain exaggerations and half-truths that are NOT there in the actual original sources, if you bother enough to ask ChatGPT for those, and to read them yourself. If your question is only slightly suggestive, ChatGPT's tuning is all too happy to tilt the summary in your favor; it tells you what you seem to want to hear, based on the phrasing of your prompt. ChatGPT presents, using confident and authoritative language, total falsehoods and deceptive half-truths, after parsing human-written originals, be the latter natural language text, or source code. I now only trust ChatGPT to recommend sources to me, and I read those -- especially the relevant-looking parts -- myself. ChatGPT has been tuned by its masters to be a lying sack of shit.
I've recently asked ChatGPT a factual question: I asked it about the identity of a public figure (an artist) whom I had seen in a video on youtube. ChatGPT answered with "Person X", and even explained why Person X's contribution was so great to the piece of art in question. I knew the answer was wrong, so I retorted only with: "Source?". Then ChatGPT apologized, and did the exact same thing, just with "Person Y"; again explaining why Person Y was so influental in making that piece of art so great. I knew the answer was wrong still, so I again said: "Source?". And at third attempt, ChatGPT finally said "Person Z", with a verifiable reference to a human-written document that identified the artist.
FUCK ChatGPT.
Actually a lot people dispute this and I'm sure the author knows that!
Maybe for simple braindead tasks you can do yourself anyway.
Try doing it on something actually hard or complex and they get it wrong 100/100 if they don't have adequate training data, and 90/100 if they do.
If you hand me a financial report, I expect you used Excel or a calculator. I don't feel cheated that you didn't do long division by hand to prove your understanding. Writing is no different. The value isn't in how much you sweated while producing it. The value is in how clear the final output is.
Human communication is lossy. I think X, I write X' (because I'm imperfect), you understand Y. This is where so many misunderstandings and workplace conflicts come from. People overestimate how clear they are. LLMs help reduce that gap. They remove ambiguity, clean up grammar, and strip away the accidental noise that gets in the way of the actual point.
Ultimately, outside of fiction and poetry, writing is data transmission. I don't need to know that the writer struggled with the text. I need to understand the point clearly, quickly, and without friction. Using a tool that delivers that is the highest form of respect for the reader.
THOU SHALT OWN THE CODE THAT THOU DOST RENDER.
All other values should flow from that, regardless of whether the code itself is written by you or AI or by your dog. If you look at the values in the article, they make sense even without LLMs in the picture.
The source of workslop is not AI, it's a lack of ownership. This is especially true for Open Source projects, which are seeing a wave of AI slop PR's precisely because the onus of ownership is largely on the maintainers and not the upstart "contributors."
Note also that this does not imply a universal set of values. Different organizations may well have different values for what ownership of code means -- E.g. in the "move fast, break things" era of FaceBook, workslop may have been perfectly fine for Zuck! (I'd bet it may even have hastened the era of "Move fast with stable infrastructure.") But those values must be consistently applied regardless of how the code came to be.
1. Because reading posts like this 2. Is actually frustrating as hell 3. When everything gets dragged around and filled with useless anecdotes and 3 adjective mumbojumbos and endless emdashes — because somehow it's better than actually just writing something up.
Which just means that people in tech or in general have no understanding what an editor does.
He is a long way from Sun.