Think of as an extended bipolar-optimism-fueled glimpse into the future. Steve's MO is laid out in the medium post - but basically, it's okay to lose things, rewrite whole subsystems, whatever, this is the future. It's really fun and interesting to watch the speed of development.
I've made a few multi agent coding setups in the last year, and I think gas town has the team side about right: big boss (mayor), operations boss (deacon), relatively linear keeper of truth (witness), single point for merges (refiner), lots of coders with their code held lightly.
I love the idea of formulas - a lot of what makes gas town work and informs how well it ultimately will work is the formulas. They're close conceptually to skills.
I don't love the mad max branding, but meh, whatever, it's fun, and a perk of the brave new world where you can make stuff like for a few hundred bucks a month sent to anthropic - software can have personality again, yay.
Conceptually I think there is a product team element to this still missing - deploy engineers, product managers, visual testing. Everything is sort of out there, janky in parts, but workable to glue together right now, and will only improve. That said, the mad max town analogy is going to get overstretched at some point; we already have pretty good names for all the parts that are needed, and as coordination improves, we're going to want to add more stuff into the coordination. So, I'd like to see a version of this with normal names and expanded.
Upshot - worth a look - if beads is any indication, give it a month or two or four to settle down unless you like living on the bleeding bleeding edge.
> Gas Town helps with all that yak shaving, and lets you focus on what your Claude Codes are working on.
Then:
> Working effectively in Gas Town involves committing to vibe coding. Work becomes fluid, an uncountable that you sling around freely, like slopping shiny fish into wooden barrels at the docks. Most work gets done; some work gets lost. Fish fall out of the barrel. Some escape back to sea, or get stepped on. More fish will come. The focus is throughput: creation and correction at the speed of thought.
I see -- so where exactly is my focus supposed to sit?
As someone who sits comfortably in the "Stage 8" category that this article defines, my concern has never been throughput, it has always been about retaining a high-degree of quality while organizing work so that, when context switching occurs, it transitions me to near-orthogonal tasks which are easy to remember so I can give high-quality feedback before switching again.
For instance, I know Project A -- these are the concerns of Project A. I know Project B -- these are the concerns of Project B. I have the insight to design these projects so they compose, so I don't have to keep track of a hundred parallel issues in a mono Project C.
On each of those projects, run a single agent -- with review gates for 2-3 independent agents (fresh context, different models! Codex and Gemini). Use a loop, let the agents go back and forth.
This works and actually gets shit done. I'm not convinced that 20 Claudes or massively parallel worktrees or whatever improves on quality, because, indeed, I always have to intervene at some point. The blocker for me is not throughput, it's me -- a human being -- my focus, and the random points of intervention which ... by definition ... occur stochastically (because agents).
Finally:
> Opus 4.5 can handle any reasonably sized task, so your job is to make tasks for it. That’s it.
This is laughably not true, for anyone who has used Opus 4.5 for non-trivial tasks. Claude Code constantly gives up early, corrupts itself with self-bias, the list goes on and on. It's getting better, but it's not that good.
Gas Town is clearly the same thing multiplied by ten thousand. The number of overlapping and adhoc concepts in this design is overwhelming. Steve is ahead of his time but we aren't going to end up using this stuff. Instead a few of the core insights will get incorporated into other agents in a simpler but no less effective way.
And anyway the big problem is accountability. The reason everyone makes a face when Steve preaches agent orchestration is that he must be in an unusual social situation. Gas Town sounds fun if you are accountable to nobody: not for code quality, design coherence or inferencing costs. The rest of us are accountable for at least the first two and even in corporate scenarios where there is a blank check for tokens, that can't last. So the bottleneck is going to be how fast humans can review code and agree to take responsibility for it. Meaning, if it's crap code with embarrassing bugs then that goes on your EOY perf review. Lots of parallel agents can't solve that fundamental bottleneck.
I have enjoyed Steve's rants since "Execution in the Kingdom of Nouns" and the Google "Platform rant", but he may need someone to talk to him about bamboo and what a terrible life choice it is. Unless you can keep it the hell away from you and your neighbours it is bad, very bad. I'm talking about clumping varieties, the runners are a whole other level.
It's not that there's nothing useful, maybe even important, in there, it's just so far it's all just the easy parts: playing around inside a computer.
I've noticed a certain trend over the years where you get certain types of projects that get lots of hype and excitement and much progress seems to be made, but when you dig deep enough you find out that it's all just the fun, easy sort of progress.
The fun progress, which not at all coincidentallly tends to also be the easy progress, is the type that happens solely inside a computer.
What do I mean by that? I mean programs who only operate at the level of artificial computer abstractions.
The hard part is always dealing with "the real world": hardware that returns "impossible" results to your nicely abstract api functions, things that stop working in places they really shouldn't be able to, or even, and this is the really tricky bit, dealing with humans.
Databases are a good example of this kind of thing. It's easy to start off a database writing all the clever (and fun) bits like btrees and hash maps and chained hashes that spill to disk to optimize certain types of tables and so on, but I'd wager that at least half of the code in a "real" database like sqlite or postgresql is devoted to dealing with strange hardware errors or leaky api abstractions across multiple platforms or the various ways a human can send nonsensical input into the system and really screw things up.
I'd also bet that this type of code is a lot less fun to write and took much longer than the rest (which incidentally is why I always get annoyes when programming language demos show code with only a happy path, but that's another rant and this comment is already excessive).
Anyways, this AI thing is definitely a gold rush and it's important to keep in mind that there was in fact a lot of gold that got dug up but, as everyone constantly repeats, the more consistent way to benefit is sell the shovels and this is very definitely an ad for a shovel.
WARNING DANGER CAUTION GET THE F** OUT YOU WILL DIE
I have never met Steve, but this warning alone is :chefskiss:
I recognize 100% that a tool to manage ai agents with long term context tracking is going to be a big thing. Many folks have written versions of this already. But mashing together the complexity of k8s with a hodge podge of lotr and mad max references is not it.
Its like the complexity of J2EE combined with AI-fueled solipsim and a microdosing mushroom regime gone off the rails. What even are all the layers of abstractions here? and to build what? What actual apps or systems has this thing built? AFAICT it has built gas town, and nothing else. Not surprising that it has eaten its own tail.
The amount of jargon, ai art, pop culture references, and excessive complexity going on here is truly amazing, and I would assume its satire if I didn't know Yegge's style and previous writings. Its like someone looked at the amount of overlapping and confusing tools Anthropic has released around Claude Code, and said "hold my beer, hand me 3 red bulls and a shot of espresso, I can top that!".
I do think a friend of mine nailed it though with this quote: "This whole "I'm using agents to write so much software" building-in-public trend, but without actually showing what they built, reminds me of the people selling courses on stock trading or drop shipping."
The amount of get-rich quick schemes around any new tech are boundless. As yegge himself points out in the post towards the end, you'd be surprised what you can pull off with a ridiculous blog post, big-tech reputation, and excessive LOC dev-tools in a hype-driven market. How could it be wrong if it aligns so closely with so many CEOs dreams?
Has to be close for the shortest time from first commit to HN front page.
I'm on my second agent orchestration framework, Omnispect - https://omnispect.dev/
Example created by Omnispect:
Oneshot - https://omnispect.dev/battleclone00.html
Polished - https://omnispect.dev/battleclone04.html
I'll add a personal anecdote - 2 years ago, I wrote a SwiftUI app by myself (bare you, I'm mostly an infrastructure/backend guy with some expertise in front end, where I get the general stuff, but never really made anything big out of it other than stuff on LAMPP back in 2000s) and it took me a few weeks to get it to do what I want to do, with bare minimum of features. As I was playtesting my app, I kept writing a wishlist of features for myself, and later when I put it on AppStore, people around the world would email me asking for some other features. But life, work and etc. would get into way, and I would have no time to actually do them, as some of the features would take me days/weeks.
Fast forward to 2 weeks ago, at this point I'm very familiar with Claude Code, how to steer multiple agents at a time, quick review its outputs, stitch things together in my head, and ask for right things. I've completed almost all of the features, rewrote the app, and it's already been submitted to AppStore. The code isn't perfect, but it's also not that bad. Honestly, it's probably better from what I would've written myself. It's an app that can be memory intensive in some parts, and it's been doing well from my testings. On top of it, since I've been steering 2-3 agents actively myself, I have the entire codebase in my mind. I also have overwhelming amount of more notes what I would do better and etc.
My point is, if you have enough expertise and experience, you'll be able to "stitch things together" cleaner than others with no expertise. This also means, user acquisition, marketing and data will be more valuable than the product itself, since it'll be easier to develop competing products. Finding users for your product will be the hard part. Which kinda sucks, if I'll be honest, but it is what it is.
He as a dev should know that adding a layer of names on top of already named entities is not a good practice. But he just had fun and this came up. Which is fantastic. But I don't want to have to translate names in my head all the time.
Just not useful. Beads also... really sorry to say this, but it is a task runner with labels, but it has 0 awareness of the actual tasks.
I don't know, maybe I am wrong, but this just doesn't seem like a thing that will work. Which is why I think it will be popular, nobody will be able to make it work, but they will not want to look dumb and will say it is awesome and amazing. Like another AI thingy I could name but will not that everyone is using.
But love Yegge and hope he does well. Amp for a little bit that I used it, is really solid agent and delivered much better results than many others.
We intend to sing the love of danger, the habit of energy and fearlessness.
Courage, audacity, and revolt will be essential elements of our poetry.
Up to now literature has exalted a pensive immobility, ecstasy, and sleep. We intend to exalt aggresive action, a feverish insomnia, the racer’s stride, the mortal leap, the punch and the slap.
We affirm that the world’s magnificence has been enriched by a new beauty: the beauty of speed. A racing car whose hood is adorned with great pipes, like serpents of explosive breath—a roaring car that seems to ride on grapeshot is more beautiful than the Victory of Samothrace. … https://www.arthistoryproject.com/artists/filippo-tommaso-ma...
225k lines for a cli issue tracker? What the fuck?
...no, I haven't lost the plot. I'm seeing another fad of the intoxicated parting with their money bending a useful tool into a golden hammer of a caricature. I dread seeing the eventual wreckage and self-realization from the inevitable hangover.
Looking at the screenshot of "Tracked Issues", it seems many of the "tasks" are likely overlapping in terms of code locality.
Based on my own experience, I've found the current crop of models to work well at a slightly higher-level of complexity than the tasks listed there, and they often benefit from having a shared context vs. when I've tried to parallelize down to that level of work (individual schema changes/helper creation/etc.).
Maybe I'm still just unclear on the inner workings, but it's my understanding each of those tasks is passed to Claude Code and developed separately?
In either case, I think this project is a glimpse into the future of software development (albeit with a grungy desert punk tinted lens).
For context, I've been "full vibe-coding"[0] for the past 6 months, and though it started painfully, the models are now good enough that not reading the code isn't much of an issue anymore.
This explains why some of the comments have timestamps that appear older than the post itself. I got tired of trying to make them line up, sorry!)
I promptly gave Claude the text to the articles and had him rewrite using idiomatic distributed systems naming.
Fun times!
https://newsletter.semianalysis.com/p/how-ai-labs-are-solvin...
Assuming this isn't a parody project, maybe this just isn't for me, and thats fine. I'm struggling to understand a production use case where I'd be comfortable letting this thing loose.
Who is the intended audience for this design?
I'm looking for "the Emacs" of whatever this is, and I haven't read a blog post which isolates the design yet.
I think Gas Town looks interesting directionally and as a PoC. Like it or not, that's the world we'll end up in. Some products will do it well and some will be horrible monsters. (Like I'm already dreading Oracle Gas Town and Azure Gas Town).
I think the Amp coding agent trends in the direction of Gas Town already. Powerful but expensive, uses a mix of models and capabilities to do something that's greater than the sum of the parts.
I don't known the details but I was wondering why people aren't "just" writing chat venues any commns protocols for the chats? So the fundamental unit is a chat that humans and agents can be a member of.
You can also have DMs etc to avoid chattiness.
But fundmantally if you start with this kind of madness you don't have a strict hierarchy and it might also be fun to see how it goes.
I briefly started building this but just spun out and am stuck using PAL MCP for now and some dumb scripts. Not super content with any of it yet.
We're trying to orchestrate a horde of agents. The workers (polecats?) are the main problem solvers. Now you need a top level agent (mayor) to breakdown the problem and delegate work, and then a merger to resolve conflicts in the resulting code (refinery). Sometimes agents get stuck and need encouragement.
The molecules stuff confused me, but I think they're just "policy docs," checklists to do common tasks.
But this is baby stuff. Only one level of hierarchy? Show me a design for your VP agent and I'll be impressed for real.
He is so in love with his own voice.
Try to find actual screenshots of this shit or what it really does in the 200 000-word diarrhea (funnily he agrees it's diarrhea [1]).
---
He also references his previous slop called beads. To quote, "Course, I’ve never looked at Beads either, and it’s 225k lines of Go code that tens of thousands of people are using every day".
It's slop to a level that people create extensive scripts to try and purge it from the system since it infects everything you do: https://gist.github.com/banteg/1a539b88b3c8945cd71e4b958f319...
Do not listen to newly converted or accept anything from them. Steve Yegge used to be a good engineer with great understanding of the world. Now it's all gupps and polecats
[1] Quote from the article: "it’s a bunch of bullshit I pulled out of my arse over the past 3 weeks, and I named it after badgers and stuff."
Update: I was hoping it'd at least be smart enough to automatically test the project still builds but it did not. It also didn't commit the changes.
> are you the mayor?
Yes. I violated the Mayor protocol - I should have dispatched this work to the gmailthreading crew worktree instead of implementing it directly myself.
The CLAUDE.md is clear: "Mayor Does NOT Edit Code" and "Coordinate, don't implement."
Maybe Yegge should have build it around Codex instead - Codex is a lot better at adhering to instructions.Pros: The overall system architecture is similar to my own latest attempt at solving this problem. I like the tmux-based console-monitoring approach (rather than going full SDK + custom UI), it makes it easier to inspect what is going on. The overlap between my ideas and Steve's is around 75%.
Cons: Arguing with "The Mayor" about some other detached processes poor workmanship seems like a major disconnect and architectural gap. A game of telephone is unlikely to be better than simply using claude. I was also hoping gastown would amplify my intent to complete the task of "Add feature X" without early-stopping, but so far it's more work than both 1. Vibing with claude directly and 2. Creating a highly-detailed spec with checkboxes and piping in "do the next task" until it's done.
Definitely looking forward to seeing how the tools in this space evolve. Eventually someone is bound to get it right!
P.s. the choice of nomenclature throughout the article is a bit odd, making it hard to follow. Movie characters, dogs and raccoons, huh? How about striving for descriptive SWE clarity?
Most likely, tens of other bugs are being introduced at each step, etc etc, right?
With vibe coding you just give the code some constraints and then system will try to work within those constraints, but what if those constraints are wrong? What if you’re asking the wrong question? Then you’ll end up with over complicated slop.
It’s a shame that vibe coded slop seems to be a new standard, when in fact you can use AI tools to produce much higher quality code if you actually care to engage in thoughtful conversations with the AIs and take a growth mindset.
Our civilization is doomed if this is the future. Zero quality, zero resiliency, zero coherent vision, zero cohesive intent. Just chaotic slop everywhere, the ultimate Ouroboros.