Well, I have barely anything to show for months of this. I made Termux more accessible on Android, made an MUD client for Emacs, fixed up some Emacspeak stuff because it's been abandonned going on 3 years now, and Emacs packages wait for no one, and tried added Grade 2 Braille entry support to BRLTTY. That failed because depression sucks and who would even use this vibe coded junk anyway.
The more open nature of Android made it rather easier. How far behind in features TalkBack is compared to VoiceOver, besides AI image description, made it feel like trying to heal a broken arm with pain pills. So I'm trying to tell myself that I can't fix everything, and that it's not my fault if other people, and companies, choose to not consider accessibility. I mean I can't help Google if they choose to not be helped.
Ah well, Global Accessibility Awareness Day is this Thursday. Maybe Apple will finally announce LLM image descriptions, and hopefully my iPhone 16 will be good enough for them because I can't afford to upgrade in this economy.
I bet you do, working at OpenAI you get paid for more token use.
Give each Codex an AgentName and ask them to mark their PR/issue/comments with those. Have one or two "managers" that manage PRs and overall project direction. I write the project directions and make long lasting issues. Each Codex session has an almost unachievable `/goal` but they are asked to achieve the goal by landing changes in `main` via PRs
I am running about 14 Codex sessions on 4 machines right now for about two weeks since OpenAI 10x'ed my 20x account and I simply can not run out of tokens fast enough.
Side note: I have multiple Claude accounts too but the new Claude Code `/goal` command is seriously broken. It waits long pauses between iterations and sometimes prematurely stops.
Main difference would be just in how they're used (general purpose assistant vs "coding assistant") but the actual capabilities seem to be identical.
At what point do we stop calling this development ? It's nothing even close to the process of development or engineering. "I tried to migrate X". No you didn't, you tried to ask an LLM and hoped for the best.
I mean, honestly at what point would you bother, there's no learning happening, there's no creativity happening, just talking to a literal text generator to request your refund while you go for a shower, novelty, maybe even convenient but absolutely not development.
He must be a pleasure to work with
> When I come back to Slack, replies are often already sitting in drafts. I still decide what gets sent, but the expensive part of gathering context is done.
This just feels so dystopian to me. I hope that I never work with you or someone else doing this.
I personally do use LLMs for work messaging but I'm extremely careful to state clearly like "here's a draft for that quotation request that Claude wrote:" or something like that. I would never present that as my own words.
Help me prioritize what matters most.
If someone asks me a question, research the answer as deeply as you can and draft a reply for me, but do not send it."
This is a very dangerous road to go down. You may feel like you are getting more done but end up living your life on autopilot, without any introspection or applying your own taste.
And about voice mode, I thought it was a good idea but I seriously don't know how you guys use it, my thoughts whenever I use voice are "aaaaaaaaahhhhhh, uhmmm" and then cancel it so that I can type and organize my thoughts. I don't really think those "brain dumps" are useful when you are thinking out loud like "We should really do X oh wait but actually Y is in the way and we have to take into consideration Z, but wait Y was actually done" and so on, and it turns out that your assumptions are wrong, it becomes a mess. I am in favor of the LLM to work with facts and always verify it. To me this post is basically selling Codex app and that's it, nothing new inside.
- growing fast as fuck
- overepresentation on starred repos (even though stars mean less these days, it is definitely something to look at)
- overepresentation in `rust`
- in terms of aliveness, codex is first
The cost of memory-as-files isn't writing them. It's that the agent will cheerfully claim it updated something and not actually do it, or write a one-line stub that satisfies the spec but loses the original signal. Without a verification layer, the vault accumulates plausible-looking entries that quietly drift from reality.
What ended up working for me was treating the agent's self-reported summary as a wish, not a fact. A separate process diffs the actual file system against the claimed changes and flags mismatches.
After a few cycles, the agent gets calibrated and stops claiming things that don't survive a file check. That has the side benefit of making the diff review itself much higher signal: most of what shows up is real.
The split I'd make early is per-agent instructions vs. cross-thread shared notes.
They sound like the same artifact, but “what this agent should always do” and “what sibling work just learned” age very differently. Mixing them means the wisdom gets stale together.