FRESH

Hacker News

Home

The Codex App

478 points by meetpateltech

by OlympicMarmoto

8 subcomments

It is baffling how these AI companies, with billions of dollars, cannot build native applications, even with the help of AI. From a UI perspective, these are mostly just chat apps, which are not particularly difficult to code from scratch. Before the usual excuses come about how it is impossible to build a custom UI, consider software that is orders of magnitude more complex, such as raddbg, 10x, Superluminal, Blender, Godot, Unity, and UE5, or any video game with a UI. On top of that, programs like Claude Cowork or Codex should, by design, integrate as deeply with the OS as possible. This requires calling native APIs (e.g., Win32), which is not feasible from Electron.

by jtrn

0 subcomment

People's mileage may vary, but in my instance, this was so bad that I actually got angry while trying to use it.
It's slow and stupid. It does not do proper research. It does not follow instructions. It randomly decides to stop being agentic, and instead just dumps the code for me to paste. It has the extremely annoying habit of just doing stuff without understanding what I meant, making a mess, then claiming everything is fine. The outdated training data is extremely annoying when working with Nuxt 4+. It is not creative at solving problems. It dosent show the thinking. The Undo code does not give proper feedback on the diff and if it actually did "undo." And I hate the personality. It HAS to be better than it comes off for me because I am actually in a bad mood after having worked with it. I would rather YOLO code with Gemini 3 flash, since it's actually smarter in my assessment, and at least I can iterate faster, and it feels like it has better common sense.
Just as an example, I found an old, terrible app I made years ago for our firm that handles room reservations. I told it to update from Bootstrap to Flowbite UI. Codex just took forever to make a mess, installed version 2.7 when 4.0.1 is the latest, even when I explicitly stated that it should use the absolute latest version. Then it tried to install it and failed, so it reverted to the outdated CDN.
I gave the same task to Claude Code. Same prompt. It one-shotted it quickly. Then I asked it to swap out ALL the fetch logic to have SPA-like functionality with the new beta 4 version of HTMX, and it one-shot that too in the time Codex spent just trying to read a few files in the project.
This reminds me of the feeling I had when I got the Nokia N800. It was so promising on paper, but the product was so bad and terrible to use that I knew Nokia was done for. If this was their take on what an acceptable smartphone could be, it proves that the whole foundation is doomed. If this is OpenAI's take on what an agentic coding assistant should be—something that can run by itself and iterate until it completes its task in an intelligent and creative way.... OpenAI is doomed.

by strongpigeon

4 subcomments

Genuinely excited to try this out. I've started using Codex much more heavily in the past two months and honestly, it's been shockingly good. Not perfect mind you, but it keeps impressing me with what it's able to "get". It often gets stuff wrong, and at times runs with faulty assumptions, but overall it's no worse than having average L3-L4 engs at your disposal.
That being said, the app is stuck at the launch screen, with "Loading projects..." taking forever...
Edit: A lot of links to documentation aren't working yet. E.g.: https://developers.openai.com/codex/guides/environments. My current setup involves having a bunch of different environments in their own VMs using Tart and using VS Code Remote for each of them. I'm not married to that setup, but I'm curious how it handles multiple environments.
Edit 2: Link is working now. Looks like I might have to tweak my setup to have port offsets instead of running VMs.

by nr378

4 subcomments

Looks like another Claude App/Cowork-type competitor with slightly different tradeoffs (Cowork just calls Claude Code in a VM, this just calls Codex CLI with OS sandboxing).
Here's the Codex tech stack in case anyone was interested like me.
Framework: Electron 40.0.0
Frontend:
- React 19.2.0
- Jotai (state management)
- TanStack React Form
- Vite (bundler)
- TypeScript
Backend/Main Process:
- Node.js
- better-sqlite3 (local database)
- node-pty (terminal emulation)
- Zod (validation)
- Immer (immutable state)
Build & Dev:
- pnpm (package manager)
- Electron Forge
- Vitest (testing)
- ESLint + Prettier
Native/macOS:
- Sparkle (auto-updates)
- Squirrel (installer)
- electron-liquid-glass (macOS vibrancy effects)
- Sentry (error tracking)

by samuelstros

5 subcomments

It's basically what Emdash (https://www.emdash.sh/), Conductor (https://www.conductor.build/) & CO have been building but as first class product from OpenAI.
Begs the question if Anthropic will follow up with a first-class Claude Code "multi agent" (git worktree) app themselves.

by dworks

0 subcomment

Somewhat underwhelmed. I consider agents to be a sidetrack. The key insight from the Recursive Language Models paper is that requirements, implementation plans, and other types of core information should not be part of context but exist as immutable objects that can be referenced as a source of truth. In practice this just means creating an .md file per stage (spec, analysis, implementation plan, implementation summary, verification and test plan, manual qa plan, global state reference doc).
I created this using PLANS.md and it basically replicates a kanban/scrum process with gated approvals per stage, locked artifacts when it moves to next stage, etc. It works very well and it doesnt need a UI. Sure, I could have several agents running at the same time, but I believe manual QA is key to keeping the codebase clean, so time spent on this today means that future requirements can be implemented 10x faster than with a messy codebase.

by waldopat

0 subcomment

It seems the big feature is working agents in parallel? I've been working agents in parallel in Claude Code for almost 9 months now. Just create a command in .claude/commands that references an agent in .claude/agents. You can also just call parallel default Task agents to work concurrently.
Using slash commands and agents has been a game changer for me for anything from creating and executing on plans to following proper CI/CD policies when I commit changes.
To Codex more generally, I love it for surgical changes or whenever Claude chases its tail. It's also very, very good at finding Claude's blindspots on plans. Using AI tools adversarially is another big win in terms of getting things 90% right the first time. Once you get the right execution plan with the right code snippets, Claude is essentially a very fast typer. That's how I prefer to do AI-assisted development personally.
That said, I agree with the comments on tokens. I can use Codex until the sun goes down on $20/month. I use the $200/month pro plan with Claude and have only maxxed out a couple times, but I do find the volume to quality to be better with Claude. So far it's worth the money.

by oneneptune

2 subcomments

I'm a Claude Code user primarily. The best UI based orchestrator I've used is Zenflow by Zencoder.ai -- I am in no way affiliated with them, but their UI / tool can connect to any model or service you have. They offer their own model but I've not used it.
What I like is that the sessions are highly configurable from their plan.md which translates a md document into a process. So you can tweak and add steps. This is similar to some of the other workflow tools I've seen around hooks and such -- but presented in a way that is easy for me to use. I also like that it can update the plan.md as it goes to dynamically add steps and even add "hooks" as needed based on the problem.

by nycdatasci

2 subcomments

The landing page for the demo game "Voxel Velocity" mentions "<Enter> start" at the bottom, but <Enter> actually changes selection. One would think that after 7mm tokens and use of a QA agent, they would catch something like this.

by vzaliva

3 subcomments

How about us, Linux users? This is Mac only. Do they plan to support CLI version with all the features they are adding to desktop app?

by lacoolj

4 subcomments

OpenAI, ChatGPT, Codex
So many of the things that pioneered the way for the truly good (Claude, Gemini) to evolve. I am thankful for what they have done.
But the quality is gone, and they are now in catch-up mode. This is clear, not just from the quality of GPT-5.x outputs, but from this article.
They launch something new, flashy, should get the attention of all of us. And yet, they only launch to Apple devices?
Then, there are typos in the article. Again. I can't believe they would be sloppy about this with so much on the line. EDIT: since I know someone will ask, couple of examples - "7MM Tokens", "...this prompt initial prompt..."
And why are they not giving the full prompt used for these examples? "...that we've summarized for clarity" but we want to see the actual prompt. How unclear do we need to make our prompts to get to the level that you're showing us? Slight red flag there.
Anyway, good luck to them, and I hope it improves! Happy to try it out when it does, or at the very least, when it exists for a platform I own.

by rubslopes

6 subcomments

To me, the obvious next step for these companies is to integrate their products with web hosting. At this point, the remaining hurdle for non-developers is deploying their creations to the cloud with built-in monetization.

by oompydoompy74

2 subcomments

Looks like they forgot the part of the code editor where you can… edit code. Claude Code in Zed is about the most optimal experience I can imagine. I want the agent on the side and a code editor in the middle.

by surrTurr

1 subcomments

- looks like OpenAIs answer to Claude Code Desktop / Cowork
- workspace agent runner apps (like Conductor) get more and more obsolete
- "vibe working" is becoming a thing - people use folder based agents to do their work (not just coding)
- new workflows seem to be evolving into folder based workspaces, where agents can self-configure MCP servers and skills + memory files and instructions
kinda interested to see if openai has the ideas & shipping power to compete with anthropic going forward; anthropic does not only have an edge over openai because of how op their models are at coding, but also because they innovate on workflows and ai tooling standards; openai so far has only followed in adoption (mcp, skills, now codex desktop) but rarely pushed the SOTA themselves.

by samstokes

1 subcomments

Bit of a buried lede:
> For a limited time we're including Codex with ChatGPT Free
Is this the first free frontier coding agent? (I know there have been OSS coding agents for years, but not Codex/Claude Code.)

by mellosouls

4 subcomments

Mac only. Again.
Apple is great but this is OpenAI devs showing their disconnect from the mainstream. Its complacent at best, contemptuous at worst.
SamA or somebody really needs to give the product managers here a kick up the arse.

by barbazoo

3 subcomments

> "Localize my app and add the option to change units"
To me this still feels like the wrong way to interact with a coding agent. Does this lead people to success? I've never seen it not go off the rails in some way unless you provide clear boundaries as to what the scope of the expected change is. It's gonna write code if you don't even want it to yet, it's gonna write the test first or the logic first, whichever you don't want it to do. It'll be much too verbose or much too hacky, etc.

by beklein

0 subcomment

This will actually work well with my current workflow: dictation for prompts, parallel execution, and working on multiple bigger and smaller projects so waiting times while Codex is coding are fully utilized, plus easy commits with auto commit messages. Wow, thank you for this. Since skills are now first class tools, I will give it a try and see what I can accomplish with them.
I know/hope some OpenAI people are lurking in the comments and perhaps they will implement this, or at least consider it, but I would love to be able to use @ to add files via voice input as if I had typed it. So when I say "change the thingy at route slash to slash somewhere slash page dot tsx", I will get the same prompt as if I had typed it on my keyboard, including the file pill UI element shown in the input box. Same for slash commands. Voice is a great input modality, please make it a first class input. You are 90% there, this way I don't need my dictation app (Handy, highly recommended) anymore.
Also, I see myself using the built in console often to ls, cat, and rg to still follow old patterns, and I would love to pin the console to a specific side of the screen instead of having it at the bottom and pls support terminal tabs or I need to learn tmux.

by epolanski

1 subcomments

OT: I never liked about codex how it didn't ask for confirmations before editing. While Claude has auto accept off by default I never understood why codex didn't have it. I want to iterate on LLMs edit suggestions.
Did they fix it?
Otherwise I'm not interested.

by daviding

1 subcomments

This looks interesting and I use Codex a fair bit already in vscode etc, but I'm having trouble leaving a 'code editor with AI' to an environment that sort of looks like it puts the code as a hidden secondary artefact. I guess the key thing is the multi agent spinning plates part.

by eamag

2 subcomments

> For a limited time, Codex will also be available to ChatGPT Free and Go users to help build more with agents. We’re also doubling rate limits for existing Codex users across all paid plans during this period.
Is there more information about it? For how long and what are the limits?

by fanyangxyz33

0 subcomment

I really look forward to using this. I tried Codex first time yesterday and it was able to complete a task (i.e. drawing Penrose tilings) that Claude Code previously failed at. Also a little overwhelmed by all the new features that this app brings. I feel that I'm behind all the fancy new tools.

by inercia

0 subcomment

Something similar but for any ACP server: https://github.com/inercia/mitto

by hmokiguess

3 subcomments

I'm still waiting for the big pivotal moment in this space, I think there is a lot of potential with rethinking an IDE to be Agent first, and lots of what is out there is still lacking. (It's like we all don't know what we don't know, so we are just recycling UX around trying to solve it)
I keep coming back to my basic terminal with tmux running multiple sessions. I recently though forked this https://github.com/tiann/hapi and been loving using tailscale to expose my setup on my mobile device for convenience (plus the voice input there)

by justkez

3 subcomments

Genuinely curious if people would just let this rip with no obvious isolation?
I’m aware Mac OS has some isolation/sandboxes but without running codex via docker I wouldn’t be running codex.
(Appreciate there are still risks)

by archiepeach

0 subcomment

Interesting timing for me personally as I just switched from running Codex in multiple tabs in Cursor to Ghostty. It had nicer fonts by default, better tab switching that was consistent with the keyboard shortcut to switch to any tab on Mac, and it had native notifications that would ping when Codex had finished. Worktrees requiring manual configuration was probably the one sticking point, so definitely looking forward to this.

by daxfohl

1 subcomments

It would be nice if it didn't have to be all local. I'd love a managed cluster feature where you could just blast some workloads off to some designated server or cluster and manage them remotely, share progress with teammates, etc. (Not "cloud" though; I'd still want them on the internal network). I imagine something like that is in the works.

by causal

0 subcomment

This is the 5th OpenAI product called Codex if I'm counting correctly

by punnerud

0 subcomment

When can I get remote access in the iPhone app? Start on my laptop, check results using Tailscale/VPN and add follow up’s on the mobile to run on the computer. Know many that would love this feature.

by avazhi

0 subcomment

ChatGPT can’t even write me a simple working AutoHotKey script so I’m not sure why I’d trust it with any actual coding. As I’ve done for about the past year with OpenAI showcases like this, this elicited an ‘Oh, that’s kinda neat, I’ll just wait for Gemini to do something similar so it will actually work’ from me.

by hollowturtle

0 subcomment

I don't know you, but apart from ai tools race fatigue(feel pretty much like frameworks fatigue), all I see is mouse traveling a lot between far distant small elements, buttons and textareas. AI should have brought innovation even in UIs we basically stopped innovating there

0 subcomment

by blueaquilae

0 subcomment

This is an ode to opencode and how openai, very strangely, is just porting layout and feature of real open-source.
So much valuation, so much intern competetion and shenanigans than the creatives left.

by keeeba

0 subcomment

Is everything OpenAI do/release now a response to something Anthropic have recently released?
I remember the days when it was worth reading about their latest research/release. Halcyon days indeed.

by solomatov

1 subcomments

Is it open source? Do they disclose which framework they use for the GUI? Is it Electron or Tauri?

by shevy-java

5 subcomments

No.
I am glad to not depend on AI. It would annoy me to no ends how it tries to assimilate everything. It's like systemd on roids in this aspect. It will swallow up more and more tasks. Granted, in a way this is saying "then it was not necessary to have this things anymore now that AI solves it all", but I am skeptical of "the praised land" here. Skynet was not trusted back in 1982 or so. I don't trust AI either.

by elpakal

0 subcomment

why would it need local network access though, I wonder?

by gordon_freeman

1 subcomments

How does Codex mac app compare with Cursor? If anyone who tried both can explain here?
My experience with Cursor is generally good and I like that it gives me UX of using VS Code and also allows selection of multiple models to choose if one model is stuck on the prompt and does not work.

by xiphias2

0 subcomment

I guess the next it was meant to happen...I tried Google's Antigravity and found it quite buggy.
May give a go at this and Claude Code desktop as well, but Cursor guys are still working the hardest to keep themselves alive.

by Dowwie

0 subcomment

Hey, that's great OpenAI. Now add about 6 zeroes to the end of the weekly token limit for your customers and maybe we could use the app

by joe8756438

1 subcomments

Is there any marked difference or benefit over Claude Code?

by Oras

0 subcomment

I’ve been using codex regularly and it’s pretty good at model extra high with pretty generous context.
From the video, I can see how this app would be useful in:
- Creating branches without having to open another terminal, or creating a new branch before the session.
- Seeing diff in the same app.
- working on multiple sessions at once without switching CLI
- I quite like the “address the comments”, I can see how this would be valuable
I will give it a try for sure

by e1g

2 subcomments

Wow, this is nearly an exact copy of Codex Monitor[1]: voice mode, project + threads/agents, git panel, PR button, terminal drawer, IDE integrations, local/worktree/cloud edits, archiving threads, etc.
[1] https://github.com/Dimillian/CodexMonitor

by lvl155

2 subcomments

Bugs me they treat MacOS as first class. Do people actually develop on a Mac in 2026? Why not just start with Linux?

by mdrzn

0 subcomment

Maybe it's because I'm not used to the flow, but I prefer to work directly on the machine where I'm logged in via ssh, instead of working "somewhere in a git tree", and then have to deploy/test/etc.
Once this app (or a similar app by Anthropic) will allow me to have the same level of "orchestration" but on a remote machine, I'll test it.

by aed

0 subcomment

I typically bounce between Claude Code and Codex for the same project, and generally enjoy using both to check each other.
One cool thing about this: upon installing it immediately found all previous projects I've used with Codex and has those projects in the sidebar with all of the "threads" (sessions) I've had with Codex on these projects!

by hamasho

0 subcomment

More simple and similar app: vibe-kanban
https://www.vibekanban.com/

by archiepeach

2 subcomments

We are certainly approaching the point where a high end MacBook Pro for development isn’t required. Feels very close to just being able to use an iPad? My current workplace deploy on Vercel, we already test actively on feature branches and the models have gotten so good that you can reliably just commit what they’ve done (with linting and type check hooks etc) and in the rare event something is broken, follow up with a new commit.

by freeqaz

1 subcomments

Does anybody know when Codex is going to roll out subagent support? That has been an absolute game changer in Claude Code. It lets me run with a single session for so much longer and chip away at much more complex tasks. This was my biggest pain point when I used Codex last week.

0 subcomment

by SunshineTheCat

1 subcomments

This does look like it would simplify some aspects of using Codex on Mac, however, when I first saw the headline I thought this was going to be a phone app. And that started running a whole list of ideas through my brain... :(
But overall, looks very nice and I'm looking forward to giving it a try.

by luke_walsh

0 subcomment

I'm excited to try this out, it seems like it would solve a lot of my workflow issues. I hope there is the ability to review/edit research docs and plans it generates and not just code.

0 subcomment

by macmac_mac

1 subcomments

i've been using ai vibe coding tools since Copilot was basically spicy autocomplete, and this feels like the next obvious step: less “help me type” and more “please do this while I watch nervously.” The agent model sounds powerful, but in practice it’s still a lot of supervision, retries, and quiet hope it doesn’t hallucinate itself into a refactor I didn’t ask for.

by austinhutch

0 subcomment

I'm managing context with codex inside VSCode using different threads. I'm trying to figure out if there are use cases where I'd rather be in this app.

by geooff_

0 subcomment

Having dictation and worktree support built in is nice. Currently there is a whole ecosystem of tools implementing similar functionality for Claude Code. The automations look cool too!

by xGrill

2 subcomments

Is this not just a skinned version of Goose: https://block.github.io/goose/

by thefounder

0 subcomment

How is this better than vscode with the codex extension?

by karmasimida

2 subcomments

Not to rain on the parade, but this app feels to me ... unpolished. Some of the options in the demo feels less thought out and just put together.
I will try it out, but is this just me, or product/UX side of recent OpenAI products are sort of ... skipped over? It is good that agents help ship software quickly, but please no half-baked stuff like Altas 2.0 again ...

by poolnoodle

3 subcomments

These paid offerings geared toward software development must be a hell of a lot "smarter" than the regular chatbots. The amount of nonsense and bad or outright wrong code Gemini and ChatGPT throw at me lately is off the charts. I feel like they are getting dumber.

by _rwo

0 subcomment

seems like I need to update my toolset for the 3rd time this week

0 subcomment

by ngrilly

0 subcomment

Currently using opencode with Codex 5.2 and wondering why I should switch.

by kblissett

0 subcomment

Does this support users who access Codex via Azure OpenAI API keys?

by asdev

0 subcomment

Built an open source lightweight version of this that works with any cli agent: https://github.com/built-by-as/FleetCode

by robmn

0 subcomment

This is so garbage. OpenAI is never catching up.

0 subcomment

by minimaxir

1 subcomments

The inclusion of a live vibe-coded game on the webpage is fun, except the game barely works and it's odd they didn't attempt any polish/QA for what is ostensibly a PR announcement. It just adds more fuel to the fire to the argument that vibecoding results in AI slop.

by gigatexal

0 subcomment

For pure code generation is ChatGPT 5.2 so much better than Claude opus 4.5 thinking to have me switch? I’m basically all in on Claude.
Sure I could move to open code and use them as commodities but I’ve gotten use to Claude code and like using the vendors first party app.

by ChrisMarshallNY

0 subcomment

Eh. Kicked the tires for a few minutes. Back to the old clunker app.
No worries. I'm not their target demographic, anyway.

by FergusArgyll

0 subcomment

> and we're doubling the rate limits on Plus, Pro, Business, Enterprise, and Edu plans.
I love competition

by christkv

0 subcomment

What are the max context sizes ?

by teaearlgraycold

0 subcomment

Kind of embarrassing to demo "Please change this string to gpt-5.2". Presumably the diff UI doesn't let you edit the text manually? Or are they demonstrating being so AI-brained you refuse to type anything yourself?

by wpm

1 subcomments

Maybe I'm just not getting it, but I just don't give a flying fuck about any of this crap.
Like, seriously, this is the grand new vision of using a computer, this is the interface to these LLMs we're settling on? This is the best we could come up with? Having an army of chatbots chatting to each other running basic build commands in a terminal while we what? Supervise them? Yell at them? When am I getting manager pay bumps then?
Sorry. I'll stick with occasionally chatting with one of these things in a sandboxed web browser on a single difficult problem I'm having. I just don't see literally any value in using them this way. More power to the rest of you.

by desireco42

0 subcomment

It keeps offering me to "Get Plus" even though I am signed and already have a Plus plan.
Codex really grown on me lately. I re-signed to try it out on a project I have and it turned out to be really great addition to my toolkit.
It isn't always perfect and it's cli (how I mostly use it) isn't as sophisticated as OpenCode which is my default.
I am happy with this app, I am using Superset, terminal app which suprisingly is well positioned to help you if you work in cli like I do. But like I said, new desktop app seems like a solid addition.

by submeta

1 subcomments

> Work with multiple agents in parallel
But you can already do that, in the terminal. Open your favourite terminal, use splits or tmux and spin up as many claude code or codex instances as you want. In parallel. I do it constantly. For all kinds of tasks, not only coding.

by jauntywundrkind

0 subcomment

Does the Codex app host MCP Apps?

by simianwords

0 subcomment

I really want to like the native Mac app aesthetic but I kinda hate it. It screams minimalist but also clearly tells me it’s not meant for a power user. That ruggedness and sensitivity is missing.

by sergiotapia

0 subcomment

No Linux support? :(

by vcryan

0 subcomment

Another boring update from OpenAI. Why would I want an orchestration tool tied to one model? Part of the value of orchestration tools is using the most appropriate and/or cost effective model for the task, sub-task, etc

by drcongo

3 subcomments

> We're also excited to show more people what's now possible with Codex . For a limited time we're including Codex with ChatGPT Free and Go, and we're doubling the rate limits on Plus, Pro, Business, Enterprise, and Edu plans.
Translated from Marketingspeak, this is presumably "we're also desperate for some people to actually use it because everyone shrugged and went back to Claude Code when we released it".

by ath3nd

0 subcomment

[dead]

by miroljub

2 subcomments

Given the prevalence of Opencode and its ability to use any model and provider I don't see reason why would anyone bother with random vendors half-assed tools.