FRESH

Hacker News

Uber's $1,500/Month AI Limit Is a Useful Signal for AI Tool Pricing

45 points by pdyc

by CharlieDigital

12 subcomments

$1500/mo is $18,000/seat/annum.
Maybe Microsoft and Nvidia are on to something.
128 GB machines that can run local LLMs are a bargain even if priced $5-8k. Yes, tok/s is not quite there, but that's probably OK since the bottleneck really isn't the code; it's WTF did Uber build with all of that spend? How did it meaningfully impact their revenue in a positive direction?

by ashahin

2 subcomments

The $1,500 frames this as a per-engineer ceiling, but the unit of consumption shifted under everyone's feet — engineers don't issue prompts anymore, they kick off agent loops that fan out into 20–100 tool calls and 10–50 LLM calls per task. A single agent run on a non-trivial refactor burns more tokens than the engineer typing for an hour. So the cap doesn't constrain engineers, it constrains agent-task throughput per engineer — which is a different thing. The leaderboard-vs-cap debate misses that the metric worth bounding is $/successful-PR or $/correct-completion, not $/engineer-month; variance between cheap and expensive tasks at the same budget is 10–50x now rather than 2–3x. Per-tool caps eventually force every team to ask: which workflows justify burning through tokens, and which should be cached, retrieved, or templated.

by jkwang

1 subcomments

The $1500 number is less interesting than the fact that they hit a ceiling at all. Most engineering teams I've talked to have no idea what their AI spend is per developer because it's buried in a consolidated cloud bill. Having a hard cap forces two useful conversations: what workflows actually justify API calls vs local inference, and whether the output is being measured against any real productivity metric. Without that feedback loop it's just a race to see who can burn tokens fastest.

by PessimalDecimal

5 subcomments

These are still at currently subsidized prices. We'll see if they think they're getting $1500/month of value when that buys significantly fewer tokens.

by f311a

0 subcomment

by LurkandComment

0 subcomment

1) This happened because they fundementally misunderstand how to use AI and how AI is priced 2) Most organizations are throwing everything in for analyses and not limiting the answer they want. You need to be specific of about what you analyze and what answers you want 3) People undervalue prompting or templated responses. I will have written. validated and sanity checked a prompt several times and run it across several models before I say its ready for use. But when it is, I know what it will give me and that the scope of its research and answer is as close to what I want as it can be. As little excess as I can. This all saves tokens

by jwpapi

0 subcomment

If you estimate 10k salary per engineer that means the moment it’s cheaper for them to hire another engineer but that doesn’t mean it’s improving productivity 15% but if 15% is the moment it stopped being better than another human we can assume 7.5%?
Probably even less because you would spend those 1500 extra per employee also if you just save 10% so 150 per employee that’s 1.5% on salary.
This is imho one of the best ranges we can assume for now how much would that be on the whole swe market?

by ilia-a

2 subcomments

Seems odd limit, especially since it highly dependant on Token provider used, with Opus this is not much and could easily be burnt in a week or less, but with something like deepseek the 1500 can literarily be an annual budget.
That being said, I do have to wonder why someone as bug as say Uber, simply not rollout OSS model in the cloud for their team, I'd imagine that would be cheapest & most flexible option, while also keeping all the data shared with LLM private.

by epsteingpt

0 subcomment

Uber engineers reported that loading their workspace and pulling recent commits exhausted that AI limit for Claude Code (4.8 x-high) immediately.

by cloudking

0 subcomment

They are also beholden to enterprise pricing and can't use the subsidized consumer max plans.

by sremani

0 subcomment

I have strong conviction that companies will now choose tech stack/programming languages based on 'tokenomics'. I am vibe coding using Clojure, a language I can read but cannot write and I never hit the usage limits even when using the latest model on Claude. I have similar experience with F#, which is a bit more verbose than clojure but absolutely beats every OOP language, Python, Typescript etc.
The reason, I use F# & Clojure is they hit JVM and CLR, two popular enterprise stacks.
In my not so humble opinion Lisp(Clojure) still remains the language of AI.

by jedisct1

1 subcomments

by ChrisArchitect

0 subcomment