FRESH

Hacker News

Home

Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing

121 points by _____k

by delichon

0 subcomment

There is little new under the big fusion reactor in the sky. I just read a chapter in James Glieck's "The Information" about tokenmaxxing in the telegraphy industry. There used to be a big market for code books to reduce the per-character charges for sending telegrams. Compression was cash in the pocket. The telegraph companies discouraged the practice but were forced to accept it. The telegraph code industry started with the initial commercialization of telegraphy and didn't end until the 1920s.
There was a cost to it though. Codes greatly reduced redundancy, and caused large miscommunications from very small errors. As Glieck explains it, this was the opposite of the African drumming practice of adding redundancy to strengthen the relationship between the rhythm and the language that the drums mimic.

by izanton

8 subcomments

What if... we stop for a moment, and then, after thinking for a moment, we stop hammering nails with a microscope, and stop using token usage as a metric of productivity?
I know it's sounds stupid, but what if

by FartyMcFarter

3 subcomments

If any company announces that they use token consumption as an employee performance signal, for me that's close to a red flag to stay away from that company.
No company with good engineering leadership should act like this is remotely a good idea.

by mrkeen

9 subcomments

I always used to wonder this about software stacks even prior to LLMs, but it seems more relevant now somehow:
When will Uber (or your favourite company) be 'done'? They've been writing software for 16 years.
They match drivers to passengers. More software isn't going to increase the chance that I seek them out instead of taking a bus or train.
Will their software be finished in 20 years? 80?

by crorella

3 subcomments

Tokenmaxxing makes no sense, it is akin to write extremely inefficient SQL / Spark Jobs, full of cartesian joins, ultra skewed datasets, etc, just for the sake of using as much compute / memory / IO as possible.
This always happens when the metric becomes the goal, companies should nurture and foster an environment where AI is used in the most efficient way possible, first asking "do we really need an agent for this" and if so, what kind of agent is needed, what model, reasoning level, etc.
They should also promote projects that aim at saving tokens, increasing cache hits, codifying the information in ways such they use as less context as possible (graphs of knowledge are pretty good for this!)

by rr808

3 subcomments

I have Opus 4.7 at work at 15x. Burns through tokens like water. It feels like one of these new mega datacenters is just for me. I'd love to know what the bill is, but we're just encouraged to do as much AI as possible.

by mchusma

2 subcomments

I actually do think token maxing is good, but they should have limited it per user. I find it reallly hard to get people to max out the Claude $100 plan, let alone the $200 plan. I understand the enterprise plans are different and more expensive, which is how you get these kinds of issues. But encouraging people to try things with AI is very important, and some amount of token maxing is importsnt.

by jhack

3 subcomments

Maybe don't use the most expensive models on the planet? Maybe use AI like a tool and not this black box that grants wishes?

by bilater

0 subcomment

The black bill that is coming that nobody is prepared for is that the value of a token varies greatly depending on the human. Companies will quickly find out its much better to give your top 10% engineers a lot more tokens and lay off your average engineers. The 10x engineer will become the 1000x engineer.
Wrote about this and the impact of to jobs here: https://x.com/deepwhitman/status/2058324179506831372

by simonw

1 subcomments

I'd be interested to know if this is about individual employee AI usage, or use of AI tokens in production features, or both - and assuming both, what the split is.
I can see how Uber could burn unbelievable amounts of tokens if they start running internal features that run a bunch of prompts against every completed ride, or every customer profile, for example.
Or maybe this is about employee usage, but they introduced some stupid "you get evaluated on how many tokens you used" thing a couple of months ago when that was trendy and are just beginning to notice how much that cost?

by mmastrac

0 subcomment

I am certain that the max sustainable boost from AI use -- with code review and otherwise all-in -- is approximately 20% with the appropriately skilled senior engineering talent, and the token budget for any engineer should not exceed that.
I do not believe that engineers who are tokenmaxxing are truely productive and I have not seen any evidence whatsoever (perhaps the opposite).
I've personally found that with the right flow and codebase knowledge, that's achievable with sustainable levels of effort.

by cryo32

2 subcomments

Waiting for tokenedging next.

by InsideOutSanta

0 subcomment

"He said that, based on talks with Uber's senior engineering leaders, he realized higher token usage did not translate into a proportional increase in useful consumer features."
He's saying that like it's some grand epiphany and not the most self-evident, obvious thing I've heard this month. Some of the literal dumbest people on earth are in charge of these major companies.

by victor9000

0 subcomment

Clearly they need more layoffs, and for that matter why keep anyone around? After all, AI will be writing 100% of code in 2026.

by mustaphah

0 subcomment

Feels like they are debating internally whether to cut people or AI spending. Very healthy debate. Let's hope they spare people.

by chihuahua

6 subcomments

It's amazing that it took months to figure this out. "Well we thought that if engineers are told to maximize costs through AI use, to consume as much as possible of a resource that costs us money, then obviously good things will happen. Imagine my surprise when it didn't turn out that way."
Imagine if engineers were ranked based on their AWS spend. People allocate VMs and fill databases with terabytes of random bits, to get to the top of the AWS leaderboard. If you don't do this, you're ranked at the bottom, and good luck at the next review cycle. Who could have expected that this is not the road to success?

by JackDanMeier

0 subcomment

At what point is there a difference between a burn rate and tokenmaxxing? Isn't it the same as during the dotcom bubble?

by matheusmoreira

0 subcomment

LLMs are great, I can understand using them in general. I can even understand chasing 100% weekly usage if you're using the gacha-like subscriptions since that's how you get the most value out of what you paid for.
The way these corporations are going about it is completely insane though. They're essentially ordering their employees to set money on fire or be fired themselves. The more money you burn on tokens at insane API rates, the better an employee you are. Absolutely mind boggling.

by rcvassallo83

0 subcomment

Oof leader of bubble are starting to take a step back?

0 subcomment

by mustaphah

0 subcomment

Tokenmaxxing is so dumb. You should never show your team how exactly you're measuring their performance; people will optimize for the metric, not the actual performance.
Classic Goodhart’s Law: when a measure becomes a target, it ceases to be a good measure.

by illithid0

1 subcomments

>"He said that, based on talks with Uber's senior engineering leaders, he realized higher token usage did not translate into a proportional increase in useful consumer features."
Goodhart's law strikes again at someone with enough power to be both ignorant of it and make others suffer their ignorance. You cannot simply measure productivity by tokens spent just like you can't measure it by hours spent in a chair at a desk.

by phendrenad2

0 subcomment

AI productivity hasn't been well studied yet, but I'm betting that we'll end up with some variation on Price's Law, I.E. some small subset of workers get most of the benefit, while most just burn tokens with little to show for it.
I also want to call out the false productivity opportunities AI offers. There are whole teams building their own "gas town" and not shipping features.

by lorecore

0 subcomment

Not all tokens are created equal. It's easy to use a ton of tokens by having agents work together in parallel. That's basically the equivalent as people spending time in meetings, hardly a productivity win. As with everything in development, results matter, how you get there doesn't (unless you're a bad manager).

by irishcoffee

1 subcomments

I just realized my company is months behind this curve. About to blow my token allocation. Before I do, anyone have requests? Sincerely.

by 7777777phil

9 subcomments

As soon as tokens stop stop being subsidized, heavy agentic use will become as least as expensive than paying an (entry level) employee. When this happens many companies will trade off havy tolen usage for (maybe a bit slower, bit less accurate) employees again.

by yapyap

0 subcomment

by paulpauper

0 subcomment

many of these leading AI companies are operating at large losses and subsidizing users with VC money. Profitability will entail having to impose greater limits and raising prices, so this will reduce to some degree the value proposition of AI compared to humans.

by pocksuppet

0 subcomment

what the fuck is this timeline I am stuck living in

by Rohunyyy

1 subcomments

Now we are going to get a new profession. Token Engineer! They will be experts on tokenmaxxing! The job growth that the billionaire CEOs promised us from AI is finally here!

by aplomb1026

0 subcomment

[flagged]

by nekzn

3 subcomments

It’s funny that “maxxing” entered the common vocabulary.

by gigatexal

5 subcomments

I find it useful that if they cut the use altogether I will pay for it out of pocket.

by egypturnash

1 subcomments

Uber COO says he just decided to short a bunch of AI company stock.

by hmokiguess

0 subcomment

Why do keep doing this? It's the same as measuring by LoC, we know it's not gonna work. Also, see Goodhart's Law[1]
- https://en.wikipedia.org/wiki/Goodhart%27s_law