It's impressive to see how fast open-weights models are catching up in specialized domains like math and reasoning.
I'm curious if anyone has tested this model for complex logic tasks in coding? Sometimes strong math performance correlates well with debugging or algorithm generation.
by WhitneyLand
0 subcomment
Shouldn’t there be a lot of skepticism here?
All the problems they claim to have solved are on are the Internet and they explicitly say they crawled them. They do not mention doing any benchmark decontamination or excluding 2024/2025 competition problems from training.
IIRC correctly OpenAI/Google did not have access to the 2025 problems before testing their experimental math models.
by terespuwash
1 subcomments
Why isn’t OpenAI’s gold medal-winning model available to the public yet?
by simianwords
2 subcomments
A bit important that this model is not general purpose whereas the ones Google and OpenAI used were general purpose.
by H8crilA
3 subcomments
How do you run this kind of a model at home? On a CPU on a machine that has about 1TB of RAM?
by letmetweakit
0 subcomment
Does anyone know if this will become available on OpenRouter?
by sschueller
6 subcomments
How is OpenAI going to be able to serve ads in chatgpt without everyone immediately jumping ship to another model?
by LZ_Khan
0 subcomment
Don't they distill directly off OpenAI/Google outputs?