This is very misleading because the generalisation ability of LLMs is very very high. It doesn’t just memorise problems - that’s nonsense.
At high school level maths you genuinely can’t get gpt-5 thinking to make a single mistake. Not possible at all. Unless you give some convoluted ambiguous prompt that no human can understand. If you assume I’m correct, how does gpt memorise then?
In fact even undergraduate level mathematics is quite simple for gpt-5 thinking.
IMO gold was won.. by what? Memorising solutions?
I challenge people to find ONE example that gpt-5 thinking gets wrong in high school or undergrad level maths. I could not achieve it. You must allow all tools though.
Just like how GPUs were optimised to pass synthetic benchmarks.