OS models close the gap (via distillation) with frontier models, then get burned to chip, then offer commoditized inference via data farms or local plugins.
With thought loops this fast even if the models are less smart they can be self correcting to level them selves up.