The problem is that the companies serving LLMs can't delineate compute between users. You're using a cloud of compute, not a single individual unit of compute. If you could measure at the unit of compute, then you'd just pay by the minute for what you use instead of by the token, we would get out of this situation.