This gives you practically unlimited usage of frontier models like kimi, deepseek, glm. Their models are always fullsize, never quantised except where the lab themselves provides an 4bit or 8bit model. You can see from the model config exactly which hf model it pulls and the serving co figuration used.
Prompts are encrypted using Trusted Execution Environment (TEE). So neither a model host or neighbour can view your prompts. That's as close as you can get to local level privacy in the cloud.
I consume Claude ~30% per day in of, 1 week, Max,x20. Equivalent in Kimi Ai, is I consume 60% in one day, in one week.
DeepSeek/Latest, 95% discount, with cache, I rack up ~$60/day before I stopped.
I don't know how Claude compute their daily limits, it seems much cheaper.