FRESH

Hacker News

Kimi K2 1T model runs on 2 512GB M3 Ultras

233 points by jeudesprits

by A_D_E_P_T

11 subcomments

Kimi K2 is a really weird model, just in general.
It's not nearly as smart as Opus 4.5 or 5.2-Pro or whatever, but it has a very distinct writing style and also a much more direct "interpersonal" style. As a writer of very-short-form stuff like emails, it's probably the best model available right now. As a chatbot, it's the only one that seems to really relish calling you out on mistakes or nonsense, and it doesn't hesitate to be blunt with you.
I get the feeling that it was trained very differently from the other models, which makes it situationally useful even if it's not very good for data analysis or working through complex questions. For instance, as it's both a good prose stylist and very direct/blunt, it's an extremely good editor.
I like it enough that I actually pay for a Kimi subscription.

by Kim_Bruning

2 subcomments

Kimi K2 is a very impressive model! It's particularly un-obsequious, which makes it useful for actually checking your reasoning on things.
Some especially older ChatGPT models will tell you that everything you say is fantastic and great. Kimi -on the other hand- doesn't mind taking a detour to question your intelligence and likely your entire ancestry if you ask it to be brutal.

by mehdibl

0 subcomment

Claims as always misleading as they don't show the context length or prefill if you use a lot of context. As it will be fun waiting minutes for a reply.

by sfc32

2 subcomments

A single 512GB M3 Ultra is $9,499.00
https://www.apple.com/shop/buy-mac/mac-studio/apple-m3-ultra...

by smlacy

2 subcomments

Is there a linux equivalent of this setup? I see some mention of RDNA support for linux distros, but it's unclear to me if this is hardware-specific (requires ConnectX or in this case Apple Thunderbolt) or is there something interesting that can be done with "vanilla 10G NIC" hardware?

by websiteapi

6 subcomments

I get tempted to buy a couple of these, but I just feel like the amortization doesn’t make sense yet. Surely in the next few years this will be orders of magnitude cheaper.

by pcf

0 subcomment

I use this model in Perplexity Pro (included in Revolut Premium), usually in threads where I alternate between Claude 4.5 Sonnet, GPT-5.2, Gemini 3 Pro, Grok 4.1 and Kimi K2.
The beauty with this availability is that any model you switch to can read the whole thread, so it's able to critique and augment the answers from other models before it. I've done this for ages with the various OpenAI models inside ChatGPT, and now I can do the same with all these SOTA thinking models.
To my surprise Kimi K2 is quite sharp, and often finds errors or omissions in the thinking and analyses of its colleagues. Now I always include it in these ensembles, usually at the end to judge the preceding models and add its own "The Tenth Man" angle.

by rubymamis

0 subcomment

What benchmarks are good these days? I generally just try different models on Cursor, but most of the open weight models aren't available there (Deepseak v3.2, Kimi K2 has some problems with formatting, and many others are missing) so I'd be curious to see some benchmarks - especially for non-web stuff (C++, Rust, etc).

by Alifatisk

2 subcomments

by storus

0 subcomment

Does this also run with Exo Labs' token pre-fill acceleration using DGX Spark? I.e. take 2 Sparks and 2 MacStudios and get a comparable inference speed to what 2x M5 Ultras will be able to do?

by macshome

0 subcomment

by zkmon

0 subcomment

Isn't it the same model which won the competition of drawing a real-time clock recently?

by iwwr

1 subcomments

by ansc

0 subcomment