FRESH

Hacker News

Home

DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]

970 points by pretext

by zug_zug

6 subcomments

Well props to them for continuing to improve, winning on cost-effectiveness, and continuing to publicly share their improvements. Hard not to root for them as a force to prevent an AI corporate monopoly/duopoly.

by gradus_ad

15 subcomments

How will the Google/Anthropic/OpenAI's of the world make money on AI if open models are competitive with their models? What hurt open source in the past was its inability to keep up with the quality and feature depth of closed source competitors, but models seem to be reaching a performance plateau; the top open weight models are generally indistinguishable from the top private models.
Infrastructure owners with access to the cheapest energy will be the long run winners in AI.

by red2awn

2 subcomments

Worth noting this is not only good on benchmarks, but significantly more efficient at inference https://x.com/_thomasip/status/1995489087386771851

by embedding-shape

1 subcomments

> DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a "thinking with tools" capability.
At first, I thought they had gone the route of implementing yet another chat format that can handle more dynamic conversations like that, instead of just using Harmony, but looking at the syntax, doesn't it look exactly like Harmony? That's a good thing, don't get me wrong, but why not mention straight up that they've implemented Harmony, so people can already understand up front that it's compatible with whatever parsing we're using for GPT-OSS?

by TIPSIO

7 subcomments

It's awesome that stuff like this is open source, but even if you have a basement rig with 4 NVIDIA GeForce RTX 5090 graphic cards ($15-20k machine), can it even run with any reasonable context window that isn't like a crawling 10/tps?
Frontier models are far exceeding even the most hardcore consumer hobbyist requirements. This is even further

by zparky

1 subcomments

Benchmarks are super impressive, as usual. Interesting to note in table 3 of the paper (p. 15), DS-Speciale is 1st or 2nd in accuracy in all tests, but has much higher token output (50% more, or 3.5x vs gemini 3 in the codeforces test!).

by BoorishBears

1 subcomments

3.2-Exp came out in September: this is 3.2, along with a special checkpoint (DeepSeek-V3.2-Speciale) for deep reasoning that they're claiming surpasses GPT-5 and matches Gemini 3.0
https://x.com/deepseek_ai/status/1995452641430651132

by nickandbro

1 subcomments

For anyone that is interested
"create me a svg of a pelican riding on a bicycle"
https://www.svgviewer.dev/s/FhqYdli5

by mcbuilder

1 subcomments

After using it a couple hours playing around, it is a very solid entry, and very competitive compared with the big US relaeses. I'd say it's better than GLM4.6 and I'm Kimi K2. Looking forward to v4

by singularity2001

3 subcomments

Why are there so few 32,64,128,256,512 GB models which could run on current consumer hardware? And why is the maximum RAM on Mac studio M4 128 GB??

by cgearhart

0 subcomment

So DSA means a lightweight indexing model evaluated over the entire context window + a top-k attention evaluation. There’s no soft max in the indexing model, so it can run blazingly fast in parallel.
I’m surprised that a fixed size k doesn’t experience degrading performance in long context windows though. That’s a _lot_ of responsibility to push into that indexing function. How could such a simple model achieve high enough precision and recall in a fixed size k for long context windows?

by imbusy111

0 subcomment

Funny to see tau2-bench on the list of benchmarks, when tau2-bench is flawed and 100% score is impossible, unless you add the tasks to the training set: https://github.com/sierra-research/tau2-bench/issues/89

by mark_l_watson

0 subcomment

I used DeepSeek-v3.2 to solve two coding problems by pasting code and directions as one large prompt into a chat interface and it performed very well. VERY WELL!
I am still happy to pay Google because of their ecosystem or Gemini app, NotebookLM, Colab, gemini-cli, etc. Google’s moat for me is all the tooling and engineering around the models.
That said, my one year Google AI subscription ends in four months and I might try an alternative, or at least evaluate options. Alibaba Cloud looks like an interesting low cost alternative to AWS for building systems. I am now a retired ‘gentleman scientist’ now and my personal research is inexpensive no matter who I pay for inference compute, but it is fun to spend a small amount of time evaluating alternatives even though mostly using Google is time efficient.

by sfdlkj3jk342a

1 subcomments

What version is actually running on chat.deepseek.com?
It refuses to tell me when asked, only that it's been train with data up until July 2024, which would make it quite old. I turned off search and asked it for the winner of the US 2024 election, and it said it didn't know, so I guess that confirms it's not a recent model.

by chistev

0 subcomment

I've found it better than ChatGPT lately, at least the free version of GPT.
I don't know, but GPT seems to have regressed a lot, at least the free version.

by jodleif

11 subcomments

I genuinely do not understand the evaluations of the US AI industry. The chinese models are so close and far cheaper

by swframe2

0 subcomment

The AI market is hard to predict due to the constant development of new algorithms that could emerge unexpectedly. Refer to this summary of Ilya's opinions for insights into the necessity of these new algorithms: https://youtu.be/DcrXHTOxi3I
DeepSeek is a valuable product, but its open-source nature makes it difficult to displace larger competitors. Any advancements can be quickly adopted, and in fact, it may inadvertently strengthen these companies by highlighting weaknesses in their current strategies.

by Havoc

0 subcomment

Note combination of big frontier level model and MIT license.

by nickstinemates

1 subcomments

I am waiting for the first truly open model without any of the censorship built in.
I wonder how long it will take and how quickly it will try to get shut down.

by spullara

4 subcomments

I hate that their model ids don't change as they change the underlying model. I'm not sure how you can build on that.

  % curl https://api.deepseek.com/models \          
    -H "Authorization: Bearer ${DEEPSEEK_API_KEY}"  
  {"object":"list","data":[{"id":"deepseek-chat","object":"model","owned_by":"deepseek"},{"id":"deepseek-reasoner","object":"model","owned_by":"deepseek"}]}

by matt-alive

0 subcomment

Is it open source vs enterprise or China vs US?

by htrp

1 subcomments

what is the ballpark vram / gpu requirement to run this ?

by johnxie

0 subcomment

Cool to see open models catching up fast. For builders the real question is simple. Which model gives you the tightest loop and the least surprises in production. Sometimes that is open. Sometimes closed. The rest is noise.

by johnnienaked

1 subcomments

Are we the baddies?

by arthurcolle

0 subcomment

Surely OpenAI will follow up with a gpt-oss-780b

0 subcomment

by twistedcheeslet

2 subcomments

How capable are these models at tool calling?

by EternalFury

0 subcomment

It does seem good, but it’s slow.

by orena

0 subcomment

Any results on frontier math or arc ?

by lalassu

8 subcomments

Disclaimer: I did not test this yet.
I don't want to make big generalizations. But one thing I noticed with chinese models, especially Kimi, is that it does very well on benchmarks, but fails on vibe testing. It feels a little bit over-fitting to the benchmark and less to the use cases.
I hope it's not the same here.

by catigula

6 subcomments

To push back on naivety I'm sensing here I think it's a little silly to see Chinese Communist Party backed enterprise as somehow magnanimous and without ulterior, very harmful motive.

by sidcool

2 subcomments

Can someone kind please ELI5 this paper?

by samir123766

0 subcomment

nice

by nimchimpsky

2 subcomments

Pretty amazing that a relatively small Chinese hedge fund can build AI better than almost anyone.

by Foobar8568

1 subcomments

At least, there is no doubt where he is from !
which version are you?
我是DeepSeek最新版本模型！如果你想了解具体的版本号信息，我建议你：
```
    查看官方文档 - DeepSeek官网和文档会有最准确的版本信息

    关注官方公告 - 版本更新通常会在官方渠道公布

    查看应用商店/网页版 - 使用界面通常会显示当前版本
```
我具备DeepSeek的所有最新功能特性，包括：
```
    强大的对话和推理能力

    128K上下文长度

    文件上传处理（图像、文档等）

    联网搜索功能（需手动开启）

    完全免费使用
```
如果你需要知道精确的版本号用于技术对接或其他特定用途，最好直接查阅官方技术文档，那里会有最准确和详细的技术规格说明。
有什么其他问题我可以帮你解答吗？

by wosined

2 subcomments

Remember: If it is not peer-reviewed, then it is an ad.