DeepSeek-V3.2 released

cm0002@lemmy.world · 4 days ago

DeepSeek-V3.2 released

brucethemoose@lemmy.world · edit-2 4 days ago

With sparse attention, very interesting. It seems GQA is a thing of the past.

I especially love Deepseek’s ‘public research’ aspect: they trained this and Terminus the same way, so the attention schemes are (more-or-less) directly comparable. That’s awesome.

GLM 4.6 is reportedly about to drop too. Which is great, as 4.5 is without a doubt my daily driver now.

DeepSeek-V3.2 released

DeepSeek-V3.2 released

DeepSeek-V3.2 - a deepseek-ai Collection