Decreased Intelligence Density in DeepSeek V4 Pro
·
0 reactions
·
0 comments
·
3 views
In the V3.2 paper, they mentioned: Second, token efficiency remains a challenge; DeepSeek-V3.2 typically requires longer generation trajectories (i.e., more tokens) to match the output quality of models like Gemini 3.0-Pro. Future work will focus on optimizing the intelligence density of the model’s reasoning chains to improve efficiency. However, in V4 Pro , the situation seems to have worsened. Even the non-thinking mode uses significantly more tokens than V3.2 , and V4 Pro (1.6T) is roughly 2
Original article
Reddit
Anonymous · no account needed