Three reasons why DeepSeek’s new model matters
Three reasons why DeepSeek’s new model matters
深度求索(DeepSeek)新模型为何重要:三大理由
EXECUTIVE SUMMARY On April 24, Chinese AI firm DeepSeek released a preview of V4, its long-awaited new flagship model. The model can process much longer prompts than its last generation, thanks to a new design that helps it handle large amounts of text more efficiently. Like DeepSeek’s previous models, V4 is open source, meaning it is available for anyone to download, use, and modify.
执行摘要 4月24日,中国人工智能公司深度求索(DeepSeek)发布了其备受期待的新一代旗舰模型 V4 的预览版。得益于一种能够更高效处理海量文本的新型架构,该模型能够处理比上一代长得多的提示词(prompts)。与深度求索此前的模型一样,V4 采用开源模式,这意味着任何人都可以下载、使用和修改它。
V4 marks DeepSeek’s most significant release since R1, the reasoning model it launched in January 2025. R1, which was trained on limited computing resources, stunned the global AI industry with its strong performance and efficiency, turning DeepSeek from a little-known research team into China’s best-known AI company almost overnight. It also helped set off a wave of open-weight model releases from other Chinese AI firms. DeepSeek has kept a relatively low profile since then—but earlier this month, it effectively teased V4’s release when it added “expert” and “flash” modes to the online version of its model, prompting speculation that the updates were tied to a bigger upcoming release.
V4 是深度求索自 2025 年 1 月发布推理模型 R1 以来最重要的产品。R1 在有限的计算资源下训练而成,凭借其强大的性能和效率震惊了全球 AI 行业,使深度求索几乎在一夜之间从一个鲜为人知的研究团队变成了中国最知名的 AI 公司。它还引发了其他中国 AI 公司发布开源权重模型的热潮。此后,深度求索一直保持相对低调,但本月初,该公司在在线版模型中增加了“专家(expert)”和“闪电(flash)”模式,实际上预告了 V4 的发布,引发了外界关于这些更新与即将到来的重大发布相关的猜测。
While the company has become a powerful symbol of China’s AI ambitions, its big return to cutting-edge frontier models comes after months of scrutiny—including major personnel departures, delays to previous model launches, and growing scrutiny from both the US and Chinese governments. So, will V4 shake the AI field the way R1 did? Almost certainly not, but here are three big reasons why this release matters.
尽管该公司已成为中国 AI 雄心的有力象征,但其重返前沿模型领域的这一重大举措,是在经历了数月的审查后做出的——包括重要人员离职、此前模型发布推迟,以及来自美国和中国政府日益严格的监管。那么,V4 会像 R1 那样撼动 AI 领域吗?几乎可以肯定不会,但以下是此次发布之所以重要的三个主要原因。
1. It breaks new ground for an open-source model
1. 它为开源模型开辟了新天地
As with R1 before it, DeepSeek claims that V4’s performance rivals the best models available at a fraction of the price. This is great news for developers and for companies using the tech, because it means they can access frontier AI capabilities on their own terms, and without worrying about skyrocketing costs. The new model comes in two versions, both of which are available on DeepSeek’s website and in its app, with API access also open to developers. V4-Pro is a larger model built for coding and complex agent tasks, and V4-Flash is a smaller version designed to be faster and cheaper to run.
正如之前的 R1 一样,深度求索声称 V4 的性能足以媲美目前最顶尖的模型,但价格却仅为后者的一小部分。这对开发者和使用该技术的公司来说是个好消息,因为这意味着他们可以按照自己的方式获取前沿 AI 能力,而不必担心成本飙升。新模型分为两个版本,均可在深度求索的网站和应用程序上使用,API 访问权限也已向开发者开放。V4-Pro 是一个专为编程和复杂智能体(agent)任务构建的大型模型,而 V4-Flash 则是一个旨在实现更快、更低运行成本的小型版本。
Both versions offer reasoning modes, in which the model can carefully parse a user’s prompt and show each step as it works through the problem. For V4-Pro, DeepSeek charges $1.74 per million input tokens and $3.48 per million output tokens, a fraction of the cost of comparable models from OpenAI and Anthropic. V4-Flash is even cheaper, at about $0.14 per million input tokens and about $0.28 per million output tokens, making it one of the cheapest top-tier models available. This would make it a very appealing model to build applications on.
两个版本都提供推理模式,模型可以仔细解析用户的提示词,并在解决问题的过程中展示每一步的思考过程。对于 V4-Pro,深度求索的收费标准为每百万输入 token 1.74 美元,每百万输出 token 3.48 美元,仅为 OpenAI 和 Anthropic 同类模型成本的一小部分。V4-Flash 则更便宜,每百万输入 token 约 0.14 美元,每百万输出 token 约 0.28 美元,使其成为目前最便宜的顶级模型之一。这使其成为构建应用程序极具吸引力的模型。
In terms of performance, V4 is, perhaps unsurprisingly, a huge jump from R1—and it seems to be a strong alternative to just about all the latest big AI models. On the major benchmarks, according to results shared by the company, DeepSeek V4-Pro competes with leading closed-source models, matching the performance of Anthropic’s Claude-Opus-4.6, OpenAI’s GPT-5.4, and Google’s Gemini-3.1. And compared to other open-source models, such as Alibaba’s Qwen-3.5 or Z.ai’s GLM-5.1, DeepSeek V4 exceeds them all on coding, math, and STEM problems, making it one of the strongest open-source models ever released.
在性能方面,V4 比 R1 有了巨大的飞跃(这或许并不令人意外),而且它似乎是几乎所有最新大型 AI 模型的强有力替代品。根据该公司分享的主要基准测试结果,DeepSeek V4-Pro 足以与领先的闭源模型竞争,其性能与 Anthropic 的 Claude-Opus-4.6、OpenAI 的 GPT-5.4 和 Google 的 Gemini-3.1 不相上下。与阿里巴巴的 Qwen-3.5 或智谱 AI 的 GLM-5.1 等其他开源模型相比,DeepSeek V4 在编程、数学和 STEM 问题上均超越了它们,使其成为有史以来最强大的开源模型之一。
DeepSeek also says that V4-Pro now ranks among the strongest open-source models on benchmarks for agentic coding tasks and performs well on other tests that measure ability to carry out multistep problems. Its writing ability and world knowledge also lead the field, according to benchmarking results shared by the company. In a technical report released alongside the model, DeepSeek shared results from an internal survey of 85 experienced developers: More than 90% included V4-Pro among their top model choices for coding tasks. DeepSeek says it has specifically optimized V4 for popular agent frameworks such as Claude Code, OpenClaw, and CodeBuddy.
深度求索还表示,V4-Pro 在智能体编程任务的基准测试中已跻身最强开源模型之列,并在衡量多步骤问题处理能力的其他测试中表现出色。根据该公司分享的基准测试结果,其写作能力和世界知识储备也处于行业领先地位。在随模型发布的技术报告中,深度求索分享了一项针对 85 名资深开发者的内部调查结果:超过 90% 的受访者将 V4-Pro 列为他们编程任务的首选模型之一。深度求索表示,已针对 Claude Code、OpenClaw 和 CodeBuddy 等主流智能体框架对 V4 进行了专门优化。
2. It delivers on a new approach to memory efficiency
2. 它带来了一种内存效率的新方案
One of the key innovations of V4 is its long context window—the amount of text the model can process at once. Both versions can handle 1 million tokens, which is large enough to fit all three volumes of The Lord of the Rings and The Hobbit combined. The company says this context window size is now the default across all DeepSeek services and it matches what is offered by cutting-edge versions of models like Gemini and Claude.
V4 的关键创新之一是其长上下文窗口——即模型一次可以处理的文本量。两个版本均可处理 100 万个 token,这足以容纳《指环王》三部曲和《霍比特人》的总和。该公司表示,这一上下文窗口大小现已成为所有深度求索服务的默认配置,并与 Gemini 和 Claude 等模型的前沿版本所提供的能力相匹配。
But it’s important to know not just that DeepSeek has made this leap, but how it did so. V4 makes significant architectural changes to the company’s former models—especially in the attention mechanism, which is the feature of AI models that helps them understand each part of a prompt in relation to the rest. As the prompt text gets longer, these comparisons become much more costly, making attention one of the main bottlenecks for long-context models.
但重要的不仅是深度求索实现了这一飞跃,还在于它是如何实现的。V4 对该公司之前的模型架构进行了重大调整——特别是在注意力机制(attention mechanism)方面,这是 AI 模型理解提示词各部分之间关联的核心功能。随着提示词文本变长,这些比较的计算成本会大幅增加,使得注意力机制成为长上下文模型的主要瓶颈之一。
DeepSeek’s innovation was to make the model more selective about what it pays attention to. Instead of treating all earlier文本 as equally important, V4 compresses older information and focuses on the parts most likely to matter in the present moment, while still keeping nearby text in full so it does not miss important details. DeepSeek says this sharply reduces the cost of using long context. In a 1-million-token context, V4-Pro uses only 27% of the computing power required by its previous model, V3.2, while cutting memory use to 10%. The reduction in V4-Flash is even larger, using just 10% of the computing power and 7% of the memory.
深度求索的创新之处在于让模型在“关注什么”上更具选择性。V4 不再将所有先前的文本视为同等重要,而是压缩旧信息,专注于当前时刻最可能重要的部分,同时完整保留附近的文本,以免遗漏重要细节。深度求索表示,这大幅降低了使用长上下文的成本。在 100 万 token 的上下文中,V4-Pro 仅消耗其上一代模型 V3.2 所需计算能力的 27%,同时将内存使用量降低至 10%。V4-Flash 的降幅更大,仅消耗 10% 的计算能力和 7% 的内存。
In practice, this could make it cheaper to build tools that need to work across huge amounts of material, such as an AI coding assistant that can read an entire codebase, or a research agent that can analyze a long archive of documents without constantly forgetting what came before. DeepSeek’s interest in long context windows didn’t start with V4. Over the past year and a half, the company has quietly published a series of papers on how AI models “remember” in…
在实践中,这可以降低构建需要处理海量资料的工具的成本,例如能够阅读整个代码库的 AI 编程助手,或者能够分析长篇文档档案而不会不断“遗忘”前文的研究智能体。深度求索对长上下文窗口的兴趣并非始于 V4。在过去的一年半里,该公司已悄然发表了一系列关于 AI 模型如何“记忆”的论文……