‘Dangerous’ AI Models Are Coming No Matter What

‘Dangerous’ AI Models Are Coming No Matter What

“危险”的 AI 模型终将到来,无论如何阻挡

Late last week, Anthropic took its new Claude Fable 5 and Mythos 5 AI models offline following a United States government export-control directive barring “any foreign national” from using the services. The company has been in talks with the White House since Friday but has yet to secure an agreement that would allow it to reinstate the offerings.

上周晚些时候,在收到美国政府的一项出口管制指令(禁止“任何外国公民”使用其服务)后,Anthropic 将其新款 Claude Fable 5 和 Mythos 5 AI 模型下线。自上周五以来,该公司一直与白宫进行商谈,但尚未达成任何允许其恢复这些服务的协议。

Since Mythos debuted in April, Anthropic has claimed—and warned—that the model has advanced capabilities for not only finding software vulnerabilities to help defenders patch them, but also figuring out ways to exploit them that could be used by bad actors. Anthropic itself noted this double edged sword in its launch of Mythos 5 and Claude Fable 5. “A great deal of advanced usage of AI models is dual use: the same queries that are beneficial in the hands of cybersecurity professionals and biology researchers could be dangerous if available to malicious actors,” the company wrote in a blog post last week.

自 Mythos 于四月首次亮相以来,Anthropic 就声称并警告称,该模型不仅具备发现软件漏洞以帮助防御者进行修补的高级能力,还能找出可被不法分子利用的漏洞攻击方式。Anthropic 在发布 Mythos 5 和 Claude Fable 5 时也指出了这把“双刃剑”。该公司在上周的一篇博文中写道:“AI 模型的大量高级应用具有双重用途:同样的查询,在网络安全专家和生物学研究人员手中是有益的,但如果被恶意行为者利用,则可能造成危险。”

With this in mind, the company initially released a version called Mythos Preview to a select consortium as part of a working group known as Project Glasswing. Mythos 5 was also privately released to this group last week, while Claude Fable 5, which is a Mythos-grade model, was released to the general public with specific blocks on its ability to give responses to questions about biology and cybersecurity.

考虑到这一点,该公司最初向一个特定的联盟发布了一个名为 Mythos Preview 的版本,作为“玻璃翼项目”(Project Glasswing)工作组的一部分。Mythos 5 上周也私下向该小组成员发布,而作为 Mythos 级别模型的 Claude Fable 5 则向公众发布,但对其回答生物学和网络安全相关问题的能力进行了特定限制。

Then, at the end of last week, the Trump administration moved to restrict both models because it believes that Fable 5’s guardrails can be disabled to allow full access to the Mythos 5 capabilities, allegedly making it a national security risk.

随后,在上周末,特朗普政府采取行动限制了这两款模型,理由是政府认为 Fable 5 的护栏可以被禁用,从而允许用户完全访问 Mythos 5 的功能,据称这构成了国家安全风险。

Experts say, though, that this institutional clash is simply delaying or masking a hard truth: Anthropic may be the tip of the spear in this moment, but AI capabilities in general and models from multiple companies and open-weight developers will almost certainly have similar capabilities to Mythos 5 in the near future—if they don’t already.

然而,专家们表示,这种体制上的冲突只是在拖延或掩盖一个残酷的事实:Anthropic 此刻或许处于风口浪尖,但人工智能的整体能力,以及来自多家公司和开源开发者的模型,在不久的将来几乎肯定会具备与 Mythos 5 类似的能力——如果它们现在还没有具备的话。

“It’s myopic in the extreme to think that no other competitors to Anthropic will develop similar capabilities to Mythos or even that they have not already done so,” says Tarah Wheeler, chief security officer of the specialized cybersecurity consulting firm TPO Group. “There are other companies hot on Anthropic’s heels who probably have the capabilities, too, and are holding them in reserve as they see how Anthropic is being treated in the current regulatory environment.”

“认为 Anthropic 的其他竞争对手不会开发出类似 Mythos 的能力,甚至认为他们还没有这样做,这简直是目光短浅,”专业网络安全咨询公司 TPO Group 的首席安全官 Tarah Wheeler 表示。“还有其他公司紧追 Anthropic,他们可能也具备这些能力,但鉴于他们看到了 Anthropic 在当前监管环境下的遭遇,正将这些能力作为储备。”

Anthropic itself has emphasized this point since the launch of Mythos Preview. “The real message is that this is not about the model or Anthropic,” Logan Graham, the company’s frontier red team lead, told WIRED when Mythos Preview launched in April. “We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months.”

自 Mythos Preview 发布以来,Anthropic 本身也一直强调这一点。“真正的信息是,这不仅仅关乎某个模型或 Anthropic,”该公司前沿红队负责人 Logan Graham 在四月份 Mythos Preview 发布时告诉《连线》(WIRED)杂志。“我们需要现在就为一个在 6 个月、12 个月或 24 个月后这些能力被广泛普及的世界做好准备。”

OpenAI, for example, also did a private release of a cybersecurity-focused model in mid-April and announced an expanded cybersecurity strategy.

例如,OpenAI 也在四月中旬私下发布了一款专注于网络安全的模型,并宣布了一项扩展的网络安全战略。

Researchers note that even before this next generation of models, existing AI offerings could be used for advanced vulnerability-hunting and exploit development with a refined harness. A large group of cybersecurity leaders emphasized this to the administration in an open letter on Sunday, arguing that the White House’s export-control directive was misguided.

研究人员指出,即使在这一代模型出现之前,现有的 AI 产品通过精细的调优,就已经可以用于高级漏洞挖掘和漏洞利用开发。周日,一大批网络安全领袖在一封公开信中向政府强调了这一点,认为白宫的出口管制指令是误导性的。

“It’s not one model; it’s the general trend of technology,” says Bruce Schneier, a researcher at Harvard University and the University of Toronto who has been analyzing the situation. “Smaller, cheaper, open-source models, sometimes by themselves and sometimes in concert with each other, can match Mythos/Fable’s performance with more sophisticated prompting. And we should expect other models to match Mythos/Fable’s creativity and tenaciousness within months—slightly longer for open-source models.”

“这不仅仅是一个模型的问题,这是技术的总体趋势,”一直在分析此情况的哈佛大学和多伦多大学研究员 Bruce Schneier 表示。“更小、更便宜的开源模型,有时单独使用,有时相互配合,通过更复杂的提示词(prompting)就能达到 Mythos/Fable 的性能。我们应该预料到,其他模型在几个月内就能达到 Mythos/Fable 的创造力和韧性——开源模型可能稍慢一些。”

What the White House and governments around the world need to focus on, experts say, is democratically developing much broader and more transparent plans for how they will contend with advances in AI capabilities on cybersecurity and in other sensitive areas as they inevitably occur.

专家们表示,白宫和世界各国政府需要关注的,是以民主的方式制定更广泛、更透明的计划,以应对人工智能在网络安全及其他敏感领域不可避免的进步。

“The policy question is not whether a technology has risk,” says Chris Wysopal, cofounder of the cloud security firm Veracode. “The question is whether a specific restriction meaningfully reduces that risk or whether it mainly slows down the people trying to make systems safer.”

“政策问题不在于某项技术是否有风险,”云安全公司 Veracode 的联合创始人 Chris Wysopal 说。“问题在于,某项特定的限制措施是否能真正降低这种风险,还是说它主要只是拖慢了那些试图让系统变得更安全的人的脚步。”