"Dangerous" AI models are coming no matter what

“Dangerous” AI models are coming no matter what

无论如何,“危险”的 AI 模型终将到来

Late last week, Anthropic took its new Claude Fable 5 and Mythos 5 AI models offline following a United States government export-control directive barring “any foreign national” from using the services. The company has been in talks with the White House since Friday but has yet to secure an agreement that would allow it to reinstate the offerings.

上周晚些时候,在收到美国政府禁止“任何外国公民”使用其服务的出口管制指令后,Anthropic 将其新款 Claude Fable 5 和 Mythos 5 AI 模型下线。自上周五以来,该公司一直与白宫进行磋商,但尚未达成允许其恢复这些服务的协议。

Since Mythos debuted in April, Anthropic has claimed—and warned—that the model has advanced capabilities for not only finding software vulnerabilities to help defenders patch them, but also figuring out ways to exploit them that could be used by bad actors. Anthropic itself noted this double-edged sword in its launch of Mythos 5 and Claude Fable 5. “A great deal of advanced usage of AI models is dual use: the same queries that are beneficial in the hands of cybersecurity professionals and biology researchers could be dangerous if available to malicious actors,” the company wrote in a blog post last week.

自 Mythos 于四月首次亮相以来,Anthropic 就声称并警告称,该模型不仅具备发现软件漏洞以帮助防御者进行修补的高级能力,还能找出可被不法分子利用的攻击手段。Anthropic 在发布 Mythos 5 和 Claude Fable 5 时也指出了这把“双刃剑”。该公司在上周的一篇博文中写道:“AI 模型的大量高级应用具有双重用途:同样的查询,在网络安全专家和生物学研究人员手中是有益的,但如果被恶意行为者利用,则可能造成危险。”

With this in mind, the company initially released a version called Mythos Preview to a select consortium as part of a working group known as Project Glasswing. Mythos 5 was also privately released to this group last week, while Claude Fable 5, which is a Mythos-grade model, was released to the general public with specific blocks on its ability to give responses to questions about biology and cybersecurity. Then, at the end of last week, the Trump administration moved to restrict both models because it believes that Fable 5’s guardrails can be disabled to allow full access to the Mythos 5 capabilities, allegedly making it a national security risk.

考虑到这一点,该公司最初向一个特定的联盟发布了一个名为 Mythos Preview 的版本,作为“玻璃翼项目”(Project Glasswing)工作组的一部分。Mythos 5 上周也私下向该小组成员发布,而作为 Mythos 级别模型的 Claude Fable 5 则向公众发布,但对其回答生物学和网络安全相关问题的能力进行了特定限制。随后,在上周末,特朗普政府采取行动限制了这两款模型,理由是政府认为 Fable 5 的护栏可以被禁用,从而允许用户完全访问 Mythos 5 的功能,据称这构成了国家安全风险。

Experts say, though, that this institutional clash is simply delaying or masking a hard truth: Anthropic may be the tip of the spear in this moment, but AI capabilities in general and models from multiple companies and open-weight developers will almost certainly have similar capabilities to Mythos 5 in the near future—if they don’t already.

然而,专家们表示,这种体制上的冲突只是在拖延或掩盖一个残酷的事实:Anthropic 目前或许处于风口浪尖,但 AI 的整体能力,以及来自多家公司和开源开发者的模型,在不久的将来几乎肯定会具备与 Mythos 5 类似的能力——如果它们现在还没有具备的话。

“It’s myopic in the extreme to think that no other competitors to Anthropic will develop similar capabilities to Mythos or even that they have not already done so,” says Tarah Wheeler, chief security officer of the specialized cybersecurity consulting firm TPO Group. “There are other companies hot on Anthropic’s heels who probably have the capabilities, too, and are holding them in reserve as they see how Anthropic is being treated in the current regulatory environment.”

专业网络安全咨询公司 TPO Group 的首席安全官 Tarah Wheeler 表示:“认为 Anthropic 的其他竞争对手不会开发出类似 Mythos 的能力,甚至认为他们还没有做到这一点,这种想法极其短视。还有其他公司紧追 Anthropic,他们可能也具备这些能力,只是在观察 Anthropic 在当前监管环境下的遭遇后,选择将这些能力暂时保留。”

Anthropic itself has emphasized this point since the launch of Mythos Preview. “The real message is that this is not about the model or Anthropic,” Logan Graham, the company’s frontier red team lead, told WIRED when Mythos Preview launched in April. “We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months.”

自 Mythos Preview 发布以来,Anthropic 本身就一直强调这一点。该公司前沿红队负责人 Logan Graham 在四月份 Mythos Preview 发布时告诉《连线》(WIRED)杂志:“真正的核心信息是,这不仅仅关乎某个模型或 Anthropic 公司。我们需要现在就为一个在 6 个月、12 个月或 24 个月后这些能力被广泛普及的世界做好准备。”

OpenAI, for example, also did a private release of a cybersecurity-focused model in mid-April and announced an expanded cybersecurity strategy. Researchers note that even before this next generation of models, existing AI offerings could be used for advanced vulnerability-hunting and exploit development with a refined harness.

例如,OpenAI 也在四月中旬私下发布了一款专注于网络安全的模型,并宣布了扩展的网络安全战略。研究人员指出,即使在这一代模型出现之前,现有的 AI 产品如果经过精细的引导,也已经可以用于高级漏洞挖掘和漏洞利用开发。

A large group of cybersecurity leaders emphasized this to the administration in an open letter on Sunday, arguing that the White House’s export-control directive was misguided. “It’s not one model; it’s the general trend of technology,” says Bruce Schneier, a researcher at Harvard University and the University of Toronto who has been analyzing the situation. “Smaller, cheaper, open-source models, sometimes by themselves and sometimes in concert with each other, can match Mythos/Fable’s performance with more sophisticated prompting. And we should expect other models to match Mythos/Fable’s creativity and tenaciousness within months—slightly longer for open-source models.”

周日,一大批网络安全领袖在一封致政府的公开信中强调了这一点,认为白宫的出口管制指令是误导性的。一直在分析此事的哈佛大学和多伦多大学研究员 Bruce Schneier 表示:“这不仅仅是一个模型的问题,而是技术发展的总体趋势。更小、更便宜的开源模型,有时单独使用,有时相互配合,通过更复杂的提示词(prompting)就能达到 Mythos/Fable 的性能。我们应该预料到,其他模型在几个月内就能达到 Mythos/Fable 的创造力和韧性——开源模型可能需要稍长一点的时间。”

What the White House and governments around the world need to focus on, experts say, is democratically developing much broader and more transparent plans for how they will contend with advances in AI capabilities on cybersecurity and in other sensitive areas as they inevitably occur. “The policy question is not whether a technology has risk,” says Chris Wysopal, cofounder of the cloud security firm Veracode. “The question is whether a specific restriction meaningfully reduces that risk or whether it mainly slows down the people trying to make systems safer.”

专家们表示,白宫和世界各国政府需要关注的,是以民主的方式制定更广泛、更透明的计划,以应对 AI 在网络安全及其他敏感领域不可避免的技术进步。云安全公司 Veracode 的联合创始人 Chris Wysopal 说:“政策问题不在于某项技术是否有风险,而在于某项具体的限制措施是否能真正降低这种风险,还是说它主要只是拖慢了那些试图让系统变得更安全的人的脚步。”