Cybersecurity vets protest ‘dangerous’ US government ban on Anthropic’s most powerful models
Cybersecurity vets protest ‘dangerous’ US government ban on Anthropic’s most powerful models
网络安全专家抗议美国政府对 Anthropic 最强模型实施“危险”禁令
A group made up of dozens of cybersecurity experts, including several well-known veterans of the industry, published an open letter to the U.S. government asking it to lift the export control order on Anthropic’s Fable and Mythos models. 由数十名网络安全专家(包括多位业内知名资深人士)组成的团体向美国政府发表了一封公开信,要求撤销对 Anthropic 公司 Fable 和 Mythos 模型的出口管制令。
According to the open letter, “this action has taken the best models away from [cybersecurity] defenders” who now can’t use the models to find vulnerabilities and make their software and products more secure. “To pull the best capabilities away from defenders without a good reason when our adversaries are rapidly advancing is dangerous,” read the letter. 公开信称:“这一举措剥夺了(网络安全)防御者使用最强模型的能力”,导致他们无法利用这些模型发现漏洞并提升软件及产品的安全性。信中写道:“在我们的对手迅速进步的情况下,毫无理由地从防御者手中夺走最强大的工具是危险的。”
On Friday, the U.S. government ordered Anthropic to limit the export of Fable and Mythos, citing national security concerns, without explaining the specific reasons behind the order, according to Anthropic. In response, the company suspended access to the models to all users worldwide. 据 Anthropic 公司称,美国政府于周五以国家安全为由,要求 Anthropic 限制 Fable 和 Mythos 的出口,但并未解释该命令背后的具体原因。作为回应,该公司暂停了全球所有用户对这些模型的访问权限。
As of this writing, the letter is signed by 76 cybersecurity experts, including Alex Stamos, former Facebook chief of security; Casey Ellis, the founder bug bounty platform Bugcrowd; Jon Callas, famed cryptographer and former Apple security design and architecture manager; Paul Vixie, computer scientist; Dino Dai Zovi, the former head of applied security engineering at Block; Katie Moussouris, the founder of Luta Security; and Rachel Tobac, the CEO of the security awareness training firm SocialProof Security. 截至本文撰写时,这封信已获得 76 位网络安全专家的签名,其中包括前 Facebook 首席安全官 Alex Stamos、漏洞赏金平台 Bugcrowd 创始人 Casey Ellis、著名密码学家及前苹果安全设计与架构经理 Jon Callas、计算机科学家 Paul Vixie、Block 前应用安全工程主管 Dino Dai Zovi、Luta Security 创始人 Katie Moussouris 以及安全意识培训公司 SocialProof Security 的首席执行官 Rachel Tobac。
When Mythos launched as a preview in April, Anthropic claimed it was so powerful at finding security vulnerabilities that the company needed to tightly restrict access to prevent malicious hackers or foreign adversaries from using it to cause havoc on the internet. In practice, that meant Anthropic gave around 50 companies initial access to Mythos, recently expanding that group to include around 150 organizations in 15 countries. 当 Mythos 于 4 月以预览版形式发布时,Anthropic 声称其在发现安全漏洞方面的能力极其强大,公司必须严格限制访问权限,以防止恶意黑客或外国对手利用它在互联网上制造混乱。在实践中,这意味着 Anthropic 最初仅向约 50 家公司开放了 Mythos 的访问权限,近期才将该范围扩大到 15 个国家的约 150 个组织。
Last week, Anthropic released Fable, a public version of Mythos that the company said had strict guardrails to block its use in the fields of biology, chemistry, and cybersecurity, as well as to stop others from distilling the model in order to re-create it. The guardrails on Fable were so strict that many cybersecurity experts found that it stopped essentially any prompts related to cybersecurity. 上周,Anthropic 发布了 Mythos 的公开版本 Fable。该公司表示,Fable 设有严格的护栏,旨在阻止其在生物、化学和网络安全领域的使用,并防止他人通过蒸馏模型来重新创建它。Fable 的护栏非常严格,以至于许多网络安全专家发现,它几乎拦截了所有与网络安全相关的提示词。
Anthropic said that the White House export control order may have been based on a report that there was a method to bypass — or jailbreak — Fable to unlock its powerful Mythos-level capabilities. Anthropic 表示,白宫的出口管制令可能基于一份报告,该报告称存在一种绕过(或越狱)Fable 的方法,可以解锁其强大的 Mythos 级别功能。
According to Katie Moussouris, one of the signatories of the open letter, the method was demonstrated by Amazon researchers in a paper that is not public but that she has reviewed. But Moussouris said in a blog post that the paper did not actually demonstrate a real jailbreak. Instead, she wrote, the researchers simply asked Fable to fix open source code with public and known vulnerabilities along with “deliberately planted vulnerabilities,” after the model initially refused to “review the code for security issues.” 公开信签署人之一 Katie Moussouris 表示,该方法是由亚马逊研究人员在一篇尚未公开但她已审阅过的论文中演示的。但 Moussouris 在一篇博文中指出,该论文实际上并未演示真正的越狱。相反,她写道,研究人员只是在模型最初拒绝“审查代码中的安全问题”后,要求 Fable 修复包含已知公开漏洞以及“故意植入漏洞”的开源代码。
“The behavior described in the paper cannot meaningfully be fixed, and any attempt would only weaken the model for defense,” Moussouris wrote. “Defenders need to be able to ask AI to fix the bugs in a file, explain why the fix matters, and write tests that confirm the patch works. That is not a guardrail bypass. It is the most valuable thing an AI model can do for defensive security: executing the find, fix, and test loop defenders run every day.” “论文中描述的行为无法从根本上修复,任何尝试只会削弱模型在防御方面的能力,” Moussouris 写道。“防御者需要能够要求 AI 修复文件中的漏洞,解释修复的重要性,并编写测试来确认补丁有效。这并不是绕过护栏。这是 AI 模型在防御安全方面能做的最有价值的事情:执行防御者每天都在进行的‘发现、修复和测试’循环。”
Moussouris’ critique was echoed in the open letter, which also said that the group of experts believe the model capabilities in the Amazon paper “can be replicated” on OpenAI’s GPT-5.5, on Anthropic’s own publicly available Claude Opus 4.8 and Sonnet, “and even Chinese models like Kimi 2.7.” 公开信呼应了 Moussouris 的批评,并指出专家组认为亚马逊论文中的模型能力可以在 OpenAI 的 GPT-5.5、Anthropic 自家公开的 Claude Opus 4.8 和 Sonnet,甚至“像 Kimi 2.7 这样的中国模型”上实现。
Moussouris told TechCrunch that “the bugs used to demonstrate the techniques in the paper can be found using the other models. The method in the paper is a guardrail bypass technique. Other models that lack the Fable guardrails often won’t refuse the straightforward request to look for security bugs, so they don’t need a bypass.” Moussouris 告诉 TechCrunch:“论文中用于演示该技术的漏洞,使用其他模型也能找到。论文中的方法是一种绕过护栏的技术。其他缺乏 Fable 护栏的模型通常不会拒绝查找安全漏洞的直接请求,因此它们根本不需要绕过。”
The letter also asked for transparently and fairly enforced regulations created by “a democratic rule-making process” that are based on scientific research done by industry and academic experts, and “used only to the minimal extent necessary to ensure the safety of the American public.” 这封信还呼吁制定透明且公平执行的法规,这些法规应通过“民主的规则制定程序”产生,基于行业和学术专家的科学研究,并“仅在确保美国公众安全所需的最小范围内使用”。