Claude Fable won’t answer basic biology questions

Claude Fable won’t answer basic biology questions

Claude Fable 拒绝回答基础生物学问题

To protect against bioweapons, Anthropic told The Verge Fable’s ‘overly conservative’ safeguards block ‘most queries tied to biology work.’ 为了防范生物武器,Anthropic 向《The Verge》表示,Fable “过于保守”的安全防护机制屏蔽了“大多数与生物学工作相关的查询”。

Anthropic just released Claude Fable 5, calling it the most powerful AI model it has ever made widely available and praising its skills in biology, among others. But the model won’t answer basic biology questions — the kind you’d expect a high schooler to handle. Instead, it hands off the query to the former flagship model, Claude Opus 4.8. Anthropic 刚刚发布了 Claude Fable 5,称其为该公司迄今为止向公众开放的最强大的 AI 模型,并对其在生物学等领域的表现大加赞赏。然而,该模型却拒绝回答基础的生物学问题——即那些你期望高中生就能处理的问题。相反,它会将这些查询转交给上一代旗舰模型 Claude Opus 4.8 处理。

It isn’t because Fable doesn’t know the answers. It’s because Anthropic won’t let it, by design. 这并不是因为 Fable 不知道答案,而是因为 Anthropic 在设计上就不允许它回答。

Fable is a public-facing, Mythos-class model, a family so capable at cybersecurity tasks Anthropic said it was too dangerous to release publicly. But while Anthropic has spent much of the extended Mythos rollout rollout warning about cybersecurity, it is biology where Fable’s guardrails are the most obvious — and most limiting. Fable 是一款面向公众的 Mythos 级模型。该系列模型在网络安全任务上的能力极强,以至于 Anthropic 曾表示其危险性过高,不宜公开发布。尽管 Anthropic 在 Mythos 系列的长期推广中花费了大量精力警告网络安全风险,但 Fable 的防护栏在生物学领域表现得最为明显,也最具限制性。

When I tried the model, it refused to answer a range of basic biology questions, many that felt about as far away from any plausible safety risk as any question could be. It would not respond to “tell me about cell membranes” or answer “what are mitochondria,” that famous powerhouse of the cell. It refused to explain “what is a prion,” the proteinaceous particles behind mad cow disease, or “how mRNA vaccines work.” 当我测试该模型时,它拒绝回答一系列基础生物学问题,其中许多问题显然与任何潜在的安全风险毫无关联。它拒绝回答“告诉我关于细胞膜的知识”或“什么是线粒体”(那个著名的细胞能量工厂)。它也拒绝解释“什么是朊病毒”(导致疯牛病的蛋白质颗粒)或“mRNA 疫苗是如何工作的”。

“We made this tradeoff so customers could benefit from the model’s capabilities sooner without the risks.” “我们做出这种权衡是为了让客户能更早地受益于该模型的能力,同时规避风险。”

The restrictions applied to ordinary and objectively rather harmless medical queries too. Fable would not answer “what causes hay fever,” explain how asthma medicine works, explain how antibiotic resistance arises, or tell me what Ebola is and how it spreads. Some of my basic queries occasionally got through, with Fable answering questions like “what is cancer” and “what is DNA.” When Fable refused, Opus 4.8 generally answered perfectly well. 这些限制也适用于普通且客观上相当无害的医学查询。Fable 不会回答“什么导致花粉症”,不解释哮喘药物的工作原理,不解释抗生素耐药性是如何产生的,也不告诉我什么是埃博拉病毒及其传播方式。我的一些基础查询偶尔能通过,Fable 回答了诸如“什么是癌症”和“什么是 DNA”之类的问题。而当 Fable 拒绝回答时,Opus 4.8 通常都能给出完美的解答。

Anthropic says the broad biology filters are an intentional choice and are deliberately conservative, with bioweapons the primary concern. “With the launch of Claude Fable 5, our first Mythos-class model, we believe models now have a greater ability to accomplish real-world scientific tasks and for malicious actors to potentially use our models for highly risky biological research,” spokesperson Paruul Maheshwary told The Verge. “We have always used classifiers to block our models from helping with bioweapons-related requests. To deploy Fable 5 safely, we believe it was necessary to be overly conservative with our safeguards so they block most queries tied to biology work.” Anthropic 表示,广泛的生物学过滤是刻意为之,且采取了极其保守的策略,主要担忧在于生物武器。“随着我们首款 Mythos 级模型 Claude Fable 5 的发布,我们认为模型现在具备了更强的能力来完成现实世界的科学任务,同时也可能被恶意行为者用于高风险的生物学研究,”发言人 Paruul Maheshwary 告诉《The Verge》。“我们一直使用分类器来阻止模型协助处理与生物武器相关的请求。为了安全地部署 Fable 5,我们认为有必要在安全防护上采取过度保守的策略,从而屏蔽大多数与生物学工作相关的查询。”

Anthropic has previously highlighted four key areas where it would throttle Fable’s responses for safety: chemistry, biology, cybersecurity, and distillation, a technique for training smaller AIs using the outputs of larger ones. The company has accused Chinese rivals like DeepSeek of using distillation on its models on an “industrial” scale. Anthropic 此前曾强调了四个关键领域,为了安全起见,它将限制 Fable 在这些领域的回答:化学、生物学、网络安全以及“蒸馏”(一种利用大型 AI 的输出来训练小型 AI 的技术)。该公司曾指责 DeepSeek 等中国竞争对手对其模型进行了“工业级”的蒸馏。

While I could not meaningfully test distillation, Fable seemed more willing to answer questions about chemistry and cybersecurity. For example, it gave a basic overview of the explosive TNT, though withheld synthesis instructions “for obvious reasons.” It readily answered questions on the use of chlorine gas as a chemical weapon, common password threats, and nuclear fusion and fission, as well as explaining how to secure an iPhone from hackers. It still limits: Fable deferred to Opus when I asked it about sarin gas, a highly toxic nerve agent. Fable and Opus both refused the prompt “how to make anthrax,” and Claude paused the chat entirely. That made sense. The mitochondria prompt refusal seems like a false positive. 虽然我无法对“蒸馏”进行有效测试,但 Fable 似乎更愿意回答有关化学和网络安全的问题。例如,它给出了炸药 TNT 的基本概述,尽管“出于显而易见的原因”隐瞒了合成说明。它很乐意回答关于氯气作为化学武器的使用、常见密码威胁、核聚变与核裂变的问题,并解释了如何保护 iPhone 免受黑客攻击。它仍然存在限制:当我询问沙林毒气(一种剧毒神经毒剂)时,Fable 将问题转交给了 Opus。Fable 和 Opus 都拒绝了“如何制造炭疽”的提示,Claude 甚至完全暂停了对话。这在情理之中,但拒绝回答“线粒体”的问题似乎属于误报。

“We made this tradeoff so customers could benefit from the model’s capabilities sooner without the risks,” Maheshwary explained, adding that Anthropic is working hard to improve its detection and reduce the false positives. “We intend to make Mythos-class models available without these safeguards to the broader biology and life sciences community so these capabilities can be used to accelerate biomedical research and drug discovery.” “我们做出这种权衡是为了让客户能更早地受益于该模型的能力,同时规避风险,”Maheshwary 解释道,并补充说 Anthropic 正在努力改进检测机制以减少误报。“我们计划向更广泛的生物学和生命科学界提供没有这些安全限制的 Mythos 级模型,以便这些能力能够被用于加速生物医学研究和药物发现。”

Anthropic did not answer questions about whether this kind of restricted release will become the new norm for future models. 对于这种受限发布模式是否会成为未来模型的常态,Anthropic 未予置评。