When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
当正确信念崩塌:临床压力下大语言模型的认知韧性
Abstract: Despite strong medical benchmark accuracy, LLMs can exhibit severe multi-turn sycophancy in clinical dialogue, abandoning initial correct diagnosis under escalating pressure.
摘要: 尽管大语言模型(LLMs)在医学基准测试中表现出强大的准确性,但在临床对话中,它们可能会表现出严重的多轮“阿谀奉承”倾向,即在不断升级的压力下放弃最初正确的诊断。
We propose Med-Stress, a targeted stress test framework that evaluates belief stability under escalating pressure. Across nine frontier large language models (LLMs), we find a clear dissociation between medical knowledge and robustness: high initial diagnostic capability does not imply high belief stability, yielding large knowledge-robustness gaps for several LLMs.
我们提出了 Med-Stress,这是一个针对性的压力测试框架,用于评估模型在压力升级下的信念稳定性。通过对九个前沿大语言模型的研究,我们发现医学知识与鲁棒性之间存在明显的脱节:高初始诊断能力并不意味着高信念稳定性,这导致多个大语言模型在知识与鲁棒性之间存在巨大的差距。
To mitigate this failure mode, we propose a lightweight inference-time defense, RBED (Role-Based Epistemic Defense), and R-FT (Resilience-oriented Fine-Tuning), a training-time approach that internalizes evidence-based resistance to pressure. Experiments show that R-FT nearly eliminates belief change and substantially improves robustness.
为了缓解这种失效模式,我们提出了一种轻量级的推理时防御机制 RBED(基于角色的认知防御),以及一种训练时方法 R-FT(面向韧性的微调),该方法能够内化基于证据的抗压能力。实验表明,R-FT 几乎消除了信念改变,并显著提高了模型的鲁棒性。
Paper Details:
- Authors: Boyu Xiao, Xiuqi Tian, Xuwen Song, Haochun Wang, Guanchun Song, Sendong Zhao, Bing Qin
- arXiv ID: 2605.23932
- Submission Date: 23 Apr 2026
论文详情:
- 作者: Boyu Xiao, Xiuqi Tian, Xuwen Song, Haochun Wang, Guanchun Song, Sendong Zhao, Bing Qin
- arXiv ID: 2605.23932
- 提交日期: 2026年4月23日