Multi-Persona Debate System for Automated Scientific Hypothesis Generation
Multi-Persona Debate System for Automated Scientific Hypothesis Generation
用于自动化科学假设生成的多角色辩论系统
Abstract: Modern scientific discovery is bottlenecked not by data scarcity, but by the inability to synthesize fragmented knowledge into actionable hypotheses. This challenge is especially acute in battery materials research, where electrochemical performance, interfacial behavior, and manufacturing feasibility must be optimized simultaneously.
摘要: 现代科学发现的瓶颈不在于数据匮乏,而在于无法将碎片化的知识综合为可操作的假设。这一挑战在电池材料研究中尤为突出,因为研究人员必须同时优化电化学性能、界面行为和制造可行性。
Here, we present the Multi-Persona Debate System (MPDS), a literature-grounded framework for automated scientific hypothesis generation that combines literature retrieval, long-context large language model reasoning, corpus-driven persona induction, and structured multi-agent debate.
在此,我们提出了多角色辩论系统(MPDS),这是一个基于文献的自动化科学假设生成框架。它结合了文献检索、长上下文大语言模型推理、语料驱动的角色归纳以及结构化的多智能体辩论。
MPDS constructs literature snapshots of up to 500 papers, grounds agents in role-specific evidence pools, and conducts a three-round citation-aware debate followed by moderator synthesis, enabling negotiation between personas while preserving evidence traceability.
MPDS 构建了包含多达 500 篇论文的文献快照,将智能体置于特定角色的证据池中,并进行三轮基于引用的辩论,随后由主持人进行综合。这一过程在实现角色间协商的同时,保留了证据的可追溯性。
We evaluate MPDS using a temporally controlled protocol excluding direct access to target papers, including two held-out battery-materials case studies and a blinded comparison across 30 matched cases. In sodium-ion anode and all-solid-state battery cathode design tasks, MPDS recovered design logics aligned with experimentally validated solution spaces and generated more mechanistically explicit, process-aware proposals than simpler baselines.
我们使用时间控制协议对 MPDS 进行了评估,该协议排除了对目标论文的直接访问,包括两个预留的电池材料案例研究,以及跨 30 个匹配案例的盲测对比。在钠离子负极和全固态电池正极设计任务中,MPDS 恢复了与实验验证的解空间一致的设计逻辑,并比简单的基准模型生成了更具机制明确性和工艺感知性的方案。
To assess the impact of personas and debate, we introduce Integrative Hypothesis Quality scoring. In ablation studies, MPDS achieved the highest mean score among five conditions, with its largest advantage in cross-perspective integration. A laboratory follow-up suggests utility as a diagnostic aid for identifying practical bottlenecks in workflows.
为了评估角色和辩论的影响,我们引入了“综合假设质量评分”。在消融研究中,MPDS 在五种实验条件下获得了最高的平均分,其最大优势在于跨视角的整合能力。后续的实验室跟进研究表明,该系统可作为诊断辅助工具,用于识别工作流程中的实际瓶颈。
These results indicate that structured debate over literature snapshots improves hypothesis formation under coupled engineering constraints and provides a reusable workflow for text-intensive scientific discovery.
这些结果表明,针对文献快照进行的结构化辩论,能够在耦合的工程约束下改善假设的形成,并为文本密集型的科学发现提供了一种可复用的工作流程。