Diagnosing and Mitigating Compounding Failures in Agentic Persuasion via Taxonomic Strategy Retrieval
Diagnosing and Mitigating Compounding Failures in Agentic Persuasion via Taxonomic Strategy Retrieval
通过分类策略检索诊断并缓解智能体说服过程中的复合故障
Abstract: Foundation-model agents in multi-step, open-ended environments frequently suffer from compounding errors, where early mistakes contaminate long-horizon trajectories. 摘要: 在多步骤、开放式环境中,基础模型智能体经常遭受复合错误的影响,即早期的失误会污染长周期的任务轨迹。
While Multi-Agent Debate (MAD) succeeds in deterministic domains, agents in subjective tasks like persuasion experience severe problem drift and sycophantic conformity. 尽管多智能体辩论(MAD)在确定性领域取得了成功,但在说服等主观任务中,智能体会经历严重的问题漂移和谄媚式从众行为。
We identify semantic leakage in standard Retrieval-Augmented Generation (RAG) as a reproducible trigger for these failures, as standard RAG prioritizes vocabulary overlap over logical necessity. 我们发现标准检索增强生成(RAG)中的语义泄露是导致这些故障的可复现诱因,因为标准 RAG 优先考虑词汇重叠而非逻辑必要性。
To eliminate this leakage, we introduce Taxonomic Strategy RAG (TS-RAG), a systems intervention that routes strategies through a discrete categorical bottleneck to decouple argumentative structure from topical content. 为了消除这种泄露,我们引入了分类策略 RAG(TS-RAG),这是一种系统干预手段,通过离散的分类瓶颈来引导策略,从而将论证结构与主题内容解耦。
Zero-shot, cross-domain evaluations demonstrate that TS-RAG significantly improves the transfer of abstract logic where standard semantic retrieval collapses. 零样本跨领域评估表明,在标准语义检索失效的情况下,TS-RAG 显著提升了抽象逻辑的迁移能力。
Crucially, TS-RAG acts as a “capability bridge” in asymmetric deployments, empowering lightweight persuaders to consistently defeat parametrically superior opponents (improving win rates from 70.5 to 78.5) and accelerating argumentative efficiency. 至关重要的是,TS-RAG 在非对称部署中充当了“能力桥梁”,使轻量级说服者能够持续击败参数规模更优的对手(胜率从 70.5% 提升至 78.5%),并提高了论证效率。
Finally, we introduce trace-level diagnostics via a turn-by-turn Debate State Representation (DSR), demonstrating the necessity of strict constraints to prevent evaluation collapse via default agentic sycophancy. 最后,我们通过逐轮辩论状态表示(DSR)引入了轨迹级诊断,证明了实施严格约束以防止因智能体默认谄媚行为导致评估崩溃的必要性。