Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection

Traj-Evolve：用于肺癌早期检测中患者轨迹建模的自进化多智能体系统

Abstract: Modeling patient trajectories from longitudinal electronic health records (EHRs) requires reasoning over sparse, noisy, and long-context multimodal sequences. Existing LLM-based multi-agent systems address context length but process patients in isolation, failing to mirror how clinicians leverage accumulated experience from similar prior cases.

摘要： 从纵向电子健康记录（EHR）中对患者轨迹进行建模，需要对稀疏、嘈杂且长上下文的多模态序列进行推理。现有的基于大语言模型（LLM）的多智能体系统虽然解决了上下文长度问题，但通常孤立地处理患者数据，未能模拟临床医生如何利用类似既往病例的累积经验。

We present Traj-Evolve, a self-evolving multi-agent system with two complementary evolving mechanisms. First, an Experience Pool (ExPool) acts as a non-parametric memory, indexing rejection-sampled reasoning traces to retrieve similar patients as few-shot contexts. Second, multi-agent reinforcement learning (MARL) via reward-ranked fine-tuning parametrically optimizes inter-agent and agent-memory collaboration. A leave-one-out cross-retrieval strategy unifies the two, aligning training- and inference-time behavior under retrieval augmentation.

我们提出了 Traj-Evolve，这是一个具有两种互补进化机制的自进化多智能体系统。首先，经验池（ExPool）充当非参数化记忆，通过索引拒绝采样后的推理轨迹，检索相似患者作为少样本（few-shot）上下文。其次，通过奖励排序微调的多智能体强化学习（MARL）对智能体之间以及智能体与记忆之间的协作进行参数化优化。一种“留一法”交叉检索策略将两者统一起来，使检索增强下的训练和推理行为保持一致。

On a lung cancer prediction task utilizing up to five years of multimodal EHRs, Traj-Evolve outperforms 9 strong baselines on the overall population and a challenging never-smoker population.

在利用长达五年多模态 EHR 数据的肺癌预测任务中，Traj-Evolve 在总体人群和具有挑战性的从不吸烟人群中，表现均优于 9 个强基准模型。

Analysis of the evolving dynamics highlights three key findings: (1) expanding the ExPool shifts optimal retrieval from diverse to specific samples; (2) under MARL, the manager agent’s prediction loss converges quickly while the worker agents’ temporal reasoning continues to benefit from more verified patients; and (3) the two mechanisms are complementary on the predicted risk, where ExPool improves specificity while MARL improves sensitivity.

对进化动态的分析突出了三个关键发现：（1）扩大 ExPool 会使最优检索从多样化样本转向特定样本；（2）在 MARL 下，管理智能体的预测损失收敛迅速，而工作智能体的时间推理能力则持续从更多已验证的患者案例中获益；（3）这两种机制在预测风险方面具有互补性，其中 ExPool 提高了特异性，而 MARL 提高了敏感性。