TRUSTMEM: Learning Trustworthy Memory Consolidation for LLM Agents with Long-Term Memory

TRUSTMEM：为具有长期记忆的 LLM 智能体学习可信的记忆整合机制

Large language model (LLM) agents rely on long-term memory to support extended interactions and personalized assistance beyond finite context windows. Existing memory agents actively update external memory through generated write, revise, and delete operations, but these updates may omit important information, corrupt existing memory, or introduce unsupported hallucinated content. Once stored, such errors become persistent system-state failures that can affect future reasoning and generation.

大型语言模型（LLM）智能体依赖长期记忆来支持超出有限上下文窗口的扩展交互和个性化辅助。现有的记忆智能体通过生成的写入、修改和删除操作主动更新外部记忆，但这些更新可能会遗漏重要信息、破坏现有记忆或引入未经支持的幻觉内容。一旦存储，这些错误就会成为持久的系统状态故障，从而影响未来的推理和生成。

In this paper, we propose TrustMem, a framework designed to improve the trustworthiness of memory consolidation. TrustMem relies on a Memory Transition Verifier to evaluate the transition process of memory updates in terms of coverage, preservation, and faithfulness. It further constructs preference pairs among candidate updates under the same memory state, enabling preference-guided reinforcement learning to directly optimize memory updating behaviors.

在本文中，我们提出了 TrustMem，这是一个旨在提高记忆整合可信度的框架。TrustMem 依赖于一个“记忆转换验证器”（Memory Transition Verifier），从覆盖率、保持率和忠实度三个维度评估记忆更新的转换过程。此外，它还在相同记忆状态下的候选更新之间构建偏好对，从而利用偏好引导的强化学习直接优化记忆更新行为。

Extensive experiments demonstrate that TrustMem improves both memory utility and reliability: it achieves state-of-the-art results across MemoryAgentBench, HaluMem, and the Mem-alpha validation set, improves HaluMem memory extraction by 12.14 F1 points, and reduces transition-level omission, corruption, and hallucination by 40.1%, 79.1%, and 50.0%, respectively, compared with the strongest baseline for each error type.

广泛的实验表明，TrustMem 同时提升了记忆的效用和可靠性：它在 MemoryAgentBench、HaluMem 和 Mem-alpha 验证集上均取得了最先进（SOTA）的结果；与各类错误类型中最强的基线相比，它将 HaluMem 的记忆提取能力提升了 12.14 个 F1 点，并将转换层面的遗漏、损坏和幻觉率分别降低了 40.1%、79.1% 和 50.0%。