CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment

CASCADE：大型语言模型部署期间的基于案例的持续适应

Abstract: Large language models (LLMs) have become a central foundation of modern artificial intelligence, yet their lifecycle remains constrained by a rigid separation between training and deployment, after which learning effectively ceases. This limitation contrasts with natural intelligence, which continually adapts through interaction with its environment.

摘要： 大型语言模型（LLM）已成为现代人工智能的核心基础，但其生命周期仍受限于训练与部署之间的严格分离，一旦部署，学习过程实际上就停止了。这种局限性与自然智能形成了鲜明对比，后者通过与环境的交互不断进行自我适应。

In this paper, we formalise deployment-time learning (DTL) as the third stage in the LLM lifecycle that enables LLM agents to improve from experience during deployment without modifying model parameters. We present CASCADE (CASe-based Continual Adaptation during DEployment), a general and principled framework that equips LLM agents with an explicit, evolving episodic memory.

在本文中，我们将“部署时学习”（Deployment-time Learning, DTL）形式化为 LLM 生命周期的第三阶段，使 LLM 智能体能够在不修改模型参数的情况下，通过部署期间的经验不断改进。我们提出了 CASCADE（部署期间基于案例的持续适应），这是一个通用且具有原则性的框架，为 LLM 智能体配备了显式且不断演进的情景记忆。

CASCADE formulates experience reuse as a contextual bandit problem, enabling principled exploration-exploitation trade-offs and establishing no-regret guarantees over long-term interactions. This design allows agents to accumulate, select, and refine task-relevant cases, transforming past experience into actionable knowledge.

CASCADE 将经验重用建模为上下文多臂老虎机问题（Contextual Bandit Problem），实现了原则性的探索与利用权衡，并为长期交互建立了“无悔”（no-regret）保证。这种设计允许智能体积累、选择并提炼与任务相关的案例，从而将过去的经验转化为可操作的知识。

Across 16 diverse tasks spanning medical diagnosis, legal analysis, code generation, web search, tool use, and embodied interaction, CASCADE improves macro-averaged success rate by 20.9% over zero-shot prompting while consistently outperforming gradient-based and memory-based baselines. By reframing deployment as an adaptive learning process, this work establishes a foundation for continually improving AI systems.

在涵盖医疗诊断、法律分析、代码生成、网络搜索、工具使用和具身交互等 16 项不同任务中，CASCADE 的宏观平均成功率比零样本提示（zero-shot prompting）提高了 20.9%，并始终优于基于梯度和基于记忆的基线模型。通过将部署重新定义为一个自适应学习过程，本研究为持续改进人工智能系统奠定了基础。