Strategic Decision Support for AI Agents

AI 智能体的战略决策支持

Abstract: Traditionally, decision support studies how humans use machine learning models to make better decisions. In modern agentic systems, this division of roles is increasingly reversed: AI agents act on behalf of users, while humans and tools become support mechanisms around them. This role reversal brings reliability concerns to the forefront, since agentic errors can be consequential and agent behavior must remain aligned with human goals and constraints.

摘要： 传统上，决策支持研究的是人类如何利用机器学习模型来做出更好的决策。在现代智能体系统中，这种角色分工正日益发生逆转：AI 智能体代表用户行事，而人类和工具则成为围绕它们的支持机制。这种角色互换将可靠性问题推向了前台，因为智能体的错误可能产生严重后果，且智能体的行为必须始终与人类的目标和约束保持一致。

Departing from the classical view of decision support, we revisit its two basic principles, the cost—value tradeoff of seeking support and the role of uncertainty quantification, in a setting where AI agents are the central actors. We propose a framework for strategic decision support for AI agents through an optimization problem that minimizes support usage subject to controlling a counterfactual missed-support error: the probability that the agent acts alone on instances where support would have materially improved its output.

我们摒弃了传统的决策支持视角，在以 AI 智能体为核心角色的背景下，重新审视了其两个基本原则：寻求支持的成本-价值权衡，以及不确定性量化的作用。我们提出了一个针对 AI 智能体的战略决策支持框架，通过一个优化问题来最小化对支持的使用，同时控制一种反事实的“漏支持错误”（missed-support error），即：在原本获得支持能显著改善输出的情况下，智能体却独立行事的概率。

At the population level, we show that the optimal policy is a threshold rule on the value of support. Building on this structure, we develop an online algorithm that adaptively thresholds such a score and uses randomized exploration to control missed-support error without distributional assumptions. We further introduce a calibration-on-the-fly method that reduces unnecessary support calls online.

在总体层面，我们证明了最优策略是基于支持价值的阈值规则。基于这一结构，我们开发了一种在线算法，能够自适应地设定此类分数的阈值，并利用随机探索在无需分布假设的情况下控制“漏支持错误”。此外，我们还引入了一种即时校准（calibration-on-the-fly）方法，以减少在线过程中不必要的支持调用。

We instantiate this framework across diverse scenarios, including information gathering, human—AI collaboration, and tool use, showing how each can be modeled through the same strategic decision-support lens. Experiments across these settings show that our method reliably controls the target error while substantially reducing support usage in practice.

我们将该框架应用于多种场景，包括信息收集、人机协作和工具使用，展示了如何通过相同的战略决策支持视角对这些场景进行建模。在这些环境下的实验表明，我们的方法能够可靠地控制目标错误率，同时在实践中大幅降低了对支持的需求。