Beyond expert users: agents should help users construct preferences, not just elicit them

超越专家用户：智能体应帮助用户构建偏好，而非仅仅是诱导偏好

Abstract: Agents typically assume an expert user — one with well-formed preferences about what they want — and default to clarifying questions whenever the task is underspecified. We argue this assumption is unrealistic. Users often lack the domain knowledge to have completely specified preferences; if asked about their preference on some feature, the user may be unable to answer without the agent helping the user to learn some domain knowledge needed to form a preference for that feature, e.g., via examples or explanations.

摘要： 智能体通常假设用户是“专家”——即对自己的需求有明确偏好——并在任务描述不充分时默认通过澄清性问题来获取信息。我们认为这种假设是不切实际的。用户往往缺乏足够的领域知识来形成完全明确的偏好；如果被问及对某项功能的偏好，用户可能无法回答，除非智能体能帮助他们学习形成该偏好所需的领域知识（例如通过示例或解释）。

To formalize these principles, we draw on the Search-Experience-Credence framework from Information Economics to introduce CoPref, a model of how users construct preferences based on agent dialog actions. We then study these ideas concretely in agentic recommender systems, proposing CoShop, an interactive benchmark. In CoShop, an agent converses with and makes recommendations for a CoPref user. The agent’s performance depends on whether it can help the user gain the knowledge needed to specify the task well.

为了将这些原则形式化，我们借鉴了信息经济学中的“搜索-体验-信任”（Search-Experience-Credence）框架，引入了 CoPref 模型，用以描述用户如何基于智能体的对话行为来构建偏好。随后，我们在智能推荐系统中具体研究了这些理念，并提出了一个交互式基准测试 CoShop。在 CoShop 中，智能体与 CoPref 用户进行对话并提供推荐。智能体的表现取决于它能否帮助用户获得明确任务需求所需的知识。

Evaluating five frontier models, we find that no agent exceeds 56% accuracy on CoShop despite five turns of interaction. Failures stem not from agents’ ability to find items, but from how little the interaction expands what users know about what they want.

通过对五个前沿模型进行评估，我们发现尽管进行了五轮交互，没有任何智能体在 CoShop 上的准确率超过 56%。失败的原因不在于智能体寻找物品的能力，而在于交互过程未能有效扩展用户对自己需求的认知。