Critique of Agent Model

Abstract: What is an agent? What constitutes agency? With the rise of Large Language Model (LLM) systems marketed as coding agents'', AI co-scientists”, and other agentic" tools that promise to drive up productivity, and at the same time, existential” concerns such as AI escaping human control with destructive power under a speculative “machine agency” against humans, it has become essential to clarify where automation ends and agency begins, both for building capable systems and for understanding whether and what to fear.

摘要： 什么是智能体（Agent）？什么构成了主体性（Agency）？随着被营销为“编程智能体”、“AI 联合科学家”以及其他承诺提高生产力的“智能体化（agentic）”工具的大型语言模型（LLM）系统的兴起，与此同时，诸如 AI 在推测性的“机器主体性”下摆脱人类控制并产生破坏力的“生存”担忧也随之而来。因此，明确自动化在何处终结、主体性在何处开始，对于构建强大的系统以及理解我们是否应该恐惧以及恐惧什么，变得至关重要。

Drawing on Descartes’ grounding of agency in independent thought, and on portrayals of autonomous beings in science fiction, we survey the current landscape of AI agents, and analyze agent architectures along five dimensions: goal, identity, decision-making, self-regulation, and learning. Specifically, we argue that genuine agency requires these structures to be internalized within the system itself rather than assembled through external scaffolding.

借鉴笛卡尔关于主体性植根于独立思考的观点，以及科幻小说中对自主存在的描绘，我们调查了当前 AI 智能体的格局，并从目标、身份、决策、自我调节和学习五个维度分析了智能体架构。具体而言，我们认为真正的主体性要求这些结构必须“内化于系统自身”，而不是通过外部脚手架组装而成。

This distinction between agentic systems, whose competence resides in engineered workflows, and agentive systems, whose capabilities (including social interaction) arise endogenously, defines the boundary between systems designed for prescribed tasks, and those capable of operating in the open world with true autonomy.

这种区分定义了“智能体化系统”（agentic systems，其能力依赖于工程化工作流）与“主体性系统”（agentive systems，其能力（包括社交互动）是内源性产生的）之间的界限，也界定了为预设任务设计的系统与能够在开放世界中实现真正自主运行的系统之间的边界。

Building on this analysis, we propose the Goal-Identity-Configurator (GIC) architecture for a general-purpose agent model, combining hierarchical goal decomposition, identity evolution, simulative reasoning grounded in a separately trained world model, learned self-regulation, and self-directed learning from both real and simulated experience. Furthermore, we share insight on the auditability, controllability, and safety of agentive systems that possess greater autonomy and “agency”, but remain under human oversight.

基于这一分析，我们提出了一种用于通用智能体模型的“目标-身份-配置器”（GIC）架构。该架构结合了分层目标分解、身份演化、基于独立训练的世界模型的模拟推理、习得的自我调节，以及从真实和模拟经验中进行的自我导向学习。此外，我们还分享了关于那些拥有更高自主性和“主体性”但仍处于人类监督下的主体性系统的可审计性、可控性和安全性的见解。