Harness, Scaffold, and the AI Agent Terms Worth Getting Right

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

Harness、Scaffold 以及那些值得厘清的 AI Agent 术语

When a field evolves quickly, its vocabulary often evolves faster than its shared understanding. Terms start to blur, get reused in different contexts, or become shorthand for ideas that are never fully explained. We are currently seeing this happen in the field of AI Agents, where concepts are getting mixed together, some are renamed, and others are widely used for a few months before quietly disappearing. 当一个领域快速发展时,其词汇的演变速度往往超过了人们对其共识的理解。术语开始变得模糊,在不同语境下被重复使用,或者成为从未被完全解释清楚的概念的简写。我们目前正在 AI Agent 领域看到这种情况:各种概念被混为一谈,一些被重新命名,而另一些在被广泛使用几个月后又悄然消失。

This can be overwhelming for newcomers, and even for practitioners trying to keep up with the latest developments. After ICLR 2026, one of us (@ariG23498) posted a question that captured this confusion well: “What do you mean by the terms ‘harness’ and ‘scaffold’ in the context of agents? I have heard a lot of explanations while I was at ICLR, but I could not understand why they did not converge to a single explanation.” 对于新人,甚至是试图跟上最新进展的从业者来说,这可能会让人不知所措。在 ICLR 2026 会议之后,我们中的一位 (@ariG23498) 提出了一个很好地捕捉到这种困惑的问题:“在 Agent 的语境下,你们所说的‘harness’和‘scaffold’到底是什么意思?我在 ICLR 期间听到了很多解释,但我无法理解为什么它们不能统一成一个解释。”

This glossary is our attempt to ground the terms that keep coming up without clear, consistent explanations. It is not meant to be a comprehensive dictionary of every term in the field. Instead, we focus on the concepts that are often mixed up, reused in different ways, or assumed to be obvious when they are not. 这份术语表是我们试图为那些反复出现却缺乏清晰、一致解释的术语正本清源的尝试。它并非旨在成为该领域所有术语的综合词典。相反,我们专注于那些经常被混淆、以不同方式重复使用,或被误认为显而易见但实则不然的概念。

Model

模型

The model is the LLM: it takes text in and produces text out (e.g., Claude, Qwen, GPT, Kimi, DeepSeek…). On its own, it has no memory between calls, and no loop. The model can express the intent to call a tool, but it needs a harness to actually execute it. It answers one prompt and stops. Wrap it in scaffolding and a harness and it becomes an agent. 模型即大语言模型(LLM):它接收文本输入并输出文本(例如 Claude、Qwen、GPT、Kimi、DeepSeek 等)。就其本身而言,它在多次调用之间没有记忆,也没有循环。模型可以表达调用工具的意图,但需要一个 harness 来实际执行它。它回答一个提示词后就会停止。将其包裹在 scaffolding 和 harness 中,它就成为了一个 Agent。

Scaffolding

脚手架 (Scaffolding)

The behavior-defining layer around the model: system prompt, tool descriptions, how the model’s responses get parsed, what it remembers across steps (context management). It shapes how the model sees the world and acts in it, whether during training or at inference. Products like Claude Code, Codex, and Antigravity CLI call the whole thing a harness. Claude Code’s own docs say it directly: “Claude Code serves as the agentic harness around Claude.” That’s the broad use: harness means everything that isn’t the model. 这是围绕模型定义的行为层:系统提示词、工具描述、模型响应的解析方式,以及它在不同步骤间记忆的内容(上下文管理)。它塑造了模型如何看待世界并在其中行动,无论是在训练还是推理阶段。像 Claude Code、Codex 和 Antigravity CLI 这样的产品将这一切统称为 harness。Claude Code 的文档直接说明:“Claude Code 作为 Claude 周围的代理 harness 存在。”这就是广义的用法:harness 指代除了模型之外的一切。

Harness

驱动器 (Harness)

The execution layer inside the agent: it calls the model, handles its tool calls, decides when to stop. The harness is what makes the agent run. Scaffolding, defined above, is what the model works from: its instructions, its tools, its format. Harness engineering is the discipline of designing this layer well: deciding when the agent should stop, how errors get handled, and what guardrails keep it on track. 这是 Agent 内部的执行层:它调用模型,处理工具调用,并决定何时停止。Harness 是让 Agent 运行的核心。上面定义的 Scaffolding 是模型工作的基础:它的指令、工具和格式。Harness 工程学是一门关于如何设计好这一层的学科:决定 Agent 何时停止、如何处理错误,以及使用什么护栏(guardrails)来确保其运行不偏离轨道。

Agent

智能体 (Agent)

The term comes from reinforcement learning, where an agent is simply a function that takes an observation and returns an action. The environment takes that action and returns a new observation, and the loop repeats. That loop is still at the core of how LLM agents work. In the LLM world, the term has expanded. An agent is a model plus everything around it that lets it act, not just respond. It turns raw text generation into something that can act in a loop: taking in information, deciding what to do, and acting on the results. 这个术语源于强化学习,其中 Agent 仅仅是一个接收观察结果并返回动作的函数。环境接收该动作并返回新的观察结果,循环往复。这个循环仍然是 LLM Agent 工作方式的核心。在 LLM 世界中,这个术语的含义已经扩展。Agent 是模型加上围绕它的一切,使其能够“行动”而不仅仅是“响应”。它将原始的文本生成转化为一种可以在循环中行动的事物:接收信息、决定做什么,并根据结果采取行动。