Agents need control flow, not more prompts

智能体需要的是控制流，而非更多的提示词

07 May, 2026 2026年5月7日

Thesis: reliable agents tackling complex tasks need deterministic control flow encoded in software, not increasingly elaborate prompt chains 核心论点：处理复杂任务的可靠智能体需要的是编码在软件中的确定性控制流，而不是日益复杂的提示词链。

If you’ve ever resorted to MANDATORY or DO NOT SKIP, you’ve hit the ceiling of prompting. Imagine a programming language where statements are suggestions and functions return “Success” while hallucinating. Reasoning becomes impossible; reliability collapses as complexity grows. 如果你曾被迫使用“强制执行”或“禁止跳过”这类指令，说明你已经触及了提示词工程的天花板。试想一种编程语言，其中的语句仅仅是建议，函数在产生幻觉的同时却返回“成功”。在这种情况下，逻辑推理变得不可能，可靠性会随着复杂度的增加而崩塌。

Software scales through recursive composability: systems built from libraries, modules, and functions. It’s code all the way down. Code exposes predictable behavior, enabling local reasoning. Prompt chains lack this property. While useful for narrow tasks, prompts are non-deterministic, weakly specified, and difficult to verify. 软件通过递归组合性实现扩展：系统由库、模块和函数构建而成。底层逻辑皆为代码。代码展现出可预测的行为，从而支持局部推理。而提示词链缺乏这种特性。虽然提示词在处理狭窄任务时很有用，但它们具有非确定性、定义模糊且难以验证的特点。

Reliability requires moving logic out of prose and into runtime. We need deterministic scaffolds: explicit state transitions and validation checkpoints that treat the LLM as a component, not the system. 可靠性要求将逻辑从自然语言中剥离，并植入运行时环境。我们需要确定性的脚手架：即明确的状态转换和验证检查点，将大语言模型（LLM）视为系统的一个组件，而非系统本身。

But deterministic orchestration is only half the battle. In a system prone to silent failure, an agent without aggressive error detection is just a fast way to reach the wrong conclusion. Without programmatic verification, we are left with three options: 但确定性的编排只是成功了一半。在一个容易发生静默失败的系统中，如果智能体缺乏主动的错误检测机制，那它只会成为通往错误结论的“快车道”。在缺乏程序化验证的情况下，我们只剩下三种选择：

Babysitter: Keep a human in the loop to catch errors before they propagate. 保姆模式： 让人员参与流程，在错误扩散前进行拦截。
Auditor: Perform exhaustive end-to-end verification after the run. 审计模式： 在运行结束后进行详尽的端到端验证。
Prayer: Vibe accept the outputs. 祈祷模式： 凭感觉接受输出结果。