Your AI Agent Isn't Broken. Your Company's Truth Is.

Your AI Agent Isn’t Broken. Your Company’s Truth Is.

你的 AI 智能体没坏,坏的是你公司的“真相”

The AI agent had one job: pay approved vendor invoices, so the finance team could stop doing it by hand. On a Tuesday morning, it picked up invoice #4471 from a freight vendor Ksh48,000, stamped Approved in the company’s ERP, cleanly matched to a valid purchase order. The agent checked the things it was told to check. They all passed. It paid the invoice. The invoice had already been paid. The previous Thursday. By a member of the finance team.

这个 AI 智能体的工作很简单:支付已审批的供应商发票,好让财务团队从繁琐的手工操作中解脱出来。周二早上,它处理了来自一家货运供应商的 #4471 号发票,金额为 48,000 先令。在公司的 ERP 系统中,这张发票显示为“已审批”,并与有效的采购订单完美匹配。智能体检查了所有被要求核对的项目,全部通过,于是它支付了这笔款项。然而,这张发票其实在上周四就已经被财务团队的一名成员支付过了。

Here is what the company’s systems believed that morning and none of them was wrong. The ERP said: Approved. Unpaid. The reconciliation job that pulls in bank activity runs overnight, and last night it had failed silently. So the ERP’s picture of the world was simply four days stale. The bank feed said: Paid. Last Thursday. It was right. Nobody had told the ERP. A Slack thread said: “hold everything to this vendor they double-billed us last quarter, I’m sorting it out with their AP team.” Posted by the accounts-payable lead. Three days earlier. Resolved in her head, and nowhere else.

以下是当天早上公司各系统所掌握的信息,且它们各自都没错:ERP 系统显示“已审批、未支付”。负责同步银行流水对账的作业在夜间运行,但昨晚静默失败了,因此 ERP 对世界的认知滞后了四天。银行流水显示“已支付,上周四”,这是正确的,但没人通知 ERP。Slack 上的一条消息写着:“暂停给这家供应商付款,他们上季度重复计费了,我正在和他们的应付账款团队沟通。”这是应付账款主管三天前发的,她心里认为问题已经解决了,但除此之外没有任何地方记录了这一点。

The vendor’s own email said: “Payment well received, thank you!” referring, of course, to Thursday’s payment. The agent’s inbox reader had seen it that morning, then set it aside, because email ranked below the ERP and the two disagreed. Every system was internally consistent. Every system was the authority on something. And there was no system anywhere not one that could answer the only question that actually mattered: has invoice #4471 been paid?

供应商发来的邮件写着:“已收到付款,谢谢!”指的当然是周四的那笔款项。智能体的收件箱读取器当天早上看到了这封邮件,但将其搁置了,因为邮件的优先级低于 ERP,且两者信息不符。每个系统内部都是自洽的,每个系统在各自领域都是权威。但没有任何一个系统——一个都没有——能回答那个唯一重要的问题:#4471 号发票到底付过款了吗?

A human clerk would almost certainly have caught it. Not because a clerk is smarter than the model they’re not. Because a clerk would have felt the friction. They’d have half-remembered cutting the check. Or scrolled past the Slack message that morning and hesitated. Or simply had the reflex to ping someone before sending $48,000 out the door. Reconciling systems that quietly disagree is most of what operations people actually do all day so much of it that nobody files it under “work.” It’s just judgment.

人类职员几乎肯定能发现这个问题。不是因为职员比模型更聪明——他们并不聪明——而是因为职员能感受到“摩擦力”。他们可能会隐约记得开过支票,或者在早上浏览 Slack 消息时产生迟疑,又或者在转出 48,000 先令之前,出于本能去询问一下同事。协调那些静默冲突的系统,占据了运营人员每天大部分的工作时间,以至于没人把这当作“工作”,这仅仅被视为一种“判断力”。

The agent had no friction to feel. It read the highest-priority system, found Approved, unpaid, and acted at machine speed, with no pause, no second source, no instinct that something was off. It didn’t break. It worked exactly as designed, against a company that had no single, trustworthy answer to give it. This is the failure almost nobody names correctly. I call it epistemic collapse.

智能体没有这种“摩擦力”可言。它读取了优先级最高的系统,发现“已审批、未支付”,便以机器的速度执行了操作,没有停顿,没有交叉验证,也没有任何“不对劲”的直觉。它没有坏,它完全按照设计运行,只是它面对的是一家无法提供单一、可信答案的公司。这就是几乎没人能准确命名的失败,我称之为“认知崩溃”(epistemic collapse)。

We keep trying to fix the agent better models, better prompts, better retrieval, tighter guardrails when the agent was never the broken part. The broken part is underneath it. Companies don’t have a layer that turns scattered data into one trustworthy answer. They have an ERP, and a Slack workspace, and a bank feed, and an inbox, and a spreadsheet each holding a fragment of the truth, none of them agreeing, none of them on the same clock held together by a thin film of human judgment that reconciles the whole mess silently, all day, forever. That film finally needs a name, because we are now, for the first time, trying to build it.

我们一直试图通过更好的模型、更好的提示词、更好的检索和更严密的护栏来修复智能体,但智能体从来都不是坏掉的部分。坏掉的部分在它之下。公司缺乏一个能将零散数据转化为单一可信答案的层级。他们有 ERP、Slack 工作区、银行流水、收件箱和电子表格,每一个都只掌握真相的碎片,彼此不一致,时间戳也不统一,全靠一层薄薄的人类判断力将这一团乱麻静默地协调在一起,日复一日,永无止境。这层“薄膜”终于需要一个名字了,因为我们现在第一次尝试去构建它。

  1. What’s actually missing. Call it epistemic infrastructure: the layer that turns data into truth. It’s worth being precise about that, because the two words get used as if they’re the same thing, and the gap between them is exactly where the agent fell in. Data is what your systems store. Truth is what is actually the case. Most companies are drowning in the first and own nothing that reliably produces the second.

  2. 真正缺失的是什么?称之为“认知基础设施”(epistemic infrastructure):即那个将数据转化为真相的层级。我们需要对此进行精确定义,因为这两个词常被混为一谈,而它们之间的鸿沟正是智能体跌落的地方。数据是系统存储的内容,而真相是实际发生的情况。大多数公司淹没在数据中,却没有任何能可靠产出真相的机制。

The ERP had data. The bank had data. Slack had data. What no one had was a place that could take all of it and resolve it into the invoice has been paid with enough confidence to bet 48,000先令 on the answer. Every business system you own makes the same quiet mistake: it crushes three genuinely different things into one overwritten field. Observation - somebody asserted X. “The ERP shows the invoice as approved and unpaid.” Truth - X is actually the case. “The invoice is genuinely unpaid.” History - X became true at some point, and used to be something else. “Unpaid through Wednesday. Paid since Thursday.”

ERP 有数据,银行有数据,Slack 也有数据。但没有人拥有一个地方,能将所有这些信息汇总并得出“发票已支付”的结论,且置信度高到足以押上 48,000 先令。你拥有的每一个业务系统都犯着同样的静默错误:它将三个本质上不同的东西压缩进了一个会被覆盖的字段中:观察(有人断言 X,“ERP 显示发票已审批且未支付”)、真相(X 确实是事实,“发票确实未支付”)、历史(X 在某个时间点变为真,且之前是另一种状态,“周三前未支付,周四起已支付”)。

A database row that reads status: unpaid smashes all three together. It can’t tell you who said so, when it became true, what it was before, or whether anything disagreed. The instant the field is written, every one of those distinctions is gone. The row doesn’t say “the overnight job hasn’t run, so this is a four-day-old observation from one source, and the bank feed disagrees.” It just says unpaid, with the full, flat confidence of a fact.

一个显示“状态:未支付”的数据库行将这三者混为一谈。它无法告诉你是谁说的、何时生效、之前是什么状态,或者是否有其他来源与之冲突。一旦该字段被写入,所有这些区别就消失了。这一行数据不会告诉你“夜间作业未运行,所以这是一个来自单一来源、滞后四天的观察结果,且银行流水与之冲突”。它只是冷冰冰地写着“未支付”,带着一种事实般绝对且平庸的自信。

Now multiply that by every system in the company each collapsing observation, truth, and history its own way, each certain about its own slice, none aware of the others and you have produced that Tuesday morning by construction. The systems weren’t malfunctioning. They were doing the only thing they were ever built to do: store data and present it as truth.

现在,将这种情况乘以公司里的每一个系统——每个系统都以自己的方式将观察、真相和历史坍缩,每个系统都对自己掌握的片段深信不疑,却对其他系统一无所知——你就“构建”出了那个周二早上的惨剧。系统并没有故障,它们只是在做它们被设计用来做的唯一一件事:存储数据,并将其呈现为真相。

Epistemic infrastructure is the missing layer that refuses to do that. Instead of overwriting a field, it records observations and keeps them who observed, from where, when it was claimed true, and what it was replaced. Instead of forcing one answer when sources disagree, it represents the disagreement as a first-class fact rather than silently picking a winner. And instead of pretending the company has a single consistent state, it can tell you the honest thing: the ERP and the bank disagree about #4471, the ERP’s data is stale, and a hold was placed in Slack three days ago do not pay this yet. That last sentence is the entire product. It’s the sentence no system at the company could produce, and the sentence a competent human produces effortlessly.

认知基础设施就是那个缺失的、拒绝这样做的层级。它不会覆盖字段,而是记录观察结果并保留它们:谁观察的、从哪里观察的、何时被宣称为真、以及它取代了什么。当来源冲突时,它不会强行给出一个答案,而是将这种冲突视为一种“一等事实”,而不是静默地选出一个胜者。它不会假装公司处于单一的一致状态,而是能告诉你诚实的情况:ERP 和银行对 #4471 号发票的记录不一致,ERP 数据滞后,且 Slack 上三天前已下达暂停支付指令——现在不要支付。最后这句话就是整个产品的核心。这是公司里没有任何系统能说出来的话,却是一个称职的人类可以毫不费力说出的结论。

  1. Why AI agents are the ones that broke. Here’s the part that reframes everything: this problem is not new. Companies have always been a heap of disagreeing systems on mismatched clocks. The contradiction in the story has been sitting in that company for years. It never detonated for one reason. Humans were the epistemic infrastructure. They were the layer that knew…

  2. 为什么 AI 智能体成了“坏掉”的那一个?这里有一个重构一切的观点:这个问题并不新鲜。公司一直以来都是一堆时钟不同步、信息不一致的系统的集合体。故事中的矛盾在公司里存在多年了,之所以从未爆发,原因只有一个:人类充当了“认知基础设施”。他们是那个知道……的层级。