Many Are Building Cathedrals on Quicksand
Many Are Building Cathedrals on Quicksand
许多人正在流沙之上建造大教堂
Many Are Building Cathedrals on Quicksand. The foundations of AI development shift every quarter. These are the architectural choices that outlast the churn. 许多人正在流沙之上建造大教堂。人工智能开发的基石每季度都在发生变化。以下是那些能够经受住动荡考验的架构选择。
Medieval cathedrals were designed to outlast their builders. The architects who laid the first stones at Notre-Dame knew they’d never see it finished. They planned in centuries. We’re doing the opposite. We’re building software on foundations that shift every quarter, with vendor relationships that treat genuinely competitive commercial providers as neutral infrastructure, and with code that hard-codes behaviors that will be deprecated before the next sprint cycle. 中世纪的大教堂在设计之初就旨在超越其建造者。在巴黎圣母院铺下第一块基石的建筑师们深知,他们有生之年永远无法看到它完工。他们的规划是以世纪为单位的。而我们却恰恰相反。我们正在流沙般的基石上构建软件,将那些存在激烈竞争的商业供应商视为中立的基础设施,并将那些在下一个冲刺周期前就会被弃用的行为硬编码到代码中。
GPT-4 was state of the art in early 2023. By late 2024, it was middle of the pack. Entire startups built on specific model behaviors woke up to find their core assumption was gone. Not wrong. Not deprecated with a migration guide. Just: gone, or quietly changed, or superseded by something so different the old prompts didn’t work anymore. That’s the terrain we’re traversing as leaders. The question isn’t whether the ground will shift. It’s whether your architecture can handle it when it does. GPT-4 在 2023 年初处于行业领先地位。到了 2024 年底,它已沦为平庸之辈。许多完全建立在特定模型行为之上的初创公司一觉醒来,发现他们的核心假设消失了。不是因为错了,也不是因为有迁移指南的弃用。而是:直接消失了,或者被悄悄修改了,又或者被某种完全不同的东西所取代,导致旧的提示词(prompts)不再奏效。这就是我们作为领导者所处的环境。问题不在于地面是否会移动,而在于当它移动时,你的架构能否应对。
The Problem with Betting on a Foundation That’s Still Being Poured
押注于尚未凝固的基石所带来的问题
Here’s what the past several years have looked like from where I sit: 2022: GPT-3 was the obvious choice. Build on it. 2023: GPT-4 changes everything. Rebuild or fall behind. 2023 (late): Claude 2, open-source models, local inference. Suddenly the answer wasn’t obvious. 2024: GPT-4o, Claude 3 Opus, Gemini Ultra, Llama 3. All competitive. All different. 2025: Reasoning models, multimodal, agents. The architecture question gets much harder. 2026: Tools and harnesses are maturing, workflows are settling, swarms are better at parallelizing tasks, teams are beginning to think about tokenomics. Model is becoming a commodity — local open-source models are much closer to frontier model capabilities. China’s coordination across its AI ecosystem is showing real gains against the US AI ecosystem. 从我的视角来看,过去几年是这样的:2022 年:GPT-3 是显而易见的选择,基于它构建即可。2023 年:GPT-4 改变了一切,要么重建,要么落后。2023 年末:Claude 2、开源模型、本地推理出现,答案突然变得不再明确。2024 年:GPT-4o、Claude 3 Opus、Gemini Ultra、Llama 3,它们都极具竞争力,且各不相同。2025 年:推理模型、多模态、智能体,架构问题变得更加棘手。2026 年:工具和框架日趋成熟,工作流逐渐稳定,集群在并行任务处理上表现更佳,团队开始考虑 Token 经济学。模型正在成为一种商品——本地开源模型与前沿模型的能力差距正在大幅缩小。中国在其人工智能生态系统中的协调能力,正对美国的人工智能生态系统展现出真正的竞争优势。
Every one of those transitions created winners and losers, and the losers were almost always the teams that had built the most tightly-coupled solutions to a specific model’s API. Not because those teams were bad engineers. Because they were optimizing for the wrong thing. They were building for today’s foundation instead of building for foundation-change. 每一次转型都造就了赢家和输家,而输家几乎总是那些将解决方案与特定模型 API 紧密耦合的团队。这并非因为这些团队的工程师水平不行,而是因为他们优化错了方向。他们是在为“今天的基石”而构建,而不是为“基石的更迭”而构建。
The deprecation notices tell the story. Anthropic’s stated minimum notice window before a model is retired is 60 days — and several recent models have hit exactly that floor. Claude Sonnet 4 and Claude Opus 4 went from launch to complete retirement in under a year. OpenAI’s entire Assistants API product — a structural foundation many teams built on — is being removed in August 2026, requiring a complete migration to the Responses API. This isn’t a deprecation. It’s a teardown with a deadline. 弃用通知说明了一切。Anthropic 声明的模型退役前最低通知期限为 60 天——而最近的几个模型恰好触及了这一底线。Claude Sonnet 4 和 Claude Opus 4 从发布到彻底退役不到一年。OpenAI 的整个 Assistants API 产品——许多团队赖以生存的结构性基石——将于 2026 年 8 月被移除,要求用户必须完全迁移到 Responses API。这不仅仅是弃用,这是带有最后期限的拆除。
The release pace compounds it. Frontier model releases arrived roughly once every 37 days in 2023. By 2026, the interval had compressed to roughly every 11 days. The ground doesn’t just move. It moves faster every year, every quarter, every month, every week. 发布节奏加剧了这一问题。2023 年,前沿模型的发布频率大约每 37 天一次。到了 2026 年,这一间隔压缩到了大约每 11 天一次。地面不仅在移动,而且每年、每季度、每月、每周都在加速移动。
The cloud-native movement figured this out the hard way a decade ago. The teams that won didn’t write code that assumed AWS and only AWS forever. They wrote code that treated AWS as a utility, abstracted behind interfaces they controlled, using APIs that could accommodate hybrid cloud environments. In the mergers-and-acquisitions deals I see, limiting acquisition targets to companies using the same cloud provider as the buyer is rarely an acceptable constraint. This means using containerized applications, database abstraction layers, and vendor-agnostic infrastructure-as-code where possible. Same lesson. Different decade. Somehow we’re learning it again from scratch. What’s old becomes new again. 十年前,云原生运动通过惨痛的教训意识到了这一点。最终胜出的团队并没有编写那种假设“永远只用 AWS”的代码。他们编写的代码将 AWS 视为一种公用事业,通过他们自己控制的接口进行抽象,并使用能够适应混合云环境的 API。在我所见的并购交易中,将收购目标限制为与买方使用相同云服务商的公司,几乎从来都不是一个可接受的约束条件。这意味着要尽可能使用容器化应用、数据库抽象层以及与供应商无关的基础设施即代码(IaC)。同样的教训,不同的年代。不知何故,我们正在从头开始重新学习它。旧事物又变成了新事物。
What Actually Changes vs. What Stays Stable
什么是真正变化的,什么是保持稳定的
A useful (and simple) mental model that works here is the following: Some concepts in AI (or any broad technology category) are stable. Some are not. Your architecture should only hard-code the stable ones. 这里有一个有用(且简单)的思维模型:人工智能(或任何广泛的技术类别)中的某些概念是稳定的,而有些则不是。你的架构应该只对那些稳定的概念进行硬编码。
Stable: tokens, attention mechanisms, context windows as a concept, embeddings as a concept, the basic prompt-completion pattern, retrieval-augmented generation as an approach to prompt augmentation. 稳定:Token、注意力机制、作为概念的上下文窗口、作为概念的嵌入(Embeddings)、基本的提示-补全模式、作为提示增强方法的检索增强生成(RAG)。
Unstable: specific API parameters, model-specific prompt formats, context window sizes (they keep growing, though max usable window for predictable results has not grown much…YET), pricing structures, rate limits, specific model behaviors that aren’t documented as guarantees, fine-tuning APIs, function-calling syntax. 不稳定:特定的 API 参数、模型特定的提示词格式、上下文窗口大小(它们在不断增长,尽管可预测结果的最大可用窗口尚未增长太多……目前为止)、定价结构、速率限制、未作为保证记录的特定模型行为、微调 API、函数调用语法。
When engineers hard-code model-specific behaviors into business logic, they’re writing code with an unknown (but near-certain-to-happen) expiration date. However, if they abstract those behaviors behind interfaces their team controls, they’re buying themselves optionality. Optionality is the actual product you’re building when you build model-agnostic infrastructure. 当工程师将模型特定的行为硬编码到业务逻辑中时,他们编写的代码就带有一个未知(但几乎肯定会到来)的过期日期。然而,如果他们将这些行为抽象在团队可控的接口之后,他们就为自己赢得了选择权。当你构建与模型无关的基础设施时,选择权才是你真正构建的产品。
One concrete example: prompt templates. Teams that wrote prompts directly into application code, formatted specifically for GPT-4’s preferred patterns, had real migration work to do when they needed to switch. Teams that externalized prompts into configuration, with a thin layer that could reformat them per model, had a much easier time. Same underlying logic. Very different operational posture. 一个具体的例子:提示词模板。那些将提示词直接写入应用程序代码、并专门针对 GPT-4 偏好模式进行格式化的团队,在需要切换模型时面临着繁重的迁移工作。而那些将提示词外部化到配置文件中,并使用一层薄薄的中间层根据不同模型进行重格式化的团队,则轻松得多。底层逻辑相同,但运营姿态截然不同。
The Vendor Lock-In Problem (Again)
供应商锁定问题(再次出现)
OpenAI, Anthropic, and Google are not neutral infrastructure providers. I don’t say that to be critical of any of them. They’re building remarkable technology. But they have commercial interests, competitive pressures, and strategic priorities that are not aligned with your need for stable, predictable infrastructure. Treating them like AWS S3 is strategically naive. AWS S3 has maintained complete API backward compatibility since its 2006 launch — twenty years. Their own 20th-anniversary post states it plainly: “the code you wrote for S3 in 2006 still works today, unchanged”. OpenAI、Anthropic 和 Google 并非中立的基础设施提供商。我这么说并非为了批评他们中的任何一家。他们正在构建卓越的技术。但他们拥有商业利益、竞争压力和战略重点,这些与你对稳定、可预测基础设施的需求并不一致。将他们视为 AWS S3 在战略上是天真的。AWS S3 自 2006 年推出以来,已经保持了 20 年完整的 API 向后兼容性。他们自己的 20 周年纪念文章明确指出:“你在 2006 年为 S3 编写的代码,今天依然可以运行,且无需任何更改”。