The Capability Curve Has No Memory
The Capability Curve Has No Memory
能力曲线没有记忆
And everyone keeps building anyway. What choice do we really have? Anthropic urges coordinated pause on advanced AI development. They published a progress report by Marina Favaro and Jack Clark last week that I have not been able to stop thinking about, that AI systems are accelerating and could reach “recursive self-improvement.” Not because of the headline numbers, though those are striking enough. Claude authored over 80% of the code merged into Anthropic’s own codebase, and so are other frontier companies now. Engineers are shipping eight times more output per quarter than they did two years ago. An agent completing tasks that would take a skilled human sixteen hours, working continuously, without being redirected once.
尽管如此,每个人还是在继续开发。我们真的有选择吗?Anthropic 呼吁对先进人工智能的开发进行协调性暂停。上周,他们发布了一份由 Marina Favaro 和 Jack Clark 撰写的进展报告,我对此久久不能忘怀:AI 系统正在加速发展,并可能达到“递归自我改进”的阶段。这不仅仅是因为那些标题党数字,尽管它们已经足够惊人。Claude 编写了 Anthropic 自身代码库中超过 80% 的合并代码,其他前沿公司现在也是如此。工程师们每季度的产出是两年前的八倍。一个智能体可以连续工作,无需任何重新引导,完成一项熟练人类需要十六小时才能完成的任务。
What got me was the graph showing lines of code per engineer over time. Flat for four years. Then a sharp bend upward in 2025 when Claude started running code rather than just suggesting it, the ouroboros, a binary Gödel machine feeding code back into itself. Then steeper again in 2026 when agents started working autonomously over longer horizons. And stop! But I don’t think anyone will, even at Anthropic’s request; technology is like an organism; it just keeps evolving. Smart cookies, Anthropic. In just a few years they managed to get the moola, 1 trillion, in fact. Purchase the missing puzzle pieces of infrastructure like Vercept, Bun, Coefficient Biohealth, Fractional AI, and Stainless, the SDK experts, for whom Anthropic was one of their largest clients, makes sense symbiotically and strategically, well played. I don’t know everything going on inside Anthropic, but Dario and his team are starting to look like 4D chess grand masters.
真正触动我的是那张展示随时间推移每位工程师代码行数的图表。四年间一直保持平稳,直到 2025 年出现了一个急剧的转折点——当时 Claude 开始运行代码而不仅仅是提供建议,这就像衔尾蛇,一台将代码反馈给自身的二进制哥德尔机。到了 2026 年,随着智能体开始在更长的时间跨度内自主工作,曲线变得更加陡峭。停下吧!但我认为没人会停下,即使是应 Anthropic 的要求;技术就像一个有机体,它只会不断进化。Anthropic 真聪明,在短短几年内就筹集到了资金,实际上是 1 万亿。他们收购了基础设施中缺失的拼图,如 Vercept、Bun、Coefficient Biohealth、Fractional AI 以及 SDK 专家 Stainless(Anthropic 曾是其最大的客户之一),这种共生和战略布局非常合理,玩得漂亮。我不知道 Anthropic 内部发生的一切,但 Dario 和他的团队看起来越来越像 4D 国际象棋大师了。
I looked at that graph and felt two things at the same time. Genuinely impressed. I really like Anthropic, and, if I’m honest, I’m a little concerned. The concentration of control: pretty much all of the brains and infrastructure in AI will be consolidated into a handful of Silicon Valley tech companies, reminiscent of the 80’s when Microsoft made deals with all the hardware manufacturers so Windows was the only licensed OS allowed. That’s why Linux was smart to pivot to servers and retained 60% of market share to this day, Ubuntu is great; it works and very rarely has any reliability issues, along with Red Hat and Debian.
看着那张图,我同时感受到了两点。我真心感到钦佩,我真的很喜欢 Anthropic,但老实说,我也有一点担忧。控制权的集中:AI 领域几乎所有的智慧和基础设施都将整合到少数几家硅谷科技公司手中,这让人想起 80 年代微软与所有硬件制造商达成协议,使 Windows 成为唯一获准的授权操作系统。这就是为什么 Linux 明智地转向服务器领域,并至今保持着 60% 的市场份额;Ubuntu 很棒,它运行稳定,极少出现可靠性问题,Red Hat 和 Debian 也是如此。
The Inflection Point Nobody Has a Map For
没人拥有地图的拐点
Here is what I think is actually happening and why the idea is more rational rather than alarmist. We are approaching a threshold. Not gradually, but in the way the frog in hot water approaches boiling with nothing much visible, then everything all at once. An agent can reliably replicate its own development cycle and sustain above 90% code accuracy on open-ended tasks, the nature of human work does not just change. It restructures from the ground up; it amplifies and compounds. The Anthropic article is careful to frame this as a positive development, and they are not wrong. More code shipped faster, bugs caught before production, research that would have taken humans months to years was completed in weeks. Real gains for real problems.
这就是我认为正在发生的事情,以及为什么这个观点是理性的而非危言耸听。我们正在接近一个临界点。不是渐进式的,而是像温水煮青蛙那样,在没什么明显迹象时接近沸点,然后一切瞬间爆发。当一个智能体能够可靠地复制其自身的开发周期,并在开放式任务中保持 90% 以上的代码准确率时,人类工作的本质就不只是改变了。它是从根本上进行了重构;它在放大和复合。Anthropic 的文章小心翼翼地将其描述为一种积极的发展,他们并没有错。代码交付更快,在生产前就捕获了 Bug,人类原本需要数月甚至数年才能完成的研究在几周内就完成了。这是针对实际问题的真正收益。
But here is what it means on the ground for the people doing the work. The volume of what needs to get done does not decrease. It multiplies. What changes is the type of work. Manual execution gives way to high-level direction. Writing code gives way to reviewing it, shaping it, and deciding what strategic problems it should be solving. The human role becomes a layer of high-level authorisation above an autonomous system that is already capable of most of the execution. That is not less work; it is a more complex job, more cerebral, and also requires multidisciplinary experience and deep problem-solving detective skills. Ten times the output means ten times the decisions, ten times the context to hold, and ten times the responsibility for what ships correctly; that’s the compounding effect. And agentic bots are going to do all of this for us, some already are. Being the head of HITL is not easy; stuff moves so quickly. Did you read 20 pages of code from 20 different projects and text instantly on your mobile phone and approve all of them?
但这对一线工作者意味着什么呢?需要完成的工作量并没有减少,反而成倍增加。改变的是工作的类型。手动执行让位于高层指导。编写代码让位于审查、塑造代码,并决定它应该解决哪些战略问题。人类的角色变成了一个位于自主系统之上的高层授权层,而该系统已经能够完成大部分执行工作。这并不是工作变少了;这是一项更复杂、更烧脑的工作,还需要跨学科经验和深度的侦探式问题解决能力。十倍的产出意味着十倍的决策、十倍需要掌握的上下文,以及对正确交付成果的十倍责任;这就是复合效应。智能体机器人将为我们完成这一切,有些已经在做了。担任“人在回路”(HITL)的负责人并不容易;事情发展得太快了。你能在手机上瞬间阅读来自 20 个不同项目的 20 页代码和文本,并全部批准吗?
You Already Need to Know 100 Things
你已经需要掌握 100 件事
I feel this shift personally, and I feel it constantly. Building VEKTOR as a solo developer means I am a developer, a product manager, a security engineer, a devops engineer, a content writer, a growth person, a customer support function, and a business owner, all at once. AI has made each of those roles individually more accessible and even feasible. It has also made it technically possible to run all of them simultaneously in a way that was not realistic before, via delegation. If you go back in time, I remember we had 5 systems at work: Oracle Unix green screen (it never crashed once), which was fast but needed mental repetition to learn; one database; Outlook; Intranet; then Salesforce came along and 20 other apps bolted on. The result is not fewer tasks. It is more complicated work, spread across more domains and more systems with API’s, M2FA logins with higher stakes at each one. Even humans can’t work this captcha out, agentic bots are going to need a standardized system to traverse the internet without getting blocked. And yes, the biggest brains are working on this problem right now. Solving multiple agentic bot layers with credentialed passports. Whoever thought of this captcha idea above needs to be spanked immediately.
我个人不断地感受到这种转变。作为一名独立开发者构建 VEKTOR,意味着我同时是开发者、产品经理、安全工程师、DevOps 工程师、内容撰稿人、增长负责人、客户支持和企业主。AI 让这些角色中的每一个都变得更容易上手,甚至变得可行。它还通过委派,在技术上实现了同时运行所有这些角色的可能,这在以前是不现实的。回想过去,我记得我们工作中只有 5 个系统:Oracle Unix 绿屏(从不崩溃),速度快但需要死记硬背才能学会;一个数据库;Outlook;内网;然后 Salesforce 出现了,又附加了 20 个其他应用程序。结果并不是任务变少了,而是工作变得更复杂了,分散在更多的领域和更多的系统中,每个系统都有 API 和高风险的 M2FA 登录。连人类都搞不定这些验证码,智能体机器人将需要一个标准化的系统来遍历互联网而不被拦截。是的,最聪明的大脑现在正在研究这个问题。通过凭证护照来解决多层智能体机器人的问题。无论谁想出上面这个验证码主意,都该被狠狠打屁股。
This week I was mid-session debugging a certbot renewal failure on the VPS when it became clear the issue was a credentials format mismatch between an old apt-installed certbot version and a Cloudflare API token that expected a newer format. The fix required understanding the snap package ecosystem, the certbot renewal hook architecture, and the Cloudflare API token permission model, all at the same time. Claude who handled it flawlessly by logging into the VPS via Vektor Cloak SSH tool.
这周我正在 VPS 上调试 certbot 更新失败的问题,当时很明显,问题在于通过 apt 安装的旧版 certbot 与期望更新格式的 Cloudflare API 令牌之间的凭证格式不匹配。修复它需要同时理解 snap 包生态系统、certbot 更新钩子架构以及 Cloudflare API 令牌权限模型。Claude 通过 Vektor Cloak SSH 工具登录到 VPS,完美地处理了这个问题。