Elon, stop trying to make Grok happen
Elon, stop trying to make Grok happen
伊隆,别再强推 Grok 了
There is a harsh truth about Elon Musk’s “truth-seeking” AI chatbot Grok: It’s not very good, and not many people are using it. That’s the takeaway of a new Reuters report, which found that Grok barely appears in federal records of how the US government used AI last year. It’s not the only sign xAI’s signature chatbot is in trouble, even as Musk puts it at the heart of what could be the biggest IPO in history.
关于伊隆·马斯克那款“追求真相”的 AI 聊天机器人 Grok,有一个残酷的事实:它表现平平,且用户寥寥。这是路透社一份新报告得出的结论,该报告发现,在去年美国政府使用 AI 的联邦记录中,几乎找不到 Grok 的身影。尽管马斯克将其视为史上最大规模 IPO 的核心,但这并不是 xAI 旗舰聊天机器人陷入困境的唯一迹象。
Reuters reviewed more than 400 examples of government AI use where specific vendors were named. Grok or xAI, it found, appeared in only three — each of those for basic uses like document drafting or social media management, and always alongside competitors like Microsoft and OpenAI. OpenAI’s models, by comparison, appeared in more than 230 examples, while Google and Anthropic each appeared dozens of times.
路透社审查了 400 多个明确标注了供应商的政府 AI 使用案例。结果发现,Grok 或 xAI 仅出现了三次,且均用于文档起草或社交媒体管理等基础用途,且总是与微软和 OpenAI 等竞争对手一同出现。相比之下,OpenAI 的模型出现在 230 多个案例中,而谷歌和 Anthropic 也分别出现了数十次。
A similar pattern appeared in another database of more ambitious government AI projects with smaller numbers of users. Grok appeared just three times: twice for routine administrative tasks at the Election Assistance Commission, and once in a Department of Energy pilot at Lawrence Livermore National Laboratory for document summaries and general research. Reuters found 140 entries involving Microsoft and OpenAI, while my brief review found at least 10 entries for Anthropic and dozens for Google’s Gemini.
在另一个包含用户规模较小、更具雄心的政府 AI 项目数据库中,也出现了类似的情况。Grok 仅出现了三次:两次用于选举援助委员会的日常行政工作,一次用于劳伦斯利弗莫尔国家实验室的能源部试点项目,负责文档摘要和一般性研究。路透社在 140 个条目中发现了微软和 OpenAI 的身影,而我粗略统计发现,Anthropic 至少有 10 个条目,谷歌的 Gemini 则有数十个。
The lists are an incomplete and patchy measure of government adoption. Many more examples are listed without a specific vendor, and it’s clear there is no universal definition of what counts as AI. The data also doesn’t capture intelligence agencies or the Pentagon — where xAI secured a $200 million contract last year and was recently cleared to operate on classified networks after Anthropic’s blacklisting.
这些列表对政府采用情况的衡量并不完整且零散。许多案例并未列出具体供应商,且显然目前对于什么是 AI 并没有统一的定义。此外,这些数据并未涵盖情报机构或五角大楼——xAI 去年在那里获得了 2 亿美元的合同,并在 Anthropic 被列入黑名单后,近期获准在机密网络上运行。
Still, it’s not looking good for Grok. It shows up far less than its rivals, and when it does show up, it’s mostly for basic admin work — hardly befitting the world-class frontier model Musk has spent years bragging about.
尽管如此,Grok 的前景依然不容乐观。它的出现频率远低于竞争对手,且即便出现,也大多仅限于基础行政工作——这与马斯克多年来吹嘘的“世界级前沿模型”形象相去甚远。
People who spoke to Reuters suggested the explanation was simple: Grok isn’t as good as its rivals. It’s “just not the best model out there,” an unnamed Pentagon source said, adding that staffers there tend to prefer Gemini or Claude. Public leaderboards ranking AI models lend weight to that view. Anthropic, Google, and OpenAI dominate the top ranks, while Grok rarely cracks the top 10 outside the occasional image or video category.
接受路透社采访的人士认为原因很简单:Grok 不如竞争对手。一位不愿透露姓名的五角大楼消息人士称,它“根本不是市面上最好的模型”,并补充说那里的工作人员更倾向于使用 Gemini 或 Claude。公开的 AI 模型排行榜也印证了这一观点。Anthropic、谷歌和 OpenAI 占据了榜单前列,而 Grok 除了偶尔在图像或视频类别中露面外,极少进入前十名。
That’s awkward for Musk, and even more awkward for SpaceX, which absorbed xAI earlier this year. The rocket venture’s IPO filing shows the company has put AI — and Grok specifically — at the heart of its pitch to investors. SpaceX claims to have identified “the largest actionable total addressable market in human history”: an astonishing $28.5 trillion opportunity, though, sadly, it offers no timetable for getting there. Practically all of this estimated value comes from AI, enterprise AI in particular, not rockets or satellites.
这对马斯克来说很尴尬,对今年早些时候合并了 xAI 的 SpaceX 来说则更为尴尬。这家火箭公司的 IPO 文件显示,公司已将 AI(特别是 Grok)作为向投资者推介的核心。SpaceX 声称已经确定了“人类历史上最大的可行动总潜在市场”:一个高达 28.5 万亿美元的惊人机遇,遗憾的是,它并未提供实现这一目标的具体时间表。几乎所有这些估值都来自 AI,尤其是企业级 AI,而非火箭或卫星。
Reuters notes that Grok’s performance in government agencies could hint at how well it does in other workplaces, too. As part of xAI’s push for enterprise customers, Musk has reportedly strong-armed banks into buying Grok subscriptions if they wish to participate in SpaceX’s IPO — but if they’re not getting their money’s worth, these deals could prove a short-term fix.
路透社指出,Grok 在政府机构中的表现可能也暗示了它在其他工作场所的表现。作为 xAI 争取企业客户努力的一部分,据报道,马斯克强迫银行购买 Grok 订阅以换取参与 SpaceX IPO 的资格——但如果这些银行觉得物无所值,这些交易可能只是权宜之计。
As if its dreary performance wasn’t awkward enough, Musk recently admitted that xAI has used OpenAI’s models to help train and improve Grok. The process, known as distillation, is standard when companies are using their own models, but far more contentious when it involves using a rival’s system. Grok can’t even beat the models it’s training on.
如果说其惨淡的表现还不够尴尬,马斯克最近还承认,xAI 曾使用 OpenAI 的模型来帮助训练和改进 Grok。这种被称为“蒸馏”的过程在公司使用自有模型时很常见,但涉及使用竞争对手的系统时则极具争议。Grok 甚至无法超越它所借鉴训练的模型。
In its public-facing consumer version, Grok is deliberately unpleasant. Musk has branded the chatbot a less biased and less censored alternative to tools like ChatGPT, but that’s translated into a product with loose evidentiary standards, an unhealthy obsession with Musk, and a long track record of offensive, conspiratorial, and sexualized outputs. Even if workplace guardrails are different, it may not be the kind of thing a business would welcome. Grok’s illustrious record includes praising Adolf Hitler, casting doubt on Holocaust death tolls, plastering millions of nonconsensual sexualized deepfakes all over X, including ones of children, and powering a racist and transphobic Wikipedia knockoff and spicy anime girlfriend. And let us not forget the time it called itself “MechaHitler.” If Grok were a human employee, I feel HR would not take long to get involved.
在面向公众的消费者版本中,Grok 被刻意设计得令人不快。马斯克将其标榜为比 ChatGPT 等工具更少偏见、更少审查的替代品,但这转化为一个证据标准宽松、对马斯克有病态执念,且长期输出冒犯性、阴谋论和性化内容的产品。即使工作场所的防护栏有所不同,这恐怕也不是企业所欢迎的。Grok 的“光辉”记录包括赞美阿道夫·希特勒、质疑大屠杀死亡人数、在 X 上充斥数百万张未经同意的性化深度伪造图片(包括儿童图片),以及为种族主义和跨性别恐惧症的维基百科仿制品及“火辣动漫女友”提供支持。别忘了它还曾自称“机械希特勒”。如果 Grok 是人类员工,我觉得人力资源部门早就该介入了。
SpaceX appears to understand the problem. In its filing, the company warned Grok’s “spicy” or “unhinged” modes carry “heightened risks,” including reputational damage, regulatory scrutiny, and lawsuits. In corporate speak: This chatbot is going to get us sued.
SpaceX 似乎意识到了这个问题。在文件中,该公司警告称,Grok 的“火辣”或“失控”模式带有“更高的风险”,包括声誉受损、监管审查和诉讼。用企业术语来说就是:这个聊天机器人会让我们被告上法庭。
Grok takes its name from Robert A. Heinlein’s Stranger in a Strange Land, where it roughly means a deep and profound understanding of something. The thing to understand here is not particularly complex: Musk has spent billions building a chatbot that is not very good, not very popular, and somehow key to justifying SpaceX’s astronomical valuation. Good luck with that.
Grok 的名字取自罗伯特·海因莱因的小说《异乡异客》,大致意为对某事物的深刻理解。这里需要理解的事情并不复杂:马斯克花费数十亿美元打造了一个既不好用、也不受欢迎的聊天机器人,却将其作为证明 SpaceX 天价估值的关键。祝他好运吧。