MCP is dead?

MCP is dead? / MCP 真的死了吗?

Articles 文章

MCP is dead MCP 真的死了吗

Chloe Kim Backend Engineer @ Quandri Quandri 后端工程师

TL;DR: MCP eats context, has low reliability, and overlaps with existing CLI/API. 简而言之:MCP 会吞噬上下文,可靠性低,且与现有的 CLI/API 存在功能重叠。

💡Reference: MCP is dead. Long live the CLI 💡参考:MCP 已死,CLI 万岁

After reading the above article, we ran the experiments on our actual stack. This document covers the original argument, additional research, and our measurements. 在阅读上述文章后,我们在实际的技术栈中进行了实验。本文涵盖了原始论点、补充研究以及我们的测量结果。

📌Update: Since these measurements were taken, Claude Code has rolled out Tool Search with Deferred Loading, which loads MCP tool schemas on-demand and reduces context usage by 85%+. The context bloat described in Problem 1 is largely addressed for users on current Claude Code versions. The performance, debugging, and architectural arguments below still apply. 📌更新:自这些测量数据采集以来,Claude Code 推出了带有“延迟加载”功能的工具搜索(Tool Search with Deferred Loading),该功能可按需加载 MCP 工具架构,并将上下文占用率降低了 85% 以上。问题 1 中描述的上下文臃肿问题在当前版本的 Claude Code 中已基本解决。但下文提到的性能、调试和架构方面的论点依然成立。


What’s Wrong with MCP / MCP 有什么问题?

MCP (Model Context Protocol) connects LLMs to external tools (GitHub, Linear, Notion, Slack, etc.). Since its launch in late 2024, it’s been called “the USB-C of the AI ecosystem.” But developers actually using it day-to-day are starting to think differently. MCP(模型上下文协议)将大语言模型(LLM)连接到外部工具(如 GitHub、Linear、Notion、Slack 等)。自 2024 年底发布以来,它被称为“AI 生态系统的 USB-C”。但每天实际使用它的开发者们开始有了不同的看法。

TL;DR: MCP eats context, has low reliability, and overlaps with existing CLI/API. 简而言之:MCP 会吞噬上下文,可靠性低,且与现有的 CLI/API 存在功能重叠。

Problem 1: It Devours the Context Window / 问题 1:它会吞噬上下文窗口

The context window is the LLM’s desk. When you connect MCP servers, tool definitions alone take up a significant chunk of that desk. 上下文窗口就像是 LLM 的办公桌。当你连接 MCP 服务器时,仅工具定义就会占据这张桌子的很大一部分。

Restaurant analogy: 餐厅类比:

  • You sit down and 10 menus (MCP tool definitions) are spread across the table. 你坐下来,桌上铺满了 10 本菜单(MCP 工具定义)。
  • There’s no room left for actual food (your work). 没有空间放真正的食物(你的工作)了。
  • Every time you order, the menus have to be pulled out again. 每次点餐时,都必须重新翻阅这些菜单。

We extracted and measured the actual tool definitions from the MCP servers connected in our environment. With all 4 servers connected, 10.5% of the context window is consumed by tool definitions alone. 我们提取并测量了我们环境中连接的 MCP 服务器的实际工具定义。在连接了全部 4 个服务器的情况下,仅工具定义就占用了 10.5% 的上下文窗口。

Measurement: Tool Definition Sizes (Quandri Stack) 测量:工具定义大小(Quandri 技术栈)

MCP ServerToolsEstimated CharsEstimated Tokens
Linear42~51,229~12,807
Notion14~16,156~4,039
Slack12~15,168~3,792
Postgres9~1,755~438
Total77~84,308~21,077

Context Window Usage (all servers combined) 上下文窗口占用(所有服务器总计)

ModelContext WindowUsage by Tool Definitions
Claude (200K)200,000 tokens10.5%
GPT-4o (128K)128,000 tokens16.5%

Linear alone accounts for over 12,800 tokens. That’s 42 tool definitions always loaded, even if you only ever use get_issue and save_issue. 仅 Linear 就占用了超过 12,800 个 token。这意味着即使你只使用 get_issuesave_issue,也有 42 个工具定义始终处于加载状态。


Problem 2: Low Operational Reliability / 问题 2:运行可靠性低

  • Issue Detail: Init failure, repeated re-auth. 问题细节: 初始化失败,重复重新认证。
  • Requires starting and maintaining a separate process. 需要启动并维护一个独立的进程。
  • Slower AI responses: External server round-trip on every tool call. AI 响应变慢: 每次工具调用都需要外部服务器往返。
  • Mid-session tool death: MCP server process crashes. 会话中途工具失效: MCP 服务器进程崩溃。
  • Opaque permissions: Unclear what permissions each tool actually has. 权限不透明: 不清楚每个工具实际拥有什么权限。

Performance is a known issue. The author of the original article benchmarked Jira MCP against its REST API directly and found MCP was 3x slower per call, and 9.4x slower on first call including initialization. This isn’t Jira-specific, it’s architectural: every MCP server adds a process layer between the LLM and the underlying API. The same overhead applies to the Linear, Notion, and Slack servers in our stack. 性能是一个已知问题。原作者对比了 Jira MCP 与其 REST API 的性能,发现 MCP 每次调用的速度慢了 3 倍,包含初始化的首次调用慢了 9.4 倍。这并非 Jira 特有的问题,而是架构问题:每个 MCP 服务器都在 LLM 和底层 API 之间增加了一个进程层。同样的开销也存在于我们技术栈中的 Linear、Notion 和 Slack 服务器上。


Problem 3: Overlaps with Existing CLI/API / 问题 3:与现有 CLI/API 功能重叠

AspectCLI / APIMCP
Human-machine paritySame commands for humans and LLMsOnly exists inside LLM conversations
ComposabilityPipes, jq, grep freely combinableLocked to server return format
DebuggingReproduce immediately in terminalOnly reproducible inside conversation context
Training dataAlready learned from man pages, StackOverflowRequires separate tool definitions
Install costMostly already installedServer setup, auth, process management needed

Token Comparison: MCP vs CLI for Linear Issue Lookup Token 对比:Linear 问题查询的 MCP 与 CLI

How many tokens does it cost to look up the same Linear issue? MCP consumes ~65x more tokens than the CLI approach. 查询同一个 Linear 问题需要消耗多少 token?MCP 消耗的 token 数量大约是 CLI 方法的 65 倍。

  • [ CLI approach: ~200 tokens ] curl -s -H "Authorization: Bearer $LINEAR_TOKEN" \ -H "Content-Type: application/json" \ -d '{"query":"{ issue(id: \"ISSUE-ID\") { title state { name } assignee { name } } }"}' \ https://api.linear.app/graphql
    • Prompt (curl command): ~50 tokens
    • Response: ~150 tokens
  • [ MCP approach: ~12,957 tokens ]
    • Tool definitions (always loaded): ~12,807 tokens (42 tools)
    • Tool call + response: ~150 tokens

What Are the Alternatives? / 有什么替代方案?

Alternative 1: CLI-First Strategy / 替代方案 1:CLI 优先策略 Provide CLI -> API -> docs, in that order. LLMs already learned from man pages and StackOverflow. 按 CLI -> API -> 文档的顺序提供。LLM 已经从 man 手册和 StackOverflow 中学习过这些内容。

  • Using existing CLI directly: 直接使用现有的 CLI:
    • No context wasted on tool definitions. 不会在工具定义上浪费上下文。
    • Same interface for humans and AI, easy to debug. 人类和 AI 使用相同的接口,易于调试。
    • Freely composable with pipelines. 可与管道自由组合。

Alternative 2: Skills Pattern / 替代方案 2:技能模式 If MCP is “spreading all menus on the table upfront”, Skills is “asking the librarian for only the book you need”. 如果说 MCP 是“把所有菜单都摊在桌子上”,那么“技能模式”就是“只向图书管理员索要你需要的书”。

AspectMCPSkills
Loading timeAll tool definitions loaded on connectOnly loaded when needed
Context consumptionAlways occupiedOnly when in use
ScalabilityContext pressure grows with each serverNot proportional to skill count

The key is embedding CLI usage instructions inside Skills. Combined with Alternative 1’s CLI-first strategy, this is most efficient. 关键在于将 CLI 使用说明嵌入到“技能”中。结合替代方案 1 的 CLI 优先策略,这是最高效的。

So Is MCP Really Dead? / 所以 MCP 真的死了吗? Not entirely. MCP is still valid when: 不完全是。在以下情况下,MCP 仍然有效:

  • No CLI exists for the service - web-only SaaS where MCP may be the only connection method. 服务没有 CLI——对于仅限 Web 的 SaaS,MCP 可能是唯一的连接方式。
  • Non-developer users - MCP is more accessible for those who don’t use terminals. 非开发者用户——对于不使用终端的用户,MCP 更易于访问。
  • Real-time bidirectional communication - scenarios beyond simple request-response. 实时双向通信——超出简单请求-响应的场景。