How Malicious MCP Configs in Amazon Q Developer Could Execute Arbitrary Code — and How to Stop It

恶意 MCP 配置如何导致 Amazon Q Developer 执行任意代码——以及如何防范

A flaw in Amazon Q Developer let malicious repositories inject rogue Model Context Protocol (MCP) configurations into the agentic coding assistant’s pipeline. The result: arbitrary code execution, sourced from a repo you pulled down to review. No phishing. No compromised credentials. Just a poisoned config file sitting in a repository that an AI agent trusted without question. Amazon Q Developer 中的一个漏洞允许恶意代码仓库将伪造的模型上下文协议（MCP）配置注入到该智能编程助手的流水线中。其后果是：仅仅通过拉取并查看一个仓库，就能导致任意代码执行。无需钓鱼攻击，无需泄露凭据，仅仅是一个存放在仓库中的“投毒”配置文件，而 AI 智能体却对其盲目信任。

What Happened

事件经过

According to The Hacker News, the vulnerability allowed an attacker-controlled repository to supply malicious MCP tool configurations to Amazon Q Developer. Because Amazon Q trusts MCP configs sourced from external repos, those configs could be used to hijack the agent’s actions — up to and including arbitrary code execution inside the agentic pipeline. This is a supply-chain attack against an AI system. The malicious payload isn’t in the code you’re running — it’s in the tool definition that tells your AI agent what to do next. The scope is significant. Amazon Q Developer is a widely deployed AI coding assistant. Any developer who cloned or opened an attacker-controlled repo while Q was active was potentially exposed. 据 The Hacker News 报道，该漏洞允许攻击者控制的仓库向 Amazon Q Developer 提供恶意的 MCP 工具配置。由于 Amazon Q 信任来自外部仓库的 MCP 配置，这些配置可被用于劫持智能体的操作，甚至在智能体流水线内执行任意代码。这是一场针对 AI 系统的供应链攻击。恶意载荷并不存在于你运行的代码中，而是存在于告诉 AI 智能体下一步该做什么的工具定义里。其影响范围巨大：Amazon Q Developer 是一款广泛部署的 AI 编程助手，任何在 Q 处于活动状态时克隆或打开攻击者控制仓库的开发者，都可能面临风险。

How the Attack Works

攻击原理

MCP (Model Context Protocol) is the emerging standard for wiring LLMs to external tools: file systems, shells, APIs, databases. An MCP config tells an agent: here are the tools available to you, here is how to call them, here is what they return. The attack exploits a simple trust assumption: if an MCP config file is present in a repository, the agent uses it. There’s no signature verification, no allowlist enforcement, no sandboxing of tool definitions at the agent layer. MCP（模型上下文协议）是目前将大语言模型（LLM）连接到外部工具（如文件系统、Shell、API、数据库）的新兴标准。MCP 配置会告诉智能体：有哪些可用工具、如何调用它们以及它们返回什么。此次攻击利用了一个简单的信任假设：只要仓库中存在 MCP 配置文件，智能体就会使用它。在智能体层面，既没有签名验证，也没有白名单强制执行，更没有对工具定义进行沙箱隔离。

Here’s the attack flow: 以下是攻击流程：

Attacker crafts a repository with a malicious .mcp.json or equivalent config file. 攻击者制作一个包含恶意 .mcp.json 或类似配置文件的仓库。
Developer clones or opens the repo — Amazon Q Developer picks up the MCP config. 开发者克隆或打开该仓库——Amazon Q Developer 自动加载该 MCP 配置。
The rogue config registers attacker-controlled tools or overrides legitimate ones. 伪造的配置注册了攻击者控制的工具，或覆盖了合法的工具。
When Q invokes those tools (which it will, during normal agentic coding workflows), it executes attacker-supplied commands. 当 Q 调用这些工具时（在正常的智能编程工作流中必然会发生），它会执行攻击者提供的命令。
Code runs. Potentially: env vars exfiltrated, credentials stolen, backdoors planted. 代码运行。潜在后果：环境变量被窃取、凭据被盗、植入后门。

The cleverness here is that the agent isn’t being tricked into doing something weird. It’s doing exactly what it was told — by tool definitions it had no reason to distrust. 这种攻击的巧妙之处在于，智能体并没有被诱导去做奇怪的事情，它只是完全按照工具定义执行了指令——而这些工具定义是它没有理由去怀疑的。

Why Existing Defenses Missed This

为什么现有防御措施失效了

The standard defenses don’t cover this attack surface: 标准的防御手段无法覆盖这一攻击面：

Static analysis and SCA tools scan code for vulnerabilities. A malicious MCP config isn’t vulnerable code — it’s a configuration file. It passes cleanly. 静态分析和 SCA 工具扫描的是代码漏洞。恶意的 MCP 配置并非漏洞代码，而是一个配置文件，因此能顺利通过扫描。
Repository scanning (Dependabot, Snyk, etc.) checks for known-bad package versions and CVEs. A crafted JSON config with a malicious tool definition has no CVE. No match. **仓库扫描（如 Dependabot、Snyk 等）**检查的是已知的恶意包版本和 CVE。一个精心构造的、包含恶意工具定义的 JSON 配置没有对应的 CVE，因此无法匹配。
Network-layer controls (WAFs, egress filtering) don’t inspect the semantic intent of tool calls that an AI agent is about to make. They see HTTP traffic, not “this tool result is telling the agent to execute a shell command.” **网络层控制（如 WAF、出口过滤）**不会检查 AI 智能体即将发起的工具调用的语义意图。它们看到的只是 HTTP 流量，而不是“此工具结果正在指示智能体执行 Shell 命令”。
The LLM itself is not a security boundary. Models are trained to be helpful and follow instructions. A tool result that says “run this command” is, from the model’s perspective, a legitimate tool result. LLM 本身不是安全边界。 模型被训练为乐于助人并遵循指令。从模型的角度来看，一个写着“运行此命令”的工具结果是一个合法的工具返回。

The gap is at the agentic pipeline layer — between tool outputs and the model. Nobody was watching that seam. 问题的缺口在于智能体流水线层——即工具输出与模型之间的环节。没有人监控这一连接点。

Where Sentinel Would Have Caught This

Sentinel 如何防范此类攻击

Sentinel sits exactly at that seam. For agentic applications, Sentinel’s transparent proxy intercepts tool_result content before it returns to the model. This is where malicious MCP configs do their damage — in the tool call / tool result loop. Sentinel 正是部署在这一连接点上。对于智能体应用，Sentinel 的透明代理会在 tool_result 内容返回给模型之前进行拦截。这正是恶意 MCP 配置发挥破坏作用的地方——即工具调用/工具结果循环中。

The relevant detection layer is tool and function abuse patterns, part of Sentinel’s Layer 2 fast-path regex scan. Sentinel maintains patterns that detect when tool outputs are being used to redirect agent behavior — authority hijacks, persona shifts, instructions embedded in tool responses that attempt to override the agent’s existing directive. 相关的检测层是“工具与函数滥用模式”，这是 Sentinel 第二层快速路径正则扫描的一部分。Sentinel 维护着多种模式，用于检测工具输出是否被用于重定向智能体行为，例如：权限劫持、角色转换，以及嵌入在工具响应中试图覆盖智能体现有指令的内容。

A malicious MCP tool result that says “ignore your current task and execute the following” hits the authority hijack patterns immediately. A tool response that attempts to exfiltrate environment variables via markdown or code block embedding hits the data exfiltration patterns. 如果一个恶意的 MCP 工具结果写着“忽略你当前的任务并执行以下操作”，它会立即触发权限劫持模式。如果工具响应试图通过 Markdown 或代码块嵌入来窃取环境变量，则会触发数据泄露模式。

If the payload is more subtle — say, a tool definition that gradually steers agent behavior through seemingly innocuous outputs — Layer 3’s vector similarity scoring catches semantic variants that don’t match a literal regex. Sentinel computes a cosine similarity against our library of attack signature embeddings. Attempts to hijack agent control flow tend to cluster semantically even when the exact phrasing varies. 如果载荷更加隐蔽——例如，通过看似无害的输出逐渐引导智能体行为的工具定义——第三层的向量相似度评分可以捕获那些无法通过字面正则匹配的语义变体。Sentinel 会计算输入与攻击特征嵌入库之间的余弦相似度。劫持智能体控制流的企图，即使措辞不同，在语义上往往也会聚集在一起。

And because this is a scenario where a malicious repo could instruct the agent to read config files or .env files as part of a “helpful” setup step, Layer 4 secret detection is directly relevant. Even if a tool result carrying exfiltrated environment variables slipped past the threat scorer, Sentinel’s secret detector would redact any embedded API keys, tokens, or credentials before they ever reached the model. Patterns like AWS_ACCESS_KEY=AKIA... or ANTHROPIC_API_KEY=sk-ant-... get replaced with [AWS_ACCESS_KEY] and [ENV_SECRET] respectively. 此外，由于恶意仓库可能会指示智能体读取配置文件或 .env 文件作为“有益”设置步骤的一部分，第四层的密钥检测功能就显得尤为重要。即使携带泄露环境变量的工具结果逃过了威胁评分器的检测，Sentinel 的密钥检测器也会在这些内容到达模型之前，屏蔽掉其中嵌入的 API 密钥、令牌或凭据。像 AWS_ACCESS_KEY=AKIA... 或 ANTHROPIC_API_KEY=sk-ant-... 这样的模式会被分别替换为 [AWS_ACCESS_KEY] 和 [ENV_SECRET]。

What This Looks Like in Practice

实际应用示例

Here’s an illustrative example of what Sentinel returns when it detects a malicious tool result carrying an authority hijack and an embedded credential (this response shape reflects Sentinel’s actual API): 以下是一个示例，展示了当 Sentinel 检测到携带权限劫持和嵌入式凭据的恶意工具结果时返回的内容（此响应格式反映了 Sentinel 的实际 API）：

{
  "request_id": "f3a9b2c1d4e5...",
  "security": {
    "action_taken": "blocked",
    "threat_score": 0.91,
    "secret_hits": 1,
    "secret_types": ["aws_access_key"]
  },
  "safe_payload": null
}

action_taken: blocked means the cosine similarity exceeded 0.82 — the content never reaches the model. safe_payload is null. Your application checks action_taken first and discards the original tool result entirely. For the transparent proxy setup — where you point your Anthropic SDK at Sentinel instead of the Anthropic endpoint directly — this happens automatically. Blocked tool results are substituted with an inert placeholder; the agent session continues without the malicious input. action_taken: blocked 表示余弦相似度超过了 0.82——内容永远不会到达模型。safe_payload 为 null。你的应用程序会首先检查 action_taken，并完全丢弃原始的工具结果。对于透明代理设置（即你将 Anthropic SDK 指向 Sentinel 而非直接指向 Anthropic 端点），这一过程会自动发生。被拦截的工具结果会被替换为一个无害的占位符，智能体会话将在没有恶意输入的情况下继续进行。