Block-Reason Headers: Make Your Security Proxy Tell You Why

Block-Reason Headers: Make Your Security Proxy Tell You Why

Block-Reason 响应头:让你的安全代理说明拦截原因

When a security proxy blocks an agent’s request, the agent sees a 4xx and has to guess what happened. Was the destination wrong? The body? A header? Did the proxy timeout? Did the proxy itself crash? Without context, every block looks the same and the agent burns its retry budget on a single attempt’s worth of information. 当安全代理拦截了代理(Agent)的请求时,代理只能看到 4xx 错误,并不得不猜测发生了什么。是目标地址错误?请求体?还是请求头?是代理超时了?还是代理本身崩溃了?在缺乏上下文的情况下,所有的拦截看起来都一样,代理会因为单次尝试的信息而耗尽其重试配额。

X-Pipelock-Block-Reason is the header Pipelock emits on every block path so the agent knows. The vocabulary is small, the format is open-spec, and the impact on operator debugging is large. This post is about the design, the schema, and why making a security proxy explain itself is good for the security posture, not bad for it. X-Pipelock-Block-Reason 是 Pipelock 在每个拦截路径上发出的响应头,旨在让代理知晓原因。该词汇表精简、格式开放,对运维人员的调试工作影响巨大。本文将探讨其设计、架构,以及为什么让安全代理进行自我解释对安全态势是有益而非有害的。

The problem the header solves

该响应头解决的问题

A coding agent runs a tool that fetches a URL, parses the response, and feeds the output back to the model. The fetch goes through Pipelock. Pipelock decides the response contains a prompt-injection pattern and returns 403 with no body. The agent has no idea what happened. 一个编程代理运行一个工具来获取 URL、解析响应并将输出反馈给模型。该获取过程通过 Pipelock 进行。Pipelock 判断响应包含提示词注入(prompt-injection)模式,并返回 403 且不带响应体。代理完全不知道发生了什么。

From the agent’s perspective: The host could be unreachable. The proxy could be misconfigured. The proxy could be down. The destination could be returning 403 itself. The agent’s request could have failed scanning. The agent’s response could have failed scanning. Each of these has a different correct response from the agent. 从代理的角度来看:主机可能无法访问;代理可能配置错误;代理可能宕机;目标地址本身可能返回 403;代理的请求可能未通过扫描;代理的响应可能未通过扫描。每种情况都需要代理采取不同的正确应对措施。

“Host unreachable” might mean “try a different host.” “Proxy misconfigured” might mean “tell the operator.” “Scanning blocked the request” might mean “do not retry this exact body.” Without a signal, the agent treats them all the same way: retry, hit the same block, retry again, eventually give up. “主机无法访问”可能意味着“尝试其他主机”;“代理配置错误”可能意味着“通知运维人员”;“扫描拦截了请求”可能意味着“不要重试这个特定的请求体”。如果没有信号,代理会将它们一视同仁:重试、再次遇到拦截、再次重试,最终放弃。

The operator’s view is no better. The audit log records the block, but correlating an agent’s confused retry sequence with the proxy’s decision tree means cross-referencing two log streams by timestamp and request ID. For one block in a quiet period, fine. For a fleet generating thousands of requests an hour, painful. 运维人员的视角也好不到哪去。审计日志记录了拦截,但要将代理困惑的重试序列与代理的决策树关联起来,意味着必须通过时间戳和请求 ID 对两个日志流进行交叉引用。在平静时期处理一次拦截尚可,但对于每小时产生数千个请求的集群来说,这非常痛苦。

A structured block reason on the response solves both sides. The agent knows what happened. The operator does not have to grep two logs to figure out what the agent saw. This is the operator-facing half of enforcement. Politeness vs Enforcement explains how to make the kernel refuse bypasses; block-reason headers explain what the agent should do after the refusal. 响应中结构化的拦截原因解决了双方的问题。代理知道了发生了什么,运维人员也不必通过 grep 两个日志来弄清楚代理看到了什么。这是强制执行机制中面向运维人员的一半。Politeness vs Enforcement(礼貌与强制)解释了如何让内核拒绝绕过;而 block-reason 响应头则解释了代理在被拒绝后应该做什么。

The header schema

响应头架构

The full schema lives at docs/specs/block-reason-header.md in the Pipelock repo. The shape, in one paragraph: 完整的架构位于 Pipelock 仓库的 docs/specs/block-reason-header.md 中。其结构概括如下:

X-Pipelock-Block-Reason: <reason> with companion headers for version, severity, retry, and the layer that fired: X-Pipelock-Block-Reason: <reason> 以及用于版本、严重性、重试建议和触发层的配套响应头:

X-Pipelock-Block-Reason: dlp_match
X-Pipelock-Block-Reason-Version: 1
X-Pipelock-Block-Reason-Severity: critical
X-Pipelock-Block-Reason-Retry: none
X-Pipelock-Block-Reason-Layer: dlp

X-Pipelock-Block-Reason-Receipt is reserved in v2.4: the schema and the WithReceipt validator ship in this release, but production block paths leave the value unset until the receipt-pointer wiring lands. When populated, the value will be a 26-character Crockford-base32 ULID. X-Pipelock-Block-Reason-Receipt 在 v2.4 版本中被预留:该架构和 WithReceipt 验证器随此版本发布,但在 receipt-pointer 连线完成之前,生产环境的拦截路径不会设置该值。一旦填充,该值将是一个 26 字符的 Crockford-base32 ULID。

The reason vocabulary is closed. Examples by category include: 原因词汇表是封闭的。按类别划分的示例包括:

  • Egress: ssrf_private_ip, ssrf_metadata, ssrf_dns_rebind, domain_blocklist, scheme_blocked, subdomain_entropy, url_length, path_entropy, rate_limit, data_budget.
  • Content: dlm_match, prompt_injection, redaction_failure, media_policy.
  • MCP: tool_policy_deny, tool_poisoning, tool_chain_blocked, session_binding.
  • Posture: airlock_active, kill_switch_active, envelope_verify_failed, outbound_envelope_failed, redirect_scan_denied, authority_mismatch, session_anomaly, cross_request_deny, compressed_response, browser_shield_oversize.
  • Contract: contract_default_deny, contract_enforce_default, contract_non_default_port, contract_invalid_path, contract_observed_only.
  • Generic: parse_error, timeout, bad_request, pattern_unavailable, not_enabled, block_reason_overflow.

The full canonical list lives at docs/specs/block-reason-header.md in the Pipelock repo and in internal/blockreason/blockreason.go. A block can have at most one reason code. The code is the primary signal. Severity and retry are advisory: severity tells the agent how loud to be when logging the block, retry tells the agent whether retrying with the same payload could ever succeed. 完整的规范列表位于 Pipelock 仓库的 docs/specs/block-reason-header.mdinternal/blockreason/blockreason.go 中。一个拦截最多只能有一个原因代码。该代码是主要信号。严重性和重试建议仅供参考:严重性告诉代理在记录拦截时应采取多高的日志级别,重试建议告诉代理使用相同的负载重试是否可能成功。

The same reason vocabulary is used for WebSocket close frames. MCP stdio does not have an HTTP header surface, so stdio blocks flow through the JSON-RPC error envelope instead. 同样的词汇表也用于 WebSocket 关闭帧。MCP stdio 没有 HTTP 响应头接口,因此 stdio 拦截通过 JSON-RPC 错误信封进行传输。

Why the schema is small

为什么架构如此精简

Two design choices kept the vocabulary small: 两个设计选择保持了词汇表的精简:

  1. No free-form reason strings. A free-form string would let the proxy tell the agent things like “request body contained AKIA…EXAMPLE at offset 1024.” That is too useful for an attacker who controls part of the request. The agent learns exactly what the scanner saw and can adjust the next attempt to avoid the match. Closed vocabularies do not leak that detail. 没有自由格式的原因字符串。 自由格式字符串会让代理告诉代理诸如“请求体在偏移量 1024 处包含 AKIA…EXAMPLE”之类的信息。这对于控制部分请求的攻击者来说太有用了。代理会确切地知道扫描器看到了什么,并可以调整下一次尝试以规避匹配。封闭的词汇表不会泄露这些细节。

  2. No retry-after seconds. A retry hint with timing would let the agent build a retry policy that matches whatever the proxy is rate-limiting. The hint is categorical: transient says “retrying might work because the cause is not your request,” none says “this exact request will never work,” and policy says “this might work only after an operator changes policy.” 没有重试时间间隔。 带有时间的重试提示会让代理构建出与代理限流策略相匹配的重试策略。提示是分类的:transient 表示“重试可能会成功,因为原因不在于你的请求”;none 表示“这个特定的请求永远不会成功”;policy 表示“只有在运维人员更改策略后才可能成功”。

Both choices trade specificity for safety. The agent gets enough signal to react sensibly without learning how to evade. 这两个选择都以特异性换取了安全性。代理获得了足够的信号来做出合理的反应,而不会学会如何规避。

What changes for operators

对运维人员的影响

The operator’s experience changes from “decode the audit log against the agent’s trace” to “read the block reason on the agent’s HTTP response.” 运维人员的体验从“对照代理的追踪记录解码审计日志”转变为“直接读取代理 HTTP 响应中的拦截原因”。

Two examples: 两个例子:

A coding agent’s CI pipeline started failing on a fetch tool. The pipeline log shows: 一个编程代理的 CI 流水线在 fetch 工具上开始失败。流水线日志显示:

fetch tool: HTTP 403, body empty
agent: retrying (1/3)
fetch tool: HTTP 403, body empty
agent: retrying (2/3)
fetch tool: HTTP 403, body empty
agent: failed after 3 retries

With the header on, the pipeline log includes: 开启响应头后,流水线日志包含:

fetch tool: HTTP 403, X-Pipelock-Block-Reason: ssrf_private_ip
fetch tool: X-Pipelock-Block-Reason-Severity: critical
fetch tool: X-Pipelock-Block-Reason-Retry: none
agent: not retrying (non-retryable block)
agent: surfacing block