From Regex to AST: Building Taint Tracking for AI Agent Code

从正则表达式到抽象语法树 (AST)：为 AI Agent 代码构建污点追踪

AgentGuard v0.5.0 ships AST-based taint tracking. This post explains how it works and why it matters. AgentGuard v0.5.0 正式发布了基于 AST（抽象语法树）的污点追踪功能。本文将详细介绍其工作原理及重要性。

The Regex Ceiling

正则表达式的局限性

Regex catches obvious patterns: prompt = f"You are helpful. {user_input}". A regex rule sees f"..." with {user_input} and flags it. Done. 正则表达式可以捕获明显的模式，例如 prompt = f"You are helpful. {user_input}"。正则规则只需识别带有 {user_input} 的 f"..." 字符串即可完成标记。

But regex cannot track this: 但正则表达式无法追踪以下情况：

query = request.json.get("query")
processed = query.strip().upper()
template = "Answer: {q}"
prompt = template.format(q=processed)
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}]
)

The taint flows: request.json -> query -> processed -> template.format() -> prompt -> openai call. Four hops. Regex sees each line independently and cannot connect them. 污点流向为：request.json -> query -> processed -> template.format() -> prompt -> openai call，共经过四次跳转。正则表达式只能独立查看每一行，无法将它们关联起来。

AST to the Rescue

AST 救场

Python’s ast module parses source code into a syntax tree. We can walk that tree and track how data flows. Python 的 ast 模块可以将源代码解析为语法树。我们可以遍历这棵树，从而追踪数据的流向。

Step 1: Identify Sources 第一步：识别源 (Sources)

A “source” is any expression that produces untrusted data: “源”是指任何产生不可信数据的表达式：

SOURCE_PATTERNS = {
    "user_input", "user_msg", "user_message", "request", "req", "query", "message", "msg",
}

Plus attribute access patterns: request.args.get("q"), request.json["key"], input(). In AST terms, we check ast.Name nodes against the source set, and ast.Call nodes for request.args.get patterns. 此外还包括属性访问模式，如 request.args.get("q")、request.json["key"] 和 input()。在 AST 层面，我们通过检查 ast.Name 节点是否在源集合中，以及检查 ast.Call 节点是否匹配 request.args.get 模式来识别。

Step 2: Track Propagation 第二步：追踪传播 (Propagation)

When a source is assigned to a variable, that variable becomes tainted. But taint also propagates through: 当一个源被赋值给变量时，该变量即被标记为“污点”。污点还会通过以下方式传播：

Method calls: processed = user_input.strip() — processed is still tainted.
方法调用： processed = user_input.strip() —— processed 依然带有污点。
F-strings: prompt = f"Hello {user_input}" — prompt is tainted.
F-strings： prompt = f"Hello {user_input}" —— prompt 带有污点。
.format(): prompt = template.format(q=query) — prompt is tainted if query is.
.format()： prompt = template.format(q=query) —— 如果 query 有污点，则 prompt 也有污点。
String concatenation: prompt = "Hello " + user_input — prompt is tainted.
字符串拼接： prompt = "Hello " + user_input —— prompt 带有污点。
List/dict construction: messages = [{"role": "user", "content": user_input}] — messages is tainted.
列表/字典构建： messages = [{"role": "user", "content": user_input}] —— messages 带有污点。

The tracker walks assignments in order, maintaining a tainted_vars dict. When it sees x = tainted_expr, it adds x to the dict. When it sees x = safe_expr, it removes x. 追踪器按顺序遍历赋值语句，维护一个 tainted_vars 字典。当看到 x = tainted_expr 时，将 x 加入字典；当看到 x = safe_expr 时，将其移除。

Step 3: Identify Sinks 第三步：识别汇点 (Sinks)

A “sink” is where tainted data reaches an LLM: “汇点”是指污点数据到达 LLM 的位置：

Variable assignment: prompt = <tainted> or messages = [<tainted>]
变量赋值： prompt = <tainted> 或 messages = [<tainted>]
Function call: openai.chat.completions.create(messages=<tainted>)
函数调用： openai.chat.completions.create(messages=<tainted>)

When the tracker sees a tainted expression reaching a sink, it fires a finding. 当追踪器发现污点表达式到达汇点时，就会触发警报。

Step 4: Sanitizers 第四步：净化器 (Sanitizers)

Not all transformations preserve taint. Some explicitly make data safe: safe = str(user_input)[:100]. The tracker treats str(), int(), float(), len(), and explicit escape functions as sanitizers. When data passes through a sanitizer, the taint is removed. 并非所有转换都会保留污点。有些转换会显式地使数据变得安全，例如 safe = str(user_input)[:100]。追踪器将 str()、int()、float()、len() 以及显式的转义函数视为净化器。当数据通过净化器时，污点会被移除。

What It Catches (That Regex Cannot)

它能捕获（正则无法捕获）的内容

Multi-hop flow: 4 variable assignments leading to an LLM call.
多跳流： 经过 4 次变量赋值后到达 LLM 调用。
Template .format() with named args: Correctly tracks taint through .format().
带命名参数的 Template .format()： 正确追踪通过 .format() 传播的污点。
Messages array with tainted content: Detects taint inside nested list/dict structures.
带有污点内容的 Messages 数组： 检测嵌套列表/字典结构内部的污点。

What It Does Not Flag (Correctly)

它（正确地）不会标记的内容

Sanitized input: str(user_input)[:100] is recognized as safe.
已净化输入： str(user_input)[:100] 被识别为安全。
Hardcoded prompt: No taint source, so no alert.
硬编码提示词： 没有污点源，因此不会触发警报。

Limitations

局限性

Python only: JavaScript/TypeScript support is on the roadmap.
仅限 Python： JavaScript/TypeScript 支持已在路线图中。
Intra-file only: No interprocedural analysis yet.
仅限单文件： 暂不支持跨过程分析。
No control flow: If/else branches are not tracked separately.
无控制流： If/else 分支不会被单独追踪。
Conservative sanitizers: str() is treated as a sanitizer, but may not be safe in all contexts.
保守的净化器： str() 被视为净化器，但在某些上下文中可能并不安全。

Try It

尝试使用

pip install --upgrade dfx-agentguard
agentguard src/ --format text

The taint tracking rule (ASI01-TAINT-TRACK) runs alongside existing regex rules. Both layers work together: regex for speed, AST for precision. AgentGuard is MIT-licensed. v0.5.0 includes 38 tests and a 32-sample benchmark with 100% detection rate. 污点追踪规则 (ASI01-TAINT-TRACK) 与现有的正则规则并行运行。两者相辅相成：正则负责速度，AST 负责精度。AgentGuard 采用 MIT 协议开源。v0.5.0 版本包含 38 个测试用例和 32 个样本的基准测试，检测率达 100%。