Context Compression Is Not One Thing: Readable Symbolic Re-expression vs. Coherent Summary at Matched Budget

上下文压缩并非单一概念：可读符号重表达与同等预算下的连贯摘要对比

Abstract: We study context compression for multi-hop question answering with small language models. We propose Telegraph English, a readable symbolic format that rewrites retrieved passages into structured entity-relation statements, preserving reasoning evidence at lower token cost.

摘要： 我们研究了小型语言模型在多跳问答任务中的上下文压缩问题。我们提出了“电报英语”（Telegraph English），这是一种可读的符号化格式，它将检索到的段落重写为结构化的实体-关系陈述，从而以更低的 Token 成本保留了推理证据。

In controlled experiments on MuSiQue, TwoWiki, and HotpotQA, Telegraph English outperforms three matched-budget compression baselines (character-level deletion, truncation, and random sub-sampling) on every dataset, with gains of 13 to 20 F1 percentage point. It also outperforms a coherent prose summary produced by the same encoder on the hardest dataset.

在针对 MuSiQue、TwoWiki 和 HotpotQA 的对照实验中，Telegraph English 在所有数据集上均优于三种同等预算的压缩基线方法（字符级删除、截断和随机子采样），F1 分数提升了 13 到 20 个百分点。在最困难的数据集上，它的表现也优于由同一编码器生成的连贯文本摘要。

A pre-registered depth-interaction hypothesis is null: the advantage does not grow with reasoning depth within datasets. We interpret these results as evidence that readable symbolic re-expression preserves entity content more densely than either natural language or coherent summarization at matched token budget.

一项预注册的深度交互假设被证明不成立：即该方法的优势并不会随着数据集内推理深度的增加而增长。我们将这些结果解读为一种证据，表明在同等 Token 预算下，可读的符号重表达比自然语言或连贯的摘要能更密集地保留实体内容。