RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

RSAT：结构化归因使小型语言模型成为忠实的表格推理者

When a language model answers a table question, users have no way to verify which cells informed which reasoning steps. We introduce RSAT, a method that trains small language models (SLMs, 1-8B) to produce step-by-step reasoning with cell-level citations grounded in table evidence.

当语言模型回答表格问题时，用户无法验证哪些单元格为推理步骤提供了依据。我们引入了 RSAT，这是一种训练小型语言模型（SLM，1-8B 参数）的方法，旨在生成带有基于表格证据的单元格级引用的分步推理。

Phase 1 (SFT) teaches a structured JSON output format from verified reasoning traces. Phase 2 (GRPO) optimizes a composite reward centered on NLI-based faithfulness, alongside citation validity and parsimony.

第一阶段（SFT）通过经过验证的推理轨迹，教授模型结构化的 JSON 输出格式。第二阶段（GRPO）优化了一个以基于自然语言推理（NLI）的忠实度为核心的复合奖励，同时兼顾引用的有效性和简洁性。

Across six models from two families—Qwen 2.5 (1.5B/3B/7B) and Llama 3 (1B/3B/8B)—RSAT improves faithfulness 3.7$\times$ over SFT alone (0.224$\rightarrow$0.826), with near-perfect citation validity (0.992). Post-hoc attribution collapses below 13% format success, confirming that attribution must be integrated into reasoning, not retrofitted. Ablations show the faithfulness reward is essential: removing it drops faithfulness from 0.97 to 0.03.

在来自两个系列的六个模型（Qwen 2.5 [1.5B/3B/7B] 和 Llama 3 [1B/3B/8B]）中，RSAT 将忠实度较单纯的 SFT 提升了 3.7 倍（从 0.224 提升至 0.826），并实现了近乎完美的引用有效性（0.992）。事后归因（Post-hoc attribution）的格式成功率低于 13%，这证实了归因必须整合到推理过程中，而不是事后强加。消融实验表明，忠实度奖励至关重要：移除该奖励会导致忠实度从 0.97 骤降至 0.03。