RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

RSAT:结构化归因使小型语言模型成为忠实的表格推理者

When a language model answers a table question, users have no way to verify which cells informed which reasoning steps. We introduce RSAT, a method that trains small language models (SLMs, 1-8B) to produce step-by-step reasoning with cell-level citations grounded in table evidence.

当语言模型回答表格问题时,用户无法验证哪些单元格为推理步骤提供了依据。我们引入了 RSAT,这是一种训练小型语言模型(SLM,1-8B 参数)的方法,旨在生成带有基于表格证据的单元格级引用的分步推理。

Phase 1 (SFT) teaches a structured JSON output format from verified reasoning traces. Phase 2 (GRPO) optimizes a composite reward centered on NLI-based faithfulness, alongside citation validity and parsimony.

第一阶段(SFT)通过经过验证的推理轨迹,教授模型结构化的 JSON 输出格式。第二阶段(GRPO)优化了一个以基于自然语言推理(NLI)的忠实度为核心的复合奖励,同时兼顾引用的有效性和简洁性。

Across six models from two families—Qwen 2.5 (1.5B/3B/7B) and Llama 3 (1B/3B/8B)—RSAT improves faithfulness 3.7$\times$ over SFT alone (0.224$\rightarrow$0.826), with near-perfect citation validity (0.992). Post-hoc attribution collapses below 13% format success, confirming that attribution must be integrated into reasoning, not retrofitted. Ablations show the faithfulness reward is essential: removing it drops faithfulness from 0.97 to 0.03.

在来自两个系列的六个模型(Qwen 2.5 [1.5B/3B/7B] 和 Llama 3 [1B/3B/8B])中,RSAT 将忠实度较单纯的 SFT 提升了 3.7 倍(从 0.224 提升至 0.826),并实现了近乎完美的引用有效性(0.992)。事后归因(Post-hoc attribution)的格式成功率低于 13%,这证实了归因必须整合到推理过程中,而不是事后强加。消融实验表明,忠实度奖励至关重要:移除该奖励会导致忠实度从 0.97 骤降至 0.03。