SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG
SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG
SproutRAG:用于长文档 RAG 的注意力引导树搜索与渐进式嵌入
Retrieval-augmented generation (RAG) systems must balance retrieval granularity with contextual coherence, a challenge that existing methods address through LLM-guided chunking, single-level context expansion, or hierarchical summarization. These approaches variously depend on costly LLM calls during indexing or retrieval, limit context aggregation to a single granularity level, or introduce information loss through summarization.
检索增强生成(RAG)系统必须在检索粒度与上下文连贯性之间取得平衡。现有的方法通过大模型(LLM)引导的分块、单层上下文扩展或分层摘要来解决这一挑战。然而,这些方法往往依赖于索引或检索过程中昂贵的大模型调用,将上下文聚合限制在单一粒度水平,或者因摘要处理而导致信息丢失。
We present SproutRAG, an attention-guided hierarchical RAG framework that addresses this trade-off by organizing sentence-level chunks into progressively larger but semantically coherent units, using learned inter-sentence attention to construct a binary chunking tree. Unlike prior approaches that rely on external LLMs, fixed context expansion, or lossy summarization, SproutRAG learns which attention heads and layers best capture semantic document structure, enabling multi-granularity retrieval without additional LLM calls or compressed summaries.
我们提出了 SproutRAG,这是一个注意力引导的分层 RAG 框架。它通过利用学习到的句间注意力构建二叉分块树,将句子级块组织成逐渐变大但语义连贯的单元,从而解决了上述权衡问题。与依赖外部大模型、固定上下文扩展或有损摘要的先前方法不同,SproutRAG 能够学习哪些注意力头和层最能捕捉文档的语义结构,从而在无需额外大模型调用或压缩摘要的情况下实现多粒度检索。
At retrieval time, SproutRAG uses hierarchical beam search to retrieve candidates at multiple granularities, capturing multi-sentence relevance beyond flat retrieval. The framework is trained end-to-end with a joint objective that improves both embeddings and tree structure. Experiments across four benchmarks spanning scientific, legal, and open-domain settings demonstrate that SproutRAG improves information efficiency (IE) by 6.1% on average over the strongest baseline.
在检索阶段,SproutRAG 使用分层束搜索(hierarchical beam search)在多个粒度上检索候选内容,捕捉超越扁平化检索的多句相关性。该框架通过联合目标进行端到端训练,同时优化了嵌入表示和树结构。在涵盖科学、法律和开放域设置的四个基准测试中,实验结果表明,SproutRAG 的信息效率(IE)平均比最强的基准模型提高了 6.1%。