DIAGRAMS: A Review Framework for Reasoning-Level Attribution in Diagram QA
DIAGRAMS: A Review Framework for Reasoning-Level Attribution in Diagram QA
Abstract: Diagram question answering (Diagram QA) requires reasoning-level attribution that links each question-answer pair to all visual regions needed to derive the answer, rather than only the region containing the final response.
摘要: 图表问答(Diagram QA)需要推理层面的归因,即需要将每个问答对与推导出答案所需的所有视觉区域相关联,而不仅仅是包含最终答案的区域。
Creating such structured evidence across diagrams, charts, maps, circuits, and infographics is time-consuming, and existing annotation tools tightly couple their interfaces to dataset-specific formats.
在图表、统计图、地图、电路图和信息图中创建此类结构化证据非常耗时,且现有的标注工具通常将其界面与特定数据集的格式紧密耦合。
We present DIAGRAMS, a lightweight, schema-driven review framework that decouples interface logic from dataset-specific JSON structures through an internal meta-schema and dataset adapters.
我们提出了 DIAGRAMS,这是一个轻量级的、由模式驱动的审查框架。它通过内部元模式(meta-schema)和数据集适配器,将界面逻辑与特定数据集的 JSON 结构解耦。
Given an image and QA pair with optional candidate regions, the system performs QA-conditioned evidence selection and proposes the regions required for reasoning.
给定一张图像和一个问答对(以及可选的候选区域),该系统能够执行基于问答条件的证据选择,并提出推理所需的区域。
When QA pairs or candidate regions are missing, it generates them and supports human verification and refinement.
当问答对或候选区域缺失时,系统会自动生成它们,并支持人工核对与优化。
Across six Diagram QA datasets, model-suggested evidence achieves 85.39% precision and 75.30% recall against reviewer-final selections (micro-averaged).
在六个图表问答数据集上,模型建议的证据与审查员最终选择的结果相比,达到了 85.39% 的精确率和 75.30% 的召回率(微平均值)。
These results indicate that the review-first framework reduces manual region creation while maintaining high agreement with final reasoning-level attributions.
这些结果表明,这种“审查优先”的框架在减少人工创建区域工作量的同时,保持了与最终推理层面归因的高度一致性。
We release a public demo and installable package to support dataset auditing, grounded supervision creation, and grounded evaluation.
我们发布了公开演示版和可安装软件包,以支持数据集审计、基础监督(grounded supervision)创建以及基础评估。