Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches
Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches
基于信息融合的文档分类模式识别:多模态与多视图表示方法的系统综述
Abstract: Information fusion is used widely to improve document classification by the integration of multiple data sources (multimodal) or representations (multiview). However, the field lacks a unified framework, a quantitative synthesis of its effectiveness, and clear guidance for practitioners. This systematic review addresses these gaps by analysing 139 primary studies.
摘要: 信息融合被广泛用于通过整合多个数据源(多模态)或表示形式(多视图)来改进文档分类。然而,该领域目前缺乏统一的框架、对其有效性的定量综合分析,以及对从业者的明确指导。本系统综述通过分析 139 项主要研究,旨在填补这些空白。
It introduces a formal framework to structure the field, presents the results of a qualitative analysis to identify key trends, and performs a random-effects meta-analysis (to our knowledge, the first focused on document classification) to quantify performance gains. Our meta-analysis reveals that multimodal fusion improves accuracy (mean gain of +5.28 percentage points, $p=0.0016$) significantly — the F1-score effect is directionally positive but statistically non-significant in our primary model.
该综述引入了一个正式框架来构建该领域,通过定性分析展示了关键趋势,并进行了随机效应荟萃分析(据我们所知,这是首个针对文档分类的此类分析)以量化性能提升。我们的荟萃分析显示,多模态融合显著提高了准确率(平均提升 +5.28 个百分点,$p=0.0016$)——在我们的主要模型中,F1 分数的效果在方向上是积极的,但在统计学上并不显著。
Multiview fusion provides consistent but modest gains for accuracy (+4.67%), F1-score (+3.08%), and recall (all $p<0.05$). Critically, our qualitative synthesis uncovers challenges in reproducibility in methodological rigour: only 11.8% (multimodal) and 23.3% (multiview) of the studies use statistical tests to validate their findings, which undermines the reliability of many of their results.
多视图融合在准确率(+4.67%)、F1 分数(+3.08%)和召回率(均 $p<0.05$)方面提供了持续但适度的提升。至关重要的是,我们的定性综合分析揭示了方法论严谨性在可重复性方面的挑战:仅有 11.8%(多模态)和 23.3%(多视图)的研究使用统计检验来验证其发现,这削弱了许多研究结果的可靠性。
This review’s primary contributions are a unifying framework, the first quantitative evidence base, and data-driven guidelines. This review concludes that successful information fusion depends not on algorithmic complexity, but on the strategic alignment of the fusion method with the task context and a commitment to more rigorous validation.
本综述的主要贡献在于提供了一个统一的框架、首个定量证据库以及数据驱动的指南。综述结论指出,成功的信息融合并不取决于算法的复杂性,而取决于融合方法与任务背景的战略性对齐,以及对更严谨验证过程的承诺。