Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study
Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study
知识图谱增强的零样本主题分类:多策略对比研究
Abstract: Multi-label topic classification without labeled training data is a challenging task, specially when documents contain complex relational information. We present a zero-shot multi-label topic classification framework and systematically investigate how per-article knowledge graph augmentation affects its performance. 摘要: 在没有标注训练数据的情况下进行多标签主题分类是一项具有挑战性的任务,尤其是当文档包含复杂的关联信息时。我们提出了一个零样本多标签主题分类框架,并系统地研究了基于单篇文章的知识图谱增强如何影响其性能。
The base framework classifies topics in documents without labeled training data and has four variants: article-only classification, keyword-enhanced classification, and self-consistency decoding variants of both. Then, we augment each base variant with per article knowledge graph. This graph is extracted from the input document through a pipeline similar to KGGen based on subject-predicate-object triples. 该基础框架在没有标注训练数据的情况下对文档主题进行分类,并包含四种变体:仅文章分类、关键词增强分类,以及上述两者的自洽解码(self-consistency decoding)变体。随后,我们利用单篇文章的知识图谱对每种基础变体进行增强。该图谱通过类似于 KGGen 的流水线,基于“主语-谓语-宾语”三元组从输入文档中提取。
We test all eight methods, four base and four graph augmented on fifteen LLMs and eight multi-label datasets across different domains. For the base framework, keyword-enhanced classification (AK) is the best performing method, and six out of fifteen LLMs surpass the sentence-encoder baseline. 我们在十五个大语言模型(LLM)和八个跨领域的多标签数据集上测试了所有八种方法(四种基础方法和四种图增强方法)。对于基础框架而言,关键词增强分类(AK)是表现最好的方法,且十五个大语言模型中有六个超过了句子编码器(sentence-encoder)的基准水平。
Graph augmentation has positive and negative impacts on small and large models, respectively. This shows that larger models already contain enough relational information from pretraining. Furthermore, the self-consistency decoding variant does not show performance improvements in any experiment while increasing computation costs about fivefold. 知识图谱增强对小型模型和大型模型分别产生了正面和负面的影响。这表明大型模型在预训练阶段已经包含了足够的关联信息。此外,自洽解码变体在任何实验中均未显示出性能提升,同时计算成本却增加了约五倍。