ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts

ViLegalNLI：越南法律文本的自然语言推理

In this article, we introduce ViLegalNLI, the first large-scale Vietnamese Natural Language Inference (NLI) dataset specifically constructed for the legal domain. 在本文中，我们介绍了 ViLegalNLI，这是首个专门为法律领域构建的大规模越南语自然语言推理（NLI）数据集。

The dataset consists of 42,012 premise-hypothesis pairs derived from official statutory documents and annotated with binary inference labels (Entailment and Non-entailment). 该数据集包含 42,012 对从官方成文法文件中提取的前提-假设对，并标注了二元推理标签（蕴含与非蕴含）。

It covers multiple legal domains and reflects realistic legal reasoning scenarios characterized by structured logic, conditional clauses, and domain-specific terminology. 它涵盖了多个法律领域，并反映了以结构化逻辑、条件从句和特定领域术语为特征的现实法律推理场景。

To construct ViLegalNLI, we propose a semi-automatic data generation framework that integrates large language models for controlled hypothesis generation and systematic quality validation procedures. 为了构建 ViLegalNLI，我们提出了一种半自动数据生成框架，该框架集成了大语言模型，用于受控的假设生成和系统的质量验证程序。

The framework incorporates artifact mitigation strategies and cross-model validation to improve annotation reliability and ensure legal consistency. 该框架结合了伪影缓解策略和跨模型验证，以提高标注的可靠性并确保法律一致性。

The resulting dataset captures diverse reasoning patterns, including paraphrasing, logical implication, and legally invalid inferences, thereby providing a comprehensive benchmark for Vietnamese legal inference tasks. 最终的数据集捕捉了多种推理模式，包括释义、逻辑蕴含和法律上无效的推理，从而为越南法律推理任务提供了一个全面的基准。

We conduct extensive experiments on the ViLegalNLI using multilingual models, Vietnamese-specific pretrained language models, and instruction-tuned large language models. 我们使用多语言模型、越南语专用预训练语言模型以及指令微调大语言模型，对 ViLegalNLI 进行了广泛的实验。

The results show that few-shot LLM configurations consistently achieve superior performance, while performance is significantly influenced by hypothesis length, lexical overlap, and reasoning complexity. 结果表明，少样本（few-shot）LLM 配置始终能实现卓越的性能，而性能会受到假设长度、词汇重叠度和推理复杂性的显著影响。

Cross-domain evaluations further reveal the challenges of generalizing legal inference across distinct legal fields. 跨领域评估进一步揭示了在不同法律领域间推广法律推理所面临的挑战。

Overall, ViLegalNLI establishes a foundational benchmark for Vietnamese legal NLI and supports future research in legal reasoning, statutory text understanding, and the development of reliable AI systems for legal analysis and decision support. 总的来说，ViLegalNLI 为越南法律 NLI 建立了基础基准，并支持了法律推理、成文法文本理解以及开发用于法律分析和决策支持的可靠 AI 系统的未来研究。

The dataset is publicly available for research purposes. 该数据集现已公开，供研究使用。