Automated Scoring of Arabic Text Using Large Language Models: A Literature Review

利用大语言模型进行阿拉伯语文本自动评分：文献综述

Abstract: In modern educational systems, Automatic Text Scoring (ATS) plays a central role by enabling scalable and consistent evaluation of learner responses without human intervention. Recently, the increased accessibility of LLMs and Arabic-specific datasets has sparked renewed interest in this area.

摘要： 在现代教育系统中，自动文本评分（ATS）发挥着核心作用，它能够在无需人工干预的情况下，实现对学习者回答的可扩展且一致的评估。近期，大语言模型（LLM）和阿拉伯语特定数据集的可访问性提高，重新激发了人们对该领域的兴趣。

In this work, we investigate LLM-Based approaches for the automated evaluation of Arabic texts, focusing on both short answer grading (ASAG) and essay scoring (AES). We further introduce a structured taxonomy comprising five dimensions: application domain, feedback generation capability, LLM architecture deployed, alignment with competency referential frameworks, and prompt engineering strategy.

在这项工作中，我们研究了基于大语言模型的阿拉伯语文本自动评估方法，重点关注简答题评分（ASAG）和作文评分（AES）。我们进一步引入了一个包含五个维度的结构化分类法：应用领域、反馈生成能力、所部署的大语言模型架构、与能力参考框架的对齐情况，以及提示工程策略。

By applying this taxonomy, we conduct a comparative analysis of existing studies, examining their methodological approaches, datasets, evaluation metrics, and reported performance. The findings highlight the need for sustained and pedagogically grounded research efforts in Arabic ATS, given its significance for improving educational quality across Arabic-speaking communities.

通过应用该分类法，我们对现有研究进行了比较分析，考察了它们的方法论、数据集、评估指标以及报告的性能表现。研究结果强调，鉴于阿拉伯语自动评分对于提升阿拉伯语社区教育质量的重要性，该领域需要持续且具有教学法基础的研究投入。

Paper Details:

Authors: Khaoula Dahimi, Hadda Cherroun, Amel Belabbaci
arXiv ID: 2606.09830
Subject: Computation and Language (cs.CL)
Submission Date: 10 Apr 2026

论文详情：

作者： Khaoula Dahimi, Hadda Cherroun, Amel Belabbaci
arXiv ID: 2606.09830
学科： 计算与语言 (cs.CL)
提交日期： 2026年4月10日