Hate Speech Detection in Turkish and Arabic Languages: A Comprehensive Study
Hate Speech Detection in Turkish and Arabic Languages: A Comprehensive Study
土耳其语与阿拉伯语仇恨言论检测:一项综合研究
Abstract: Online hate speech has been linked to a global rise in violence against minorities, including incidents such as mass shootings, lynchings, and ethnic cleansing. Societies grappling with this issue, particularly when hate speech targets specific groups based on religion, race, ethnicity, culture, nationality, or migration status, face the challenge of balancing freedom of expression with the need for effective content moderation on widely used online platforms.
摘要: 在线仇恨言论与全球范围内针对少数群体的暴力事件上升有关,其中包括大规模枪击、私刑和种族清洗等事件。正在应对这一问题的社会,特别是当仇恨言论针对基于宗教、种族、民族、文化、国籍或移民身份的特定群体时,面临着在言论自由与广泛使用的在线平台上进行有效内容审核的需求之间取得平衡的挑战。
In response to this challenge, we introduce a comprehensive hate speech dataset covering five distinct topics in Turkish: refugees, the Israel-Palestine conflict, anti-Greek sentiment in Turkey, ethnic or religious communities (Alevis, Armenians, Arabs, Jews, and Kurds), and LGBTI+, alongside one topic in Arabic (refugees).
为了应对这一挑战,我们引入了一个涵盖土耳其语五个不同主题的综合仇恨言论数据集:难民、以巴冲突、土耳其境内的反希腊情绪、族裔或宗教社区(阿列维派、亚美尼亚人、阿拉伯人、犹太人和库尔德人)以及 LGBTI+,此外还包括一个阿拉伯语主题(难民)。
In addition, we develop state-of-the-art BERT-based models to address multiple dimensions of hate speech analysis, including hate category classification, hate intensity prediction, target identification, and hate speech span detection, enabling a comprehensive understanding of hateful content in online discourse.
此外,我们开发了基于 BERT 的最先进模型,以解决仇恨言论分析的多个维度,包括仇恨类别分类、仇恨强度预测、目标识别和仇恨言论范围检测,从而实现对在线话语中仇恨内容的全面理解。
Paper Details:
- Authors: Somaiyeh Dehghan, Gökçe Uludoğan, Mehmet Umut Şen, Elif Erol, Arzucan Özgür, Berrin Yanikoglu
- Submission Date: 30 Jun 2026
- Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
- DOI: 10.48550/arXiv.2607.00143
论文详情:
- 作者: Somaiyeh Dehghan, Gökçe Uludoğan, Mehmet Umut Şen, Elif Erol, Arzucan Özgür, Berrin Yanikoglu
- 提交日期: 2026年6月30日
- 学科: 计算与语言 (cs.CL);人工智能 (cs.AI)
- 数字对象标识符 (DOI): 10.48550/arXiv.2607.00143