DysLexLens: A Low-Resource LLM Framework for Analysing Dyslexic Learners Insights from Online Forums

DysLexLens: A Low-Resource LLM Framework for Analysing Dyslexic Learners Insights from Online Forums

DysLexLens:一个用于分析阅读障碍学习者在线论坛见解的低资源大语言模型框架

Abstract: Dyslexic learners increasingly use artificial intelligence (AI) tools to support reading, writing, organisation, and study-related tasks. However, their lived experiences with these tools remain largely underexamined. This paper proposes DysLexLens, a low-resource LLM framework, designed to analyse dyslexic learners experience with AI through online forum discussions.

摘要: 阅读障碍学习者越来越多地使用人工智能(AI)工具来辅助阅读、写作、组织和学习相关任务。然而,他们使用这些工具的真实生活体验在很大程度上仍未得到充分研究。本文提出了 DysLexLens,这是一个低资源大语言模型(LLM)框架,旨在通过在线论坛讨论来分析阅读障碍学习者对人工智能的使用体验。

DysLexLens is designed as an end-to-end, evidence-traceable architecture which transforms noisy social media posts into a dictionary-driven corpora, provides knowledge-graph (KG)-based question reasoning, generates verifiable query responses, and enables response evaluation through quantitative and human-grounded assessment.

DysLexLens 被设计为一种端到端、证据可追溯的架构,它将嘈杂的社交媒体帖子转化为字典驱动的语料库,提供基于知识图谱(KG)的问题推理,生成可验证的查询响应,并通过定量和基于人类的评估来实现响应评估。

DysLexLens has four key features. First, it employs a dictionary-driven filtering method to construct a more focused Reddit corpus on dyslexia and AI, filtering out noisy and weakly related posts to improve the relevance of data collected from low-resource forum contexts. Second, it integrates LLM-assisted semantic analysis with KG-based query reasoning to uncover meaningful patterns. Third, it has quantitative evaluation metrics (RAGAS and Query Robustness) to measure LLM-generated response performance. Fourth, it provides structured qualitative validation guidelines for assessing response quality, with a specific focus on hallucination and evidence alignment.

DysLexLens 具有四个关键特性。首先,它采用字典驱动的过滤方法,构建了一个更专注于阅读障碍和人工智能的 Reddit 语料库,过滤掉嘈杂和弱相关帖子,以提高从低资源论坛环境中收集的数据的相关性。其次,它将大语言模型辅助的语义分析与基于知识图谱的查询推理相结合,以发现有意义的模式。第三,它拥有定量评估指标(RAGAS 和查询稳健性)来衡量大语言模型生成响应的性能。第四,它提供了用于评估响应质量的结构化定性验证指南,特别关注幻觉和证据对齐。

We demonstrate the effectiveness of DysLexLens using dyslexia-related Reddit forum data and 30 questions. The results show its potential generalisability to other low-resource forum data contexts. DysLexLens, sample data, questions and evaluation results are available at Github to support reproducibility.

我们使用与阅读障碍相关的 Reddit 论坛数据和 30 个问题证明了 DysLexLens 的有效性。结果显示了其在其他低资源论坛数据环境中的潜在通用性。DysLexLens、样本数据、问题和评估结果已在 Github 上发布,以支持可重复性研究。