Sem-Detect: Semantic Level Detection of AI Generated Peer-Reviews
Sem-Detect: Semantic Level Detection of AI Generated Peer-Reviews
Sem-Detect:AI 生成同行评审的语义级检测
Abstract: How can we distinguish whether a peer review was written by a human or generated by an AI model? We argue that, in this setting, authorship should not be attributed solely from the textual features of a review, but also from the ideas, judgments, and claims it expresses.
摘要: 我们该如何区分一份同行评审是由人类撰写的,还是由 AI 模型生成的?我们认为,在这种情况下,作者身份的归属不应仅仅基于评审的文本特征,还应考虑其所表达的观点、判断和主张。
To this end, we propose Sem-Detect, an authorship detection method for peer reviews that operationalizes this principle by combining textual features with claim-level semantic analysis. Sem-Detect compares a target review against multiple AI-generated reviews of the same paper, leveraging the observation that different AI models tend to converge on similar points, while human reviewers introduce more unique and diverse ones.
为此,我们提出了 Sem-Detect,这是一种针对同行评审的作者身份检测方法。该方法通过结合文本特征与主张级(claim-level)语义分析,将上述原则付诸实践。Sem-Detect 将目标评审与针对同一篇论文的多份 AI 生成评审进行对比,利用了这样一个观察结果:不同的 AI 模型往往会趋向于相似的观点,而人类评审者则会引入更多独特且多样的见解。
As a result, Sem-Detect is able to distinguish fully AI reviews from authentic human-written ones, including those that have been refined using an LLM but still reflect human judgment. Across a dataset of over 20,000 peer reviews from ICLR and NeurIPS conferences, Sem-Detect improves over the strongest baseline by 25.5% in TPR@0.1% FPR in the binary setting.
因此,Sem-Detect 能够区分完全由 AI 生成的评审与真实的人类撰写评审,包括那些经过大语言模型(LLM)润色但仍反映人类判断的评审。在涵盖 ICLR 和 NeurIPS 会议的超过 20,000 份同行评审数据集上,Sem-Detect 在二分类设置下的 TPR@0.1% FPR 指标上,较最强基准模型提升了 25.5%。
Moreover, in the three-class scenario, we empirically show that LLM refinement preserves the semantic signals of human reviews, which remain distinct from the patterns exhibited by fully AI-generated text; as a result, fewer than 3.5% of LLM-refined human reviews are misclassified as AI-generated.
此外,在三分类场景中,我们通过实证表明,LLM 润色保留了人类评审的语义信号,这些信号与完全由 AI 生成的文本所表现出的模式截然不同;因此,只有不到 3.5% 经过 LLM 润色的人类评审被错误地归类为 AI 生成。