Sem-Detect: Semantic Level Detection of AI Generated Peer-Reviews

Sem-Detect：AI 生成同行评审的语义级检测

Abstract: How can we distinguish whether a peer review was written by a human or generated by an AI model? We argue that, in this setting, authorship should not be attributed solely from the textual features of a review, but also from the ideas, judgments, and claims it expresses.

摘要： 我们该如何区分一份同行评审是由人类撰写的，还是由 AI 模型生成的？我们认为，在这种情况下，作者身份的归属不应仅仅基于评审的文本特征，还应考虑其所表达的观点、判断和主张。

To this end, we propose Sem-Detect, an authorship detection method for peer reviews that operationalizes this principle by combining textual features with claim-level semantic analysis. Sem-Detect compares a target review against multiple AI-generated reviews of the same paper, leveraging the observation that different AI models tend to converge on similar points, while human reviewers introduce more unique and diverse ones.

为此，我们提出了 Sem-Detect，这是一种针对同行评审的作者身份检测方法。该方法通过结合文本特征与主张级（claim-level）语义分析，将上述原则付诸实践。Sem-Detect 将目标评审与针对同一篇论文的多份 AI 生成评审进行对比，利用了这样一个观察结果：不同的 AI 模型往往会趋向于相似的观点，而人类评审者则会引入更多独特且多样的见解。

As a result, Sem-Detect is able to distinguish fully AI reviews from authentic human-written ones, including those that have been refined using an LLM but still reflect human judgment. Across a dataset of over 20,000 peer reviews from ICLR and NeurIPS conferences, Sem-Detect improves over the strongest baseline by 25.5% in TPR@0.1% FPR in the binary setting.

因此，Sem-Detect 能够区分完全由 AI 生成的评审与真实的人类撰写评审，包括那些经过大语言模型（LLM）润色但仍反映人类判断的评审。在涵盖 ICLR 和 NeurIPS 会议的超过 20,000 份同行评审数据集上，Sem-Detect 在二分类设置下的 TPR@0.1% FPR 指标上，较最强基准模型提升了 25.5%。

Moreover, in the three-class scenario, we empirically show that LLM refinement preserves the semantic signals of human reviews, which remain distinct from the patterns exhibited by fully AI-generated text; as a result, fewer than 3.5% of LLM-refined human reviews are misclassified as AI-generated.

此外，在三分类场景中，我们通过实证表明，LLM 润色保留了人类评审的语义信号，这些信号与完全由 AI 生成的文本所表现出的模式截然不同；因此，只有不到 3.5% 经过 LLM 润色的人类评审被错误地归类为 AI 生成。