Skip to content
AI / ML· 2 min

arXiv tightens AI-authorship rules with one-strike year-long ban

Authors caught publishing unverified LLM output — hallucinated references and visible model chatter being the giveaway — face a 12-month ban and a requirement that future submissions clear peer review first.

Authors caught publishing unverified LLM output — hallucinated references and visible model chatter being the giveaway — face a 12-month ban and a requirement that future submissions clear peer review first.

The pre-print server arXiv has formalised what it calls a "one-strike" rule against authors whose submissions show "incontrovertible evidence that the authors did not check the results of LLM generation," according to TechCrunch's reporting on 16 May 2026 [1]. The penalty is a one-year ban, after which the author may submit again only via a paper that has already cleared a "reputable peer-reviewed venue" [1].

Thomas Dietterich, chair of arXiv's computer science section, framed the rationale in plain terms: "if a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper" [1]. The triggers arXiv cites include hallucinated references and accidentally-included messages to and from the model in the manuscript body [1].

Procedurally, moderators flag suspicious submissions and section chairs confirm the evidence before any ban is imposed. Authors retain the right to appeal [1]. Responsibility for content remains with the human authors "irrespective of how the contents are generated" [1].

The move continues a pattern of arXiv tightening submission integrity. Earlier measures included endorsement requirements for first-time posters [1]. arXiv's recent transition to independent non-profit status is given as part of the context for the policy change [1].

Two questions the reporting does not resolve are worth flagging. First, scope: the announcement focuses on the computer science section, but whether the same rule applies to physics, mathematics, biology, and the other arXiv archives is not stated. Second, enforcement granularity: there is no public number on how many submissions arXiv has already caught under this standard, or what fraction of incoming computer-science submissions show LLM-generation artefacts.

For working ML researchers, the practical implication is unambiguous. If a paper draft contains references that don't resolve, or any visible trace of the model that wrote the prose, that paper is now a career-affecting risk. Read your own citations.