arXiv tightens AI-authorship rules with one-strike year-long ban
Authors caught publishing unverified LLM output — hallucinated references and visible model chatter being the giveaway — face a 12-month ban and a requirement that future submissions clear peer review first.
Authors caught publishing unverified LLM output — hallucinated references and visible model chatter being the giveaway — face a 12-month ban and a requirement that future submissions clear peer review first.
The pre-print server arXiv has formalised what it calls a "one-strike" rule against authors whose submissions show "incontrovertible evidence that the authors did not check the results of LLM generation," according to TechCrunch's reporting on 16 May 2026 [1]. The penalty is a one-year ban, after which the author may submit again only via a paper that has already cleared a "reputable peer-reviewed venue" [1].
Thomas Dietterich, chair of arXiv's computer science section, framed the rationale in plain terms: "if a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper" [1]. The triggers arXiv cites include hallucinated references and accidentally-included messages to and from the model in the manuscript body [1].
Procedurally, moderators flag suspicious submissions and section chairs confirm the evidence before any ban is imposed. Authors retain the right to appeal [1]. Responsibility for content remains with the human authors "irrespective of how the contents are generated" [1].
The move continues a pattern of arXiv tightening submission integrity. Earlier measures included endorsement requirements for first-time posters [1]. arXiv's recent transition to independent non-profit status is given as part of the context for the policy change [1].
Two questions the reporting does not resolve are worth flagging. First, scope: the announcement focuses on the computer science section, but whether the same rule applies to physics, mathematics, biology, and the other arXiv archives is not stated. Second, enforcement granularity: there is no public number on how many submissions arXiv has already caught under this standard, or what fraction of incoming computer-science submissions show LLM-generation artefacts.
For working ML researchers, the practical implication is unambiguous. If a paper draft contains references that don't resolve, or any visible trace of the model that wrote the prose, that paper is now a career-affecting risk. Read your own citations.