Submit AI-Generated Slop to arXiv, Get a One-Year Vacation from Publishing

arXiv just dropped the hammer on AI-generated slop. Starting today, May 16, 2026, any researcher caught submitting papers written by LLMs like GPT-4o or Claude 3.5 without proper attribution faces a mandatory 12-month submission ban. This isn’t just about quality control; it is a desperate move to save the peer-review process before it drowns in synthetic noise. I have seen some of these papers lately—they are basically word salad with zero value. It is about time someone set a hard boundary for the arXiv AI-generated slop ban.

📋 In This Article

How the New arXiv Enforcement Policy Works
Why LLM-Generated Papers are Breaking the System
How to Avoid the One-Year Ban
The Impact on the Global AI Community
⭐ Pro Tips
❓ FAQ

Contents show

How the New arXiv Enforcement Policy Works

The Cornell-run repository is officially fed up. They are now using a combination of internal detection tools and community reporting to flag suspicious content. If your paper is flagged for containing more than 30% unedited LLM output in the core methodology or results sections, you are out. The ban lasts exactly 365 days. It is a bold move considering arXiv hosts over 2.2 million papers and serves as the backbone of open science. I think the 30% threshold is actually quite generous. If you cannot be bothered to write your own abstract or explain your own data, you probably should not be in academia in the first place. The system is designed to catch ‘prompt-engineered’ papers that lack actual human oversight.

Detection and the 12-Month Penalty

The detection algorithms are updated weekly to keep pace with Gemini 2.0 and newer models. If you get caught, your account is locked for a year, and your previous submissions get a ‘Review Required’ flag. It is a reputation killer.

Why LLM-Generated Papers are Breaking the System

It costs real money to host this stuff. arXiv operates on an annual budget of roughly $3.2 million, funded by member institutions and the Simons Foundation. When the server gets hit with 6,000 ‘papers’ a month that are just Claude 3.5 Opus rehashes, it wastes storage, bandwidth, and human volunteer time. I have spent way too much time on r/MachineLearning reading about reviewers getting hit with 25 papers a week that look identical. It is killing the vibe of open science. We are seeing a massive influx of low-effort content that obscures real breakthroughs. When every ‘researcher’ can generate a 15-page paper in 30 seconds, the signal-to-noise ratio hits zero. This ban is the only way to keep the platform functional.

The Burden on Infrastructure

Storing millions of PDFs is not free. With submissions up 45% since 2024, arXiv is struggling to keep up with the physical storage costs and the human cost of basic moderation.

How to Avoid the One-Year Ban

You can still use AI for basic grammar and flow. Tools like Grammarly, which costs about $12 a month, or the basic spellcheck in Microsoft Word are perfectly fine. The problem is when you ask an LLM to ‘write a methodology section for a CNN-based image classifier’ and paste it directly into your LaTeX editor. That is the slop arXiv is targeting. If you use AI for brainstorming, you need to cite it. If you use it for code generation—like GitHub Copilot—you should document it in your appendix. Transparency is the only way to stay in the clear. I suggest keeping your prompt history just in case you need to appeal a false positive, though those are becoming rare.

Proper Attribution and Tool Usage

If you used an LLM for more than simple proofreading, list it in the acknowledgments. Failure to do so is now considered academic misconduct under the new May 2026 guidelines.

The Impact on the Global AI Community

We have seen a massive spike in submissions since GPT-4 launched, and frankly, most of it is garbage. This ban will likely cut the noise by at least 20% by the end of the year. I am seeing some backlash on X from people claiming this stifles innovation, but I call BS. If your innovation depends on an LLM writing the paper for you, it is not innovation—it is just noise. Real researchers are actually cheering this on because their actual work won’t get buried under 500 copies of the same prompt-engineered junk. The goal is to return to a world where an arXiv notification actually means something. It is about restoring the prestige of the platform before it becomes a glorified blog hosting site.

Moving Back to Quality Over Quantity

The focus is shifting back to empirical results. If your paper has 10 pages of text but only 1 graph of original data, you are going to get flagged.

⭐ Pro Tips

Use Overleaf ($15/mo) to track your version history so you can prove you wrote the paper manually if flagged.
Always run your final draft through a detector like GPTZero before hitting upload to check for accidental ‘AI-isms’.
Keep your original raw data sets and computation logs; they are your best defense against an AI-generation accusation.

Frequently Asked Questions

Can I use ChatGPT to edit my arXiv paper?

Yes, for grammar and style, but do not let it generate whole paragraphs. arXiv policy allows for minor linguistic assistance but bans synthetic content generation for core research.

Is arXiv still free to use?

Yes, it remains free for both readers and authors. This is exactly why the ban is necessary—to prevent the free resource from being abused by low-effort AI spam.

How long is the arXiv ban?

The initial ban is exactly 365 days for the first offense. Subsequent offenses result in a permanent ban of the author’s email domain and institutional ID.

Final Thoughts

arXiv had to do this. The preprint world was turning into a digital landfill. If you are a real researcher doing real work, this should not scare you. It should make you happy that your 20-page deep dive into transformer architecture won’t be competing for attention with an AI-generated listicle. Stop using the ‘Write for me’ button and start typing. See you in a year if you don’t follow the rules.