Skip to content

worldbench/awesome-ai-auto-research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Awesome Logo arXiv Project Page Visitors PR's Welcome

😎 Awesome AI Auto-Research

This repository accompanies the survey paper "AI for Auto-Research: Roadmap & User Guide" and tracks papers on AI-assisted and automated scientific research, covering the full research lifecycle.

πŸ€– AI Auto-Research

We organize the academic research lifecycle as eight interconnected stages grouped into four epistemological phases. Each phase serves a distinct function in producing, scrutinizing, and communicating scientific knowledge.

Phase 1: Creation
Generating novel research ideas, searching and synthesizing literature, running coding experiments, and creating publication-quality tables and figures. This phase spans Idea Generation, Literature Review, Coding & Experiments, and Tables & Figures.
Phase 2: Writing
Drafting, editing, and polishing academic manuscripts. AI assistance ranges from semi-automated grammar and citation tools to fully automated paper generation β€” the most commercially mature yet ethically contested stage.
Phase 3: Validation
Automated peer review generation, reviewer-paper matching, review quality assessment, and AI-assisted author rebuttals. This phase covers Peer Review and Rebuttal & Revision.
Phase 4: Dissemination
Converting papers into slides, posters, videos, websites, and social media content. Each output format targets a different audience and demands its own design logic and AI tool chain.

For additional details, kindly refer to our πŸ“š Paper and 🌏 Project Page.

πŸ“š Citation

If you find this work helpful for your research, please kindly consider citing our paper:

@article{survey-ai-auto-research,
  title   = {{AI} for {Auto-Research}: Roadmap \& User Guide},
  author  = {Kong, Lingdong and Sun, Xian and Chow, Wei and Li, Linfeng and Lin, Kevin Qinghong and Zhang, Xuan Billy
             and Wang, Song and Li, Rong and Wu, Qing and Gao, Wei and Wang, Yingshuo and Xie, Shaoyuan
             and Liu, Jiachen and Qu, Leigang and Li, Shijie and Ng, Lai Xing and Cottereau, Benoit R.
             and Liu, Ziwei and Chua, Tat-Seng and Ooi, Wei Tsang},
  journal = {arXiv preprint arXiv:2605.18661},
  year    = {2026}
}

Table of Contents

1. Idea Generation

LLM Internal Knowledge-Based Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Chain of Ideas arXiv
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
arXiv '24 - GitHub
ResearchAgent Website
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
NAACL '25 - GitHub
SciMON arXiv
SciMON: Scientific Inspiration Machines Optimized for Novelty
ACL '24 - GitHub
Idea Gen Agent arXiv
Can LLMs Generate Novel Research Ideas? A Large Scale Human Study with 100+ NLP Researchers
arXiv '24 - -
IRIS Website
IRIS: Interactive Research Ideation System for Accelerating Scientific Discovery
ACL '25 - GitHub
Spark arXiv
Spark: A System for Scientifically Creative Idea Generation
ICCC '25 - -

External Signal-Driven Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
MOOSE-Chem Website
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
ICLR '25 - -
Nova arXiv
Nova: An Iterative Planning and Search Approach to Enhance Novelty and Diversity of LLM Generated Ideas
arXiv '24 - -
SciAgents arXiv
SciAgents: Automating Scientific Discovery through Multi-Agent Intelligent Graph Reasoning
arXiv '24 - GitHub
SciPIP arXiv
SciPIP: An LLM-based Scientific Paper Idea Proposer
arXiv '24 - GitHub
IdeaSynth arXiv
IdeaSynth: Iterative Research Idea Development Through Evolving and Composing Idea Facets with Literature-Grounded Feedback
CHI '25 - -
MOOSE-Chem2 Website
MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
NeurIPS '25 - -

Multi-Agent Collaborative Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Combi. Creativity arXiv
Combi. Creativity
arXiv '24 - -
Deep Ideation arXiv
Deep Ideation: Designing LLM Agents to Generate Novel Research Ideas on Scientific Concept Network
arXiv '25 - GitHub
VirSci Website
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System
ACL '25 - GitHub
Multi-Agent Dial. arXiv
Multi-Agent Dial.
SIGDIAL '25 - -
Artificial Hivemind arXiv
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
NeurIPS '25 - -

Novelty and Feasibility Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
IdeaBench Website
LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
KDD '25 - -
LiveIdeaBench arXiv
LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
arXiv '24 - -
AI Idea Bench 2025 arXiv
AI Idea Bench 2025: AI Research Idea Generation Benchmark
arXiv '25 - GitHub
HeurekaBench arXiv
HeurekaBench: A Benchmarking Framework for AI Co-scientist
ICLR '26 - GitHub
ResearchBench arXiv
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition
ACL '26 - -
HindSight arXiv
HindSight: Evaluating LLM-Generated Research Ideas via Future Impact
arXiv '26 - -
Rubric Rewards arXiv
Training AI Co-Scientists Using Rubric Rewards
arXiv '25 - -
DeepInnovator arXiv
DeepInnovator: Triggering the Innovative Capabilities of LLMs
arXiv '26 - GitHub
FlowPIE arXiv
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration
arXiv '26 - -

2. Literature Review & Paper Search

Literature Retrieval

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
CiteME arXiv
CiteME: Can Language Models Accurately Cite Scientific Claims?
arXiv '24 - -
LitLLM arXiv
LitLLM: A Toolkit for Literature Review with Large Language Models
arXiv '24 - -
LitSearch arXiv
LitSearch: A Retrieval Benchmark for Scientific Literature Search
arXiv '24 - GitHub
PaperQA2 arXiv
Language Agents Achieve Superhuman Synthesis of Scientific Knowledge
arXiv '24 - GitHub
OpenResearcher arXiv
OpenResearcher: Unleashing AI for Accelerated Scientific Research
EMNLP '24 - -
PaSa arXiv
PaSa: An LLM Agent for Comprehensive Academic Paper Search
arXiv '25 - GitHub

Survey & Related Work Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ChatPaper Website
ChatPaper: Use LLM to summarize papers
GitHub '23 - GitHub
PaperQA arXiv
PaperQA: Retrieval-Augmented Generative Agent for Scientific Research
arXiv '23 - GitHub
AutoSurvey arXiv
AutoSurvey: Large Language Models Can Automatically Write Surveys
arXiv '24 - GitHub
GPT Researcher Website
GPT Researcher: Autonomous Agent for Comprehensive Online Research
GitHub '24 - GitHub
LLMs for Lit. Review arXiv
LLMs for Lit. Review
arXiv '24 - -
STORM arXiv
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
arXiv '24 - GitHub
Agentic AutoSurvey arXiv
Agentic AutoSurvey: Let LLMs Survey LLMs
arXiv '25 - -
Citegeist arXiv
Citegeist: Automated Generation of Related Work Analysis on the arXiv Corpus
arXiv '25 - -
IterSurvey arXiv
IterSurvey: Deep Literature Survey Automation with an Iterative Workflow
arXiv '25 - GitHub
LiRA arXiv
LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation
arXiv '25 - -
SurveyForge arXiv
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing
arXiv '25 - GitHub
SurveyG arXiv
SurveyG: A Multi-Agent LLM Framework with Hierarchical Citation Graph for Automated Survey Generation
arXiv '25 - -
SurveyX arXiv
SurveyX: Academic Survey Automation via Large Language Models
arXiv '25 - -
InteractiveSurvey arXiv
InteractiveSurvey: An LLM-based Personalized and Interactive Survey Paper Generation System
arXiv '25 - GitHub
CiteLLM arXiv
CiteLLM: An Agentic Platform for Trustworthy Scientific Reference Discovery
arXiv '26 - -

Deep Research Agents

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ASReview Website
An Open Source Machine Learning Framework for Efficient and Transparent Systematic Reviews
Nature MI '21 - GitHub
CHIME arXiv
CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support
arXiv '24 - -
DeepResearch-Agent Website
DeepResearchAgent: A Hierarchical Multi-Agent System for Deep Research
GitHub '25 - GitHub
DeerFlow Website
DeerFlow: A Deep Research Framework Orchestrating Sub-Agents, Memory, and Sandboxes
GitHub '25 - GitHub
OpenScholar Website
OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs
Nature '26 - -
AutoAgent arXiv
AutoAgent
arXiv '25 - -
Tongyi DeepResearch Website
Tongyi DeepResearch
GitHub '25 - GitHub
O-Researcher arXiv
O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL
arXiv '26 - -
OpenResearcher arXiv
OpenResearcher: Unleashing AI for Accelerated Scientific Research
arXiv '26 - GitHub

Retrieval and Synthesis Quality Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
DeepScholar-Bench arXiv
DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis
arXiv '25 - GitHub
ReportBench arXiv
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks
arXiv '25 - GitHub
IDRBench arXiv
IDRBench: Interactive Deep Research Benchmark
arXiv '26 - -
ScholarGym arXiv
ScholarGym: Benchmarking Large Language Model Capabilities in the Information-Gathering Stage of Deep Research
arXiv '26 - -
SciNetBench arXiv
SciNetBench: A Relation-Aware Benchmark for Scientific Literature Retrieval Agents
arXiv '26 - -

3. Coding & Experimentation

Code Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
SWE-bench arXiv
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
ICLR '24 - GitHub
SWE-agent arXiv
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
arXiv '24 - GitHub
OpenHands arXiv
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
ICLR '25 - GitHub
SWE-bench Pro arXiv
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
arXiv '25 - -
SWE-EVO arXiv
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios
arXiv '25 - -

Paper-to-Code

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
FunSearch Website
Mathematical Discoveries from Program Search with Large Language Models
Nature '24 - GitHub
SciCode arXiv
SciCode: A Research Coding Benchmark Curated by Scientists
arXiv '24 - GitHub
PaperBench arXiv
PaperBench: Evaluating AI's Ability to Replicate AI Research
arXiv '25 - GitHub
PaperCoder arXiv
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
arXiv '25 - GitHub
ResearchCodeBench arXiv
ResearchCodeBench: Benchmarking LLMs on Implementing Novel ML Research Code
arXiv '25 - -
SciReplicate-Bench arXiv
SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers
arXiv '25 - GitHub

Experiment Execution & Orchestration

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
BioPlanner arXiv
BioPlanner: Automatic Evaluation of LLMs on Protocol Planning
arXiv '23 - GitHub
CRISPR-GPT arXiv
CRISPR-GPT for Agentic Automation of Gene-Editing Experiments
arXiv '24 - -
DS-Agent arXiv
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
arXiv '24 - GitHub
MLE-Bench arXiv
MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering
arXiv '24 - -
MLAgentBench arXiv
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
arXiv '24 - GitHub
MLR-Copilot arXiv
MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents
arXiv '24 - -
AIDE arXiv
AIDE: AI-Driven Exploration in the Space of Code
arXiv '25 - -
AlphaEvolve arXiv
AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery
arXiv '25 - -
AutoReproduce arXiv
AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage
arXiv '25 - GitHub
CURIE arXiv
Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents
arXiv '25 - GitHub
MLGym arXiv
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
arXiv '25 - -
MLR-Bench arXiv
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
arXiv '25 - -
Execution-Grounded arXiv
Towards Execution-Grounded Automated AI Research
arXiv '26 - -
Learn to Discover arXiv
Learning to Discover at Test Time
arXiv '26 - -
SciNav arXiv
SciNav: A General Agent Framework for Scientific Coding Tasks
arXiv '26 - -
FrontierScience arXiv
FrontierScience: Evaluating AI's Ability to Perform Expert-Level Scientific Tasks
arXiv '26 - -

Code Correctness and Reproducibility Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
DiscoveryBench arXiv
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
arXiv '24 - GitHub
DiscoveryWorld arXiv
DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
arXiv '24 - GitHub
InfiAgent-DABench arXiv
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
arXiv '24 - -
ScienceAgentBench arXiv
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
arXiv '24 - -
LAB-Bench arXiv
Lab-Bench: Measuring Capabilities of Language Models for Biology Research
arXiv '24 - GitHub
KernelBench arXiv
KernelBench: Can LLMs Write Efficient GPU Kernels?
arXiv '25 - GitHub
TritonBench arXiv
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
arXiv '25 - GitHub
AstaBench arXiv
AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite
arXiv '25 - GitHub
ResearchClawBench arXiv
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
arXiv '25 - GitHub
EXP-Bench Website
EXP-Bench: Can AI Conduct AI Research Experiments?
ICLR '26 - GitHub
PostTrainBench arXiv
PostTrainBench: Can LLM Agents Automate LLM Post-Training?
arXiv '26 - GitHub

4. Tables & Figures

Scientific Figure Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ChartGPT arXiv
ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language
arXiv '23 - -
MatPlotAgent arXiv
MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization
arXiv '24 - -
CoDA arXiv
CoDA: Agentic Systems for Collaborative Data Visualization
arXiv '25 - -
PlotGen arXiv
PlotGen: Multi-Agent LLM-based Scientific Data Visualization via Multimodal Feedback
arXiv '25 - -
VIS-Shepherd arXiv
VIS-Shepherd: Constructing Critic for LLM-based Data Visualization Generation
arXiv '25 - -
DiagramAgent arXiv
From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing
CVPR '25 - -
StarVector arXiv
StarVector: Generating Scalable Vector Graphics Code from Images and Text
CVPR '25 - -
VisCoder arXiv
VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation
EMNLP '25 - -
AI-Generated Figures arXiv
AI-Generated Figures
arXiv '26 - -
AutoFigure-Edit arXiv
AutoFigure-Edit: Generating Editable Scientific Illustration
arXiv '26 - GitHub
AutoFigure arXiv
AutoFigure-Edit: Generating Editable Scientific Illustration
ICLR '26 - GitHub
PaperBanana arXiv
PaperBanana: Automating Academic Illustration for AI Scientists
arXiv '26 - -
SAIL arXiv
Setting SAIL: Leveraging Scientist-AI-Loops for Rigorous Visualization Tools
arXiv '26 - -

Table Understanding & Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ArxivDIGESTables arXiv
ArxivDIGESTables: Synthesizing Scientific Literature into Tables using Language Models
EMNLP '24 - -
Chain-of-Table arXiv
Chain-of-Table: Evolving Tables in Reasoning Chain for Table Understanding
ICLR '24 - -
ShowTable arXiv
ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement
CVPR '26 - -
Table2LaTeX-RL arXiv
Table2LaTeX-RL: Converting Table Images to High-Fidelity LaTeX Code Using Reinforced Multimodal Language Models
arXiv '25 - -

Mathematical Formulas & TikZ

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AutomaTikZ arXiv
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
ICLR '24 - -
DeTikZify arXiv
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
NeurIPS '24 - -
TikZilla arXiv
TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning
arXiv '26 - -

Visual Fidelity and Scientific Accuracy Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
PlotCraft arXiv
PlotCraft: Pushing the Limits of LLMs for Complex and Interactive Data Visualization
arXiv '25 - -
TeXpert Website
TeXpert: Multi-Level Benchmark for LaTeX Code Generation
SDP '25 - -
AbGen arXiv
AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research
ACL '25 - -
SciFig arXiv
SciFig: Towards Automating Scientific Figure Generation
arXiv '26 - -
SciFlow-Bench arXiv
SciFlow-Bench: Evaluating Structure-Aware Scientific Diagram Generation via Inverse Parsing
arXiv '26 - -
FigureBench arXiv
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations
ICLR '26 - GitHub

5. Paper Writing

Semi-Automated Writing Assistance

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
CoAuthor arXiv
CoAuthor: Human-AI Collaborative Writing with Language Models
arXiv '22 - -
AI Writing Study arXiv
AI Writing Study
AIED '25 - -
DraftMarks arXiv
DraftMarks: Enhancing Transparency in Human-AI Co-Writing Through Interactive Skeuomorphic Process Traces
arXiv '25 - -
PaperDebugger arXiv
PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing
arXiv '25 - GitHub
ScholarCopilot arXiv
ScholarCopilot: Training LLMs for Academic Writing with Integrated Citation
arXiv '25 - -
XtraGPT arXiv
XtraGPT: Context-Aware and Controllable Academic Paper Revision
arXiv '25 - -
LimAgents arXiv
Multi-Agent LLMs for Generating Research Limitations
arXiv '26 - -

Fully Automated Paper Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
CycleResearcher arXiv
CycleResearcher: Improving Automated Research via Automated Review
ICLR '25 - -
Agent Laboratory Website
Agent Laboratory: Using LLM Agents as Research Assistants
EMNLP '25 - -
FutureGen arXiv
FutureGen: A RAG-based Approach to Generate the Future Work of Scientific Article
arXiv '25 - -
AI Scientist arXiv
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
Nature '26 - GitHub
APRES arXiv
APRES: An Agentic Paper Revision and Evaluation System
arXiv '26 - -

Societal Analysis

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AI Writing Adoption Website
AI Writing Adoption
Nature '26 - -
Nature AI Survey Website
More than Half of Researchers Now Use AI for Peer Review
Nature '26 - -

Writing Quality and AI Detection Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Mapping LLM Use arXiv
Mapping the Increasing Use of LLMs in Scientific Papers
arXiv '24 - -
CycleReviewer arXiv
CycleResearcher: Improving Automated Research via Automated Review
ICLR '25 - -
Stanford Agentic Website
Stanford Agentic
Web '25 - -
SciIG arXiv
Let's Use ChatGPT To Write Our Paper! Benchmarking LLMs To Write the Introduction of a Research Paper
arXiv '25 - -
Watermarking arXiv
Detecting LLM-Generated Peer Reviews
arXiv '25 - -
PaperWritingBench arXiv
PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing
arXiv '26 - -

6. Peer Review

Automated Review Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ChatReviewer Website
ChatReviewer: ChatGPT-based Paper Reviewing and Response Generation
GitHub '23 - GitHub
AI-Peer-Review Website
AI-Peer-Review
GitHub '24 - GitHub
MARG arXiv
MARG: Multi-Agent Review Generation for Scientific Papers
arXiv '24 - -
Reviewer2 arXiv
Reviewer2: Optimizing Review Generation Through Prompt Generation
arXiv '24 - -
ReviewRL Website
ReviewRL: Towards Automated Scientific Review with RL
EMNLP '25 - -
DeepReviewer arXiv
DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process
arXiv '25 - -
OpenReviewer Website
OpenReviewer: A Specialized Large Language Model for Generating Critical Scientific Paper Reviews
NAACL '25 - -
REMOR arXiv
REMOR: Automated Peer Review Generation with LLM Reasoning and Multi-Objective Reinforcement Learning
arXiv '25 - -
ScholarPeer arXiv
ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review
arXiv '26 - -

Meta-Review & Reviewer Matching

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AgentReview Website
AgentReview: Exploring Peer Review Dynamics with LLM Agents
EMNLP '24 - -
Meta-Review LLMs Website
Meta-Review LLMs
NAACL '25 - -
RATE arXiv
RATE: Reviewer Profiling and Annotation-free Training for Expertise Ranking in Peer Review Systems
arXiv '26 - -

Adversarial Attacks & Bias Analysis

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Raina etal arXiv
Raina etal
EMNLP '24 - -
AI Review Lottery arXiv
The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates
arXiv '24 - -
Ye etal arXiv
Ye etal
arXiv '24 - -
Breaking the Reviewer arXiv
Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks
arXiv '25 - -
LLM Reviewer Bias arXiv
LLM Reviewer Bias
arXiv '25 - -
Prompt Injection arXiv
Prompt Injection Attacks on LLM Generated Reviews of Scientific Publications
arXiv '25 - -
Sahoo etal arXiv
Sahoo etal
arXiv '25 - -
Zhou etal arXiv
Zhou etal
arXiv '25 - -

Detection & Policy

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AI Detection arXiv
Is Your Paper Being Reviewed by an LLM? Benchmarking AI Text Detection in Peer Review
arXiv '25 - -
AI Use Rejects Website
Major Conference Catches Illicit AI Use β€” and Rejects Hundreds of Papers
Nature '26 - -
Nature AI Survey Website
More than Half of Researchers Now Use AI for Peer Review
Nature '26 - -
Policy Enforcement arXiv
Policy Enforcement
arXiv '26 - -
Reviewer Feedback Website
What Happens When Reviewers Receive AI Feedback in Their Reviews?
CHI '26 - -

Review Consistency and Bias Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Review Survey Website
More than Half of Researchers Now Use AI for Peer Review β€” often Against Guidance
IF '25 - -
Stanford Agentic Website
Stanford Agentic
Web '25 - -
ClaimCheck Website
ClaimCheck: How Grounded are LLM Critiques of Scientific Papers?
EMNLP '25 - -
ReViewGraph arXiv
Automatic Paper Reviewing with Heterogeneous Graph Reasoning over LLM-Simulated Reviewer-Author Debates
AAAI '26 - -
ReviewAgents arXiv
ReviewAgents: Bridging the Gap Between Human and AI-Generated Paper Reviews
arXiv '25 - -
ICLR 2025 Study Website
ICLR 2025 Study
NMI '26 - -

7. Rebuttal

Reviewer Comment Analysis

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ReviewMT arXiv
Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions
arXiv '24 - -
ICLR Rebuttal Study arXiv
ICLR Rebuttal Study
arXiv '25 - -

Automated Rebuttal Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ReviewerToo arXiv
ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review
arXiv '25 - -
RebuttalAgent arXiv
RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind
ICLR '26 - GitHub
Author-in-the-Loop arXiv
Author-in-the-Loop Response Generation and Evaluation: Integrating Author Expertise and Intent in Responses to Peer Review
ACL '26 - -
DRPG arXiv
DRPG: An Agentic Framework for Academic Rebuttal
arXiv '26 - GitHub
Paper2Rebuttal arXiv
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance
arXiv '26 - -

Rebuttal Effectiveness Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Re$^2$ arXiv
Re$^2$
arXiv '25 - -
Commitment Checklist arXiv
Commitment Checklist: Auditing Author Commitments in Peer Review
arXiv '26 - -
Re$^3$Align arXiv
Re$^3$Align
ACL '26 - -

8. Dissemination (Paper2X)

Paper2Poster

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
P2P Website
P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark
ICLR '26 - -
Paper2Poster Website
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
NeurIPS '25 - GitHub
PosterForest arXiv
PosterForest: Hierarchical Multi-Agent Collaboration for Scientific Poster Generation
arXiv '25 - -
PosterGen arXiv
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
arXiv '25 - -
APEX arXiv
APEX: Academic Poster Editing Agentic Expert
arXiv '26 - GitHub
PosterOmni arXiv
PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
arXiv '26 - -

Paper2Slides

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
DOC2PPT Website
DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents
AAAI '22 - -
PPTAgent arXiv
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
EMNLP '25 - GitHub
AutoPresent arXiv
AutoPresent: Designing Structured Visuals from Scratch
CVPR '25 - -
Paper2Slides Website
Paper2Slides: From Paper to Presentation in One Click
GitHub '25 - GitHub
Auto-Slides arXiv
Auto-Slides: An Interactive Multi-Agent System for Creating and Customizing Research Presentations
arXiv '25 - -
PASS arXiv
PASS: Presentation Automation for Slide Generation and Speech
arXiv '25 - -
SlideGen arXiv
SlideGen: Collaborative Multimodal Agents for Scientific Slide Generation
arXiv '25 - -
Talk to Your Slides arXiv
Talk to Your Slides: Efficient Slide Editing Agent
arXiv '25 - -
SlideTailor arXiv
SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
AAAI '26 - GitHub
DeepPresenter arXiv
DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation
arXiv '26 - GitHub
Office Raccoon Website
Office Raccoon
Web '26 - -

Paper2Video

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Preacher Website
Preacher: Paper-to-Video Agentic System
ICCV '25 - GitHub
Paper2Video arXiv
Paper2Video: Automatic Video Generation from Scientific Papers
arXiv '25 - GitHub
PresentAgent Website
PresentAgent: Multimodal Agent for Presentation Video Generation
EMNLP '25 - GitHub

Paper2Web & Social Media

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Paper2Web arXiv
Paper2Web: Let's Make Your Paper Alive!
arXiv '25 - GitHub

Fidelity and Adoption Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
PPTEval arXiv
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
EMNLP '25 - GitHub
PresentQuiz arXiv
Paper2Video: Automatic Video Generation from Scientific Papers
arXiv '25 - GitHub
PresentEval Website
PresentAgent: Multimodal Agent for Presentation Video Generation
EMNLP '25 - GitHub

9. End-to-End Systems

Fully Automated Research Systems

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ResearchTown arXiv
ResearchTown: Simulator of Human Research Community
ICML 2025 Website GitHub
Agent Laboratory arXiv
Agent Laboratory: Using LLM Agents as Research Assistants
arXiv 2025 - -
AgentRxiv arXiv
AgentRxiv: Towards Collaborative Autonomous Research
arXiv 2025 - -
ARIS - GitHub 2025 - GitHub
freephdlabor arXiv
Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation
arXiv 2025 - -
SciMaster arXiv
SciMaster: Towards General-Purpose Scientific AI Agents
arXiv 2025 - GitHub
- arXiv
Towards End-to-End Automation of AI Research
Nature 2026 Website GitHub
Idea2Story arXiv
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives
arXiv 2026 - -
UniScientist - Web 2026 - -
ASI-Evolve - GitHub 2026 - GitHub
FARS - Web 2026 - -
AutoResearchClaw - GitHub 2026 - GitHub
CORAL arXiv
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery
arXiv 2026 - GitHub
AutoSOTA arXiv
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery
arXiv 2026 - GitHub
AiScientist-LH arXiv
Toward Autonomous Long-Horizon Engineering for ML Research
arXiv 2026 - -
OpenResearcher (2026) arXiv
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis
arXiv 2026 - GitHub
Aletheia arXiv
Towards Autonomous Mathematics Research
arXiv 2026 - GitHub

Domain-Specific Systems

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AlphaFold 3 arXiv
Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3
Nature 2024 Website -
Medical AI Scientist arXiv
Towards a Medical AI Scientist
arXiv 2026 - -

Evolutionary & Self-Improving Systems

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ShinkaEvolve arXiv
ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution
arXiv 2025 - GitHub
Darwin Godel Machine arXiv
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
arXiv 2025 - GitHub

Research Platforms & Infrastructure

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Towards an AI co-scientist arXiv
Towards an AI co-scientist
arXiv 2025 - -
PiFlow arXiv
PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration
arXiv 2025 - -
LabClaw - Web 2026 - -
- arXiv
OpenAI Is Throwing Everything into Building a Fully Automated Researcher
MIT TR 2026 Website -

10. Societal & Critical Perspectives

⏲️ In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
- arXiv
Navigating the Jagged Technological Frontier
Org. Sci. 2025 Website -
- arXiv
Reassessing Academic Integrity in the Age of AI
SSH Open 2025 Website -
The AI Deskilling Paradox arXiv
The AI Deskilling Paradox
CACM 2025 Website -
Hidden Pitfalls of AI Scientist Systems arXiv
The More You Automate, the Less You See: Hidden Pitfalls of AI Scientist Systems
arXiv 2025 - -
Rethinking Science in the Age of AI arXiv
Rethinking Science in the Age of Artificial Intelligence
arXiv 2025 - -
- arXiv
Measuring AI Ability to Complete Long Tasks
METR 2025 Website -
- arXiv
Towards a Science of Scaling Agent Systems
arXiv 2025 - -
- arXiv
Artificial Intelligence Tools Expand Scientists' Impact but Contract Science's Focus
Nature 2026 Website -
- [arXiv](https://www.cell.com/patterns/fulltext/S2666-3899(25)
AI for Scientific Discovery is a Social Problem
Patterns 2026 [Website](https://www.cell.com/patterns/fulltext/S2666-3899(25) -
Research Integrity in the Age of AI arXiv
Research Integrity and Academic Authority in the Age of Artificial Intelligence: From Discovery to Curation?
arXiv 2026 - -
SciSciGPT arXiv
SciSciGPT: Advancing Human-AI Collaboration in the Science of Science
Nature CS 2026 Website -
SimStep arXiv
SimStep: Chain-of-Abstractions for Incremental Specification and Debugging of AI-Generated Interactive Simulations
arXiv 2025 - -
ConvoLearn arXiv
ConvoLearn: A Learning Sciences Grounded Dataset for Fine-Tuning Dialogic AI Tutors
arXiv 2026 - -
AFIM: Academic Fraud Inclination Metric arXiv
AFIM: Academic Fraud Inclination Metric
Web 2026 Website -
- arXiv
AI Researchers' Views on Automating AI R&D and Intelligence Explosions
arXiv 2026 - -
- arXiv
AI Scientists Are Changing Research
Nature 2026 Website -
Learning by Creating (Talk) arXiv
Learning by Creating: A Human-Centered Vision for AI in Education
Talk 2026 Website -

11. Surveys & Curated Lists

⏲️ In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
LLM4SR arXiv
LLM4SR: A Survey on Large Language Models for Scientific Research
arXiv 2025 - -
From Automation to Autonomy arXiv
From Automation to Autonomy: A Survey on Large Language Models for Scientific Discovery
arXiv 2025 - -
AI4Research arXiv
AI4Research: A Survey of Artificial Intelligence for Scientific Research
arXiv 2025 - -
A Survey of AI Scientists arXiv
A Survey of AI Scientists
arXiv 2025 - -
- arXiv
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey
arXiv 2025 - -
- arXiv
Large Language Models for Automated Scholarly Paper Review: A Survey
Inf. Fusion 2025 Website -

12. Tools & GitHub Repos

Open-source tools, frameworks, and curated resource lists for AI-assisted research (not directly tied to a single paper).

Curated Lists

Repository Stars Description
Awesome-Deep-Research GitHub Up-to-date collection of agentic deep research resources
Awesome-Scientific-Language-Models GitHub Survey of scientific LLMs (EMNLP'24)
Awesome-LLM-Scientific-Discovery GitHub Three-level autonomy framework (EMNLP'25)
Awesome-AI-Scientist-Papers GitHub Resources on AI Scientist systems
Awesome-Auto-Research-Tools GitHub Automated research tools catalog
awesome-autoresearch GitHub Autonomous improvement loops and research agents
awesome-ai-research-writing GitHub Prompt templates and agent skills for AI-assisted writing

Idea Generation

Repository Stars Description
Virtual-Scientists GitHub VirSci: multi-agent collaborative idea generation (ACL'25)
ResearchAgent GitHub Iterative idea proposal with reviewing agents

Literature Review

Repository Stars Description
paper-qa GitHub PaperQA2: superhuman RAG for scientific Q&A
local-deep-research GitHub Fully local deep research
researchgpt GitHub Conversational interaction with research papers
gpt-researcher GitHub Autonomous agent for comprehensive online research
AutoSurvey GitHub Automated comprehensive literature surveys
storm GitHub Wikipedia-style article generation (STORM)

Coding & Experiments

Repository Stars Description
autoresearch (Karpathy) GitHub Autonomous ML experiments, ~12 exp/hour overnight
Paper2Code GitHub Multi-agent ML paper to code transformation
RD-Agent GitHub Microsoft's LLM framework for autonomous data science
MLAgentBench GitHub 13 end-to-end ML experimentation tasks
SWE-bench GitHub Real-world GitHub issue resolution benchmark
Thoth GitHub Dashboard-first Claude Code and Codex runtime for durable autoresearch runs, work-item locks, ledgers, and reviewable verdicts

Peer Review

Repository Stars Description
paper-reviewer GitHub arXiv paper reviews + blog posts
ai-peer-review GitHub Multi-LLM reviews + meta-review synthesis
openreviewer GitHub Llama-8B fine-tuned on 79K expert reviews

⬆ Back to Top

Last updated: 2026-05-18 Β· Maintained by WorldBench

Releases

No releases published

Packages

 
 
 

Contributors

Languages