PaperBench: Evaluating AI’s Ability to Replicate AI Research

by Maxim Makedonsky
1 month ago

We introduce PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research.

Source link

Categories: SEO Tools

Exit mobile version