[ad_1] <br>We introduce PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research. <br>[ad_2] <br><a href="https://openai.com/index/paperbench">Source link </a>