Nature, Published online: 01 April 2026; doi:10.1038/s41586-026-10303-2

A fully automated methodology based on rubrics capturing a broad range of cognitive and intellectual demands is illustrated using LLMs and tasks, demonstrating a new way to evaluate the capabilities of AI systems and anticipate their performance.


From Nature via this RSS feed