Nature, Published online: 01 April 2026; doi:10.1038/s41586-026-10303-2
A fully automated methodology based on rubrics capturing a broad range of cognitive and intellectual demands is illustrated using LLMs and tasks, demonstrating a new way to evaluate the capabilities of AI systems and anticipate their performance.
From Nature via this RSS feed
You must log in or # to comment.

