No AI can beat the new benchmark

January 24, 2025 19:30

432

The nonprofit Center for Artificial Intelligence Security (CAIS) and Scale AI, a data labeling and AI development company, have released a new challenging test for advanced AI systems.

The test, titled Humanity’s Last Exam, includes thousands of crowdsourced questions in math, humanities, and science. To make the assessment more difficult, the questions are presented in a variety of formats, including formats that include charts and images.

In a previous study, none of the publicly available flagship artificial intelligence systems managed to score more than 10% on the “Last Exam of Humanity.”

CAIS and Scale AI say they plan to open this benchmark to the research community so that researchers can “explore variations more deeply” and evaluate new AI models.

No AI can beat the new benchmark

LEAVE A REPLY Cancel reply

Don't Miss

How Hybrid Energy Systems Can Keep Businesses Running During Grid Instability

Europe’s AI Diagnosis 2026: A “Rounding Error” in the Global Race

Mark Cuban’s AI Warning: CEOs Face “Innovator’s AI Dilemma”

Intel Core Ultra X9 378H Review: Marketing Hype vs. X7 Reality

Life on Mars? Record Nickel Concentrations in Jezero Crater Signal Potential...