No AI can beat the new benchmark

0
383
No AI can beat the new benchmark

The nonprofit Center for Artificial Intelligence Security (CAIS) and Scale AI, a data labeling and AI development company, have released a new challenging test for advanced AI systems.

The test, titled Humanity’s Last Exam, includes thousands of crowdsourced questions in math, humanities, and science. To make the assessment more difficult, the questions are presented in a variety of formats, including formats that include charts and images.

In a previous study, none of the publicly available flagship artificial intelligence systems managed to score more than 10% on the “Last Exam of Humanity.”

CAIS and Scale AI say they plan to open this benchmark to the research community so that researchers can “explore variations more deeply” and evaluate new AI models.

LEAVE A REPLY

Please enter your comment!
Please enter your name here