AI Models Achieve Human-Level Abstract Reasoning Accuracy of 61.9%

Researchers are using creative methods to improve AI's abstract reasoning. They explored different ways to make AI models predict sequences correctly. This involves flipping the model's inputs and leaving some data out. By doing so, they test different combinations to predict what comes next in a sequence.

For example, take a sequence like 2, 4, 6. Usually, you'd expect 8 to follow. But what if you look at just 4 and 6? You'd predict 2 to come before and then find the middle number between 2 and 6, which is 4. This method uses a mix of search algorithms and predictions. These AI models then use hierarchical voting to determine the most likely answer based on consistent results.

Laptop with code on screen by a window overlooking a cityscape at sunset.

The outcome is surprising. The researchers achieved a public validation accuracy of 61.9%. This score matches the average human score in abstract reasoning tests. This achievement is significant because it hints at AI models reaching human-level performance in specific tasks. Some experts say that this brings us closer to the development of Artificial General Intelligence (AGI).

However, not everyone agrees on AGI's definition. OpenAI, for instance, describes AGI as a system that surpasses humans in valuable economic tasks. This study's results do not define AGI but demonstrate progress in AI reasoning. The methods used here can apply to different models, improving their accuracy.

The 01 Paradigm provides insights into how AI might achieve AGI. This involves understanding how AI models reason with hidden data points during inference. When given more time to think, AI models score higher on benchmarks. This study showed a six-times improvement using a model with just 8 billion parameters.

The research reveals that AI systems can enhance their reasoning abilities. It suggests that continued exploration and refinement of these methods could lead to smarter and more capable AI systems. While these findings do not yet confirm AGI, they offer a glimpse into the potential advancements in AI technology.