Skip to content

A New Benchmark for Spatial Reasoning

The International Mathematical Olympiad (IMO) represents the pinnacle of competition for talented high school students from around the world. Competitors tackle highly complex mathematical problems. In this context, geometry problems—which require rigorous formal logic and advanced spatial reasoning—have long been considered a critical benchmark in artificial intelligence (AI) research.

A team of researchers based in China has just achieved a significant milestone by developing an AI system capable not only of solving but also of generating these Olympiad-level problems. Named TongGeometry, this new system performs on par with the best human Olympiad competitors in a field that requires deep creativity.

The system’s achievements are detailed in a new study by the research team, published in the scientific journal Nature Machine Intelligence. This breakthrough marks a turning point, as it demonstrates that a machine can now rival the human mind in areas where intuition and visual construction are paramount.

The Delicate Art of Problem Formulation

While some AI systems were already capable of solving Olympic-level geometry problems, formulating new problems requires a different kind of mathematical mastery, coupled with an aesthetic sensibility that is difficult to replicate artificially. Previous systems, such as AlphaGeometry, focused solely on problem-solving and required significant computational resources.

The study’s authors highlight the subtlety of this exercise: “The most admired problems exhibit a deceptive simplicity: they are accessible through fundamental knowledge but require deep creativity for complete solutions. Mathematical elegance, particularly symmetry in various forms, serves as a critical quality criterion in prestigious competitions.”

The visual and constructive nature of geometry poses major challenges for AI. According to the researchers, fundamental limitations arise in computational approaches due to “the combinatorial explosion of reasoning paths and the scarcity of exemplary problems for heuristic development.”

An Innovative Neurosymbolic Architecture

TongGeometry is a neuro-symbolic system that uses guided tree search within a Markovian framework to model geometric reasoning. This architecture appears to largely overcome the obstacles posed by these problems. To achieve this, the team developed the system by refining two major language models: one suggests search directions, while the other evaluates reasoning steps.

Using 196 Olympiad problems from previous competitions as guiding statistics, the system generated a vast repository of 6.7 billion geometry problems. Of these, 4.1 billion exhibited mathematical symmetry, a key quality criterion. The quality of this output is such that three of these problems were selected for major math Olympiads in China and the United States.

This capacity for massive generation, coupled with qualitative filtering, sets TongGeometry apart from its predecessors. The system does not merely apply rules; it explores the mathematical space to create novel solutions that are relevant and usable in real competitive contexts.

Performance and Execution Speed

The researchers tested TongGeometry’s problem-solving capabilities using a dataset designed for AlphaGeometry (IMO-AG-30) as well as a new dataset (MO-TG-225). The IMO-AG-30 dataset comprised 30 problems drawn from 23 years of IMO competitions, while the MO-TG-225 dataset contained 225 well-known theorems, such as Euler’s line theorem. The results are indisputable: TongGeometry solved all 30 problems in the IMO-AG-30 test set.

On this specific dataset, the system outperformed the average score of IMO gold medalists. Even more impressively, it accomplished this task in just 38 minutes, using off-the-shelf computing resources. This hardware efficiency contrasts with the often colossal requirements of supercomputers used in AI research.

These benchmarks confirm the system’s robustness when dealing with historical problems and classical theorems. The speed of solution—less than an hour for a corpus that would take a human days to solve—illustrates the optimization of the search algorithms employed by the Chinese team.

Technical Comparison with AlphaGeometry

The study provides a detailed comparison with the previous system. The authors explain: “TongGeometry’s DD backend demonstrated improved problem-solving capabilities compared to AlphaGeometry’s DD+AR, achieving performance levels close to those of AlphaGeometry overall. We noted that AlphaGeometry’s success stemmed largely from its backend engine, with 72.5% of all solutions obtained by DD+AR.”

The difference lies in the use of neural networks. The researchers specify: “In contrast, TongGeometry not only solved a higher proportion of problems (81.3% versus 45.3%) but also leveraged its neural models more effectively to tackle auxiliary construction challenges, with only 55.2% of problems solved by DD alone.”

These figures indicate that TongGeometry relies less on the brute force of its deduction engine alone and manages to integrate artificial intelligence more subtly to navigate the complex steps of geometric construction.

A Tool for Education and Research

Although TongGeometry does not cover all possible geometry problems—such as those requiring algebraic or combinatorial reasoning—its architecture could be extended to other areas of mathematics. The system has already demonstrated its practical utility in educational settings, where experienced OIM coaches review and refine the problems before using them with their students.

The authors describe this process as follows: “This curated collection is then presented to students, serving a dual purpose: it provides a rich source of training material that helps students master complex topics and competition-specific techniques, while simultaneously acting as a powerful creative aid for coaches, helping them devise interesting and challenging problems for their teams.”

In conclusion, the researchers note TongGeometry’s potential to advance computational geometry and mathematics education, paving the way for increased collaboration between artificial intelligence and high-level pedagogy.

Source: phys.org

Created by humans, assisted by AI.

TongGeometry: The Artificial Intelligence That Challenges Gold Medalists in Geometry

This content was created with the help of AI.

facebook icon twitter icon linkedin icon
Copied!

Commentaires

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
More Content