Google DeepMind’s AI systems can now solve complex math problems
“It is often easier to train a model for mathematics if you have a way to check its answers (e.g., in a formal language), but there is comparatively less formal mathematics data online compared to free-form natural language (informal language),” says Katie Collins, an researcher at the University of Cambridge who specializes in math and AI but was not involved in the project.
Bridging this gap was Google DeepMind’s goal in creating AlphaProof, a reinforcement-learning-based system that trains itself to prove mathematical statements in the formal programming language Lean. The key is a version of DeepMind’s Gemini AI that’s fine-tuned to automatically translate math problems phrased in natural, informal language into formal statements, which are easier for the AI to process. This created a large library of formal math problems with varying degrees of difficulty.
Automating the process of translating data into formal language is a big step forward for the math community, says Wenda Li, a lecturer in hybrid AI at the University of Edinburgh, who peer-reviewed the research but was not involved in the project.
“We can have much greater confidence in the correctness of published results if they are able to formulate this proving system, and it can also become more collaborative,” he adds.
The Gemini model works alongside AlphaZero—the reinforcement-learning model that Google DeepMind trained to master games such as Go and chess—to prove or disprove millions of mathematical problems. The more problems it has successfully solved, the better AlphaProof has become at tackling problems of increasing complexity.
Although AlphaProof was trained to tackle problems across a wide range of mathematical topics, AlphaGeometry 2—an improved version of a system that Google DeepMind announced in January—was optimized to tackle problems relating to movements of objects and equations involving angles, ratios, and distances. Because it was trained on significantly more synthetic data than its predecessor, it was able to take on much more challenging geometry questions.