Discover new algorithms using AlphaTensor

Estimated read time: 7 min

Wireless

AlphaZero’s first extension for mathematics opens up new possibilities for research

Algorithms have helped mathematicians perform basic operations for thousands of years. The ancient Egyptians devised an algorithm for multiplying two numbers without the need for a multiplication table, and the Greek mathematician Euclid described an algorithm for calculating the greatest common divisor, which is still in use today.

During the Islamic Golden Age, Persian mathematician Muhammad ibn Musa al-Khwarizmi designed new algorithms for solving linear and quadratic equations. In fact, Al-Khwarizmi’s name is translated into Latin as Algoritmi, led to the term algorithm. But, despite the familiarity with algorithms today—used throughout society from classroom algebra to cutting-edge scientific research—the process of discovering new algorithms is incredibly difficult, and an example of the amazing reasoning capabilities of the human mind.

In our paper published today in the journal NatureAnd We offer Alpha Tensor, the first artificial intelligence (AI) system to discover new, efficient and correct algorithms for basic tasks such as matrix multiplication. This sheds light on a 50-year-old open question in mathematics about finding the fastest way to multiply two matrices.

This paper is a starting point in DeepMind’s mission to advance the science and unlock fundamental problems with artificial intelligence. Our AlphaTensor system is based on AlphaZero, a proxy that has shown breakthrough performance on board games, such as chess, Go, and shogi, and this work shows AlphaZero’s journey from playing games to tackling unsolved mathematical problems for the first time.

matrix multiplication

Matrix multiplication is one of the simplest operations in algebra, and it’s usually taught in high school math classes. But outside of the classroom, this humble arithmetic operation has a huge impact in the contemporary digital world and is ubiquitous in modern computing.

Example of multiplying two 3×3 matrices.

This process is used to process images on smartphones, recognize speech commands, create graphics for computer games, run simulations to forecast the weather, compress data and videos for sharing on the Internet, and much more. Companies around the world spend a great deal of time and money developing computing hardware for efficient matrix multiplication. Therefore, even small improvements in the efficiency of matrix multiplication can have a widespread impact.

For centuries, mathematicians have believed that the standard matrix multiplication algorithm is the best achievable algorithm in terms of efficiency. But in 1969, German mathematician Volker Strassen shocked the mathematical community by proving that there are better algorithms.

Standard algorithm compared to Strassen’s algorithm, which uses less scalar multiplication (7 instead of 8) to multiply 2×2 matrices. Multiplication matters far more than additions for overall efficiency.

By studying very small matrices (size 2×2), he discovered an ingenious way to combine the entries of the matrices to produce a faster algorithm. Despite decades of research after Strassen’s breakthrough, larger versions of this problem have remained unsolved – so much so that it is not known how efficiently two matrices as small as 3×3 can be multiplied.

In our paper, we explore how newer AI technologies can enhance the automatic detection of new matrix multiplication algorithms. Building on the advancement of human intuition, AlphaTensor has discovered algorithms that are more efficient than the latest technology for many sizes of arrays. Algorithms designed with our AI technology are superior to human-designed algorithms, which is a huge step forward in the field of computational discovery.

Process and advances in algorithmic discovery automation

First, we turned the problem of finding efficient algorithms for matrix multiplication into a single player game. In this game, the board is a 3D tensor (a collection of numbers), and it captures how far the current algorithm is from correcting it. Through a set of allowed moves, which correspond to the algorithm’s instructions, the player attempts to modify the tensor and cancel its entries. When the player can do this, it results in a provably correct matrix multiplication algorithm for any pair of matrices, the efficiency of which is scored by the number of steps taken to zero out the tensor.

This game is incredibly difficult – the number of possible algorithms to consider is much greater than the number of atoms in the universe, even for small cases of matrix doubling. Compared to Go, which has been challenging AI for decades, the number of possible moves in each move of our game is 30 orders of magnitude greater (up from 1033 for one of the settings we consider).

Basically, to play this game well, one needs to identify the smallest needles in a giant haystack of possibilities. To address the challenges of this field, which differ significantly from traditional games, we developed several critical components including a new neural network architecture that incorporates problem-specific inductive biases, a procedure for generating useful synthetic data, and a recipe for taking advantage of problem symmetries.

We then trained the AlphaTensor agent using reinforcement learning to play the game, starting without any knowledge of existing matrix multiplication algorithms. Through learning, AlphaTensor gradually improves over time, rediscovering historical fast matrix multiplication algorithms such as Strassen, eventually surpassing the realm of human intuition and discovering algorithms faster than previously known.

A single player game played by AlphaTensor, where the objective is to find the correct matrix multiplication algorithm. The game state is a cubic set of numbers (shown as gray for 0, blue for 1, and green for -1), representing the remaining work to be done.

For example, if a traditional algorithm taught in school multiplies a 4 x 5 x 5 x 5 matrix using 100 multiplications, and that number is reduced to 80 by human ingenuity, AlphaTensor has found algorithms that do the same operation using only 76 multiplications.

The algorithm was detected by AlphaTensor using 76 hits, which is an improvement over the latest algorithms.

Beyond this example, the AlphaTensor algorithm improves the two-level Strassen algorithm in a finite field for the first time since its discovery 50 years ago. These algorithms for multiplying small matrices can be used as the basis for multiplying larger matrices of arbitrary size.

Moreover, AlphaTensor also detects a variety of algorithms with state-of-the-art complexity – up to thousands of matrix multiplication algorithms per size, which shows that the space for matrix multiplication algorithms is richer than previously thought.

Algorithms in this rich space have different mathematical and practical properties. Taking advantage of this versatility, we’ve adapted AlphaTensor to find fast algorithms specifically on a specific device, such as Nvidia V100 GPU and Google TPU v2. These algorithms multiply large arrays 10-20% faster than algorithms commonly used on the same machine, demonstrating AlphaTensor’s flexibility in optimizing arbitrary targets.

AlphaTensor with a target corresponding to the algorithm’s runtime. When the correct matrix multiplication algorithm is discovered, it is measured on the target machines, which is then fed back to the AlphaTensor, in order to learn more algorithms that are effective on the target machines.

Explore the impact on future research and applications

From a mathematical point of view, our results can guide further research in complexity theory, which aims to identify the fastest algorithms for solving computational problems. By exploring the space of possible algorithms in a more efficient way than previous methods, AlphaTensor It helps advance our understanding of the richness of matrix multiplication algorithms. Understanding this space may open up new results to help determine the asymptotic complexity of matrix multiplication, one of the most fundamental open problems in computer science.

Since matrix multiplication is a key component of many computational tasks, including computer graphics, digital communications, neural network training, and scientific computing, the AlphaTensor algorithms discovered could make computations in these areas significantly more efficient. AlphaTensor’s flexibility to consider any type of target could also spur new applications for designing algorithms that improve metrics such as power use and numerical stability, helping to prevent small rounding errors from multiplying as the algorithm works.

While we have focused here on the problem for matrix multiplication, we hope that our paper will inspire others to use artificial intelligence to guide algorithmic discovery for other basic arithmetic tasks. Our research also shows that AlphaZero is a powerful algorithm that can be extended far beyond the realm of traditional games to help solve open-ended problems in mathematics. Based on our research, we hope to spur greater action – applying AI to help society solve some of the most important challenges in mathematics and across the sciences.

You can find more information in AlphaTensor’s GitHub repository.

Source link

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.