Calculations have assisted mathematicians with performing essential tasks for millennia. The old Egyptians conceived a calculation for duplicating two numbers without the requirement for an increase table, and the Greek mathematician Euclid portrayed a calculation for working out the best normal divisor, which is still being used today.
During the Islamic Brilliant Age, Persian mathematician Muhammad ibn Musa al-Khwarizmi planned new calculations for addressing direct and quadratic conditions. As a matter of fact, Al-Khwarizmi's name is converted into Latin as Algoritmi, prompted the term calculation. Yet, regardless of the knowledge of calculations today — utilized all through society from study hall variable based math to state of the art logical examination — the method involved with finding new calculations is inconceivably troublesome, and an illustration of the astounding thinking abilities of the human psyche.
In our paper distributed today in the diary NatureAnd We offer Alpha Tensor, the principal computerized reasoning (artificial intelligence) framework to find new, effective and right calculations for essential errands like grid augmentation. This reveals insight into a 50-year-old open inquiry in science about tracking down the quickest method for duplicating two networks.
This paper is a beginning stage in DeepMind's central goal to propel the science and open essential issues with man-made brainpower. Our AlphaTensor framework depends on AlphaZero, an intermediary that has shown advancement execution on board games, like chess, Go, and shogi, and this work shows AlphaZero's excursion from messing around to handling strange numerical issues interestingly.
network augmentation
Grid duplication is quite possibly of the least complex activity in variable based math, and it's typically shown in secondary school math classes. In any case, beyond the study hall, this unassuming number juggling activity has an immense effect in the contemporary computerized world and is pervasive in present day figuring.
This process is used to process images on smartphones, recognize speech commands, create graphics for computer games, run simulations to forecast the weather, compress data and videos for sharing on the Internet, and much more. Companies around the world spend a great deal of time and money developing computing hardware for efficient matrix multiplication. Therefore, even small improvements in the efficiency of matrix multiplication can have a widespread impact.
For centuries, mathematicians have believed that the standard matrix multiplication algorithm is the best achievable algorithm in terms of efficiency. But in 1969, German mathematician Volker Strassen shocked the mathematical community by proving that there are better algorithms.
By studying very small matrices (size 2×2), he discovered an ingenious way to combine the entries of the matrices to produce a faster algorithm. Despite decades of research after Strassen’s breakthrough, larger versions of this problem have remained unsolved – so much so that it is not known how efficiently two matrices as small as 3×3 can be multiplied.
In our paper, we explore how newer AI technologies can enhance the automatic detection of new matrix multiplication algorithms. Building on the advancement of human intuition, AlphaTensor has discovered algorithms that are more efficient than the latest technology for many sizes of arrays. Algorithms designed with our AI technology are superior to human-designed algorithms, which is a huge step forward in the field of computational discovery.
Process and advances in algorithmic discovery automation
First, we turned the problem of finding efficient algorithms for matrix multiplication into a single player game. In this game, the board is a 3D tensor (a collection of numbers), and it captures how far the current algorithm is from correcting it. Through a set of allowed moves, which correspond to the algorithm’s instructions, the player attempts to modify the tensor and cancel its entries. When the player can do this, it results in a provably correct matrix multiplication algorithm for any pair of matrices, the efficiency of which is scored by the number of steps taken to zero out the tensor.
This game is incredibly difficult – the number of possible algorithms to consider is much greater than the number of atoms in the universe, even for small cases of matrix doubling. Compared to Go, which has been challenging AI for decades, the number of possible moves in each move of our game is 30 orders of magnitude greater (up from 1033 for one of the settings we consider).
Basically, to play this game well, one needs to identify the smallest needles in a giant haystack of possibilities. To address the challenges of this field, which differ significantly from traditional games, we developed several critical components including a new neural network architecture that incorporates problem-specific inductive biases, a procedure for generating useful synthetic data, and a recipe for taking advantage of problem symmetries.
We then trained the AlphaTensor agent using reinforcement learning to play the game, starting without any knowledge of existing matrix multiplication algorithms. Through learning, AlphaTensor gradually improves over time, rediscovering historical fast matrix multiplication algorithms such as Strassen, eventually surpassing the realm of human intuition and discovering algorithms faster than previously known.
For example, if a traditional algorithm taught in school multiplies a 4 x 5 x 5 x 5 matrix using 100 multiplications, and that number is reduced to 80 by human ingenuity, AlphaTensor has found algorithms that do the same operation using only 76 multiplications.
Beyond this example, the AlphaTensor algorithm improves the two-level Strassen algorithm in a finite field for the first time since its discovery 50 years ago. These algorithms for multiplying small matrices can be used as the basis for multiplying larger matrices of arbitrary size.
Moreover, AlphaTensor also detects a variety of algorithms with state-of-the-art complexity – up to thousands of matrix multiplication algorithms per size, which shows that the space for matrix multiplication algorithms is richer than previously thought.
Algorithms in this rich space have different mathematical and practical properties. Taking advantage of this versatility, we’ve adapted AlphaTensor to find fast algorithms specifically on a specific device, such as Nvidia V100 GPU and Google TPU v2. These algorithms multiply large arrays 10-20% faster than algorithms commonly used on the same machine, demonstrating AlphaTensor’s flexibility in optimizing arbitrary targets.
Explore the impact on future research and applications
From a mathematical point of view, our results can guide further research in complexity theory, which aims to identify the fastest algorithms for solving computational problems. By exploring the space of possible algorithms in a more efficient way than previous methods, AlphaTensor It helps advance our understanding of the richness of matrix multiplication algorithms. Understanding this space may open up new results to help determine the asymptotic complexity of matrix multiplication, one of the most fundamental open problems in computer science.
Since matrix multiplication is a key component of many computational tasks, including computer graphics, digital communications, neural network training, and scientific computing, the AlphaTensor algorithms discovered could make computations in these areas significantly more efficient. AlphaTensor’s flexibility to consider any type of target could also spur new applications for designing algorithms that improve metrics such as power use and numerical stability, helping to prevent small rounding errors from multiplying as the algorithm works.
While we have focused here on the problem for matrix multiplication, we hope that our paper will inspire others to use artificial intelligence to guide algorithmic discovery for other basic arithmetic tasks. Our research also shows that AlphaZero is a powerful algorithm that can be extended far beyond the realm of traditional games to help solve open-ended problems in mathematics. Based on our research, we hope to spur greater action – applying AI to help society solve some of the most important challenges in mathematics and across the sciences.
You can find more information in AlphaTensor’s GitHub repository.