Investigate novel algorithms using AlphaTensor

Estimated read time: 7 min

 

Calculations have assisted mathematicians with performing essential tasks for millennia. The old Egyptians conceived a calculation for duplicating two numbers without the requirement for an increase table, and the Greek mathematician Euclid portrayed a calculation for working out the best normal divisor, which is still being used today.


During the Islamic Brilliant Age, Persian mathematician Muhammad ibn Musa al-Khwarizmi planned new calculations for addressing direct and quadratic conditions. As a matter of fact, Al-Khwarizmi's name is converted into Latin as Algoritmi, prompted the term calculation. Yet, regardless of the knowledge of calculations today — utilized all through society from study hall variable based math to state of the art logical examination — the method involved with finding new calculations is inconceivably troublesome, and an illustration of the astounding thinking abilities of the human psyche.

In our paper distributed today in the diary NatureAnd We offer Alpha Tensor, the principal computerized reasoning (artificial intelligence) framework to find new, effective and right calculations for essential errands like grid augmentation. This reveals insight into a 50-year-old open inquiry in science about tracking down the quickest method for duplicating two networks.

This paper is a beginning stage in DeepMind's central goal to propel the science and open essential issues with man-made brainpower. Our AlphaTensor framework depends on AlphaZero, an intermediary that has shown advancement execution on board games, like chess, Go, and shogi, and this work shows AlphaZero's excursion from messing around to handling strange numerical issues interestingly.

network augmentation

Grid duplication is quite possibly of the least complex activity in variable based math, and it's typically shown in secondary school math classes. In any case, beyond the study hall, this unassuming number juggling activity has an immense effect in the contemporary computerized world and is pervasive in present day figuring.

Example of multiplying two 3×3 matrices.

This process is used to process images on smartphones, recognize speech commands, create graphics for computer games, run simulations to forecast the weather, compress data and videos for sharing on the Internet, and much more. Companies around the world spend a great deal of time and money developing computing hardware for efficient matrix multiplication. Therefore, even small improvements in the efficiency of matrix multiplication can have a widespread impact.

For centuries, mathematicians have believed that the standard matrix multiplication algorithm is the best achievable algorithm in terms of efficiency. But in 1969, German mathematician Volker Strassen shocked the mathematical community by proving that there are better algorithms.

Standard algorithm compared to Strassen’s algorithm, which uses less scalar multiplication (7 instead of 8) to multiply 2×2 matrices. Multiplication matters far more than additions for overall efficiency.

By studying very small matrices (size 2×2), he discovered an ingenious way to combine the entries of the matrices to produce a faster algorithm. Despite decades of research after Strassen’s breakthrough, larger versions of this problem have remained unsolved – so much so that it is not known how efficiently two matrices as small as 3×3 can be multiplied.

In our paper, we explore how newer AI technologies can enhance the automatic detection of new matrix multiplication algorithms. Building on the advancement of human intuition, AlphaTensor has discovered algorithms that are more efficient than the latest technology for many sizes of arrays. Algorithms designed with our AI technology are superior to human-designed algorithms, which is a huge step forward in the field of computational discovery.

Process and advances in algorithmic discovery automation

First, we turned the problem of finding efficient algorithms for matrix multiplication into a single player game. In this game, the board is a 3D tensor (a collection of numbers), and it captures how far the current algorithm is from correcting it. Through a set of allowed moves, which correspond to the algorithm’s instructions, the player attempts to modify the tensor and cancel its entries. When the player can do this, it results in a provably correct matrix multiplication algorithm for any pair of matrices, the efficiency of which is scored by the number of steps taken to zero out the tensor.

This game is incredibly difficult – the number of possible algorithms to consider is much greater than the number of atoms in the universe, even for small cases of matrix doubling. Compared to Go, which has been challenging AI for decades, the number of possible moves in each move of our game is 30 orders of magnitude greater (up from 1033 for one of the settings we consider).

Basically, to play this game well, one needs to identify the smallest needles in a giant haystack of possibilities. To address the challenges of this field, which differ significantly from traditional games, we developed several critical components including a new neural network architecture that incorporates problem-specific inductive biases, a procedure for generating useful synthetic data, and a recipe for taking advantage of problem symmetries.

We then trained the AlphaTensor agent using reinforcement learning to play the game, starting without any knowledge of existing matrix multiplication algorithms. Through learning, AlphaTensor gradually improves over time, rediscovering historical fast matrix multiplication algorithms such as Strassen, eventually surpassing the realm of human intuition and discovering algorithms faster than previously known.

A single player game played by AlphaTensor, where the objective is to find the correct matrix multiplication algorithm. The game state is a cubic set of numbers (shown as gray for 0, blue for 1, and green for -1), representing the remaining work to be done.

For example, if a traditional algorithm taught in school multiplies a 4 x 5 x 5 x 5 matrix using 100 multiplications, and that number is reduced to 80 by human ingenuity, AlphaTensor has found algorithms that do the same operation using only 76 multiplications.

The algorithm was detected by AlphaTensor using 76 hits, which is an improvement over the latest algorithms.

Beyond this example, the AlphaTensor algorithm improves the two-level Strassen algorithm in a finite field for the first time since its discovery 50 years ago. These algorithms for multiplying small matrices can be used as the basis for multiplying larger matrices of arbitrary size.

Moreover, AlphaTensor also detects a variety of algorithms with state-of-the-art complexity – up to thousands of matrix multiplication algorithms per size, which shows that the space for matrix multiplication algorithms is richer than previously thought.

Algorithms in this rich space have different mathematical and practical properties. Taking advantage of this versatility, we’ve adapted AlphaTensor to find fast algorithms specifically on a specific device, such as Nvidia V100 GPU and Google TPU v2. These algorithms multiply large arrays 10-20% faster than algorithms commonly used on the same machine, demonstrating AlphaTensor’s flexibility in optimizing arbitrary targets.

AlphaTensor with a target corresponding to the algorithm’s runtime. When the correct matrix multiplication algorithm is discovered, it is measured on the target machines, which is then fed back to the AlphaTensor, in order to learn more algorithms that are effective on the target machines.

Explore the impact on future research and applications

From a mathematical point of view, our results can guide further research in complexity theory, which aims to identify the fastest algorithms for solving computational problems. By exploring the space of possible algorithms in a more efficient way than previous methods, AlphaTensor It helps advance our understanding of the richness of matrix multiplication algorithms. Understanding this space may open up new results to help determine the asymptotic complexity of matrix multiplication, one of the most fundamental open problems in computer science.

Since matrix multiplication is a key component of many computational tasks, including computer graphics, digital communications, neural network training, and scientific computing, the AlphaTensor algorithms discovered could make computations in these areas significantly more efficient. AlphaTensor’s flexibility to consider any type of target could also spur new applications for designing algorithms that improve metrics such as power use and numerical stability, helping to prevent small rounding errors from multiplying as the algorithm works.

While we have focused here on the problem for matrix multiplication, we hope that our paper will inspire others to use artificial intelligence to guide algorithmic discovery for other basic arithmetic tasks. Our research also shows that AlphaZero is a powerful algorithm that can be extended far beyond the realm of traditional games to help solve open-ended problems in mathematics. Based on our research, we hope to spur greater action – applying AI to help society solve some of the most important challenges in mathematics and across the sciences.

You can find more information in AlphaTensor’s GitHub repository.


Source link

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.