AI and algorithmic breakthrough in matrix multiplication

I strongly believe that in the case of any econometric student this is the motto of our student life. Why? Well, because any day of our mundane lives revolves around learning the steps of solving a problem, ranging from basic Lagrangian optimizations to the more complex Markov processes type of problems everybody is freaked out about. It may seem like learning the steps means you master the entire process, but little we know because there are a quota of predicaments that could impede our way towards achieving the right solution.

I think that I can speak for everybody when I say that matrices are our biggest enemy in this Bachelor’s Degree. At least, just count the number of times you made a small mistake and your entire solution fell apart. I can count hundreds of crumpled papers in my lifetime just due to a matrix multiplication that gave me headaches. At those times, I was just asking myself why am I supposed to compute them in such a “savage” manner ? Obviously, I knew the reason, but I was still mad at the matrices themselves, so I just used to turn to any online tool that could do the job for me. However, the computation time as I was increasing the matrix size was getting bigger and bigger, which was quite a frustration because if not even the most developed applications can make those computations way faster than I do, then what is the drill for computing them?

Needless to say that I was not the only one that could see the glitches in the system, the reason why after years of development we can be now proud of a new system implemented by AI that computes everything way faster. But the story behind it is not that simple, so let’s go back in time for a second to the first attempt of optimising this process.

Back in ‘69 there was this German mathematician called Volker Strassen who stunned the mathematical world. He found an innovative way of computing two by two matrices. We all know that the standard way means multiplying row by column element wise, but this guy found that the steps involved are too many, so he came up with a new approach that involves only 7 steps instead of the standard 8 steps. Maybe, it seems quite trivial now, but at that point in time when the mathematical world was still emerging it meant a huge step towards what followed later on. Also, in order to grasp better why this is a way more efficient approach, you should visualise everything from the perspective of a computer, namely in any programming language subtraction and addition are way more time efficient than multiplication.

Back in the day, this discovery represented an incentive for one of the researcher scientists at Google DeepMind to work further in order to find an answer to the question: "Is there any way to develop this algorithm even further? ”. Hence, AlphaTensor comes into play.

DeepMind is the company behind AlphaZero, a famous AI application designed in order to defeat chess grandmasters and legends in the game of Go.AlphaTensor is an AI system based on AlphaZero, wherein the tensor-decomposition problem is converted into a single-player game .AlphaTensor is trained to find efficient ways to win the game. As any machine learning tool, AlphaTensor had no previous knowledge about matrix multiplication algorithms, but playing the game over and over again and learning from the outcomes, it slowly improved. Up to this day ,AlphaTensor discovered thousands of matrix-multiplication algorithms, one of them defeating even the efficiency of Strassen’s 50-year-old algorithm for multiplying 4 by 4 matrices.

In order to grasp the metamorphosis of these tensors, I will briefly describe how this whole process of decomposing tensors actually works. The game commence with a matrix multiplication tensor 𝓣n (green box) which is converted in the red box by rotating the axes of the reference frame attached to it. In every step of the game, the agent uses a tree-based search algorithm led by the neural network to pick the following move, corresponding to constructing blocks u, v and w. The state of the game alters after each step (lighter red boxes). The target is to “unlock” the zero tensor in the smallest number of steps. Hence, repeatedly doing this, the AlphaTensor actually learns from previous implementations, improving the process of multiplying matrices bit by bit, leading towards a so-called “algorithmic breakthrough”.

To conclude, I assume that in a few years most of the classic mathematical algorithms that nowadays are quite cumbersome to process will be improved by means of machine learning. You know, just thinking about a time when all the classic algorithms will be a hundred times faster seems like light-years away, but that time is actually closer than you think. But until that moment comes, all we can do is just hope that our all-time struggles with matrix multiplication will soothe day by day.