3988:, even in commutative field. On 4x4 matrices, AlphaTensor unexpectedly discovered a solution with 47 multiplication steps, an improvement over the 49 required with Strassen’s algorithm of 1969, albeit restricted to mod 2 arithmetic. Similarly, AlphaTensor solved 5x5 matrices with 96 rather than Strassen's 98 steps. Based on the surprising discovery that such improvements exist, other researchers were quickly able to find a similar independent 4x4 algorithm, and separately tweaked Deepmind's 96-step 5x5 algorithm down to 95 steps in mod 2 arithmetic and to 97 in normal arithmetic. Some algorithms were never discovered before, e.g. (4, 5, 5) got improved to 76 from 80 in normal and mod 2 arithmetics.
5298:
1664:
5079:
459:
2595:
1258:
3708:
are represented in the same basis. Karstadt and
Schwartz computed in different bases and traded 3 additions for less expensive basis transformations. They also proved that one cannot go below 12 additions per step using different bases. In subsequent work Beniamini et el. applied this base change trick to more general decompositions than 2x2-block matrices and improved the leading constant for their run times.
1239:
1659:{\displaystyle {\begin{pmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\\\end{pmatrix}}={\begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\\\end{pmatrix}}{\begin{pmatrix}B_{11}&B_{12}\\B_{21}&B_{22}\\\end{pmatrix}}={\begin{pmatrix}A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\\\end{pmatrix}}}
3700:
3432:
4227:
1006:
3707:
It is known that a
Strassen-like algorithm with a 2x2-block matrix step requires at least 7 block matrix multiplications. In 1976 Probert showed that such an algorithm requires at least 15 additions (including subtractions), however, a hidden assumption was that the blocks and the 2x2-block matrix
5103:
On modern architectures with hierarchical memory, the cost of loading and storing input matrix elements tends to dominate the cost of arithmetic. On a single machine this is the amount of data transferred between RAM and cache, while on a distributed memory multi-node machine it is the amount
1874:
A variant of this algorithm that works for matrices of arbitrary shapes and is faster in practice splits matrices in two instead of four submatrices, as follows. Splitting a matrix now means dividing it into two parts of equal size, or as close to equal sizes as possible in the case of odd
5284:
more memory than is needed to store the inputs. This algorithm can be combined with
Strassen to further reduce runtime. "2.5D" algorithms provide a continuous tradeoff between memory usage and communication bandwidth. On modern distributed computing environments such as
2428:
6106:
Fawzi, Alhussein; Balog, Matej; Huang, Aja; Hubert, Thomas; Romera-Paredes, Bernardino; Barekatain, Mohammadamin; Novikov, Alexander; R. Ruiz, Francisco J.; Schrittwieser, Julian; Swirszcz, Grzegorz; Silver, David; Hassabis, Demis; Kohli, Pushmeet (October 2022).
4232:
can be performed independently of each other, as can the four summations (although the algorithm needs to "join" the multiplications before doing the summations). Exploiting the full parallelism of the problem, one obtains an algorithm that can be expressed in
3984:(mod 2 arithmetic). The best "practical" (explicit low-rank decomposition of a matrix multiplication tensor) algorithm found ran in O(n). Finding low-rank decompositions of such tensors (and beyond) is NP-hard; optimal multiplication even for 3x3 matrices
2238:
3945:
that used a single-player game analogy to invent thousands of matrix multiplication algorithms, including some previously discovered by humans and some that were not. Operations were restricted to the non-commutative ground field (normal arithmetic) and
2088:
2454:
environment where cache sizes are effectively dynamic due to other processes taking up cache space. (The simple iterative algorithm is cache-oblivious as well, but much slower in practice if the matrix layout is not adapted to the algorithm.)
469:
The three loops in iterative matrix multiplication can be arbitrarily swapped with each other without an effect on correctness or asymptotic running time. However, the order can have a considerable impact on practical performance due to the
3164:
2989:
3544:
2578:
1234:{\displaystyle C={\begin{pmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\\\end{pmatrix}},\,A={\begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\\\end{pmatrix}},\,B={\begin{pmatrix}B_{11}&B_{12}\\B_{21}&B_{22}\\\end{pmatrix}},}
3276:
4019:
5256:
which arranges the processors in a 3D cube mesh, assigning every product of two input submatrices to a single processor. The result submatrices are then generated by performing a reduction over each row. This algorithm transmits
5332:-1 steps are needed. The performance improves further for repeated computations leading to 100% efficiency. The cross-wired mesh array may be seen as a special case of a non-planar (i.e. multilayered) processing structure.
5598:
2654:-matrices which require only 7 multiplications (instead of the usual 8), at the expense of several additional addition and subtraction operations. Applying this recursively gives an algorithm with a multiplicative cost of
5325:-2 steps although this is reduced to half this number for repeated computations. The standard array is inefficient because the data from the two matrices does not arrive simultaneously and it must be padded with zeroes.
571:
cache misses in the worst case. As of 2010, the speed of memories compared to that of processors is such that the cache misses, rather than the actual calculations, dominate the running time for sizable matrices.
6066:—the wreath product approach is unlikely to be applicable to the large matrix problems that arise in practice. the input matrices must be astronomically large for the difference in time to be apparent.
3538:
3270:
2286:
2723:
284:
1834:
5160:
is the size of fast memory. The naïve algorithm is then used over the block matrices, computing products of submatrices entirely in fast memory. This reduces communication bandwidth to
3981:
2751:
has its merits. A table that compares key aspects of the improved version based on recursive multiplication of 2x2-block matrices via 7 block matrix multiplications follows. As usual
2132:
3063:
2888:
3855:
1976:
3809:
1753:
5341:
3861:, Xu, Xu, and Zhou. This algorithm, like all other recent algorithms in this line of research, is a generalization of the Coppersmith–Winograd algorithm, which was given by
2636:
6747:
5346:
3985:
2589:
105:
3767:
5975:"Worst-case complexity bounds on algorithms for computing the canonical structure of finite abelian groups and the Hermite and Smith normal forms of an integer matrix"
3829:
3741:
5974:
1669:
which consists of eight multiplications of pairs of submatrices, followed by an addition step. The divide-and-conquer algorithm computes the smaller multiplications
3695:{\displaystyle 4\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-12n^{2}+1.5n^{2}\cdot \log _{2}\left({\frac {{\sqrt {2}}n}{\sqrt {M}}}\right)+5M}
3069:
2894:
3427:{\displaystyle 4\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-12n^{2}+3n^{2}\cdot \log _{2}\left({\frac {{\sqrt {2}}n}{\sqrt {M}}}\right)+5M}
2472:
4222:{\displaystyle {\begin{pmatrix}A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\\\end{pmatrix}}}
2789:
2769:
6868:
5917:
Beniamini, Gal; Cheng, Nathan; Holtz, Olga; Karstadt, Elaye; Schwartz, Oded (2020). "Sparsifying the
Operators of Fast Matrix Multiplication Algorithms".
6357:
Agarwal, R.C.; Balle, S. M.; Gustavson, F. G.; Joshi, M.; Palkar, P. (September 1995). "A three-dimensional approach to parallel matrix multiplication".
6030:
The
Coppersmith–Winograd algorithm is not practical, due to the very large hidden constant in the upper bound on the number of multiplications required.
6924:
6893:
985:
amounts to several orders of magnitude on modern machines, so that the actual calculations dominate the running time, rather than the cache misses.
5361:
2747:
Since
Strassen's algorithm is actually used in practical numerical software and computer algebra systems improving on the constants hidden in the
1855:
6888:
6740:
6418:
5752:
5641:
5420:
6317:
Irony, Dror; Toledo, Sivan; Tiskin, Alexander (September 2004). "Communication lower bounds for distributed-memory matrix multiplication".
5063:
on any real computer. The algorithm isn't practical due to the communication cost inherent in moving data to and from the temporary matrix
5217:
2D mesh, one submatrix of the result can be assigned to each processor, and the product can be computed with each processor transmitting
5297:
5047:
steps, meaning it takes that much time on an ideal machine with an infinite number of processors; therefore, it has a maximum possible
6914:
6847:
6733:
5572:
5517:
6295:
6878:
6194:
5127:
2737:
3452:
2423:{\displaystyle C={\begin{pmatrix}A_{1}&A_{2}\end{pmatrix}}{\begin{pmatrix}B_{1}\\B_{2}\end{pmatrix}}=A_{1}B_{1}+A_{2}B_{2}}
3184:
3869:
in 1990. The conceptual idea of these algorithms are similar to
Strassen's algorithm: a way is devised for multiplying two
2657:
6015:
3889:
is so large that these algorithms are only worthwhile for matrices that are too large to handle on present-day computers.
3858:
121:
206:
6837:
4010:
3712:
994:
5356:
479:
463:
6691:
Van Zee, Field G.; van de Geijn, Robert A. (2015). "BLIS: A Framework for
Rapidly Instantiating BLAS Functionality".
5491:
6791:
5740:
5562:
5272:
words per processor, which is asymptotically optimal. However, this requires replicating each input matrix element
1759:
54:. Many different algorithms have been designed for multiplying matrices on different types of hardware, including
6842:
2642:
Algorithms exist that provide better running times than the straightforward ones. The first to be discovered was
2447:
3951:
2233:{\displaystyle C=A{\begin{pmatrix}B_{1}&B_{2}\end{pmatrix}}={\begin{pmatrix}AB_{1}&AB_{2}\end{pmatrix}}}
6756:
5490:
Alman, Josh; Williams, Virginia
Vassilevska (2020), "A Refined Laser Method and Faster Matrix Multiplication",
6078:
42:
efficient. Applications of matrix multiplication in computational problems are found in many fields including
5628:. ASPLOS91: 4th Int'l Conference on Architecture Support for Programming Languages & Operating Systems.
5366:
3892:
3885:
multiplications, and this technique is applied recursively. However, the constant coefficient hidden by the
3009:
2834:
2083:{\displaystyle C={\begin{pmatrix}A_{1}\\A_{2}\end{pmatrix}}{B}={\begin{pmatrix}A_{1}B\\A_{2}B\end{pmatrix}}}
6919:
6661:
6577:
6519:
6478:
6366:
6326:
5989:
5702:
429:
66:
3834:
6801:
5024:
is a keyword that signal a computation may be run in parallel with the rest of the function call, while
3896:
1674:
471:
145:
59:
31:
5318:
5119:
2650:
in 1969 and often referred to as "fast matrix multiplication". It is based on a way of multiplying two
6652:
Goto, Kazushige; van de Geijn, Robert A. (2008). "Anatomy of high-performance matrix multiplication".
3772:
1711:
6796:
6452:
6120:
5550:
5082:
Block matrix multiplication. In the 2D algorithm, each processor is responsible for one submatrix of
5036:
51:
43:
17:
6666:
5994:
5707:
4234:
2605:
6775:
6582:
6524:
6371:
6331:
5744:
2726:
2450:: there is no tuning parameter required to get optimal cache performance, and it behaves well in a
425:
75:
62:
systems, where the computational work is spread over multiple processors (perhaps over a network).
47:
35:
5404:
6708:
6679:
6640:
6622:
6548:
6442:
6287:
6173:
5918:
5785:
5497:
5468:
5444:
4006:
2643:
133:
101:
55:
100:). Better asymptotic bounds on the time required to multiply matrices have been known since the
6395:
3746:
65:
Directly applying the mathematical definition of matrix multiplication gives an algorithm that
6414:
6223:
6154:
6136:
5939:
5748:
5734:
5637:
5568:
5416:
5078:
3159:{\displaystyle 5\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-15n^{2}+3M}
2984:{\displaystyle 6\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-18n^{2}+3M}
5130:
that partitions each input matrix into a block matrix whose elements are submatrices of size
3814:
3726:
1244:
which works for all square matrices whose dimensions are powers of two, i.e., the shapes are
478:
use of the algorithm; which order is best also depends on whether the matrices are stored in
6811:
6700:
6671:
6632:
6587:
6529:
6490:
6406:
6376:
6336:
6279:
6232:
6144:
6128:
5999:
5954:
5899:
5872:
5845:
5816:
5777:
5712:
5675:
5629:
5546:
5408:
5396:
2451:
458:
6613:(2009). "A class of parallel tiled linear algebra algorithms for multicore architectures".
6396:"Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms"
6011:
2573:{\displaystyle \Theta \left(m+n+p+{\frac {mn+np+mp}{b}}+{\frac {mnp}{b{\sqrt {M}}}}\right)}
6852:
6007:
5306:
5301:
Matrix multiplication completed in 2n-1 steps for two n×n matrices on a cross-wired mesh.
3866:
3862:
3716:
2647:
2594:
414:
109:
5397:
6456:
6268:
6172:
Kauers, Manuel; Moosbauer, Jakob (2022-12-02). "Flip Graphs for Matrix
Multiplication".
6124:
6770:
6149:
6108:
5730:
5558:
3942:
3886:
2774:
2754:
2748:
1929:
97:
5959:
6908:
6816:
6610:
6591:
6547:
Kak, S. (2014). "Efficiency of matrix multiplication on the cross-wired mesh array".
6533:
5821:
5804:
5789:
5666:
5662:
5392:
5351:
6276:
Proceedings of the thirteenth annual ACM symposium on Theory of computing - STOC '81
5842:
Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures
6712:
6683:
6291:
6059:
Even if someone manages to prove one of the conjectures—thereby demonstrating that
2741:
998:
5890:
Probert, Robert L. (1976). "On the additive complexity of matrix multiplication".
5237:
words, which is asymptotically optimal assuming that each node stores the minimum
6410:
6109:"Discovering faster matrix multiplication algorithms with reinforcement learning"
5837:
4013:. These are based on the fact that the eight recursive matrix multiplications in
6719:
6636:
6495:
6435:
5865:"Pebbling Game and Alternative Basis for High Performance Matrix Multiplication"
5412:
2443:
2442:
The cache miss rate of recursive matrix multiplication is the same as that of a
585:
6644:
6340:
6132:
5554:
5439:
Williams, Virginia Vassilevska; Xu, Yinzhan; Xu, Zixuan; Zhou, Renfei (2024),
4237:
289:
From this, a simple algorithm can be constructed which loops over the indices
6140:
1854:
to sum the four pairs of resulting matrices element-wise. Application of the
6832:
6675:
6251:
6043:
5849:
5286:
2729:
is reduced compared to the naïve algorithm, but it is faster in cases where
1670:
486:
475:
6725:
6158:
551:, every iteration of the inner loop (a simultaneous sweep through a row of
6283:
5633:
589:
version, where the matrix is implicitly divided into square tiles of size
108:) remains unknown. As of April 2024, the best announced bound on the
5693:
Miller, Webb (1975), "Computational complexity and numerical stability",
5328:
The result is even faster on a two-layered cross-wired mesh, where only 2
3938:
2458:
The number of cache misses incurred by this algorithm, on a machine with
2446:
iterative version, but unlike that algorithm, the recursive algorithm is
50:
and in seemingly unrelated problems such as counting the paths through a
6380:
6236:
5679:
6479:"A faster parallel algorithm for matrix multiplication on a mesh array"
6403:
Proceedings of the 17th International Conference on Parallel Processing
5876:
5781:
5048:
6883:
6873:
6510:
Kak, S (1988). "A two-layered mesh array for matrix multiplication".
5864:
2795:
Progress for Strassen-like recursive 2x2-block matrix multiplication
6704:
6003:
5903:
5716:
2602:
over time for the computational complexity of matrix multiplication
6178:
5923:
5502:
5473:
5449:
136:
because of the large constants and cannot be realized practically.
6627:
6553:
6447:
5077:
3947:
2740:. It is very useful for large matrices over exact domains such as
2593:
457:
5493:
32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2021)
6231:(Ph.D.). Massachusetts Institute of Technology. pp. 54–57.
5768:
Strassen, Volker (1969). "Gaussian Elimination is not Optimal".
132:
time, given by Alman and Williams. However, this algorithm is a
6729:
5624:
Lam, Monica S.; Rothberg, Edward E.; Wolf, Michael E. (1991).
5599:"6.172 Performance Engineering of Software Systems, Lecture 8"
5626:
The Cache Performance and Optimizations of Blocked Algorithms
5289:, specialized multiplication algorithms have been developed.
5180:, which is asymptotically optimal (for algorithms performing
1839:
accounting for the eight recursive calls on matrices of size
432:
is to assume that the inputs are all square matrices of size
6253:
A cellular computer to implement the Kalman Filter Algorithm
5028:
waits for all previously "forked" computations to complete.
5567:(3rd ed.). MIT Press and McGraw-Hill. pp. 75–79.
5104:
transferred between nodes; in either case it is called the
3715:
how well Strassen's algorithm can be improved in terms of
5441:
New Bounds for Matrix Multiplication: from Alpha to Omega
3533:{\displaystyle 5n^{\log _{2}7}-4n^{2}+1.5n^{2}\log _{2}n}
939:
In the idealized cache model, this algorithm incurs only
5305:
There are a variety of algorithms for multiplication on
6195:"AI Reveals New Possibilities in Matrix Multiplication"
6044:"Toward an Optimal Algorithm for Matrix Multiplication"
3265:{\displaystyle 5n^{\log _{2}7}-4n^{2}+3n^{2}\log _{2}n}
5086:. In the 3D algorithm, every pair of submatrices from
4028:
2339:
2301:
2191:
2150:
2039:
1991:
1465:
1398:
1334:
1267:
1169:
1095:
1021:
6436:"Dimension Independent Matrix Square Using MapReduce"
4022:
3954:
3837:
3817:
3775:
3769:
matrix over a field can be multiplied together using
3749:
3729:
3547:
3455:
3279:
3187:
3072:
3012:
2897:
2837:
2777:
2757:
2718:{\displaystyle O(n^{\log _{2}7})\approx O(n^{2.807})}
2660:
2608:
2475:
2289:
2135:
1979:
1762:
1714:
1261:
1009:
517:
cache lines), the above algorithm is sub-optimal for
209:
5518:"Matrix Multiplication Inches Closer to Mythic Goal"
5108:. The naïve algorithm using three nested loops uses
6861:
6825:
6784:
6763:
5940:"Matrix multiplication via arithmetic progressions"
5465:
Faster Matrix Multiplication via Asymmetric Hashing
5342:
Computational complexity of mathematical operations
575:The optimal variant of the iterative algorithm for
559:) incurs a cache miss when accessing an element of
279:{\displaystyle c_{ij}=\sum _{k=1}^{m}a_{ik}b_{kj}.}
124:, Xu, Xu, and Zhou. This improves on the bound of
5736:Numerical Recipes: The Art of Scientific Computing
4221:
3975:
3849:
3823:
3803:
3761:
3735:
3694:
3532:
3426:
3264:
3158:
3057:
2983:
2882:
2783:
2763:
2717:
2630:
2572:
2422:
2232:
2082:
1828:
1747:
1701:The complexity of this algorithm as a function of
1658:
1233:
278:
6609:Buttari, Alfredo; Langou, Julien; Kurzak, Jakub;
5347:Computational complexity of matrix multiplication
5099:Communication-avoiding and distributed algorithms
4002:
2590:Computational complexity of matrix multiplication
1856:master theorem for divide-and-conquer recurrences
993:An alternative to the iterative algorithm is the
106:computational complexity of matrix multiplication
104:in the 1960s, but the optimal time (that is, the
5317:on a standard two-dimensional mesh using the 2D
5094:that is multiplied is assigned to one processor.
5032:achieves its goal by pointer manipulation only.
2736:or so and appears in several libraries, such as
2725:. Strassen's algorithm is more complex, and the
6568:Kak, S (1988). "Multilayered array computing".
6079:"Discovering novel algorithms with AlphaTensor"
5597:Amarasinghe, Saman; Leiserson, Charles (2010).
6101:
6099:
997:for matrix multiplication. This relies on the
428:). A common simplification for the purpose of
6741:
6477:Bae, S.E.; Shinn, T.-W.; Takaoka, T. (2014).
6434:Bosagh Zadeh, Reza; Carlsson, Gunnar (2013).
5836:Karstadt, Elaye; Schwartz, Oded (July 2017).
5463:Duan, Ran; Wu, Hongxun; Zhou, Renfei (2022),
2744:, where numerical stability is not an issue.
8:
3811:field operations. The current best bound on
3743:, is the smallest real number for which any
1829:{\displaystyle T(n)=8T(n/2)+\Theta (n^{2}),}
450:, i.e., cubic in the size of the dimension.
5938:Coppersmith, Don; Winograd, Shmuel (1990),
5075:speedup, without using a temporary matrix.
4300:Otherwise, allocate space for a new matrix
305:, computing the above using a nested loop:
6748:
6734:
6726:
6352:
6350:
6269:"I/O complexity: The red-blue pebble game"
5321:, one can complete the multiplication in 3
3976:{\displaystyle \mathbb {Z} /2\mathbb {Z} }
1858:shows this recursion to have the solution
485:In particular, in the idealized case of a
6693:ACM Transactions on Mathematical Software
6665:
6654:ACM Transactions on Mathematical Software
6626:
6581:
6552:
6523:
6494:
6446:
6370:
6330:
6177:
6148:
5993:
5958:
5922:
5820:
5706:
5501:
5472:
5448:
4205:
4195:
4182:
4172:
4160:
4150:
4137:
4127:
4113:
4103:
4090:
4080:
4068:
4058:
4045:
4035:
4023:
4021:
3969:
3968:
3960:
3956:
3955:
3953:
3836:
3816:
3780:
3774:
3748:
3728:
3661:
3658:
3645:
3632:
3616:
3586:
3581:
3559:
3556:
3546:
3518:
3508:
3492:
3468:
3463:
3454:
3393:
3390:
3377:
3364:
3348:
3318:
3313:
3291:
3288:
3278:
3250:
3240:
3224:
3200:
3195:
3186:
3141:
3111:
3106:
3084:
3081:
3071:
3049:
3025:
3020:
3011:
2966:
2936:
2931:
2909:
2906:
2896:
2874:
2850:
2845:
2836:
2776:
2756:
2706:
2676:
2671:
2659:
2619:
2607:
2555:
2538:
2502:
2474:
2414:
2404:
2391:
2381:
2360:
2346:
2334:
2320:
2308:
2296:
2288:
2216:
2201:
2186:
2169:
2157:
2145:
2134:
2063:
2046:
2034:
2026:
2012:
1998:
1986:
1978:
1814:
1790:
1761:
1713:
1642:
1632:
1619:
1609:
1597:
1587:
1574:
1564:
1550:
1540:
1527:
1517:
1505:
1495:
1482:
1472:
1460:
1443:
1431:
1417:
1405:
1393:
1379:
1367:
1353:
1341:
1329:
1312:
1300:
1286:
1274:
1262:
1260:
1214:
1202:
1188:
1176:
1164:
1157:
1140:
1128:
1114:
1102:
1090:
1083:
1066:
1054:
1040:
1028:
1016:
1008:
264:
251:
241:
230:
214:
208:
6394:Solomonik, Edgar; Demmel, James (2011).
6312:
6310:
6308:
5838:"Matrix Multiplication, a Little Faster"
5657:
5655:
5653:
5434:
5432:
5387:
5385:
5383:
5381:
5296:
5067:, but a more practical variant achieves
2793:
112:of a matrix multiplication algorithm is
38:, much work has been invested in making
6879:Basic Linear Algebra Subprograms (BLAS)
6301:from the original on December 15, 2019.
6225:Cilk: Efficient Multithreaded Computing
6217:
6215:
5729:Press, William H.; Flannery, Brian P.;
5605:. Massachusetts Institute of Technology
5377:
4820:(or do a short loop, perhaps unrolled).
2771:gives the dimensions of the matrix and
1866:, the same as the iterative algorithm.
631:be a new matrix of the appropriate size
563:. This means that the algorithm incurs
325:be a new matrix of the appropriate size
6405:. Vol. Part II. pp. 90–109.
5541:
5539:
5537:
5485:
5483:
5252:elements. This can be improved by the
4737:(wait for parallel forks to complete).
3058:{\displaystyle 6n^{\log _{2}7}-5n^{2}}
2883:{\displaystyle 7n^{\log _{2}7}-6n^{2}}
5592:
5590:
5588:
5586:
5584:
2598:Improvement of estimates of exponent
18:Fast matrix multiplication algorithms
7:
6250:Cannon, Lynn Elliot (14 July 1969).
5869:SIAM Journal on Scientific Computing
5863:Schwartz, Oded; Vaknin, Noa (2023).
442:, in which case the running time is
34:is such a central operation in many
5809:Linear Algebra and Its Applications
5805:"On multiplication of 2×2 matrices"
5362:Sparse matrix–vector multiplication
5352:CYK algorithm § Valiant's algorithm
5278:times, and so requires a factor of
4297:(or multiply a small block matrix).
3992:Parallel and distributed algorithms
3850:{\displaystyle \omega <2.371552}
2462:lines of ideal cache, each of size
1932:version of the iterative algorithm.
480:row-major order, column-major order
146:definition of matrix multiplication
6256:(Ph.D.). Montana State University.
6193:Brubaker, Ben (23 November 2022).
2476:
1804:
1730:
25:
6267:Hong, J. W.; Kung, H. T. (1981).
5733:; Vetterling, William T. (2007).
5516:Hartnett, Kevin (23 March 2021).
5395:(2012). "Sorting and Searching".
6925:Matrix multiplication algorithms
6042:Robinson, Sara (November 2005),
5128:communication-avoiding algorithm
3804:{\displaystyle n^{\omega +o(1)}}
2806:#matrix multiplications per step
1928:is below some threshold, use an
1748:{\displaystyle T(1)=\Theta (1);}
525:stored in row-major order. When
40:matrix multiplication algorithms
5947:Journal of Symbolic Computation
5973:Iliopoulos, Costas S. (1989),
5844:. SPAA '17. pp. 101–110.
5191:In a distributed setting with
3796:
3790:
3721:matrix multiplication exponent
2712:
2699:
2690:
2664:
2631:{\displaystyle O(n^{\omega })}
2625:
2612:
1820:
1807:
1798:
1784:
1772:
1766:
1739:
1733:
1724:
1718:
27:Algorithm to multiply matrices
1:
5960:10.1016/S0747-7171(08)80013-2
4011:shared-memory multiprocessors
6592:10.1016/0020-0255(88)90010-2
6534:10.1016/0167-8191(88)90078-6
6411:10.1007/978-3-642-23397-5_10
5822:10.1016/0024-3795(71)90009-7
5309:. For multiplication of two
4003:divide-and-conquer algorithm
3713:theoretical computer science
2791:designates the memory size.
1252:. The matrix product is now
995:divide-and-conquer algorithm
989:Divide-and-conquer algorithm
6637:10.1016/j.parco.2008.10.002
6496:10.1016/j.procs.2014.05.208
6319:J. Parallel Distrib. Comput
5413:10.1007/978-1-84800-070-4_4
5399:The Algorithm Design Manual
5357:Matrix chain multiplication
2812:total arithmetic operations
1705:is given by the recurrence
497:bytes per cache line (i.e.
464:row- and column-major order
78:operations to multiply two
6941:
6792:System of linear equations
6341:10.1016/j.jpdc.2004.03.021
6222:Randall, Keith H. (1998).
6133:10.1038/s41586-022-05172-4
5741:Cambridge University Press
5668:Cache-Oblivious Algorithms
5564:Introduction to Algorithms
3941:introduced AlphaTensor, a
3879:-matrices with fewer than
3711:It is an open question in
2809:#matrix additions per step
2587:
971:cache misses; the divisor
88:matrices over that field (
6843:Cache-oblivious algorithm
6483:Procedia Computer Science
5982:SIAM Journal on Computing
5803:Winograd, Shmuel (1971).
5195:processors arranged in a
5116:communication bandwidth.
3997:Shared-memory parallelism
3762:{\displaystyle n\times n}
583:in row-major layout is a
6915:Numerical linear algebra
6894:General purpose software
6757:Numerical linear algebra
4005:sketched earlier can be
6676:10.1145/1356052.1356053
5850:10.1145/3087556.3087579
5367:Method of Four Russians
5106:communication bandwidth
3824:{\displaystyle \omega }
3736:{\displaystyle \omega }
487:fully associative cache
5871:. pp. C277–C303.
5302:
5095:
4223:
3977:
3851:
3825:
3805:
3763:
3737:
3696:
3534:
3428:
3266:
3160:
3059:
2985:
2884:
2785:
2765:
2719:
2639:
2632:
2574:
2424:
2234:
2084:
1830:
1749:
1660:
1235:
472:memory access patterns
466:
280:
246:
6889:Specialized libraries
6802:Matrix multiplication
6797:Matrix decompositions
6284:10.1145/800076.802486
5634:10.1145/106972.106981
5551:Leiserson, Charles E.
5403:. Springer. pp.
5300:
5293:Algorithms for meshes
5081:
5035:This algorithm has a
4224:
3978:
3899:that, given matrices
3897:Monte Carlo algorithm
3852:
3826:
3806:
3764:
3738:
3717:asymptotic complexity
3697:
3535:
3429:
3267:
3161:
3060:
2986:
2885:
2815:total I/O-complexity
2786:
2766:
2720:
2633:
2597:
2588:Further information:
2575:
2466:bytes, is bounded by
2425:
2235:
2085:
1831:
1750:
1675:scalar multiplication
1661:
1236:
461:
413:This algorithm takes
281:
226:
110:asymptotic complexity
32:matrix multiplication
6720:How To Optimize GEMM
6570:Information Sciences
6278:. pp. 326–333.
5122:, also known as the
5037:critical path length
4489:Parallel execution:
4020:
3952:
3893:Freivalds' algorithm
3835:
3815:
3773:
3747:
3727:
3545:
3453:
3277:
3185:
3070:
3010:
2895:
2835:
2775:
2755:
2658:
2644:Strassen's algorithm
2606:
2584:Sub-cubic algorithms
2473:
2287:
2133:
1977:
1760:
1712:
1259:
1007:
482:, or a mix of both.
207:
200:matrix with entries
102:Strassen's algorithm
44:scientific computing
36:numerical algorithms
6776:Numerical stability
6457:2013arXiv1304.1467B
6381:10.1147/rd.395.0575
6125:2022Natur.610...47F
2796:
2727:numerical stability
1870:Non-square matrices
426:asymptotic notation
140:Iterative algorithm
48:pattern recognition
6615:Parallel Computing
6512:Parallel Computing
5877:10.1137/22M1502719
5782:10.1007/BF02165411
5731:Teukolsky, Saul A.
5603:MIT OpenCourseWare
5319:Cannon's algorithm
5303:
5120:Cannon's algorithm
5096:
4219:
4213:
3973:
3847:
3821:
3801:
3759:
3733:
3723:, usually denoted
3692:
3530:
3424:
3262:
3173:Karstadt, Schwartz
3156:
3055:
2981:
2880:
2794:
2781:
2761:
2715:
2640:
2628:
2570:
2420:
2368:
2328:
2230:
2224:
2177:
2080:
2074:
2020:
1826:
1745:
1698:as its base case.
1656:
1650:
1451:
1387:
1320:
1231:
1222:
1148:
1074:
999:block partitioning
467:
430:algorithm analysis
276:
134:galactic algorithm
6902:
6901:
6420:978-3-642-23397-5
5754:978-0-521-88068-8
5674:(Master's). MIT.
5643:978-0-89791-380-5
5555:Rivest, Ronald L.
5547:Cormen, Thomas H.
5422:978-1-84800-069-8
3705:
3704:
3677:
3676:
3666:
3575:
3574:
3564:
3409:
3408:
3398:
3307:
3306:
3296:
3100:
3099:
3089:
2925:
2924:
2914:
2784:{\displaystyle M}
2764:{\displaystyle n}
2563:
2560:
2533:
1881:Inputs: matrices
634:Pick a tile size
16:(Redirected from
6932:
6812:Matrix splitting
6750:
6743:
6736:
6727:
6716:
6687:
6669:
6648:
6630:
6596:
6595:
6585:
6565:
6559:
6558:
6556:
6544:
6538:
6537:
6527:
6507:
6501:
6500:
6498:
6474:
6468:
6467:
6465:
6463:
6450:
6440:
6431:
6425:
6424:
6400:
6391:
6385:
6384:
6374:
6354:
6345:
6344:
6334:
6314:
6303:
6302:
6300:
6273:
6264:
6258:
6257:
6247:
6241:
6240:
6230:
6219:
6210:
6209:
6207:
6205:
6190:
6184:
6183:
6181:
6169:
6163:
6162:
6152:
6103:
6094:
6093:
6091:
6090:
6085:. 5 October 2022
6083:www.deepmind.com
6075:
6069:
6068:
6065:
6048:
6039:
6033:
6032:
6027:
6026:
6020:
6014:, archived from
5997:
5979:
5970:
5964:
5963:
5962:
5944:
5935:
5929:
5928:
5926:
5914:
5908:
5907:
5887:
5881:
5880:
5860:
5854:
5853:
5833:
5827:
5826:
5824:
5800:
5794:
5793:
5765:
5759:
5758:
5739:(3rd ed.).
5726:
5720:
5719:
5710:
5690:
5684:
5683:
5673:
5659:
5648:
5647:
5621:
5615:
5614:
5612:
5610:
5594:
5579:
5578:
5543:
5532:
5531:
5529:
5528:
5513:
5507:
5506:
5505:
5487:
5478:
5477:
5476:
5460:
5454:
5453:
5452:
5436:
5427:
5426:
5402:
5389:
5283:
5277:
5271:
5251:
5236:
5234:
5233:
5216:
5215:
5214:
5205:
5204:
5203:
5194:
5187:
5179:
5177:
5176:
5159:
5153:
5152:
5151:
5141:
5140:
5139:
5115:
5093:
5089:
5085:
5074:
5066:
5062:
5046:
5031:
5003:
4980:
4957:
4934:
4908:
4899:
4890:
4881:
4872:
4865:
4856:
4847:
4838:
4829:
4819:
4796:
4786:, element-wise:
4785:
4781:
4777:
4757:
4750:
4728:
4698:
4668:
4638:
4608:
4578:
4548:
4518:
4485:
4476:
4467:
4458:
4449:
4442:
4433:
4424:
4415:
4406:
4399:
4390:
4381:
4372:
4363:
4356:
4347:
4338:
4329:
4320:
4313:
4303:
4296:
4273:
4262:
4228:
4226:
4225:
4220:
4218:
4217:
4210:
4209:
4200:
4199:
4187:
4186:
4177:
4176:
4165:
4164:
4155:
4154:
4142:
4141:
4132:
4131:
4118:
4117:
4108:
4107:
4095:
4094:
4085:
4084:
4073:
4072:
4063:
4062:
4050:
4049:
4040:
4039:
4009:in two ways for
3982:
3980:
3979:
3974:
3972:
3964:
3959:
3928:
3918:
3910:
3906:
3902:
3884:
3878:
3856:
3854:
3853:
3848:
3830:
3828:
3827:
3822:
3810:
3808:
3807:
3802:
3800:
3799:
3768:
3766:
3765:
3760:
3742:
3740:
3739:
3734:
3701:
3699:
3698:
3693:
3682:
3678:
3672:
3671:
3667:
3662:
3659:
3650:
3649:
3637:
3636:
3621:
3620:
3599:
3598:
3591:
3590:
3580:
3576:
3570:
3569:
3565:
3560:
3557:
3539:
3537:
3536:
3531:
3523:
3522:
3513:
3512:
3497:
3496:
3481:
3480:
3473:
3472:
3441:Schwartz, Vaknin
3433:
3431:
3430:
3425:
3414:
3410:
3404:
3403:
3399:
3394:
3391:
3382:
3381:
3369:
3368:
3353:
3352:
3331:
3330:
3323:
3322:
3312:
3308:
3302:
3301:
3297:
3292:
3289:
3271:
3269:
3268:
3263:
3255:
3254:
3245:
3244:
3229:
3228:
3213:
3212:
3205:
3204:
3165:
3163:
3162:
3157:
3146:
3145:
3124:
3123:
3116:
3115:
3105:
3101:
3095:
3094:
3090:
3085:
3082:
3064:
3062:
3061:
3056:
3054:
3053:
3038:
3037:
3030:
3029:
2990:
2988:
2987:
2982:
2971:
2970:
2949:
2948:
2941:
2940:
2930:
2926:
2920:
2919:
2915:
2910:
2907:
2889:
2887:
2886:
2881:
2879:
2878:
2863:
2862:
2855:
2854:
2797:
2790:
2788:
2787:
2782:
2770:
2768:
2767:
2762:
2735:
2724:
2722:
2721:
2716:
2711:
2710:
2689:
2688:
2681:
2680:
2653:
2637:
2635:
2634:
2629:
2624:
2623:
2601:
2579:
2577:
2576:
2571:
2569:
2565:
2564:
2562:
2561:
2556:
2550:
2539:
2534:
2529:
2503:
2465:
2461:
2452:multiprogramming
2429:
2427:
2426:
2421:
2419:
2418:
2409:
2408:
2396:
2395:
2386:
2385:
2373:
2372:
2365:
2364:
2351:
2350:
2333:
2332:
2325:
2324:
2313:
2312:
2274:
2270:
2266:
2239:
2237:
2236:
2231:
2229:
2228:
2221:
2220:
2206:
2205:
2182:
2181:
2174:
2173:
2162:
2161:
2120:
2116:
2089:
2087:
2086:
2081:
2079:
2078:
2068:
2067:
2051:
2050:
2030:
2025:
2024:
2017:
2016:
2003:
2002:
1964:
1960:
1935:Recursive cases:
1927:
1908:
1898:
1894:
1884:
1865:
1853:
1845:
1835:
1833:
1832:
1827:
1819:
1818:
1794:
1754:
1752:
1751:
1746:
1704:
1697:
1665:
1663:
1662:
1657:
1655:
1654:
1647:
1646:
1637:
1636:
1624:
1623:
1614:
1613:
1602:
1601:
1592:
1591:
1579:
1578:
1569:
1568:
1555:
1554:
1545:
1544:
1532:
1531:
1522:
1521:
1510:
1509:
1500:
1499:
1487:
1486:
1477:
1476:
1456:
1455:
1448:
1447:
1436:
1435:
1422:
1421:
1410:
1409:
1392:
1391:
1384:
1383:
1372:
1371:
1358:
1357:
1346:
1345:
1325:
1324:
1317:
1316:
1305:
1304:
1291:
1290:
1279:
1278:
1251:
1247:
1240:
1238:
1237:
1232:
1227:
1226:
1219:
1218:
1207:
1206:
1193:
1192:
1181:
1180:
1153:
1152:
1145:
1144:
1133:
1132:
1119:
1118:
1107:
1106:
1079:
1078:
1071:
1070:
1059:
1058:
1045:
1044:
1033:
1032:
984:
983:
982:
970:
968:
966:
965:
964:
963:
952:
949:
933:
917:
896:
876:
860:
856:
850:
843:
827:
823:
816:
800:
796:
789:
758:
727:
693:
689:
685:
678:
674:
670:
663:
659:
655:
649:
647:
646:
630:
624:
620:
617:Input: matrices
610:
609:
608:
599:
598:
597:
582:
578:
570:
562:
558:
555:and a column of
554:
550:
549:
547:
546:
541:
538:
524:
520:
516:
514:
513:
508:
505:
496:
492:
462:Illustration of
449:
441:
423:
407:
397:
383:
363:
359:
353:
346:
342:
335:
331:
324:
318:
314:
311:Input: matrices
304:
300:
296:
292:
285:
283:
282:
277:
272:
271:
259:
258:
245:
240:
222:
221:
199:
189:
185:
181:
171:
167:
157:
131:
119:
95:
87:
74:
69:on the order of
21:
6940:
6939:
6935:
6934:
6933:
6931:
6930:
6929:
6905:
6904:
6903:
6898:
6857:
6853:Multiprocessing
6821:
6817:Sparse problems
6780:
6759:
6754:
6724:
6705:10.1145/2764454
6690:
6667:10.1.1.140.3583
6651:
6608:
6604:
6602:Further reading
6599:
6567:
6566:
6562:
6546:
6545:
6541:
6509:
6508:
6504:
6476:
6475:
6471:
6461:
6459:
6438:
6433:
6432:
6428:
6421:
6398:
6393:
6392:
6388:
6359:IBM J. Res. Dev
6356:
6355:
6348:
6316:
6315:
6306:
6298:
6271:
6266:
6265:
6261:
6249:
6248:
6244:
6228:
6221:
6220:
6213:
6203:
6201:
6199:Quanta Magazine
6192:
6191:
6187:
6171:
6170:
6166:
6119:(7930): 47–53.
6105:
6104:
6097:
6088:
6086:
6077:
6076:
6072:
6060:
6046:
6041:
6040:
6036:
6024:
6022:
6018:
6004:10.1137/0218045
5995:10.1.1.531.9309
5977:
5972:
5971:
5967:
5942:
5937:
5936:
5932:
5916:
5915:
5911:
5904:10.1137/0205016
5889:
5888:
5884:
5862:
5861:
5857:
5835:
5834:
5830:
5802:
5801:
5797:
5767:
5766:
5762:
5755:
5728:
5727:
5723:
5717:10.1137/0204009
5708:10.1.1.148.9947
5692:
5691:
5687:
5671:
5661:
5660:
5651:
5644:
5623:
5622:
5618:
5608:
5606:
5596:
5595:
5582:
5575:
5559:Stein, Clifford
5545:
5544:
5535:
5526:
5524:
5522:Quanta Magazine
5515:
5514:
5510:
5489:
5488:
5481:
5462:
5461:
5457:
5438:
5437:
5430:
5423:
5391:
5390:
5379:
5375:
5338:
5295:
5279:
5273:
5258:
5238:
5229:
5227:
5218:
5210:
5208:
5207:
5199:
5197:
5196:
5192:
5181:
5172:
5170:
5161:
5155:
5146:
5144:
5143:
5134:
5132:
5131:
5109:
5101:
5091:
5087:
5083:
5068:
5064:
5052:
5040:
5029:
5018:
5017:
5001:
4994:
4987:
4978:
4971:
4964:
4955:
4948:
4941:
4932:
4925:
4918:
4907:
4901:
4898:
4892:
4889:
4883:
4880:
4874:
4870:
4864:
4858:
4855:
4849:
4846:
4840:
4837:
4831:
4827:
4818:
4811:
4804:
4798:
4791:
4783:
4779:
4767:
4755:
4740:
4726:
4719:
4712:
4705:
4696:
4689:
4682:
4675:
4666:
4659:
4652:
4645:
4636:
4629:
4622:
4615:
4606:
4599:
4592:
4585:
4576:
4569:
4562:
4555:
4546:
4539:
4532:
4525:
4516:
4509:
4502:
4495:
4484:
4478:
4475:
4469:
4466:
4460:
4457:
4451:
4447:
4441:
4435:
4432:
4426:
4423:
4417:
4414:
4408:
4404:
4398:
4392:
4389:
4383:
4380:
4374:
4371:
4365:
4361:
4355:
4349:
4346:
4340:
4337:
4331:
4328:
4322:
4318:
4305:
4301:
4295:
4288:
4281:
4275:
4268:
4248:
4235:fork–join style
4212:
4211:
4201:
4191:
4178:
4168:
4166:
4156:
4146:
4133:
4123:
4120:
4119:
4109:
4099:
4086:
4076:
4074:
4064:
4054:
4041:
4031:
4024:
4018:
4017:
3999:
3994:
3986:remains unknown
3950:
3949:
3935:
3920:
3912:
3908:
3904:
3900:
3880:
3870:
3867:Shmuel Winograd
3863:Don Coppersmith
3833:
3832:
3813:
3812:
3776:
3771:
3770:
3745:
3744:
3725:
3724:
3660:
3654:
3641:
3628:
3612:
3582:
3558:
3552:
3551:
3543:
3542:
3514:
3504:
3488:
3464:
3459:
3451:
3450:
3392:
3386:
3373:
3360:
3344:
3314:
3290:
3284:
3283:
3275:
3274:
3246:
3236:
3220:
3196:
3191:
3183:
3182:
3137:
3107:
3083:
3077:
3076:
3068:
3067:
3045:
3021:
3016:
3008:
3007:
2962:
2932:
2908:
2902:
2901:
2893:
2892:
2870:
2846:
2841:
2833:
2832:
2773:
2772:
2753:
2752:
2730:
2702:
2672:
2667:
2656:
2655:
2651:
2648:Volker Strassen
2615:
2604:
2603:
2599:
2592:
2586:
2551:
2540:
2504:
2483:
2479:
2471:
2470:
2463:
2459:
2448:cache-oblivious
2440:
2435:
2434:
2410:
2400:
2387:
2377:
2367:
2366:
2356:
2353:
2352:
2342:
2335:
2327:
2326:
2316:
2314:
2304:
2297:
2285:
2284:
2272:
2271:vertically and
2268:
2249:
2223:
2222:
2212:
2207:
2197:
2187:
2176:
2175:
2165:
2163:
2153:
2146:
2131:
2130:
2118:
2099:
2073:
2072:
2059:
2056:
2055:
2042:
2035:
2019:
2018:
2008:
2005:
2004:
1994:
1987:
1975:
1974:
1962:
1943:
1913:
1900:
1896:
1886:
1882:
1872:
1859:
1847:
1840:
1810:
1758:
1757:
1710:
1709:
1702:
1696:
1690:
1683:
1677:
1649:
1648:
1638:
1628:
1615:
1605:
1603:
1593:
1583:
1570:
1560:
1557:
1556:
1546:
1536:
1523:
1513:
1511:
1501:
1491:
1478:
1468:
1461:
1450:
1449:
1439:
1437:
1427:
1424:
1423:
1413:
1411:
1401:
1394:
1386:
1385:
1375:
1373:
1363:
1360:
1359:
1349:
1347:
1337:
1330:
1319:
1318:
1308:
1306:
1296:
1293:
1292:
1282:
1280:
1270:
1263:
1257:
1256:
1249:
1245:
1221:
1220:
1210:
1208:
1198:
1195:
1194:
1184:
1182:
1172:
1165:
1147:
1146:
1136:
1134:
1124:
1121:
1120:
1110:
1108:
1098:
1091:
1073:
1072:
1062:
1060:
1050:
1047:
1046:
1036:
1034:
1024:
1017:
1005:
1004:
991:
978:
976:
972:
959:
957:
953:
950:
945:
944:
942:
940:
937:
936:
931:
914:
907:
902:
894:
887:
881:
862:
858:
854:
848:
829:
825:
821:
802:
798:
794:
788:
760:
757:
729:
726:
698:
691:
687:
683:
676:
672:
668:
661:
657:
653:
642:
640:
635:
628:
622:
618:
604:
602:
601:
593:
591:
590:
580:
576:
564:
560:
556:
552:
542:
539:
534:
533:
531:
526:
522:
518:
509:
506:
501:
500:
498:
494:
490:
456:
443:
433:
417:
411:
410:
405:
394:
389:
381:
374:
368:
361:
357:
351:
344:
340:
333:
329:
322:
316:
312:
302:
301:from 1 through
298:
294:
293:from 1 through
290:
260:
247:
210:
205:
204:
191:
187:
183:
173:
169:
159:
149:
142:
125:
120:time, given by
113:
89:
79:
70:
28:
23:
22:
15:
12:
11:
5:
6938:
6936:
6928:
6927:
6922:
6917:
6907:
6906:
6900:
6899:
6897:
6896:
6891:
6886:
6881:
6876:
6871:
6865:
6863:
6859:
6858:
6856:
6855:
6850:
6845:
6840:
6835:
6829:
6827:
6823:
6822:
6820:
6819:
6814:
6809:
6799:
6794:
6788:
6786:
6782:
6781:
6779:
6778:
6773:
6771:Floating point
6767:
6765:
6761:
6760:
6755:
6753:
6752:
6745:
6738:
6730:
6723:
6722:
6717:
6688:
6649:
6611:Dongarra, Jack
6605:
6603:
6600:
6598:
6597:
6583:10.1.1.90.4753
6576:(3): 347–365.
6560:
6539:
6525:10.1.1.88.8527
6502:
6469:
6426:
6419:
6386:
6372:10.1.1.44.3404
6365:(5): 575–582.
6346:
6332:10.1.1.20.7034
6325:(9): 1017–26.
6304:
6259:
6242:
6211:
6185:
6164:
6095:
6070:
6034:
5988:(4): 658–669,
5965:
5930:
5909:
5898:(2): 187–203.
5892:SIAM J. Comput
5882:
5855:
5828:
5815:(4): 381–388.
5795:
5776:(4): 354–356.
5760:
5753:
5721:
5685:
5663:Prokop, Harald
5649:
5642:
5616:
5580:
5573:
5533:
5508:
5479:
5455:
5428:
5421:
5393:Skiena, Steven
5376:
5374:
5371:
5370:
5369:
5364:
5359:
5354:
5349:
5344:
5337:
5334:
5294:
5291:
5188:computation).
5100:
5097:
5016:
5015:
5014:
5013:
5007:
5006:
5005:
4999:
4992:
4982:
4976:
4969:
4959:
4953:
4946:
4936:
4930:
4923:
4910:
4905:
4896:
4887:
4878:
4867:
4862:
4853:
4844:
4835:
4821:
4816:
4809:
4802:
4790:Base case: if
4762:
4761:
4760:
4759:
4752:
4738:
4732:
4731:
4730:
4724:
4717:
4710:
4700:
4694:
4687:
4680:
4670:
4664:
4657:
4650:
4640:
4634:
4627:
4620:
4610:
4604:
4597:
4590:
4580:
4574:
4567:
4560:
4550:
4544:
4537:
4530:
4520:
4514:
4507:
4500:
4487:
4482:
4473:
4464:
4455:
4444:
4439:
4430:
4421:
4412:
4401:
4396:
4387:
4378:
4369:
4358:
4353:
4344:
4335:
4326:
4298:
4293:
4286:
4279:
4267:Base case: if
4243:
4242:
4230:
4229:
4216:
4208:
4204:
4198:
4194:
4190:
4185:
4181:
4175:
4171:
4167:
4163:
4159:
4153:
4149:
4145:
4140:
4136:
4130:
4126:
4122:
4121:
4116:
4112:
4106:
4102:
4098:
4093:
4089:
4083:
4079:
4075:
4071:
4067:
4061:
4057:
4053:
4048:
4044:
4038:
4034:
4030:
4029:
4027:
3998:
3995:
3993:
3990:
3971:
3967:
3963:
3958:
3943:neural network
3934:
3931:
3911:, verifies in
3887:Big O notation
3846:
3843:
3840:
3820:
3798:
3795:
3792:
3789:
3786:
3783:
3779:
3758:
3755:
3752:
3732:
3703:
3702:
3691:
3688:
3685:
3681:
3675:
3670:
3665:
3657:
3653:
3648:
3644:
3640:
3635:
3631:
3627:
3624:
3619:
3615:
3611:
3608:
3605:
3602:
3597:
3594:
3589:
3585:
3579:
3573:
3568:
3563:
3555:
3550:
3540:
3529:
3526:
3521:
3517:
3511:
3507:
3503:
3500:
3495:
3491:
3487:
3484:
3479:
3476:
3471:
3467:
3462:
3458:
3448:
3445:
3442:
3439:
3435:
3434:
3423:
3420:
3417:
3413:
3407:
3402:
3397:
3389:
3385:
3380:
3376:
3372:
3367:
3363:
3359:
3356:
3351:
3347:
3343:
3340:
3337:
3334:
3329:
3326:
3321:
3317:
3311:
3305:
3300:
3295:
3287:
3282:
3272:
3261:
3258:
3253:
3249:
3243:
3239:
3235:
3232:
3227:
3223:
3219:
3216:
3211:
3208:
3203:
3199:
3194:
3190:
3180:
3177:
3174:
3171:
3167:
3166:
3155:
3152:
3149:
3144:
3140:
3136:
3133:
3130:
3127:
3122:
3119:
3114:
3110:
3104:
3098:
3093:
3088:
3080:
3075:
3065:
3052:
3048:
3044:
3041:
3036:
3033:
3028:
3024:
3019:
3015:
3005:
3002:
2999:
2996:
2992:
2991:
2980:
2977:
2974:
2969:
2965:
2961:
2958:
2955:
2952:
2947:
2944:
2939:
2935:
2929:
2923:
2918:
2913:
2905:
2900:
2890:
2877:
2873:
2869:
2866:
2861:
2858:
2853:
2849:
2844:
2840:
2830:
2827:
2824:
2821:
2817:
2816:
2813:
2810:
2807:
2804:
2801:
2780:
2760:
2749:big O notation
2714:
2709:
2705:
2701:
2698:
2695:
2692:
2687:
2684:
2679:
2675:
2670:
2666:
2663:
2627:
2622:
2618:
2614:
2611:
2585:
2582:
2581:
2580:
2568:
2559:
2554:
2549:
2546:
2543:
2537:
2532:
2528:
2525:
2522:
2519:
2516:
2513:
2510:
2507:
2501:
2498:
2495:
2492:
2489:
2486:
2482:
2478:
2439:
2438:Cache behavior
2436:
2433:
2432:
2431:
2430:
2417:
2413:
2407:
2403:
2399:
2394:
2390:
2384:
2380:
2376:
2371:
2363:
2359:
2355:
2354:
2349:
2345:
2341:
2340:
2338:
2331:
2323:
2319:
2315:
2311:
2307:
2303:
2302:
2300:
2295:
2292:
2279:
2278:
2277:
2276:
2243:
2242:
2241:
2240:
2227:
2219:
2215:
2211:
2208:
2204:
2200:
2196:
2193:
2192:
2190:
2185:
2180:
2172:
2168:
2164:
2160:
2156:
2152:
2151:
2149:
2144:
2141:
2138:
2125:
2124:
2123:
2122:
2093:
2092:
2091:
2090:
2077:
2071:
2066:
2062:
2058:
2057:
2054:
2049:
2045:
2041:
2040:
2038:
2033:
2029:
2023:
2015:
2011:
2007:
2006:
2001:
1997:
1993:
1992:
1990:
1985:
1982:
1969:
1968:
1967:
1966:
1937:
1936:
1933:
1912:Base case: if
1910:
1878:
1877:
1871:
1868:
1837:
1836:
1825:
1822:
1817:
1813:
1809:
1806:
1803:
1800:
1797:
1793:
1789:
1786:
1783:
1780:
1777:
1774:
1771:
1768:
1765:
1755:
1744:
1741:
1738:
1735:
1732:
1729:
1726:
1723:
1720:
1717:
1694:
1688:
1681:
1667:
1666:
1653:
1645:
1641:
1635:
1631:
1627:
1622:
1618:
1612:
1608:
1604:
1600:
1596:
1590:
1586:
1582:
1577:
1573:
1567:
1563:
1559:
1558:
1553:
1549:
1543:
1539:
1535:
1530:
1526:
1520:
1516:
1512:
1508:
1504:
1498:
1494:
1490:
1485:
1481:
1475:
1471:
1467:
1466:
1464:
1459:
1454:
1446:
1442:
1438:
1434:
1430:
1426:
1425:
1420:
1416:
1412:
1408:
1404:
1400:
1399:
1397:
1390:
1382:
1378:
1374:
1370:
1366:
1362:
1361:
1356:
1352:
1348:
1344:
1340:
1336:
1335:
1333:
1328:
1323:
1315:
1311:
1307:
1303:
1299:
1295:
1294:
1289:
1285:
1281:
1277:
1273:
1269:
1268:
1266:
1242:
1241:
1230:
1225:
1217:
1213:
1209:
1205:
1201:
1197:
1196:
1191:
1187:
1183:
1179:
1175:
1171:
1170:
1168:
1163:
1160:
1156:
1151:
1143:
1139:
1135:
1131:
1127:
1123:
1122:
1117:
1113:
1109:
1105:
1101:
1097:
1096:
1094:
1089:
1086:
1082:
1077:
1069:
1065:
1061:
1057:
1053:
1049:
1048:
1043:
1039:
1035:
1031:
1027:
1023:
1022:
1020:
1015:
1012:
990:
987:
935:
934:
928:
927:
926:
925:
924:
923:
922:
921:
920:
919:
918:
912:
905:
899:
898:
897:
892:
885:
851:
791:
764:
733:
702:
650:
632:
625:
614:
613:
489:consisting of
455:
454:Cache behavior
452:
409:
408:
402:
401:
400:
399:
398:
392:
386:
385:
384:
379:
372:
354:
326:
319:
308:
307:
287:
286:
275:
270:
267:
263:
257:
254:
250:
244:
239:
236:
233:
229:
225:
220:
217:
213:
141:
138:
98:big O notation
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
6937:
6926:
6923:
6921:
6920:Matrix theory
6918:
6916:
6913:
6912:
6910:
6895:
6892:
6890:
6887:
6885:
6882:
6880:
6877:
6875:
6872:
6870:
6867:
6866:
6864:
6860:
6854:
6851:
6849:
6846:
6844:
6841:
6839:
6836:
6834:
6831:
6830:
6828:
6824:
6818:
6815:
6813:
6810:
6807:
6803:
6800:
6798:
6795:
6793:
6790:
6789:
6787:
6783:
6777:
6774:
6772:
6769:
6768:
6766:
6762:
6758:
6751:
6746:
6744:
6739:
6737:
6732:
6731:
6728:
6721:
6718:
6714:
6710:
6706:
6702:
6698:
6694:
6689:
6685:
6681:
6677:
6673:
6668:
6663:
6659:
6655:
6650:
6646:
6642:
6638:
6634:
6629:
6624:
6620:
6616:
6612:
6607:
6606:
6601:
6593:
6589:
6584:
6579:
6575:
6571:
6564:
6561:
6555:
6550:
6543:
6540:
6535:
6531:
6526:
6521:
6517:
6513:
6506:
6503:
6497:
6492:
6488:
6484:
6480:
6473:
6470:
6458:
6454:
6449:
6444:
6437:
6430:
6427:
6422:
6416:
6412:
6408:
6404:
6397:
6390:
6387:
6382:
6378:
6373:
6368:
6364:
6360:
6353:
6351:
6347:
6342:
6338:
6333:
6328:
6324:
6320:
6313:
6311:
6309:
6305:
6297:
6293:
6289:
6285:
6281:
6277:
6270:
6263:
6260:
6255:
6254:
6246:
6243:
6238:
6234:
6227:
6226:
6218:
6216:
6212:
6200:
6196:
6189:
6186:
6180:
6175:
6168:
6165:
6160:
6156:
6151:
6146:
6142:
6138:
6134:
6130:
6126:
6122:
6118:
6114:
6110:
6102:
6100:
6096:
6084:
6080:
6074:
6071:
6067:
6063:
6056:
6052:
6045:
6038:
6035:
6031:
6021:on 2014-03-05
6017:
6013:
6009:
6005:
6001:
5996:
5991:
5987:
5983:
5976:
5969:
5966:
5961:
5956:
5952:
5948:
5941:
5934:
5931:
5925:
5920:
5913:
5910:
5905:
5901:
5897:
5893:
5886:
5883:
5878:
5874:
5870:
5866:
5859:
5856:
5851:
5847:
5843:
5839:
5832:
5829:
5823:
5818:
5814:
5810:
5806:
5799:
5796:
5791:
5787:
5783:
5779:
5775:
5771:
5764:
5761:
5756:
5750:
5746:
5742:
5738:
5737:
5732:
5725:
5722:
5718:
5714:
5709:
5704:
5701:(2): 97–107,
5700:
5696:
5689:
5686:
5681:
5677:
5670:
5669:
5664:
5658:
5656:
5654:
5650:
5645:
5639:
5635:
5631:
5627:
5620:
5617:
5604:
5600:
5593:
5591:
5589:
5587:
5585:
5581:
5576:
5574:0-262-03384-4
5570:
5566:
5565:
5560:
5556:
5552:
5548:
5542:
5540:
5538:
5534:
5523:
5519:
5512:
5509:
5504:
5499:
5495:
5494:
5486:
5484:
5480:
5475:
5470:
5466:
5459:
5456:
5451:
5446:
5442:
5435:
5433:
5429:
5424:
5418:
5414:
5410:
5406:
5401:
5400:
5394:
5388:
5386:
5384:
5382:
5378:
5372:
5368:
5365:
5363:
5360:
5358:
5355:
5353:
5350:
5348:
5345:
5343:
5340:
5339:
5335:
5333:
5331:
5326:
5324:
5320:
5316:
5312:
5308:
5299:
5292:
5290:
5288:
5282:
5276:
5269:
5265:
5261:
5255:
5254:3D algorithm,
5249:
5245:
5241:
5232:
5225:
5221:
5213:
5202:
5189:
5185:
5175:
5168:
5164:
5158:
5149:
5137:
5129:
5125:
5121:
5117:
5113:
5107:
5098:
5080:
5076:
5072:
5060:
5056:
5050:
5044:
5038:
5033:
5027:
5023:
5011:
5008:
4998:
4991:
4986:
4983:
4975:
4968:
4963:
4960:
4952:
4945:
4940:
4937:
4929:
4922:
4917:
4914:
4913:
4912:In parallel:
4911:
4904:
4895:
4886:
4877:
4868:
4861:
4852:
4843:
4834:
4825:
4824:
4822:
4815:
4808:
4801:
4794:
4789:
4788:
4787:
4775:
4771:
4766:
4753:
4748:
4744:
4739:
4736:
4733:
4723:
4716:
4709:
4704:
4701:
4693:
4686:
4679:
4674:
4671:
4663:
4656:
4649:
4644:
4641:
4633:
4626:
4619:
4614:
4611:
4603:
4596:
4589:
4584:
4581:
4573:
4566:
4559:
4554:
4551:
4543:
4536:
4529:
4524:
4521:
4513:
4506:
4499:
4494:
4491:
4490:
4488:
4481:
4472:
4463:
4454:
4445:
4438:
4429:
4420:
4411:
4402:
4395:
4386:
4377:
4368:
4359:
4352:
4343:
4334:
4325:
4316:
4315:
4312:
4308:
4299:
4292:
4285:
4278:
4271:
4266:
4265:
4264:
4260:
4256:
4252:
4247:
4241:
4239:
4236:
4214:
4206:
4202:
4196:
4192:
4188:
4183:
4179:
4173:
4169:
4161:
4157:
4151:
4147:
4143:
4138:
4134:
4128:
4124:
4114:
4110:
4104:
4100:
4096:
4091:
4087:
4081:
4077:
4069:
4065:
4059:
4055:
4051:
4046:
4042:
4036:
4032:
4025:
4016:
4015:
4014:
4012:
4008:
4004:
3996:
3991:
3989:
3987:
3983:
3965:
3961:
3948:finite field
3944:
3940:
3932:
3930:
3927:
3923:
3916:
3898:
3894:
3890:
3888:
3883:
3877:
3873:
3868:
3864:
3860:
3844:
3841:
3838:
3818:
3793:
3787:
3784:
3781:
3777:
3756:
3753:
3750:
3730:
3722:
3718:
3714:
3709:
3689:
3686:
3683:
3679:
3673:
3668:
3663:
3655:
3651:
3646:
3642:
3638:
3633:
3629:
3625:
3622:
3617:
3613:
3609:
3606:
3603:
3600:
3595:
3592:
3587:
3583:
3577:
3571:
3566:
3561:
3553:
3548:
3541:
3527:
3524:
3519:
3515:
3509:
3505:
3501:
3498:
3493:
3489:
3485:
3482:
3477:
3474:
3469:
3465:
3460:
3456:
3449:
3446:
3443:
3440:
3437:
3436:
3421:
3418:
3415:
3411:
3405:
3400:
3395:
3387:
3383:
3378:
3374:
3370:
3365:
3361:
3357:
3354:
3349:
3345:
3341:
3338:
3335:
3332:
3327:
3324:
3319:
3315:
3309:
3303:
3298:
3293:
3285:
3280:
3273:
3259:
3256:
3251:
3247:
3241:
3237:
3233:
3230:
3225:
3221:
3217:
3214:
3209:
3206:
3201:
3197:
3192:
3188:
3181:
3178:
3175:
3172:
3169:
3168:
3153:
3150:
3147:
3142:
3138:
3134:
3131:
3128:
3125:
3120:
3117:
3112:
3108:
3102:
3096:
3091:
3086:
3078:
3073:
3066:
3050:
3046:
3042:
3039:
3034:
3031:
3026:
3022:
3017:
3013:
3006:
3003:
3000:
2997:
2994:
2993:
2978:
2975:
2972:
2967:
2963:
2959:
2956:
2953:
2950:
2945:
2942:
2937:
2933:
2927:
2921:
2916:
2911:
2903:
2898:
2891:
2875:
2871:
2867:
2864:
2859:
2856:
2851:
2847:
2842:
2838:
2831:
2828:
2825:
2822:
2819:
2818:
2814:
2811:
2808:
2805:
2802:
2799:
2798:
2792:
2778:
2758:
2750:
2745:
2743:
2742:finite fields
2739:
2733:
2728:
2707:
2703:
2696:
2693:
2685:
2682:
2677:
2673:
2668:
2661:
2649:
2646:, devised by
2645:
2620:
2616:
2609:
2596:
2591:
2583:
2566:
2557:
2552:
2547:
2544:
2541:
2535:
2530:
2526:
2523:
2520:
2517:
2514:
2511:
2508:
2505:
2499:
2496:
2493:
2490:
2487:
2484:
2480:
2469:
2468:
2467:
2456:
2453:
2449:
2445:
2437:
2415:
2411:
2405:
2401:
2397:
2392:
2388:
2382:
2378:
2374:
2369:
2361:
2357:
2347:
2343:
2336:
2329:
2321:
2317:
2309:
2305:
2298:
2293:
2290:
2283:
2282:
2281:
2280:
2275:horizontally:
2265:
2261:
2257:
2253:
2247:
2246:
2245:
2244:
2225:
2217:
2213:
2209:
2202:
2198:
2194:
2188:
2183:
2178:
2170:
2166:
2158:
2154:
2147:
2142:
2139:
2136:
2129:
2128:
2127:
2126:
2115:
2111:
2107:
2103:
2097:
2096:
2095:
2094:
2075:
2069:
2064:
2060:
2052:
2047:
2043:
2036:
2031:
2027:
2021:
2013:
2009:
1999:
1995:
1988:
1983:
1980:
1973:
1972:
1971:
1970:
1965:horizontally:
1959:
1955:
1951:
1947:
1941:
1940:
1939:
1938:
1934:
1931:
1925:
1921:
1917:
1911:
1907:
1903:
1893:
1889:
1880:
1879:
1876:
1869:
1867:
1863:
1857:
1851:
1843:
1823:
1815:
1811:
1801:
1795:
1791:
1787:
1781:
1778:
1775:
1769:
1763:
1756:
1742:
1736:
1727:
1721:
1715:
1708:
1707:
1706:
1699:
1693:
1687:
1680:
1676:
1672:
1651:
1643:
1639:
1633:
1629:
1625:
1620:
1616:
1610:
1606:
1598:
1594:
1588:
1584:
1580:
1575:
1571:
1565:
1561:
1551:
1547:
1541:
1537:
1533:
1528:
1524:
1518:
1514:
1506:
1502:
1496:
1492:
1488:
1483:
1479:
1473:
1469:
1462:
1457:
1452:
1444:
1440:
1432:
1428:
1418:
1414:
1406:
1402:
1395:
1388:
1380:
1376:
1368:
1364:
1354:
1350:
1342:
1338:
1331:
1326:
1321:
1313:
1309:
1301:
1297:
1287:
1283:
1275:
1271:
1264:
1255:
1254:
1253:
1228:
1223:
1215:
1211:
1203:
1199:
1189:
1185:
1177:
1173:
1166:
1161:
1158:
1154:
1149:
1141:
1137:
1129:
1125:
1115:
1111:
1103:
1099:
1092:
1087:
1084:
1080:
1075:
1067:
1063:
1055:
1051:
1041:
1037:
1029:
1025:
1018:
1013:
1010:
1003:
1002:
1001:
1000:
996:
988:
986:
981:
975:
962:
956:
948:
929:
915:
908:
900:
895:
888:
879:
878:
874:
870:
866:
852:
846:
845:
841:
837:
833:
819:
818:
814:
810:
806:
792:
787:
783:
779:
775:
771:
767:
763:
756:
752:
748:
744:
740:
736:
732:
725:
721:
717:
713:
709:
705:
701:
696:
695:
681:
680:
666:
665:
651:
645:
638:
633:
626:
616:
615:
612:
607:
596:
588:
587:
573:
568:
545:
537:
529:
512:
504:
488:
483:
481:
477:
473:
465:
460:
453:
451:
447:
440:
436:
431:
427:
421:
416:
403:
395:
387:
382:
375:
366:
365:
355:
349:
348:
338:
337:
327:
320:
310:
309:
306:
273:
268:
265:
261:
255:
252:
248:
242:
237:
234:
231:
227:
223:
218:
215:
211:
203:
202:
201:
198:
194:
180:
176:
166:
162:
156:
152:
147:
139:
137:
135:
129:
123:
117:
111:
107:
103:
99:
93:
86:
82:
77:
73:
68:
63:
61:
57:
53:
49:
45:
41:
37:
33:
19:
6805:
6764:Key concepts
6696:
6692:
6657:
6653:
6618:
6614:
6573:
6569:
6563:
6542:
6518:(3): 383–5.
6515:
6511:
6505:
6486:
6482:
6472:
6460:. Retrieved
6429:
6402:
6389:
6362:
6358:
6322:
6318:
6275:
6262:
6252:
6245:
6237:1721.1/47519
6224:
6202:. Retrieved
6198:
6188:
6167:
6116:
6112:
6087:. Retrieved
6082:
6073:
6061:
6058:
6054:
6050:
6037:
6029:
6023:, retrieved
6016:the original
5985:
5981:
5968:
5950:
5946:
5933:
5912:
5895:
5891:
5885:
5868:
5858:
5841:
5831:
5812:
5808:
5798:
5773:
5769:
5763:
5735:
5724:
5698:
5694:
5688:
5680:1721.1/80568
5667:
5625:
5619:
5607:. Retrieved
5602:
5563:
5525:. Retrieved
5521:
5511:
5492:
5464:
5458:
5440:
5407:–46, 401–3.
5398:
5329:
5327:
5322:
5314:
5310:
5304:
5280:
5274:
5267:
5263:
5259:
5253:
5247:
5243:
5239:
5230:
5223:
5219:
5211:
5200:
5190:
5183:
5173:
5166:
5162:
5156:
5147:
5135:
5124:2D algorithm
5123:
5118:
5111:
5105:
5102:
5070:
5058:
5054:
5042:
5034:
5025:
5021:
5019:
5009:
4996:
4989:
4984:
4973:
4966:
4961:
4950:
4943:
4938:
4927:
4920:
4915:
4902:
4893:
4884:
4875:
4859:
4850:
4841:
4832:
4813:
4806:
4799:
4792:
4773:
4769:
4764:
4763:
4746:
4742:
4734:
4721:
4714:
4707:
4702:
4691:
4684:
4677:
4672:
4661:
4654:
4647:
4642:
4631:
4624:
4617:
4612:
4601:
4594:
4587:
4582:
4571:
4564:
4557:
4552:
4541:
4534:
4527:
4522:
4511:
4504:
4497:
4492:
4479:
4470:
4461:
4452:
4436:
4427:
4418:
4409:
4393:
4384:
4375:
4366:
4350:
4341:
4332:
4323:
4310:
4306:
4290:
4283:
4276:
4269:
4258:
4254:
4250:
4245:
4244:
4231:
4007:parallelized
4000:
3936:
3925:
3921:
3914:
3895:is a simple
3891:
3881:
3875:
3871:
3720:
3710:
3706:
2746:
2731:
2641:
2457:
2441:
2263:
2259:
2255:
2251:
2113:
2109:
2105:
2101:
1957:
1953:
1949:
1945:
1923:
1919:
1915:
1905:
1901:
1891:
1887:
1875:dimensions.
1873:
1861:
1849:
1841:
1838:
1700:
1691:
1685:
1678:
1673:, using the
1668:
1243:
992:
979:
973:
960:
954:
946:
938:
910:
903:
890:
883:
882:sum ← sum +
872:
868:
864:
839:
835:
831:
812:
808:
804:
785:
781:
777:
773:
769:
765:
761:
754:
750:
746:
742:
738:
734:
730:
723:
719:
715:
711:
707:
703:
699:
690:in steps of
675:in steps of
660:in steps of
643:
636:
605:
594:
584:
574:
566:
543:
535:
527:
510:
502:
484:
468:
445:
438:
434:
419:
412:
390:
377:
370:
369:sum ← sum +
288:
196:
192:
178:
174:
164:
160:
154:
150:
143:
127:
115:
91:
84:
80:
71:
64:
39:
29:
6699:(3): 1–33.
6660:(3): 1–25.
6489:: 2230–40.
6204:26 November
5770:Numer. Math
4823:Otherwise:
4754:Deallocate
3933:AlphaTensor
2248:Otherwise,
2121:vertically:
1671:recursively
148:is that if
60:distributed
6909:Categories
6806:algorithms
6179:2212.01175
6089:2022-11-01
6025:2015-01-16
5953:(3): 251,
5924:2008.03759
5743:. p.
5609:27 January
5527:2021-04-01
5503:2010.05846
5474:2210.10173
5450:2307.07970
5373:References
4869:Partition
4826:Partition
4446:Partition
4403:Partition
4360:Partition
4317:Partition
4238:pseudocode
790:, that is:
686:from 1 to
671:from 1 to
656:from 1 to
493:bytes and
360:from 1 to
343:from 1 to
332:from 1 to
67:takes time
6833:CPU cache
6662:CiteSeerX
6628:0709.1272
6621:: 38–53.
6578:CiteSeerX
6554:1411.3273
6520:CiteSeerX
6448:1304.1467
6367:CiteSeerX
6327:CiteSeerX
6141:1476-4687
6051:SIAM News
5990:CiteSeerX
5790:121656251
5703:CiteSeerX
5695:SIAM News
5561:(2009) .
5287:MapReduce
5030:partition
4765:Procedure
4706:multiply(
4676:multiply(
4646:multiply(
4616:multiply(
4586:multiply(
4556:multiply(
4526:multiply(
4496:multiply(
4304:of shape
4249:multiply(
4246:Procedure
3937:In 2022,
3839:ω
3819:ω
3782:ω
3754:×
3731:ω
3652:
3639:⋅
3607:−
3601:⋅
3593:
3525:
3483:−
3475:
3384:
3371:⋅
3339:−
3333:⋅
3325:
3257:
3215:−
3207:
3132:−
3126:⋅
3118:
3040:−
3032:
2957:−
2951:⋅
2943:
2865:−
2857:
2803:Reference
2694:≈
2683:
2621:ω
2477:Θ
2098:Else, if
1805:Θ
1731:Θ
1248:for some
697:Multiply
228:∑
6862:Software
6826:Hardware
6785:Problems
6296:Archived
6159:36198780
5665:(1999).
5336:See also
5154:, where
4314:, then:
3939:DeepMind
3919:time if
3859:Williams
3845:2.371552
2998:Winograd
2823:Strassen
2734:> 100
2267:. Split
2117:, split
1961:, split
1930:unrolled
1899:of size
1885:of size
122:Williams
56:parallel
30:Because
6713:1242360
6684:9359223
6462:12 July
6453:Bibcode
6292:8410593
6150:9534758
6121:Bibcode
6012:1004789
5228:√
5209:√
5198:√
5171:√
5145:√
5133:√
5126:, is a
5049:speedup
977:√
967:
958:√
943:
930:Return
849:sum = 0
641:√
603:√
592:√
548:
532:
515:
499:
404:Return
352:sum = 0
186:, then
182:matrix
172:and an
168:matrix
158:for an
6884:LAPACK
6874:MATLAB
6711:
6682:
6664:
6643:
6580:
6522:
6417:
6369:
6329:
6290:
6157:
6147:
6139:
6113:Nature
6010:
5992:
5788:
5751:
5705:
5640:
5571:
5419:
5307:meshes
5041:Θ(log
5020:Here,
4797:, set
4274:, set
3719:. The
190:is an
6869:ATLAS
6709:S2CID
6680:S2CID
6641:S2CID
6623:arXiv
6549:arXiv
6443:arXiv
6439:(PDF)
6399:(PDF)
6299:(PDF)
6288:S2CID
6272:(PDF)
6229:(PDF)
6174:arXiv
6057:(9),
6047:(PDF)
6019:(PDF)
5978:(PDF)
5943:(PDF)
5919:arXiv
5786:S2CID
5672:(PDF)
5498:arXiv
5469:arXiv
5445:arXiv
5057:/log
4873:into
4830:into
4782:into
4778:adds
4450:into
4407:into
4364:into
4321:into
3857:, by
2708:2.807
2652:2 × 2
2444:tiled
1246:2 × 2
916:+ sum
857:from
824:from
797:from
759:into
586:tiled
530:>
476:cache
396:← sum
76:field
52:graph
6848:SIMD
6464:2014
6415:ISBN
6206:2022
6155:PMID
6137:ISSN
5749:ISBN
5638:ISBN
5611:2015
5569:ISBN
5417:ISBN
5090:and
5026:join
5022:fork
5010:Join
4988:add(
4985:Fork
4965:add(
4962:Fork
4942:add(
4939:Fork
4919:add(
4916:Fork
4768:add(
4741:add(
4735:Join
4703:Fork
4673:Fork
4643:Fork
4613:Fork
4583:Fork
4553:Fork
4523:Fork
4493:Fork
4001:The
3907:and
3865:and
3842:<
3438:2023
3170:2017
2995:1971
2820:1969
2800:Year
2738:BLAS
2262:) =
2250:max(
2112:) =
2100:max(
1956:) =
1944:max(
1914:max(
1846:and
901:Set
880:Set
863:min(
853:For
847:Let
830:min(
820:For
803:min(
793:For
728:and
682:For
667:For
652:For
639:= Θ(
627:Let
621:and
579:and
521:and
474:and
424:(in
415:time
388:Set
367:Set
356:For
350:Let
339:For
328:For
321:Let
315:and
297:and
144:The
58:and
46:and
6838:TLB
6701:doi
6672:doi
6645:955
6633:doi
6588:doi
6530:doi
6491:doi
6407:doi
6377:doi
6337:doi
6280:doi
6233:hdl
6145:PMC
6129:doi
6117:610
6064:= 2
6000:doi
5955:doi
5900:doi
5873:doi
5846:doi
5817:doi
5778:doi
5745:108
5713:doi
5676:hdl
5630:doi
5409:doi
5206:by
5142:by
5051:of
5039:of
4795:= 1
4272:= 1
3831:is
3643:log
3626:1.5
3584:log
3516:log
3502:1.5
3466:log
3375:log
3316:log
3248:log
3198:log
3109:log
3023:log
2934:log
2848:log
2674:log
1942:If
861:to
828:to
801:to
600:by
420:nmp
96:in
6911::
6707:.
6697:41
6695:.
6678:.
6670:.
6658:34
6656:.
6639:.
6631:.
6619:35
6617:.
6586:.
6574:45
6572:.
6528:.
6514:.
6487:29
6485:.
6481:.
6451:.
6441:.
6413:.
6401:.
6375:.
6363:39
6361:.
6349:^
6335:.
6323:64
6321:.
6307:^
6294:.
6286:.
6274:.
6214:^
6197:.
6153:.
6143:.
6135:.
6127:.
6115:.
6111:.
6098:^
6081:.
6055:38
6053:,
6049:,
6028:,
6008:MR
6006:,
5998:,
5986:18
5984:,
5980:,
5949:,
5945:,
5894:.
5867:.
5840:.
5811:.
5807:.
5784:.
5774:13
5772:.
5747:.
5711:,
5697:,
5652:^
5636:.
5601:.
5583:^
5557:;
5553:;
5549:;
5536:^
5520:.
5496:,
5482:^
5467:,
5443:,
5431:^
5415:.
5405:45
5380:^
5182:Ω(
5150:/3
5138:/3
5110:Ω(
5069:Θ(
5053:Θ(
5000:22
4995:,
4993:22
4977:21
4972:,
4970:21
4954:12
4949:,
4947:12
4931:11
4926:,
4924:11
4906:22
4900:,
4897:21
4891:,
4888:12
4882:,
4879:11
4863:22
4857:,
4854:21
4848:,
4845:12
4839:,
4836:11
4817:11
4812:+
4810:11
4805:←
4803:11
4772:,
4745:,
4725:22
4720:,
4718:22
4713:,
4711:22
4695:21
4690:,
4688:22
4683:,
4681:21
4665:22
4660:,
4658:12
4653:,
4651:12
4635:21
4630:,
4628:12
4623:,
4621:11
4605:12
4600:,
4598:21
4593:,
4591:22
4575:11
4570:,
4568:21
4563:,
4561:21
4545:12
4540:,
4538:11
4533:,
4531:12
4515:11
4510:,
4508:11
4503:,
4501:11
4483:22
4477:,
4474:21
4468:,
4465:12
4459:,
4456:11
4440:22
4434:,
4431:21
4425:,
4422:12
4416:,
4413:11
4397:22
4391:,
4388:21
4382:,
4379:12
4373:,
4370:11
4354:22
4348:,
4345:21
4339:,
4336:12
4330:,
4327:11
4309:×
4294:11
4289:×
4287:11
4282:←
4280:11
4263::
4257:,
4253:,
4240::
4207:22
4197:22
4184:12
4174:21
4162:21
4152:22
4139:11
4129:21
4115:22
4105:12
4092:12
4082:11
4070:21
4060:12
4047:11
4037:11
3929:.
3924:=
3922:AB
3913:Θ(
3903:,
3874:×
3610:12
3447:12
3342:12
3179:12
3135:15
3004:15
2960:18
2829:18
2258:,
2254:,
2108:,
2104:,
1952:,
1948:,
1922:,
1918:,
1904:×
1895:,
1890:×
1860:Θ(
1848:Θ(
1844:/2
1695:11
1689:11
1684:=
1682:11
1644:22
1634:22
1621:12
1611:21
1599:21
1589:22
1576:11
1566:21
1552:22
1542:12
1529:12
1519:11
1507:21
1497:12
1484:11
1474:11
1445:22
1433:21
1419:12
1407:11
1381:22
1369:21
1355:12
1343:11
1314:22
1302:21
1288:12
1276:11
1216:22
1204:21
1190:12
1178:11
1142:22
1130:21
1116:12
1104:11
1068:22
1056:21
1042:12
1030:11
941:Θ(
913:ij
909:←
906:ij
893:kj
889:×
886:ik
877::
871:,
867:+
844::
838:,
834:+
817::
811:,
807:+
776:,
745:,
714:,
694::
679::
664::
611::
565:Θ(
444:Θ(
437:×
418:Θ(
393:ij
380:kj
376:×
373:ik
364::
347::
336::
195:×
177:×
163:×
155:AB
153:=
126:O(
114:O(
90:Θ(
83:×
6808:)
6804:(
6749:e
6742:t
6735:v
6715:.
6703::
6686:.
6674::
6647:.
6635::
6625::
6594:.
6590::
6557:.
6551::
6536:.
6532::
6516:6
6499:.
6493::
6466:.
6455::
6445::
6423:.
6409::
6383:.
6379::
6343:.
6339::
6282::
6239:.
6235::
6208:.
6182:.
6176::
6161:.
6131::
6123::
6092:.
6062:ω
6002::
5957::
5951:9
5927:.
5921::
5906:.
5902::
5896:5
5879:.
5875::
5852:.
5848::
5825:.
5819::
5813:4
5792:.
5780::
5757:.
5715::
5699:4
5682:.
5678::
5646:.
5632::
5613:.
5577:.
5530:.
5500::
5471::
5447::
5425:.
5411::
5330:n
5323:n
5315:n
5313:×
5311:n
5281:p
5275:p
5270:)
5268:p
5266:/
5264:n
5262:(
5260:O
5250:)
5248:p
5246:/
5244:n
5242:(
5240:O
5235:)
5231:p
5226:/
5224:n
5222:(
5220:O
5212:p
5201:p
5193:p
5186:)
5184:n
5178:)
5174:M
5169:/
5167:n
5165:(
5163:O
5157:M
5148:M
5136:M
5114:)
5112:n
5092:B
5088:A
5084:C
5073:)
5071:n
5065:T
5061:)
5059:n
5055:n
5045:)
5043:n
5012:.
5004:.
5002:)
4997:T
4990:C
4981:.
4979:)
4974:T
4967:C
4958:.
4956:)
4951:T
4944:C
4935:.
4933:)
4928:T
4921:C
4909:.
4903:T
4894:T
4885:T
4876:T
4871:T
4866:.
4860:C
4851:C
4842:C
4833:C
4828:C
4814:t
4807:c
4800:c
4793:n
4784:C
4780:T
4776:)
4774:T
4770:C
4758:.
4756:T
4751:.
4749:)
4747:T
4743:C
4729:.
4727:)
4722:B
4715:A
4708:T
4699:.
4697:)
4692:B
4685:A
4678:T
4669:.
4667:)
4662:B
4655:A
4648:T
4639:.
4637:)
4632:B
4625:A
4618:T
4609:.
4607:)
4602:B
4595:A
4588:C
4579:.
4577:)
4572:B
4565:A
4558:C
4549:.
4547:)
4542:B
4535:A
4528:C
4519:.
4517:)
4512:B
4505:A
4498:C
4486:.
4480:T
4471:T
4462:T
4453:T
4448:T
4443:.
4437:C
4428:C
4419:C
4410:C
4405:C
4400:.
4394:B
4385:B
4376:B
4367:B
4362:B
4357:.
4351:A
4342:A
4333:A
4324:A
4319:A
4311:n
4307:n
4302:T
4291:b
4284:a
4277:c
4270:n
4261:)
4259:B
4255:A
4251:C
4215:)
4203:B
4193:A
4189:+
4180:B
4170:A
4158:B
4148:A
4144:+
4135:B
4125:A
4111:B
4101:A
4097:+
4088:B
4078:A
4066:B
4056:A
4052:+
4043:B
4033:A
4026:(
3970:Z
3966:2
3962:/
3957:Z
3926:C
3917:)
3915:n
3909:C
3905:B
3901:A
3882:k
3876:k
3872:k
3797:)
3794:1
3791:(
3788:o
3785:+
3778:n
3757:n
3751:n
3690:M
3687:5
3684:+
3680:)
3674:M
3669:n
3664:2
3656:(
3647:2
3634:2
3630:n
3623:+
3618:2
3614:n
3604:M
3596:7
3588:2
3578:)
3572:M
3567:n
3562:3
3554:(
3549:4
3528:n
3520:2
3510:2
3506:n
3499:+
3494:2
3490:n
3486:4
3478:7
3470:2
3461:n
3457:5
3444:7
3422:M
3419:5
3416:+
3412:)
3406:M
3401:n
3396:2
3388:(
3379:2
3366:2
3362:n
3358:3
3355:+
3350:2
3346:n
3336:M
3328:7
3320:2
3310:)
3304:M
3299:n
3294:3
3286:(
3281:4
3260:n
3252:2
3242:2
3238:n
3234:3
3231:+
3226:2
3222:n
3218:4
3210:7
3202:2
3193:n
3189:5
3176:7
3154:M
3151:3
3148:+
3143:2
3139:n
3129:M
3121:7
3113:2
3103:)
3097:M
3092:n
3087:3
3079:(
3074:5
3051:2
3047:n
3043:5
3035:7
3027:2
3018:n
3014:6
3001:7
2979:M
2976:3
2973:+
2968:2
2964:n
2954:M
2946:7
2938:2
2928:)
2922:M
2917:n
2912:3
2904:(
2899:6
2876:2
2872:n
2868:6
2860:7
2852:2
2843:n
2839:7
2826:7
2779:M
2759:n
2732:n
2713:)
2704:n
2700:(
2697:O
2691:)
2686:7
2678:2
2669:n
2665:(
2662:O
2638:.
2626:)
2617:n
2613:(
2610:O
2600:ω
2567:)
2558:M
2553:b
2548:p
2545:n
2542:m
2536:+
2531:b
2527:p
2524:m
2521:+
2518:p
2515:n
2512:+
2509:n
2506:m
2500:+
2497:p
2494:+
2491:n
2488:+
2485:m
2481:(
2464:b
2460:M
2416:2
2412:B
2406:2
2402:A
2398:+
2393:1
2389:B
2383:1
2379:A
2375:=
2370:)
2362:2
2358:B
2348:1
2344:B
2337:(
2330:)
2322:2
2318:A
2310:1
2306:A
2299:(
2294:=
2291:C
2273:B
2269:A
2264:m
2260:p
2256:m
2252:n
2226:)
2218:2
2214:B
2210:A
2203:1
2199:B
2195:A
2189:(
2184:=
2179:)
2171:2
2167:B
2159:1
2155:B
2148:(
2143:A
2140:=
2137:C
2119:B
2114:p
2110:p
2106:m
2102:n
2076:)
2070:B
2065:2
2061:A
2053:B
2048:1
2044:A
2037:(
2032:=
2028:B
2022:)
2014:2
2010:A
2000:1
1996:A
1989:(
1984:=
1981:C
1963:A
1958:n
1954:p
1950:m
1946:n
1926:)
1924:p
1920:m
1916:n
1909:.
1906:p
1902:m
1897:B
1892:m
1888:n
1883:A
1864:)
1862:n
1852:)
1850:n
1842:n
1824:,
1821:)
1816:2
1812:n
1808:(
1802:+
1799:)
1796:2
1792:/
1788:n
1785:(
1782:T
1779:8
1776:=
1773:)
1770:n
1767:(
1764:T
1743:;
1740:)
1737:1
1734:(
1728:=
1725:)
1722:1
1719:(
1716:T
1703:n
1692:b
1686:a
1679:c
1652:)
1640:B
1630:A
1626:+
1617:B
1607:A
1595:B
1585:A
1581:+
1572:B
1562:A
1548:B
1538:A
1534:+
1525:B
1515:A
1503:B
1493:A
1489:+
1480:B
1470:A
1463:(
1458:=
1453:)
1441:B
1429:B
1415:B
1403:B
1396:(
1389:)
1377:A
1365:A
1351:A
1339:A
1332:(
1327:=
1322:)
1310:C
1298:C
1284:C
1272:C
1265:(
1250:n
1229:,
1224:)
1212:B
1200:B
1186:B
1174:B
1167:(
1162:=
1159:B
1155:,
1150:)
1138:A
1126:A
1112:A
1100:A
1093:(
1088:=
1085:A
1081:,
1076:)
1064:C
1052:C
1038:C
1026:C
1019:(
1014:=
1011:C
980:M
974:b
969:)
961:M
955:b
951:/
947:n
932:C
911:C
904:C
891:B
884:A
875:)
873:m
869:T
865:K
859:K
855:k
842:)
840:p
836:T
832:J
826:J
822:j
815:)
813:n
809:T
805:I
799:I
795:i
786:T
784:+
782:J
780::
778:J
774:T
772:+
770:I
768::
766:I
762:C
755:T
753:+
751:J
749::
747:J
743:T
741:+
739:K
737::
735:K
731:B
724:T
722:+
720:K
718::
716:K
712:T
710:+
708:I
706::
704:I
700:A
692:T
688:m
684:K
677:T
673:p
669:J
662:T
658:n
654:I
648:)
644:M
637:T
629:C
623:B
619:A
606:M
595:M
581:B
577:A
569:)
567:n
561:B
557:B
553:A
544:b
540:/
536:M
528:n
523:B
519:A
511:b
507:/
503:M
495:b
491:M
448:)
446:n
439:n
435:n
422:)
406:C
391:C
378:B
371:A
362:m
358:k
345:p
341:j
334:n
330:i
323:C
317:B
313:A
303:p
299:j
295:n
291:i
274:.
269:j
266:k
262:b
256:k
253:i
249:a
243:m
238:1
235:=
232:k
224:=
219:j
216:i
212:c
197:p
193:n
188:C
184:B
179:p
175:m
170:A
165:m
161:n
151:C
130:)
128:n
118:)
116:n
94:)
92:n
85:n
81:n
72:n
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.