Knowledge (XXG)

Matrix multiplication algorithm

Source 📝

3988:, even in commutative field. On 4x4 matrices, AlphaTensor unexpectedly discovered a solution with 47 multiplication steps, an improvement over the 49 required with Strassen’s algorithm of 1969, albeit restricted to mod 2 arithmetic. Similarly, AlphaTensor solved 5x5 matrices with 96 rather than Strassen's 98 steps. Based on the surprising discovery that such improvements exist, other researchers were quickly able to find a similar independent 4x4 algorithm, and separately tweaked Deepmind's 96-step 5x5 algorithm down to 95 steps in mod 2 arithmetic and to 97 in normal arithmetic. Some algorithms were never discovered before, e.g. (4, 5, 5) got improved to 76 from 80 in normal and mod 2 arithmetics. 5298: 1664: 5079: 459: 2595: 1258: 3708:
are represented in the same basis. Karstadt and Schwartz computed in different bases and traded 3 additions for less expensive basis transformations. They also proved that one cannot go below 12 additions per step using different bases. In subsequent work Beniamini et el. applied this base change trick to more general decompositions than 2x2-block matrices and improved the leading constant for their run times.
1239: 1659:{\displaystyle {\begin{pmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\\\end{pmatrix}}={\begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\\\end{pmatrix}}{\begin{pmatrix}B_{11}&B_{12}\\B_{21}&B_{22}\\\end{pmatrix}}={\begin{pmatrix}A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\\\end{pmatrix}}} 3700: 3432: 4227: 1006: 3707:
It is known that a Strassen-like algorithm with a 2x2-block matrix step requires at least 7 block matrix multiplications. In 1976 Probert showed that such an algorithm requires at least 15 additions (including subtractions), however, a hidden assumption was that the blocks and the 2x2-block matrix
5103:
On modern architectures with hierarchical memory, the cost of loading and storing input matrix elements tends to dominate the cost of arithmetic. On a single machine this is the amount of data transferred between RAM and cache, while on a distributed memory multi-node machine it is the amount
1874:
A variant of this algorithm that works for matrices of arbitrary shapes and is faster in practice splits matrices in two instead of four submatrices, as follows. Splitting a matrix now means dividing it into two parts of equal size, or as close to equal sizes as possible in the case of odd
5284:
more memory than is needed to store the inputs. This algorithm can be combined with Strassen to further reduce runtime. "2.5D" algorithms provide a continuous tradeoff between memory usage and communication bandwidth. On modern distributed computing environments such as
2428: 6106:
Fawzi, Alhussein; Balog, Matej; Huang, Aja; Hubert, Thomas; Romera-Paredes, Bernardino; Barekatain, Mohammadamin; Novikov, Alexander; R. Ruiz, Francisco J.; Schrittwieser, Julian; Swirszcz, Grzegorz; Silver, David; Hassabis, Demis; Kohli, Pushmeet (October 2022).
4232:
can be performed independently of each other, as can the four summations (although the algorithm needs to "join" the multiplications before doing the summations). Exploiting the full parallelism of the problem, one obtains an algorithm that can be expressed in
3984:(mod 2 arithmetic). The best "practical" (explicit low-rank decomposition of a matrix multiplication tensor) algorithm found ran in O(n). Finding low-rank decompositions of such tensors (and beyond) is NP-hard; optimal multiplication even for 3x3 matrices 2238: 3945:
that used a single-player game analogy to invent thousands of matrix multiplication algorithms, including some previously discovered by humans and some that were not. Operations were restricted to the non-commutative ground field (normal arithmetic) and
2088: 2454:
environment where cache sizes are effectively dynamic due to other processes taking up cache space. (The simple iterative algorithm is cache-oblivious as well, but much slower in practice if the matrix layout is not adapted to the algorithm.)
469:
The three loops in iterative matrix multiplication can be arbitrarily swapped with each other without an effect on correctness or asymptotic running time. However, the order can have a considerable impact on practical performance due to the
3164: 2989: 3544: 2578: 1234:{\displaystyle C={\begin{pmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\\\end{pmatrix}},\,A={\begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\\\end{pmatrix}},\,B={\begin{pmatrix}B_{11}&B_{12}\\B_{21}&B_{22}\\\end{pmatrix}},} 3276: 4019: 5256:
which arranges the processors in a 3D cube mesh, assigning every product of two input submatrices to a single processor. The result submatrices are then generated by performing a reduction over each row. This algorithm transmits
5332:-1 steps are needed. The performance improves further for repeated computations leading to 100% efficiency. The cross-wired mesh array may be seen as a special case of a non-planar (i.e. multilayered) processing structure. 5598: 2654:-matrices which require only 7 multiplications (instead of the usual 8), at the expense of several additional addition and subtraction operations. Applying this recursively gives an algorithm with a multiplicative cost of 5325:-2 steps although this is reduced to half this number for repeated computations. The standard array is inefficient because the data from the two matrices does not arrive simultaneously and it must be padded with zeroes. 571:
cache misses in the worst case. As of 2010, the speed of memories compared to that of processors is such that the cache misses, rather than the actual calculations, dominate the running time for sizable matrices.
6066:—the wreath product approach is unlikely to be applicable to the large matrix problems that arise in practice. the input matrices must be astronomically large for the difference in time to be apparent. 3538: 3270: 2286: 2723: 284: 1834: 5160:
is the size of fast memory. The naĂŻve algorithm is then used over the block matrices, computing products of submatrices entirely in fast memory. This reduces communication bandwidth to
3981: 2751:
has its merits. A table that compares key aspects of the improved version based on recursive multiplication of 2x2-block matrices via 7 block matrix multiplications follows. As usual
2132: 3063: 2888: 3855: 1976: 3809: 1753: 5341: 3861:, Xu, Xu, and Zhou. This algorithm, like all other recent algorithms in this line of research, is a generalization of the Coppersmith–Winograd algorithm, which was given by 2636: 6747: 5346: 3985: 2589: 105: 3767: 5975:"Worst-case complexity bounds on algorithms for computing the canonical structure of finite abelian groups and the Hermite and Smith normal forms of an integer matrix" 3829: 3741: 5974: 1669:
which consists of eight multiplications of pairs of submatrices, followed by an addition step. The divide-and-conquer algorithm computes the smaller multiplications
3695:{\displaystyle 4\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-12n^{2}+1.5n^{2}\cdot \log _{2}\left({\frac {{\sqrt {2}}n}{\sqrt {M}}}\right)+5M} 3069: 2894: 3427:{\displaystyle 4\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-12n^{2}+3n^{2}\cdot \log _{2}\left({\frac {{\sqrt {2}}n}{\sqrt {M}}}\right)+5M} 2472: 4222:{\displaystyle {\begin{pmatrix}A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\\\end{pmatrix}}} 2789: 2769: 6868: 5917:
Beniamini, Gal; Cheng, Nathan; Holtz, Olga; Karstadt, Elaye; Schwartz, Oded (2020). "Sparsifying the Operators of Fast Matrix Multiplication Algorithms".
6357:
Agarwal, R.C.; Balle, S. M.; Gustavson, F. G.; Joshi, M.; Palkar, P. (September 1995). "A three-dimensional approach to parallel matrix multiplication".
6030:
The Coppersmith–Winograd algorithm is not practical, due to the very large hidden constant in the upper bound on the number of multiplications required.
6924: 6893: 985:
amounts to several orders of magnitude on modern machines, so that the actual calculations dominate the running time, rather than the cache misses.
5361: 2747:
Since Strassen's algorithm is actually used in practical numerical software and computer algebra systems improving on the constants hidden in the
1855: 6888: 6740: 6418: 5752: 5641: 5420: 6317:
Irony, Dror; Toledo, Sivan; Tiskin, Alexander (September 2004). "Communication lower bounds for distributed-memory matrix multiplication".
5063:
on any real computer. The algorithm isn't practical due to the communication cost inherent in moving data to and from the temporary matrix
5217:
2D mesh, one submatrix of the result can be assigned to each processor, and the product can be computed with each processor transmitting
5297: 5047:
steps, meaning it takes that much time on an ideal machine with an infinite number of processors; therefore, it has a maximum possible
6914: 6847: 6733: 5572: 5517: 6295: 6878: 6194: 5127: 2737: 3452: 2423:{\displaystyle C={\begin{pmatrix}A_{1}&A_{2}\end{pmatrix}}{\begin{pmatrix}B_{1}\\B_{2}\end{pmatrix}}=A_{1}B_{1}+A_{2}B_{2}} 3184: 3869:
in 1990. The conceptual idea of these algorithms are similar to Strassen's algorithm: a way is devised for multiplying two
2657: 6015: 3889:
is so large that these algorithms are only worthwhile for matrices that are too large to handle on present-day computers.
3858: 121: 206: 6837: 4010: 3712: 994: 5356: 479: 463: 6691:
Van Zee, Field G.; van de Geijn, Robert A. (2015). "BLIS: A Framework for Rapidly Instantiating BLAS Functionality".
5491: 6791: 5740: 5562: 5272:
words per processor, which is asymptotically optimal. However, this requires replicating each input matrix element
1759: 54:. Many different algorithms have been designed for multiplying matrices on different types of hardware, including 6842: 2642:
Algorithms exist that provide better running times than the straightforward ones. The first to be discovered was
2447: 3951: 2233:{\displaystyle C=A{\begin{pmatrix}B_{1}&B_{2}\end{pmatrix}}={\begin{pmatrix}AB_{1}&AB_{2}\end{pmatrix}}} 6756: 5490:
Alman, Josh; Williams, Virginia Vassilevska (2020), "A Refined Laser Method and Faster Matrix Multiplication",
6078: 42:
efficient. Applications of matrix multiplication in computational problems are found in many fields including
5628:. ASPLOS91: 4th Int'l Conference on Architecture Support for Programming Languages & Operating Systems. 5366: 3892: 3885:
multiplications, and this technique is applied recursively. However, the constant coefficient hidden by the
3009: 2834: 2083:{\displaystyle C={\begin{pmatrix}A_{1}\\A_{2}\end{pmatrix}}{B}={\begin{pmatrix}A_{1}B\\A_{2}B\end{pmatrix}}} 6919: 6661: 6577: 6519: 6478: 6366: 6326: 5989: 5702: 429: 66: 3834: 6801: 5024:
is a keyword that signal a computation may be run in parallel with the rest of the function call, while
3896: 1674: 471: 145: 59: 31: 5318: 5119: 2650:
in 1969 and often referred to as "fast matrix multiplication". It is based on a way of multiplying two
6652:
Goto, Kazushige; van de Geijn, Robert A. (2008). "Anatomy of high-performance matrix multiplication".
3772: 1711: 6796: 6452: 6120: 5550: 5082:
Block matrix multiplication. In the 2D algorithm, each processor is responsible for one submatrix of
5036: 51: 43: 6666: 5994: 5707: 4234: 2605: 6775: 6582: 6524: 6371: 6331: 5744: 2726: 2450:: there is no tuning parameter required to get optimal cache performance, and it behaves well in a 425: 75: 62:
systems, where the computational work is spread over multiple processors (perhaps over a network).
47: 35: 5404: 6708: 6679: 6640: 6622: 6548: 6442: 6287: 6173: 5918: 5785: 5497: 5468: 5444: 4006: 2643: 133: 101: 55: 100:). Better asymptotic bounds on the time required to multiply matrices have been known since the 6395: 3746: 65:
Directly applying the mathematical definition of matrix multiplication gives an algorithm that
6414: 6223: 6154: 6136: 5939: 5748: 5734: 5637: 5568: 5416: 5078: 3159:{\displaystyle 5\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-15n^{2}+3M} 2984:{\displaystyle 6\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-18n^{2}+3M} 17: 5130:
that partitions each input matrix into a block matrix whose elements are submatrices of size
3814: 3726: 1244:
which works for all square matrices whose dimensions are powers of two, i.e., the shapes are
478:
use of the algorithm; which order is best also depends on whether the matrices are stored in
6811: 6700: 6671: 6632: 6587: 6529: 6490: 6406: 6376: 6336: 6279: 6232: 6144: 6128: 5999: 5954: 5899: 5872: 5845: 5816: 5777: 5712: 5675: 5629: 5546: 5408: 5396: 2451: 458: 6613:(2009). "A class of parallel tiled linear algebra algorithms for multicore architectures". 6396:"Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms" 6011: 2573:{\displaystyle \Theta \left(m+n+p+{\frac {mn+np+mp}{b}}+{\frac {mnp}{b{\sqrt {M}}}}\right)} 6852: 6007: 5306: 5301:
Matrix multiplication completed in 2n-1 steps for two n×n matrices on a cross-wired mesh.
3866: 3862: 3716: 2647: 2594: 414: 109: 5397: 6456: 6268: 6172:
Kauers, Manuel; Moosbauer, Jakob (2022-12-02). "Flip Graphs for Matrix Multiplication".
6124: 6770: 6149: 6108: 5730: 5558: 3942: 3886: 2774: 2754: 2748: 1929: 97: 5959: 6908: 6816: 6610: 6591: 6547:
Kak, S. (2014). "Efficiency of matrix multiplication on the cross-wired mesh array".
6533: 5821: 5804: 5789: 5666: 5662: 5392: 5351: 6276:
Proceedings of the thirteenth annual ACM symposium on Theory of computing - STOC '81
5842:
Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures
6712: 6683: 6291: 6059:
Even if someone manages to prove one of the conjectures—thereby demonstrating that
2741: 998: 5890:
Probert, Robert L. (1976). "On the additive complexity of matrix multiplication".
5237:
words, which is asymptotically optimal assuming that each node stores the minimum
6410: 6109:"Discovering faster matrix multiplication algorithms with reinforcement learning" 5837: 4013:. These are based on the fact that the eight recursive matrix multiplications in 6719: 6636: 6495: 6435: 5865:"Pebbling Game and Alternative Basis for High Performance Matrix Multiplication" 5412: 2443: 2442:
The cache miss rate of recursive matrix multiplication is the same as that of a
585: 6644: 6340: 6132: 5554: 5439:
Williams, Virginia Vassilevska; Xu, Yinzhan; Xu, Zixuan; Zhou, Renfei (2024),
4237: 289:
From this, a simple algorithm can be constructed which loops over the indices
6140: 1854:
to sum the four pairs of resulting matrices element-wise. Application of the
6832: 6675: 6251: 6043: 5849: 5286: 2729:
is reduced compared to the naĂŻve algorithm, but it is faster in cases where
1670: 486: 475: 6725: 6158: 551:, every iteration of the inner loop (a simultaneous sweep through a row of 6283: 5633: 589:
version, where the matrix is implicitly divided into square tiles of size
108:) remains unknown. As of April 2024, the best announced bound on the 5693:
Miller, Webb (1975), "Computational complexity and numerical stability",
5328:
The result is even faster on a two-layered cross-wired mesh, where only 2
3938: 2458:
The number of cache misses incurred by this algorithm, on a machine with
2446:
iterative version, but unlike that algorithm, the recursive algorithm is
50:
and in seemingly unrelated problems such as counting the paths through a
6380: 6236: 5679: 6479:"A faster parallel algorithm for matrix multiplication on a mesh array" 6403:
Proceedings of the 17th International Conference on Parallel Processing
5876: 5781: 5048: 6883: 6873: 6510:
Kak, S (1988). "A two-layered mesh array for matrix multiplication".
5864: 2795:
Progress for Strassen-like recursive 2x2-block matrix multiplication
6704: 6003: 5903: 5716: 2602:
over time for the computational complexity of matrix multiplication
6178: 5923: 5502: 5473: 5449: 136:
because of the large constants and cannot be realized practically.
6627: 6553: 6447: 5077: 3947: 2740:. It is very useful for large matrices over exact domains such as 2593: 457: 5493:
32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2021)
6231:(Ph.D.). Massachusetts Institute of Technology. pp. 54–57. 5768:
Strassen, Volker (1969). "Gaussian Elimination is not Optimal".
132:
time, given by Alman and Williams. However, this algorithm is a
6729: 5624:
Lam, Monica S.; Rothberg, Edward E.; Wolf, Michael E. (1991).
5599:"6.172 Performance Engineering of Software Systems, Lecture 8" 5626:
The Cache Performance and Optimizations of Blocked Algorithms
5289:, specialized multiplication algorithms have been developed. 5180:, which is asymptotically optimal (for algorithms performing 1839:
accounting for the eight recursive calls on matrices of size
432:
is to assume that the inputs are all square matrices of size
6253:
A cellular computer to implement the Kalman Filter Algorithm
5028:
waits for all previously "forked" computations to complete.
5567:(3rd ed.). MIT Press and McGraw-Hill. pp. 75–79. 5104:
transferred between nodes; in either case it is called the
3715:
how well Strassen's algorithm can be improved in terms of
5441:
New Bounds for Matrix Multiplication: from Alpha to Omega
3533:{\displaystyle 5n^{\log _{2}7}-4n^{2}+1.5n^{2}\log _{2}n} 939:
In the idealized cache model, this algorithm incurs only
5305:
There are a variety of algorithms for multiplication on
6195:"AI Reveals New Possibilities in Matrix Multiplication" 6044:"Toward an Optimal Algorithm for Matrix Multiplication" 3265:{\displaystyle 5n^{\log _{2}7}-4n^{2}+3n^{2}\log _{2}n} 5086:. In the 3D algorithm, every pair of submatrices from 4028: 2339: 2301: 2191: 2150: 2039: 1991: 1465: 1398: 1334: 1267: 1169: 1095: 1021: 6436:"Dimension Independent Matrix Square Using MapReduce" 4022: 3954: 3837: 3817: 3775: 3769:
matrix over a field can be multiplied together using
3749: 3729: 3547: 3455: 3279: 3187: 3072: 3012: 2897: 2837: 2777: 2757: 2718:{\displaystyle O(n^{\log _{2}7})\approx O(n^{2.807})} 2660: 2608: 2475: 2289: 2135: 1979: 1762: 1714: 1261: 1009: 517:
cache lines), the above algorithm is sub-optimal for
209: 5518:"Matrix Multiplication Inches Closer to Mythic Goal" 5108:. The naĂŻve algorithm using three nested loops uses 6861: 6825: 6784: 6763: 5940:"Matrix multiplication via arithmetic progressions" 5465:
Faster Matrix Multiplication via Asymmetric Hashing
5342:
Computational complexity of mathematical operations
575:The optimal variant of the iterative algorithm for 559:) incurs a cache miss when accessing an element of 279:{\displaystyle c_{ij}=\sum _{k=1}^{m}a_{ik}b_{kj}.} 124:, Xu, Xu, and Zhou. This improves on the bound of 5736:Numerical Recipes: The Art of Scientific Computing 4221: 3975: 3849: 3823: 3803: 3761: 3735: 3694: 3532: 3426: 3264: 3158: 3057: 2983: 2882: 2783: 2763: 2717: 2630: 2572: 2422: 2232: 2082: 1828: 1747: 1701:The complexity of this algorithm as a function of 1658: 1233: 278: 6609:Buttari, Alfredo; Langou, Julien; Kurzak, Jakub; 5347:Computational complexity of matrix multiplication 5099:Communication-avoiding and distributed algorithms 4002: 2590:Computational complexity of matrix multiplication 1856:master theorem for divide-and-conquer recurrences 993:An alternative to the iterative algorithm is the 106:computational complexity of matrix multiplication 104:in the 1960s, but the optimal time (that is, the 5317:on a standard two-dimensional mesh using the 2D 5094:that is multiplied is assigned to one processor. 5032:achieves its goal by pointer manipulation only. 2736:or so and appears in several libraries, such as 2725:. Strassen's algorithm is more complex, and the 6568:Kak, S (1988). "Multilayered array computing". 6079:"Discovering novel algorithms with AlphaTensor" 5597:Amarasinghe, Saman; Leiserson, Charles (2010). 6101: 6099: 997:for matrix multiplication. This relies on the 428:). A common simplification for the purpose of 6741: 6477:Bae, S.E.; Shinn, T.-W.; Takaoka, T. (2014). 6434:Bosagh Zadeh, Reza; Carlsson, Gunnar (2013). 5836:Karstadt, Elaye; Schwartz, Oded (July 2017). 5463:Duan, Ran; Wu, Hongxun; Zhou, Renfei (2022), 2744:, where numerical stability is not an issue. 8: 3811:field operations. The current best bound on 3743:, is the smallest real number for which any 1829:{\displaystyle T(n)=8T(n/2)+\Theta (n^{2}),} 450:, i.e., cubic in the size of the dimension. 5938:Coppersmith, Don; Winograd, Shmuel (1990), 5075:speedup, without using a temporary matrix. 4300:Otherwise, allocate space for a new matrix 305:, computing the above using a nested loop: 6748: 6734: 6726: 6352: 6350: 6269:"I/O complexity: The red-blue pebble game" 5321:, one can complete the multiplication in 3 3976:{\displaystyle \mathbb {Z} /2\mathbb {Z} } 1858:shows this recursion to have the solution 485:In particular, in the idealized case of a 6693:ACM Transactions on Mathematical Software 6665: 6654:ACM Transactions on Mathematical Software 6626: 6581: 6552: 6523: 6494: 6446: 6370: 6330: 6177: 6148: 5993: 5958: 5922: 5820: 5706: 5501: 5472: 5448: 4205: 4195: 4182: 4172: 4160: 4150: 4137: 4127: 4113: 4103: 4090: 4080: 4068: 4058: 4045: 4035: 4023: 4021: 3969: 3968: 3960: 3956: 3955: 3953: 3836: 3816: 3780: 3774: 3748: 3728: 3661: 3658: 3645: 3632: 3616: 3586: 3581: 3559: 3556: 3546: 3518: 3508: 3492: 3468: 3463: 3454: 3393: 3390: 3377: 3364: 3348: 3318: 3313: 3291: 3288: 3278: 3250: 3240: 3224: 3200: 3195: 3186: 3141: 3111: 3106: 3084: 3081: 3071: 3049: 3025: 3020: 3011: 2966: 2936: 2931: 2909: 2906: 2896: 2874: 2850: 2845: 2836: 2776: 2756: 2706: 2676: 2671: 2659: 2619: 2607: 2555: 2538: 2502: 2474: 2414: 2404: 2391: 2381: 2360: 2346: 2334: 2320: 2308: 2296: 2288: 2216: 2201: 2186: 2169: 2157: 2145: 2134: 2063: 2046: 2034: 2026: 2012: 1998: 1986: 1978: 1814: 1790: 1761: 1713: 1642: 1632: 1619: 1609: 1597: 1587: 1574: 1564: 1550: 1540: 1527: 1517: 1505: 1495: 1482: 1472: 1460: 1443: 1431: 1417: 1405: 1393: 1379: 1367: 1353: 1341: 1329: 1312: 1300: 1286: 1274: 1262: 1260: 1214: 1202: 1188: 1176: 1164: 1157: 1140: 1128: 1114: 1102: 1090: 1083: 1066: 1054: 1040: 1028: 1016: 1008: 264: 251: 241: 230: 214: 208: 6394:Solomonik, Edgar; Demmel, James (2011). 6312: 6310: 6308: 5838:"Matrix Multiplication, a Little Faster" 5657: 5655: 5653: 5434: 5432: 5387: 5385: 5383: 5381: 5296: 5067:, but a more practical variant achieves 2793: 112:of a matrix multiplication algorithm is 38:, much work has been invested in making 6879:Basic Linear Algebra Subprograms (BLAS) 6301:from the original on December 15, 2019. 6225:Cilk: Efficient Multithreaded Computing 6217: 6215: 5729:Press, William H.; Flannery, Brian P.; 5605:. Massachusetts Institute of Technology 5377: 4820:(or do a short loop, perhaps unrolled). 2771:gives the dimensions of the matrix and 1866:, the same as the iterative algorithm. 631:be a new matrix of the appropriate size 563:. This means that the algorithm incurs 325:be a new matrix of the appropriate size 6405:. Vol. Part II. pp. 90–109. 5541: 5539: 5537: 5485: 5483: 5252:elements. This can be improved by the 4737:(wait for parallel forks to complete). 3058:{\displaystyle 6n^{\log _{2}7}-5n^{2}} 2883:{\displaystyle 7n^{\log _{2}7}-6n^{2}} 5592: 5590: 5588: 5586: 5584: 2598:Improvement of estimates of exponent 7: 6250:Cannon, Lynn Elliot (14 July 1969). 5869:SIAM Journal on Scientific Computing 5863:Schwartz, Oded; Vaknin, Noa (2023). 442:, in which case the running time is 34:is such a central operation in many 5809:Linear Algebra and Its Applications 5805:"On multiplication of 2×2 matrices" 5362:Sparse matrix–vector multiplication 5352:CYK algorithm § Valiant's algorithm 5278:times, and so requires a factor of 4297:(or multiply a small block matrix). 3992:Parallel and distributed algorithms 3850:{\displaystyle \omega <2.371552} 2462:lines of ideal cache, each of size 1932:version of the iterative algorithm. 480:row-major order, column-major order 146:definition of matrix multiplication 6256:(Ph.D.). Montana State University. 6193:Brubaker, Ben (23 November 2022). 2476: 1804: 1730: 25: 6267:Hong, J. W.; Kung, H. T. (1981). 5733:; Vetterling, William T. (2007). 5516:Hartnett, Kevin (23 March 2021). 5395:(2012). "Sorting and Searching". 6925:Matrix multiplication algorithms 6042:Robinson, Sara (November 2005), 5128:communication-avoiding algorithm 3804:{\displaystyle n^{\omega +o(1)}} 2806:#matrix multiplications per step 1928:is below some threshold, use an 1748:{\displaystyle T(1)=\Theta (1);} 525:stored in row-major order. When 40:matrix multiplication algorithms 5947:Journal of Symbolic Computation 5973:Iliopoulos, Costas S. (1989), 5844:. SPAA '17. pp. 101–110. 5191:In a distributed setting with 3796: 3790: 3721:matrix multiplication exponent 2712: 2699: 2690: 2664: 2631:{\displaystyle O(n^{\omega })} 2625: 2612: 1820: 1807: 1798: 1784: 1772: 1766: 1739: 1733: 1724: 1718: 27:Algorithm to multiply matrices 18:Coppersmith–Winograd algorithm 1: 5960:10.1016/S0747-7171(08)80013-2 4011:shared-memory multiprocessors 6592:10.1016/0020-0255(88)90010-2 6534:10.1016/0167-8191(88)90078-6 6411:10.1007/978-3-642-23397-5_10 5822:10.1016/0024-3795(71)90009-7 5309:. For multiplication of two 4003:divide-and-conquer algorithm 3713:theoretical computer science 2791:designates the memory size. 1252:. The matrix product is now 995:divide-and-conquer algorithm 989:Divide-and-conquer algorithm 6637:10.1016/j.parco.2008.10.002 6496:10.1016/j.procs.2014.05.208 6319:J. Parallel Distrib. Comput 5413:10.1007/978-1-84800-070-4_4 5399:The Algorithm Design Manual 5357:Matrix chain multiplication 2812:total arithmetic operations 1705:is given by the recurrence 497:bytes per cache line (i.e. 464:row- and column-major order 78:operations to multiply two 6941: 6792:System of linear equations 6341:10.1016/j.jpdc.2004.03.021 6222:Randall, Keith H. (1998). 6133:10.1038/s41586-022-05172-4 5741:Cambridge University Press 5668:Cache-Oblivious Algorithms 5564:Introduction to Algorithms 3941:introduced AlphaTensor, a 3879:-matrices with fewer than 3711:It is an open question in 2809:#matrix additions per step 2587: 971:cache misses; the divisor 88:matrices over that field ( 6843:Cache-oblivious algorithm 6483:Procedia Computer Science 5982:SIAM Journal on Computing 5803:Winograd, Shmuel (1971). 5195:processors arranged in a 5116:communication bandwidth. 3997:Shared-memory parallelism 3762:{\displaystyle n\times n} 583:in row-major layout is a 6915:Numerical linear algebra 6894:General purpose software 6757:Numerical linear algebra 4005:sketched earlier can be 6676:10.1145/1356052.1356053 5850:10.1145/3087556.3087579 5367:Method of Four Russians 5106:communication bandwidth 3824:{\displaystyle \omega } 3736:{\displaystyle \omega } 487:fully associative cache 5871:. pp. C277–C303. 5302: 5095: 4223: 3977: 3851: 3825: 3805: 3763: 3737: 3696: 3534: 3428: 3266: 3160: 3059: 2985: 2884: 2785: 2765: 2719: 2639: 2632: 2574: 2424: 2234: 2084: 1830: 1749: 1660: 1235: 472:memory access patterns 466: 280: 246: 6889:Specialized libraries 6802:Matrix multiplication 6797:Matrix decompositions 6284:10.1145/800076.802486 5634:10.1145/106972.106981 5551:Leiserson, Charles E. 5403:. Springer. pp.  5300: 5293:Algorithms for meshes 5081: 5035:This algorithm has a 4224: 3978: 3899:that, given matrices 3897:Monte Carlo algorithm 3852: 3826: 3806: 3764: 3738: 3717:asymptotic complexity 3697: 3535: 3429: 3267: 3161: 3060: 2986: 2885: 2815:total I/O-complexity 2786: 2766: 2720: 2633: 2597: 2588:Further information: 2575: 2466:bytes, is bounded by 2425: 2235: 2085: 1831: 1750: 1675:scalar multiplication 1661: 1236: 461: 413:This algorithm takes 281: 226: 110:asymptotic complexity 32:matrix multiplication 6720:How To Optimize GEMM 6570:Information Sciences 6278:. pp. 326–333. 5122:, also known as the 5037:critical path length 4489:Parallel execution: 4020: 3952: 3893:Freivalds' algorithm 3835: 3815: 3773: 3747: 3727: 3545: 3453: 3277: 3185: 3070: 3010: 2895: 2835: 2775: 2755: 2658: 2644:Strassen's algorithm 2606: 2584:Sub-cubic algorithms 2473: 2287: 2133: 1977: 1760: 1712: 1259: 1007: 482:, or a mix of both. 207: 200:matrix with entries 102:Strassen's algorithm 44:scientific computing 36:numerical algorithms 6776:Numerical stability 6457:2013arXiv1304.1467B 6381:10.1147/rd.395.0575 6125:2022Natur.610...47F 2796: 2727:numerical stability 1870:Non-square matrices 426:asymptotic notation 140:Iterative algorithm 48:pattern recognition 6615:Parallel Computing 6512:Parallel Computing 5877:10.1137/22M1502719 5782:10.1007/BF02165411 5731:Teukolsky, Saul A. 5603:MIT OpenCourseWare 5319:Cannon's algorithm 5303: 5120:Cannon's algorithm 5096: 4219: 4213: 3973: 3847: 3821: 3801: 3759: 3733: 3723:, usually denoted 3692: 3530: 3424: 3262: 3173:Karstadt, Schwartz 3156: 3055: 2981: 2880: 2794: 2781: 2761: 2715: 2640: 2628: 2570: 2420: 2368: 2328: 2230: 2224: 2177: 2080: 2074: 2020: 1826: 1745: 1698:as its base case. 1656: 1650: 1451: 1387: 1320: 1231: 1222: 1148: 1074: 999:block partitioning 467: 430:algorithm analysis 276: 134:galactic algorithm 6902: 6901: 6420:978-3-642-23397-5 5754:978-0-521-88068-8 5674:(Master's). MIT. 5643:978-0-89791-380-5 5555:Rivest, Ronald L. 5547:Cormen, Thomas H. 5422:978-1-84800-069-8 3705: 3704: 3677: 3676: 3666: 3575: 3574: 3564: 3409: 3408: 3398: 3307: 3306: 3296: 3100: 3099: 3089: 2925: 2924: 2914: 2784:{\displaystyle M} 2764:{\displaystyle n} 2563: 2560: 2533: 1881:Inputs: matrices 634:Pick a tile size 16:(Redirected from 6932: 6812:Matrix splitting 6750: 6743: 6736: 6727: 6716: 6687: 6669: 6648: 6630: 6596: 6595: 6585: 6565: 6559: 6558: 6556: 6544: 6538: 6537: 6527: 6507: 6501: 6500: 6498: 6474: 6468: 6467: 6465: 6463: 6450: 6440: 6431: 6425: 6424: 6400: 6391: 6385: 6384: 6374: 6354: 6345: 6344: 6334: 6314: 6303: 6302: 6300: 6273: 6264: 6258: 6257: 6247: 6241: 6240: 6230: 6219: 6210: 6209: 6207: 6205: 6190: 6184: 6183: 6181: 6169: 6163: 6162: 6152: 6103: 6094: 6093: 6091: 6090: 6085:. 5 October 2022 6083:www.deepmind.com 6075: 6069: 6068: 6065: 6048: 6039: 6033: 6032: 6027: 6026: 6020: 6014:, archived from 5997: 5979: 5970: 5964: 5963: 5962: 5944: 5935: 5929: 5928: 5926: 5914: 5908: 5907: 5887: 5881: 5880: 5860: 5854: 5853: 5833: 5827: 5826: 5824: 5800: 5794: 5793: 5765: 5759: 5758: 5739:(3rd ed.). 5726: 5720: 5719: 5710: 5690: 5684: 5683: 5673: 5659: 5648: 5647: 5621: 5615: 5614: 5612: 5610: 5594: 5579: 5578: 5543: 5532: 5531: 5529: 5528: 5513: 5507: 5506: 5505: 5487: 5478: 5477: 5476: 5460: 5454: 5453: 5452: 5436: 5427: 5426: 5402: 5389: 5283: 5277: 5271: 5251: 5236: 5234: 5233: 5216: 5215: 5214: 5205: 5204: 5203: 5194: 5187: 5179: 5177: 5176: 5159: 5153: 5152: 5151: 5141: 5140: 5139: 5115: 5093: 5089: 5085: 5074: 5066: 5062: 5046: 5031: 5003: 4980: 4957: 4934: 4908: 4899: 4890: 4881: 4872: 4865: 4856: 4847: 4838: 4829: 4819: 4796: 4786:, element-wise: 4785: 4781: 4777: 4757: 4750: 4728: 4698: 4668: 4638: 4608: 4578: 4548: 4518: 4485: 4476: 4467: 4458: 4449: 4442: 4433: 4424: 4415: 4406: 4399: 4390: 4381: 4372: 4363: 4356: 4347: 4338: 4329: 4320: 4313: 4303: 4296: 4273: 4262: 4228: 4226: 4225: 4220: 4218: 4217: 4210: 4209: 4200: 4199: 4187: 4186: 4177: 4176: 4165: 4164: 4155: 4154: 4142: 4141: 4132: 4131: 4118: 4117: 4108: 4107: 4095: 4094: 4085: 4084: 4073: 4072: 4063: 4062: 4050: 4049: 4040: 4039: 4009:in two ways for 3982: 3980: 3979: 3974: 3972: 3964: 3959: 3928: 3918: 3910: 3906: 3902: 3884: 3878: 3856: 3854: 3853: 3848: 3830: 3828: 3827: 3822: 3810: 3808: 3807: 3802: 3800: 3799: 3768: 3766: 3765: 3760: 3742: 3740: 3739: 3734: 3701: 3699: 3698: 3693: 3682: 3678: 3672: 3671: 3667: 3662: 3659: 3650: 3649: 3637: 3636: 3621: 3620: 3599: 3598: 3591: 3590: 3580: 3576: 3570: 3569: 3565: 3560: 3557: 3539: 3537: 3536: 3531: 3523: 3522: 3513: 3512: 3497: 3496: 3481: 3480: 3473: 3472: 3441:Schwartz, Vaknin 3433: 3431: 3430: 3425: 3414: 3410: 3404: 3403: 3399: 3394: 3391: 3382: 3381: 3369: 3368: 3353: 3352: 3331: 3330: 3323: 3322: 3312: 3308: 3302: 3301: 3297: 3292: 3289: 3271: 3269: 3268: 3263: 3255: 3254: 3245: 3244: 3229: 3228: 3213: 3212: 3205: 3204: 3165: 3163: 3162: 3157: 3146: 3145: 3124: 3123: 3116: 3115: 3105: 3101: 3095: 3094: 3090: 3085: 3082: 3064: 3062: 3061: 3056: 3054: 3053: 3038: 3037: 3030: 3029: 2990: 2988: 2987: 2982: 2971: 2970: 2949: 2948: 2941: 2940: 2930: 2926: 2920: 2919: 2915: 2910: 2907: 2889: 2887: 2886: 2881: 2879: 2878: 2863: 2862: 2855: 2854: 2797: 2790: 2788: 2787: 2782: 2770: 2768: 2767: 2762: 2735: 2724: 2722: 2721: 2716: 2711: 2710: 2689: 2688: 2681: 2680: 2653: 2637: 2635: 2634: 2629: 2624: 2623: 2601: 2579: 2577: 2576: 2571: 2569: 2565: 2564: 2562: 2561: 2556: 2550: 2539: 2534: 2529: 2503: 2465: 2461: 2452:multiprogramming 2429: 2427: 2426: 2421: 2419: 2418: 2409: 2408: 2396: 2395: 2386: 2385: 2373: 2372: 2365: 2364: 2351: 2350: 2333: 2332: 2325: 2324: 2313: 2312: 2274: 2270: 2266: 2239: 2237: 2236: 2231: 2229: 2228: 2221: 2220: 2206: 2205: 2182: 2181: 2174: 2173: 2162: 2161: 2120: 2116: 2089: 2087: 2086: 2081: 2079: 2078: 2068: 2067: 2051: 2050: 2030: 2025: 2024: 2017: 2016: 2003: 2002: 1964: 1960: 1935:Recursive cases: 1927: 1908: 1898: 1894: 1884: 1865: 1853: 1845: 1835: 1833: 1832: 1827: 1819: 1818: 1794: 1754: 1752: 1751: 1746: 1704: 1697: 1665: 1663: 1662: 1657: 1655: 1654: 1647: 1646: 1637: 1636: 1624: 1623: 1614: 1613: 1602: 1601: 1592: 1591: 1579: 1578: 1569: 1568: 1555: 1554: 1545: 1544: 1532: 1531: 1522: 1521: 1510: 1509: 1500: 1499: 1487: 1486: 1477: 1476: 1456: 1455: 1448: 1447: 1436: 1435: 1422: 1421: 1410: 1409: 1392: 1391: 1384: 1383: 1372: 1371: 1358: 1357: 1346: 1345: 1325: 1324: 1317: 1316: 1305: 1304: 1291: 1290: 1279: 1278: 1251: 1247: 1240: 1238: 1237: 1232: 1227: 1226: 1219: 1218: 1207: 1206: 1193: 1192: 1181: 1180: 1153: 1152: 1145: 1144: 1133: 1132: 1119: 1118: 1107: 1106: 1079: 1078: 1071: 1070: 1059: 1058: 1045: 1044: 1033: 1032: 984: 983: 982: 970: 968: 966: 965: 964: 963: 952: 949: 933: 917: 896: 876: 860: 856: 850: 843: 827: 823: 816: 800: 796: 789: 758: 727: 693: 689: 685: 678: 674: 670: 663: 659: 655: 649: 647: 646: 630: 624: 620: 617:Input: matrices 610: 609: 608: 599: 598: 597: 582: 578: 570: 562: 558: 555:and a column of 554: 550: 549: 547: 546: 541: 538: 524: 520: 516: 514: 513: 508: 505: 496: 492: 462:Illustration of 449: 441: 423: 407: 397: 383: 363: 359: 353: 346: 342: 335: 331: 324: 318: 314: 311:Input: matrices 304: 300: 296: 292: 285: 283: 282: 277: 272: 271: 259: 258: 245: 240: 222: 221: 199: 189: 185: 181: 171: 167: 157: 131: 119: 95: 87: 74: 69:on the order of 21: 6940: 6939: 6935: 6934: 6933: 6931: 6930: 6929: 6905: 6904: 6903: 6898: 6857: 6853:Multiprocessing 6821: 6817:Sparse problems 6780: 6759: 6754: 6724: 6705:10.1145/2764454 6690: 6667:10.1.1.140.3583 6651: 6608: 6604: 6602:Further reading 6599: 6567: 6566: 6562: 6546: 6545: 6541: 6509: 6508: 6504: 6476: 6475: 6471: 6461: 6459: 6438: 6433: 6432: 6428: 6421: 6398: 6393: 6392: 6388: 6359:IBM J. Res. Dev 6356: 6355: 6348: 6316: 6315: 6306: 6298: 6271: 6266: 6265: 6261: 6249: 6248: 6244: 6228: 6221: 6220: 6213: 6203: 6201: 6199:Quanta Magazine 6192: 6191: 6187: 6171: 6170: 6166: 6119:(7930): 47–53. 6105: 6104: 6097: 6088: 6086: 6077: 6076: 6072: 6060: 6046: 6041: 6040: 6036: 6024: 6022: 6018: 6004:10.1137/0218045 5995:10.1.1.531.9309 5977: 5972: 5971: 5967: 5942: 5937: 5936: 5932: 5916: 5915: 5911: 5904:10.1137/0205016 5889: 5888: 5884: 5862: 5861: 5857: 5835: 5834: 5830: 5802: 5801: 5797: 5767: 5766: 5762: 5755: 5728: 5727: 5723: 5717:10.1137/0204009 5708:10.1.1.148.9947 5692: 5691: 5687: 5671: 5661: 5660: 5651: 5644: 5623: 5622: 5618: 5608: 5606: 5596: 5595: 5582: 5575: 5559:Stein, Clifford 5545: 5544: 5535: 5526: 5524: 5522:Quanta Magazine 5515: 5514: 5510: 5489: 5488: 5481: 5462: 5461: 5457: 5438: 5437: 5430: 5423: 5391: 5390: 5379: 5375: 5338: 5295: 5279: 5273: 5258: 5238: 5229: 5227: 5218: 5210: 5208: 5207: 5199: 5197: 5196: 5192: 5181: 5172: 5170: 5161: 5155: 5146: 5144: 5143: 5134: 5132: 5131: 5109: 5101: 5091: 5087: 5083: 5068: 5064: 5052: 5040: 5029: 5018: 5017: 5001: 4994: 4987: 4978: 4971: 4964: 4955: 4948: 4941: 4932: 4925: 4918: 4907: 4901: 4898: 4892: 4889: 4883: 4880: 4874: 4870: 4864: 4858: 4855: 4849: 4846: 4840: 4837: 4831: 4827: 4818: 4811: 4804: 4798: 4791: 4783: 4779: 4767: 4755: 4740: 4726: 4719: 4712: 4705: 4696: 4689: 4682: 4675: 4666: 4659: 4652: 4645: 4636: 4629: 4622: 4615: 4606: 4599: 4592: 4585: 4576: 4569: 4562: 4555: 4546: 4539: 4532: 4525: 4516: 4509: 4502: 4495: 4484: 4478: 4475: 4469: 4466: 4460: 4457: 4451: 4447: 4441: 4435: 4432: 4426: 4423: 4417: 4414: 4408: 4404: 4398: 4392: 4389: 4383: 4380: 4374: 4371: 4365: 4361: 4355: 4349: 4346: 4340: 4337: 4331: 4328: 4322: 4318: 4305: 4301: 4295: 4288: 4281: 4275: 4268: 4248: 4235:fork–join style 4212: 4211: 4201: 4191: 4178: 4168: 4166: 4156: 4146: 4133: 4123: 4120: 4119: 4109: 4099: 4086: 4076: 4074: 4064: 4054: 4041: 4031: 4024: 4018: 4017: 3999: 3994: 3986:remains unknown 3950: 3949: 3935: 3920: 3912: 3908: 3904: 3900: 3880: 3870: 3867:Shmuel Winograd 3863:Don Coppersmith 3833: 3832: 3813: 3812: 3776: 3771: 3770: 3745: 3744: 3725: 3724: 3660: 3654: 3641: 3628: 3612: 3582: 3558: 3552: 3551: 3543: 3542: 3514: 3504: 3488: 3464: 3459: 3451: 3450: 3392: 3386: 3373: 3360: 3344: 3314: 3290: 3284: 3283: 3275: 3274: 3246: 3236: 3220: 3196: 3191: 3183: 3182: 3137: 3107: 3083: 3077: 3076: 3068: 3067: 3045: 3021: 3016: 3008: 3007: 2962: 2932: 2908: 2902: 2901: 2893: 2892: 2870: 2846: 2841: 2833: 2832: 2773: 2772: 2753: 2752: 2730: 2702: 2672: 2667: 2656: 2655: 2651: 2648:Volker Strassen 2615: 2604: 2603: 2599: 2592: 2586: 2551: 2540: 2504: 2483: 2479: 2471: 2470: 2463: 2459: 2448:cache-oblivious 2440: 2435: 2434: 2410: 2400: 2387: 2377: 2367: 2366: 2356: 2353: 2352: 2342: 2335: 2327: 2326: 2316: 2314: 2304: 2297: 2285: 2284: 2272: 2271:vertically and 2268: 2249: 2223: 2222: 2212: 2207: 2197: 2187: 2176: 2175: 2165: 2163: 2153: 2146: 2131: 2130: 2118: 2099: 2073: 2072: 2059: 2056: 2055: 2042: 2035: 2019: 2018: 2008: 2005: 2004: 1994: 1987: 1975: 1974: 1962: 1943: 1913: 1900: 1896: 1886: 1882: 1872: 1859: 1847: 1840: 1810: 1758: 1757: 1710: 1709: 1702: 1696: 1690: 1683: 1677: 1649: 1648: 1638: 1628: 1615: 1605: 1603: 1593: 1583: 1570: 1560: 1557: 1556: 1546: 1536: 1523: 1513: 1511: 1501: 1491: 1478: 1468: 1461: 1450: 1449: 1439: 1437: 1427: 1424: 1423: 1413: 1411: 1401: 1394: 1386: 1385: 1375: 1373: 1363: 1360: 1359: 1349: 1347: 1337: 1330: 1319: 1318: 1308: 1306: 1296: 1293: 1292: 1282: 1280: 1270: 1263: 1257: 1256: 1249: 1245: 1221: 1220: 1210: 1208: 1198: 1195: 1194: 1184: 1182: 1172: 1165: 1147: 1146: 1136: 1134: 1124: 1121: 1120: 1110: 1108: 1098: 1091: 1073: 1072: 1062: 1060: 1050: 1047: 1046: 1036: 1034: 1024: 1017: 1005: 1004: 991: 978: 976: 972: 959: 957: 953: 950: 945: 944: 942: 940: 937: 936: 931: 914: 907: 902: 894: 887: 881: 862: 858: 854: 848: 829: 825: 821: 802: 798: 794: 788: 760: 757: 729: 726: 698: 691: 687: 683: 676: 672: 668: 661: 657: 653: 642: 640: 635: 628: 622: 618: 604: 602: 601: 593: 591: 590: 580: 576: 564: 560: 556: 552: 542: 539: 534: 533: 531: 526: 522: 518: 509: 506: 501: 500: 498: 494: 490: 456: 443: 433: 417: 411: 410: 405: 394: 389: 381: 374: 368: 361: 357: 351: 344: 340: 333: 329: 322: 316: 312: 302: 301:from 1 through 298: 294: 293:from 1 through 290: 260: 247: 210: 205: 204: 191: 187: 183: 173: 169: 159: 149: 142: 125: 120:time, given by 113: 89: 79: 70: 28: 23: 22: 15: 12: 11: 5: 6938: 6936: 6928: 6927: 6922: 6917: 6907: 6906: 6900: 6899: 6897: 6896: 6891: 6886: 6881: 6876: 6871: 6865: 6863: 6859: 6858: 6856: 6855: 6850: 6845: 6840: 6835: 6829: 6827: 6823: 6822: 6820: 6819: 6814: 6809: 6799: 6794: 6788: 6786: 6782: 6781: 6779: 6778: 6773: 6771:Floating point 6767: 6765: 6761: 6760: 6755: 6753: 6752: 6745: 6738: 6730: 6723: 6722: 6717: 6688: 6649: 6611:Dongarra, Jack 6605: 6603: 6600: 6598: 6597: 6583:10.1.1.90.4753 6576:(3): 347–365. 6560: 6539: 6525:10.1.1.88.8527 6502: 6469: 6426: 6419: 6386: 6372:10.1.1.44.3404 6365:(5): 575–582. 6346: 6332:10.1.1.20.7034 6325:(9): 1017–26. 6304: 6259: 6242: 6211: 6185: 6164: 6095: 6070: 6034: 5988:(4): 658–669, 5965: 5930: 5909: 5898:(2): 187–203. 5892:SIAM J. Comput 5882: 5855: 5828: 5815:(4): 381–388. 5795: 5776:(4): 354–356. 5760: 5753: 5721: 5685: 5663:Prokop, Harald 5649: 5642: 5616: 5580: 5573: 5533: 5508: 5479: 5455: 5428: 5421: 5393:Skiena, Steven 5376: 5374: 5371: 5370: 5369: 5364: 5359: 5354: 5349: 5344: 5337: 5334: 5294: 5291: 5188:computation). 5100: 5097: 5016: 5015: 5014: 5013: 5007: 5006: 5005: 4999: 4992: 4982: 4976: 4969: 4959: 4953: 4946: 4936: 4930: 4923: 4910: 4905: 4896: 4887: 4878: 4867: 4862: 4853: 4844: 4835: 4821: 4816: 4809: 4802: 4790:Base case: if 4762: 4761: 4760: 4759: 4752: 4738: 4732: 4731: 4730: 4724: 4717: 4710: 4700: 4694: 4687: 4680: 4670: 4664: 4657: 4650: 4640: 4634: 4627: 4620: 4610: 4604: 4597: 4590: 4580: 4574: 4567: 4560: 4550: 4544: 4537: 4530: 4520: 4514: 4507: 4500: 4487: 4482: 4473: 4464: 4455: 4444: 4439: 4430: 4421: 4412: 4401: 4396: 4387: 4378: 4369: 4358: 4353: 4344: 4335: 4326: 4298: 4293: 4286: 4279: 4267:Base case: if 4243: 4242: 4230: 4229: 4216: 4208: 4204: 4198: 4194: 4190: 4185: 4181: 4175: 4171: 4167: 4163: 4159: 4153: 4149: 4145: 4140: 4136: 4130: 4126: 4122: 4121: 4116: 4112: 4106: 4102: 4098: 4093: 4089: 4083: 4079: 4075: 4071: 4067: 4061: 4057: 4053: 4048: 4044: 4038: 4034: 4030: 4029: 4027: 3998: 3995: 3993: 3990: 3971: 3967: 3963: 3958: 3943:neural network 3934: 3931: 3911:, verifies in 3887:Big O notation 3846: 3843: 3840: 3820: 3798: 3795: 3792: 3789: 3786: 3783: 3779: 3758: 3755: 3752: 3732: 3703: 3702: 3691: 3688: 3685: 3681: 3675: 3670: 3665: 3657: 3653: 3648: 3644: 3640: 3635: 3631: 3627: 3624: 3619: 3615: 3611: 3608: 3605: 3602: 3597: 3594: 3589: 3585: 3579: 3573: 3568: 3563: 3555: 3550: 3540: 3529: 3526: 3521: 3517: 3511: 3507: 3503: 3500: 3495: 3491: 3487: 3484: 3479: 3476: 3471: 3467: 3462: 3458: 3448: 3445: 3442: 3439: 3435: 3434: 3423: 3420: 3417: 3413: 3407: 3402: 3397: 3389: 3385: 3380: 3376: 3372: 3367: 3363: 3359: 3356: 3351: 3347: 3343: 3340: 3337: 3334: 3329: 3326: 3321: 3317: 3311: 3305: 3300: 3295: 3287: 3282: 3272: 3261: 3258: 3253: 3249: 3243: 3239: 3235: 3232: 3227: 3223: 3219: 3216: 3211: 3208: 3203: 3199: 3194: 3190: 3180: 3177: 3174: 3171: 3167: 3166: 3155: 3152: 3149: 3144: 3140: 3136: 3133: 3130: 3127: 3122: 3119: 3114: 3110: 3104: 3098: 3093: 3088: 3080: 3075: 3065: 3052: 3048: 3044: 3041: 3036: 3033: 3028: 3024: 3019: 3015: 3005: 3002: 2999: 2996: 2992: 2991: 2980: 2977: 2974: 2969: 2965: 2961: 2958: 2955: 2952: 2947: 2944: 2939: 2935: 2929: 2923: 2918: 2913: 2905: 2900: 2890: 2877: 2873: 2869: 2866: 2861: 2858: 2853: 2849: 2844: 2840: 2830: 2827: 2824: 2821: 2817: 2816: 2813: 2810: 2807: 2804: 2801: 2780: 2760: 2749:big O notation 2714: 2709: 2705: 2701: 2698: 2695: 2692: 2687: 2684: 2679: 2675: 2670: 2666: 2663: 2627: 2622: 2618: 2614: 2611: 2585: 2582: 2581: 2580: 2568: 2559: 2554: 2549: 2546: 2543: 2537: 2532: 2528: 2525: 2522: 2519: 2516: 2513: 2510: 2507: 2501: 2498: 2495: 2492: 2489: 2486: 2482: 2478: 2439: 2438:Cache behavior 2436: 2433: 2432: 2431: 2430: 2417: 2413: 2407: 2403: 2399: 2394: 2390: 2384: 2380: 2376: 2371: 2363: 2359: 2355: 2354: 2349: 2345: 2341: 2340: 2338: 2331: 2323: 2319: 2315: 2311: 2307: 2303: 2302: 2300: 2295: 2292: 2279: 2278: 2277: 2276: 2243: 2242: 2241: 2240: 2227: 2219: 2215: 2211: 2208: 2204: 2200: 2196: 2193: 2192: 2190: 2185: 2180: 2172: 2168: 2164: 2160: 2156: 2152: 2151: 2149: 2144: 2141: 2138: 2125: 2124: 2123: 2122: 2093: 2092: 2091: 2090: 2077: 2071: 2066: 2062: 2058: 2057: 2054: 2049: 2045: 2041: 2040: 2038: 2033: 2029: 2023: 2015: 2011: 2007: 2006: 2001: 1997: 1993: 1992: 1990: 1985: 1982: 1969: 1968: 1967: 1966: 1937: 1936: 1933: 1912:Base case: if 1910: 1878: 1877: 1871: 1868: 1837: 1836: 1825: 1822: 1817: 1813: 1809: 1806: 1803: 1800: 1797: 1793: 1789: 1786: 1783: 1780: 1777: 1774: 1771: 1768: 1765: 1755: 1744: 1741: 1738: 1735: 1732: 1729: 1726: 1723: 1720: 1717: 1694: 1688: 1681: 1667: 1666: 1653: 1645: 1641: 1635: 1631: 1627: 1622: 1618: 1612: 1608: 1604: 1600: 1596: 1590: 1586: 1582: 1577: 1573: 1567: 1563: 1559: 1558: 1553: 1549: 1543: 1539: 1535: 1530: 1526: 1520: 1516: 1512: 1508: 1504: 1498: 1494: 1490: 1485: 1481: 1475: 1471: 1467: 1466: 1464: 1459: 1454: 1446: 1442: 1438: 1434: 1430: 1426: 1425: 1420: 1416: 1412: 1408: 1404: 1400: 1399: 1397: 1390: 1382: 1378: 1374: 1370: 1366: 1362: 1361: 1356: 1352: 1348: 1344: 1340: 1336: 1335: 1333: 1328: 1323: 1315: 1311: 1307: 1303: 1299: 1295: 1294: 1289: 1285: 1281: 1277: 1273: 1269: 1268: 1266: 1242: 1241: 1230: 1225: 1217: 1213: 1209: 1205: 1201: 1197: 1196: 1191: 1187: 1183: 1179: 1175: 1171: 1170: 1168: 1163: 1160: 1156: 1151: 1143: 1139: 1135: 1131: 1127: 1123: 1122: 1117: 1113: 1109: 1105: 1101: 1097: 1096: 1094: 1089: 1086: 1082: 1077: 1069: 1065: 1061: 1057: 1053: 1049: 1048: 1043: 1039: 1035: 1031: 1027: 1023: 1022: 1020: 1015: 1012: 990: 987: 935: 934: 928: 927: 926: 925: 924: 923: 922: 921: 920: 919: 918: 912: 905: 899: 898: 897: 892: 885: 851: 791: 764: 733: 702: 650: 632: 625: 614: 613: 489:consisting of 455: 454:Cache behavior 452: 409: 408: 402: 401: 400: 399: 398: 392: 386: 385: 384: 379: 372: 354: 326: 319: 308: 307: 287: 286: 275: 270: 267: 263: 257: 254: 250: 244: 239: 236: 233: 229: 225: 220: 217: 213: 141: 138: 98:big O notation 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 6937: 6926: 6923: 6921: 6920:Matrix theory 6918: 6916: 6913: 6912: 6910: 6895: 6892: 6890: 6887: 6885: 6882: 6880: 6877: 6875: 6872: 6870: 6867: 6866: 6864: 6860: 6854: 6851: 6849: 6846: 6844: 6841: 6839: 6836: 6834: 6831: 6830: 6828: 6824: 6818: 6815: 6813: 6810: 6807: 6803: 6800: 6798: 6795: 6793: 6790: 6789: 6787: 6783: 6777: 6774: 6772: 6769: 6768: 6766: 6762: 6758: 6751: 6746: 6744: 6739: 6737: 6732: 6731: 6728: 6721: 6718: 6714: 6710: 6706: 6702: 6698: 6694: 6689: 6685: 6681: 6677: 6673: 6668: 6663: 6659: 6655: 6650: 6646: 6642: 6638: 6634: 6629: 6624: 6620: 6616: 6612: 6607: 6606: 6601: 6593: 6589: 6584: 6579: 6575: 6571: 6564: 6561: 6555: 6550: 6543: 6540: 6535: 6531: 6526: 6521: 6517: 6513: 6506: 6503: 6497: 6492: 6488: 6484: 6480: 6473: 6470: 6458: 6454: 6449: 6444: 6437: 6430: 6427: 6422: 6416: 6412: 6408: 6404: 6397: 6390: 6387: 6382: 6378: 6373: 6368: 6364: 6360: 6353: 6351: 6347: 6342: 6338: 6333: 6328: 6324: 6320: 6313: 6311: 6309: 6305: 6297: 6293: 6289: 6285: 6281: 6277: 6270: 6263: 6260: 6255: 6254: 6246: 6243: 6238: 6234: 6227: 6226: 6218: 6216: 6212: 6200: 6196: 6189: 6186: 6180: 6175: 6168: 6165: 6160: 6156: 6151: 6146: 6142: 6138: 6134: 6130: 6126: 6122: 6118: 6114: 6110: 6102: 6100: 6096: 6084: 6080: 6074: 6071: 6067: 6063: 6056: 6052: 6045: 6038: 6035: 6031: 6021:on 2014-03-05 6017: 6013: 6009: 6005: 6001: 5996: 5991: 5987: 5983: 5976: 5969: 5966: 5961: 5956: 5952: 5948: 5941: 5934: 5931: 5925: 5920: 5913: 5910: 5905: 5901: 5897: 5893: 5886: 5883: 5878: 5874: 5870: 5866: 5859: 5856: 5851: 5847: 5843: 5839: 5832: 5829: 5823: 5818: 5814: 5810: 5806: 5799: 5796: 5791: 5787: 5783: 5779: 5775: 5771: 5764: 5761: 5756: 5750: 5746: 5742: 5738: 5737: 5732: 5725: 5722: 5718: 5714: 5709: 5704: 5701:(2): 97–107, 5700: 5696: 5689: 5686: 5681: 5677: 5670: 5669: 5664: 5658: 5656: 5654: 5650: 5645: 5639: 5635: 5631: 5627: 5620: 5617: 5604: 5600: 5593: 5591: 5589: 5587: 5585: 5581: 5576: 5574:0-262-03384-4 5570: 5566: 5565: 5560: 5556: 5552: 5548: 5542: 5540: 5538: 5534: 5523: 5519: 5512: 5509: 5504: 5499: 5495: 5494: 5486: 5484: 5480: 5475: 5470: 5466: 5459: 5456: 5451: 5446: 5442: 5435: 5433: 5429: 5424: 5418: 5414: 5410: 5406: 5401: 5400: 5394: 5388: 5386: 5384: 5382: 5378: 5372: 5368: 5365: 5363: 5360: 5358: 5355: 5353: 5350: 5348: 5345: 5343: 5340: 5339: 5335: 5333: 5331: 5326: 5324: 5320: 5316: 5312: 5308: 5299: 5292: 5290: 5288: 5282: 5276: 5269: 5265: 5261: 5255: 5254:3D algorithm, 5249: 5245: 5241: 5232: 5225: 5221: 5213: 5202: 5189: 5185: 5175: 5168: 5164: 5158: 5149: 5137: 5129: 5125: 5121: 5117: 5113: 5107: 5098: 5080: 5076: 5072: 5060: 5056: 5050: 5044: 5038: 5033: 5027: 5023: 5011: 5008: 4998: 4991: 4986: 4983: 4975: 4968: 4963: 4960: 4952: 4945: 4940: 4937: 4929: 4922: 4917: 4914: 4913: 4912:In parallel: 4911: 4904: 4895: 4886: 4877: 4868: 4861: 4852: 4843: 4834: 4825: 4824: 4822: 4815: 4808: 4801: 4794: 4789: 4788: 4787: 4775: 4771: 4766: 4753: 4748: 4744: 4739: 4736: 4733: 4723: 4716: 4709: 4704: 4701: 4693: 4686: 4679: 4674: 4671: 4663: 4656: 4649: 4644: 4641: 4633: 4626: 4619: 4614: 4611: 4603: 4596: 4589: 4584: 4581: 4573: 4566: 4559: 4554: 4551: 4543: 4536: 4529: 4524: 4521: 4513: 4506: 4499: 4494: 4491: 4490: 4488: 4481: 4472: 4463: 4454: 4445: 4438: 4429: 4420: 4411: 4402: 4395: 4386: 4377: 4368: 4359: 4352: 4343: 4334: 4325: 4316: 4315: 4312: 4308: 4299: 4292: 4285: 4278: 4271: 4266: 4265: 4264: 4260: 4256: 4252: 4247: 4241: 4239: 4236: 4214: 4206: 4202: 4196: 4192: 4188: 4183: 4179: 4173: 4169: 4161: 4157: 4151: 4147: 4143: 4138: 4134: 4128: 4124: 4114: 4110: 4104: 4100: 4096: 4091: 4087: 4081: 4077: 4069: 4065: 4059: 4055: 4051: 4046: 4042: 4036: 4032: 4025: 4016: 4015: 4014: 4012: 4008: 4004: 3996: 3991: 3989: 3987: 3983: 3965: 3961: 3948:finite field 3944: 3940: 3932: 3930: 3927: 3923: 3916: 3898: 3894: 3890: 3888: 3883: 3877: 3873: 3868: 3864: 3860: 3844: 3841: 3838: 3818: 3793: 3787: 3784: 3781: 3777: 3756: 3753: 3750: 3730: 3722: 3718: 3714: 3709: 3689: 3686: 3683: 3679: 3673: 3668: 3663: 3655: 3651: 3646: 3642: 3638: 3633: 3629: 3625: 3622: 3617: 3613: 3609: 3606: 3603: 3600: 3595: 3592: 3587: 3583: 3577: 3571: 3566: 3561: 3553: 3548: 3541: 3527: 3524: 3519: 3515: 3509: 3505: 3501: 3498: 3493: 3489: 3485: 3482: 3477: 3474: 3469: 3465: 3460: 3456: 3449: 3446: 3443: 3440: 3437: 3436: 3421: 3418: 3415: 3411: 3405: 3400: 3395: 3387: 3383: 3378: 3374: 3370: 3365: 3361: 3357: 3354: 3349: 3345: 3341: 3338: 3335: 3332: 3327: 3324: 3319: 3315: 3309: 3303: 3298: 3293: 3285: 3280: 3273: 3259: 3256: 3251: 3247: 3241: 3237: 3233: 3230: 3225: 3221: 3217: 3214: 3209: 3206: 3201: 3197: 3192: 3188: 3181: 3178: 3175: 3172: 3169: 3168: 3153: 3150: 3147: 3142: 3138: 3134: 3131: 3128: 3125: 3120: 3117: 3112: 3108: 3102: 3096: 3091: 3086: 3078: 3073: 3066: 3050: 3046: 3042: 3039: 3034: 3031: 3026: 3022: 3017: 3013: 3006: 3003: 3000: 2997: 2994: 2993: 2978: 2975: 2972: 2967: 2963: 2959: 2956: 2953: 2950: 2945: 2942: 2937: 2933: 2927: 2921: 2916: 2911: 2903: 2898: 2891: 2875: 2871: 2867: 2864: 2859: 2856: 2851: 2847: 2842: 2838: 2831: 2828: 2825: 2822: 2819: 2818: 2814: 2811: 2808: 2805: 2802: 2799: 2798: 2792: 2778: 2758: 2750: 2745: 2743: 2742:finite fields 2739: 2733: 2728: 2707: 2703: 2696: 2693: 2685: 2682: 2677: 2673: 2668: 2661: 2649: 2646:, devised by 2645: 2620: 2616: 2609: 2596: 2591: 2583: 2566: 2557: 2552: 2547: 2544: 2541: 2535: 2530: 2526: 2523: 2520: 2517: 2514: 2511: 2508: 2505: 2499: 2496: 2493: 2490: 2487: 2484: 2480: 2469: 2468: 2467: 2456: 2453: 2449: 2445: 2437: 2415: 2411: 2405: 2401: 2397: 2392: 2388: 2382: 2378: 2374: 2369: 2361: 2357: 2347: 2343: 2336: 2329: 2321: 2317: 2309: 2305: 2298: 2293: 2290: 2283: 2282: 2281: 2280: 2275:horizontally: 2265: 2261: 2257: 2253: 2247: 2246: 2245: 2244: 2225: 2217: 2213: 2209: 2202: 2198: 2194: 2188: 2183: 2178: 2170: 2166: 2158: 2154: 2147: 2142: 2139: 2136: 2129: 2128: 2127: 2126: 2115: 2111: 2107: 2103: 2097: 2096: 2095: 2094: 2075: 2069: 2064: 2060: 2052: 2047: 2043: 2036: 2031: 2027: 2021: 2013: 2009: 1999: 1995: 1988: 1983: 1980: 1973: 1972: 1971: 1970: 1965:horizontally: 1959: 1955: 1951: 1947: 1941: 1940: 1939: 1938: 1934: 1931: 1925: 1921: 1917: 1911: 1907: 1903: 1893: 1889: 1880: 1879: 1876: 1869: 1867: 1863: 1857: 1851: 1843: 1823: 1815: 1811: 1801: 1795: 1791: 1787: 1781: 1778: 1775: 1769: 1763: 1756: 1742: 1736: 1727: 1721: 1715: 1708: 1707: 1706: 1699: 1693: 1687: 1680: 1676: 1672: 1651: 1643: 1639: 1633: 1629: 1625: 1620: 1616: 1610: 1606: 1598: 1594: 1588: 1584: 1580: 1575: 1571: 1565: 1561: 1551: 1547: 1541: 1537: 1533: 1528: 1524: 1518: 1514: 1506: 1502: 1496: 1492: 1488: 1483: 1479: 1473: 1469: 1462: 1457: 1452: 1444: 1440: 1432: 1428: 1418: 1414: 1406: 1402: 1395: 1388: 1380: 1376: 1368: 1364: 1354: 1350: 1342: 1338: 1331: 1326: 1321: 1313: 1309: 1301: 1297: 1287: 1283: 1275: 1271: 1264: 1255: 1254: 1253: 1228: 1223: 1215: 1211: 1203: 1199: 1189: 1185: 1177: 1173: 1166: 1161: 1158: 1154: 1149: 1141: 1137: 1129: 1125: 1115: 1111: 1103: 1099: 1092: 1087: 1084: 1080: 1075: 1067: 1063: 1055: 1051: 1041: 1037: 1029: 1025: 1018: 1013: 1010: 1003: 1002: 1001: 1000: 996: 988: 986: 981: 975: 962: 956: 948: 929: 915: 908: 900: 895: 888: 879: 878: 874: 870: 866: 852: 846: 845: 841: 837: 833: 819: 818: 814: 810: 806: 792: 787: 783: 779: 775: 771: 767: 763: 756: 752: 748: 744: 740: 736: 732: 725: 721: 717: 713: 709: 705: 701: 696: 695: 681: 680: 666: 665: 651: 645: 638: 633: 626: 616: 615: 612: 607: 596: 588: 587: 573: 568: 545: 537: 529: 512: 504: 488: 483: 481: 477: 473: 465: 460: 453: 451: 447: 440: 436: 431: 427: 421: 416: 403: 395: 387: 382: 375: 366: 365: 355: 349: 348: 338: 337: 327: 320: 310: 309: 306: 273: 268: 265: 261: 255: 252: 248: 242: 237: 234: 231: 227: 223: 218: 215: 211: 203: 202: 201: 198: 194: 180: 176: 166: 162: 156: 152: 147: 139: 137: 135: 129: 123: 117: 111: 107: 103: 99: 93: 86: 82: 77: 73: 68: 63: 61: 57: 53: 49: 45: 41: 37: 33: 19: 6805: 6764:Key concepts 6696: 6692: 6657: 6653: 6618: 6614: 6573: 6569: 6563: 6542: 6518:(3): 383–5. 6515: 6511: 6505: 6486: 6482: 6472: 6460:. Retrieved 6429: 6402: 6389: 6362: 6358: 6322: 6318: 6275: 6262: 6252: 6245: 6237:1721.1/47519 6224: 6202:. Retrieved 6198: 6188: 6167: 6116: 6112: 6087:. Retrieved 6082: 6073: 6061: 6058: 6054: 6050: 6037: 6029: 6023:, retrieved 6016:the original 5985: 5981: 5968: 5950: 5946: 5933: 5912: 5895: 5891: 5885: 5868: 5858: 5841: 5831: 5812: 5808: 5798: 5773: 5769: 5763: 5735: 5724: 5698: 5694: 5688: 5680:1721.1/80568 5667: 5625: 5619: 5607:. Retrieved 5602: 5563: 5525:. Retrieved 5521: 5511: 5492: 5464: 5458: 5440: 5407:–46, 401–3. 5398: 5329: 5327: 5322: 5314: 5310: 5304: 5280: 5274: 5267: 5263: 5259: 5253: 5247: 5243: 5239: 5230: 5223: 5219: 5211: 5200: 5190: 5183: 5173: 5166: 5162: 5156: 5147: 5135: 5124:2D algorithm 5123: 5118: 5111: 5105: 5102: 5070: 5058: 5054: 5042: 5034: 5025: 5021: 5019: 5009: 4996: 4989: 4984: 4973: 4966: 4961: 4950: 4943: 4938: 4927: 4920: 4915: 4902: 4893: 4884: 4875: 4859: 4850: 4841: 4832: 4813: 4806: 4799: 4792: 4773: 4769: 4764: 4763: 4746: 4742: 4734: 4721: 4714: 4707: 4702: 4691: 4684: 4677: 4672: 4661: 4654: 4647: 4642: 4631: 4624: 4617: 4612: 4601: 4594: 4587: 4582: 4571: 4564: 4557: 4552: 4541: 4534: 4527: 4522: 4511: 4504: 4497: 4492: 4479: 4470: 4461: 4452: 4436: 4427: 4418: 4409: 4393: 4384: 4375: 4366: 4350: 4341: 4332: 4323: 4310: 4306: 4290: 4283: 4276: 4269: 4258: 4254: 4250: 4245: 4244: 4231: 4007:parallelized 4000: 3936: 3925: 3921: 3914: 3895:is a simple 3891: 3881: 3875: 3871: 3720: 3710: 3706: 2746: 2731: 2641: 2457: 2441: 2263: 2259: 2255: 2251: 2113: 2109: 2105: 2101: 1957: 1953: 1949: 1945: 1923: 1919: 1915: 1905: 1901: 1891: 1887: 1875:dimensions. 1873: 1861: 1849: 1841: 1838: 1700: 1691: 1685: 1678: 1673:, using the 1668: 1243: 992: 979: 973: 960: 954: 946: 938: 910: 903: 890: 883: 882:sum ← sum + 872: 868: 864: 839: 835: 831: 812: 808: 804: 785: 781: 777: 773: 769: 765: 761: 754: 750: 746: 742: 738: 734: 730: 723: 719: 715: 711: 707: 703: 699: 690:in steps of 675:in steps of 660:in steps of 643: 636: 605: 594: 584: 574: 566: 543: 535: 527: 510: 502: 484: 468: 445: 438: 434: 419: 412: 390: 377: 370: 369:sum ← sum + 288: 196: 192: 178: 174: 164: 160: 154: 150: 143: 127: 115: 91: 84: 80: 71: 64: 39: 29: 6699:(3): 1–33. 6660:(3): 1–25. 6489:: 2230–40. 6204:26 November 5770:Numer. Math 4823:Otherwise: 4754:Deallocate 3933:AlphaTensor 2248:Otherwise, 2121:vertically: 1671:recursively 148:is that if 60:distributed 6909:Categories 6806:algorithms 6179:2212.01175 6089:2022-11-01 6025:2015-01-16 5953:(3): 251, 5924:2008.03759 5743:. p.  5609:27 January 5527:2021-04-01 5503:2010.05846 5474:2210.10173 5450:2307.07970 5373:References 4869:Partition 4826:Partition 4446:Partition 4403:Partition 4360:Partition 4317:Partition 4238:pseudocode 790:, that is: 686:from 1 to 671:from 1 to 656:from 1 to 493:bytes and 360:from 1 to 343:from 1 to 332:from 1 to 67:takes time 6833:CPU cache 6662:CiteSeerX 6628:0709.1272 6621:: 38–53. 6578:CiteSeerX 6554:1411.3273 6520:CiteSeerX 6448:1304.1467 6367:CiteSeerX 6327:CiteSeerX 6141:1476-4687 6051:SIAM News 5990:CiteSeerX 5790:121656251 5703:CiteSeerX 5695:SIAM News 5561:(2009) . 5287:MapReduce 5030:partition 4765:Procedure 4706:multiply( 4676:multiply( 4646:multiply( 4616:multiply( 4586:multiply( 4556:multiply( 4526:multiply( 4496:multiply( 4304:of shape 4249:multiply( 4246:Procedure 3937:In 2022, 3839:ω 3819:ω 3782:ω 3754:× 3731:ω 3652:⁡ 3639:⋅ 3607:− 3601:⋅ 3593:⁡ 3525:⁡ 3483:− 3475:⁡ 3384:⁡ 3371:⋅ 3339:− 3333:⋅ 3325:⁡ 3257:⁡ 3215:− 3207:⁡ 3132:− 3126:⋅ 3118:⁡ 3040:− 3032:⁡ 2957:− 2951:⋅ 2943:⁡ 2865:− 2857:⁡ 2803:Reference 2694:≈ 2683:⁡ 2621:ω 2477:Θ 2098:Else, if 1805:Θ 1731:Θ 1248:for some 697:Multiply 228:∑ 6862:Software 6826:Hardware 6785:Problems 6296:Archived 6159:36198780 5665:(1999). 5336:See also 5154:, where 4314:, then: 3939:DeepMind 3919:time if 3859:Williams 3845:2.371552 2998:Winograd 2823:Strassen 2734:> 100 2267:. Split 2117:, split 1961:, split 1930:unrolled 1899:of size 1885:of size 122:Williams 56:parallel 30:Because 6713:1242360 6684:9359223 6462:12 July 6453:Bibcode 6292:8410593 6150:9534758 6121:Bibcode 6012:1004789 5228:√ 5209:√ 5198:√ 5171:√ 5145:√ 5133:√ 5126:, is a 5049:speedup 977:√ 967:⁠ 958:√ 943:⁠ 930:Return 849:sum = 0 641:√ 603:√ 592:√ 548:⁠ 532:⁠ 515:⁠ 499:⁠ 404:Return 352:sum = 0 186:, then 182:matrix 172:and an 168:matrix 158:for an 6884:LAPACK 6874:MATLAB 6711:  6682:  6664:  6643:  6580:  6522:  6417:  6369:  6329:  6290:  6157:  6147:  6139:  6113:Nature 6010:  5992:  5788:  5751:  5705:  5640:  5571:  5419:  5307:meshes 5041:Θ(log 5020:Here, 4797:, set 4274:, set 3719:. The 190:is an 6869:ATLAS 6709:S2CID 6680:S2CID 6641:S2CID 6623:arXiv 6549:arXiv 6443:arXiv 6439:(PDF) 6399:(PDF) 6299:(PDF) 6288:S2CID 6272:(PDF) 6229:(PDF) 6174:arXiv 6057:(9), 6047:(PDF) 6019:(PDF) 5978:(PDF) 5943:(PDF) 5919:arXiv 5786:S2CID 5672:(PDF) 5498:arXiv 5469:arXiv 5445:arXiv 5057:/log 4873:into 4830:into 4782:into 4778:adds 4450:into 4407:into 4364:into 4321:into 3857:, by 2708:2.807 2652:2 × 2 2444:tiled 1246:2 × 2 916:+ sum 857:from 824:from 797:from 759:into 586:tiled 530:> 476:cache 396:← sum 76:field 52:graph 6848:SIMD 6464:2014 6415:ISBN 6206:2022 6155:PMID 6137:ISSN 5749:ISBN 5638:ISBN 5611:2015 5569:ISBN 5417:ISBN 5090:and 5026:join 5022:fork 5010:Join 4988:add( 4985:Fork 4965:add( 4962:Fork 4942:add( 4939:Fork 4919:add( 4916:Fork 4768:add( 4741:add( 4735:Join 4703:Fork 4673:Fork 4643:Fork 4613:Fork 4583:Fork 4553:Fork 4523:Fork 4493:Fork 4001:The 3907:and 3865:and 3842:< 3438:2023 3170:2017 2995:1971 2820:1969 2800:Year 2738:BLAS 2262:) = 2250:max( 2112:) = 2100:max( 1956:) = 1944:max( 1914:max( 1846:and 901:Set 880:Set 863:min( 853:For 847:Let 830:min( 820:For 803:min( 793:For 728:and 682:For 667:For 652:For 639:= Θ( 627:Let 621:and 579:and 521:and 474:and 424:(in 415:time 388:Set 367:Set 356:For 350:Let 339:For 328:For 321:Let 315:and 297:and 144:The 58:and 46:and 6838:TLB 6701:doi 6672:doi 6645:955 6633:doi 6588:doi 6530:doi 6491:doi 6407:doi 6377:doi 6337:doi 6280:doi 6233:hdl 6145:PMC 6129:doi 6117:610 6064:= 2 6000:doi 5955:doi 5900:doi 5873:doi 5846:doi 5817:doi 5778:doi 5745:108 5713:doi 5676:hdl 5630:doi 5409:doi 5206:by 5142:by 5051:of 5039:of 4795:= 1 4272:= 1 3831:is 3643:log 3626:1.5 3584:log 3516:log 3502:1.5 3466:log 3375:log 3316:log 3248:log 3198:log 3109:log 3023:log 2934:log 2848:log 2674:log 1942:If 861:to 828:to 801:to 600:by 420:nmp 96:in 6911:: 6707:. 6697:41 6695:. 6678:. 6670:. 6658:34 6656:. 6639:. 6631:. 6619:35 6617:. 6586:. 6574:45 6572:. 6528:. 6514:. 6487:29 6485:. 6481:. 6451:. 6441:. 6413:. 6401:. 6375:. 6363:39 6361:. 6349:^ 6335:. 6323:64 6321:. 6307:^ 6294:. 6286:. 6274:. 6214:^ 6197:. 6153:. 6143:. 6135:. 6127:. 6115:. 6111:. 6098:^ 6081:. 6055:38 6053:, 6049:, 6028:, 6008:MR 6006:, 5998:, 5986:18 5984:, 5980:, 5949:, 5945:, 5894:. 5867:. 5840:. 5811:. 5807:. 5784:. 5774:13 5772:. 5747:. 5711:, 5697:, 5652:^ 5636:. 5601:. 5583:^ 5557:; 5553:; 5549:; 5536:^ 5520:. 5496:, 5482:^ 5467:, 5443:, 5431:^ 5415:. 5405:45 5380:^ 5182:Ω( 5150:/3 5138:/3 5110:Ω( 5069:Θ( 5053:Θ( 5000:22 4995:, 4993:22 4977:21 4972:, 4970:21 4954:12 4949:, 4947:12 4931:11 4926:, 4924:11 4906:22 4900:, 4897:21 4891:, 4888:12 4882:, 4879:11 4863:22 4857:, 4854:21 4848:, 4845:12 4839:, 4836:11 4817:11 4812:+ 4810:11 4805:← 4803:11 4772:, 4745:, 4725:22 4720:, 4718:22 4713:, 4711:22 4695:21 4690:, 4688:22 4683:, 4681:21 4665:22 4660:, 4658:12 4653:, 4651:12 4635:21 4630:, 4628:12 4623:, 4621:11 4605:12 4600:, 4598:21 4593:, 4591:22 4575:11 4570:, 4568:21 4563:, 4561:21 4545:12 4540:, 4538:11 4533:, 4531:12 4515:11 4510:, 4508:11 4503:, 4501:11 4483:22 4477:, 4474:21 4468:, 4465:12 4459:, 4456:11 4440:22 4434:, 4431:21 4425:, 4422:12 4416:, 4413:11 4397:22 4391:, 4388:21 4382:, 4379:12 4373:, 4370:11 4354:22 4348:, 4345:21 4339:, 4336:12 4330:, 4327:11 4309:× 4294:11 4289:× 4287:11 4282:← 4280:11 4263:: 4257:, 4253:, 4240:: 4207:22 4197:22 4184:12 4174:21 4162:21 4152:22 4139:11 4129:21 4115:22 4105:12 4092:12 4082:11 4070:21 4060:12 4047:11 4037:11 3929:. 3924:= 3922:AB 3913:Θ( 3903:, 3874:× 3610:12 3447:12 3342:12 3179:12 3135:15 3004:15 2960:18 2829:18 2258:, 2254:, 2108:, 2104:, 1952:, 1948:, 1922:, 1918:, 1904:× 1895:, 1890:× 1860:Θ( 1848:Θ( 1844:/2 1695:11 1689:11 1684:= 1682:11 1644:22 1634:22 1621:12 1611:21 1599:21 1589:22 1576:11 1566:21 1552:22 1542:12 1529:12 1519:11 1507:21 1497:12 1484:11 1474:11 1445:22 1433:21 1419:12 1407:11 1381:22 1369:21 1355:12 1343:11 1314:22 1302:21 1288:12 1276:11 1216:22 1204:21 1190:12 1178:11 1142:22 1130:21 1116:12 1104:11 1068:22 1056:21 1042:12 1030:11 941:Θ( 913:ij 909:← 906:ij 893:kj 889:× 886:ik 877:: 871:, 867:+ 844:: 838:, 834:+ 817:: 811:, 807:+ 776:, 745:, 714:, 694:: 679:: 664:: 611:: 565:Θ( 444:Θ( 437:× 418:Θ( 393:ij 380:kj 376:× 373:ik 364:: 347:: 336:: 195:× 177:× 163:× 155:AB 153:= 126:O( 114:O( 90:Θ( 83:× 6808:) 6804:( 6749:e 6742:t 6735:v 6715:. 6703:: 6686:. 6674:: 6647:. 6635:: 6625:: 6594:. 6590:: 6557:. 6551:: 6536:. 6532:: 6516:6 6499:. 6493:: 6466:. 6455:: 6445:: 6423:. 6409:: 6383:. 6379:: 6343:. 6339:: 6282:: 6239:. 6235:: 6208:. 6182:. 6176:: 6161:. 6131:: 6123:: 6092:. 6062:ω 6002:: 5957:: 5951:9 5927:. 5921:: 5906:. 5902:: 5896:5 5879:. 5875:: 5852:. 5848:: 5825:. 5819:: 5813:4 5792:. 5780:: 5757:. 5715:: 5699:4 5682:. 5678:: 5646:. 5632:: 5613:. 5577:. 5530:. 5500:: 5471:: 5447:: 5425:. 5411:: 5330:n 5323:n 5315:n 5313:× 5311:n 5281:p 5275:p 5270:) 5268:p 5266:/ 5264:n 5262:( 5260:O 5250:) 5248:p 5246:/ 5244:n 5242:( 5240:O 5235:) 5231:p 5226:/ 5224:n 5222:( 5220:O 5212:p 5201:p 5193:p 5186:) 5184:n 5178:) 5174:M 5169:/ 5167:n 5165:( 5163:O 5157:M 5148:M 5136:M 5114:) 5112:n 5092:B 5088:A 5084:C 5073:) 5071:n 5065:T 5061:) 5059:n 5055:n 5045:) 5043:n 5012:. 5004:. 5002:) 4997:T 4990:C 4981:. 4979:) 4974:T 4967:C 4958:. 4956:) 4951:T 4944:C 4935:. 4933:) 4928:T 4921:C 4909:. 4903:T 4894:T 4885:T 4876:T 4871:T 4866:. 4860:C 4851:C 4842:C 4833:C 4828:C 4814:t 4807:c 4800:c 4793:n 4784:C 4780:T 4776:) 4774:T 4770:C 4758:. 4756:T 4751:. 4749:) 4747:T 4743:C 4729:. 4727:) 4722:B 4715:A 4708:T 4699:. 4697:) 4692:B 4685:A 4678:T 4669:. 4667:) 4662:B 4655:A 4648:T 4639:. 4637:) 4632:B 4625:A 4618:T 4609:. 4607:) 4602:B 4595:A 4588:C 4579:. 4577:) 4572:B 4565:A 4558:C 4549:. 4547:) 4542:B 4535:A 4528:C 4519:. 4517:) 4512:B 4505:A 4498:C 4486:. 4480:T 4471:T 4462:T 4453:T 4448:T 4443:. 4437:C 4428:C 4419:C 4410:C 4405:C 4400:. 4394:B 4385:B 4376:B 4367:B 4362:B 4357:. 4351:A 4342:A 4333:A 4324:A 4319:A 4311:n 4307:n 4302:T 4291:b 4284:a 4277:c 4270:n 4261:) 4259:B 4255:A 4251:C 4215:) 4203:B 4193:A 4189:+ 4180:B 4170:A 4158:B 4148:A 4144:+ 4135:B 4125:A 4111:B 4101:A 4097:+ 4088:B 4078:A 4066:B 4056:A 4052:+ 4043:B 4033:A 4026:( 3970:Z 3966:2 3962:/ 3957:Z 3926:C 3917:) 3915:n 3909:C 3905:B 3901:A 3882:k 3876:k 3872:k 3797:) 3794:1 3791:( 3788:o 3785:+ 3778:n 3757:n 3751:n 3690:M 3687:5 3684:+ 3680:) 3674:M 3669:n 3664:2 3656:( 3647:2 3634:2 3630:n 3623:+ 3618:2 3614:n 3604:M 3596:7 3588:2 3578:) 3572:M 3567:n 3562:3 3554:( 3549:4 3528:n 3520:2 3510:2 3506:n 3499:+ 3494:2 3490:n 3486:4 3478:7 3470:2 3461:n 3457:5 3444:7 3422:M 3419:5 3416:+ 3412:) 3406:M 3401:n 3396:2 3388:( 3379:2 3366:2 3362:n 3358:3 3355:+ 3350:2 3346:n 3336:M 3328:7 3320:2 3310:) 3304:M 3299:n 3294:3 3286:( 3281:4 3260:n 3252:2 3242:2 3238:n 3234:3 3231:+ 3226:2 3222:n 3218:4 3210:7 3202:2 3193:n 3189:5 3176:7 3154:M 3151:3 3148:+ 3143:2 3139:n 3129:M 3121:7 3113:2 3103:) 3097:M 3092:n 3087:3 3079:( 3074:5 3051:2 3047:n 3043:5 3035:7 3027:2 3018:n 3014:6 3001:7 2979:M 2976:3 2973:+ 2968:2 2964:n 2954:M 2946:7 2938:2 2928:) 2922:M 2917:n 2912:3 2904:( 2899:6 2876:2 2872:n 2868:6 2860:7 2852:2 2843:n 2839:7 2826:7 2779:M 2759:n 2732:n 2713:) 2704:n 2700:( 2697:O 2691:) 2686:7 2678:2 2669:n 2665:( 2662:O 2638:. 2626:) 2617:n 2613:( 2610:O 2600:ω 2567:) 2558:M 2553:b 2548:p 2545:n 2542:m 2536:+ 2531:b 2527:p 2524:m 2521:+ 2518:p 2515:n 2512:+ 2509:n 2506:m 2500:+ 2497:p 2494:+ 2491:n 2488:+ 2485:m 2481:( 2464:b 2460:M 2416:2 2412:B 2406:2 2402:A 2398:+ 2393:1 2389:B 2383:1 2379:A 2375:= 2370:) 2362:2 2358:B 2348:1 2344:B 2337:( 2330:) 2322:2 2318:A 2310:1 2306:A 2299:( 2294:= 2291:C 2273:B 2269:A 2264:m 2260:p 2256:m 2252:n 2226:) 2218:2 2214:B 2210:A 2203:1 2199:B 2195:A 2189:( 2184:= 2179:) 2171:2 2167:B 2159:1 2155:B 2148:( 2143:A 2140:= 2137:C 2119:B 2114:p 2110:p 2106:m 2102:n 2076:) 2070:B 2065:2 2061:A 2053:B 2048:1 2044:A 2037:( 2032:= 2028:B 2022:) 2014:2 2010:A 2000:1 1996:A 1989:( 1984:= 1981:C 1963:A 1958:n 1954:p 1950:m 1946:n 1926:) 1924:p 1920:m 1916:n 1909:. 1906:p 1902:m 1897:B 1892:m 1888:n 1883:A 1864:) 1862:n 1852:) 1850:n 1842:n 1824:, 1821:) 1816:2 1812:n 1808:( 1802:+ 1799:) 1796:2 1792:/ 1788:n 1785:( 1782:T 1779:8 1776:= 1773:) 1770:n 1767:( 1764:T 1743:; 1740:) 1737:1 1734:( 1728:= 1725:) 1722:1 1719:( 1716:T 1703:n 1692:b 1686:a 1679:c 1652:) 1640:B 1630:A 1626:+ 1617:B 1607:A 1595:B 1585:A 1581:+ 1572:B 1562:A 1548:B 1538:A 1534:+ 1525:B 1515:A 1503:B 1493:A 1489:+ 1480:B 1470:A 1463:( 1458:= 1453:) 1441:B 1429:B 1415:B 1403:B 1396:( 1389:) 1377:A 1365:A 1351:A 1339:A 1332:( 1327:= 1322:) 1310:C 1298:C 1284:C 1272:C 1265:( 1250:n 1229:, 1224:) 1212:B 1200:B 1186:B 1174:B 1167:( 1162:= 1159:B 1155:, 1150:) 1138:A 1126:A 1112:A 1100:A 1093:( 1088:= 1085:A 1081:, 1076:) 1064:C 1052:C 1038:C 1026:C 1019:( 1014:= 1011:C 980:M 974:b 969:) 961:M 955:b 951:/ 947:n 932:C 911:C 904:C 891:B 884:A 875:) 873:m 869:T 865:K 859:K 855:k 842:) 840:p 836:T 832:J 826:J 822:j 815:) 813:n 809:T 805:I 799:I 795:i 786:T 784:+ 782:J 780:: 778:J 774:T 772:+ 770:I 768:: 766:I 762:C 755:T 753:+ 751:J 749:: 747:J 743:T 741:+ 739:K 737:: 735:K 731:B 724:T 722:+ 720:K 718:: 716:K 712:T 710:+ 708:I 706:: 704:I 700:A 692:T 688:m 684:K 677:T 673:p 669:J 662:T 658:n 654:I 648:) 644:M 637:T 629:C 623:B 619:A 606:M 595:M 581:B 577:A 569:) 567:n 561:B 557:B 553:A 544:b 540:/ 536:M 528:n 523:B 519:A 511:b 507:/ 503:M 495:b 491:M 448:) 446:n 439:n 435:n 422:) 406:C 391:C 378:B 371:A 362:m 358:k 345:p 341:j 334:n 330:i 323:C 317:B 313:A 303:p 299:j 295:n 291:i 274:. 269:j 266:k 262:b 256:k 253:i 249:a 243:m 238:1 235:= 232:k 224:= 219:j 216:i 212:c 197:p 193:n 188:C 184:B 179:p 175:m 170:A 165:m 161:n 151:C 130:) 128:n 118:) 116:n 94:) 92:n 85:n 81:n 72:n 20:)

Index

Coppersmith–Winograd algorithm
matrix multiplication
numerical algorithms
scientific computing
pattern recognition
graph
parallel
distributed
takes time
field
big O notation
Strassen's algorithm
computational complexity of matrix multiplication
asymptotic complexity
Williams
galactic algorithm
definition of matrix multiplication
time
asymptotic notation
algorithm analysis

row- and column-major order
memory access patterns
cache
row-major order, column-major order
fully associative cache
tiled
divide-and-conquer algorithm
block partitioning
recursively

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑