216:
116:
253:(APIs): the BLAS interface and the CBLAS interface. BLIS also includes two APIs native to the framework: a typed (BLAS-like) API and an object API. These native interfaces provide access to BLAS-like functionality that is not supported by, but closely related to, operations found in the BLAS (and CBLAS). The framework is developed and supported by the Science of High-Performance Computing (SHPC) group of the Oden
272:
The framework achieves high performance by employing specialized kernels (typically written in GNU extended inline assembly syntax) along with cache and register blocking through matrix operands. BLIS also works on processors for which custom kernels have not yet been written; in those cases, the
268:
BLIS yields high performance on many current CPU microarchitectures in both single-threaded and multithreaded modes of execution. BLIS also offers competitive performance for some cases of matrix multiplication in which one or more matrix operands are unusually skinny and/or small.
366:
Van Zee, Field; Smith, Tyler; Igual, Francisco; Smelyanskiy, Mikhail; Zhang, Xiangyi; Kistler, Michael; Austel, Vernon; Gunnels, John; Low, Tze Meng; Marker, Bryan; Killough, Lee; van de Geijn, Robert (2016).
254:
400:
Smith, Tyler M.; van de Geijn, Robert; Smelyanskiy, Mikhail; Hammond, Jeff R.; Van Zee, Field G. (2014). "Anatomy of High-Performance Many-Threaded Matrix
Multiplication".
566:
246:
687:
297:
712:
469:
707:
559:
417:
250:
733:
666:
552:
433:
Low, Tze Meng; Igual, Francisco; Smith, Tyler; Quintana, Enrique (2016). "Analytical
Modeling is Enough for High-Performance BLIS".
258:
697:
176:
624:
285:
262:
656:
93:
738:
67:
610:
661:
575:
339:
Van Zee, Field; van de Geijn, Robert (2015). "BLIS: A Framework for
Rapidly Instantiating BLAS Functionality".
242:
620:
226:
506:
Goto, Kazushige; van de Geijn, Robert A. (2008). "Anatomy of high-performance matrix multiplication".
615:
493:
234:
215:
594:
308:
273:
framework relies upon portable kernel implementations that perform at a lower rate of computation.
170:
150:
481:
413:
135:
630:
515:
450:
442:
405:
380:
348:
183:
159:
126:
671:
115:
589:
281:
727:
635:
189:
35:
25:
651:
519:
544:
409:
402:
2014 IEEE 28th
International Parallel and Distributed Processing Symposium
327:
303:
277:
455:
143:
702:
692:
155:
537:
446:
385:
368:
352:
200:
102:
139:
131:
472:, SIAM ยท Prizes & Recognition ยท Major Prizes & Lectures.
249:
in 2023. It exposes that functionality through two traditional
238:
548:
241:(Basic Linear Algebra Subprograms) functionality for specific
30:
Science of High-Performance
Computing (SHPC) group, UT-Austin
255:
680:
644:
603:
582:
195:
182:
169:
149:
125:
92:
66:
44:
34:
24:
233:(BLAS-like Library Instantiation Software) is an
369:"The BLIS Framework: Experiments in Portability"
276:BLIS is sometimes described as a refactoring of
470:James H. Wilkinson Prize for Numerical Software
560:
8:
247:J. H. Wilkinson Prize for Numerical Software
19:
298:Automatically Tuned Linear Algebra Software
567:
553:
545:
214:
175:Linear algebra library; implementation of
114:
18:
508:ACM Transactions on Mathematical Software
454:
435:ACM Transactions on Mathematical Software
384:
373:ACM Transactions on Mathematical Software
341:ACM Transactions on Mathematical Software
237:framework for implementing a superset of
698:Basic Linear Algebra Subprograms (BLAS)
320:
7:
261:and the Matthews Research Group at
251:Application Programming Interfaces
14:
259:The University of Texas at Austin
40:Field Van Zee and Devin Matthews
286:Texas Advanced Computing Center
328:Releases ยท flame/blis โ GitHub
1:
263:Southern Methodist University
245:types that was awarded the
755:
611:System of linear equations
16:Numerical software library
662:Cache-oblivious algorithm
309:Intel Math Kernel Library
88:
73:1.0 / May 6, 2024
62:
734:Numerical linear algebra
713:General purpose software
576:Numerical linear algebra
280:2, which was created by
520:10.1145/1356052.1356053
496:, flame/blis on GitHub.
484:, flame/blis on GitHub.
410:10.1109/IPDPS.2014.110
404:. pp. 1049โ1059.
188:new/modified/3-clause
75:; 4 months ago
50:; 10 years ago
708:Specialized libraries
621:Matrix multiplication
616:Matrix decompositions
48:November 9, 2013
227:scientific computing
595:Numerical stability
494:PerformanceSmall.md
21:
739:Numerical software
26:Original author(s)
721:
720:
419:978-1-4799-3800-1
223:
222:
136:Microsoft Windows
746:
631:Matrix splitting
569:
562:
555:
546:
541:
540:
538:Official website
524:
523:
503:
497:
491:
485:
479:
473:
467:
461:
460:
458:
430:
424:
423:
397:
391:
390:
388:
363:
357:
356:
336:
330:
325:
219:
218:
211:
208:
206:
204:
202:
127:Operating system
118:
113:
110:
108:
106:
104:
83:
81:
76:
58:
56:
51:
22:
754:
753:
749:
748:
747:
745:
744:
743:
724:
723:
722:
717:
676:
672:Multiprocessing
640:
636:Sparse problems
599:
578:
573:
536:
535:
532:
527:
505:
504:
500:
492:
488:
480:
476:
468:
464:
447:10.1145/2925987
432:
431:
427:
420:
399:
398:
394:
386:10.1145/2755561
365:
364:
360:
353:10.1145/2764454
338:
337:
333:
326:
322:
318:
294:
213:
199:
164:
162:
158:
142:
138:
134:
121:
101:
84:
79:
77:
74:
54:
52:
49:
45:Initial release
17:
12:
11:
5:
752:
750:
742:
741:
736:
726:
725:
719:
718:
716:
715:
710:
705:
700:
695:
690:
684:
682:
678:
677:
675:
674:
669:
664:
659:
654:
648:
646:
642:
641:
639:
638:
633:
628:
618:
613:
607:
605:
601:
600:
598:
597:
592:
590:Floating point
586:
584:
580:
579:
574:
572:
571:
564:
557:
549:
543:
542:
531:
530:External links
528:
526:
525:
498:
486:
482:Performance.md
474:
462:
425:
418:
392:
358:
331:
319:
317:
314:
313:
312:
306:
301:
293:
290:
282:Kazushige Goto
221:
220:
197:
193:
192:
186:
180:
179:
173:
167:
166:
153:
147:
146:
129:
123:
122:
120:
119:
98:
96:
90:
89:
86:
85:
72:
70:
68:Stable release
64:
63:
60:
59:
46:
42:
41:
38:
32:
31:
28:
15:
13:
10:
9:
6:
4:
3:
2:
751:
740:
737:
735:
732:
731:
729:
714:
711:
709:
706:
704:
701:
699:
696:
694:
691:
689:
686:
685:
683:
679:
673:
670:
668:
665:
663:
660:
658:
655:
653:
650:
649:
647:
643:
637:
634:
632:
629:
626:
622:
619:
617:
614:
612:
609:
608:
606:
602:
596:
593:
591:
588:
587:
585:
581:
577:
570:
565:
563:
558:
556:
551:
550:
547:
539:
534:
533:
529:
521:
517:
513:
509:
502:
499:
495:
490:
487:
483:
478:
475:
471:
466:
463:
457:
452:
448:
444:
440:
436:
429:
426:
421:
415:
411:
407:
403:
396:
393:
387:
382:
378:
374:
370:
362:
359:
354:
350:
346:
342:
335:
332:
329:
324:
321:
315:
310:
307:
305:
302:
299:
296:
295:
291:
289:
287:
283:
279:
274:
270:
266:
264:
260:
256:
252:
248:
244:
240:
236:
232:
228:
217:
210:
198:
194:
191:
187:
185:
181:
178:
174:
172:
168:
161:
157:
154:
152:
148:
145:
141:
137:
133:
130:
128:
124:
117:
112:
100:
99:
97:
95:
91:
87:
71:
69:
65:
61:
47:
43:
39:
37:
33:
29:
27:
23:
583:Key concepts
511:
507:
501:
489:
477:
465:
456:10234/163618
438:
434:
428:
401:
395:
376:
372:
361:
344:
340:
334:
323:
275:
271:
267:
230:
224:
36:Developer(s)
514:(3): 1โ25.
441:(2): 1โ18.
379:(2): 1โ19.
347:(3): 1โ33.
235:open-source
190:BSD License
728:Categories
625:algorithms
316:References
94:Repository
80:2024-05-06
55:2013-11-09
652:CPU cache
243:processor
681:Software
645:Hardware
604:Problems
304:OpenBLAS
292:See also
278:GotoBLAS
151:Platform
300:(ATLAS)
284:at the
203:.github
196:Website
184:License
144:FreeBSD
105:.github
78: (
53: (
703:LAPACK
693:MATLAB
416:
212:
207:/flame
156:x86-64
109:/flame
688:ATLAS
311:(MKL)
209:/blis
163:ARM64
140:macOS
132:Linux
111:/blis
667:SIMD
414:ISBN
239:BLAS
231:BLIS
205:.com
177:BLAS
171:Type
107:.com
20:BLIS
657:TLB
516:doi
451:hdl
443:doi
406:doi
381:doi
349:doi
257:at
225:In
201:www
165:...
160:ARM
103:www
730::
512:34
510:.
449:.
439:43
437:.
412:.
377:42
375:.
371:.
345:41
343:.
288:.
265:.
229:,
627:)
623:(
568:e
561:t
554:v
522:.
518::
459:.
453::
445::
422:.
408::
389:.
383::
355:.
351::
82:)
57:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.