Knowledge (XXG)

Manycore processor

Source šŸ“

3578: 67: 214:. These techniques devote runtime resources toward figuring out implicit parallelism in a single thread. They are used in systems where they have evolved continuously (with backward compatibility) from single core processors. They usually have a 'few' cores (e.g. 2, 4, 8) and may be complemented by a manycore 459:, an improved 520-core variant of SW26010, with 512-bit SIMD (also adding support for half-precision), used in a prototype, meant for an exascale system (and in the future 10 exascale system), and according to datacenterdynamics China is rumored to already have two separate exascale systems secretly 535:
have over 5 million CPU cores. When there are also coprocessors, e.g. GPUs used with, then those cores are not listed in the core-count, then quite a few more computers would hit those targets.
2904: 1931: 565:, once one of the fastest supercomputers in the world, using a custom manycore architecture. As of November 2018, it was the world's third fastest supercomputer (as ranked by the 903: 2042: 1225: 2994: 1744: 2846: 1901: 1467: 1284: 1247: 739:"The Cell architecture is like nothing we have ever seen in commodity microprocessors, it is closer in design to multiprocessor vector supercomputers" 1896: 694:
Olofsson, Andreas; Nordstrƶm, Tomas; Ul-Abdin, Zain (2014). "Kickstarting High-performance Energy-efficient Manycore Architectures with Epiphany".
1968: 347: 2975: 1721: 257:
and local memories gives software the opportunity to explicitly optimise the spatial layout of tasks (e.g. as seen in tooling developed for
3242: 2665: 1789: 1052: 896: 3265: 2675: 1816: 88: 3154: 943: 342: 820: 3260: 3237: 1983: 1811: 1784: 1163: 593: 158: 1134: 2839: 2798: 2361: 1254: 1220: 1215: 1099: 306: 250: 139: 3232: 3047: 2773: 2670: 2071: 1978: 1779: 1022: 1000: 889: 795: 608: 111: 38: 3603: 3339: 3202: 1518: 953: 92: 3563: 3397: 3015: 2935: 1973: 1821: 1655: 1269: 1230: 1087: 118: 750: 3608: 2410: 2255: 2250: 2172: 1648: 1609: 1264: 1259: 1193: 1005: 1129: 485:
A number of computers built from multicore processors have one million or more individual CPU cores. Examples include:
3613: 3582: 3528: 2988: 2832: 2037: 1734: 1432: 77: 870: 677: 283:, and only being suitable for highly parallel code (high throughput, but extremely poor single thread performance). 125: 3507: 3302: 3187: 3149: 2999: 2889: 2687: 2334: 1751: 1242: 1210: 980: 968: 948: 598: 265: 237:
is an issue limiting the scaling of multicore processors. Manycore processors may bypass this with methods such as
50: 96: 81: 3523: 3502: 3447: 3334: 3324: 3297: 3159: 2778: 2741: 2731: 1119: 292: 180: 198:
serial code, and therefore place more emphasis on high single-thread performance (e.g. devoting more silicon to
3477: 3103: 3042: 2955: 2793: 2200: 2136: 2113: 1963: 1925: 1761: 1711: 1706: 1183: 1077: 985: 539: 219: 107: 990: 3392: 3538: 3533: 2983: 2746: 2529: 2423: 2387: 2304: 2288: 2130: 1919: 1878: 1866: 1729: 1643: 1564: 1329: 933: 628: 425: 280: 223: 659: 3277: 3209: 3113: 3005: 2960: 2552: 2524: 2434: 2399: 2148: 2142: 2124: 1858: 1852: 1756: 1660: 1551: 1490: 1352: 995: 613: 544: 409: 199: 3067: 400: 3369: 3329: 3282: 3272: 3010: 2930: 2869: 2726: 2635: 2381: 2093: 1911: 1670: 1638: 1596: 1508: 1309: 1124: 1114: 1104: 1094: 1064: 1047: 912: 618: 215: 467:, a manycore processor designed for running convolutional neural nets for embedded vision applications 396: 3309: 3197: 3192: 3182: 3169: 2965: 2756: 2692: 2278: 2000: 1890: 1837: 1369: 1082: 938: 920: 583: 497: 246: 203: 187: 176: 172: 34: 2803: 2405: 393: 3472: 3427: 3253: 3248: 3227: 3093: 2788: 2608: 2459: 2441: 2393: 2047: 1994: 1799: 1794: 1771: 1687: 1569: 1424: 1319: 1178: 519: 441: 3497: 3346: 3319: 3144: 3108: 3098: 2879: 2874: 2855: 2660: 2652: 2504: 2479: 2158: 1682: 1623: 1503: 1235: 963: 720: 695: 633: 518:, a massively parallel (1 million CPU cores) manycore processor (ARM-based) built as part of the 3057: 850:"Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks" 132: 3543: 3219: 3177: 3072: 2613: 2580: 2496: 2428: 2329: 2319: 2309: 2240: 2235: 2230: 2153: 2082: 1988: 1948: 1581: 1531: 1481: 1457: 1339: 1279: 1274: 1156: 1072: 558: 493: 451: 405: 388: 337: 273: 254: 242: 46: 179:, and for higher throughput (or lower power consumption) at the expense of latency and lower 45:(from a few tens of cores to thousands or more). Manycore processors are used extensively in 3553: 3352: 3287: 3134: 2950: 2945: 2940: 2909: 2783: 2716: 2557: 2464: 2418: 2225: 2220: 2215: 2210: 2205: 2195: 2065: 2032: 1943: 1938: 1847: 1699: 1694: 1677: 1665: 1604: 1168: 1146: 1032: 1010: 928: 776: 603: 588: 269: 264:
Manycore processors may have more in common (conceptually) with technologies originating in
3417: 3357: 3292: 3139: 3129: 3062: 2894: 2884: 2697: 2682: 2630: 2534: 2509: 2346: 2339: 2190: 2185: 2180: 2119: 2027: 2017: 1739: 1574: 1526: 1289: 1173: 1141: 1042: 1037: 958: 854:
IEEE International Solid-State Circuits Conference, ISSCC 2016, Digest of Technical Papers
623: 238: 234: 3052: 734: 512:, with 20,480,000 processing elements total plus the 1,250 Intel Xeon D host processors. 3548: 3364: 3021: 2914: 2808: 2642: 2625: 2618: 2514: 2371: 2108: 2022: 1953: 1536: 1498: 1447: 1442: 1437: 1151: 975: 509: 474: 456: 447: 435: 363: 301: 42: 3597: 3437: 3314: 2603: 2519: 1559: 1541: 1334: 1027: 562: 552: 548: 532: 505: 211: 3037: 2813: 2751: 2567: 2544: 2356: 2077: 1015: 845: 3558: 2598: 2562: 2273: 2245: 2103: 1958: 780: 311: 207: 66: 2484: 2474: 2469: 2451: 2351: 2324: 1586: 1419: 1389: 1109: 716: 3432: 3407: 2575: 2572: 2314: 1384: 1362: 515: 470: 444:, a manycore processor using message passing aimed at low power applications 431: 258: 3482: 3462: 3387: 2590: 1462: 1409: 881: 758: 384: 369: 321: 849: 17: 3487: 3467: 3442: 3077: 1399: 1357: 570: 489: 464: 360: 3457: 3452: 2824: 2702: 1414: 1379: 1344: 566: 450:, a 260-core manycore processor used in the then top 1 supercomputer 415: 379: 316: 297: 771:
Barker, J; Bowden, J (2013). "Manycore Parallelism through OpenMP".
279:
GPUs may be considered a form of manycore processor having multiple
1872: 1404: 1374: 873:, published on Feb 19, 2010 (more than one dead link in the slide) 700: 419: 253:, or read-only/non-coherent caches. A manycore processor using a 3492: 3422: 3412: 2736: 1884: 1804: 1394: 775:. IWOMP. Lecture Notes in Computer Science, vol 8122. Springer. 638: 2828: 885: 3402: 3379: 1324: 1314: 333: 60: 821:"China's Exascale Prototype Supercomputer Tests AI Workloads" 660:"The Future of Many Core Computing: A tale of two processors" 175:
in being optimized from the outset for a higher degree of
876: 796:"A First Peek At China's Sunway Exascale Supercomputer" 773:
OpenMP in the Era of Low Power Devices and Accelerators
210:
execution units, and larger, more general caches), and
190:, by contrast, are usually designed to efficiently run 561:, a massively parallel (10 million CPU cores) Chinese 3516: 3378: 3218: 3168: 3122: 3086: 3030: 2974: 2923: 2862: 2766: 2715: 2651: 2589: 2543: 2495: 2450: 2370: 2297: 2266: 2171: 2092: 2056: 2010: 1910: 1836: 1770: 1720: 1631: 1622: 1595: 1550: 1517: 1489: 1480: 1300: 1203: 1192: 1063: 919: 527:
Specific computers with 5 million or more CPU cores
387:Epiphany Architecture, a manycore chip using PGAS 27:Multi-core processor with a large number of cores 573:manycore processors, each containing 256 cores. 871:Architecting solutions for the Manycore future 481:Specific manycore computers with 1M+ CPU cores 438:with a manycore network on a chip architecture 2840: 897: 569:list), obtaining its performance from 40,960 8: 844:Chen, Yu-Hsin; Krishna, Tushar; Emer, Joel; 428:, a manycore processor using message passing 1902:Computer performance by orders of magnitude 95:. Unsourced material may be challenged and 41:, containing numerous simpler, independent 2847: 2833: 2825: 2367: 2007: 1628: 1486: 1200: 904: 890: 882: 699: 159:Learn how and when to remove this message 751:"OEMs show systems with Intel MIC chips" 399:, a 100-core DSP/GPP processor based on 794:Morgan, Timothy Prickett (2021-02-10). 650: 348:Asynchronous array of simple processors 171:Manycore processors are distinct from 336:, which can be described as manycore 7: 1873:Floating-point operations per second 676:Hendry, Gilbert; Kretschmann, Mark. 555:ARM-based cores, 7,630,848 in total. 422:accelerator for data-intensive tasks 93:adding citations to reliable sources 57:Contrast with multicore architecture 343:Massively parallel processor array 25: 3577: 3576: 2799:Semiconductor device fabrication 723:from the original on 2021-12-21. 307:Partitioned global address space 251:partitioned global address space 65: 3048:Analysis of parallel algorithms 2774:History of general-purpose CPUs 1001:Nondeterministic Turing machine 609:Multiprocessor system on a chip 354:Specific manycore architectures 954:Deterministic finite automaton 819:Hemsoth, Nicole (2021-04-19). 749:Rick Merritt (June 20, 2011), 717:"IBM SyNAPSE Deep Dive Part 3" 37:designed for a high degree of 1: 2995:Simultaneous and heterogenous 1745:Simultaneous and heterogenous 715:Amir, Arnon (June 11, 2015). 658:Mattson, Tim (January 2010). 3583:Category: Parallel computing 2429:Integrated memory controller 2411:Translation lookaside buffer 1610:Memory dependence prediction 1053:Random-access stored program 1006:Probabilistic Turing machine 372:coprocessor, which has MIC ( 1885:Synaptic updates per second 781:10.1007/978-3-642-40698-0_4 508:developed by ExaScaler and 328:Classes of manycore systems 287:Suitable programming models 3630: 2890:High-performance computing 2289:Heterogeneous architecture 1211:Orthogonal instruction set 981:Alternating Turing machine 969:Quantum cellular automaton 599:High-performance computing 266:high-performance computing 51:high-performance computing 3572: 3524:Automatic parallelization 3160:Application checkpointing 2779:Microprocessor chronology 2742:Dynamic frequency scaling 1897:Cache performance metrics 300:or other APIs supporting 293:Message passing interface 181:single-thread performance 2794:Hardware security module 2137:Digital signal processor 2114:Graphics processing unit 1926:Graphics processing unit 186:The broader category of 3539:Embarrassingly parallel 3534:Deterministic algorithm 2747:Dynamic voltage scaling 2530:Memory address register 2424:Branch target predictor 2388:Address generation unit 2131:Physics processing unit 1920:Central processing unit 1879:Transactions per second 1867:Instructions per second 1790:Array processing (SIMT) 934:Stored-program computer 629:Embarrassingly parallel 426:Teraflops Research Chip 281:shader processing units 3254:Associative processing 3210:Non-blocking algorithm 3016:Clustered multi-thread 2553:Hardwired control unit 2435:Memory management unit 2400:Memory management unit 2149:Secure cryptoprocessor 2143:Tensor Processing Unit 2125:Vision processing unit 1859:Cycles per instruction 1853:Instructions per cycle 1800:Associative processing 1491:Instruction pipelining 913:Processor technologies 614:Vision processing unit 410:vision processing unit 200:out-of-order execution 3604:Computer architecture 3370:Hardware acceleration 3283:Superscalar processor 3273:Dataflow architecture 2870:Distributed computing 2636:Sum-addressed decoder 2382:Arithmetic logic unit 1509:Classic RISC pipeline 1463:Epiphany architecture 1310:Motorola 68000 series 619:Memory access pattern 374:Many Integrated Cores 188:multi-core processors 173:multi-core processors 35:multi-core processors 33:are special kinds of 3249:Pipelined processing 3198:Explicit parallelism 3193:Implicit parallelism 3183:Dataflow programming 2757:Performance per watt 2335:replacement policies 2001:Package on a package 1891:Performance per watt 1795:Pipelined processing 1565:Tomasulo's algorithm 1370:Clipper architecture 1226:Application-specific 939:Finite-state machine 877:Eyeriss architecture 678:"IBM Cell Processor" 584:Multi-core processor 224:heterogeneous system 177:explicit parallelism 108:"Manycore processor" 89:improve this section 3609:Manycore processors 3473:Parallel Extensions 3278:Pipelined processor 2789:Digital electronics 2442:Instruction decoder 2394:Floating-point unit 2048:Soft microprocessor 1995:System in a package 1570:Reservation station 1100:Transport-triggered 856:. pp. 262ā€“263. 735:"cell architecture" 520:Human Brain Project 401:HyperX Architecture 39:parallel processing 31:Manycore processors 3614:Parallel computing 3347:Massively parallel 3325:distributed shared 3145:Cache invalidation 3109:Instruction window 2900:Manycore processor 2880:Massively parallel 2875:Parallel computing 2856:Parallel computing 2661:Integrated circuit 2505:Processor register 2159:Baseband processor 1504:Operand forwarding 964:Cellular automaton 634:Massively parallel 366:2,048-core modules 47:embedded computers 3591: 3590: 3544:Parallel slowdown 3178:Stream processing 3068:Karpā€“Flatt metric 2822: 2821: 2711: 2710: 2330:Instruction cache 2320:Scratchpad memory 2167: 2166: 2154:Network processor 2083:Network on a chip 2038:Ultra-low-voltage 1989:Multi-chip module 1832: 1831: 1618: 1617: 1605:Branch prediction 1582:Register renaming 1476: 1475: 1458:VISC architecture 1280:Quantum computing 1275:VISC architecture 1157:Secondary storage 1073:Microarchitecture 1033:Register machines 825:The Next Platform 800:The Next Platform 559:Sunway TaihuLight 504:, dawn light), a 452:Sunway TaihuLight 406:Movidius Myriad 2 389:scratchpad memory 338:vector processors 274:vector processors 255:network on a chip 243:scratchpad memory 169: 168: 161: 143: 16:(Redirected from 3621: 3580: 3579: 3554:Software lockout 3353:Computer cluster 3288:Vector processor 3243:Array processing 3228:Flynn's taxonomy 3135:Memory coherence 2910:Computer network 2849: 2842: 2835: 2826: 2784:Processor design 2676:Power management 2558:Instruction unit 2419:Branch predictor 2368: 2066:System on a chip 2008: 1848:Transistor count 1772:Flynn's taxonomy 1629: 1487: 1290:Addressing modes 1201: 1147:Memory hierarchy 1011:Hypercomputation 929:Abstract machine 906: 899: 892: 883: 858: 857: 841: 835: 834: 832: 831: 816: 810: 809: 807: 806: 791: 785: 784: 768: 762: 761: 746: 740: 738: 731: 725: 724: 719:. IBM Research. 712: 706: 705: 703: 691: 685: 684: 682: 673: 667: 666: 664: 655: 604:Computer cluster 589:Vector processor 397:hx3100 Processor 164: 157: 153: 150: 144: 142: 101: 69: 61: 21: 3629: 3628: 3624: 3623: 3622: 3620: 3619: 3618: 3594: 3593: 3592: 3587: 3568: 3512: 3418:Coarray Fortran 3374: 3358:Beowulf cluster 3214: 3164: 3155:Synchronization 3140:Cache coherence 3130:Multiprocessing 3118: 3082: 3063:Cost efficiency 3058:Gustafson's law 3026: 2970: 2919: 2895:Multiprocessing 2885:Cloud computing 2858: 2853: 2823: 2818: 2804:Tickā€“tock model 2762: 2718: 2707: 2647: 2631:Address decoder 2585: 2539: 2535:Program counter 2510:Status register 2491: 2446: 2406:Loadā€“store unit 2373: 2366: 2293: 2262: 2163: 2120:Image processor 2095: 2088: 2058: 2052: 2028:Microcontroller 2018:Embedded system 2006: 1906: 1839: 1828: 1766: 1716: 1614: 1591: 1575:Re-order buffer 1546: 1527:Data dependency 1513: 1472: 1302: 1296: 1195: 1194:Instruction set 1188: 1174:Multiprocessing 1142:Cache hierarchy 1135:Register/memory 1059: 959:Queue automaton 915: 910: 867: 862: 861: 843: 842: 838: 829: 827: 818: 817: 813: 804: 802: 793: 792: 788: 770: 769: 765: 755:www.eetimes.com 748: 747: 743: 733: 732: 728: 714: 713: 709: 693: 692: 688: 680: 675: 674: 670: 662: 657: 656: 652: 647: 624:Cache coherency 580: 529: 483: 356: 330: 302:compute kernels 289: 239:message passing 235:Cache coherency 232: 165: 154: 148: 145: 102: 100: 86: 70: 59: 43:processor cores 28: 23: 22: 15: 12: 11: 5: 3627: 3625: 3617: 3616: 3611: 3606: 3596: 3595: 3589: 3588: 3586: 3585: 3573: 3570: 3569: 3567: 3566: 3561: 3556: 3551: 3549:Race condition 3546: 3541: 3536: 3531: 3526: 3520: 3518: 3514: 3513: 3511: 3510: 3505: 3500: 3495: 3490: 3485: 3480: 3475: 3470: 3465: 3460: 3455: 3450: 3445: 3440: 3435: 3430: 3425: 3420: 3415: 3410: 3405: 3400: 3395: 3390: 3384: 3382: 3376: 3375: 3373: 3372: 3367: 3362: 3361: 3360: 3350: 3344: 3343: 3342: 3337: 3332: 3327: 3322: 3317: 3307: 3306: 3305: 3300: 3293:Multiprocessor 3290: 3285: 3280: 3275: 3270: 3269: 3268: 3263: 3258: 3257: 3256: 3251: 3246: 3235: 3224: 3222: 3216: 3215: 3213: 3212: 3207: 3206: 3205: 3200: 3195: 3185: 3180: 3174: 3172: 3166: 3165: 3163: 3162: 3157: 3152: 3147: 3142: 3137: 3132: 3126: 3124: 3120: 3119: 3117: 3116: 3111: 3106: 3101: 3096: 3090: 3088: 3084: 3083: 3081: 3080: 3075: 3070: 3065: 3060: 3055: 3050: 3045: 3040: 3034: 3032: 3028: 3027: 3025: 3024: 3022:Hardware scout 3019: 3013: 3008: 3003: 2997: 2992: 2986: 2980: 2978: 2976:Multithreading 2972: 2971: 2969: 2968: 2963: 2958: 2953: 2948: 2943: 2938: 2933: 2927: 2925: 2921: 2920: 2918: 2917: 2915:Systolic array 2912: 2907: 2902: 2897: 2892: 2887: 2882: 2877: 2872: 2866: 2864: 2860: 2859: 2854: 2852: 2851: 2844: 2837: 2829: 2820: 2819: 2817: 2816: 2811: 2809:Pin grid array 2806: 2801: 2796: 2791: 2786: 2781: 2776: 2770: 2768: 2764: 2763: 2761: 2760: 2754: 2749: 2744: 2739: 2734: 2729: 2723: 2721: 2713: 2712: 2709: 2708: 2706: 2705: 2700: 2695: 2690: 2685: 2680: 2679: 2678: 2673: 2668: 2657: 2655: 2649: 2648: 2646: 2645: 2643:Barrel shifter 2640: 2639: 2638: 2633: 2626:Binary decoder 2623: 2622: 2621: 2611: 2606: 2601: 2595: 2593: 2587: 2586: 2584: 2583: 2578: 2570: 2565: 2560: 2555: 2549: 2547: 2541: 2540: 2538: 2537: 2532: 2527: 2522: 2517: 2515:Stack register 2512: 2507: 2501: 2499: 2493: 2492: 2490: 2489: 2488: 2487: 2482: 2472: 2467: 2462: 2456: 2454: 2448: 2447: 2445: 2444: 2439: 2438: 2437: 2426: 2421: 2416: 2415: 2414: 2408: 2397: 2391: 2385: 2378: 2376: 2365: 2364: 2359: 2354: 2349: 2344: 2343: 2342: 2337: 2332: 2327: 2322: 2317: 2307: 2301: 2299: 2295: 2294: 2292: 2291: 2286: 2281: 2276: 2270: 2268: 2264: 2263: 2261: 2260: 2259: 2258: 2248: 2243: 2238: 2233: 2228: 2223: 2218: 2213: 2208: 2203: 2198: 2193: 2188: 2183: 2177: 2175: 2169: 2168: 2165: 2164: 2162: 2161: 2156: 2151: 2146: 2140: 2134: 2128: 2122: 2117: 2111: 2109:AI accelerator 2106: 2100: 2098: 2090: 2089: 2087: 2086: 2080: 2075: 2072:Multiprocessor 2069: 2062: 2060: 2054: 2053: 2051: 2050: 2045: 2040: 2035: 2030: 2025: 2023:Microprocessor 2020: 2014: 2012: 2011:By application 2005: 2004: 1998: 1992: 1986: 1981: 1976: 1971: 1966: 1961: 1956: 1954:Tile processor 1951: 1946: 1941: 1936: 1935: 1934: 1923: 1916: 1914: 1908: 1907: 1905: 1904: 1899: 1894: 1888: 1882: 1876: 1870: 1864: 1863: 1862: 1850: 1844: 1842: 1834: 1833: 1830: 1829: 1827: 1826: 1825: 1824: 1814: 1809: 1808: 1807: 1802: 1797: 1792: 1782: 1776: 1774: 1768: 1767: 1765: 1764: 1759: 1754: 1749: 1748: 1747: 1742: 1740:Hyperthreading 1732: 1726: 1724: 1722:Multithreading 1718: 1717: 1715: 1714: 1709: 1704: 1703: 1702: 1692: 1691: 1690: 1685: 1675: 1674: 1673: 1668: 1658: 1653: 1652: 1651: 1646: 1635: 1633: 1626: 1620: 1619: 1616: 1615: 1613: 1612: 1607: 1601: 1599: 1593: 1592: 1590: 1589: 1584: 1579: 1578: 1577: 1572: 1562: 1556: 1554: 1548: 1547: 1545: 1544: 1539: 1534: 1529: 1523: 1521: 1515: 1514: 1512: 1511: 1506: 1501: 1499:Pipeline stall 1495: 1493: 1484: 1478: 1477: 1474: 1473: 1471: 1470: 1465: 1460: 1455: 1452: 1451: 1450: 1448:z/Architecture 1445: 1440: 1435: 1427: 1422: 1417: 1412: 1407: 1402: 1397: 1392: 1387: 1382: 1377: 1372: 1367: 1366: 1365: 1360: 1355: 1347: 1342: 1337: 1332: 1327: 1322: 1317: 1312: 1306: 1304: 1298: 1297: 1295: 1294: 1293: 1292: 1282: 1277: 1272: 1267: 1262: 1257: 1252: 1251: 1250: 1240: 1239: 1238: 1228: 1223: 1218: 1213: 1207: 1205: 1198: 1190: 1189: 1187: 1186: 1181: 1176: 1171: 1166: 1161: 1160: 1159: 1154: 1152:Virtual memory 1144: 1139: 1138: 1137: 1132: 1127: 1122: 1112: 1107: 1102: 1097: 1092: 1091: 1090: 1080: 1075: 1069: 1067: 1061: 1060: 1058: 1057: 1056: 1055: 1050: 1045: 1040: 1030: 1025: 1020: 1019: 1018: 1013: 1008: 1003: 998: 993: 988: 983: 976:Turing machine 973: 972: 971: 966: 961: 956: 951: 946: 936: 931: 925: 923: 917: 916: 911: 909: 908: 901: 894: 886: 880: 879: 874: 866: 865:External links 863: 860: 859: 836: 811: 786: 763: 741: 726: 707: 686: 668: 649: 648: 646: 643: 642: 641: 636: 631: 626: 621: 616: 611: 606: 601: 596: 591: 586: 579: 576: 575: 574: 556: 542: 533:supercomputers 528: 525: 524: 523: 513: 510:PEZY Computing 482: 479: 478: 477: 475:AI accelerator 468: 462: 461: 460: 448:Sunway SW26010 445: 439: 436:AI accelerator 429: 423: 413: 403: 394:Coherent Logix 391: 382: 377: 376:) architecture 367: 364:PEZY Computing 355: 352: 351: 350: 345: 340: 329: 326: 325: 324: 319: 314: 309: 304: 295: 288: 285: 231: 228: 167: 166: 73: 71: 64: 58: 55: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 3626: 3615: 3612: 3610: 3607: 3605: 3602: 3601: 3599: 3584: 3575: 3574: 3571: 3565: 3562: 3560: 3557: 3555: 3552: 3550: 3547: 3545: 3542: 3540: 3537: 3535: 3532: 3530: 3527: 3525: 3522: 3521: 3519: 3515: 3509: 3506: 3504: 3501: 3499: 3496: 3494: 3491: 3489: 3486: 3484: 3481: 3479: 3476: 3474: 3471: 3469: 3466: 3464: 3461: 3459: 3456: 3454: 3451: 3449: 3446: 3444: 3441: 3439: 3438:Global Arrays 3436: 3434: 3431: 3429: 3426: 3424: 3421: 3419: 3416: 3414: 3411: 3409: 3406: 3404: 3401: 3399: 3396: 3394: 3391: 3389: 3386: 3385: 3383: 3381: 3377: 3371: 3368: 3366: 3365:Grid computer 3363: 3359: 3356: 3355: 3354: 3351: 3348: 3345: 3341: 3338: 3336: 3333: 3331: 3328: 3326: 3323: 3321: 3318: 3316: 3313: 3312: 3311: 3308: 3304: 3301: 3299: 3296: 3295: 3294: 3291: 3289: 3286: 3284: 3281: 3279: 3276: 3274: 3271: 3267: 3264: 3262: 3259: 3255: 3252: 3250: 3247: 3244: 3241: 3240: 3239: 3236: 3234: 3231: 3230: 3229: 3226: 3225: 3223: 3221: 3217: 3211: 3208: 3204: 3201: 3199: 3196: 3194: 3191: 3190: 3189: 3186: 3184: 3181: 3179: 3176: 3175: 3173: 3171: 3167: 3161: 3158: 3156: 3153: 3151: 3148: 3146: 3143: 3141: 3138: 3136: 3133: 3131: 3128: 3127: 3125: 3121: 3115: 3112: 3110: 3107: 3105: 3102: 3100: 3097: 3095: 3092: 3091: 3089: 3085: 3079: 3076: 3074: 3071: 3069: 3066: 3064: 3061: 3059: 3056: 3054: 3051: 3049: 3046: 3044: 3041: 3039: 3036: 3035: 3033: 3029: 3023: 3020: 3017: 3014: 3012: 3009: 3007: 3004: 3001: 2998: 2996: 2993: 2990: 2987: 2985: 2982: 2981: 2979: 2977: 2973: 2967: 2964: 2962: 2959: 2957: 2954: 2952: 2949: 2947: 2944: 2942: 2939: 2937: 2934: 2932: 2929: 2928: 2926: 2922: 2916: 2913: 2911: 2908: 2906: 2903: 2901: 2898: 2896: 2893: 2891: 2888: 2886: 2883: 2881: 2878: 2876: 2873: 2871: 2868: 2867: 2865: 2861: 2857: 2850: 2845: 2843: 2838: 2836: 2831: 2830: 2827: 2815: 2812: 2810: 2807: 2805: 2802: 2800: 2797: 2795: 2792: 2790: 2787: 2785: 2782: 2780: 2777: 2775: 2772: 2771: 2769: 2765: 2758: 2755: 2753: 2750: 2748: 2745: 2743: 2740: 2738: 2735: 2733: 2730: 2728: 2725: 2724: 2722: 2720: 2714: 2704: 2701: 2699: 2696: 2694: 2691: 2689: 2686: 2684: 2681: 2677: 2674: 2672: 2669: 2667: 2664: 2663: 2662: 2659: 2658: 2656: 2654: 2650: 2644: 2641: 2637: 2634: 2632: 2629: 2628: 2627: 2624: 2620: 2617: 2616: 2615: 2612: 2610: 2607: 2605: 2604:Demultiplexer 2602: 2600: 2597: 2596: 2594: 2592: 2588: 2582: 2579: 2577: 2574: 2571: 2569: 2566: 2564: 2561: 2559: 2556: 2554: 2551: 2550: 2548: 2546: 2542: 2536: 2533: 2531: 2528: 2526: 2525:Memory buffer 2523: 2521: 2520:Register file 2518: 2516: 2513: 2511: 2508: 2506: 2503: 2502: 2500: 2498: 2494: 2486: 2483: 2481: 2478: 2477: 2476: 2473: 2471: 2468: 2466: 2463: 2461: 2460:Combinational 2458: 2457: 2455: 2453: 2449: 2443: 2440: 2436: 2433: 2432: 2430: 2427: 2425: 2422: 2420: 2417: 2412: 2409: 2407: 2404: 2403: 2401: 2398: 2395: 2392: 2389: 2386: 2383: 2380: 2379: 2377: 2375: 2369: 2363: 2360: 2358: 2355: 2353: 2350: 2348: 2345: 2341: 2338: 2336: 2333: 2331: 2328: 2326: 2323: 2321: 2318: 2316: 2313: 2312: 2311: 2308: 2306: 2303: 2302: 2300: 2296: 2290: 2287: 2285: 2282: 2280: 2277: 2275: 2272: 2271: 2269: 2265: 2257: 2254: 2253: 2252: 2249: 2247: 2244: 2242: 2239: 2237: 2234: 2232: 2229: 2227: 2224: 2222: 2219: 2217: 2214: 2212: 2209: 2207: 2204: 2202: 2199: 2197: 2194: 2192: 2189: 2187: 2184: 2182: 2179: 2178: 2176: 2174: 2170: 2160: 2157: 2155: 2152: 2150: 2147: 2144: 2141: 2138: 2135: 2132: 2129: 2126: 2123: 2121: 2118: 2115: 2112: 2110: 2107: 2105: 2102: 2101: 2099: 2097: 2091: 2084: 2081: 2079: 2076: 2073: 2070: 2067: 2064: 2063: 2061: 2055: 2049: 2046: 2044: 2041: 2039: 2036: 2034: 2031: 2029: 2026: 2024: 2021: 2019: 2016: 2015: 2013: 2009: 2002: 1999: 1996: 1993: 1990: 1987: 1985: 1982: 1980: 1977: 1975: 1972: 1970: 1967: 1965: 1962: 1960: 1957: 1955: 1952: 1950: 1947: 1945: 1942: 1940: 1937: 1933: 1930: 1929: 1927: 1924: 1921: 1918: 1917: 1915: 1913: 1909: 1903: 1900: 1898: 1895: 1892: 1889: 1886: 1883: 1880: 1877: 1874: 1871: 1868: 1865: 1860: 1857: 1856: 1854: 1851: 1849: 1846: 1845: 1843: 1841: 1835: 1823: 1820: 1819: 1818: 1815: 1813: 1810: 1806: 1803: 1801: 1798: 1796: 1793: 1791: 1788: 1787: 1786: 1783: 1781: 1778: 1777: 1775: 1773: 1769: 1763: 1760: 1758: 1755: 1753: 1750: 1746: 1743: 1741: 1738: 1737: 1736: 1733: 1731: 1728: 1727: 1725: 1723: 1719: 1713: 1710: 1708: 1705: 1701: 1698: 1697: 1696: 1693: 1689: 1686: 1684: 1681: 1680: 1679: 1676: 1672: 1669: 1667: 1664: 1663: 1662: 1659: 1657: 1654: 1650: 1647: 1645: 1642: 1641: 1640: 1637: 1636: 1634: 1630: 1627: 1625: 1621: 1611: 1608: 1606: 1603: 1602: 1600: 1598: 1594: 1588: 1585: 1583: 1580: 1576: 1573: 1571: 1568: 1567: 1566: 1563: 1561: 1560:Scoreboarding 1558: 1557: 1555: 1553: 1549: 1543: 1542:False sharing 1540: 1538: 1535: 1533: 1530: 1528: 1525: 1524: 1522: 1520: 1516: 1510: 1507: 1505: 1502: 1500: 1497: 1496: 1494: 1492: 1488: 1485: 1483: 1479: 1469: 1466: 1464: 1461: 1459: 1456: 1453: 1449: 1446: 1444: 1441: 1439: 1436: 1434: 1431: 1430: 1428: 1426: 1423: 1421: 1418: 1416: 1413: 1411: 1408: 1406: 1403: 1401: 1398: 1396: 1393: 1391: 1388: 1386: 1383: 1381: 1378: 1376: 1373: 1371: 1368: 1364: 1361: 1359: 1356: 1354: 1351: 1350: 1348: 1346: 1343: 1341: 1338: 1336: 1335:Stanford MIPS 1333: 1331: 1328: 1326: 1323: 1321: 1318: 1316: 1313: 1311: 1308: 1307: 1305: 1299: 1291: 1288: 1287: 1286: 1283: 1281: 1278: 1276: 1273: 1271: 1268: 1266: 1263: 1261: 1258: 1256: 1253: 1249: 1246: 1245: 1244: 1241: 1237: 1234: 1233: 1232: 1229: 1227: 1224: 1222: 1219: 1217: 1214: 1212: 1209: 1208: 1206: 1202: 1199: 1197: 1196:architectures 1191: 1185: 1182: 1180: 1177: 1175: 1172: 1170: 1167: 1165: 1164:Heterogeneous 1162: 1158: 1155: 1153: 1150: 1149: 1148: 1145: 1143: 1140: 1136: 1133: 1131: 1128: 1126: 1123: 1121: 1118: 1117: 1116: 1115:Memory access 1113: 1111: 1108: 1106: 1103: 1101: 1098: 1096: 1093: 1089: 1086: 1085: 1084: 1081: 1079: 1076: 1074: 1071: 1070: 1068: 1066: 1062: 1054: 1051: 1049: 1048:Random-access 1046: 1044: 1041: 1039: 1036: 1035: 1034: 1031: 1029: 1028:Stack machine 1026: 1024: 1021: 1017: 1014: 1012: 1009: 1007: 1004: 1002: 999: 997: 994: 992: 989: 987: 984: 982: 979: 978: 977: 974: 970: 967: 965: 962: 960: 957: 955: 952: 950: 947: 945: 944:with datapath 942: 941: 940: 937: 935: 932: 930: 927: 926: 924: 922: 918: 914: 907: 902: 900: 895: 893: 888: 887: 884: 878: 875: 872: 869: 868: 864: 855: 851: 847: 846:Sze, Vivienne 840: 837: 826: 822: 815: 812: 801: 797: 790: 787: 782: 778: 774: 767: 764: 760: 756: 752: 745: 742: 736: 730: 727: 722: 718: 711: 708: 702: 697: 690: 687: 679: 672: 669: 661: 654: 651: 644: 640: 637: 635: 632: 630: 627: 625: 622: 620: 617: 615: 612: 610: 607: 605: 602: 600: 597: 595: 592: 590: 587: 585: 582: 581: 577: 572: 568: 564: 563:supercomputer 560: 557: 554: 553:Fujitsu A64FX 550: 549:supercomputer 547:, a Japanese 546: 543: 541: 538: 537: 536: 534: 526: 521: 517: 514: 511: 507: 506:supercomputer 503: 499: 495: 491: 488: 487: 486: 480: 476: 473:, a manycore 472: 469: 466: 463: 458: 455: 454: 453: 449: 446: 443: 440: 437: 433: 430: 427: 424: 421: 418:, a manycore 417: 414: 411: 408:, a manycore 407: 404: 402: 398: 395: 392: 390: 386: 383: 381: 378: 375: 371: 368: 365: 361: 358: 357: 353: 349: 346: 344: 341: 339: 335: 332: 331: 327: 323: 320: 318: 315: 313: 310: 308: 305: 303: 299: 296: 294: 291: 290: 286: 284: 282: 277: 275: 271: 267: 262: 260: 256: 252: 248: 244: 240: 236: 229: 227: 225: 221: 217: 213: 212:shared memory 209: 205: 201: 197: 193: 189: 184: 182: 178: 174: 163: 160: 152: 149:December 2022 141: 138: 134: 131: 127: 124: 120: 117: 113: 110: ā€“  109: 105: 104:Find sources: 98: 94: 90: 84: 83: 79: 74:This section 72: 68: 63: 62: 56: 54: 52: 48: 44: 40: 36: 32: 19: 3123:Coordination 3053:Amdahl's law 2989:Simultaneous 2899: 2814:Chip carrier 2752:Clock gating 2671:Mixed-signal 2568:Write buffer 2545:Control unit 2357:Clock signal 2283: 2096:accelerators 2078:Cypress PSoC 1735:Simultaneous 1552:Out-of-order 1184:Neuromorphic 1065:Architecture 1023:Belt machine 1016:Zeno machine 949:Hierarchical 853: 839: 828:. Retrieved 824: 814: 803:. Retrieved 799: 789: 772: 766: 754: 744: 729: 710: 689: 671: 653: 531:Quite a few 530: 501: 484: 442:Green arrays 373: 359:ZettaScaler 278: 263: 233: 195: 191: 185: 170: 155: 146: 136: 129: 122: 115: 103: 87:Please help 75: 30: 29: 3559:Scalability 3320:distributed 3203:Concurrency 3170:Programming 3011:Cooperative 3000:Speculative 2936:Instruction 2599:Multiplexer 2563:Data buffer 2274:Single-core 2246:bit slicing 2104:Coprocessor 1959:Coprocessor 1840:performance 1762:Cooperative 1752:Speculative 1712:Distributed 1671:Superscalar 1656:Instruction 1624:Parallelism 1597:Speculative 1429:System/3x0 1301:Instruction 1078:Von Neumann 991:Postā€“Turing 362:, Japanese 312:Actor model 218:(such as a 216:accelerator 208:superscalar 3598:Categories 3564:Starvation 3303:asymmetric 3038:PRAM model 3006:Preemptive 2719:management 2614:Multiplier 2475:Logic gate 2465:Sequential 2372:Functional 2352:Clock rate 2325:Data cache 2298:Components 2279:Multi-core 2267:Core count 1757:Preemptive 1661:Pipelining 1644:Bit-serial 1587:Wide-issue 1532:Structural 1454:Tilera ISA 1420:MicroBlaze 1390:ETRAX CRIS 1285:Comparison 1130:Loadā€“store 1110:Endianness 830:2021-11-18 805:2021-11-18 645:References 230:Motivation 119:newspapers 3298:symmetric 3043:PEM model 2653:Circuitry 2573:Microcode 2497:Registers 2340:coherence 2315:CPU cache 2173:Word size 1838:Processor 1482:Execution 1385:DEC Alpha 1363:Power ISA 1179:Cognitive 986:Universal 701:1412.5538 516:SpiNNaker 471:Graphcore 432:TrueNorth 259:TrueNorth 204:pipelines 202:, deeper 194:parallel 76:does not 3529:Deadlock 3517:Problems 3483:pthreads 3463:OpenHMPP 3388:Ateji PX 3349:computer 3220:Hardware 3087:Elements 3073:Slowdown 2984:Temporal 2966:Pipeline 2591:Datapath 2284:Manycore 2256:variable 2094:Hardware 1730:Temporal 1410:OpenRISC 1105:Cellular 1095:Dataflow 1088:modified 848:(2016). 759:EE Times 721:Archived 578:See also 540:Frontier 494:Japanese 385:Adapteva 370:Xeon Phi 322:Dataflow 270:clusters 268:such as 18:Manycore 3488:RaftLib 3468:OpenACC 3443:GPUOpen 3433:C++ AMP 3408:Charm++ 3150:Barrier 3094:Process 3078:Speedup 2863:General 2767:Related 2698:Quantum 2688:Digital 2683:Boolean 2581:Counter 2480:Quantum 2241:512-bit 2236:256-bit 2231:128-bit 2074:(MPSoC) 2059:on chip 2057:Systems 1875:(FLOPS) 1688:Process 1537:Control 1519:Hazards 1405:Itanium 1400:Unicore 1358:PowerPC 1083:Harvard 1043:Pointer 1038:Counter 996:Quantum 571:SW26010 498:Hepburn 490:Gyoukou 465:Eyeriss 457:SW52020 222:) in a 206:, more 133:scholar 97:removed 82:sources 3581:  3458:OpenCL 3453:OpenMP 3398:Chapel 3315:shared 3310:Memory 3245:(SIMT) 3188:Models 3099:Thread 3031:Theory 3002:(SpMT) 2956:Memory 2941:Thread 2924:Levels 2703:Switch 2693:Analog 2431:(IMC) 2402:(MMU) 2251:others 2226:64-bit 2221:48-bit 2216:32-bit 2211:24-bit 2206:16-bit 2201:15-bit 2196:12-bit 2033:Mobile 1949:Stream 1944:Barrel 1939:Vector 1928:(GPU) 1887:(SUPS) 1855:(IPC) 1707:Memory 1700:Vector 1683:Thread 1666:Scalar 1468:Others 1415:RISC-V 1380:SuperH 1349:Power 1345:MIPS-X 1320:PDP-11 1169:Fabric 921:Models 567:TOP500 551:using 545:Fugaku 416:Kalray 380:Tilera 317:OpenMP 298:OpenCL 135:  128:  121:  114:  106:  3428:Dryad 3393:Boost 3114:Array 3104:Fiber 3018:(CMT) 2991:(SMT) 2905:GPGPU 2759:(PPW) 2717:Power 2609:Adder 2485:Array 2452:Logic 2413:(TLB) 2396:(FPU) 2390:(AGU) 2384:(ALU) 2374:units 2310:Cache 2191:8-bit 2186:4-bit 2181:1-bit 2145:(TPU) 2139:(DSP) 2133:(PPU) 2127:(VPU) 2116:(GPU) 2085:(NoC) 2068:(SoC) 2003:(PoP) 1997:(SiP) 1991:(MCM) 1932:GPGPU 1922:(CPU) 1912:Types 1893:(PPW) 1881:(TPS) 1869:(IPS) 1861:(CPI) 1632:Level 1443:S/390 1438:S/370 1433:S/360 1375:SPARC 1353:POWER 1236:TRIPS 1204:Types 696:arXiv 681:(PDF) 663:(PDF) 502:gyōkō 496:: ęšå…‰ 434:, an 420:PCI-e 412:(VPU) 140:JSTOR 126:books 3493:ROCm 3423:CUDA 3413:Cilk 3380:APIs 3340:COMA 3335:NUMA 3266:MIMD 3261:MISD 3238:SIMD 3233:SISD 2961:Loop 2951:Data 2946:Task 2737:ACPI 2470:Glue 2362:FIFO 2305:Core 2043:ASIP 1984:CPLD 1979:FPOA 1974:FPGA 1969:ASIC 1822:SPMD 1817:MIMD 1812:MISD 1805:SWAR 1785:SIMD 1780:SISD 1695:Data 1678:Task 1649:Word 1395:M32R 1340:MIPS 1303:sets 1270:ZISC 1265:NISC 1260:OISC 1255:MISC 1248:EPIC 1243:VLIW 1231:EDGE 1221:RISC 1216:CISC 1125:HUMA 1120:NUMA 639:CUDA 594:SIMD 334:GPUs 272:and 192:both 112:news 80:any 78:cite 49:and 3508:ZPL 3503:TBB 3498:UPC 3478:PVM 3448:MPI 3403:HPX 3330:UMA 2931:Bit 2732:APM 2727:PMU 2619:CPU 2576:ROM 2347:Bus 1964:PAL 1639:Bit 1425:LMC 1330:ARM 1325:x86 1315:VAX 777:doi 261:). 247:DMA 220:GPU 196:and 91:by 3600:: 2666:3D 852:. 823:. 798:. 757:, 753:, 500:: 276:. 249:, 245:, 241:, 226:. 183:. 53:. 2848:e 2841:t 2834:v 905:e 898:t 891:v 833:. 808:. 783:. 779:: 737:. 704:. 698:: 683:. 665:. 522:. 492:( 162:) 156:( 151:) 147:( 137:Ā· 130:Ā· 123:Ā· 116:Ā· 99:. 85:. 20:)

Index

Manycore
multi-core processors
parallel processing
processor cores
embedded computers
high-performance computing

cite
sources
improve this section
adding citations to reliable sources
removed
"Manycore processor"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
multi-core processors
explicit parallelism
single-thread performance
multi-core processors
out-of-order execution
pipelines
superscalar
shared memory
accelerator
GPU
heterogeneous system

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

ā†‘