142:) have a value larger than or equal to 8 in order to avoid overlap with existing instructions. The C4 byte used in the VEX scheme has no such restriction. This may prevent the use of the m-bits for other purposes in the future in the XOP scheme, but not in the VEX scheme. Another possible problem is that the pp bits have the value 00 in the XOP scheme, while they have the value 01 in the VEX scheme for instructions that have no legacy equivalent. This may complicate the use of the pp bits for other purposes in the future.
1004:. Like the AVX instruction VPBLENDVB, it is a four-operand instruction with three source operands and a destination. For each bit in the third operand (which acts as a selector), 1 selects the same bit in the first source, and 0 selects the same in the second source. When used together with the XOP vector comparison instructions above this can be used to implement a vectorized ternary move, or if the second input is the same as the destination, a conditional move (
1198:. It takes three registers as input, the first two are source registers and the third the selector register. Each byte in the selector selects one of the bytes in one of the two input registers for the output. The selector can also apply effects on the selected bytes such as setting it to 0, reverse the bit order, and repeating the most-significant bit. All of the effects or the input can in addition be inverted.
476:
Horizontal addition instructions adds adjacent values in the input vector to each other. The output size in the instructions below describes how wide the horizontal addition performed is. For instance horizontal byte to word adds two bytes at a time and returns the result as vector of words, but byte
816:
This set of vector compare instructions all take an immediate as an extra argument. The immediate controls what kind of comparison is performed. There are eight comparison possible for each instruction. The vectors are compared and all comparisons that evaluate to true set all corresponding bits in
1042:
in that they can shift each unit with a different amount using a vector register interpreted as packed signed integers. The sign indicates the direction of shift or rotate, with positive values causing left shift and negative right shift Intel has specified a different incompatible set of variable
134:
Commentators have seen this as evidence that Intel has not allowed AMD to use any part of the large VEX coding space. AMD has been forced to use different codes in order to avoid using any code combination that Intel might possibly be using in its development pipeline for something else. The XOP
1419:
Byte value 0x8F is an existing opcode for a POP instruction. This instruction uses the ModR/M byte, which follows the opcode, but it does not make use of the "reg" (register) field, which is bits 3-5. Some opcodes which don't use "reg" multiplex instructions by using these bits to signify eight
135:
coding scheme is as close to the VEX scheme as technically possible without risking that the AMD codes overlap with future Intel codes. This inference is speculative, since no public information is available about negotiations between the two companies on this issue.
149:
instruction sets. Intel initially proposed FMA4 in AVX/FMA specification version 3 to supersede the 3-operand FMA proposed by AMD in SSE5. After AMD adopted FMA4, Intel canceled FMA4 support and reverted to FMA3 in the AVX/FMA specification version 5 (See
1420:
different instructions (0x80-0x83 and 0xD0-0xDF, among others); 0x8F does not. This means, for a standard POP instruction, bits 3-5 should always be zero. Since the m-bits are bits 0-4, requiring a value 8 or higher sets bit 3 of the byte following 0x8F.
62:. Most of the instructions are integer instructions, but it also contains floating point permutation and floating point fraction extraction instructions. See the index for a list of instruction types.
1512:
But with Zen being a clean-sheet design, there are some instruction set extensions found in
Bulldozer processors not found in Zen/znver1. Those no longer present include FMA4 and XOP.
477:
to quadword adds eight bytes together at a time and returns the result as vector of quadwords. Six additional horizontal addition and subtraction instructions can be found in
2118:
2487:
1825:
1448:
158:
817:
the destination to 1, and false comparisons sets all the same bits to 0. This result can be used directly in VPCMOV instruction for a vectorized
1737:
1973:
1549:
2111:
58:
The XOP instruction set contains several different types of vector instructions since it was originally intended as a major upgrade to
2212:
2130:
2493:
2372:
2248:
2138:
2080:
2068:
2063:
2058:
2053:
2503:
2142:
1789:
1678:
1275:
These instructions extracts the fractional part of floating point, that is the part that would be lost in conversion to integer.
1705:
1763:
1357:
2104:
1351:
1363:
1345:
2389:
1818:
1501:
2347:
2311:
2021:
1543:
1380:
1218:
128:
116:
108:
81:
1572:
2566:
2556:
2462:
2418:
2273:
59:
36:
161:, its third-generation x86-64 architecture in its first iteration (znver1 – Zen, version 1), will not support
89:
2561:
2468:
2397:
2168:
2163:
1811:
52:
2236:
1336:
32:
1454:
157:
In March 2015, AMD explicitly revealed in the description of the patch for the GNU Binutils package that
2438:
2261:
2031:
1948:
1385:
93:
48:
123:
equivalents in AVX were classified as the XOP extension. The XOP instructions have an opcode byte 8F (
2479:
2450:
2195:
1651:
1479:
AMD64 Architecture
Programmer's Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions
1390:
2432:
2329:
2085:
2046:
2041:
2036:
1859:
1741:
186:
151:
146:
112:
85:
2523:
2517:
2511:
51:
processor core, which was released on
October 12, 2011. However AMD removed support for XOP from
1943:
2402:
2200:
2180:
2364:
818:
1619:
1599:
115:
instruction sets announced by Intel have been changed to use the coding proposed by Intel.
2127:
1769:
28:
1652:"AMD64 Architecture Programmer's Manual, Volume4: 128-Bit and 256-Bit Media Instructions"
1968:
177:
instructions developed specifically for the "Bulldozer" family of micro-architectures.
97:
84:, parts that overlapped with AVX were removed or moved to separate standards such as
2550:
1983:
1963:
1958:
1713:
1528:
2426:
1624:
1604:
124:
1978:
1918:
1913:
1869:
481:, but they operate on two input vectors and only does two and two operations.
139:
2096:
107:
All SSE5 instructions that were equivalent or similar to instructions in the
2456:
2378:
2219:
2156:
2151:
1953:
1923:
1524:
1477:
1190:
instruction PALIGNR and PSHUFB and adds more to both. Some compare it the
1993:
1988:
1506:
2341:
2224:
2207:
2190:
2173:
1897:
1578:
1191:
2444:
2407:
2267:
2016:
2011:
1864:
1854:
1706:"Intel Architecture Instruction Set Extensions Programming Reference"
2353:
2291:
2231:
1849:
1709:
1187:
478:
390:
Multiply
Accumulate (with Saturation) High Doubleword to Quadword
348:
Multiply
Accumulate (with Saturation) Low Doubleword to Quadword
101:
44:
1803:
1225:
they can select output from any of the fields in the two inputs.
2335:
2317:
2303:
2297:
2285:
2279:
2185:
2075:
1892:
1502:"The Impact Of GCC Zen Compiler Tuning On AMD Ryzen Performance"
1395:
1039:
1001:
190:
77:
71:
2100:
1807:
305:
Multiply
Accumulate (with Saturation) Doubleword to Doubleword
2252:
1834:
1685:
1658:
1486:
1400:
40:
432:
Multiply Add
Accumulate (with Saturation) Word to Doubleword
262:
Multiply
Accumulate (with Saturation) Low Word to Doubleword
145:
A similar compatibility issue is the difference between the
685:
Horizontal add two signed/unsigned doublewords to quadword
76:
XOP is a revised subset of what was originally intended as
1324:
Extract
Fraction Scalar Single-Precision Floating Point
1314:
Extract
Fraction Scalar Double-Precision Floating-Point
1304:
Extract Fraction Packed Single-Precision Floating-Point
1294:
Extract Fraction Packed Double-Precision Floating-Point
546:
Horizontal add four signed/unsigned bytes to doubleword
80:. It was changed to be similar but not overlapping with
785:
Horizontal subtract two signed doublewords to quadword
612:
Horizontal add two signed/unsigned words to doubleword
581:
Horizontal add eight signed/unsigned bytes to quadword
1620:"[PATCH] Remove CpuFMA4 From Znver1 CPU Flags"
1574:
Intel Advanced Vector Extensions Programming Reference
1000:
works as bitwise variant of the blend instructions in
651:
Horizontal add four signed/unsigned words to quadword
189:. These are all four operand instructions similar to
138:
The use of the 8F byte requires that the m-bits (see
1442:
1440:
1438:
1436:
2502:
2478:
2416:
2388:
2363:
2247:
2137:
2004:
1936:
1906:
1885:
1878:
1842:
1264:Permute Two-Source Single-Precision Floating-Point
1254:Permute Two-Source Double-Precision Floating-Point
750:Horizontal subtract two signed words to doubleword
219:Multiply Accumulate (with Saturation) Word to Word
127:), but otherwise almost identical coding scheme as
16:Computer instruction set introduced by AMD in 2009
1038:The shift instructions here differ from those in
507:Horizontal add two signed/unsigned bytes to word
100:floating-point conversion implemented as F16C by
1472:
1470:
1679:"New "Bulldozer" and "Piledriver" Instructions"
181:Integer vector multiply–accumulate instructions
35:on May 1, 2009, is an extension to the 128-bit
2375:(ABM: 2007, BMI1: 2012, BMI2: 2013, TBM: 2012)
2112:
1819:
1791:New "Bulldozer" and "Piledriver" Instructions
1646:
1644:
1642:
1640:
1638:
1636:
1634:
715:Horizontal subtract two signed bytes to word
8:
1209:instructions are two source versions of the
1034:Integer vector shift and rotate instructions
1598:Ganesh Gopalasubramanian (March 10, 2015).
2119:
2105:
2097:
1882:
1826:
1812:
1804:
1227:
1186:is a single instruction that combines the
1045:
915:
823:
483:
193:and they all operate on signed integers.
1277:
1010:
195:
2435:(2008); ARMv8 also has AES instructions
1432:
1412:
1600:"[PATCH] add znver1 processor"
1768:, AMD Developer blogs, archived from
1453:, AMD Developer blogs, archived from
7:
1122:Packed Shift Arithmetic Doublewords
900:Compare Vector Unsigned Doublewords
1043:vector shift instructions in AVX2.
1271:Floating-point fraction extraction
1132:Packed Shift Arithmetic Quadwords
910:Compare Vector Unsigned Quadwords
860:Compare Vector Signed Doublewords
472:Integer vector horizontal addition
14:
1500:Michael Larabel (March 3, 2017).
1366:processors (including "v2"), 2015
1162:Packed Shift Logical Doublewords
185:These are integer version of the
2535:Suspended extensions' dates are
870:Compare Vector Signed Quadwords
1545:Intel AVX Programming Reference
1172:Packed Shift Logical Quadwords
1577:, January 2009, archived from
1112:Packed Shift Arithmetic Words
1102:Packed Shift Arithmetic Bytes
890:Compare Vector Unsigned Words
880:Compare Vector Unsigned Bytes
1:
1738:"Buldozer x264 optimisations"
1618:Amit Pawar (August 7, 2015).
1342:"Heavy Equipment" processors
1762:Dave Christie (2009-05-07),
1548:, March 2008, archived from
1530:Stop the instruction set war
1447:Dave Christie (2009-05-07),
850:Compare Vector Signed Words
840:Compare Vector Signed Bytes
595:r0 = a0+a1+a2+a3+a4+a5+a6+a7
131:with the 3-byte VEX prefix.
1152:Packed Shift Logical Words
1142:Packed Shift Logical Bytes
454:r0 = a0 * b0 + a1 * b1 + c0
2583:
1082:Packed Rotate Doublewords
69:
2533:
1318:
1308:
1298:
1288:
1283:
1280:
1258:
1248:
1238:
1233:
1230:
1166:
1156:
1146:
1136:
1126:
1116:
1106:
1096:
1086:
1076:
1066:
1056:
1051:
1048:
921:
918:
904:
894:
884:
874:
864:
854:
844:
834:
829:
826:
779:
744:
709:
675:
641:
602:
571:
536:
497:
492:
489:
486:
422:
380:
338:
295:
252:
209:
204:
201:
198:
39:core instructions in the
2332:(FMA4: 2011, FMA3: 2012)
1092:Packed Rotate Quadwords
1027:Vector Conditional Move
47:instruction set for the
2390:Compressed instructions
993:Vector conditional move
88:(floating-point vector
53:Zen (microarchitecture)
954:Greater Than or Equal
812:Integer vector compare
1949:High Bandwidth Memory
2480:Transactional memory
1581:on February 29, 2012
1527:(December 5, 2009),
1244:Packed Permute Byte
1072:Packed Rotate Words
1062:Packed Rotate Bytes
443:) + 4 doublewords (
316:) + 4 doublewords (
273:) + 4 doublewords (
1797:, AMD, October 2012
1719:on February 1, 2014
1360:processors, Q1 2014
1354:processors, Q4 2012
1348:processors, Q4 2011
938:Less Than or Equal
757:) → 4 doublewords (
619:) → 4 doublewords (
553:) → 4 doublewords (
447:) → 4 doublewords (
320:) → 4 doublewords (
277:) → 4 doublewords (
187:FMA instruction set
90:multiply–accumulate
25:eXtended Operations
1765:Striking a balance
1450:Striking a balance
401:) + 2 quadwords (
393:2x4 doublewords (
359:) + 2 quadwords (
351:2x4 doublewords (
308:2x4 doublewords (
2544:
2543:
2094:
2093:
1932:
1931:
1358:Steamroller-based
1328:
1327:
1268:
1267:
1221:which means like
1176:
1175:
1031:
1030:
990:
989:
914:
913:
809:
808:
792:) → 2 quadwords (
692:) → 2 quadwords (
658:) → 2 quadwords (
588:) → 2 quadwords (
469:
468:
416:r1 = a3 * b3 + c1
412:r0 = a1 * b1 + c0
405:) → 2 quadwords (
374:r1 = a2 * b2 + c1
370:r0 = a0 * b0 + c0
363:) → 2 quadwords (
331:r1 = a1 * b1 + c1
327:r0 = a0 * b0 + c0
288:r1 = a2 * b2 + c1
284:r0 = a0 * b0 + c0
245:r1 = a1 * b1 + c1
241:r0 = a0 * b0 + c0
140:VEX coding scheme
2574:
2567:AMD technologies
2557:X86 instructions
2365:Bit manipulation
2121:
2114:
2107:
2098:
1883:
1828:
1821:
1814:
1805:
1799:
1798:
1796:
1786:
1780:
1779:
1778:
1777:
1759:
1753:
1752:
1750:
1749:
1740:. Archived from
1734:
1728:
1727:
1725:
1724:
1718:
1712:. Archived from
1702:
1696:
1695:
1693:
1692:
1683:
1675:
1669:
1668:
1666:
1665:
1656:
1648:
1629:
1628:
1615:
1609:
1608:
1595:
1589:
1588:
1587:
1586:
1569:
1563:
1562:
1561:
1560:
1554:
1540:
1534:
1533:
1521:
1515:
1514:
1497:
1491:
1490:
1484:
1474:
1465:
1464:
1463:
1462:
1444:
1421:
1417:
1352:Piledriver-based
1321:
1311:
1301:
1291:
1278:
1261:
1251:
1241:
1228:
1224:
1217:instructions in
1216:
1212:
1208:
1204:
1197:
1185:
1169:
1159:
1149:
1139:
1129:
1119:
1109:
1099:
1089:
1079:
1069:
1059:
1046:
1024:
1011:
1007:
999:
916:
907:
897:
887:
877:
867:
857:
847:
837:
824:
819:conditional move
804:
800:
795:
791:
788:4 doublewords (
782:
773:
769:
765:
760:
756:
747:
738:
734:
730:
725:
721:
712:
704:
700:
695:
691:
688:4 doublewords (
682:
678:
670:
669:r1 = a4+a5+a6+a7
666:
665:r0 = a0+a1+a2+a3
661:
657:
648:
644:
635:
631:
627:
622:
618:
609:
605:
596:
591:
587:
578:
574:
565:
564:r1 = a4+a5+a6+a7
561:
560:r0 = a0+a1+a2+a3
556:
552:
543:
539:
530:
526:
522:
517:
513:
504:
500:
484:
463:
459:
455:
450:
446:
442:
438:
429:
425:
417:
413:
408:
404:
400:
396:
387:
383:
375:
371:
366:
362:
358:
354:
345:
341:
332:
328:
323:
319:
315:
311:
302:
298:
289:
285:
280:
276:
272:
268:
259:
255:
246:
242:
237:
233:
229:
225:
216:
212:
196:
176:
172:
168:
164:
2582:
2581:
2577:
2576:
2575:
2573:
2572:
2571:
2547:
2546:
2545:
2540:
2529:
2498:
2474:
2412:
2384:
2359:
2243:
2133:
2128:Instruction set
2125:
2095:
2090:
2000:
1928:
1902:
1874:
1860:Radeon Software
1838:
1832:
1802:
1794:
1788:
1787:
1783:
1775:
1773:
1761:
1760:
1756:
1747:
1745:
1736:
1735:
1731:
1722:
1720:
1716:
1704:
1703:
1699:
1690:
1688:
1681:
1677:
1676:
1672:
1663:
1661:
1654:
1650:
1649:
1632:
1627:(Mailing list).
1617:
1616:
1612:
1607:(Mailing list).
1597:
1596:
1592:
1584:
1582:
1571:
1570:
1566:
1558:
1556:
1552:
1542:
1541:
1537:
1523:
1522:
1518:
1499:
1498:
1494:
1482:
1476:
1475:
1468:
1460:
1458:
1446:
1445:
1434:
1430:
1425:
1424:
1418:
1414:
1409:
1377:
1364:Excavator-based
1346:Bulldozer-based
1333:
1319:
1309:
1299:
1289:
1273:
1259:
1249:
1239:
1222:
1214:
1210:
1206:
1202:
1195:
1183:
1181:
1167:
1157:
1147:
1137:
1127:
1117:
1107:
1097:
1087:
1077:
1067:
1057:
1036:
1022:
1005:
997:
995:
905:
895:
885:
875:
865:
855:
845:
835:
814:
802:
798:
793:
789:
780:
771:
767:
763:
758:
754:
745:
736:
732:
728:
723:
719:
710:
702:
698:
693:
689:
680:
676:
668:
664:
659:
655:
646:
642:
633:
629:
625:
620:
616:
607:
603:
594:
589:
585:
576:
572:
563:
559:
554:
550:
541:
537:
528:
524:
520:
515:
511:
502:
498:
474:
461:
457:
453:
448:
444:
440:
436:
427:
423:
415:
411:
406:
402:
398:
394:
385:
381:
373:
369:
364:
360:
356:
352:
343:
339:
330:
326:
321:
317:
313:
309:
300:
296:
287:
283:
278:
274:
270:
266:
257:
253:
244:
240:
235:
231:
227:
223:
214:
210:
183:
174:
170:
166:
162:
74:
68:
31:, announced by
29:instruction set
17:
12:
11:
5:
2580:
2578:
2570:
2569:
2564:
2562:SIMD computing
2559:
2549:
2548:
2542:
2541:
2537:struck through
2534:
2531:
2530:
2528:
2527:
2521:
2515:
2508:
2506:
2504:Virtualization
2500:
2499:
2497:
2496:
2491:
2484:
2482:
2476:
2475:
2473:
2472:
2466:
2460:
2454:
2448:
2442:
2436:
2430:
2423:
2421:
2414:
2413:
2411:
2410:
2405:
2400:
2394:
2392:
2386:
2385:
2383:
2382:
2376:
2369:
2367:
2361:
2360:
2358:
2357:
2351:
2345:
2339:
2333:
2327:
2321:
2315:
2309:
2301:
2295:
2289:
2283:
2277:
2271:
2265:
2258:
2256:
2245:
2244:
2242:
2241:
2240:
2239:
2229:
2228:
2227:
2217:
2216:
2215:
2205:
2204:
2203:
2198:
2193:
2188:
2178:
2177:
2176:
2171:
2161:
2160:
2159:
2148:
2146:
2135:
2134:
2126:
2124:
2123:
2116:
2109:
2101:
2092:
2091:
2089:
2088:
2083:
2078:
2073:
2072:
2071:
2066:
2061:
2051:
2050:
2049:
2044:
2034:
2029:
2024:
2019:
2014:
2008:
2006:
2002:
2001:
1999:
1998:
1997:
1996:
1986:
1981:
1976:
1971:
1966:
1961:
1956:
1951:
1946:
1940:
1938:
1934:
1933:
1930:
1929:
1927:
1926:
1921:
1916:
1910:
1908:
1904:
1903:
1901:
1900:
1895:
1889:
1887:
1880:
1876:
1875:
1873:
1872:
1867:
1862:
1857:
1852:
1846:
1844:
1840:
1839:
1833:
1831:
1830:
1823:
1816:
1808:
1801:
1800:
1781:
1754:
1729:
1697:
1670:
1630:
1610:
1590:
1564:
1535:
1516:
1492:
1466:
1431:
1429:
1426:
1423:
1422:
1411:
1410:
1408:
1405:
1404:
1403:
1398:
1393:
1388:
1383:
1376:
1373:
1372:
1371:
1370:
1369:
1368:
1367:
1361:
1355:
1349:
1332:
1329:
1326:
1325:
1322:
1316:
1315:
1312:
1306:
1305:
1302:
1296:
1295:
1292:
1286:
1285:
1282:
1272:
1269:
1266:
1265:
1262:
1256:
1255:
1252:
1246:
1245:
1242:
1236:
1235:
1232:
1180:
1179:Vector permute
1177:
1174:
1173:
1170:
1164:
1163:
1160:
1154:
1153:
1150:
1144:
1143:
1140:
1134:
1133:
1130:
1124:
1123:
1120:
1114:
1113:
1110:
1104:
1103:
1100:
1094:
1093:
1090:
1084:
1083:
1080:
1074:
1073:
1070:
1064:
1063:
1060:
1054:
1053:
1050:
1035:
1032:
1029:
1028:
1025:
1019:
1018:
1015:
994:
991:
988:
987:
984:
980:
979:
976:
972:
971:
968:
964:
963:
960:
956:
955:
952:
948:
947:
944:
940:
939:
936:
932:
931:
928:
924:
923:
920:
912:
911:
908:
902:
901:
898:
892:
891:
888:
882:
881:
878:
872:
871:
868:
862:
861:
858:
852:
851:
848:
842:
841:
838:
832:
831:
828:
813:
810:
807:
806:
786:
783:
777:
776:
751:
748:
742:
741:
722:5) → 8 words (
716:
713:
707:
706:
686:
683:
673:
672:
652:
649:
639:
638:
613:
610:
600:
599:
582:
579:
569:
568:
547:
544:
534:
533:
508:
505:
495:
494:
491:
488:
473:
470:
467:
466:
433:
430:
420:
419:
391:
388:
378:
377:
349:
346:
336:
335:
306:
303:
293:
292:
263:
260:
250:
249:
230:) + 8 words (
220:
217:
207:
206:
203:
200:
182:
179:
122:
98:Half-precision
70:Main article:
67:
64:
15:
13:
10:
9:
6:
4:
3:
2:
2579:
2568:
2565:
2563:
2560:
2558:
2555:
2554:
2552:
2538:
2532:
2525:
2522:
2519:
2516:
2513:
2510:
2509:
2507:
2505:
2501:
2495:
2492:
2489:
2486:
2485:
2483:
2481:
2477:
2470:
2467:
2464:
2461:
2458:
2455:
2452:
2449:
2446:
2443:
2440:
2437:
2434:
2431:
2428:
2425:
2424:
2422:
2420:
2417:Security and
2415:
2409:
2406:
2404:
2401:
2399:
2396:
2395:
2393:
2391:
2387:
2380:
2377:
2374:
2371:
2370:
2368:
2366:
2362:
2355:
2352:
2349:
2346:
2343:
2340:
2337:
2334:
2331:
2328:
2325:
2322:
2319:
2316:
2313:
2310:
2308:
2305:
2302:
2299:
2296:
2293:
2290:
2287:
2284:
2281:
2278:
2275:
2272:
2269:
2266:
2263:
2260:
2259:
2257:
2254:
2250:
2246:
2238:
2235:
2234:
2233:
2230:
2226:
2223:
2222:
2221:
2218:
2214:
2211:
2210:
2209:
2206:
2202:
2199:
2197:
2194:
2192:
2189:
2187:
2184:
2183:
2182:
2179:
2175:
2172:
2170:
2167:
2166:
2165:
2162:
2158:
2155:
2154:
2153:
2150:
2149:
2147:
2144:
2140:
2136:
2132:
2129:
2122:
2117:
2115:
2110:
2108:
2103:
2102:
2099:
2087:
2084:
2082:
2079:
2077:
2074:
2070:
2067:
2065:
2062:
2060:
2057:
2056:
2055:
2052:
2048:
2045:
2043:
2040:
2039:
2038:
2035:
2033:
2030:
2028:
2025:
2023:
2020:
2018:
2015:
2013:
2010:
2009:
2007:
2003:
1995:
1992:
1991:
1990:
1987:
1985:
1982:
1980:
1977:
1975:
1972:
1970:
1967:
1965:
1962:
1960:
1957:
1955:
1952:
1950:
1947:
1945:
1942:
1941:
1939:
1935:
1925:
1922:
1920:
1917:
1915:
1912:
1911:
1909:
1905:
1899:
1896:
1894:
1891:
1890:
1888:
1884:
1881:
1877:
1871:
1868:
1866:
1863:
1861:
1858:
1856:
1853:
1851:
1848:
1847:
1845:
1841:
1836:
1829:
1824:
1822:
1817:
1815:
1810:
1809:
1806:
1793:
1792:
1785:
1782:
1772:on 2013-11-09
1771:
1767:
1766:
1758:
1755:
1744:on 2014-01-15
1743:
1739:
1733:
1730:
1715:
1711:
1707:
1701:
1698:
1687:
1680:
1674:
1671:
1660:
1653:
1647:
1645:
1643:
1641:
1639:
1637:
1635:
1631:
1626:
1625:
1621:
1614:
1611:
1606:
1605:
1601:
1594:
1591:
1580:
1576:
1575:
1568:
1565:
1555:on 2011-08-07
1551:
1547:
1546:
1539:
1536:
1532:
1531:
1526:
1520:
1517:
1513:
1509:
1508:
1503:
1496:
1493:
1489:, May 1, 2009
1488:
1481:
1480:
1473:
1471:
1467:
1457:on 2013-11-04
1456:
1452:
1451:
1443:
1441:
1439:
1437:
1433:
1427:
1416:
1413:
1406:
1402:
1399:
1397:
1394:
1392:
1389:
1387:
1384:
1382:
1379:
1378:
1374:
1365:
1362:
1359:
1356:
1353:
1350:
1347:
1344:
1343:
1341:
1340:
1338:
1335:
1334:
1331:CPUs with XOP
1330:
1323:
1317:
1313:
1307:
1303:
1297:
1293:
1287:
1279:
1276:
1270:
1263:
1257:
1253:
1247:
1243:
1237:
1229:
1226:
1220:
1199:
1193:
1189:
1178:
1171:
1165:
1161:
1155:
1151:
1145:
1141:
1135:
1131:
1125:
1121:
1115:
1111:
1105:
1101:
1095:
1091:
1085:
1081:
1075:
1071:
1065:
1061:
1055:
1047:
1044:
1041:
1033:
1026:
1021:
1020:
1016:
1013:
1012:
1009:
1003:
992:
985:
982:
981:
977:
974:
973:
969:
966:
965:
961:
958:
957:
953:
950:
949:
946:Greater Than
945:
942:
941:
937:
934:
933:
929:
926:
925:
917:
909:
903:
899:
893:
889:
883:
879:
873:
869:
863:
859:
853:
849:
843:
839:
833:
825:
822:
820:
811:
805:
787:
784:
778:
775:
752:
749:
743:
740:
717:
714:
708:
705:
687:
684:
674:
671:
653:
650:
640:
637:
614:
611:
601:
598:
583:
580:
570:
567:
548:
545:
535:
532:
514:) → 8 words (
509:
506:
496:
485:
482:
480:
471:
465:
434:
431:
421:
418:
392:
389:
379:
376:
350:
347:
337:
334:
307:
304:
294:
291:
264:
261:
251:
248:
234:) → 8 words (
221:
218:
208:
197:
194:
192:
188:
180:
178:
160:
155:
153:
148:
147:FMA3 and FMA4
143:
141:
136:
132:
130:
126:
120:
119:instructions
118:
114:
110:
105:
103:
99:
95:
91:
87:
83:
79:
73:
65:
63:
61:
56:
54:
50:
46:
42:
38:
34:
30:
26:
22:
2536:
2419:cryptography
2323:
2306:
2026:
2005:Instructions
1944:Cool'n'Quiet
1790:
1784:
1774:, retrieved
1770:the original
1764:
1757:
1746:. Retrieved
1742:the original
1732:
1721:. Retrieved
1714:the original
1700:
1689:. Retrieved
1673:
1662:. Retrieved
1623:
1613:
1603:
1593:
1583:, retrieved
1579:the original
1573:
1567:
1557:, retrieved
1550:the original
1544:
1538:
1529:
1519:
1511:
1505:
1495:
1478:
1459:, retrieved
1455:the original
1449:
1415:
1284:Description
1281:Instruction
1274:
1234:Description
1231:Instruction
1200:
1194:instruction
1182:
1052:Description
1049:Instruction
1037:
1017:Description
1014:Instruction
996:
830:Description
827:Instruction
815:
797:
762:
727:
697:
663:
624:
593:
558:
519:
490:Description
487:Instruction
475:
462:a3 * b3 + c1
458:r1 = a2 * b2
452:
435:2x8 words (
410:
368:
325:
282:
265:2x8 words (
239:
222:2x8 words (
202:Description
199:Instruction
184:
156:
144:
137:
133:
106:
75:
57:
24:
20:
18:
2403:MIPS16e ASE
922:Comparison
152:FMA history
125:hexadecimal
2551:Categories
2131:extensions
2032:CVT16/F16C
1979:AMD Wraith
1969:Turbo Core
1937:Technology
1870:Xilinx ISE
1837:technology
1776:2012-01-17
1748:2014-01-13
1723:2014-01-29
1691:2014-01-13
1664:2014-01-13
1585:2012-01-17
1559:2012-01-17
1461:2013-11-04
1428:References
1260:VPERMIL2PS
1250:VPERMIL2PD
1207:VPERMIL2PS
1203:VPERMIL2PD
970:Not Equal
930:Less Than
919:Immediate
803:r1 = a2-a3
799:r0 = a0-a1
772:r2 = a4-a5
768:r1 = a2-a3
764:r0 = a0-a1
753:8 words (
737:r2 = a4-a5
733:r1 = a2-a3
729:r0 = a0-a1
718:16 bytes (
703:r1 = a2+a3
699:r0 = a0+a1
654:8 words (
634:r2 = a4+a5
630:r1 = a2+a3
626:r0 = a0+a1
615:8 words (
584:16 bytes (
549:16 bytes (
529:r2 = a4+a5
525:r1 = a2+a3
521:r0 = a0+a1
510:16 bytes (
493:Operation
428:VPMADCSSWD
386:VPMACSSDQH
344:VPMACSSDQL
205:Operation
2220:Power ISA
2201:MIPS SIMD
1964:PowerTune
1959:PowerPlay
1954:PowerNow!
1879:Platforms
1525:Agner Fog
1215:VPERMILPS
1211:VPERMILPD
681:VPHADDUDQ
647:VPHADDUWQ
608:VPHADDUWD
577:VPHADDUBQ
542:VPHADDUBD
503:VPHADDUBW
424:VPMADCSWD
382:VPMACSDQH
340:VPMACSDQL
301:VPMACSSDD
258:VPMACSSWD
215:VPMACSSWW
49:Bulldozer
2526:(AMD-Vi)
1994:Ryzen AI
1907:Obsolete
1843:Software
1507:Phoronix
1375:See also
781:VPHSUBDQ
746:VPHSUBWD
711:VPHSUBBW
677:VPHADDDQ
643:VPHADDWQ
604:VPHADDWD
573:VPHADDBQ
538:VPHADDBD
499:VPHADDBW
297:VPMACSDD
254:VPMACSWD
211:VPMACSWW
92:) and
55:onward.
2427:PadLock
2342:AVX-512
2208:PA-RISC
2191:MIPS-3D
1898:GPUOpen
1886:Current
1320:VFRCZSS
1310:VFRCZSD
1300:VFRCZPS
1290:VFRCZPD
1192:Altivec
906:VPCOMUQ
896:VPCOMUD
886:VPCOMUW
876:VPCOMUB
121:without
117:Integer
66:History
2520:(2006)
2514:(2005)
2490:(2013)
2471:(2021)
2465:(2015)
2459:(2015)
2453:(2013)
2447:(2012)
2445:RDRAND
2441:(2010)
2433:AES-NI
2429:(2003)
2381:(2014)
2356:(2023)
2350:(2022)
2344:(2015)
2338:(2013)
2326:(2009)
2320:(2009)
2314:(2008)
2307:(2007)
2300:(2006)
2294:(2006)
2288:(2004)
2282:(2001)
2276:(1999)
2270:(1998)
2268:3DNow!
2264:(1996)
2017:3DNow!
2012:X86-64
1984:Virtex
1919:Dragon
1914:Spider
1865:Vivado
1855:AMDGPU
1240:VPPERM
1223:VPPERM
1184:VPPERM
1168:VPSHLQ
1158:VPSHLD
1148:VPSHLW
1138:VPSHLB
1128:VPSHAQ
1118:VPSHAD
1108:VPSHAW
1098:VPSHAB
1088:VPROTQ
1078:VPROTD
1068:VPROTW
1058:VPROTB
1023:VPCMOV
998:VPCMOV
978:False
962:Equal
866:VPCOMQ
856:VPCOMD
846:VPCOMW
836:VPCOMB
774:, ...
739:, ...
636:, ...
597:, ...
586:a0-a15
566:, ...
551:a0-a15
531:, ...
512:a0-a15
2518:AMD-V
2439:CLMUL
2398:Thumb
2354:AVX10
2292:SSSE3
2232:SPARC
2152:Alpha
1924:Horus
1850:AGESA
1795:(PDF)
1717:(PDF)
1710:Intel
1682:(PDF)
1655:(PDF)
1553:(PDF)
1483:(PDF)
1407:Notes
1386:CVT16
1196:VPERM
1188:SSSE3
986:True
794:r0-r1
790:a0-a3
759:r0-r3
755:a0-a7
724:r0-r7
720:a0-a1
694:r0-r1
690:a0-a3
660:r0-r1
656:a0-a7
621:r0-r3
617:a0-a7
590:r0-r1
555:r0-r3
516:r0-r7
479:SSSE3
464:, ..
449:r0-r3
445:c0-c3
441:b0-b7
437:a0-a7
407:r0-r3
403:c0-c1
399:b0-b3
395:a0-a3
365:r0-r3
361:c0-c1
357:b0-b3
353:a0-a3
333:, ..
322:r0-r3
318:c0-c3
314:b0-b3
310:a0-a3
279:r0-r3
275:c0-c3
271:b0-b7
267:a0-a7
247:, ..
236:r0-r7
232:c0-c7
228:b0-b7
224:a0-a7
102:Intel
94:CVT16
45:AMD64
2524:VT-d
2512:VT-x
2336:AVX2
2318:F16C
2304:SSE5
2298:SSE4
2286:SSE3
2280:SSE2
2249:SIMD
2186:MDMX
2181:MIPS
2169:NEON
2143:RISC
2139:SIMD
2076:SSE5
2064:BMI1
2047:FMA3
2042:FMA4
1989:XDNA
1974:ASTC
1893:ROCm
1396:SSE5
1391:FMA4
1213:and
1205:and
1201:The
1040:SSE2
1006:CMOV
1002:SSE4
983:111
975:110
967:101
959:100
951:011
943:010
935:001
927:000
290:, .
191:FMA4
173:and
167:FMA4
113:FMA4
111:and
86:FMA4
78:SSE5
72:SSE5
43:and
19:The
2494:ASF
2488:TSX
2469:TDX
2463:SGX
2457:MPX
2451:SHA
2408:RVC
2379:ADX
2373:BMI
2348:AMX
2330:FMA
2324:XOP
2312:AVX
2274:SSE
2262:MMX
2253:x86
2237:VIS
2225:VMX
2213:MAX
2196:MXU
2174:SVE
2164:ARM
2157:MVI
2086:AES
2081:ASF
2069:TBM
2059:ABM
2054:BMI
2037:FMA
2027:XOP
2022:AVX
1835:AMD
1686:AMD
1659:AMD
1487:AMD
1401:x86
1381:AVX
1337:AMD
1219:AVX
1008:).
175:LWP
171:XOP
163:TBM
159:Zen
154:).
129:AVX
109:AVX
104:).
82:AVX
60:SSE
41:x86
37:SSE
33:AMD
21:XOP
2553::
1708:.
1684:.
1657:.
1633:^
1622:.
1602:.
1510:.
1504:.
1485:,
1469:^
1435:^
1339::
821:.
801:,
796:)
770:,
766:,
761:)
735:,
731:,
726:)
701:,
696:)
679:,
667:,
662:)
645:,
632:,
628:,
623:)
606:,
592:)
575:,
562:,
557:)
540:,
527:,
523:,
518:)
501:,
460:+
456:,
451:)
439:,
426:,
414:,
409:)
397:,
384:,
372:,
367:)
355:,
342:,
329:,
324:)
312:,
299:,
286:,
281:)
269:,
256:,
243:,
238:)
226:,
213:,
169:,
165:,
27:)
2539:.
2255:)
2251:(
2145:)
2141:(
2120:e
2113:t
2106:v
1827:e
1820:t
1813:v
1751:.
1726:.
1694:.
1667:.
96:(
23:(
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.