379:
2567:
1553:
quantization. Mesh data is usually stored using 32-bit single-precision floats for the vertices, however in some situations it is acceptable to reduce the precision to only 16-bit half-precision, requiring only half the storage at the expense of some precision. Mesh quantization can also be done with
1565:
tend to use half precision: such applications usually do a large amount of calculation, but don't require a high level of precision. Due to hardware typically not supporting 16-bit half-precision floats, neural networks often use the
273:
Several earlier 16-bit floating point formats have existed including that of
Hitachi's HD61810 DSP of 1982 (a 4-bit exponent and a 12-bit mantissa), Thomas J. Scott's WIF of 1991 (5 exponent bits, 10 mantissa bits) and the
2158:
Nvidia recently introduced native half precision floating point support (FP16) into their Pascal GPUs. This was mainly motivated by the possibility that this will speed up data intensive and error tolerant applications in
97:
Depending on the computer, half-precision can be over an order of magnitude faster than double precision, e.g. 550 PFLOPS for half-precision vs 37 PFLOPS for double precision on one cloud provider.
2446:
1482:). It is almost identical to the IEEE format, but there is no encoding for infinity or NaNs; instead, an exponent of 31 encodes normalized numbers in the range 65536 to 131008.
1577:
instructions that can handle multiple floating-point numbers within one instruction, half precision can be twice as fast by operating on twice as many numbers simultaneously.
556:
The minimum strictly positive (subnormal) value is 2 ≈ 5.96 × 10. The minimum positive normal value is 2 ≈ 6.10 × 10. The maximum representable value is (2−2) × 2 = 65504.
388:
appear in the memory format but the total precision is 11 bits. In IEEE 754 parlance, there are 10 bits of significand, but there are 11 bits of significand precision (log
2523:
2663:
288:, but without the hard drive and memory cost of single or double precision floating point. The hardware-accelerated programmable shading group led by John Airey at
2419:
453:
Thus, as defined by the offset binary representation, in order to get the true exponent the offset of 15 has to be subtracted from the stored exponent.
2827:
2653:
1703:
instruction set extension, first introduced in 2009 by AMD and fairly broadly adopted by AMD and Intel CPUs by 2012. This was further extended up the
325:, released in late 2002. However, hardware support for accelerated 16-bit floating point was later dropped by Nvidia before being reintroduced in the
151:
255:
3198:
2832:
161:
2245:
2822:
2817:
141:
131:
384:
The format is assumed to have an implicit lead bit with value 1 unless the exponent field is stored with all zeros. Thus, only 10 bits of the
2049:
564:
These examples are given in bit representation of the floating-point value. This includes the sign bit, (biased) exponent, and significand.
2706:
2589:
1744:, VSX and the not-yet-approved SVP64 extension provide hardware support for 16-bit half-precision floats as of PowerISA v3.1B and later.
336:
extension in 2012 allows x86 processors to convert half-precision floats to and from single-precision floats with a machine instruction.
2116:
1573:
If the hardware has instructions to compute half-precision math, it is often faster than single or double precision. If the system has
2615:
1708:
155:
2956:
145:
135:
94:, and the exponent uses 5 bits. This can express values in the range ±65,504, with the minimum value above 1 being 1 + 1/1024.
3073:
2878:
2810:
2772:
1874:
1753:
1607:
1603:
1567:
229:
203:
188:
71:. It is intended for storage of floating-point values in applications where higher precision is not essential, in particular
31:
1656:
also supports half-precision floating point numbers with the half datatype on IEEE 754-2008 half-precision storage format.
1466:
65520 and larger numbers round to infinity. This is for round-to-even, other rounding strategies will change this cut-off.
296:
2000 paper (see section 4.3) and further documented in US patent 7518615. It was popularized by its use in the open-source
3254:
2973:
2903:
2751:
1696:
1670:
1641:
248:
2345:
2295:
3228:
1831:
1660:
2597:
2593:
2577:
1478:
bit) an "alternative half-precision" format, which does away with the special case for an exponent value of 31 (11111
281:
2550:
3249:
2983:
2851:
1585:
3161:
3113:
3025:
3003:
2998:
2926:
2792:
1519:
318:
104:
2091:
3035:
2699:
2395:
2320:
1979:"Patent US7518615 - Display system having floating point rasterization and floating point ... - Google Patents"
1925:
2270:
3188:
3103:
1686:
241:
198:
2931:
2787:
2746:
2741:
2065:
973:
393:
378:
367:
107:
60:
2921:
2896:
2447:"Intel® Advanced Vector Extensions 512 - FP16 Instruction Set for Intel® Xeon® Processor Based Products"
1543:
1857:
Proceedings of the twenty-second SIGCSE technical symposium on
Computer science education - SIGCSE '91
2723:
2644:
1787:
408:
representation, with the zero offset being 15; also known as exponent bias in the IEEE 754 standard.
208:
3193:
3171:
3098:
2951:
2943:
2863:
2692:
2221:
1777:
1611:
3176:
3156:
3108:
3083:
2868:
2837:
1880:
167:
54:
1951:
968:, because of the odd number of bits in the significand. The bits beyond the rounding point are
3063:
2993:
2968:
2782:
2777:
2045:
1870:
1782:
1491:
499:
2669:
2658:
2474:
1855:
Scott, Thomas J. (March 1991). "Mathematics and computer science at odds over real numbers".
292:
used the s10e5 data type in 1997 as part of the 'bali' design effort. This is described in a
3208:
3093:
2891:
2629:
2037:
1860:
1558:
1539:
1475:
965:
289:
275:
72:
2196:
3213:
3078:
3030:
2963:
2670:
C source code to convert between IEEE double, single, and half precision can be found here
2142:
1645:
1562:
68:
2679:
1756:: Alternative 16-bit floating-point format with 8 bits of exponent and 7 bits of mantissa
2002:
1898:
3166:
2988:
2978:
2886:
2498:
193:
76:
57:
3243:
3088:
1771:
1531:
442:
405:
285:
83:
1884:
3045:
3020:
2171:
1550:
2371:
2041:
1832:"hitachi :: dataBooks :: HD61810 Digital Signal Processor Users Manual"
3223:
3218:
3068:
3015:
2842:
385:
364:
2031:
3128:
3123:
3040:
3008:
2913:
2856:
1978:
490:
322:
2674:
3203:
3181:
3138:
3133:
2800:
2756:
2715:
1759:
1741:
1615:
307:
183:
38:
2321:"swift-evolution/proposals/0277-float16.md at main · apple/swift-evolution"
2036:. IEEE STD 754-2019 (Revision of IEEE 754-2008). July 2019. pp. 1–84.
1865:
1619:
3118:
1765:
1570:
format, which is the single precision float format truncated to 16 bits.
1523:
540:
358:
352:
314:
293:
114:
1704:
1644:
introduced half-precision floating point numbers in Swift 5.3 with the
1503:
1499:
297:
494:
404:
The half-precision binary floating-point exponent is encoded using an
1726:
1715:
1653:
1535:
1515:
1511:
1495:
303:
64:
1929:
1673:
provides support for half-precision floating point numbers with the
1530:. The advantage over 8-bit or 16-bit integers is that the increased
17:
2596:
external links, and converting useful links where appropriate into
1538:
for images, and avoids gamma correction. The advantage over 32-bit
2680:
Half precision floating point for one of the extended GCC features
2148:. Department of Computer Science, National University of Singapore
326:
213:
87:
1807:
2736:
2648:
2475:"RISC-V Instruction Set Manual, Volume I: RISC-V User-Level ISA"
1700:
1595:
1574:
1527:
1507:
333:
2688:
1729:
provide hardware support for 16-bit half precision floats. The
1554:
8-bit or 16-bit fixed precision depending on the requirements.
392:(2) ≈ 3.311 decimal digits, or 4 digits ± slightly less than 5
321:, released in early 2002, and implemented it in silicon in the
2731:
2560:
1693:
545:
340:
IEEE 754 half-precision binary floating-point format: binary16
2684:
2675:
Java source code for half-precision floating-point conversion
1602:
standard library type. As of
January 2024, no .NET language (
1598:
5 introduced half precision floating point numbers with the
284:
was searching for an image format that could handle a wide
2639:
2396:"Integers and Floating-Point Numbers · The Julia Language"
1950:
Mark S. Peercy; Marc Olano; John Airey; P. Jeffrey Ungar.
1768:: IEEE standard for floating-point arithmetic (IEEE 754)
1534:
allows for more detail to be preserved in highlights and
1542:
floating point is that it requires half the storage and
2585:
2580:
may not follow
Knowledge (XXG)'s policies or guidelines
2296:"Data Type Summary — Visual Basic language reference"
2143:"Exploiting half precision arithmetic in Nvidia GPUs"
1788:
Power
Management Bus § Linear11 Floating Point Format
3147:
3056:
2942:
2912:
2877:
2765:
2722:
2141:Ho, Nhut-Minh; Wong, Weng-Fai (September 1, 2017).
1707:instruction set extension implemented in the Intel
2551:Khronos Vulkan signed 16-bit floating point format
2424:ARM Compiler armclang Reference Guide Version 6.7
2070:RealView Compilation Tools Compiler User Guide
2066:"Half-precision floating-point number support"
2026:
2024:
1588:provides support for half precisions with its
2700:
2420:"Half-precision floating-point number format"
2372:"Tracking Issue for f16 and f128 float types"
2246:"Floating-point numeric types ― C# reference"
1952:"Interactive Multi-Pass Programmable Shading"
1474:ARM processors support (via a floating-point
249:
8:
2096:Khronos Data Format Specification v1.2 rev 1
1667:type for IEEE half-precision 16-bit floats.
2033:IEEE Standard for Floating-Point Arithmetic
1899:"/home/usr/bk/glide/docs2.3.1/GLIDEPGM.DOC"
34:, a different 16-bit floating-point format.
2707:
2693:
2685:
256:
242:
100:
2616:Learn how and when to remove this message
1864:
1546:(at the expense of precision and range).
1494:environments to store pixels, including
983:
566:
466:
1799:
221:
175:
113:
103:
2220:Govindarajan, Prashanth (2020-08-31).
1733:extension is a minimal alternative to
2092:"10.1. 16-bit floating-point numbers"
1920:
1918:
1663:is currently working on adding a new
964:By default, 1/3 rounds down like for
7:
632:smallest positive subnormal number
374:The format is laid out as follows:
67:(two bytes in modern computers) in
2654:OpenGL treatment of half precision
2271:"Literals ― F# language reference"
1859:. Vol. 23. pp. 130–139.
1692:Support for half precision in the
1490:Half precision is used in several
344:The IEEE 754 standard specifies a
82:Almost all modern uses follow the
25:
1774:, Language Independent Arithmetic
1689:have support for half precision.
1549:Half precision can be useful for
2634:Survey of Floating-Point Formats
2565:
1808:"About ABCI - About ABCI | ABCI"
1581:Support by programming languages
972:... which is less than 1/2 of a
848:smallest number larger than one
704:smallest positive normal number
377:
370:: 11 bits (10 explicitly stored)
348:as having the following format:
204:IBM floating-point architecture
1762:: small floating-point formats
1754:bfloat16 floating-point format
1470:ARM alternative half-precision
276:3dfx Voodoo Graphics processor
1:
2773:Arbitrary-precision or bignum
1638:) or a keyword for the type.
776:largest number less than one
27:16-bit computer number format
2222:"Introducing the Half type!"
2042:10.1109/ieeestd.2019.8766229
2007:Cg 3.1 Toolkit Documentation
1622:) has literals (e.g. in C#,
525:(−1) × 2 × 1.significantbits
504:(−1) × 2 × 0.significantbits
2659:Fast Half Float Conversions
464:are interpreted specially.
278:of 1995 (same as Hitachi).
86:standard, where the 16-bit
3271:
1557:Hardware and software for
456:The stored exponents 00000
29:
3114:Strongly typed identifier
1928:. OpenEXR. Archived from
668:largest subnormal number
521:
90:format is referred to as
2645:Half precision constants
2499:"OPF_PowerISA_v3.1B.pdf"
2454:Intel® Builders Programs
1685:Several versions of the
30:Not to be confused with
3189:Parametric polymorphism
2346:"cl_khr_fp16 extension"
2117:"KHR_mesh_quantization"
560:Half precision examples
394:units in the last place
199:Microsoft Binary Format
2664:Analog Devices variant
2505:. OpenPOWER Foundation
2197:"Half Struct (System)"
1486:Uses of half precision
974:unit in the last place
884:largest normal number
290:SGI (Silicon Graphics)
61:computer number format
1866:10.1145/107004.107029
980:Precision limitations
740:nearest value to 1/3
628:) ≈ 0.000000059604645
3255:Floating point types
2586:improve this article
2350:registry.khronos.org
1959:People.csail.mit.edu
1699:is specified in the
700:) ≈ 0.00006103515625
329:mobile GPU in 2015.
3194:Primitive data type
3099:Recursive data type
2952:Algebraic data type
2828:Quadruple precision
2666:(four-bit exponent)
2598:footnote references
2300:learn.microsoft.com
2275:learn.microsoft.com
2250:learn.microsoft.com
2201:learn.microsoft.com
1778:Primitive data type
548:(quiet, signalling)
230:Arbitrary precision
3157:Abstract data type
2838:Extended precision
2797:Reduced precision
2400:docs.julialang.org
2072:. 10 December 2010
958:negative infinity
944:1 11111 0000000000
927:1 10000 0000000000
910:1 00000 0000000000
890:0 11111 0000000000
854:0 11110 1111111111
818:0 01111 0000000001
782:0 01111 0000000000
746:0 01110 1111111111
710:0 01101 0101010101
674:0 00001 0000000000
664:) ≈ 0.000060975552
638:0 00000 1111111111
602:0 00000 0000000001
585:0 00000 0000000000
476:Significand ≠ zero
473:Significand = zero
214:G.711 8-bit floats
168:Extended precision
45:(sometimes called
3250:Binary arithmetic
3237:
3236:
2969:Associative array
2833:Octuple precision
2626:
2625:
2618:
2528:libre-soc.org Git
2524:"ls005.xlen.mdwn"
2090:Garrard, Andrew.
2051:978-1-5044-5924-2
1783:RGBE image format
1492:computer graphics
1464:
1463:
962:
961:
554:
553:
500:subnormal numbers
400:Exponent encoding
266:
265:
16:(Redirected from
3262:
3209:Type constructor
3094:Opaque data type
3026:Record or Struct
2823:Double precision
2818:Single precision
2709:
2702:
2695:
2686:
2621:
2614:
2610:
2607:
2601:
2569:
2568:
2561:
2538:
2537:
2535:
2534:
2520:
2514:
2513:
2511:
2510:
2495:
2489:
2488:
2486:
2485:
2471:
2465:
2464:
2462:
2460:
2451:
2445:Towner, Daniel.
2442:
2436:
2435:
2433:
2431:
2416:
2410:
2409:
2407:
2406:
2392:
2386:
2385:
2383:
2382:
2367:
2361:
2360:
2358:
2356:
2342:
2336:
2335:
2333:
2331:
2317:
2311:
2310:
2308:
2307:
2292:
2286:
2285:
2283:
2282:
2267:
2261:
2260:
2258:
2257:
2242:
2236:
2235:
2233:
2232:
2217:
2211:
2210:
2208:
2207:
2193:
2187:
2186:
2184:
2182:
2168:
2162:
2161:
2155:
2153:
2147:
2138:
2132:
2131:
2129:
2128:
2113:
2107:
2106:
2104:
2103:
2087:
2081:
2080:
2078:
2077:
2062:
2056:
2055:
2028:
2019:
2018:
2016:
2014:
1999:
1993:
1992:
1990:
1989:
1975:
1969:
1968:
1966:
1965:
1956:
1947:
1941:
1940:
1938:
1937:
1922:
1913:
1912:
1910:
1909:
1895:
1889:
1888:
1868:
1852:
1846:
1845:
1843:
1842:
1828:
1822:
1821:
1819:
1818:
1804:
1736:
1732:
1725:
1721:
1687:ARM architecture
1681:Hardware support
1676:
1666:
1648:
1637:
1633:
1629:
1625:
1601:
1591:
1559:machine learning
1540:single-precision
1476:control register
1383:
1381:
1380:
1377:
1374:
1358:
1356:
1355:
1352:
1349:
1333:
1331:
1330:
1327:
1324:
1225:
1223:
1222:
1219:
1216:
1203:
1201:
1200:
1197:
1194:
1186:
1184:
1183:
1180:
1177:
1164:
1162:
1161:
1158:
1155:
1147:
1145:
1144:
1141:
1138:
1125:
1123:
1122:
1119:
1116:
984:
971:
966:double precision
955:
950:
945:
938:
933:
928:
921:
916:
911:
901:
896:
891:
881:
879:
877:
876:
873:
870:
860:
855:
845:
843:
841:
840:
837:
834:
824:
819:
809:
807:
805:
804:
801:
798:
788:
783:
773:
771:
769:
768:
765:
762:
752:
747:
737:
735:
733:
732:
729:
726:
716:
711:
701:
699:
697:
696:
693:
690:
680:
675:
665:
663:
661:
660:
657:
654:
644:
639:
629:
627:
625:
624:
621:
618:
608:
603:
596:
591:
586:
567:
522:normalized value
467:
381:
258:
251:
244:
101:
73:image processing
21:
3270:
3269:
3265:
3264:
3263:
3261:
3260:
3259:
3240:
3239:
3238:
3233:
3214:Type conversion
3149:
3143:
3079:Enumerated type
3052:
2938:
2932:null-terminated
2908:
2873:
2761:
2718:
2713:
2622:
2611:
2605:
2602:
2583:
2574:This article's
2570:
2566:
2559:
2547:
2545:Further reading
2542:
2541:
2532:
2530:
2522:
2521:
2517:
2508:
2506:
2503:OpenPOWER Files
2497:
2496:
2492:
2483:
2481:
2473:
2472:
2468:
2458:
2456:
2449:
2444:
2443:
2439:
2429:
2427:
2426:. ARM Developer
2418:
2417:
2413:
2404:
2402:
2394:
2393:
2389:
2380:
2378:
2370:Cross, Travis.
2369:
2368:
2364:
2354:
2352:
2344:
2343:
2339:
2329:
2327:
2319:
2318:
2314:
2305:
2303:
2294:
2293:
2289:
2280:
2278:
2269:
2268:
2264:
2255:
2253:
2244:
2243:
2239:
2230:
2228:
2219:
2218:
2214:
2205:
2203:
2195:
2194:
2190:
2180:
2178:
2170:
2169:
2165:
2151:
2149:
2145:
2140:
2139:
2135:
2126:
2124:
2123:. Khronos Group
2115:
2114:
2110:
2101:
2099:
2089:
2088:
2084:
2075:
2073:
2064:
2063:
2059:
2052:
2030:
2029:
2022:
2012:
2010:
2001:
2000:
1996:
1987:
1985:
1977:
1976:
1972:
1963:
1961:
1954:
1949:
1948:
1944:
1935:
1933:
1924:
1923:
1916:
1907:
1905:
1897:
1896:
1892:
1877:
1854:
1853:
1849:
1840:
1838:
1830:
1829:
1825:
1816:
1814:
1806:
1805:
1801:
1796:
1750:
1734:
1730:
1723:
1719:
1709:Sapphire Rapids
1697:instruction set
1683:
1674:
1664:
1646:
1635:
1631:
1627:
1623:
1599:
1589:
1583:
1563:neural networks
1488:
1481:
1472:
1378:
1375:
1372:
1371:
1369:
1353:
1350:
1347:
1346:
1344:
1328:
1325:
1322:
1321:
1319:
1220:
1217:
1214:
1213:
1211:
1198:
1195:
1192:
1191:
1189:
1181:
1178:
1175:
1174:
1172:
1159:
1156:
1153:
1152:
1150:
1142:
1139:
1136:
1135:
1133:
1120:
1117:
1114:
1113:
1111:
982:
969:
953:
948:
943:
936:
931:
926:
919:
914:
909:
899:
894:
889:
874:
871:
868:
867:
865:
863:
858:
853:
838:
835:
832:
831:
829:
827:
822:
817:
802:
799:
796:
795:
793:
791:
786:
781:
766:
763:
760:
759:
757:
755:
750:
745:
730:
727:
724:
723:
721:
719:
714:
709:
694:
691:
688:
687:
685:
683:
678:
673:
658:
655:
652:
651:
649:
647:
642:
637:
622:
619:
616:
615:
613:
611:
606:
601:
594:
589:
584:
562:
536:
528:
519:
515:
507:
487:
463:
459:
448:
438:
434:
430:
423:
419:
415:
402:
391:
342:
271:
262:
209:PMBus Linear-11
77:neural networks
69:computer memory
35:
28:
23:
22:
15:
12:
11:
5:
3268:
3266:
3258:
3257:
3252:
3242:
3241:
3235:
3234:
3232:
3231:
3226:
3221:
3216:
3211:
3206:
3201:
3196:
3191:
3186:
3185:
3184:
3174:
3169:
3167:Data structure
3164:
3159:
3153:
3151:
3145:
3144:
3142:
3141:
3136:
3131:
3126:
3121:
3116:
3111:
3106:
3101:
3096:
3091:
3086:
3081:
3076:
3071:
3066:
3060:
3058:
3054:
3053:
3051:
3050:
3049:
3048:
3038:
3033:
3028:
3023:
3018:
3013:
3012:
3011:
3001:
2996:
2991:
2986:
2981:
2976:
2971:
2966:
2961:
2960:
2959:
2948:
2946:
2940:
2939:
2937:
2936:
2935:
2934:
2924:
2918:
2916:
2910:
2909:
2907:
2906:
2901:
2900:
2899:
2894:
2883:
2881:
2875:
2874:
2872:
2871:
2866:
2861:
2860:
2859:
2849:
2848:
2847:
2846:
2845:
2835:
2830:
2825:
2820:
2815:
2814:
2813:
2808:
2806:Half precision
2803:
2793:Floating point
2790:
2785:
2780:
2775:
2769:
2767:
2763:
2762:
2760:
2759:
2754:
2749:
2744:
2739:
2734:
2728:
2726:
2720:
2719:
2714:
2712:
2711:
2704:
2697:
2689:
2683:
2682:
2677:
2672:
2667:
2661:
2656:
2651:
2642:
2637:
2624:
2623:
2578:external links
2573:
2571:
2564:
2558:
2557:External links
2555:
2554:
2553:
2546:
2543:
2540:
2539:
2515:
2490:
2466:
2437:
2411:
2387:
2362:
2337:
2312:
2287:
2262:
2237:
2212:
2188:
2163:
2133:
2108:
2082:
2057:
2050:
2020:
1994:
1970:
1942:
1914:
1890:
1875:
1847:
1823:
1798:
1797:
1795:
1792:
1791:
1790:
1785:
1780:
1775:
1769:
1763:
1757:
1749:
1746:
1682:
1679:
1636:System.Decimal
1582:
1579:
1487:
1484:
1479:
1471:
1468:
1462:
1461:
1458:
1455:
1451:
1450:
1447:
1444:
1440:
1439:
1436:
1433:
1429:
1428:
1425:
1422:
1418:
1417:
1414:
1411:
1407:
1406:
1403:
1400:
1396:
1395:
1392:
1389:
1385:
1384:
1367:
1364:
1360:
1359:
1342:
1339:
1335:
1334:
1317:
1314:
1310:
1309:
1306:
1303:
1299:
1298:
1295:
1292:
1288:
1287:
1284:
1281:
1277:
1276:
1273:
1270:
1266:
1265:
1262:
1259:
1255:
1254:
1251:
1248:
1244:
1243:
1240:
1237:
1233:
1232:
1229:
1226:
1208:
1207:
1204:
1187:
1169:
1168:
1165:
1148:
1130:
1129:
1126:
1109:
1105:
1104:
1101:
1098:
1094:
1093:
1090:
1087:
1083:
1082:
1079:
1076:
1072:
1071:
1068:
1065:
1061:
1060:
1057:
1054:
1050:
1049:
1046:
1043:
1039:
1038:
1035:
1032:
1028:
1027:
1024:
1021:
1017:
1016:
1013:
1010:
1006:
1005:
1002:
999:
995:
994:
991:
988:
981:
978:
960:
959:
956:
951:
946:
940:
939:
934:
929:
923:
922:
917:
912:
906:
905:
902:
897:
892:
886:
885:
882:
861:
856:
850:
849:
846:
844:) ≈ 1.00097656
825:
820:
814:
813:
810:
789:
784:
778:
777:
774:
772:) ≈ 0.99951172
753:
748:
742:
741:
738:
736:) ≈ 0.33325195
717:
712:
706:
705:
702:
681:
676:
670:
669:
666:
645:
640:
634:
633:
630:
609:
604:
598:
597:
592:
587:
581:
580:
577:
574:
571:
561:
558:
552:
551:
549:
543:
537:
534:
530:
529:
526:
523:
520:
517:
513:
509:
508:
505:
502:
497:
488:
485:
481:
480:
477:
474:
471:
461:
457:
451:
450:
446:
440:
436:
432:
428:
425:
421:
417:
413:
401:
398:
389:
372:
371:
362:
356:
341:
338:
300:image format.
270:
267:
264:
263:
261:
260:
253:
246:
238:
235:
234:
233:
232:
224:
223:
219:
218:
217:
216:
211:
206:
201:
196:
194:TensorFloat-32
191:
186:
178:
177:
173:
172:
171:
170:
165:
158:
148:
138:
128:
118:
117:
111:
110:
105:Floating-point
63:that occupies
58:floating-point
43:half precision
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
3267:
3256:
3253:
3251:
3248:
3247:
3245:
3230:
3227:
3225:
3222:
3220:
3217:
3215:
3212:
3210:
3207:
3205:
3202:
3200:
3197:
3195:
3192:
3190:
3187:
3183:
3180:
3179:
3178:
3175:
3173:
3170:
3168:
3165:
3163:
3160:
3158:
3155:
3154:
3152:
3146:
3140:
3137:
3135:
3132:
3130:
3127:
3125:
3122:
3120:
3117:
3115:
3112:
3110:
3107:
3105:
3102:
3100:
3097:
3095:
3092:
3090:
3089:Function type
3087:
3085:
3082:
3080:
3077:
3075:
3072:
3070:
3067:
3065:
3062:
3061:
3059:
3055:
3047:
3044:
3043:
3042:
3039:
3037:
3034:
3032:
3029:
3027:
3024:
3022:
3019:
3017:
3014:
3010:
3007:
3006:
3005:
3002:
3000:
2997:
2995:
2992:
2990:
2987:
2985:
2982:
2980:
2977:
2975:
2972:
2970:
2967:
2965:
2962:
2958:
2955:
2954:
2953:
2950:
2949:
2947:
2945:
2941:
2933:
2930:
2929:
2928:
2925:
2923:
2920:
2919:
2917:
2915:
2911:
2905:
2902:
2898:
2895:
2893:
2890:
2889:
2888:
2885:
2884:
2882:
2880:
2876:
2870:
2867:
2865:
2862:
2858:
2855:
2854:
2853:
2850:
2844:
2841:
2840:
2839:
2836:
2834:
2831:
2829:
2826:
2824:
2821:
2819:
2816:
2812:
2809:
2807:
2804:
2802:
2799:
2798:
2796:
2795:
2794:
2791:
2789:
2786:
2784:
2781:
2779:
2776:
2774:
2771:
2770:
2768:
2764:
2758:
2755:
2753:
2750:
2748:
2745:
2743:
2740:
2738:
2735:
2733:
2730:
2729:
2727:
2725:
2724:Uninterpreted
2721:
2717:
2710:
2705:
2703:
2698:
2696:
2691:
2690:
2687:
2681:
2678:
2676:
2673:
2671:
2668:
2665:
2662:
2660:
2657:
2655:
2652:
2650:
2646:
2643:
2641:
2638:
2635:
2631:
2628:
2627:
2620:
2617:
2609:
2599:
2595:
2594:inappropriate
2591:
2587:
2581:
2579:
2572:
2563:
2562:
2556:
2552:
2549:
2548:
2544:
2529:
2525:
2519:
2516:
2504:
2500:
2494:
2491:
2480:
2479:Five EmbedDev
2476:
2470:
2467:
2455:
2448:
2441:
2438:
2425:
2421:
2415:
2412:
2401:
2397:
2391:
2388:
2377:
2373:
2366:
2363:
2351:
2347:
2341:
2338:
2326:
2322:
2316:
2313:
2301:
2297:
2291:
2288:
2276:
2272:
2266:
2263:
2251:
2247:
2241:
2238:
2227:
2223:
2216:
2213:
2202:
2198:
2192:
2189:
2177:
2173:
2167:
2164:
2160:
2144:
2137:
2134:
2122:
2118:
2112:
2109:
2097:
2093:
2086:
2083:
2071:
2067:
2061:
2058:
2053:
2047:
2043:
2039:
2035:
2034:
2027:
2025:
2021:
2008:
2004:
1998:
1995:
1984:
1980:
1974:
1971:
1960:
1953:
1946:
1943:
1932:on 2013-05-08
1931:
1927:
1921:
1919:
1915:
1904:
1900:
1894:
1891:
1886:
1882:
1878:
1872:
1867:
1862:
1858:
1851:
1848:
1837:
1833:
1827:
1824:
1813:
1809:
1803:
1800:
1793:
1789:
1786:
1784:
1781:
1779:
1776:
1773:
1772:ISO/IEC 10967
1770:
1767:
1764:
1761:
1758:
1755:
1752:
1751:
1747:
1745:
1743:
1738:
1728:
1717:
1712:
1710:
1706:
1702:
1698:
1695:
1690:
1688:
1680:
1678:
1672:
1668:
1662:
1657:
1655:
1651:
1649:
1643:
1639:
1628:System.Single
1621:
1617:
1613:
1609:
1605:
1597:
1593:
1587:
1580:
1578:
1576:
1571:
1569:
1564:
1560:
1555:
1552:
1547:
1545:
1541:
1537:
1533:
1532:dynamic range
1529:
1525:
1521:
1517:
1513:
1509:
1505:
1501:
1497:
1493:
1485:
1483:
1477:
1469:
1467:
1459:
1456:
1453:
1452:
1448:
1445:
1442:
1441:
1437:
1434:
1431:
1430:
1426:
1423:
1420:
1419:
1415:
1412:
1409:
1408:
1404:
1401:
1398:
1397:
1393:
1390:
1387:
1386:
1368:
1365:
1362:
1361:
1343:
1340:
1337:
1336:
1318:
1315:
1312:
1311:
1307:
1304:
1301:
1300:
1296:
1293:
1290:
1289:
1285:
1282:
1279:
1278:
1274:
1271:
1268:
1267:
1263:
1260:
1257:
1256:
1252:
1249:
1246:
1245:
1241:
1238:
1235:
1234:
1230:
1227:
1210:
1209:
1205:
1188:
1171:
1170:
1166:
1149:
1132:
1131:
1127:
1110:
1107:
1106:
1102:
1099:
1096:
1095:
1091:
1088:
1085:
1084:
1080:
1077:
1074:
1073:
1069:
1066:
1063:
1062:
1058:
1055:
1052:
1051:
1047:
1044:
1041:
1040:
1036:
1033:
1030:
1029:
1025:
1022:
1019:
1018:
1014:
1011:
1008:
1007:
1003:
1000:
997:
996:
992:
989:
986:
985:
979:
977:
975:
967:
957:
952:
947:
942:
941:
935:
930:
925:
924:
918:
913:
908:
907:
903:
898:
893:
888:
887:
883:
862:
857:
852:
851:
847:
826:
821:
816:
815:
811:
790:
785:
780:
779:
775:
754:
749:
744:
743:
739:
718:
713:
708:
707:
703:
682:
677:
672:
671:
667:
646:
641:
636:
635:
631:
610:
605:
600:
599:
593:
588:
583:
582:
578:
575:
572:
569:
568:
565:
559:
557:
550:
547:
544:
542:
538:
532:
531:
524:
511:
510:
503:
501:
498:
496:
492:
489:
483:
482:
478:
475:
472:
469:
468:
465:
454:
444:
443:Exponent bias
441:
426:
411:
410:
409:
407:
406:offset-binary
399:
397:
395:
387:
382:
380:
375:
369:
366:
363:
361:width: 5 bits
360:
357:
354:
351:
350:
349:
347:
339:
337:
335:
330:
328:
324:
320:
316:
313:
309:
305:
301:
299:
295:
291:
287:
286:dynamic range
283:
279:
277:
268:
259:
254:
252:
247:
245:
240:
239:
237:
236:
231:
228:
227:
226:
225:
220:
215:
212:
210:
207:
205:
202:
200:
197:
195:
192:
190:
187:
185:
182:
181:
180:
179:
174:
169:
166:
163:
159:
157:
154:(binary128),
153:
149:
147:
143:
139:
137:
133:
129:
126:
122:
121:
120:
119:
116:
112:
109:
106:
102:
99:
95:
93:
89:
85:
84:IEEE 754-2008
80:
78:
74:
70:
66:
62:
59:
56:
52:
48:
44:
40:
33:
19:
2994:Intersection
2805:
2640:OpenEXR site
2633:
2612:
2603:
2588:by removing
2575:
2531:. Retrieved
2527:
2518:
2507:. Retrieved
2502:
2493:
2482:. Retrieved
2478:
2469:
2457:. Retrieved
2453:
2440:
2428:. Retrieved
2423:
2414:
2403:. Retrieved
2399:
2390:
2379:. Retrieved
2375:
2365:
2353:. Retrieved
2349:
2340:
2328:. Retrieved
2324:
2315:
2304:. Retrieved
2302:. 2021-09-15
2299:
2290:
2279:. Retrieved
2277:. 2022-06-15
2274:
2265:
2254:. Retrieved
2252:. 2022-09-29
2249:
2240:
2229:. Retrieved
2225:
2215:
2204:. Retrieved
2200:
2191:
2179:. Retrieved
2175:
2166:
2157:
2150:. Retrieved
2136:
2125:. Retrieved
2120:
2111:
2100:. Retrieved
2095:
2085:
2074:. Retrieved
2069:
2060:
2032:
2011:. Retrieved
2006:
1997:
1986:. Retrieved
1982:
1973:
1962:. Retrieved
1958:
1945:
1934:. Retrieved
1930:the original
1906:. Retrieved
1902:
1893:
1856:
1850:
1839:. Retrieved
1835:
1826:
1815:. Retrieved
1811:
1802:
1739:
1713:
1705:AVX-512_FP16
1691:
1684:
1669:
1659:As of 2024,
1658:
1652:
1640:
1612:Visual Basic
1594:
1584:
1572:
1556:
1548:
1489:
1473:
1465:
963:
563:
555:
516:, ..., 11110
455:
452:
403:
383:
376:
373:
345:
343:
331:
311:
310:defined the
302:
280:
272:
222:Alternatives
144:(binary64),
134:(binary32),
124:
96:
91:
81:
50:
46:
42:
36:
3224:Type theory
3219:Type system
3069:Bottom type
3016:Option type
2957:generalized
2843:Long double
2788:Fixed point
2176:ziglang.org
1836:Archive.org
1711:processor.
1600:System.Half
386:significand
365:Significand
319:Cg language
164:(binary256)
3244:Categories
3129:Empty type
3124:Type class
3074:Collection
3031:Refinement
3009:metaobject
2857:signedness
2716:Data types
2630:Minifloats
2533:2023-07-02
2509:2023-07-02
2484:2023-07-02
2405:2024-07-11
2381:2024-07-05
2325:github.com
2306:2024-02-01
2281:2024-02-01
2256:2024-02-01
2231:2024-02-01
2206:2024-02-01
2127:2023-07-02
2102:2023-08-05
2076:2015-05-05
1988:2017-07-14
1983:Google.com
1964:2017-07-14
1936:2017-07-14
1908:2017-07-14
1903:Gamers.org
1876:0897913779
1841:2017-07-14
1817:2019-10-06
1794:References
1727:extensions
323:GeForce FX
156:decimal128
127:(binary16)
3204:Subtyping
3199:Interface
3182:metaclass
3134:Unit type
3104:Semaphore
3084:Exception
2989:Inductive
2979:Dependent
2944:Composite
2922:Character
2904:Reference
2801:Minifloat
2757:Bit array
2606:July 2017
2590:excessive
2226:.NET Blog
2181:7 January
2098:. Khronos
2013:17 August
2003:"vs_2_sw"
1926:"OpenEXR"
1760:Minifloat
1742:Power ISA
1634:has type
1626:has type
1544:bandwidth
993:interval
904:infinity
880:) = 65504
864:2 × (1 +
828:2 × (1 +
792:2 × (1 +
756:2 × (1 +
720:2 × (1 +
684:2 × (1 +
648:2 × (0 +
612:2 × (0 +
479:Equation
460:and 11111
368:precision
308:Microsoft
184:Minifloat
160:256-bit:
152:Quadruple
150:128-bit:
146:decimal64
136:decimal32
39:computing
3229:Variable
3119:Top type
2984:Equality
2892:physical
2869:Rational
2864:Interval
2811:bfloat16
2172:"Floats"
2152:July 13,
2009:. Nvidia
1885:16648394
1766:IEEE 754
1748:See also
1568:bfloat16
1524:Direct3D
541:infinity
470:Exponent
359:Exponent
353:Sign bit
346:binary16
327:Tegra X1
315:datatype
294:SIGGRAPH
189:bfloat16
140:64-bit:
130:32-bit:
123:16-bit:
115:IEEE 754
92:binary16
32:bfloat16
3172:Generic
3148:Related
3064:Boolean
3021:Product
2897:virtual
2887:Address
2879:Pointer
2852:Integer
2783:Decimal
2778:Complex
2766:Numeric
2584:Please
2576:use of
1812:abci.ai
1675:Float16
1647:Float16
1616:C++/CLI
1536:shadows
1504:JPEG XR
1500:OpenEXR
1382:
1370:
1357:
1345:
1332:
1320:
1224:
1212:
1202:
1190:
1185:
1173:
1163:
1151:
1146:
1134:
1124:
1112:
878:
866:
842:
830:
806:
794:
770:
758:
734:
722:
698:
686:
662:
650:
626:
614:
445:= 01111
435:− 01111
431:= 11110
420:− 01111
416:= 00001
355:: 1 bit
317:in the
298:OpenEXR
269:History
162:Octuple
108:formats
65:16 bits
53:) is a
51:float16
3162:Boxing
3150:topics
3109:Stream
3046:tagged
3004:Object
2927:String
2459:13 May
2430:13 May
2376:GitHub
2355:31 May
2330:13 May
2121:GitHub
2048:
1883:
1873:
1731:Zfhmin
1724:Zfhmin
1718:, the
1716:RISC-V
1677:type.
1654:OpenCL
1650:type.
1620:C++/CX
1614:, and
1592:type.
1526:, and
1516:Vulkan
1512:OpenGL
1496:MATLAB
579:Notes
570:Binary
304:Nvidia
142:Double
132:Single
88:base-2
55:binary
3057:Other
3041:Union
2974:Class
2964:Array
2747:Tryte
2647:from
2450:(PDF)
2159:GPUs.
2146:(PDF)
1955:(PDF)
1881:S2CID
1671:Julia
1642:Swift
1454:65520
1446:65520
1443:32768
1435:32768
1432:16384
1424:16384
808:) = 1
576:Value
533:11111
512:00001
484:00000
424:= −14
176:Other
3177:Kind
3139:Void
2999:List
2914:Text
2752:Word
2742:Trit
2737:Byte
2649:D3DX
2632:(in
2461:2022
2432:2022
2357:2024
2332:2024
2183:2024
2154:2020
2046:ISBN
2015:2016
1871:ISBN
1722:and
1701:F16C
1661:Rust
1632:1.0m
1624:1.0f
1618:and
1596:.NET
1575:SIMD
1551:mesh
1528:D3DX
1508:GIMP
1421:8192
1413:8192
1410:4096
1402:4096
1399:2048
1391:2048
1388:1024
1366:1024
970:0101
949:fc00
932:c000
915:8000
895:7c00
875:1024
869:1023
859:7bff
839:1024
823:3c01
812:one
803:1024
787:3c00
767:1024
761:1023
751:3bff
731:1024
715:3555
695:1024
679:0400
659:1024
653:1023
643:03ff
623:1024
607:0001
590:0000
491:zero
449:= 15
439:= 15
334:F16C
332:The
312:half
306:and
125:Half
75:and
47:FP16
18:FP16
3036:Set
2732:Bit
2592:or
2038:doi
1861:doi
1740:On
1735:Zfh
1720:Zfh
1714:On
1694:x86
1665:f16
1630:or
1590:f16
1586:Zig
1561:or
1449:32
1438:16
1363:512
1341:512
1338:256
1316:256
1313:128
1305:128
990:Max
987:Min
725:341
573:Hex
546:NaN
429:max
414:min
396:).
282:ILM
49:or
37:In
3246::
2526:.
2501:.
2477:.
2452:.
2422:.
2398:.
2374:.
2348:.
2323:.
2298:.
2273:.
2248:.
2224:.
2199:.
2174:.
2156:.
2119:.
2094:.
2068:.
2044:.
2023:^
2005:.
1981:.
1957:.
1917:^
1901:.
1879:.
1869:.
1834:.
1810:.
1737:.
1610:,
1608:F#
1606:,
1604:C#
1522:,
1520:Cg
1518:,
1514:,
1510:,
1506:,
1502:,
1498:,
1460:∞
1427:8
1416:4
1405:2
1394:1
1308:2
1302:64
1297:2
1294:64
1291:32
1286:2
1283:32
1280:16
1275:2
1272:16
1264:2
1253:2
1242:2
1231:2
1206:2
1167:2
1128:2
1103:2
1092:2
1081:2
1070:2
1059:2
1048:2
1037:2
1026:2
1015:2
1004:2
976:.
954:−∞
937:−2
920:−0
495:−0
493:,
390:10
79:.
41:,
2708:e
2701:t
2694:v
2636:)
2619:)
2613:(
2608:)
2604:(
2600:.
2582:.
2536:.
2512:.
2487:.
2463:.
2434:.
2408:.
2384:.
2359:.
2334:.
2309:.
2284:.
2259:.
2234:.
2209:.
2185:.
2130:.
2105:.
2079:.
2054:.
2040::
2017:.
1991:.
1967:.
1939:.
1911:.
1887:.
1863::
1844:.
1820:.
1480:2
1457:∞
1379:2
1376:/
1373:1
1354:4
1351:/
1348:1
1329:8
1326:/
1323:1
1269:8
1261:8
1258:4
1250:4
1247:2
1239:2
1236:1
1228:1
1221:2
1218:/
1215:1
1199:2
1196:/
1193:1
1182:4
1179:/
1176:1
1160:4
1157:/
1154:1
1143:8
1140:/
1137:1
1121:8
1118:/
1115:1
1108:2
1100:2
1097:2
1089:2
1086:2
1078:2
1075:2
1067:2
1064:2
1056:2
1053:2
1045:2
1042:2
1034:2
1031:2
1023:2
1020:2
1012:2
1009:2
1001:2
998:0
900:∞
872:/
836:/
833:1
800:/
797:0
764:/
728:/
692:/
689:0
656:/
620:/
617:1
595:0
539:±
535:2
527:2
518:2
514:2
506:2
486:2
462:2
458:2
447:2
437:2
433:2
427:E
422:2
418:2
412:E
257:e
250:t
243:v
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.