285:, as well as in associated peripherals. Since the punched card code then in use only allowed digits, upper-case English letters and a few special characters, six bits were sufficient. These BCD encodings extended existing simple four-bit numeric encoding to include alphabetic and special characters, mapping them easily to punch-card encoding which was already in widespread use. IBM's codes were used primarily with IBM equipment; other computer vendors of the era had their own character codes, often six-bit, but usually had the ability to read tapes produced on IBM equipment. These BCD encodings were the precursors of IBM's
33:
227:
ASCII committee (which contained at least one member of the
Fieldata committee, W. F. Leubbert), which addressed most of the shortcomings of Fieldata, using a simpler code. Many of the changes were subtle, such as collatable character sets within certain numeric ranges. ASCII63 was a success, widely adopted by industry, and with the follow-up issue of the 1967 ASCII code (which added lower-case letters and fixed some "control code" issues) ASCII67 was adopted fairly widely. ASCII67's American-centric nature was somewhat addressed in the European
235:
1165:
4320:
598:), how those code points are mapped to a series of fixed-size natural numbers (code units), and finally how those units are encoded as a stream of octets (bytes). The purpose of this decomposition is to establish a universal set of characters that can be encoded in a variety of ways. To describe this model precisely, Unicode uses its own set of terminology to describe its process:
370:
is the set of characters that can be represented by a particular coded character set. The repertoire may be closed, meaning that no additions are allowed without creating a new standard (as is the case with ASCII and most of the ISO-8859 series); or it may be open, allowing additions (as is the case
326:
Informally, the terms "character encoding", "character map", "character set" and "code page" are often used interchangeably. Historically, the same standard would specify a repertoire of characters and how they were to be encoded into a stream of code units — usually with a single character per code
245:
invented punch card data encoding in the late 19th century to analyze census data. Initially, each hole position represented a different data element, but later, numeric information was encoded by numbering the lower rows 0 to 9, with a punch in a column representing its row number. Later alphabetic
226:
code, a six-or seven-bit code, introduced by the U.S. Army Signal Corps. While
Fieldata addressed many of the then-modern issues (e.g. letter and digit codes arranged for machine collation), it fell short of its goals and was short-lived. In 1963 the first ASCII code was released (X3.4-1963) by the
296:
In trying to develop universally interchangeable character encodings, researchers in the 1980s faced the dilemma that, on the one hand, it seemed necessary to add more bits to accommodate additional characters, but on the other hand, for the users of the relatively small character set of the Latin
1082:
Note in particular that 𐐀 is represented with either one 32-bit value (UTF-32), two 16-bit values (UTF-16), or four 8-bit values (UTF-8). Although each of those forms uses the same total number of bits (32) to represent the glyph, it is not obvious how the actual numeric byte values are related.
641:
to facilitate storage in a system that represents numbers as bit sequences of fixed length (i.e. practically any computer system). For example, a system that stores numeric information in 16-bit units can only directly represent code points 0 to 65,535 in each unit, but larger code points (say,
622:(each code point represents one character). For example, in a given repertoire, the capital letter "A" in the Latin alphabet might be represented by the code point 65, the character "B" by 66, and so on. Multiple coded character sets may share the same character repertoire; for example
524:
UTF-16: code units are twice as long as 8-bit code units. Therefore, any code point with a scalar value less than U+10000 is encoded with a single code unit. Code points with a value U+10000 or higher require two code units each. These pairs of code units have a unique term in UTF-16:
314:. Code points would then be represented in a variety of ways and with various default numbers of bits per character (code units) depending on context. To encode code points higher than the length of the code unit, such as above 256 for eight-bit units, the solution was to implement
309:
was to break the assumption (dating back to telegraph codes) that each character should always directly correspond to a particular sequence of bits. Instead, characters would first be mapped to a universal intermediate representation in the form of abstract numbers called
154:
The history of character codes illustrates the evolving need for machine-mediated character-based symbolic information over a distance, using once-novel electrical means. The earliest codes were based upon manual and hand-written encoding and cyphering systems, such as
179:, introduced in the 1840s, used a system of four "symbols" (short signal, long signal, short space, long space) to generate codes of variable length. Though some commercial use of Morse code was via machinery, it was often used as a manual code, generated by hand on a
297:
alphabet (who still constituted the majority of computer users), those additional bits were a colossal waste of then-scarce and expensive computing resources (as they would always be zeroed out for such users). In 1985, the average personal computer user's
1091:
As a result of having many character encoding methods in use (and the need for backward compatibility with archived data), many computer programs have been developed to translate data between character encoding schemes, a process known as
454:
Despite no longer referring to specific page numbers in a standard, many character encodings are still referred to by their code page number; likewise, the term "code page" is often still used to refer to character encodings in general.
175:, 1869). With the adoption of electrical and electro-mechanical techniques these earliest codes were adapted to the new capabilities and limitations of the early machines. The earliest well-known electrically transmitted character code,
458:
The term "code page" is not used in Unix or Linux, where "charmap" is preferred, usually in the larger context of locales. IBM's
Character Data Representation Architecture (CDRA) designates entities with coded character set identifiers
221:
has been erroneously applied to ITA2 and its many variants. ITA2 suffered from many shortcomings and was often improved by many equipment manufacturers, sometimes creating compatibility issues. In 1959 the U.S. military defined its
754:
In
Unicode, a character can be referred to as 'U+' followed by its codepoint value in hexadecimal. The range of valid code points (the codespace) for the Unicode standard is U+0000 to U+10FFFF, inclusive, divided in 17
404:
size of the character encoding). For example, common code units include 7-bit, 8-bit, 16-bit, and 32-bit. In some encodings, some characters are encoded using multiple code units; such an encoding is referred to as a
556:, there are two distinct approaches that can be taken to encode them: they can be encoded either as a single unified character (known as a precomposed character), or as separate characters that combine into a single
301:
could store only about 10 megabytes, and it cost approximately US$ 250 on the wholesale market (and much higher if purchased separately at retail), so it was very important at the time to make every bit count.
2289:
649:(CES) is the mapping of code units to a sequence of octets to facilitate storage on an octet-based file system or transmission over an octet-based network. Simple character encoding schemes include
206:(ASCII) and Unicode. Unicode, a well-defined and extensible encoding system, has supplanted most earlier character encodings, but the path of code development to the present is fairly well known.
3938:
119:) which represent most of the characters used in many written languages. Character encoding using internationally accepted standards permits worldwide interchange of text in electronic form.
517:
A code point is represented by a sequence of code units. The mapping is defined by the encoding. Thus, the number of code units required to represent a code point depends on the encoding:
746:
The
Unicode model uses the term "character map" for other systems which directly assign a sequence of characters to a sequence of bytes, covering all of the CCS, CEF and CES layers.
1342:
Central, Eastern and
Southern European languages (Albanian, Bosnian, Croatian, Hungarian, Polish, Romanian, Serbian and Slovenian, but also French, German, Italian and Irish Gaelic)
2783:
1507:
605:(ACR) is the full set of abstract characters that a system supports. Unicode has an open repertoire, meaning that new characters will be added to the repertoire over time.
2620:
571:
variants is a choice that must be made when constructing a particular character encoding. Some writing systems, such as Arabic and Hebrew, need to accommodate things like
203:
889:). This string has several Unicode representations which are logically equivalent, yet while each is suited to a diverse set of circumstances or range of requirements:
1962:
1721:, part of the TRON project, is an encoding system that does not use Han Unification; instead, it uses "control codes" to switch between 16-bit "planes" of characters.
2728:
1497:
286:
2803:
2317:
1502:
217:
in 1870, patented in 1874, modified by Donald Murray in 1901, and standardized by CCITT as
International Telegraph Alphabet No. 2 (ITA2) in 1930. The name
1181:
Especially, a comparison of the advantages and disadvantages of the few 3-5 most common character encodings (e.g. UTF-8, UTF-16 and UTF-32). You can help by
2155:
2243:
1421:
for
Central European languages that use Latin script, (Polish, Czech, Slovak, Hungarian, Slovene, Serbian, Croatian, Bosnian, Romanian and Albanian)
4347:
4030:
3784:
2222:
690:
534:
GB 18030: multiple code units per code point are common, because of the small code units. Code points are mapped to one, two, or four code units.
4020:
1910:
1859:
694:
3769:
2723:
2229:
2024:
3903:
1128:
327:
unit. However, due to the emergence of more sophisticated character encodings, the distinction between these terms has become important.
164:
2191:
2173:
739:
character, particularly where there are regional variants that have been 'unified' in
Unicode as the same character. An example is the
4297:
3808:
3611:
2355:
1724:
763:(BMP). This plane contains most commonly-used characters. Characters in the range U+10000 to U+10FFFF in the other planes are called
3853:
3469:
3464:
2967:
2798:
2310:
1522:
2080:
2887:
4105:
4040:
3794:
3774:
725:
115:
only. The low cost of digital representation of data in modern computer systems allows more elaborate character codes (such as
3858:
2451:
2345:
2290:
The
Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
560:. The former simplifies the text handling system, but the latter allows any letter/diacritic combination to be used in text.
3972:
3943:
3593:
2094:
2076:
642:
65,536 to 1.4 million) could be represented by using multiple 16-bit units. This correspondence is defined by a CEF.
32:
4035:
3923:
3883:
2303:
1204:
1155:
1110:
401:
123:
1710:– a system ("glyph set") that includes over 100,000 Chinese character drawings, modern and ancient, popular and obscure
431:
in the IBM standard character set manual, which would define a particular character encoding. Other vendors, including
4342:
4277:
3888:
3804:
3789:
3693:
3606:
3578:
3544:
1682:
1677:
3554:
3549:
4251:
4196:
4117:
3898:
3893:
2902:
250:
represented date internally by the timing of pulses relative to the motion of the cards through the machine. When
3958:
3913:
3749:
3298:
3002:
2947:
2912:
1830:
860:
359:
is a character set mapped to set of unique numbers. For historical reasons, this is also often referred to as a
3473:
2982:
2962:
2957:
2897:
2892:
2401:
1574:
764:
266:
4323:
4307:
4234:
4229:
4191:
4162:
4127:
3559:
3293:
2992:
2877:
1688:
590:, together constitute a unified standard for character encoding. Rather than mapping characters directly to
587:
406:
315:
191:
use. Most codes are of fixed per-character length or variable-length sequences of fixed-length codes (e.g.
89:
43:. Presence and absence of a hole represents 1 and 0, respectively; for example, "W" is encoded as "1010111".
258:
Electronic Multiplier, it used a variety of binary encoding schemes that were tied to the punch card code.
3918:
3908:
3764:
3754:
3288:
2997:
2442:
2429:
2365:
2213:
721:
713:
613:
188:
168:
759:, identified by the numbers 0 to 16. Characters in the range U+0000 to U+FFFF are in plane 0, called the
724:
with fixed-length UCS-2BE and maps Unicode code points to variable-length sequences of 16-bit words. See
4095:
3933:
3868:
3744:
3283:
2437:
894:
544:
531:
UTF-32: the 32-bit code unit is large enough that every code point is represented as a single code unit.
333:
54:
3313:
4256:
3928:
3818:
3688:
3308:
2133:
1903:
1564:
1312:
Western Europe with rationalised character set for Nordic languages, including complete Icelandic set
1216:
561:
262:
135:
3729:
1770:
575:
that are joined in different ways in different contexts, but represent the same semantic character.
318:
where an escape sequence would signal that subsequent bits should be parsed as a higher code point.
4211:
3838:
3323:
3208:
3198:
3193:
1798:
627:
594:, Unicode separately defines a coded character set that maps characters to unique natural numbers (
278:
74:
4292:
4140:
3953:
3948:
3873:
2872:
2846:
2370:
1713:
1615:
1551:
1528:
440:
247:
66:
716:
with fixed-length ASCII and maps Unicode code points to variable-length sequences of octets, or
1515:
is a widely deployed standard for Japanese character encoding that has several encoding forms.
1145:
MultiByteToWideChar/WideCharToMultiByte – to convert from ANSI to Unicode & Unicode to ANSI
4282:
4221:
4201:
3863:
3843:
3823:
3451:
2927:
2907:
2419:
2235:
2225:
2020:
1412:
1135:
463:), each of which is variously called a "charset", "character set", "code page", or "CHARMAP".
444:
372:
234:
156:
1928:
4239:
3813:
3779:
3489:
3318:
2118:
2114:
1729:
1667:
1220:
1131:– A set of C and Java libraries to perform charset conversion. uconv can be used from ICU4C.
397:
242:
172:
139:
100:
4287:
4206:
2937:
2932:
2922:
2867:
2552:
2542:
2537:
2532:
2527:
2522:
2517:
2284:
2011:
1718:
1339:
1333:
1327:
1321:
1315:
1309:
760:
756:
686:
682:
396:
is the minimum bit combination that can represent a character in a character encoding (in
298:
282:
443:, also published their own sets of code pages; the most well-known code page suites are "
214:
2072:
1882:
1846:
in reality, you usually just assume UTF-8 since that is by far the most common encoding.
1125:– a program that converts encoding of input and output to programs running interactively
289:(usually abbreviated as EBCDIC), an eight-bit encoding scheme developed in 1963 for the
3739:
3734:
3724:
3719:
3714:
3709:
3673:
3668:
3661:
3656:
3651:
3646:
3641:
3636:
3631:
3626:
3621:
3616:
3484:
3441:
3436:
3431:
3426:
3421:
3416:
3411:
3406:
3401:
3396:
3391:
3386:
3381:
3376:
3371:
3278:
3273:
3268:
3263:
3258:
3253:
3248:
3243:
3238:
3233:
3228:
3223:
3007:
2987:
2592:
2512:
2507:
2502:
2497:
2492:
2487:
2482:
2477:
2472:
2340:
1303:
1297:
1291:
1285:
1279:
1273:
1267:
1261:
1255:
1208:
1099:
623:
349:
345:
290:
127:
108:
1164:
689:; compressing schemes try to minimize the number of bytes used per code unit (such as
4336:
4059:
3479:
3366:
3361:
3356:
3351:
3346:
3341:
3218:
3213:
3203:
3188:
3183:
3178:
3173:
3168:
3163:
3158:
3153:
3148:
3143:
3138:
3133:
3128:
3123:
3118:
3113:
3108:
3103:
3098:
3093:
3088:
3083:
3078:
3073:
3068:
3063:
3058:
3053:
3048:
3043:
3038:
3033:
3028:
3023:
2942:
2917:
2882:
2841:
2587:
1592:
1492:
1407:
1403:
1399:
1395:
1391:
1387:
1383:
1379:
1375:
1371:
1367:
1363:
1359:
1355:
1351:
1347:
471:
The code unit size is equivalent to the bit measurement for the particular encoding:
448:
184:
180:
1904:"IBM Electronic Data-Processing Machines Type 702 Preliminary Manual of Information"
1749:
863:
of the letters "ab̲c𐐀"—that is, a string containing a Unicode combining character (
4079:
4074:
4069:
4064:
3799:
3539:
3534:
3529:
3524:
3519:
3514:
3509:
3504:
3499:
3494:
2977:
2972:
2952:
2836:
2828:
2461:
1654:
1620:
1556:
1533:
1474:
1466:
1460:
1454:
1448:
1442:
1436:
1430:
1424:
1418:
1249:
678:
62:
36:
17:
2268:
1963:"What's the difference between an Encoding, Code Page, Character Set and Unicode?"
2394:
2377:
2279:
2110:
1694:
1232:
1106:
1093:
428:
199:
112:
104:
246:
data was encoded by allowing more than one punch per column. Electromechanical
4244:
4185:
4152:
4005:
3683:
2778:
2748:
2743:
2738:
2733:
2698:
2582:
2577:
2567:
2562:
2360:
2350:
1707:
1540:
1512:
950:
618:
595:
311:
176:
96:
81:
1732:– used in some applications when character encoding metadata is not available
4132:
4110:
4015:
3828:
2857:
2788:
2768:
2763:
2688:
2683:
1609:
1546:
1518:
735:
which supplies additional information to select the particular variant of a
553:
432:
419:
360:
85:
70:
521:
UTF-8: code points map to a sequence of one, two, three or four code units.
4302:
4157:
4122:
4100:
4010:
3833:
2773:
2758:
2718:
2713:
2708:
2693:
2652:
2647:
2642:
2637:
2632:
2627:
2424:
2414:
2410:
2384:
2285:
Decimal, Hexadecimal Character Codes in HTML Unicode – Encoding converter
2102:
1860:"Ancient Computer Character Code Tables – and Why They're Still Relevant"
1700:
1672:
1580:
1276:
Western Europe and Baltic countries (Lithuania, Estonia, Latvia and Lapp)
920:
717:
705:
701:
666:
662:
658:
654:
572:
549:
Exactly what constitutes a character varies between character encodings.
491:
223:
77:
58:
51:
2050:
4172:
3968:
3878:
3759:
3333:
2703:
2678:
2668:
2406:
2269:
Character sets registered by Internet Assigned Numbers Authority (IANA)
1627:
1569:
736:
583:
274:
270:
269:) six-bit character encoding schemes, starting as early as 1953 in its
255:
192:
160:
116:
80:. The numerical values that make up a character encoding are known as "
2295:
198:
Common examples of character encoding systems include Morse code, the
4177:
4167:
4145:
4025:
4000:
3995:
3848:
3678:
3569:
3459:
2818:
2808:
2793:
2610:
1643:
1638:
1630:(and subsets thereof, such as the 16-bit 'Basic Multilingual Plane')
1483:
1479:
1244:
1224:
1113:. On Firefox 3, for example, see the View/Character Encoding submenu.
674:
670:
630:
all cover the same repertoire but map them to different code points.
526:
505:
498:
487:
436:
344:
is a collection of elements used to represent text. For example, the
228:
143:
293:
that featured a larger character set, including lower case letters.
1270:
Western Europe and South European (Turkish, Maltese plus Esperanto)
4272:
3990:
3985:
3980:
3597:
3303:
2813:
2753:
2615:
2389:
2106:
1650:
1633:
1599:
1237:
1212:
1116:
709:
650:
568:
557:
483:
476:
460:
389:
is the range of numerical values spanned by a coded character set.
233:
131:
40:
31:
3583:
2673:
2239:
1993:
1588:
1487:
1122:
1036:
708:
are simpler CESes, most systems working with Unicode use either
591:
95:
Early character codes associated with the optical or electrical
2299:
382:
is a value or position of a character in a coded character set.
4050:
2098:
1159:
740:
251:
210:
2273:
2101:
character sets, which conform to the structure and rules of
1771:"Usage Survey of Character Encodings broken down by Ranking"
1336:
Added the Euro sign and other rationalisations to ISO 8859-1
424:"Code page" is a historical name for a coded character set.
1831:"Character encoding for iOS developers. Or UTF-8 what now?"
1814:
Android note: The Android platform default is always UTF-8.
1215:, used in 98.2% of surveyed web sites, as of May 2024. In
134:, used in 98.2% of surveyed web sites, as of May 2024. In
770:
The following table shows examples of code point values:
238:
Hollerith 80-column punch card with EBCDIC character set
99:
could only represent a subset of the characters used in
1182:
167:, and the 4-digit encoding of Chinese characters for a
2280:
Unicode Technical Report #17: Character Encoding Model
2013:
The Unicode Standard Version 15.0 – Core Specification
305:
The compromise solution that was eventually found and
1119:– a program and standardized API to convert encodings
1765:
1763:
447:" (based on Windows-1252) and "IBM"/"DOS" (based on
4265:
4220:
4088:
4049:
3967:
3702:
3592:
3568:
3450:
3332:
3016:
2855:
2827:
2661:
2603:
2460:
2333:
1988:
1986:
1984:
1982:
1980:
1978:
1976:
1685:– articles related to character encoding in general
1612:
is a Korean double-byte character encoding standard
1074:
1070:
1066:
1062:
1058:
1054:
1050:
1046:
1042:
1028:
1024:
1020:
1016:
1012:
1008:
998:
994:
990:
986:
982:
972:
968:
964:
960:
956:
942:
938:
934:
930:
926:
912:
908:
904:
900:
681:, switch between several simple schemes by using a
2049:Whistler, Ken; Freytag, Asmus (11 November 2022).
337:is a minimal unit of text that has semantic value.
204:American Standard Code for Information Interchange
1691:– articles detailing specific character encodings
1306:Western Europe with amended Turkish character set
254:went to electronic processing, starting with the
4252:Unicode control, format and separator characters
1330:Celtic languages (Irish Gaelic, Scottish, Welsh)
979:Five UTF-32 code units (32-bit integer values):
2221:. The Systems Programming Series (1 ed.).
2192:"WideCharToMultiByte function (stringapiset.h)"
2174:"MultiByteToWideChar function (stringapiset.h)"
669:; compound character encoding schemes, such as
427:Originally, a code page referred to a specific
84:" and collectively comprise a "code space", a "
1883:"An annotated history of some character codes"
287:Extended Binary-Coded Decimal Interchange Code
2311:
2215:Coded Character Sets, History and Development
2073:"VT510 Video Terminal Programmer Information"
1109:– most modern web browsers feature automatic
586:and its parallel standard, the ISO/IEC 10646
8:
2005:
2003:
1956:
1954:
1952:
1950:
2113:in IBM's standard character set manual) in
39:with the word "Knowledge (XXG)" encoded in
2318:
2304:
2296:
2051:"UTR#17: Unicode Character Encoding Model"
772:
27:Using numbers to represent text characters
183:and decipherable by ear, and persists in
2109:supports a number of IBM PC code pages (
2044:
2042:
2040:
2038:
2036:
1929:"IBM Drives Hard Disks to New Standards"
1035:Nine UTF-8 code units (8-bit values, or
1005:Six UTF-16 code units (16-bit integers)
2223:Addison-Wesley Publishing Company, Inc.
2079:(DEC). 7.1. Character Sets - Overview.
1935:. Popular Computing Inc. pp. 29–33
1741:
637:(CEF) is the mapping of code points to
57:, especially the written characters of
50:is the process of assigning numbers to
2160:Microsoft .NET Framework Class Library
2019:. Unicode Consortium. September 2022.
1543:is an extended version of JIS X 0208.
1175: with: Popularity and comparison:
876:) as well a supplementary character (
371:with Unicode and to a limited extent
7:
2083:from the original on 26 January 2016
1916:from the original on 9 October 2022.
1824:
1822:
1793:
1791:
1591:(a more famous variant is Microsoft
1129:International Components for Unicode
165:international maritime signal flags
3662:Norwegian and Danish (alternative)
2134:"Terminology (The Java Tutorials)"
1725:Universal Character Set characters
25:
2249:from the original on May 26, 2016
1829:Galloway, Matt (9 October 2012).
1096:. Some of these are cited below.
4319:
4318:
1927:Strelho, Kevin (15 April 1985).
1163:
4106:Digital encoding of APL symbols
4041:Comparison of Unicode encodings
2559:Proposed but not approved
1909:. 1954. p. 80. 22-6173-1.
1858:Tom Henderson (17 April 2014).
1750:"Character Encoding Definition"
726:comparison of Unicode encodings
4348:Natural language and computing
2292:by Joel Spolsky (Oct 10, 2003)
2212:Mackenzie, Charles E. (1980).
1961:Shawn Steele (15 March 2005).
552:For example, for letters with
1:
2077:Digital Equipment Corporation
1881:Tom Jennings (1 March 2010).
887:DESERET CAPITAL LETTER LONG I
603:abstract character repertoire
1324:Baltic languages plus Polish
1205:most used character encoding
1156:Popularity of text encodings
1111:character encoding detection
277:computers, and in its later
124:most used character encoding
4278:Character encodings in HTML
3612:National Replacement (NRCS)
3579:Japanese language in EBCDIC
2093:In addition to traditional
1994:"Glossary of Unicode Terms"
1695:Hexadecimal representations
1683:Category:Character encoding
1678:Character encodings in HTML
1142:Encoding.Convert – .NET API
728:for a detailed discussion.
626:and IBM code pages 037 and
4364:
2010:"Chapter 3: Conformance".
1525:is a dialect of Shift_JIS)
1264:Western and Central Europe
1153:
1150:Common character encodings
831:Inverted exclamation mark
542:
527:"Unicode surrogate pairs".
417:
103:, sometimes restricted to
4316:
2156:"Encoding.Convert Method"
2121:of industry-standard PCs.
1754:The Tech Terms Dictionary
1577:(Microsoft Code page 936)
1413:MS-Windows character sets
647:character encoding scheme
316:variable-length encodings
213:encoding, was created by
4308:Variable-length encoding
4089:Miscellaneous code pages
2847:Extended Unix Code / EUC
2538:-15 (New Western Europe)
2334:Early telecommunications
2274:Characters and encodings
1178:Statistics on popularity
765:supplementary characters
761:Basic Multilingual Plane
731:Finally, there may be a
616:that maps characters to
352:are both character sets.
209:The Baudot code, a five-
4235:C0 and C1 control codes
1689:Category:Character sets
635:character encoding form
588:Universal Character Set
564:pose similar problems.
407:variable-width encoding
2483:-3 (Maltese/Esperanto)
2434:World System Teletext
1704:– character set mismap
1427:for Cyrillic alphabets
1223:tasks, both UTF-8 and
579:Unicode encoding model
567:Exactly how to handle
307:developed into Unicode
239:
169:Chinese telegraph code
142:tasks, both UTF-8 and
61:, allowing them to be
44:
4257:Whitespace characters
3934:Ventura International
1996:. Unicode Consortium.
1433:for Western languages
1227:are popular options.
733:higher-level protocol
545:Character (computing)
237:
146:are popular options.
35:
3652:Norwegian and Danish
2117:mode to emulate the
2053:. Unicode Consortium
1756:. 24 September 2010.
1463:for Baltic languages
1217:application programs
743:attribute xml:lang.
508:consists of 32 bits.
501:consists of 16 bits;
368:character repertoire
263:Binary Coded Decimal
136:application programs
4212:Unified Hangul Code
3884:PostScript Standard
3607:Multinational (MCS)
2478:-2 (Central Europe)
2473:-1 (Western Europe)
2327:Character encodings
895:composed characters
779:Unicode code point
750:Unicode code points
722:backward compatible
714:backward compatible
610:coded character set
494:consists of 8 bits;
479:consists of 7 bits;
357:coded character set
248:tabulating machines
18:Coded character set
4343:Character encoding
4293:Hardware code page
4053:typesetting system
3889:PostScript Latin 1
3545:Cyrillic + Finnish
3452:Windows code pages
3334:IBM AIX code pages
2662:National standards
2593:Ukrainian Cyrillic
2276:, by Jukka Korpela
2180:. 13 October 2021.
1835:www.galloway.me.uk
1803:Android Developers
1714:Presentation layer
874:COMBINING LOW LINE
441:Oracle Corporation
373:Windows code pages
240:
105:upper case letters
48:Character encoding
45:
4330:
4329:
4283:Charset detection
4222:Control character
3904:Sharp calculators
3775:Casio calculators
3703:Platform specific
3555:Cyrillic + German
3550:Cyrillic + French
2968:Maltese/Esperanto
2604:Bibliographic use
2488:-4 (North Europe)
2420:T.51/ISO/IEC 6937
2378:Baudot and Murray
2231:978-0-201-14460-4
2026:978-1-936213-32-0
1282:Cyrillic alphabet
1201:
1200:
852:
851:
400:terms, it is the
261:IBM used several
101:written languages
16:(Redirected from
4355:
4322:
4321:
3814:DG International
3689:Special Graphics
3490:Extended Latin-8
2888:Central European
2878:Barents Cyrillic
2583:Barents Cyrillic
2553:-12 (Devanagari)
2549:Abandoned parts
2320:
2313:
2306:
2297:
2258:
2256:
2254:
2248:
2220:
2200:
2199:
2198:. 9 August 2022.
2188:
2182:
2181:
2170:
2164:
2163:
2152:
2146:
2145:
2143:
2141:
2130:
2124:
2123:
2119:console terminal
2090:
2088:
2069:
2063:
2062:
2060:
2058:
2046:
2031:
2030:
2018:
2007:
1998:
1997:
1990:
1971:
1970:
1958:
1945:
1944:
1942:
1940:
1924:
1918:
1917:
1915:
1908:
1900:
1894:
1893:
1891:
1889:
1878:
1872:
1871:
1869:
1867:
1855:
1849:
1848:
1843:
1841:
1826:
1817:
1816:
1811:
1809:
1795:
1786:
1785:
1783:
1781:
1767:
1758:
1757:
1746:
1730:Charset sniffing
1668:Percent-encoding
1557:ISO-2022-JP-2004
1221:operating system
1194:
1191:
1167:
1160:
1076:
1072:
1068:
1064:
1060:
1056:
1052:
1048:
1044:
1030:
1026:
1022:
1018:
1014:
1010:
1000:
996:
992:
988:
984:
974:
970:
966:
962:
958:
944:
940:
936:
932:
928:
914:
910:
906:
902:
888:
885:
882:
880:
875:
872:
869:
867:
773:
687:escape sequences
398:computer science
308:
243:Herman Hollerith
173:Hans Schjellerup
140:operating system
21:
4363:
4362:
4358:
4357:
4356:
4354:
4353:
4352:
4333:
4332:
4331:
4326:
4312:
4288:Han unification
4261:
4216:
4084:
4045:
3963:
3785:Compucolor 8001
3698:
3694:Technical (TCS)
3617:French Canadian
3588:
3564:
3560:Polytonic Greek
3446:
3328:
3012:
2998:Turkic Cyrillic
2913:Font X (Kermit)
2908:Farsi (Persian)
2860:
2851:
2823:
2657:
2599:
2469:Approved parts
2456:
2329:
2324:
2265:
2252:
2250:
2246:
2232:
2218:
2211:
2208:
2206:Further reading
2203:
2190:
2189:
2185:
2172:
2171:
2167:
2154:
2153:
2149:
2139:
2137:
2132:
2131:
2127:
2086:
2084:
2071:
2070:
2066:
2056:
2054:
2048:
2047:
2034:
2027:
2016:
2009:
2008:
2001:
1992:
1991:
1974:
1960:
1959:
1948:
1938:
1936:
1926:
1925:
1921:
1913:
1906:
1902:
1901:
1897:
1887:
1885:
1880:
1879:
1875:
1865:
1863:
1857:
1856:
1852:
1839:
1837:
1828:
1827:
1820:
1807:
1805:
1797:
1796:
1789:
1779:
1777:
1769:
1768:
1761:
1748:
1747:
1743:
1739:
1664:
1659:
1197:
1189:
1186:
1173:needs expansion
1158:
1152:
1089:
886:
883:
878:
877:
873:
870:
865:
864:
857:
752:
683:byte order mark
581:
547:
541:
515:
504:A code unit in
497:A code unit in
482:A code unit in
475:A code unit in
469:
422:
416:
324:
306:
299:hard disk drive
152:
28:
23:
22:
15:
12:
11:
5:
4361:
4359:
4351:
4350:
4345:
4335:
4334:
4328:
4327:
4324:Character sets
4317:
4314:
4313:
4311:
4310:
4305:
4300:
4295:
4290:
4285:
4280:
4275:
4269:
4267:
4266:Related topics
4263:
4262:
4260:
4259:
4254:
4249:
4248:
4247:
4242:
4232:
4230:Morse prosigns
4226:
4224:
4218:
4217:
4215:
4214:
4209:
4204:
4199:
4194:
4189:
4182:
4181:
4180:
4175:
4170:
4160:
4155:
4150:
4149:
4148:
4143:
4135:
4130:
4125:
4120:
4115:
4114:
4113:
4103:
4098:
4092:
4090:
4086:
4085:
4083:
4082:
4077:
4072:
4067:
4062:
4056:
4054:
4047:
4046:
4044:
4043:
4038:
4033:
4028:
4023:
4018:
4013:
4008:
4003:
3998:
3993:
3988:
3983:
3977:
3975:
3965:
3964:
3962:
3961:
3956:
3951:
3946:
3941:
3936:
3931:
3926:
3924:TI calculators
3921:
3916:
3911:
3906:
3901:
3896:
3891:
3886:
3881:
3876:
3871:
3866:
3861:
3856:
3851:
3846:
3841:
3836:
3831:
3826:
3821:
3816:
3811:
3802:
3797:
3792:
3787:
3782:
3777:
3772:
3767:
3762:
3757:
3752:
3747:
3742:
3737:
3732:
3727:
3722:
3717:
3712:
3706:
3704:
3700:
3699:
3697:
3696:
3691:
3686:
3681:
3676:
3671:
3666:
3665:
3664:
3659:
3654:
3649:
3644:
3639:
3634:
3632:United Kingdom
3629:
3624:
3619:
3609:
3603:
3601:
3590:
3589:
3587:
3586:
3581:
3575:
3573:
3566:
3565:
3563:
3562:
3557:
3552:
3547:
3542:
3537:
3532:
3527:
3522:
3517:
3512:
3507:
3502:
3497:
3492:
3487:
3482:
3477:
3467:
3462:
3456:
3454:
3448:
3447:
3445:
3444:
3439:
3434:
3429:
3424:
3419:
3414:
3409:
3404:
3399:
3394:
3389:
3384:
3379:
3374:
3369:
3364:
3359:
3354:
3349:
3344:
3338:
3336:
3330:
3329:
3327:
3326:
3321:
3316:
3311:
3306:
3301:
3296:
3291:
3286:
3281:
3276:
3271:
3266:
3261:
3256:
3251:
3246:
3241:
3236:
3231:
3226:
3221:
3216:
3211:
3206:
3201:
3196:
3191:
3186:
3181:
3176:
3171:
3166:
3161:
3156:
3151:
3146:
3141:
3136:
3131:
3126:
3121:
3116:
3111:
3106:
3101:
3096:
3091:
3086:
3081:
3076:
3071:
3066:
3061:
3056:
3051:
3046:
3041:
3036:
3031:
3026:
3020:
3018:
3017:DOS code pages
3014:
3013:
3011:
3010:
3005:
3000:
2995:
2990:
2985:
2980:
2975:
2970:
2965:
2963:Latin (Kermit)
2960:
2955:
2950:
2945:
2940:
2935:
2930:
2925:
2920:
2915:
2910:
2905:
2900:
2895:
2890:
2885:
2880:
2875:
2870:
2864:
2862:
2853:
2852:
2850:
2849:
2844:
2839:
2833:
2831:
2825:
2824:
2822:
2821:
2816:
2811:
2806:
2801:
2796:
2791:
2786:
2781:
2776:
2771:
2766:
2761:
2756:
2751:
2746:
2741:
2736:
2731:
2726:
2721:
2716:
2711:
2706:
2701:
2696:
2691:
2686:
2681:
2676:
2671:
2665:
2663:
2659:
2658:
2656:
2655:
2650:
2645:
2640:
2635:
2630:
2625:
2624:
2623:
2618:
2607:
2605:
2601:
2600:
2598:
2597:
2596:
2595:
2590:
2585:
2580:
2572:
2571:
2570:
2565:
2563:KOI-8 Cyrillic
2557:
2556:
2555:
2547:
2546:
2545:
2543:-16 (Romanian)
2540:
2535:
2530:
2525:
2520:
2515:
2510:
2505:
2500:
2495:
2490:
2485:
2480:
2475:
2466:
2464:
2458:
2457:
2455:
2454:
2449:
2448:
2447:
2446:
2445:
2440:
2432:
2427:
2422:
2404:
2399:
2398:
2397:
2387:
2382:
2381:
2380:
2375:
2374:
2373:
2368:
2363:
2358:
2348:
2341:Telegraph code
2337:
2335:
2331:
2330:
2325:
2323:
2322:
2315:
2308:
2300:
2294:
2293:
2287:
2282:
2277:
2271:
2264:
2263:External links
2261:
2260:
2259:
2230:
2207:
2204:
2202:
2201:
2196:Microsoft Docs
2183:
2178:Microsoft Docs
2165:
2147:
2125:
2064:
2032:
2025:
1999:
1972:
1967:Microsoft Docs
1946:
1919:
1895:
1873:
1850:
1818:
1787:
1759:
1740:
1738:
1735:
1734:
1733:
1727:
1722:
1716:
1711:
1705:
1697:
1692:
1686:
1680:
1675:
1670:
1663:
1660:
1658:
1657:
1648:
1647:
1646:
1641:
1636:
1625:
1624:
1623:
1618:
1613:
1604:
1603:
1602:
1585:
1584:
1583:
1578:
1572:
1561:
1560:
1559:
1554:
1549:
1547:Shift_JIS-2004
1538:
1537:
1536:
1531:
1526:
1510:
1505:
1500:
1495:
1490:
1477:
1472:
1471:
1470:
1469:for Vietnamese
1464:
1458:
1452:
1446:
1440:
1434:
1428:
1422:
1410:
1345:
1344:
1343:
1337:
1331:
1325:
1319:
1313:
1307:
1301:
1295:
1289:
1283:
1277:
1271:
1265:
1259:
1258:Western Europe
1247:
1242:
1241:
1240:
1229:
1199:
1198:
1196:
1195:
1179:
1170:
1168:
1154:Main article:
1151:
1148:
1147:
1146:
1143:
1133:
1132:
1126:
1120:
1114:
1100:Cross-platform
1088:
1085:
1080:
1079:
1078:
1077:
1033:
1032:
1031:
1003:
1002:
1001:
977:
976:
975:
947:
946:
945:
917:
916:
915:
856:
853:
850:
849:
846:
843:
839:
838:
835:
832:
828:
827:
824:
821:
817:
816:
813:
810:
806:
805:
802:
799:
798:Latin sharp S
795:
794:
791:
788:
784:
783:
780:
777:
751:
748:
624:ISO/IEC 8859-1
580:
577:
543:Main article:
540:
537:
536:
535:
532:
529:
522:
514:
511:
510:
509:
502:
495:
480:
468:
465:
418:Main article:
415:
412:
411:
410:
390:
383:
376:
364:
353:
350:Greek alphabet
346:Latin alphabet
338:
323:
320:
291:IBM System/360
157:Bacon's cipher
151:
148:
59:human language
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
4360:
4349:
4346:
4344:
4341:
4340:
4338:
4325:
4315:
4309:
4306:
4304:
4301:
4299:
4296:
4294:
4291:
4289:
4286:
4284:
4281:
4279:
4276:
4274:
4271:
4270:
4268:
4264:
4258:
4255:
4253:
4250:
4246:
4243:
4241:
4238:
4237:
4236:
4233:
4231:
4228:
4227:
4225:
4223:
4219:
4213:
4210:
4208:
4205:
4203:
4200:
4198:
4195:
4193:
4190:
4188:
4187:
4183:
4179:
4176:
4174:
4171:
4169:
4166:
4165:
4164:
4161:
4159:
4156:
4154:
4151:
4147:
4144:
4142:
4139:
4138:
4136:
4134:
4131:
4129:
4126:
4124:
4121:
4119:
4116:
4112:
4109:
4108:
4107:
4104:
4102:
4099:
4097:
4094:
4093:
4091:
4087:
4081:
4078:
4076:
4073:
4071:
4068:
4066:
4063:
4061:
4058:
4057:
4055:
4052:
4048:
4042:
4039:
4037:
4034:
4032:
4029:
4027:
4024:
4022:
4019:
4017:
4014:
4012:
4009:
4007:
4004:
4002:
3999:
3997:
3994:
3992:
3989:
3987:
3984:
3982:
3979:
3978:
3976:
3974:
3973:ISO/IEC 10646
3970:
3966:
3960:
3957:
3955:
3952:
3950:
3947:
3945:
3942:
3940:
3937:
3935:
3932:
3930:
3927:
3925:
3922:
3920:
3917:
3915:
3912:
3910:
3907:
3905:
3902:
3900:
3897:
3895:
3892:
3890:
3887:
3885:
3882:
3880:
3877:
3875:
3872:
3870:
3867:
3865:
3862:
3860:
3857:
3855:
3852:
3850:
3847:
3845:
3842:
3840:
3837:
3835:
3832:
3830:
3827:
3825:
3822:
3820:
3817:
3815:
3812:
3810:
3806:
3803:
3801:
3798:
3796:
3793:
3791:
3790:Compucolor II
3788:
3786:
3783:
3781:
3778:
3776:
3773:
3771:
3768:
3766:
3763:
3761:
3758:
3756:
3753:
3751:
3748:
3746:
3745:Acorn RISC OS
3743:
3741:
3738:
3736:
3733:
3731:
3728:
3726:
3723:
3721:
3718:
3716:
3713:
3711:
3708:
3707:
3705:
3701:
3695:
3692:
3690:
3687:
3685:
3682:
3680:
3677:
3675:
3674:8-bit Turkish
3672:
3670:
3667:
3663:
3660:
3658:
3655:
3653:
3650:
3648:
3645:
3643:
3640:
3638:
3635:
3633:
3630:
3628:
3625:
3623:
3620:
3618:
3615:
3614:
3613:
3610:
3608:
3605:
3604:
3602:
3599:
3595:
3591:
3585:
3582:
3580:
3577:
3576:
3574:
3571:
3567:
3561:
3558:
3556:
3553:
3551:
3548:
3546:
3543:
3541:
3538:
3536:
3533:
3531:
3528:
3526:
3523:
3521:
3518:
3516:
3513:
3511:
3508:
3506:
3503:
3501:
3498:
3496:
3493:
3491:
3488:
3486:
3483:
3481:
3478:
3475:
3471:
3468:
3466:
3463:
3461:
3458:
3457:
3455:
3453:
3449:
3443:
3440:
3438:
3435:
3433:
3430:
3428:
3425:
3423:
3420:
3418:
3415:
3413:
3410:
3408:
3405:
3403:
3400:
3398:
3395:
3393:
3390:
3388:
3385:
3383:
3380:
3378:
3375:
3373:
3370:
3368:
3365:
3363:
3360:
3358:
3355:
3353:
3350:
3348:
3345:
3343:
3340:
3339:
3337:
3335:
3331:
3325:
3322:
3320:
3317:
3315:
3312:
3310:
3307:
3305:
3302:
3300:
3297:
3295:
3292:
3290:
3287:
3285:
3282:
3280:
3277:
3275:
3272:
3270:
3267:
3265:
3262:
3260:
3257:
3255:
3252:
3250:
3247:
3245:
3242:
3240:
3237:
3235:
3232:
3230:
3227:
3225:
3222:
3220:
3217:
3215:
3212:
3210:
3207:
3205:
3202:
3200:
3197:
3195:
3192:
3190:
3187:
3185:
3182:
3180:
3177:
3175:
3172:
3170:
3167:
3165:
3162:
3160:
3157:
3155:
3152:
3150:
3147:
3145:
3142:
3140:
3137:
3135:
3132:
3130:
3127:
3125:
3122:
3120:
3117:
3115:
3112:
3110:
3107:
3105:
3102:
3100:
3097:
3095:
3092:
3090:
3087:
3085:
3082:
3080:
3077:
3075:
3072:
3070:
3067:
3065:
3062:
3060:
3057:
3055:
3052:
3050:
3047:
3045:
3042:
3040:
3037:
3035:
3032:
3030:
3027:
3025:
3022:
3021:
3019:
3015:
3009:
3006:
3004:
3001:
2999:
2996:
2994:
2991:
2989:
2986:
2984:
2981:
2979:
2976:
2974:
2971:
2969:
2966:
2964:
2961:
2959:
2956:
2954:
2951:
2949:
2946:
2944:
2941:
2939:
2936:
2934:
2931:
2929:
2926:
2924:
2921:
2919:
2916:
2914:
2911:
2909:
2906:
2904:
2901:
2899:
2896:
2894:
2891:
2889:
2886:
2884:
2881:
2879:
2876:
2874:
2871:
2869:
2866:
2865:
2863:
2859:
2854:
2848:
2845:
2843:
2842:ISO/IEC 10367
2840:
2838:
2835:
2834:
2832:
2830:
2826:
2820:
2817:
2815:
2812:
2810:
2807:
2805:
2802:
2800:
2797:
2795:
2792:
2790:
2787:
2785:
2782:
2780:
2777:
2775:
2772:
2770:
2767:
2765:
2762:
2760:
2757:
2755:
2752:
2750:
2747:
2745:
2742:
2740:
2737:
2735:
2732:
2730:
2727:
2725:
2722:
2720:
2717:
2715:
2712:
2710:
2707:
2705:
2702:
2700:
2697:
2695:
2692:
2690:
2687:
2685:
2682:
2680:
2677:
2675:
2672:
2670:
2667:
2666:
2664:
2660:
2654:
2651:
2649:
2646:
2644:
2641:
2639:
2636:
2634:
2631:
2629:
2626:
2622:
2619:
2617:
2614:
2613:
2612:
2609:
2608:
2606:
2602:
2594:
2591:
2589:
2586:
2584:
2581:
2579:
2576:
2575:
2573:
2569:
2566:
2564:
2561:
2560:
2558:
2554:
2551:
2550:
2548:
2544:
2541:
2539:
2536:
2534:
2531:
2529:
2526:
2524:
2521:
2519:
2516:
2514:
2511:
2509:
2506:
2504:
2501:
2499:
2496:
2494:
2493:-5 (Cyrillic)
2491:
2489:
2486:
2484:
2481:
2479:
2476:
2474:
2471:
2470:
2468:
2467:
2465:
2463:
2459:
2453:
2450:
2444:
2441:
2439:
2436:
2435:
2433:
2431:
2428:
2426:
2423:
2421:
2418:
2417:
2416:
2412:
2408:
2405:
2403:
2400:
2396:
2393:
2392:
2391:
2388:
2386:
2383:
2379:
2376:
2372:
2369:
2367:
2364:
2362:
2359:
2357:
2354:
2353:
2352:
2349:
2347:
2344:
2343:
2342:
2339:
2338:
2336:
2332:
2328:
2321:
2316:
2314:
2309:
2307:
2302:
2301:
2298:
2291:
2288:
2286:
2283:
2281:
2278:
2275:
2272:
2270:
2267:
2266:
2262:
2245:
2241:
2237:
2233:
2227:
2224:
2217:
2216:
2210:
2209:
2205:
2197:
2193:
2187:
2184:
2179:
2175:
2169:
2166:
2161:
2157:
2151:
2148:
2135:
2129:
2126:
2122:
2120:
2116:
2112:
2108:
2104:
2100:
2096:
2082:
2078:
2074:
2068:
2065:
2052:
2045:
2043:
2041:
2039:
2037:
2033:
2028:
2022:
2015:
2014:
2006:
2004:
2000:
1995:
1989:
1987:
1985:
1983:
1981:
1979:
1977:
1973:
1968:
1964:
1957:
1955:
1953:
1951:
1947:
1934:
1930:
1923:
1920:
1912:
1905:
1899:
1896:
1884:
1877:
1874:
1861:
1854:
1851:
1847:
1836:
1832:
1825:
1823:
1819:
1815:
1804:
1800:
1794:
1792:
1788:
1776:
1772:
1766:
1764:
1760:
1755:
1751:
1745:
1742:
1736:
1731:
1728:
1726:
1723:
1720:
1717:
1715:
1712:
1709:
1706:
1703:
1702:
1698:
1696:
1693:
1690:
1687:
1684:
1681:
1679:
1676:
1674:
1671:
1669:
1666:
1665:
1661:
1656:
1652:
1649:
1645:
1642:
1640:
1637:
1635:
1632:
1631:
1629:
1626:
1622:
1619:
1617:
1614:
1611:
1608:
1607:
1605:
1601:
1597:
1596:
1594:
1593:Code page 950
1590:
1586:
1582:
1579:
1576:
1573:
1571:
1568:
1567:
1566:
1562:
1558:
1555:
1553:
1550:
1548:
1545:
1544:
1542:
1539:
1535:
1532:
1530:
1527:
1524:
1523:Code page 932
1520:
1517:
1516:
1514:
1511:
1509:
1506:
1504:
1501:
1499:
1496:
1494:
1491:
1489:
1485:
1481:
1478:
1476:
1473:
1468:
1465:
1462:
1459:
1456:
1453:
1450:
1447:
1444:
1441:
1438:
1435:
1432:
1429:
1426:
1423:
1420:
1417:
1416:
1414:
1411:
1409:
1405:
1401:
1397:
1393:
1389:
1385:
1381:
1377:
1373:
1369:
1365:
1361:
1357:
1353:
1349:
1346:
1341:
1338:
1335:
1332:
1329:
1326:
1323:
1320:
1317:
1314:
1311:
1308:
1305:
1302:
1299:
1296:
1293:
1290:
1287:
1284:
1281:
1278:
1275:
1272:
1269:
1266:
1263:
1260:
1257:
1254:
1253:
1251:
1248:
1246:
1243:
1239:
1236:
1235:
1234:
1231:
1230:
1228:
1226:
1222:
1218:
1214:
1210:
1206:
1193:
1184:
1180:
1177:
1176:
1174:
1171:This section
1169:
1166:
1162:
1161:
1157:
1149:
1144:
1141:
1140:
1139:
1137:
1130:
1127:
1124:
1121:
1118:
1115:
1112:
1108:
1105:
1104:
1103:
1101:
1097:
1095:
1086:
1084:
1041:
1040:
1038:
1034:
1007:
1006:
1004:
981:
980:
978:
955:
954:
952:
949:Five Unicode
948:
925:
924:
922:
918:
899:
898:
896:
892:
891:
890:
862:
854:
847:
844:
842:Section sign
841:
840:
836:
833:
830:
829:
825:
822:
819:
818:
814:
811:
809:Han for East
808:
807:
803:
800:
797:
796:
792:
789:
786:
785:
781:
778:
775:
774:
771:
768:
766:
762:
758:
749:
747:
744:
742:
738:
734:
729:
727:
723:
719:
715:
711:
707:
703:
698:
696:
692:
688:
684:
680:
676:
672:
668:
664:
660:
656:
652:
648:
643:
640:
636:
631:
629:
625:
621:
620:
615:
611:
606:
604:
599:
597:
593:
589:
585:
578:
576:
574:
570:
565:
563:
559:
555:
550:
546:
538:
533:
530:
528:
523:
520:
519:
518:
512:
507:
503:
500:
496:
493:
489:
485:
481:
478:
474:
473:
472:
466:
464:
462:
456:
452:
450:
449:code page 437
446:
442:
438:
434:
430:
425:
421:
413:
408:
403:
399:
395:
391:
388:
384:
381:
377:
374:
369:
365:
362:
358:
354:
351:
347:
343:
342:character set
339:
336:
335:
330:
329:
328:
321:
319:
317:
313:
303:
300:
294:
292:
288:
284:
280:
276:
272:
268:
264:
259:
257:
253:
249:
244:
236:
232:
230:
225:
220:
216:
212:
207:
205:
201:
196:
194:
190:
186:
185:amateur radio
182:
181:telegraph key
178:
174:
170:
166:
162:
158:
149:
147:
145:
141:
137:
133:
129:
125:
120:
118:
114:
110:
106:
102:
98:
93:
91:
90:character map
87:
83:
79:
76:
72:
68:
64:
60:
56:
53:
49:
42:
38:
34:
30:
19:
4240:ISO/IEC 6429
4197:Stanford/ITS
4184:
4118:ARIB STD-B24
3899:Sega SC-3000
3800:DEC RADIX 50
2837:ISO/IEC 8859
2829:ISO/IEC 2022
2574:Adaptations
2533:-14 (Celtic)
2528:-13 (Baltic)
2518:-10 (Nordic)
2513:-9 (Turkish)
2462:ISO/IEC 8859
2326:
2251:. Retrieved
2214:
2195:
2186:
2177:
2168:
2159:
2150:
2138:. Retrieved
2128:
2111:page numbers
2092:
2085:. Retrieved
2067:
2055:. Retrieved
2012:
1966:
1937:. Retrieved
1932:
1922:
1898:
1886:. Retrieved
1876:
1864:. Retrieved
1853:
1845:
1838:. Retrieved
1834:
1813:
1806:. Retrieved
1802:
1778:. Retrieved
1774:
1753:
1744:
1699:
1655:ISO/IEC 6937
1552:EUC-JIS-2004
1475:Mac OS Roman
1467:Windows-1258
1461:Windows-1257
1455:Windows-1256
1449:Windows-1255
1443:Windows-1254
1437:Windows-1253
1431:Windows-1252
1425:Windows-1251
1419:Windows-1250
1202:
1187:
1183:adding to it
1172:
1134:
1107:Web browsers
1098:
1090:
1081:
858:
769:
753:
745:
732:
730:
699:
679:ISO/IEC 2022
646:
644:
638:
634:
632:
617:
609:
607:
602:
600:
582:
566:
551:
548:
516:
470:
457:
453:
426:
423:
393:
386:
379:
367:
356:
341:
332:
325:
304:
295:
260:
241:
218:
215:Émile Baudot
208:
197:
189:aeronautical
153:
121:
94:
47:
46:
37:Punched tape
29:
3959:ZX Spectrum
3914:Sinclair QL
3750:Amstrad CPC
3669:8-bit Greek
3596:terminals (
3309:Iran System
2861:("scripts")
2508:-8 (Hebrew)
2498:-6 (Arabic)
2395:ISO/IEC 646
2087:15 February
1939:10 November
1862:. Smartbear
1621:ISO-2022-KR
1534:ISO-2022-JP
1521:(Microsoft
1445:for Turkish
1340:ISO 8859-16
1334:ISO 8859-15
1328:ISO 8859-14
1322:ISO 8859-13
1316:ISO 8859-11
1310:ISO 8859-10
1094:transcoding
1087:Transcoding
951:code points
859:Consider a
720:, which is
712:, which is
619:code points
612:(CCS) is a
596:code points
513:Code points
429:page number
322:Terminology
312:code points
283:1400 series
279:7000 Series
200:Baudot code
113:punctuation
82:code points
71:transformed
67:transmitted
4337:Categories
4245:JIS X 0211
4153:ISO-IR-169
4006:UTF-EBCDIC
3572:code pages
3299:CSX+ Indic
2903:Devanagari
2858:Code pages
2779:LST 1590-4
2749:JIS X 0213
2744:JIS X 0212
2739:JIS X 0208
2734:JIS X 0201
2699:GOST 10859
2621:CCCII/EACC
2523:-11 (Thai)
2503:-7 (Greek)
2438:background
2361:Wabun/Kana
2253:August 25,
1888:1 November
1737:References
1598:Hong Kong
1541:JIS X 0213
1513:JIS X 0208
1457:for Arabic
1451:for Hebrew
1304:ISO 8859-9
1298:ISO 8859-8
1292:ISO 8859-7
1286:ISO 8859-6
1280:ISO 8859-5
1274:ISO 8859-4
1268:ISO 8859-3
1262:ISO 8859-2
1256:ISO 8859-1
999:0x00010400
995:0x00000063
991:0x00000332
987:0x00000062
983:0x00000061
820:Ampersand
776:Character
639:code units
554:diacritics
539:Characters
467:Code units
414:Code pages
387:code space
380:code point
231:standard.
177:Morse code
55:characters
4298:MICR code
4133:IEC-P27-1
4111:ISO-IR-68
4016:DIN 91379
3894:SAM Coupé
3829:GSM 03.38
3819:Galaksija
3314:Kamenický
3294:CSX Indic
3003:Ukrainian
2789:Shift JIS
2769:KS X 1002
2764:KS X 1001
2689:DIN 66003
2684:CNS 11643
2452:Transcode
2430:ITU T.101
2356:Non-Latin
2057:12 August
1933:InfoWorld
1840:2 January
1808:2 January
1799:"Charset"
1610:KS X 1001
1519:Shift JIS
1439:for Greek
1190:June 2024
921:graphemes
884:𐐀
700:Although
573:graphemes
562:Ligatures
433:Microsoft
420:Code page
394:code unit
361:code page
334:character
111:and some
97:telegraph
88:", or a "
86:code page
78:computers
52:graphical
4303:Mojibake
4158:ISO 2033
4123:Fieldata
4101:ASMO 449
4011:GB 18030
3971: /
3919:Teletext
3909:Sharp MZ
3839:HP FOCAL
3834:HP Roman
3765:Atari ST
3755:Apple II
3289:CS Indic
2983:Romanian
2958:Keyboard
2938:Gurmukhi
2933:Gujarati
2923:Georgian
2898:Cyrillic
2893:Croatian
2868:Armenian
2774:LST 1564
2759:KPS 9566
2719:GB 18030
2714:GB 12052
2709:GB 12345
2694:ELOT 927
2628:ISO 5426
2588:Estonian
2425:ITU T.61
2415:Teletext
2411:Videotex
2385:Fieldata
2371:Cyrillic
2244:Archived
2240:77-90165
2140:25 March
2136:. Oracle
2103:ISO 2022
2081:Archived
1911:Archived
1866:29 April
1780:29 April
1701:Mojibake
1673:Alt code
1662:See also
1581:GB 18030
1563:Chinese
1250:ISO 8859
787:Latin A
718:UTF-16BE
706:UTF-32LE
702:UTF-32BE
667:UTF-32LE
663:UTF-16LE
659:UTF-32BE
655:UTF-16BE
614:function
492:GB 18030
224:Fieldata
109:numerals
4192:SEASCII
4186:Mojikyō
4173:KOI8-RU
4096:ABICOMP
3969:Unicode
3879:PETSCII
3869:NEC APC
3805:DEC MCS
3760:ATASCII
3657:Swedish
3642:Finnish
3627:Spanish
3319:Mazovia
3284:ABICOMP
2993:Turkish
2948:Iceland
2856:Mac OS
2799:TIS-620
2704:GB 2312
2679:BraSCII
2669:ArmSCII
2407:Teletex
2366:Chinese
1775:W3Techs
1708:Mojikyō
1628:Unicode
1606:Korean
1587:Taiwan
1570:GB 2312
1565:Guobiao
1233:ISO 646
1207:on the
1136:Windows
973:U+10400
879:U+10400
871:̲
855:Example
845:U+00A7
834:U+00A1
823:U+0026
812:U+6771
801:U+00DF
790:U+0041
737:Unicode
584:Unicode
445:Windows
256:IBM 603
193:Unicode
161:Braille
150:History
126:on the
117:Unicode
75:digital
4202:Symbol
4178:KOI8-U
4168:KOI8-R
4036:TACE16
4026:CESU-8
4021:BOCU-1
4001:UTF-32
3996:UTF-16
3939:WISCII
3929:TRS-80
3849:SQUOZE
3844:HP RPL
3684:Hebrew
3679:SI 960
3647:French
3570:EBCDIC
3460:CER-GS
2943:Hebrew
2918:Gaelic
2883:Celtic
2873:Arabic
2819:YUSCII
2809:VISCII
2794:SI 960
2784:PASCII
2633:5426-2
2611:MARC-8
2346:Needle
2238:
2228:
2115:PCTerm
2105:, the
2023:
1644:UTF-32
1639:UTF-16
1616:EUC-KR
1529:EUC-JP
1508:VISCII
1484:KOI8-U
1480:KOI8-R
1300:Hebrew
1288:Arabic
1245:EBCDIC
1225:UTF-16
1029:0xDC00
1025:0xD801
1021:0x0063
1017:0x0332
1013:0x0062
1009:0x0061
969:U+0063
965:U+0332
961:U+0062
957:U+0061
881:
868:
866:U+0332
861:string
826:&
782:Glyph
757:planes
675:UTF-32
671:UTF-16
665:, and
506:UTF-32
499:UTF-16
488:EBCDIC
461:CCSIDs
439:, and
229:ECMA-6
219:baudot
202:, the
144:UTF-16
73:using
69:, and
63:stored
4273:CCSID
4146:8-bit
4141:7-bit
4137:INIS
3991:UTF-8
3986:UTF-7
3981:UTF-1
3859:LMBCS
3795:CP/M+
3637:Dutch
3622:Swiss
3304:CWI-2
3008:VT100
2978:Roman
2973:Ogham
2953:Inuit
2928:Greek
2814:VSCII
2804:TSCII
2754:KOI-7
2729:ISCII
2724:HKSCS
2616:ANSEL
2578:Welsh
2402:BCDIC
2390:ASCII
2351:Morse
2247:(PDF)
2219:(PDF)
2107:VT510
2017:(PDF)
1914:(PDF)
1907:(PDF)
1651:ANSEL
1634:UTF-8
1600:HKSCS
1503:TSCII
1498:ISCII
1408:CP872
1404:CP869
1400:CP866
1396:CP865
1392:CP863
1388:CP862
1384:CP861
1380:CP860
1376:CP858
1372:CP857
1368:CP855
1364:CP852
1360:CP850
1356:CP737
1352:CP720
1348:CP437
1294:Greek
1238:ASCII
1213:UTF-8
1117:iconv
1037:bytes
919:Five
893:Four
710:UTF-8
651:UTF-8
592:bytes
569:glyph
558:glyph
484:UTF-8
477:ASCII
132:UTF-8
41:ASCII
4207:TRON
4060:Cork
4031:SCSU
3954:ZX81
3949:ZX80
3944:XCCS
3874:NeXT
3854:LICS
3809:NRCS
3770:BICS
3740:1058
3735:1057
3730:1056
3725:1055
3720:1054
3715:1053
3710:1052
3584:DKOI
3540:1270
3535:1258
3530:1257
3525:1256
3520:1255
3515:1254
3510:1253
3505:1252
3500:1251
3495:1250
3485:1169
3442:1133
3437:1124
3432:1046
3427:1019
3422:1018
3417:1017
3412:1016
3407:1015
3402:1014
3397:1013
3392:1012
3387:1010
3382:1009
3377:1008
3372:1006
3279:3846
3274:1127
3269:1118
3264:1117
3259:1116
3254:1115
3249:1098
3244:1044
3239:1043
3234:1042
3229:1040
3224:1034
2988:Sámi
2674:Big5
2653:6862
2648:6438
2643:5428
2638:5427
2568:Sámi
2443:sets
2409:and
2255:2019
2236:LCCN
2226:ISBN
2142:2018
2097:and
2089:2017
2059:2023
2021:ISBN
1941:2020
1890:2018
1868:2014
1842:2021
1810:2021
1782:2024
1719:TRON
1589:Big5
1488:KOI7
1318:Thai
1219:and
1203:The
1123:luit
1075:0x80
1071:0x90
1067:0x90
1063:0xF0
1059:0x63
1055:0xB2
1051:0xCC
1047:0x62
1043:0x61
704:and
695:BOCU
693:and
691:SCSU
677:and
490:and
402:word
348:and
281:and
273:and
187:and
138:and
122:The
4163:KOI
4080:OT1
4075:OMS
4070:OML
4065:LY1
4051:TeX
3864:MSX
3824:GEM
3780:CDC
3598:VTx
3594:DEC
3480:950
3474:GBK
3470:936
3465:932
3367:922
3362:921
3357:915
3352:912
3347:896
3342:895
3324:MIK
3219:951
3214:950
3209:949
3204:942
3199:936
3194:932
3189:904
3184:903
3179:899
3174:897
3169:869
3164:868
3159:867
3154:866
3149:865
3144:864
3139:863
3134:862
3129:861
3124:860
3119:859
3114:858
3109:857
3104:856
3099:855
3094:853
3089:852
3084:851
3079:850
3074:778
3069:777
3064:776
3059:775
3054:773
3049:770
3044:737
3039:720
3034:708
3029:668
3024:437
2099:ISO
2095:DEC
1653:or
1575:GBK
1493:MIK
1211:is
1209:web
1185:.
741:XML
697:).
685:or
628:500
601:An
451:).
437:SAP
275:704
271:702
267:BCD
252:IBM
211:bit
195:).
130:is
128:web
92:".
4339::
4128:HZ
2242:.
2234:.
2194:.
2176:.
2158:.
2091:.
2075:.
2035:^
2002:^
1975:^
1965:.
1949:^
1931:.
1844:.
1833:.
1821:^
1812:.
1801:.
1790:^
1773:.
1762:^
1752:.
1595:)
1486:,
1482:,
1415::
1406:,
1402:,
1398:,
1394:,
1390:,
1386:,
1382:,
1378:,
1374:,
1370:,
1366:,
1362:,
1358:,
1354:,
1350:,
1252::
1138::
1102::
1073:,
1069:,
1065:,
1061:,
1057:,
1053:,
1049:,
1045:,
1039:)
1027:,
1023:,
1019:,
1015:,
1011:,
997:,
993:,
989:,
985:,
971:,
967:,
963:,
959:,
953::
943:𐐀
941:,
937:,
933:,
929:,
923::
913:𐐀
911:,
907:,
905:b̲
903:,
897::
848:§
837:¡
815:東
804:ß
793:Α
767:.
673:,
661:,
657:,
653:,
645:A
633:A
608:A
486:,
435:,
392:A
385:A
378:A
375:).
366:A
355:A
340:A
331:A
163:,
159:,
107:,
65:,
3807:/
3600:)
3476:)
3472:(
2413:/
2319:e
2312:t
2305:v
2257:.
2162:.
2144:.
2061:.
2029:.
1969:.
1943:.
1892:.
1870:.
1784:.
1192:)
1188:(
939:c
935:_
931:b
927:a
909:c
901:a
459:(
409:.
363:.
265:(
171:(
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.