419:
36:
528:
mention characters used in Greek, Russian, and most
Eastern languages). Many individuals, companies, and countries defined extra characters as needed—often reassigning control characters, or using values in the range from 128 to 255. Using values above 128 conflicts with using the 8th bit as a checksum, but the checksum usage gradually died out.
189:"readable" content (or just files with nothing that the speaker does not prefer). For example, that could exclude any indication of fonts or layout (such as markup, markdown, or even tabs); characters such as curly quotes, non-breaking spaces, soft hyphens, em dashes, and/or ligatures; or other things.
657:
sets, the first 32 characters of the "upper half" (128 to 159) are also control codes, known as the "C1 set". They are rarely used directly; when they turn up in documents which are ostensibly in an ISO 8859 encoding, their code positions generally refer instead to the characters at that position in
527:
The near-ubiquity of ASCII was a great help, but failed to address international and linguistic concerns. The dollar-sign ("$ ") was not as useful in
England, and the accented characters used in Spanish, French, German, Portuguese, and many other languages were entirely unavailable in ASCII (not to
578:
Text is considered plain text regardless of its encoding. To properly understand or process it the recipient must know (or be able to figure out) what encoding was used; however, they need not know anything about the computer architecture that was used, or about the binary structures defined by
224:
a character, is a binary file. Converting a plain text file to a different character encoding does not change the meaning of the text, as long as the correct character encoding is used. However, converting a binary file to a different format may alter the interpretation of the non-textual data.
485:
Before the early 1960s, computers were mainly used for number-crunching rather than for text, and memory was extremely expensive. Computers often allocated only 6 bits for each character, permitting only 64 characters—assigning codes for A-Z, a-z, and 0-9 would leave only 2 codes: nowhere near
133:
559:
then provided conventions for "switching" between different character sets in mid-file. Many other organisations developed variations on these, and for many years
Windows and Macintosh computers used incompatible variations.
297:
The use of plain text rather than binary files enables files to survive much better "in the wild", in part by making them largely immune to computer architecture incompatibilities. For example, all the problems of
598:; charset=UTF-8" -- plain text represented using the UTF-8 character encoding with HTML markup. Another common MIME type is "application/json" -- plain text represented using the UTF-8 character encoding with
215:
Plain text is also sometimes used only to exclude "binary" files: those in which at least some parts of the file cannot be correctly interpreted via the character encoding in effect. For example, a file or
174:, images, etc.). It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text is different from
571:
currently allows for 1,114,112 code values, and assigns codes covering nearly all modern text writing systems, as well as many historical ones, and for many non-linguistic characters such as printer's
531:
These additional characters were encoded differently in different countries, making texts impossible to decode without figuring out the originator's rules. For instance, a browser might display
520:, and values from 32 to 127 for graphic characters such as letters, digits, and punctuation. Most machines stored characters in 8 bits rather than 7, ignoring the remaining bit or using it as a
290:, and TeX, as well as nearly all programming language source code files, are considered plain text. The particular content is irrelevant to whether a file is plain text. For example, an
555:) is also known as "Latin-1", and covers the needs of most (not all) European languages that use Latin-based characters (there was not quite enough room to cover them all).
270:
are examples of rich text fully represented as plain text streams, interspersing plain text data with sequences of characters that represent the additional data structures."
567:
to develop a single, unified character encoding that could cover all known (or at least all currently known) languages. After some conflict, these efforts were unified.
178:, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from
540:
926:
248:, is any text representation containing plain text plus added information such as a language identifier, font size, color, hypertext links, and so on.
1297:
931:
921:
916:
314:
The purpose of using plain text today is primarily independence from programs that require their very own special encoding or formatting or
904:
805:
682:
direction override characters (used to explicitly mark right-to-left writing inside left-to-right writing and the other way around) and
466:
119:
630:
known as the "C0 set": codes originally intended not to represent printable information, but rather to control devices (such as
1055:
306:
rather than UTF-8, endianness matters, but uniformly for every character, rather than for potentially-unknown subsets of it).
1172:
977:
909:
871:
444:
382:
57:
237:"Plain text is a pure sequence of character codes; plain Un-encoded text is therefore a sequence of Unicode character codes.
498:, and others had to resort to conventions such as keying an asterisk preceding letters actually intended to be upper-case.
303:
100:
1072:
1002:
850:
508:
argued strongly for going to 8-bit bytes, because someday people might want to process text, and won. Although IBM used
72:
1327:
440:
53:
429:
1348:
1082:
950:
670:
448:
433:
332:
Many other computer programs are also capable of processing or creating plain text, such as countless programs in
79:
46:
1260:
1212:
1124:
1102:
1097:
1025:
891:
217:
1134:
798:
770:
291:
605:
When a document is received without any explicit indication of the character encoding, some applications use
182:
in which some portions must be interpreted as binary objects (encoded integers, real numbers, images, etc.).
1287:
1202:
733:
618:
86:
539:
if it tried to interpret one character set as another. The
International Organization for Standardization (
1030:
886:
845:
840:
753:
683:
326:
171:
1353:
1020:
995:
167:
24:
68:
363:
Plain text files are almost universal in programming; a source code file containing instructions in a
822:
364:
220:
consisting of "hello" (in any encoding), following by 4 bytes that express a binary integer that is
1292:
1270:
1197:
1050:
1042:
962:
791:
728:
679:
638:
about data streams such as those stored on magnetic tape. They include common characters like the
486:
enough. Most computers opted not to support lower-case letters. Thus, early text projects such as
1275:
1255:
1207:
1182:
967:
936:
564:
517:
480:
368:
349:
193:
582:
Perhaps the most common way of explicitly stating the specific encoding of plain text is with a
563:
The text-encoding situation became more and more complex, leading to efforts by ISO and by the
329:
allows people to give commands in plain text and get a response, also typically in plain text.
1162:
1092:
1067:
881:
876:
627:
606:
491:
353:
337:
1307:
1192:
990:
631:
255:
1312:
1177:
1129:
1062:
713:
275:
594:" -- plain text without markup. Another MIME type often used in both email and HTTP is "
93:
1265:
1087:
1077:
985:
687:
650:
341:
286:
form (as in HTML, XML, and so on). Thus, representations such as SGML, RTF, HTML, XML,
283:
175:
1342:
1187:
643:
397:
393:
generally contains only plain text (without formatting) intended for humans to read.
16:
Term for computer data consisting only of unformatted characters of readable material
1144:
1119:
663:
659:
654:
495:
487:
396:
The best format for storing knowledge persistently is plain text, rather than some
179:
774:
282:
are generally considered plain text, so long as the markup is also in a directly
1322:
1317:
1167:
1114:
941:
718:
708:
703:
501:
418:
319:
315:
287:
141:
35:
294:
file can express drawings or even bitmapped graphics, but is still plain text.
1227:
1222:
1139:
1107:
955:
591:
552:
544:
390:
299:
146:
170:
of readable material but not its graphical representation nor other objects (
1302:
1280:
1237:
1232:
899:
855:
814:
723:
595:
583:
279:
159:
20:
1217:
635:
556:
548:
521:
367:
is almost always a plain text file. Plain text is also commonly used for
675:
666:, that use the codes to instead provide additional graphic characters.
639:
572:
568:
201:
132:
509:
375:
357:
209:
185:
The term is sometimes used quite loosely, to mean files that contain
318:. Plain text files can be opened, read, and edited with ubiquitous
691:
623:
513:
205:
197:
166:
is a loose term for data (e.g. file contents) that represent only
151:
131:
371:, which are read for saved settings at the startup of a program.
835:
599:
587:
386:
345:
259:
251:
787:
830:
505:
412:
333:
267:
263:
29:
783:
348:
and its kin; as well as web browsers (a few browsers such as
274:
According to other definitions, however, files that contain
551:, to accommodate various languages. The first of these (
626:
reserves the first 32 codes (numbers 0–31 decimal) for
1246:
1155:
1041:
1011:
976:
864:
821:
60:. Unsourced material may be challenged and removed.
678:defines additional control characters, including
658:a proprietary, system-specific encoding, such as
212:become more common, that usage may be shrinking.
516:, using values from 0 to 31 for (non-printing)
512:, most text from then on came to be encoded in
356:produce only plain text for display) and other
196:, but occasionally the term is taken to imply
799:
8:
609:to attempt to guess what encoding was used.
579:whatever program (if any) created the data.
447:. Unsourced material may be challenged and
806:
792:
784:
467:Learn how and when to remove this message
120:Learn how and when to remove this message
634:) that make use of ASCII, or to provide
745:
302:can be avoided (with encodings such as
192:In principle, plain text can be in any
775:Chapter 14: "The Power of Plain Text"
7:
754:"The Unicode Standard, version 14.0"
445:adding citations to reliable sources
58:adding citations to reliable sources
233:According to The Unicode Standard:
19:For the cryptography meaning, see
14:
649:In 8-bit character sets such as
417:
34:
543:) eventually developed several
45:needs additional citations for
1:
872:Arbitrary-precision or bignum
686:to select alternate forms of
575:, mathematical symbols, etc.
769:Andrew Hunt, David Thomas. "
590:, the default MIME type is "
374:Plain text is used for much
144:, displayed by the command
1370:
671:Unicode control characters
668:
616:
478:
136:Text file with portion of
18:
1213:Strongly typed identifier
204:-based encodings such as
138:The Human Side of Animals
771:The Pragmatic Programmer
229:Plain text and rich text
1288:Parametric polymorphism
734:Line wrap and word wrap
619:C0 and C1 control codes
694:and other characters.
327:command-line interface
172:floating-point numbers
155:
23:. For other uses, see
135:
25:Text (disambiguation)
441:improve this section
365:programming language
54:improve this article
1293:Primitive data type
1198:Recursive data type
1051:Algebraic data type
927:Quadruple precision
729:Text-based protocol
684:variation selectors
680:bi-directional text
409:Character encodings
369:configuration files
1256:Abstract data type
937:Extended precision
896:Reduced precision
628:control characters
565:Unicode Consortium
518:control characters
481:Character encoding
156:
1349:Text file formats
1336:
1335:
1068:Associative array
932:Octuple precision
759:. pp. 18–19.
607:charset detection
492:Index Thomisticus
477:
476:
469:
354:Line Mode Browser
130:
129:
122:
104:
1361:
1308:Type constructor
1193:Opaque data type
1125:Record or Struct
922:Double precision
917:Single precision
808:
801:
794:
785:
778:
767:
761:
760:
758:
750:
636:meta-information
586:. For email and
472:
465:
461:
458:
452:
421:
413:
244:, also known as
149:
125:
118:
114:
111:
105:
103:
62:
38:
30:
1369:
1368:
1364:
1363:
1362:
1360:
1359:
1358:
1339:
1338:
1337:
1332:
1313:Type conversion
1248:
1242:
1178:Enumerated type
1151:
1037:
1031:null-terminated
1007:
972:
860:
817:
812:
782:
781:
768:
764:
756:
752:
751:
747:
742:
714:Binary protocol
700:
673:
621:
615:
483:
473:
462:
456:
453:
438:
422:
411:
406:
322:and utilities.
312:
231:
145:
126:
115:
109:
106:
63:
61:
51:
39:
28:
17:
12:
11:
5:
1367:
1365:
1357:
1356:
1351:
1341:
1340:
1334:
1333:
1331:
1330:
1325:
1320:
1315:
1310:
1305:
1300:
1295:
1290:
1285:
1284:
1283:
1273:
1268:
1266:Data structure
1263:
1258:
1252:
1250:
1244:
1243:
1241:
1240:
1235:
1230:
1225:
1220:
1215:
1210:
1205:
1200:
1195:
1190:
1185:
1180:
1175:
1170:
1165:
1159:
1157:
1153:
1152:
1150:
1149:
1148:
1147:
1137:
1132:
1127:
1122:
1117:
1112:
1111:
1110:
1100:
1095:
1090:
1085:
1080:
1075:
1070:
1065:
1060:
1059:
1058:
1047:
1045:
1039:
1038:
1036:
1035:
1034:
1033:
1023:
1017:
1015:
1009:
1008:
1006:
1005:
1000:
999:
998:
993:
982:
980:
974:
973:
971:
970:
965:
960:
959:
958:
948:
947:
946:
945:
944:
934:
929:
924:
919:
914:
913:
912:
907:
905:Half precision
902:
892:Floating point
889:
884:
879:
874:
868:
866:
862:
861:
859:
858:
853:
848:
843:
838:
833:
827:
825:
819:
818:
813:
811:
810:
803:
796:
788:
780:
779:
762:
744:
743:
741:
738:
737:
736:
731:
726:
721:
716:
711:
706:
699:
696:
688:CJK ideographs
669:Main article:
653:and the other
617:Main article:
614:
611:
479:Main article:
475:
474:
425:
423:
416:
410:
407:
405:
402:
342:classic Mac OS
311:
308:
284:human-readable
272:
271:
249:
238:
230:
227:
176:formatted text
128:
127:
42:
40:
33:
15:
13:
10:
9:
6:
4:
3:
2:
1366:
1355:
1352:
1350:
1347:
1346:
1344:
1329:
1326:
1324:
1321:
1319:
1316:
1314:
1311:
1309:
1306:
1304:
1301:
1299:
1296:
1294:
1291:
1289:
1286:
1282:
1279:
1278:
1277:
1274:
1272:
1269:
1267:
1264:
1262:
1259:
1257:
1254:
1253:
1251:
1245:
1239:
1236:
1234:
1231:
1229:
1226:
1224:
1221:
1219:
1216:
1214:
1211:
1209:
1206:
1204:
1201:
1199:
1196:
1194:
1191:
1189:
1188:Function type
1186:
1184:
1181:
1179:
1176:
1174:
1171:
1169:
1166:
1164:
1161:
1160:
1158:
1154:
1146:
1143:
1142:
1141:
1138:
1136:
1133:
1131:
1128:
1126:
1123:
1121:
1118:
1116:
1113:
1109:
1106:
1105:
1104:
1101:
1099:
1096:
1094:
1091:
1089:
1086:
1084:
1081:
1079:
1076:
1074:
1071:
1069:
1066:
1064:
1061:
1057:
1054:
1053:
1052:
1049:
1048:
1046:
1044:
1040:
1032:
1029:
1028:
1027:
1024:
1022:
1019:
1018:
1016:
1014:
1010:
1004:
1001:
997:
994:
992:
989:
988:
987:
984:
983:
981:
979:
975:
969:
966:
964:
961:
957:
954:
953:
952:
949:
943:
940:
939:
938:
935:
933:
930:
928:
925:
923:
920:
918:
915:
911:
908:
906:
903:
901:
898:
897:
895:
894:
893:
890:
888:
885:
883:
880:
878:
875:
873:
870:
869:
867:
863:
857:
854:
852:
849:
847:
844:
842:
839:
837:
834:
832:
829:
828:
826:
824:
823:Uninterpreted
820:
816:
809:
804:
802:
797:
795:
790:
789:
786:
776:
772:
766:
763:
755:
749:
746:
739:
735:
732:
730:
727:
725:
722:
720:
717:
715:
712:
710:
707:
705:
702:
701:
697:
695:
693:
689:
685:
681:
677:
672:
667:
665:
661:
656:
652:
647:
645:
644:tab character
641:
637:
633:
629:
625:
620:
613:Control codes
612:
610:
608:
603:
601:
597:
593:
589:
585:
580:
576:
574:
570:
566:
561:
558:
554:
550:
546:
542:
538:
534:
529:
525:
523:
519:
515:
511:
507:
503:
499:
497:
493:
489:
482:
471:
468:
460:
457:December 2023
450:
446:
442:
436:
435:
431:
426:This section
424:
420:
415:
414:
408:
403:
401:
399:
398:binary format
394:
392:
389:" file, or a
388:
384:
379:
377:
372:
370:
366:
361:
359:
355:
351:
347:
343:
339:
335:
330:
328:
323:
321:
317:
309:
307:
305:
301:
295:
293:
289:
285:
281:
277:
269:
265:
261:
257:
253:
250:
247:
243:
240:In contrast,
239:
236:
235:
234:
228:
226:
223:
219:
213:
211:
207:
203:
199:
195:
190:
188:
183:
181:
177:
173:
169:
165:
161:
153:
148:
143:
139:
134:
124:
121:
113:
102:
99:
95:
92:
88:
85:
81:
78:
74:
71: –
70:
66:
65:Find sources:
59:
55:
49:
48:
43:This article
41:
37:
32:
31:
26:
22:
1354:Open formats
1093:Intersection
1012:
765:
748:
674:
664:Mac OS Roman
660:Windows-1252
648:
622:
604:
581:
577:
562:
536:
535:rather than
532:
530:
526:
500:
496:Brown Corpus
488:Roberto Busa
484:
463:
454:
439:Please help
427:
395:
380:
373:
362:
331:
324:
320:text editors
313:
296:
273:
245:
241:
232:
221:
214:
191:
186:
184:
180:binary files
163:
157:
137:
116:
107:
97:
90:
83:
76:
69:"Plain text"
64:
52:Please help
47:verification
44:
1323:Type theory
1318:Type system
1168:Bottom type
1115:Option type
1056:generalized
942:Long double
887:Fixed point
719:Source code
709:Binary file
704:Binary data
502:Fred Brooks
316:file format
288:wiki markup
242:styled text
142:Royal Dixon
110:August 2012
1343:Categories
1228:Empty type
1223:Type class
1173:Collection
1130:Refinement
1108:metaobject
956:signedness
815:Data types
740:References
592:text/plain
553:ISO 8859-1
545:code pages
391:TXT Record
300:Endianness
168:characters
164:plain text
80:newspapers
1303:Subtyping
1298:Interface
1281:metaclass
1233:Unit type
1203:Semaphore
1183:Exception
1088:Inductive
1078:Dependent
1043:Composite
1021:Character
1003:Reference
900:Minifloat
856:Bit array
773:". 1999.
724:Text file
596:text/html
584:MIME type
428:does not
360:readers.
280:meta-data
278:or other
246:rich text
160:computing
21:Plaintext
1328:Variable
1218:Top type
1083:Equality
991:physical
968:Rational
963:Interval
910:bfloat16
777:. p. 73.
698:See also
655:ISO 8859
642:and the
632:printers
602:markup.
573:dingbats
557:ISO 2022
549:ISO 8859
522:checksum
404:Encoding
352:and the
194:encoding
1271:Generic
1247:Related
1163:Boolean
1120:Product
996:virtual
986:Address
978:Pointer
951:Integer
882:Decimal
877:Complex
865:Numeric
676:Unicode
651:Latin-1
640:newline
569:Unicode
449:removed
434:sources
383:comment
338:Windows
202:Unicode
94:scholar
1261:Boxing
1249:topics
1208:Stream
1145:tagged
1103:Object
1026:String
547:under
510:EBCDIC
494:, the
376:e-mail
358:e-text
344:, and
276:markup
266:, and
218:string
210:UTF-16
154:window
150:in an
96:
89:
82:
75:
67:
1156:Other
1140:Union
1073:Class
1063:Array
846:Tryte
757:(PDF)
692:emoji
624:ASCII
514:ASCII
385:, a "
310:Usage
304:UCS-2
206:UTF-8
200:. As
198:ASCII
152:xterm
101:JSTOR
87:books
1276:Kind
1238:Void
1098:List
1013:Text
851:Word
841:Trit
836:Byte
600:JSON
588:HTTP
432:any
430:cite
387:.txt
350:Lynx
346:Unix
260:HTML
252:SGML
208:and
187:only
73:news
1135:Set
831:Bit
662:or
541:ISO
506:IBM
504:of
490:'s
443:by
334:DOS
292:SVG
268:TeX
264:XML
256:RTF
222:not
158:In
147:cat
140:by
56:by
1345::
690:,
646:.
533:¬A
524:.
400:.
381:A
378:.
340:,
336:,
325:A
262:,
258:,
254:,
162:,
807:e
800:t
793:v
537:`
470:)
464:(
459:)
455:(
451:.
437:.
123:)
117:(
112:)
108:(
98:·
91:·
84:·
77:·
50:.
27:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.