Knowledge (XXG)

Text file

Source 📝

451: 388: 327: 36: 140: 730:(UTI) used for text files in macOS is "public.plain-text"; additional, more specific UTIs are: "public.utf8-plain-text" for utf-8-encoded text, "public.utf16-external-plain-text" and "public.utf16-plain-text" for utf-16-encoded text and "com.apple.traditional-mac-plain-text" for classic Mac OS text files. 650:
encodings (i.e. ANSI in the Microsoft Notepad menus is really "System Code Page", non-Unicode, legacy encoding), except for in locales such as Chinese, Japanese and Korean that require double-byte character sets. ANSI encodings were traditionally used as default system locales within Microsoft
892:
difference as to the endianness of the byte stream. UTF-8 always has the same byte order. An initial BOM is only used as a signature — an indication that an otherwise unmarked text file is in UTF-8. Note that some recipients of UTF-8 encoded data do not expect a BOM. Where UTF-8 is used
675:(BOM), which communicates the endianness of the file content. Although UTF-8 does not suffer from endianness problems, many Microsoft Windows programs (i.e. Notepad) prepend the contents of UTF-8-encoded files with BOM, to differentiate UTF-8 encoding from other 8-bit encodings. 738:
When opened by a text editor, human-readable content is presented to the user. This often consists of the file's plain text visible to the user. Depending on the application, control codes may be rendered either as literal instructions acted upon by the editor, or as visible
547:
is an attempt to create a common standard for representing all known languages, and most known character sets are subsets of the very large Unicode character set. Although there are multiple character encodings available for Unicode, the most common is
743:
that can be edited as plain text. Though there may be plain text in a text file, control characters within the file (especially the end-of-file character) can render the plain text unseen by a particular method.
515:
is the most common compatible subset of character sets for English-language text files, and is generally assumed to be the default file format in many situations. It covers American English, but for the British
691:
defines a text file as a file that contains characters organized into zero or more lines, where lines are sequences of zero or more non-newline characters plus a terminating newline character, normally LF.
556:. Thus, a common operating mode of UTF-8 capable software, when opening files of unknown encoding, is to try UTF-8 first and fall back to a locale dependent legacy encoding when it definitely is not UTF-8. 897:
in 8-bit environments, the use of a BOM will interfere with any protocol or file format that expects specific ASCII characters at the beginning, such as the use of "#!" of at the beginning of Unix shell
639:. However, many other suffixes are used for text files with specific purposes. For example, source code for computer programs is usually kept in text files that have file name suffixes indicating the 1003: 552:, which has the advantage of being backwards-compatible with ASCII; that is, every ASCII text file is also a UTF-8 text file with identical meaning. UTF-8 also has the advantage that 1035: 543:
Because encodings necessarily have only a limited repertoire of characters, often very small, many are only usable to represent text in a limited subset of human languages.
253:, where the operating system does not keep track of the file size in bytes, the end of a text file is denoted by placing one or more special characters, known as an 418:
occurs in a text file, it is often easier to recover and continue processing the remaining contents. A disadvantage of text files is that they usually have a low
703:
as a text file whose characters are printable or space or backspace according to regional rules. This excludes most control characters, which are not printable.
646:
Most Microsoft Windows text files use ANSI, OEM, Unicode or UTF-8 encoding. What Microsoft Windows terminology calls "ANSI encodings" are usually single-byte
299:, in which text files are stored as a sequence either of fixed-length records or of variable-length records with a record-length value in the record header. 854: 1095: 269:
systems, text files do not contain any special EOF character, because file systems on those operating systems keep track of the file size in bytes.
826: 498: 374: 119: 1261: 524:, or characters used outside English, a richer character set must be used. In many systems, this is chosen based on the default 476: 352: 57: 1304: 1088: 419: 411: 100: 1438: 1237: 723:
indicated that the type of the file was "TEXT". Lines of classic Mac OS text files are terminated with CR characters.
631:
On Microsoft Windows operating systems, a file is regarded as a text file if the suffix of the name of the file (the "
72: 970: 940: 910: 461: 337: 1493: 1147: 525: 472: 348: 296: 53: 480: 465: 356: 341: 46: 1445: 79: 1249: 206: 842: 528:
setting on the computer it is read on. Prior to UTF-8, this was traditionally single-byte encodings (such as
1450: 1157: 1142: 1081: 727: 181: 86: 1498: 392: 1277: 980: 950: 920: 664: 574: 403: 292: 68: 1465: 1380: 1314: 1137: 768: 640: 433:) to assist the reader in interpretation. A text file may contain no data at all, which is a case of 243: 202: 1354: 1344: 1294: 783: 778: 1349: 1339: 1232: 763: 632: 257:(EOF) marker, as padding after the last line in a text file. In modern operating systems such as 147: 612:
use a common text file format, with each line of text separated by a two-character combination:
1472: 1225: 822: 625: 609: 406:
of information. They avoid some of the problems encountered with other file formats, such as
262: 1428: 1321: 1309: 1287: 1152: 740: 231: 667:
common in DOS applications. "Unicode"-encoded Microsoft Windows text files contain text in
1418: 1370: 672: 613: 415: 235: 309:
At a generic level of description, there are two kinds of computer files: text files and
1385: 1331: 1244: 1194: 1189: 716: 537: 434: 277: 93: 1487: 1455: 1397: 1390: 1220: 1104: 1067: 815: 720: 652: 651:
Windows, before the transition to Unicode. By contrast, OEM encodings, also known as
585: 430: 227: 1460: 1433: 1423: 1215: 1169: 647: 239: 719:
system regarded the content of a file (the data fork) to be a text file when its
422:, meaning that the information occupies more storage than is strictly necessary. 1299: 1184: 1174: 1132: 1118: 788: 589: 580: 533: 450: 326: 310: 254: 35: 876: 1282: 1043: 1011: 569: 529: 517: 407: 387: 303: 159: 17: 1375: 1256: 1179: 1127: 850: 684: 617: 521: 284: 266: 171: 1210: 1164: 426: 793: 773: 624:
to be terminated with a CR-LF marker, and many text editors (including
544: 288: 281: 273: 1413: 758: 668: 660: 139: 975: 945: 915: 753: 712: 688: 553: 549: 512: 386: 1073: 726:
Being a Unix-like system, macOS uses Unix format for text files.
593: 280:, and Windows, store text files as a sequence of bytes, with an 250: 1077: 663:
text mode display system. They typically include graphical and
671:
Unicode Transformation Format. Such files normally begin with
656: 600:, usually with additional information indicating an encoding. 444: 402:
Because of their simplicity, text files are commonly used for
320: 258: 29: 687:
operating systems, text files format is precisely described:
410:, padding bytes, or differences in the number of bytes in a 287:
at the end of each line. Other operating systems, such as
628:) do not automatically insert one on the last line. 1406: 1363: 1330: 1270: 1203: 1111: 198: 190: 180: 170: 158: 146: 60:. Unsourced material may be challenged and removed. 1030: 1028: 814: 1036:"Designing Scripts for Cross-Platform Deployment" 302:"Text file" refers to a type of container, while 888:Yes, UTF-8 can contain a BOM. However, it makes 584:types). Such files can be viewed and edited on 572:content with very little formatting (e.g., no 1089: 847:Internationalization for Windows Applications 620:(LF). It is common for the last line of text 8: 132: 479:. Unsourced material may be challenged and 355:. Unsourced material may be challenged and 1096: 1082: 1074: 1004:"System-Declared Uniform Type Identifiers" 425:A simple text file may need no additional 998: 996: 568:refers to a file format that allows only 499:Learn how and when to remove this message 375:Learn how and when to remove this message 120:Learn how and when to remove this message 877:"FAQ – UTF-8, UTF-16, UTF-32 & BOM" 805: 131: 7: 564:On most operating systems, the name 477:adding citations to reliable sources 353:adding citations to reliable sources 276:, Unix-like systems, CP/M, DOS, the 230:that is structured as a sequence of 58:adding citations to reliable sources 27:Computer file containing plain text 182:Uniform Type Identifier (UTI) 25: 857:from the original on Feb 21, 2023 391:A stylized iconic depiction of a 643:in which the source is written. 449: 325: 272:Some operating systems, such as 138: 34: 540:encodings for Asian languages. 45:needs additional citations for 1262:Hidden file / Hidden directory 695:Additionally, POSIX defines a 592:. Text files usually have the 1: 1305:Filesystem Hierarchy Standard 976:IEEE Std 1003.1, 2013 Edition 946:IEEE Std 1003.1, 2013 Edition 916:IEEE Std 1003.1, 2017 Edition 875:Freytag, Asmus (2015-12-18). 536:) for European languages and 429:(other than knowledge of its 306:refers to a type of content. 249:In operating systems such as 222:; an old alternative name is 817:Computer Science Illuminated 604:Microsoft Windows text files 554:it is easily auto-detectable 1439:Comparison of file managers 1238:List of filename extensions 297:record-oriented filesystems 1515: 707:Apple Macintosh text files 1446:File system fragmentation 293:OS/360 and its successors 137: 1250:Extended file attributes 1158:Proprietary file formats 879:. The Unicode Consortium 843:"Using Byte Order Marks" 659:for use in the original 207:Generic container format 160:Internet media type 1451:File-system permissions 728:Uniform Type Identifier 711:Prior to the advent of 665:line-drawing characters 1008:Guides and Sample Code 971:"3.284 Printable File" 821:. Jones and Bartlett. 399: 1040:Mac Developer Library 981:IEEE Computer Society 951:IEEE Computer Society 921:IEEE Computer Society 390: 238:. A text file exists 191:UTI conformation 1466:File synchronization 1315:Semantic file system 1138:List of file formats 813:Lewis, John (2006). 800:Notes and references 769:List of file formats 641:programming language 473:improve this section 349:improve this section 244:computer file system 203:Document file format 54:improve this article 1295:Directory structure 1068:Power of Plain Text 784:Text-based protocol 779:Syntax highlighting 513:ASCII character set 218:(sometimes spelled 134: 1233:Filename extension 764:Filename extension 655:, were defined by 633:filename extension 400: 148:Filename extension 1494:Text file formats 1481: 1480: 1473:File verification 1226:Filename mangling 1153:Open file formats 911:"3.403 Text File" 741:escape characters 610:Microsoft Windows 509: 508: 501: 385: 384: 377: 263:Microsoft Windows 212: 211: 186:public.plain-text 130: 129: 122: 104: 16:(Redirected from 1506: 1429:Data compression 1310:Grid file system 1288:Temporary folder 1278:Directory/folder 1098: 1091: 1084: 1075: 1055: 1054: 1052: 1051: 1032: 1023: 1022: 1020: 1019: 1000: 991: 990: 988: 987: 967: 961: 960: 958: 957: 937: 931: 930: 928: 927: 907: 901: 900: 885: 884: 872: 866: 865: 863: 862: 839: 833: 832: 820: 810: 701: 700: 638: 599: 504: 497: 493: 490: 484: 453: 445: 414:. Further, when 380: 373: 369: 366: 360: 329: 321: 142: 135: 125: 118: 114: 111: 105: 103: 62: 38: 30: 21: 1514: 1513: 1509: 1508: 1507: 1505: 1504: 1503: 1484: 1483: 1482: 1477: 1419:File comparison 1402: 1371:File descriptor 1359: 1326: 1266: 1199: 1143:File signatures 1107: 1102: 1064: 1059: 1058: 1049: 1047: 1034: 1033: 1026: 1017: 1015: 1002: 1001: 994: 985: 983: 969: 968: 964: 955: 953: 939: 938: 934: 925: 923: 909: 908: 904: 882: 880: 874: 873: 869: 860: 858: 853:. Jan 7, 2021. 841: 840: 836: 829: 812: 811: 807: 802: 750: 736: 709: 698: 697: 681: 679:Unix text files 673:byte order mark 636: 614:carriage return 606: 597: 562: 505: 494: 488: 485: 470: 454: 443: 416:data corruption 381: 370: 364: 361: 346: 330: 319: 236:electronic text 226:) is a kind of 166: 154: 126: 115: 109: 106: 63: 61: 51: 39: 28: 23: 22: 15: 12: 11: 5: 1512: 1510: 1502: 1501: 1496: 1486: 1485: 1479: 1478: 1476: 1475: 1470: 1469: 1468: 1463: 1453: 1448: 1443: 1442: 1441: 1431: 1426: 1421: 1416: 1410: 1408: 1404: 1403: 1401: 1400: 1395: 1394: 1393: 1388: 1378: 1373: 1367: 1365: 1361: 1360: 1358: 1357: 1352: 1347: 1342: 1336: 1334: 1328: 1327: 1325: 1324: 1319: 1318: 1317: 1312: 1307: 1297: 1292: 1291: 1290: 1285: 1274: 1272: 1268: 1267: 1265: 1264: 1259: 1254: 1253: 1252: 1245:File attribute 1242: 1241: 1240: 1230: 1229: 1228: 1223: 1218: 1207: 1205: 1201: 1200: 1198: 1197: 1195:Zero-byte file 1192: 1190:Temporary file 1187: 1182: 1177: 1172: 1167: 1162: 1161: 1160: 1155: 1150: 1145: 1140: 1130: 1125: 1115: 1113: 1109: 1108: 1105:Computer files 1103: 1101: 1100: 1093: 1086: 1078: 1072: 1071: 1063: 1062:External links 1060: 1057: 1056: 1024: 992: 962: 932: 902: 867: 834: 827: 804: 803: 801: 798: 797: 796: 791: 786: 781: 776: 771: 766: 761: 756: 749: 746: 735: 732: 717:classic Mac OS 708: 705: 699:printable file 680: 677: 653:DOS code pages 605: 602: 586:text terminals 561: 558: 538:wide character 507: 506: 489:September 2024 457: 455: 448: 442: 439: 435:zero-byte file 383: 382: 365:September 2024 333: 331: 324: 318: 315: 278:classic Mac OS 240:stored as data 210: 209: 200: 199:Type of format 196: 195: 192: 188: 187: 184: 178: 177: 174: 168: 167: 164: 162: 156: 155: 152: 150: 144: 143: 128: 127: 42: 40: 33: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 1511: 1500: 1499:Computer data 1497: 1495: 1492: 1491: 1489: 1474: 1471: 1467: 1464: 1462: 1459: 1458: 1457: 1456:File transfer 1454: 1452: 1449: 1447: 1444: 1440: 1437: 1436: 1435: 1432: 1430: 1427: 1425: 1422: 1420: 1417: 1415: 1412: 1411: 1409: 1405: 1399: 1398:Symbolic link 1396: 1392: 1389: 1387: 1384: 1383: 1382: 1379: 1377: 1374: 1372: 1369: 1368: 1366: 1362: 1356: 1353: 1351: 1348: 1346: 1343: 1341: 1338: 1337: 1335: 1333: 1329: 1323: 1320: 1316: 1313: 1311: 1308: 1306: 1303: 1302: 1301: 1298: 1296: 1293: 1289: 1286: 1284: 1281: 1280: 1279: 1276: 1275: 1273: 1269: 1263: 1260: 1258: 1255: 1251: 1248: 1247: 1246: 1243: 1239: 1236: 1235: 1234: 1231: 1227: 1224: 1222: 1221:Long filename 1219: 1217: 1214: 1213: 1212: 1209: 1208: 1206: 1202: 1196: 1193: 1191: 1188: 1186: 1183: 1181: 1178: 1176: 1173: 1171: 1168: 1166: 1163: 1159: 1156: 1154: 1151: 1149: 1146: 1144: 1141: 1139: 1136: 1135: 1134: 1131: 1129: 1126: 1124: 1120: 1117: 1116: 1114: 1110: 1106: 1099: 1094: 1092: 1087: 1085: 1080: 1079: 1076: 1069: 1066: 1065: 1061: 1045: 1041: 1037: 1031: 1029: 1025: 1013: 1009: 1005: 999: 997: 993: 982: 978: 977: 972: 966: 963: 952: 948: 947: 942: 936: 933: 922: 918: 917: 912: 906: 903: 899: 896: 895:transparently 891: 878: 871: 868: 856: 852: 848: 844: 838: 835: 830: 828:0-7637-4149-3 824: 819: 818: 809: 806: 799: 795: 792: 790: 787: 785: 782: 780: 777: 775: 772: 770: 767: 765: 762: 760: 757: 755: 752: 751: 747: 745: 742: 733: 731: 729: 724: 722: 721:resource fork 718: 714: 706: 704: 702: 693: 690: 686: 678: 676: 674: 670: 666: 662: 658: 654: 649: 644: 642: 634: 629: 627: 623: 619: 615: 611: 603: 601: 595: 591: 588:or in simple 587: 583: 582: 577: 576: 571: 567: 559: 557: 555: 551: 546: 541: 539: 535: 531: 527: 523: 519: 514: 503: 500: 492: 482: 478: 474: 468: 467: 463: 458:This section 456: 452: 447: 446: 440: 438: 436: 432: 431:character set 428: 423: 421: 417: 413: 409: 405: 398: 394: 389: 379: 376: 368: 358: 354: 350: 344: 343: 339: 334:This section 332: 328: 323: 322: 316: 314: 312: 307: 305: 300: 298: 294: 290: 286: 283: 279: 275: 270: 268: 264: 260: 256: 252: 247: 245: 241: 237: 233: 229: 228:computer file 225: 221: 217: 208: 204: 201: 197: 193: 189: 185: 183: 179: 175: 173: 169: 163: 161: 157: 151: 149: 145: 141: 136: 124: 121: 113: 110:December 2015 102: 99: 95: 92: 88: 85: 81: 78: 74: 71: –  70: 66: 65:Find sources: 59: 55: 49: 48: 43:This article 41: 37: 32: 31: 19: 18:Text document 1461:File sharing 1434:File manager 1424:File copying 1271:Organisation 1216:8.3 filename 1170:Sidecar file 1148:Magic number 1122: 1048:. Retrieved 1039: 1016:. Retrieved 1007: 984:. Retrieved 974: 965: 954:. Retrieved 944: 941:"3.206 Line" 935: 924:. Retrieved 914: 905: 894: 889: 887: 881:. Retrieved 870: 859:. Retrieved 846: 837: 816: 808: 737: 725: 710: 696: 694: 682: 648:ISO/IEC 8859 645: 630: 621: 607: 590:text editors 579: 573: 565: 563: 542: 510: 495: 486: 471:Please help 459: 424: 412:machine word 401: 396: 371: 362: 347:Please help 335: 317:Data storage 311:binary files 308: 301: 271: 248: 223: 219: 215: 213: 116: 107: 97: 90: 83: 76: 64: 52:Please help 47:verification 44: 1300:File system 1185:System file 1175:Sparse file 1133:File format 1119:Binary file 789:Text editor 534:ISO-8859-16 395:-formatted 282:end-of-line 255:end-of-file 194:public.text 69:"Text file" 1488:Categories 1407:Management 1332:Operations 1283:NTFS links 1204:Properties 1070:on C2 wiki 1050:2016-09-12 1046:2014-03-10 1044:Apple Inc. 1018:2016-09-12 1014:2009-11-17 1012:Apple Inc. 986:2015-12-15 956:2015-12-15 926:2019-03-01 883:2016-05-30 861:2022-04-21 598:text/plain 570:plain text 530:ISO-8859-1 518:pound sign 408:endianness 304:plain text 165:text/plain 80:newspapers 1376:Hard link 1257:File size 1180:Swap file 1128:Data file 1123:text file 851:Microsoft 734:Rendering 685:Unix-like 618:line feed 616:(CR) and 566:text file 522:euro sign 460:does not 397:text file 336:does not 285:delimiter 267:Unix-like 242:within a 216:text file 172:Type code 133:Text file 1381:Shortcut 1211:Filename 1165:Metafile 898:scripts. 855:Archived 748:See also 608:DOS and 532:through 441:Encoding 427:metadata 224:flatfile 220:textfile 1364:Linking 794:Unicode 774:Newline 626:Notepad 560:Formats 545:Unicode 481:removed 466:sources 420:entropy 404:storage 357:removed 342:sources 295:, have 289:OpenVMS 274:Multics 94:scholar 1414:Backup 1391:Shadow 825:  759:EBCDIC 715:, the 669:UTF-16 661:IBM PC 635:") is 581:italic 526:locale 520:, the 96:  89:  82:  75:  67:  1386:Alias 1355:Write 1345:Close 1112:Types 754:ASCII 713:macOS 689:POSIX 596:type 550:UTF-8 232:lines 101:JSTOR 87:books 1350:Read 1340:Open 1322:Path 823:ISBN 637:.txt 594:MIME 575:bold 511:The 464:any 462:cite 340:any 338:cite 291:and 265:and 251:CP/M 176:TEXT 153:.txt 73:news 683:On 657:IBM 622:not 578:or 475:by 393:CSV 351:by 259:DOS 234:of 56:by 1490:: 1121:/ 1042:. 1038:. 1027:^ 1010:. 1006:. 995:^ 979:. 973:. 949:. 943:. 919:. 913:. 890:no 886:. 849:. 845:. 437:. 313:. 261:, 246:. 214:A 205:, 1097:e 1090:t 1083:v 1053:. 1021:. 989:. 959:. 929:. 864:. 831:. 502:) 496:( 491:) 487:( 483:. 469:. 378:) 372:( 367:) 363:( 359:. 345:. 123:) 117:( 112:) 108:( 98:· 91:· 84:· 77:· 50:. 20:)

Index

Text document

verification
improve this article
adding citations to reliable sources
"Text file"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message

Filename extension
Internet media type
Type code
Uniform Type Identifier (UTI)
Document file format
Generic container format
computer file
lines
electronic text
stored as data
computer file system
CP/M
end-of-file
DOS
Microsoft Windows
Unix-like
Multics

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.