1182:'s formative years, when dealing with data characters in the ASCII repertoire and using their corresponding bytes in ASCII as the basis for determining percent-encoded sequences, this practice was relatively harmless; it was just assumed that characters and bytes mapped one-to-one and were interchangeable. The need to represent characters outside the ASCII range, however, grew quickly, and URI schemes and protocols often failed to provide standard rules for preparing character data for inclusion in a URI. Web applications consequently began using different multi-byte,
1194:
reliably interpreted. Some schemes fail to account for encoding at all and instead just suggest that data characters map directly to URI characters, which leaves it up to implementations to decide whether and how to percent-encode data characters that are in neither the reserved nor unreserved sets.
1431:
Not addressed by the current specification is what to do with encoded character data. For example, in computers, character data manifests in encoded form, at some level, and thus could be treated as either binary or character data when being mapped to URI characters. Presumably, it is up to the URI
125:
characters have no such meanings. Using percent-encoding, reserved characters are represented using special character sequences. The sets of reserved and unreserved characters and the circumstances under which certain reserved characters have special meaning have changed slightly with each revision
1193:
before being represented in a URI by unreserved characters or percent-encoded bytes. If the scheme does not allow the URI to provide a hint as to what encoding was used, or if the encoding conflicts with the use of ASCII to percent-encode reserved and unreserved characters, then the URI cannot be
1081:
URIs that differ only by whether a reserved character is percent-encoded or appears literally are normally considered not equivalent (denoting the same resource) unless it can be determined that the reserved characters in question have no reserved purpose. This determination is dependent upon the
1423:
The generic URI syntax recommends that new URI schemes that provide for the representation of character data in a URI should, in effect, represent characters from the unreserved set without translation and should convert all other characters to bytes according to
1093:
URIs that differ only by whether an unreserved character is percent-encoded or appears literally are equivalent by definition, but URI processors, in practice, may not always recognize this equivalence. For example, URI consumers
1077:
is still considered a reserved character but it normally has no reserved purpose, unless a particular URI scheme says otherwise. The character does not need to be percent-encoded when it has no reserved purpose.
757:
When a character from the reserved set (a "reserved character") has a special meaning (a "reserved purpose") in a certain context, and a URI scheme says that it is necessary to use that character for some
1146:
path, as components of a URI. URI scheme specifications should, but often do not, provide an explicit mapping between URI characters and all possible data values being represented by those characters.
1629:
as the form action, was proposed in RFC 1867 section 5.6, during the HTML 3.2 era. Various web browsers implemented it by invoking a separate email program or using their own rudimentary
1428:, and then percent-encode those values. This suggestion was introduced in January 2005 with the publication of RFC 3986. URI schemes introduced before this date are not affected.
1178:
The procedure for percent-encoding binary data has often been extrapolated, sometimes inappropriately or without being fully specified, to apply to character-based data. In the
1409:
Arbitrary character data is sometimes percent-encoded and used in non-URI situations, such as for password-obfuscation programs or other system-specific translation protocols.
767:
1189:
For example, many URI schemes and protocols based on RFCs 1738 and 2396 presume that the data characters will be converted to bytes according to some unspecified
1062:
Reserved characters that have no reserved purpose in a particular context may also be percent-encoded but are not semantically different from those that are not.
1158:
in a URI must divide the data into 8-bit bytes and percent-encode each byte in the same manner as above. Byte value 0x0F, for example, should be represented by
1796:
1186:, and other non-ASCII-compatible encodings as the basis for percent-encoding, leading to ambiguities and difficulty interpreting URIs reliably.
1495:. The encoding used by default is based on an early version of the general URI percent-encoding rules, with a number of modifications such as
1680:
The following specifications all discuss and define reserved characters, unreserved characters, and percent-encoding, in some form or other:
1547:
1418:
1633:
capabilities. Although sometimes unreliable, it was briefly popular as a simple way to transmit form data without involving a web server or
1170:. The use of unencoded characters for alphanumeric and other unreserved characters is typically preferred, as it results in shorter URLs.
1601:
790:, are then used in the URI in place of the reserved character. (A non-ASCII character is typically converted to its byte sequence in
1708:
1704:
1432:
scheme specifications to account for this possibility and require one or the other, but in practice, few, if any, actually do.
1630:
1483:
is submitted, the form field names and values are encoded and sent to the server in an HTTP request message using method
1114:, but some do. For maximal interoperability, URI producers are discouraged from percent-encoding unreserved characters.
1791:
1484:
54:
42:
1154:
Since the publication of RFC 1738 in 1994 it has been specified that schemes that provide for the representation of
1656:
1622:
1523:
specification contains rules for how web servers decode data of this type and make it available to applications.
1719:
1634:
1626:
1520:
1183:
58:
766:. Percent-encoding a reserved character involves converting the character to its corresponding byte value in
1557:
38:
1455:
code unit represented as four hexadecimal digits. This behavior is not specified by any RFC and has been
62:
1786:
1204:
919:
912:
806:
217:
210:
1456:
1190:
1689:
870:
168:
1742:
1597:
842:
787:
140:
1693:
1179:
961:
954:
259:
252:
1780:
1292:
1278:
1246:
1239:
940:
238:
121:
characters are used to separate different parts of a URL (or more generally, a URI).
118:
1773:– a website with various options to convert files or texts into URL-encoded format.
1767:– a website with various options to convert files or texts into URL-encoded format.
1527:
1285:
1271:
1218:
1211:
1066:
779:
775:
717:
110:
1737:
1711:) together comprised the previous version of the generic URI syntax specification.
1126:) serves to indicate percent-encoded octets, it must itself be percent-encoded as
117:
characters are those characters that sometimes have special meaning. For example,
1726:
1715:
1700:
1685:
1531:
1488:
1155:
1143:
884:
877:
856:
849:
771:
182:
175:
154:
147:
1530:
of the request URI using the same syntax described above. When sent in an HTTP
1764:
1508:
1480:
1264:
1253:
1139:
933:
731:
710:
703:
696:
689:
682:
675:
668:
661:
654:
647:
231:
76:
69:
1758:
1563:
1232:
926:
898:
863:
724:
224:
196:
161:
17:
1598:"ECMAScript 2017 Language Specification (ECMA-262, 8th edition, June 2017)"
1526:
When HTML form data is sent in an HTTP GET request, it is included in the
1138:
Most URI schemes involve the representation of arbitrary data, such as an
1552:
1534:
request or via email, the data is placed in the body of the message, and
1460:
891:
189:
1496:
947:
245:
65:(URN). Consequently, it is also used in the preparation of data of the
1648:
1568:
1516:
1452:
1225:
1090:
Characters from the unreserved set never need to be percent-encoded.
1082:
rules established for reserved characters by individual URI schemes.
1313:
1492:
1468:
1425:
1299:
905:
791:
738:
203:
46:
1752:
1732:
1471:
encoding to a string, then percent-escapes the resulting bytes.
1306:
1198:
Common characters after percent-encoding (ASCII or UTF-8 based)
80:
73:
1770:
802:
1440:
There exists a non-standard encoding for
Unicode characters:
638:
631:
624:
617:
610:
603:
596:
589:
582:
575:
568:
561:
554:
547:
540:
533:
526:
519:
512:
505:
498:
491:
484:
477:
470:
463:
454:
447:
440:
433:
426:
419:
412:
405:
398:
391:
384:
377:
370:
363:
356:
349:
342:
335:
328:
321:
314:
307:
300:
293:
286:
279:
1733:
W3C Guidelines on Naming and
Addressing: URIs, URLs, ...
1718:(mostly obsolete) and RFC 1808 (obsolete), which define
1587:
RFC 1738 §2.2; RFC 2396 §2.4; RFC 3986 §1.2.1, 2.1, 2.5.
1729:(obsolete), the first generic URI syntax specification.
49:
characters legal within a URI. Although it is known as
794:, and then each byte value is represented as above.)
812:
path segments. If, according to a given URI scheme,
801:, for example, if used in the "path" component of a
126:
of specifications that govern URIs and URI schemes.
1755:– online developer tools that support URL encoding.
749:Other characters in a URI must be percent-encoded.
1538:is included in the message's Content-Type header.
53:, it is also used more generally within the main
1696:), the current generic URI syntax specification.
1560:for a comparison of various encoding algorithms
1515:, and it is currently defined in the HTML and
1467:function that uses this syntax, which applies
770:and then representing that value as a pair of
1761:– encodes or decodes URLs within the browser.
828:must be used in the segment instead of a raw
8:
1162:, but byte value 0x41 can be represented by
836:Reserved characters after percent-encoding
101:The characters allowed in a URI are either
1479:When data that has been entered into HTML
1475:The application/x-www-form-urlencoded type
820:a path segment, then the three characters
774:digits (if there is a single hex digit, a
1319:
1499:normalization and replacing spaces with
1196:
834:
267:
128:
72:, as is often used in the submission of
1580:
1069:" component of a URI (the part after a
27:Method of encoding characters in a URI
1548:Internationalized Resource Identifier
1419:Internationalized Resource Identifier
805:, has the special meaning of being a
778:is added). The digits, preceded by a
7:
762:purpose, then the character must be
1621:User-agent support for email based
1625:form submission, using a 'mailto'
25:
1659:from the original on 21 June 2016
1536:application/x-www-form-urlencoded
1519:specifications. In addition, the
1513:application/x-www-form-urlencoded
1130:to be used as data within a URI.
67:application/x-www-form-urlencoded
1738:W3C explanation of UTF-8 in URIs
1459:by the W3C. The 13th edition of
1122:Because the percent character (
113:as part of a percent-encoding).
1797:Binary-to-text encoding formats
1604:from the original on 2018-07-02
57:(URI) set, which includes both
1771:URL Encode and Decode - Online
1759:Online URL encoder and decoder
1:
1647:Berners-Lee, T. (June 1994).
1511:of data encoded this way is
1436:Non-standard implementations
1201:
839:
1743:W3C HTML form content types
55:Uniform Resource Identifier
43:uniform resource identifier
1813:
1416:
743:
1748:Various implementations:
1073:character), for example,
92:Percent-encoding in a URI
1491:, or, historically, via
59:Uniform Resource Locator
1558:Binary-to-text encoding
797:The reserved character
97:Types of URI characters
1600:. Ecma International.
33:, officially known as
1707:) and RFC 2732 (plus
1086:Unreserved characters
271:Unreserved Characters
269:RFC 3986 section 2.3
130:RFC 3986 section 2.2
63:Uniform Resource Name
45:(URI) using only the
41:arbitrary data in a
1199:
837:
753:Reserved characters
274:
135:
132:Reserved Characters
1792:Internet Standards
1765:URL Encoder online
1753:DevPal URL encoder
1463:still includes an
1197:
1191:character encoding
835:
268:
129:
1407:
1406:
1118:Percent character
1110:differently from
1102:differently from
1060:
1059:
747:
746:
266:
265:
111:percent character
37:, is a method to
16:(Redirected from
1804:
1703:(obsolete, plus
1669:
1668:
1666:
1664:
1644:
1638:
1619:
1613:
1612:
1610:
1609:
1594:
1588:
1585:
1537:
1514:
1506:
1502:
1466:
1446:
1413:Current standard
1403:
1398:
1393:
1388:
1383:
1378:
1373:
1368:
1363:
1358:
1353:
1348:
1343:
1338:
1333:
1328:
1323:
1316:
1309:
1302:
1295:
1288:
1281:
1274:
1267:
1260:
1256:
1249:
1242:
1235:
1228:
1221:
1214:
1207:
1200:
1169:
1165:
1161:
1129:
1125:
1113:
1109:
1105:
1101:
1076:
1072:
1056:
1051:
1046:
1041:
1036:
1031:
1026:
1021:
1016:
1011:
1006:
1001:
996:
991:
986:
981:
976:
971:
964:
957:
950:
943:
936:
929:
922:
915:
908:
901:
894:
887:
880:
873:
866:
859:
852:
845:
838:
831:
827:
823:
815:
800:
788:escape character
785:
741:
734:
727:
720:
713:
706:
699:
692:
685:
678:
671:
664:
657:
650:
641:
634:
627:
620:
613:
606:
599:
592:
585:
578:
571:
564:
557:
550:
543:
536:
529:
522:
515:
508:
501:
494:
487:
480:
473:
466:
457:
450:
443:
436:
429:
422:
415:
408:
401:
394:
387:
380:
373:
366:
359:
352:
345:
338:
331:
324:
317:
310:
303:
296:
289:
282:
275:
262:
255:
248:
241:
234:
227:
220:
213:
206:
199:
192:
185:
178:
171:
164:
157:
150:
143:
136:
68:
35:percent-encoding
21:
1812:
1811:
1807:
1806:
1805:
1803:
1802:
1801:
1777:
1776:
1678:
1673:
1672:
1662:
1660:
1646:
1645:
1641:
1620:
1616:
1607:
1605:
1596:
1595:
1591:
1586:
1582:
1577:
1544:
1535:
1528:query component
1512:
1504:
1500:
1477:
1464:
1441:
1438:
1421:
1415:
1401:
1396:
1391:
1386:
1381:
1376:
1371:
1366:
1361:
1356:
1351:
1346:
1341:
1336:
1331:
1326:
1321:
1312:
1305:
1298:
1291:
1284:
1277:
1270:
1263:
1259:
1252:
1245:
1238:
1231:
1224:
1217:
1210:
1203:
1176:
1167:
1163:
1159:
1152:
1136:
1127:
1123:
1120:
1111:
1107:
1103:
1099:
1088:
1074:
1070:
1054:
1049:
1044:
1039:
1034:
1029:
1024:
1019:
1014:
1009:
1004:
999:
994:
989:
984:
979:
974:
969:
960:
953:
946:
939:
932:
925:
918:
911:
904:
897:
890:
883:
876:
869:
862:
855:
848:
841:
829:
825:
821:
813:
798:
783:
764:percent-encoded
755:
737:
730:
723:
716:
709:
702:
695:
688:
681:
674:
667:
660:
653:
646:
637:
630:
623:
616:
609:
602:
595:
588:
581:
574:
567:
560:
553:
546:
539:
532:
525:
518:
511:
504:
497:
490:
483:
476:
469:
462:
453:
446:
439:
432:
425:
418:
411:
404:
397:
390:
383:
376:
369:
362:
355:
348:
341:
334:
327:
320:
313:
306:
299:
292:
285:
278:
273:(January 2005)
258:
251:
244:
237:
230:
223:
216:
209:
202:
195:
188:
181:
174:
167:
160:
153:
146:
139:
134:(January 2005)
99:
94:
89:
66:
28:
23:
22:
15:
12:
11:
5:
1810:
1808:
1800:
1799:
1794:
1789:
1779:
1778:
1775:
1774:
1768:
1762:
1756:
1746:
1745:
1740:
1735:
1730:
1723:
1712:
1697:
1677:
1676:External links
1674:
1671:
1670:
1639:
1614:
1589:
1579:
1578:
1576:
1573:
1572:
1571:
1566:
1561:
1555:
1550:
1543:
1540:
1476:
1473:
1437:
1434:
1417:Main article:
1414:
1411:
1405:
1404:
1399:
1394:
1389:
1384:
1379:
1374:
1369:
1364:
1359:
1354:
1349:
1344:
1339:
1334:
1329:
1324:
1318:
1317:
1310:
1303:
1296:
1289:
1282:
1275:
1268:
1261:
1257:
1250:
1243:
1236:
1229:
1222:
1215:
1208:
1180:World Wide Web
1175:
1174:Character data
1172:
1151:
1148:
1135:
1134:Arbitrary data
1132:
1119:
1116:
1087:
1084:
1058:
1057:
1052:
1047:
1042:
1037:
1032:
1027:
1022:
1017:
1012:
1007:
1002:
997:
992:
987:
982:
977:
972:
966:
965:
958:
951:
944:
937:
930:
923:
916:
909:
902:
895:
888:
881:
874:
867:
860:
853:
846:
754:
751:
745:
744:
742:
735:
728:
721:
714:
707:
700:
693:
686:
679:
672:
665:
658:
651:
643:
642:
635:
628:
621:
614:
607:
600:
593:
586:
579:
572:
565:
558:
551:
544:
537:
530:
523:
516:
509:
502:
495:
488:
481:
474:
467:
459:
458:
451:
444:
437:
430:
423:
416:
409:
402:
395:
388:
381:
374:
367:
360:
353:
346:
339:
332:
325:
318:
311:
304:
297:
290:
283:
264:
263:
256:
249:
242:
235:
228:
221:
214:
207:
200:
193:
186:
179:
172:
165:
158:
151:
144:
98:
95:
93:
90:
88:
85:
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
1809:
1798:
1795:
1793:
1790:
1788:
1785:
1784:
1782:
1772:
1769:
1766:
1763:
1760:
1757:
1754:
1751:
1750:
1749:
1744:
1741:
1739:
1736:
1734:
1731:
1728:
1724:
1721:
1717:
1713:
1710:
1706:
1702:
1698:
1695:
1691:
1687:
1683:
1682:
1681:
1675:
1658:
1654:
1650:
1643:
1640:
1636:
1632:
1628:
1624:
1618:
1615:
1603:
1599:
1593:
1590:
1584:
1581:
1574:
1570:
1567:
1565:
1562:
1559:
1556:
1554:
1551:
1549:
1546:
1545:
1541:
1539:
1533:
1529:
1524:
1522:
1518:
1510:
1498:
1494:
1490:
1486:
1482:
1474:
1472:
1470:
1462:
1458:
1454:
1450:
1445:
1435:
1433:
1429:
1427:
1420:
1412:
1410:
1400:
1395:
1390:
1385:
1380:
1375:
1370:
1365:
1360:
1355:
1350:
1345:
1340:
1335:
1330:
1325:
1320:
1315:
1311:
1308:
1304:
1301:
1297:
1294:
1290:
1287:
1283:
1280:
1276:
1273:
1269:
1266:
1262:
1258:
1255:
1251:
1248:
1244:
1241:
1237:
1234:
1230:
1227:
1223:
1220:
1216:
1213:
1209:
1206:
1202:
1195:
1192:
1187:
1185:
1181:
1173:
1171:
1157:
1149:
1147:
1145:
1141:
1133:
1131:
1117:
1115:
1097:
1091:
1085:
1083:
1079:
1068:
1063:
1053:
1048:
1043:
1038:
1033:
1028:
1023:
1018:
1013:
1008:
1003:
998:
993:
988:
983:
978:
973:
968:
967:
963:
959:
956:
952:
949:
945:
942:
938:
935:
931:
928:
924:
921:
917:
914:
910:
907:
903:
900:
896:
893:
889:
886:
882:
879:
875:
872:
868:
865:
861:
858:
854:
851:
847:
844:
840:
833:
819:
811:
808:
804:
795:
793:
789:
781:
777:
773:
769:
765:
761:
752:
750:
740:
736:
733:
729:
726:
722:
719:
715:
712:
708:
705:
701:
698:
694:
691:
687:
684:
680:
677:
673:
670:
666:
663:
659:
656:
652:
649:
645:
644:
640:
636:
633:
629:
626:
622:
619:
615:
612:
608:
605:
601:
598:
594:
591:
587:
584:
580:
577:
573:
570:
566:
563:
559:
556:
552:
549:
545:
542:
538:
535:
531:
528:
524:
521:
517:
514:
510:
507:
503:
500:
496:
493:
489:
486:
482:
479:
475:
472:
468:
465:
461:
460:
456:
452:
449:
445:
442:
438:
435:
431:
428:
424:
421:
417:
414:
410:
407:
403:
400:
396:
393:
389:
386:
382:
379:
375:
372:
368:
365:
361:
358:
354:
351:
347:
344:
340:
337:
333:
330:
326:
323:
319:
316:
312:
309:
305:
302:
298:
295:
291:
288:
284:
281:
277:
276:
272:
261:
257:
254:
250:
247:
243:
240:
236:
233:
229:
226:
222:
219:
215:
212:
208:
205:
201:
198:
194:
191:
187:
184:
180:
177:
173:
170:
166:
163:
159:
156:
152:
149:
145:
142:
138:
137:
133:
127:
124:
120:
119:forward slash
116:
112:
108:
104:
96:
91:
86:
84:
82:
78:
75:
71:
64:
60:
56:
52:
48:
44:
40:
36:
32:
19:
1747:
1679:
1661:. Retrieved
1652:
1642:
1617:
1606:. Retrieved
1592:
1583:
1525:
1478:
1448:
1443:
1439:
1430:
1422:
1408:
1188:
1177:
1153:
1137:
1121:
1095:
1092:
1089:
1080:
1064:
1061:
817:
816:needs to be
809:
796:
780:percent sign
776:leading zero
763:
759:
756:
748:
270:
131:
122:
114:
106:
102:
100:
51:URL encoding
50:
34:
31:URL encoding
30:
29:
18:URL encoding
1787:URI schemes
1503:instead of
1156:binary data
1150:Binary data
1144:file system
772:hexadecimal
1781:Categories
1653:IETF Tools
1649:"RFC 1630"
1608:2018-06-20
1575:References
1509:media type
1140:IP address
1096:should not
123:Unreserved
107:unreserved
83:requests.
70:media type
61:(URL) and
1725:RFC
1714:RFC
1699:RFC
1692:66 (plus
1684:RFC
1564:Shellcode
1402:%E2%82%AC
807:delimiter
1657:Archived
1655:. IETF.
1637:scripts.
1602:Archived
1553:Punycode
1542:See also
1461:ECMA-262
1457:rejected
1447:, where
1184:stateful
1065:In the "
786:) as an
115:Reserved
103:reserved
79:data in
47:US-ASCII
1663:29 June
1497:newline
810:between
1709:errata
1705:errata
1694:errata
1569:Base64
1517:XForms
1507:. The
1465:escape
1453:UTF-16
1397:%C2%A3
1098:treat
109:(or a
39:encode
1493:email
1481:forms
1469:UTF-8
1451:is a
1426:UTF-8
1166:, or
1067:query
864:&
792:UTF-8
768:ASCII
760:other
162:&
87:Types
1727:1630
1720:URLs
1716:1738
1701:2396
1686:3986
1665:2016
1631:SMTP
1623:HTML
1532:POST
1489:POST
1449:xxxx
1444:xxxx
1247:>
1240:<
81:HTTP
77:form
74:HTML
1690:STD
1635:CGI
1627:URL
1521:CGI
1505:%20
1487:or
1485:GET
1392:%7E
1387:%7D
1382:%7C
1377:%7B
1372:%60
1367:%5F
1362:%5E
1357:%5C
1352:%3E
1347:%3C
1342:%2E
1337:%2D
1332:%25
1327:%22
1322:%20
1168:%41
1160:%0F
1142:or
1128:%25
1108:%7E
1106:or
1100:%41
1055:%5D
1050:%5B
1045:%40
1040:%3F
1035:%3D
1030:%3B
1025:%3A
1020:%2F
1015:%2C
1010:%2B
1005:%2A
1000:%29
995:%28
990:%27
985:%26
980:%24
975:%23
970:%21
826:%2f
824:or
822:%2F
803:URI
105:or
1783::
1688:/
1651:.
1442:%u
857:$
832:.
818:in
155:$
1722:.
1667:.
1611:.
1501:+
1314:€
1307:£
1300:~
1293:}
1286:|
1279:{
1272:`
1265:_
1254:\
1233:.
1226:-
1219:%
1212:"
1205:␣
1164:A
1124:%
1112:~
1104:A
1075:/
1071:?
962:]
955:[
948:@
941:?
934:=
927:;
920::
913:/
906:,
899:+
892:*
885:)
878:(
871:'
850:#
843:!
830:/
814:/
799:/
784:%
782:(
739:~
732:_
725:.
718:-
711:9
704:8
697:7
690:6
683:5
676:4
669:3
662:2
655:1
648:0
639:z
632:y
625:x
618:w
611:v
604:u
597:t
590:s
583:r
576:q
569:p
562:o
555:n
548:m
541:l
534:k
527:j
520:i
513:h
506:g
499:f
492:e
485:d
478:c
471:b
464:a
455:Z
448:Y
441:X
434:W
427:V
420:U
413:T
406:S
399:R
392:Q
385:P
378:O
371:N
364:M
357:L
350:K
343:J
336:I
329:H
322:G
315:F
308:E
301:D
294:C
287:B
280:A
260:]
253:[
246:@
239:?
232:=
225:;
218::
211:/
204:,
197:+
190:*
183:)
176:(
169:'
148:#
141:!
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.