Knowledge (XXG)

IDN homograph attack

Source 📝

31: 992: 113:, in that both forms of attacks use a similar-looking name to a more established domain to fool a user. The major difference is that in typosquatting the perpetrator attracts victims by relying on natural typographical errors commonly made when manually entering a URL, while in homograph spoofing the perpetrator deceives the victims by presenting visually indistinguishable hyperlinks. Indeed, it would be a rare accident for a web user to type, for example, a Cyrillic letter within an otherwise English word, turning "b 1321:) have later been accepted. Three-letter TLD are considered safer than two-letter TLD, since they are harder to match to normal Latin ISO-3166 country domains; although the potential to match new generic domains remains, such generic domains are far more expensive than registering a second- or third-level domain address, making it cost-prohibitive to try to register a homoglyphic TLD for the sole purpose of making fraudulent domains (which itself would draw ICANN scrutiny). 146: 943:ք, which can either resemble p or f depending on the font; ա can resemble Cyrillic ш. However, the use of Armenian is, luckily, a bit less reliable: Not all standard fonts feature Armenian glyphs (whereas the Greek and Cyrillic scripts are); Windows prior to Windows 7 rendered Armenian in a distinct font, 1348:
In their 2019 study, Suzuki et al. introduced ShamFinder, a program for recognizing IDNs, shedding light on their prevalence in real-world scenarios. Similarly, Chiba et al. (2019) designed DomainScouter, a system adept at detecting diverse homograph IDNs in domains through analyzing an estimated 4.4
974:
Hebrew spoofing is generally rare. Only three letters from that alphabet can reliably be used: samekh (ס), which sometimes resembles o, vav with diacritic (וֹ), which resembles an i, and heth (ח), which resembles the letter n. Less accurate approximants for some other alphanumerics can also be found,
942:
can also contribute critical characters: several Armenian characters like օ, ո, ս, as well capital Տ and Լ are often completely identical to Latin characters in modern fonts, and symbols which similar enough to pass off, such as ցհոօզս which look like ghnoqu, յ which resembles j (albeit dotless), and
389:
provide a backward-compatible way for domain names to use the full Unicode character set, and this standard is already widely supported. However this system expanded the character repertoire from a few dozen characters in a single alphabet to many thousands of characters in many scripts; this greatly
401:
just like that of a legitimate website, but in which some of the letters have been replaced by homographs in another alphabet. The attacker could then send e-mail messages purporting to come from the original site, but directing people to the bogus site. The spoof site could then record information
382:
a Latin "a"; there is no difference in the glyphs for these characters in most fonts. However, the computer treats them differently when processing the character string as an identifier. Thus, the user's assumption of a one-to-one correspondence between the visual appearance of a name and the named
431:
Problems of this kind were anticipated before IDN was introduced, and guidelines were issued to registries to try to avoid or reduce the problem. For example, it was advised that registries only accept characters from the Latin alphabet and that of their own country, not all of Unicode characters,
1256:
As an additional defense, Internet Explorer 7, Firefox 2.0 and above, and Opera 9.10 include phishing filters that attempt to alert users when they visit malicious websites. As of April 2017, several browsers (including Chrome, Firefox and Opera) were displaying IDNs consisting purely of Cyrillic
1117:; both ì and í are included in most standard character sets and fonts) that can only be detected with close inspection. In most top-level domain registries, wíkipedia.tld (xn--wkipedia-c2a.tld) and wikipedia.tld are two different names which may be held by different registrants. One exception is 1248:
versions 7 and later allow IDNs except for labels that mix scripts for different languages. Labels that mix scripts are displayed in Punycode. There are exceptions to locales where ASCII characters are commonly mixed with localized scripts. Internet Explorer 7 was capable of using IDNs, but it
219:, which did not connect the vertical columns on the letters i, m, n, or u, making them difficult to distinguish when several were in a row. The latter, as well as "rn"/"m"/"rri" ("RN"/"M"/"RRI") confusion, is still possible for a human eye even with modern advanced computer technology. 466:
The following alphabets have characters that can be used for spoofing attacks (please note, these are only the most obvious and common, given artistic license and how much risk the spoofer will take of getting caught; the possibilities are far more numerous than can be listed here):
1192:
The simplest defense is for web browsers not to support IDNA or other similar mechanisms, or for users to turn off whatever support their browsers have. That could mean blocking access to IDNA sites, but generally browsers permit access and just display IDNs in
101:), and, for a number of reasons, similar-looking characters such as Greek Ο, Latin O, and Cyrillic О were not assigned the same code. Their incorrect or malicious usage is a possibility for security attacks. Thus, for example, a regular user of 421:, published a paper titled "The Homograph Attack", which described an attack that used Unicode URLs to spoof a website URL. To prove the feasibility of this kind of attack, the researchers successfully registered a variant of the domain name 105:
may be lured to click on it unquestioningly as an apparently familiar link, unaware that the third letter is not the Latin character "a" but rather the Cyrillic character "а" and is thus an entirely different domain from the intended one.
1249:
imposes restrictions on displaying non-ASCII domain names based on a user-defined list of allowed languages and provides an anti-phishing filter that checks suspicious Web sites against a remote database of known phishing sites.
1204:
versions 22 and later display IDNs if either the TLD prevents homograph attacks by restricting which characters can be used in domain names or labels do not mix scripts for different languages. Otherwise IDNs are displayed in
1349:
million registered IDNs across 570 Top-Level Domains (TLDs) it was able to successfully identify 8,284 IDN homographs, including many previously unidentified cases targeting brands in languages other than English.
1282:
or other Web sites without being detected until the user actually clicks the link. While the fake link will show in Punycode when it is clicked, by this point the page has already begun loading into the browser.
65:
attack) is a method used by malicious parties to deceive computer users about what remote system they are communicating with, by exploiting the fact that many different characters look alike (i.e., they rely on
1211:
versions 51 and later use an algorithm similar to the one used by Firefox. Previous versions display an IDN only if all of its characters belong to one (and only one) of the user's preferred languages.
402:
such as passwords or account details, while passing traffic through to the real site. The victims may never notice the difference, until suspicious or criminal activity occurs with their accounts.
1301:
has implemented a policy prohibiting any potential internationalized TLD from choosing letters that could resemble an existing Latin TLD and thus be used for homograph attacks. Proposed IDN TLDs
378:
The problem arises from the different treatment of the characters in the user's mind and the computer's programming. From the viewpoint of the user, a Cyrillic "а" within a Latin string
901:
also italicizes the same as its Latin counterpart, making it possible to substitute it for alpha or vice versa. The lunate form of sigma, Ϲϲ, resembles both Latin Cc and Cyrillic Сс.
1638: 475:
Cyrillic is, by far, the most commonly used alphabet for homoglyphs, largely because it contains 11 lowercase glyphs that are identical or nearly identical to Latin counterparts.
1274:
These methods of defense only extend to within a browser. Homographic URLs that house malicious software can still be distributed, without being displayed as Punycode, through
463:, but actually led to a spoofed web site with different content. Popular browsers continued to have problems properly displaying international domain names through April 2017. 226:
has been used as an amusement or attention-grabber and "Volapuk encoding", in which Cyrillic script is represented by similar Latin characters, was used in early days of the
834:) bear strong resemblance to Latin d, h, l and v, these letters are either rare or archaic and are not widely supported in most standard fonts (they are not included in the 279:
that appears almost identical to an existing domain but goes somewhere else. For example, the domain "rnicrosoft.com" begins with "r" and "n", not "m". Other examples are
2121: 883:, the list expands greatly. Greek ΑΒΕΗΙΚΜΝΟΡΤΧΥΖ looks identical to Latin ABEHIKMNOPTXYZ. Greek ΑΓΒΕΗΚΜΟΠΡΤΦΧ looks similar to Cyrillic АГВЕНКМОПРТФХ (as do Cyrillic 234:
can have both Cyrillic (for domestic usage in Cyrillic script countries) and Latin (for international driving) with the same letters. Registration plates that are
2230: 1955: 1664: 1049:
a single Chinese-language IDN registration delivers both variants as active domains (which must have the same domain name server and the same registrant).
2235: 1506: 1021:
strongly resembling Latin letters. ค (A), ท (n), น (u), บ (U), ป (J), พ (W), ร (S), and ล (a) are among the Thai glyphs that can closely resemble Latin.
1175:
In September 2017, security researcher Ankit Anubhav discovered an IDN homograph attack where the attackers registered adoḅe.com to deliver the Betabot
180:, was the confusion between "l" (lowercase letter "L") / "1" (the number "one") and "O" (capital letter for vowel "o") / "0" (the number "zero"). Some 2292: 2010: 1973: 235: 1635: 1475: 1257:
characters normally (not as punycode), allowing spoofing attacks. Chrome tightened IDN restrictions in version 59 to prevent this attack.
1152:
In 2011, an unknown source (registering under the name "Completely Anonymous") registered a domain name homographic to television station
887:(Л) and Greek Λ in certain geometric sans-serif fonts), Greek letters κ and ο look similar to Cyrillic к and о. Besides this Greek τ, 1082: 2107: 1101:
Two names which differ only in an accent on one character may look very similar, particularly when the substitution involves the
1078: 966:
Two letters in Armenian (Ձշ) also can resemble the number 2, Յ resembles 3, while another (վ) sometimes resembles the number 4.
1373: 1074: 2317: 1325: 1034: 222:
Intentional look-alike character substitution with different alphabets has also been known in various contexts. For example,
1619: 2114: 196:
would type them correctly. Unicode may contribute to this greatly with its combining characters, accents, several types of
2297: 1363: 1086: 1038: 386: 201: 1592: 991: 215:". The translation from the Arabic "samt" included the scribe's confusing of "m" into "ni". This was common in medieval 1751: 1313:(Greece) have been rejected or stalled because of their perceived resemblance to Latin letters. All three (and Serbian 947:, of which the mixing of Armenian with Latin would appear obviously different if using a font other than Sylfaen or a 188:; users had to type a lowercase L when the number one was needed. The zero/o confusion gave rise to the tradition of 2026:
CHIBA, Daiki; AKIYAMA HASEGAWA, Ayako; KOIDE, Takashi; SAWABE, Yuta; GOTO, Shigeki; AKIYAMA, Mitsuaki (2020-07-01).
1952: 1271:
that check whether the user is visiting a website which is a homograph of another domain from a user-defined list.
1157: 444: 231: 2209: 1656: 959:, used in Windows 7, supports Armenian (previous versions did not). Furthermore, this font differentiates Latin 2149: 1176: 1017:
in the modern era, has increasingly adopted a simplified style in which Thai characters are represented with
34:
An example of an IDN homograph attack; the Latin letters "e" and "a" are replaced with the Cyrillic letters "
2184: 2164: 2131: 121:
nk". There are cases in which a registration can be both typosquatting and homograph spoofing; the pairs of
1889: 1727: 1551: 1520: 1217: 1213: 2159: 1535: 912:
treats them as equivalent), as can Greek end-of-word-variant sigma ς for ç; accented Greek substitutes
879:
This list increases if close matches are also allowed (such as Greek εικηρτυωχγ for eiknptuwxy). Using
1125:
version of the domain prevents another registrant from claiming an accented version of the same name.
451:. Web browsers supporting IDNA appeared to direct the URL http://www.pаypal.com/, in which the first 2322: 2312: 1231: 1090: 835: 610: 410: 1291:
The IDN homographs database is a Python library that allows developers to defend against this using
2266: 2240: 2189: 1378: 1225: 1161: 406: 1970: 2307: 983:
is written from right to left and trying to mix it with left-to-right glyphs may cause problems.
952: 944: 839: 862:ν appear identical to a Latin alphabet letter in the lowercase used for URLs. Fonts that are in 1482: 2086: 2045: 2006: 1500: 1279: 1260: 1245: 1165: 1070: 956: 948: 939: 479: 312: 193: 2302: 2076: 2035: 1998: 1848: 1292: 1137: 1030: 976: 892: 351:
computer systems, different logical characters may have identical appearances. For example,
230:
as a way to overcome the lack of support for the Cyrillic alphabet. Another example is that
169: 1989:
Suzuki, Hiroaki; Chiba, Daiki; Yoneya, Yoshiro; Mori, Tatsuya; Goto, Shigeki (2019-10-21).
211:
provided rich opportunities for confusion. A notable example is the etymology of the word "
2194: 1977: 1959: 1912: 1823: 1797: 1642: 1623: 1539: 1201: 1006: 995:
Top: Thai glyphs rendered in a modern font (IBM Plex) in which they resemble Latin glyphs.
980: 356: 272:. For example, 0 (the number) and O (the letter), "l" lowercase L, and "I" uppercase "i". 265: 161: 98: 71: 17: 2214: 2179: 2174: 2169: 2154: 2135: 1933: 1221: 880: 851: 360: 243: 239: 157: 75: 30: 2099: 1706: 1005:
has historically had a distinct look with numerous loops and small flourishes, modern
904:
If an IDN itself is being spoofed, Greek beta β can be a substitute for German eszett
145: 2286: 2261: 2199: 2065:"A Comprehensive Survey of Recent Internet Measurement Techniques for Cyber Security" 1593:".CA takes on a French accent | Canadian Internet Registration Authority (CIRA)" 1383: 1264: 1208: 1169: 1144:
prohibits any domain with these characters from being registered, regardless of TLD.
1066: 909: 742: 726: 509:
have optical counterparts in the basic Latin alphabet and look close or identical to
254:
ASCII has several characters or pairs of characters that look alike and are known as
223: 177: 110: 35: 1318: 1314: 1306: 2063:
Safaei Pour, Morteza; Nader, Christelle; Friday, Kurt; Bou-Harb, Elias (May 2023).
1114: 1110: 1062: 891:
can be similar to Cyrillic т, ф in some fonts, Greek δ resembles Cyrillic б in the
348: 189: 153: 79: 39: 1900: 2040: 2027: 1685: 1616: 2256: 2028:"DomainScouter: Analyzing the Risks of Deceptive Internationalized Domain Names" 1002: 863: 859: 562: 276: 216: 208: 2081: 2064: 1552:"Chrome and Firefox Phishing Attack Uses Domains Identical to Known Safe Sites" 558: 554: 550: 316: 185: 181: 2090: 2049: 1990: 1890:
Chrome and Firefox Phishing Attack Uses Domains Identical to Known Safe Sites
1869: 1596: 2271: 2002: 1368: 1106: 1045:
domain, registering one variant renders the other unavailable to anyone; in
1010: 423: 397:
and other varieties of fraud. An attacker could register a domain name that
260: 149: 67: 59: 1329: 1310: 1302: 204:
support, especially with smaller font sizes and the wide variety of fonts.
1332:, forbidding a mix with Latin or Greek characters. However the problem in 1576: 1235: 1194: 1136:
includes many characters which are not displayed by default, such as the
1014: 819: 448: 440: 414: 394: 227: 134: 363:
small letter a, ("a") which is the lowercase "a" used in English. Hence
275:
In a typical example of a hypothetical attack, someone could register a
1776: 1358: 1268: 1153: 1133: 855: 827: 803: 352: 165: 94: 975:
but these are usually only accurate enough to use for the purposes of
359:
small letter a ("а"), can look identical to Unicode character U+0061,
1938: 1275: 706:, in addition to the capitals for the lowercase Cyrillic homoglyphs. 460: 418: 308: 304: 212: 197: 118: 86:
that has the same shape but different meaning from its counterparts.
1580: 924:
in many fonts, with the last of these (alpha) again only resembling
614: 1033:
can be problematic for homographs as many characters exist as both
1298: 1239: 1141: 1122: 1018: 990: 307:
was a target of a phishing scam exploiting this, using the domain
287:
in some fonts. Using a mix of uppercase and lowercase characters,
168:
does not attempt to unify these glyphs and instead separates most
144: 29: 1991:"ShamFinder: An Automated Framework for Detecting IDN Homographs" 1197:. Either way, this amounts to abandoning non-ASCII domain names. 443:
reported that this exploit was disclosed by 3ric Johanson at the
176:
An early nuisance of this kind, pre-dating the Internet and even
1337: 1333: 1046: 1042: 905: 831: 823: 815: 811: 807: 796: 793: 790: 787: 783: 780: 777: 774: 766: 758: 750: 734: 718: 710: 660: 657: 654: 651: 648: 645: 642: 639: 636: 633: 630: 627: 624: 621: 602: 585: 582: 579: 576: 546: 542: 538: 506: 502: 498: 494: 490: 486: 482: 2103: 1061:
Other Unicode scripts in which homographs can be found include
133:
are all both close together on keyboards and, depending on the
1118: 1050: 433: 1971:
Emoji to Zero-Day: Latin Homoglyphs in Domains and Subdomains
1532: 786:
can also be used if an IDN itself is being spoofed, to fake
609:; however, in most mainstream fonts, д instead resembles a 137:, may be difficult or impossible to distinguish visually. 1102: 770: 762: 754: 746: 738: 730: 722: 714: 703: 700: 697: 694: 691: 688: 685: 682: 679: 676: 673: 670: 667: 664: 606: 605:
can be used, since its italic form resembles a lowercase
598: 595: 592: 589: 534: 530: 526: 522: 518: 514: 510: 1752:"Upcoming update with IDN homograph phishing fix - Blog" 997:
Bottom: The same glyphs rendered with traditional loops.
109:
The registration of homographic domain names is akin to
1728:"Internationalized Domain Names (IDN) in Google Chrome" 1636:
Boise TV news website targeted with Justin Bieber prank
1798:"Changes to IDN in IE7 to now allow mixing of scripts" 1252:
Old Microsoft Edge converts all Unicode into Punycode.
1511:, Communications of the ACM, 45(2):128, February 2002 1328:
only accepts Cyrillic names for the top-level domain
1234:
approach is to render problematic character sets as
2249: 2223: 2142: 1449: 1105:; the tittle (dot) on the i can be replaced with a 1077:(certain abbreviations), Latin (certain digraphs), 114: 102: 1995:Proceedings of the Internet Measurement Conference 1238:. This can be changed by altering the settings in 152:are common in the three major European alphabets: 1953:IDN ccTLD Fast Track String Evaluation Completion 1657:"IDN Homograph Attack Spreading Betabot Backdoor" 1777:"About Safari International Domain Name support" 1160:. The sole purpose of the site was to spread an 1168:issuing a supposed ban on the sale of music by 242:that have homoglyphs in the Latin alphabet, as 773:bear some resemblance to each other. Cyrillic 246:regulations require the use of Latin letters. 89:This kind of spoofing attack is also known as 2231:Uniform Domain-Name Dispute-Resolution Policy 2115: 2032:IEICE Transactions on Information and Systems 709:Cyrillic non-Russian problematic letters are 459:, to the site of the well known payment site 8: 1997:. New York, NY, USA: ACM. pp. 449–462. 393:This opens a rich vein of opportunities for 343:Homographs in internationalized domain names 979:and not for substitution. Furthermore, the 390:increased the scope for homograph attacks. 2236:Anticybersquatting Consumer Protection Act 2122: 2108: 2100: 2080: 2039: 1473:Evgeniy Gabrilovich and Alex Gontmakher, 311:. In certain narrow-spaced fonts such as 268:based on these similarities are known as 1287:Server-side/registry operator mitigation 838:). Attempting to use them could cause a 428:which incorporated Cyrillic characters. 27:Visually similar letters in domain names 1463: 1395: 432:but this advice was neglected by major 70:to deceive visitors). For example, the 1505:: CS1 maint: archived copy as title ( 1498: 1263:like No Homo-Graphs are available for 1469: 1467: 7: 455:character is replaced by a Cyrillic 238:are limited to using letters of the 1533:IDN hacking disclosure by shmoo.com 371:; the Cyrillic version) instead of 315:(the default in the address bar in 1523:, Technical Report #36, 2010-04-28 25: 1617:Fake website URL not from KBOI-TV 1581:https://www.hkdnr.hk/idn_conv.jsp 1521:"Unicode Security Considerations" 1083:Mathematical Alphanumeric Symbols 1626:. KBOI-TV. Retrieved 2011-04-01. 620:If capital letters are counted, 335:will produce homoglyphs such as 200:, etc., often due to inadequate 1849:"Firefox 2 Phishing Protection" 1667:from the original on 2023-10-17 1374:Duplicate characters in Unicode 1075:Enclosed CJK Letters and Months 1053:(.香港) also adopts this policy. 1013:in 1973 and continuing through 97:incorporates numerous scripts ( 2293:Internationalized domain names 1655:Mimoso, Michael (2017-09-06). 1326:Coordination Center for TLD RU 1324:The Russian registry operator 1295:-based character recognition. 588:in standard type), resembling 387:Internationalized domain names 1: 1901:Phishing with Unicode Domains 1645:. KTVB. Retrieved 2011-04-01. 1364:Internationalized domain name 1087:Alphabetic Presentation Forms 1039:simplified Chinese characters 184:in the pre-computer era even 82:alphabets each have a letter 48:internationalized domain name 2041:10.1587/transinf.2019icp0002 1872:. Opera Software. 2006-12-18 1822:Sharif, Tariq (2005-09-09). 1796:Sharif, Tariq (2006-07-31). 1228:also use the same algorithm. 1183:Defending against the attack 1121:, where reserving the plain- 908:in some fonts (and in fact, 1579:converters online, such as 565:generates more homoglyphs: 232:vehicle registration plates 2339: 2082:10.1016/j.cose.2023.103123 1425:GREEK SMALL LETTER OMICRON 1359:Security issues in Unicode 1344:Research based mitigations 1129:Non-displayable characters 955:.) The current version of 270:homograph spoofing attacks 186:combined the L and the one 2210:Domain name front running 2034:. E103.D (7): 1493–1511. 1934:"IDN Homographs Database" 1732:chromium.googlesource.com 866:will feature Greek alpha 18:Homograph spoofing attack 2262:"Catchall" typosquatting 2150:Reverse domain hijacking 2069:Computers & Security 1870:"Opera Fraud Protection" 1824:"Phishing Filter in IE7" 918:can usually be used for 2185:Domain name warehousing 2165:Domain name speculation 2003:10.1145/3355369.3355587 1686:"IDN Display Algorithm" 1412:CYRILLIC SMALL LETTER O 1218:Chromium-based browsers 1148:Known homograph attacks 1709:. Bugzilla.mozilla.org 1188:Client-side mitigation 998: 951:. (This is known as a 549:resemble the numerals 283:which looks much like 173: 43: 2318:Web security exploits 2160:Domain name drop list 1091:typographic ligatures 1037:(regular script) and 994: 870:looking like a Latin 439:On February 7, 2005, 375:(the Latin version). 148: 33: 2298:Nonstandard spelling 2205:IDN homograph attack 1942:. 25 September 2021. 1438:LATIN SMALL LETTER O 611:partial differential 383:entity breaks down. 369:xn--wikipedi-86g.org 2267:Wildcard DNS record 2241:PROTECT Act of 2003 2190:Doppelganger domain 1915:. em_te. 2018-06-28 1379:Unicode equivalence 1164:joke regarding the 1097:Accented characters 895:, and the Cyrillic 407:Evgeniy Gabrilovich 250:Homographs in ASCII 1976:2020-12-09 at the 1958:2014-10-17 at the 1641:2012-03-15 at the 1622:2011-04-05 at the 1575:There are various 1538:2005-03-20 at the 1261:Browser extensions 999: 953:ransom note effect 840:ransom note effect 355:character U+0430, 299:) looks much like 174: 58:(often written as 44: 2280: 2279: 2012:978-1-4503-6948-0 1280:social networking 1246:Internet Explorer 1224:(since 2019) and 1166:Governor of Idaho 1158:fake news website 1071:CJK Compatibility 1009:, beginning with 963:from Armenian ց. 940:Armenian alphabet 405:In December 2001 194:computer operator 84:⟨o⟩ 16:(Redirected from 2330: 2124: 2117: 2110: 2101: 2095: 2094: 2084: 2060: 2054: 2053: 2043: 2023: 2017: 2016: 1986: 1980: 1968: 1962: 1950: 1944: 1943: 1930: 1924: 1923: 1921: 1920: 1913:"No Homo-Graphs" 1909: 1903: 1898: 1892: 1887: 1881: 1880: 1878: 1877: 1866: 1860: 1859: 1857: 1856: 1845: 1839: 1838: 1836: 1835: 1819: 1813: 1812: 1810: 1809: 1793: 1787: 1786: 1784: 1783: 1773: 1767: 1766: 1764: 1763: 1748: 1742: 1741: 1739: 1738: 1724: 1718: 1717: 1715: 1714: 1703: 1697: 1696: 1694: 1693: 1682: 1676: 1675: 1673: 1672: 1652: 1646: 1633: 1627: 1614: 1608: 1607: 1605: 1604: 1595:. Archived from 1589: 1583: 1573: 1567: 1566: 1564: 1563: 1548: 1542: 1530: 1524: 1518: 1512: 1510: 1504: 1496: 1494: 1493: 1487: 1481:. Archived from 1480: 1471: 1452: 1451: 1446: 1440: 1439: 1436: 1433: 1431: 1426: 1423: 1420: 1418: 1413: 1410: 1407: 1405: 1400: 1293:machine learning 1242:'s system files. 1162:April Fool's Day 1138:zero-width space 1079:Currency Symbols 1031:Chinese language 977:foreign branding 962: 949:Unicode typeface 930:in italic type. 929: 922: 916: 899: 893:Serbian alphabet 890: 886: 874: 858:ο and sometimes 570: 480:Cyrillic letters 374: 370: 366: 338: 334: 330: 326: 322: 303:in some fonts. 266:Spoofing attacks 236:issued in Greece 132: 128: 124: 104: 85: 56:homoglyph attack 21: 2338: 2337: 2333: 2332: 2331: 2329: 2328: 2327: 2283: 2282: 2281: 2276: 2245: 2219: 2195:Type-in traffic 2138: 2128: 2098: 2062: 2061: 2057: 2025: 2024: 2020: 2013: 1988: 1987: 1983: 1978:Wayback Machine 1969: 1965: 1960:Wayback Machine 1951: 1947: 1932: 1931: 1927: 1918: 1916: 1911: 1910: 1906: 1899: 1895: 1888: 1884: 1875: 1873: 1868: 1867: 1863: 1854: 1852: 1851:. Mozilla. 2006 1847: 1846: 1842: 1833: 1831: 1821: 1820: 1816: 1807: 1805: 1795: 1794: 1790: 1781: 1779: 1775: 1774: 1770: 1761: 1759: 1750: 1749: 1745: 1736: 1734: 1726: 1725: 1721: 1712: 1710: 1705: 1704: 1700: 1691: 1689: 1684: 1683: 1679: 1670: 1668: 1654: 1653: 1649: 1643:Wayback Machine 1634: 1630: 1624:Wayback Machine 1615: 1611: 1602: 1600: 1591: 1590: 1586: 1574: 1570: 1561: 1559: 1550: 1549: 1545: 1540:Wayback Machine 1531: 1527: 1519: 1515: 1497: 1491: 1489: 1485: 1478: 1476:"Archived copy" 1474: 1472: 1465: 1461: 1456: 1455: 1447: 1443: 1437: 1434: 1429: 1428: 1424: 1421: 1416: 1415: 1411: 1408: 1403: 1402: 1401: 1397: 1392: 1355: 1346: 1289: 1202:Mozilla Firefox 1190: 1185: 1156:'s to create a 1150: 1131: 1103:dotted letter i 1099: 1059: 1027: 1007:Thai typography 996: 989: 981:Hebrew alphabet 972: 960: 936: 925: 920: 914: 897: 888: 884: 881:capital letters 872: 848: 663:can substitute 601:(in some fonts 566: 473: 411:Alex Gontmakher 372: 368: 364: 345: 336: 332: 328: 324: 320: 252: 143: 130: 126: 122: 99:writing systems 91:script spoofing 83: 28: 23: 22: 15: 12: 11: 5: 2336: 2334: 2326: 2325: 2320: 2315: 2310: 2305: 2300: 2295: 2285: 2284: 2278: 2277: 2275: 2274: 2269: 2264: 2259: 2253: 2251: 2247: 2246: 2244: 2243: 2238: 2233: 2227: 2225: 2221: 2220: 2218: 2217: 2215:Drop registrar 2212: 2207: 2202: 2197: 2192: 2187: 2182: 2180:Domain tasting 2177: 2175:Domain parking 2172: 2170:Domain sniping 2167: 2162: 2157: 2155:Cybersquatting 2152: 2146: 2144: 2140: 2139: 2129: 2127: 2126: 2119: 2112: 2104: 2097: 2096: 2055: 2018: 2011: 1981: 1963: 1945: 1925: 1904: 1893: 1882: 1861: 1840: 1814: 1788: 1768: 1756:Opera Security 1743: 1719: 1698: 1677: 1647: 1628: 1609: 1584: 1568: 1543: 1525: 1513: 1462: 1460: 1457: 1454: 1453: 1441: 1394: 1393: 1391: 1388: 1387: 1386: 1381: 1376: 1371: 1366: 1361: 1354: 1351: 1345: 1342: 1340:remains open. 1317:and Mongolian 1309:(Ukraine) and 1288: 1285: 1254: 1253: 1250: 1243: 1229: 1222:Microsoft Edge 1206: 1189: 1186: 1184: 1181: 1149: 1146: 1140:. In general, 1130: 1127: 1098: 1095: 1067:Roman numerals 1058: 1055: 1026: 1023: 988: 985: 971: 968: 935: 932: 852:Greek alphabet 847: 844: 472: 469: 344: 341: 323:in front of a 251: 248: 244:European Union 240:Greek alphabet 207:Even earlier, 190:crossing zeros 178:text terminals 142: 139: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 2335: 2324: 2321: 2319: 2316: 2314: 2311: 2309: 2306: 2304: 2301: 2299: 2296: 2294: 2291: 2290: 2288: 2273: 2270: 2268: 2265: 2263: 2260: 2258: 2255: 2254: 2252: 2248: 2242: 2239: 2237: 2234: 2232: 2229: 2228: 2226: 2222: 2216: 2213: 2211: 2208: 2206: 2203: 2201: 2200:Typosquatting 2198: 2196: 2193: 2191: 2188: 2186: 2183: 2181: 2178: 2176: 2173: 2171: 2168: 2166: 2163: 2161: 2158: 2156: 2153: 2151: 2148: 2147: 2145: 2141: 2137: 2133: 2125: 2120: 2118: 2113: 2111: 2106: 2105: 2102: 2092: 2088: 2083: 2078: 2074: 2070: 2066: 2059: 2056: 2051: 2047: 2042: 2037: 2033: 2029: 2022: 2019: 2014: 2008: 2004: 2000: 1996: 1992: 1985: 1982: 1979: 1975: 1972: 1967: 1964: 1961: 1957: 1954: 1949: 1946: 1941: 1940: 1935: 1929: 1926: 1914: 1908: 1905: 1902: 1897: 1894: 1891: 1886: 1883: 1871: 1865: 1862: 1850: 1844: 1841: 1829: 1825: 1818: 1815: 1803: 1799: 1792: 1789: 1778: 1772: 1769: 1757: 1753: 1747: 1744: 1733: 1729: 1723: 1720: 1708: 1702: 1699: 1687: 1681: 1678: 1666: 1662: 1658: 1651: 1648: 1644: 1640: 1637: 1632: 1629: 1625: 1621: 1618: 1613: 1610: 1599:on 2015-09-07 1598: 1594: 1588: 1585: 1582: 1578: 1572: 1569: 1557: 1553: 1547: 1544: 1541: 1537: 1534: 1529: 1526: 1522: 1517: 1514: 1508: 1502: 1488:on 2020-01-02 1484: 1477: 1470: 1468: 1464: 1458: 1450:Microsfot.com 1448:For example, 1445: 1442: 1399: 1396: 1389: 1385: 1384:Typosquatting 1382: 1380: 1377: 1375: 1372: 1370: 1367: 1365: 1362: 1360: 1357: 1356: 1352: 1350: 1343: 1341: 1339: 1335: 1331: 1327: 1322: 1320: 1316: 1312: 1308: 1304: 1300: 1296: 1294: 1286: 1284: 1281: 1277: 1272: 1270: 1266: 1265:Google Chrome 1262: 1258: 1251: 1247: 1244: 1241: 1237: 1233: 1230: 1227: 1223: 1219: 1215: 1210: 1209:Google Chrome 1207: 1203: 1200: 1199: 1198: 1196: 1187: 1182: 1180: 1178: 1173: 1171: 1170:Justin Bieber 1167: 1163: 1159: 1155: 1147: 1145: 1143: 1139: 1135: 1128: 1126: 1124: 1120: 1116: 1112: 1108: 1104: 1096: 1094: 1092: 1088: 1084: 1080: 1076: 1072: 1068: 1064: 1057:Other scripts 1056: 1054: 1052: 1048: 1044: 1040: 1036: 1032: 1024: 1022: 1020: 1016: 1012: 1008: 1004: 993: 986: 984: 982: 978: 969: 967: 964: 958: 954: 950: 946: 941: 933: 931: 928: 923: 917: 911: 910:code page 437 907: 902: 900: 894: 882: 877: 875: 869: 865: 861: 857: 853: 845: 843: 841: 837: 833: 829: 825: 821: 817: 813: 809: 805: 800: 798: 795: 792: 789: 785: 782: 779: 776: 772: 768: 764: 760: 756: 752: 748: 744: 740: 736: 732: 728: 724: 720: 716: 712: 707: 705: 702: 699: 696: 693: 690: 687: 684: 681: 678: 675: 672: 669: 666: 662: 659: 656: 653: 650: 647: 644: 641: 638: 635: 632: 629: 626: 623: 618: 616: 612: 608: 604: 600: 597: 594: 591: 587: 584: 581: 578: 574: 569: 564: 560: 556: 552: 548: 544: 540: 536: 532: 528: 524: 520: 516: 512: 508: 504: 500: 496: 492: 488: 484: 481: 476: 470: 468: 464: 462: 458: 454: 450: 446: 442: 437: 435: 429: 427: 425: 420: 416: 412: 408: 403: 400: 396: 391: 388: 384: 381: 376: 373:wikipedia.org 365:wikipediа.org 362: 358: 354: 350: 342: 340: 319:), placing a 318: 314: 310: 306: 302: 298: 294: 290: 286: 282: 278: 273: 271: 267: 263: 262: 257: 249: 247: 245: 241: 237: 233: 229: 225: 224:Faux Cyrillic 220: 218: 214: 210: 205: 203: 199: 195: 191: 187: 183: 179: 171: 167: 163: 159: 155: 151: 147: 140: 138: 136: 120: 116: 112: 111:typosquatting 107: 100: 96: 92: 87: 81: 77: 73: 69: 64: 63: 57: 53: 49: 41: 37: 32: 19: 2204: 2130:Domain name 2072: 2068: 2058: 2031: 2021: 1994: 1984: 1966: 1948: 1937: 1928: 1917:. Retrieved 1907: 1896: 1885: 1874:. Retrieved 1864: 1853:. Retrieved 1843: 1832:. Retrieved 1827: 1817: 1806:. Retrieved 1801: 1791: 1780:. Retrieved 1771: 1760:. Retrieved 1758:. 2017-04-21 1755: 1746: 1735:. Retrieved 1731: 1722: 1711:. Retrieved 1707:"Bug 722299" 1701: 1690:. Retrieved 1680: 1669:. Retrieved 1660: 1650: 1631: 1612: 1601:. Retrieved 1597:the original 1587: 1571: 1560:. Retrieved 1558:. 2017-04-14 1555: 1546: 1528: 1516: 1490:. Retrieved 1483:the original 1444: 1398: 1347: 1323: 1305:(Bulgaria), 1297: 1290: 1273: 1259: 1255: 1191: 1174: 1151: 1132: 1115:acute accent 1111:grave accent 1100: 1063:Number Forms 1060: 1028: 1000: 973: 965: 937: 926: 919: 913: 903: 896: 878: 871: 867: 849: 801: 708: 619: 572: 567: 477: 474: 465: 456: 452: 438: 430: 422: 413:, both from 404: 398: 392: 385: 379: 377: 349:multilingual 346: 300: 296: 295:, not small 292: 288: 284: 280: 274: 269: 259: 255: 253: 221: 206: 192:, so that a 175: 108: 90: 88: 61: 55: 51: 47: 45: 2323:Orthography 2313:Obfuscation 2257:Domain hack 2132:speculation 1830:. Microsoft 1804:. Microsoft 1109:(such as a 1035:traditional 1003:Thai script 1001:Though the 864:italic type 563:Italic type 537:. Cyrillic 447:conference 277:domain name 217:blackletter 209:handwriting 182:typewriters 117:nk" into "b 103:example.com 2287:Categories 2075:: 103123. 1919:2020-02-18 1876:2007-02-24 1855:2006-11-30 1834:2006-11-30 1808:2006-11-30 1782:2017-04-29 1762:2020-08-26 1737:2020-08-26 1713:2016-01-31 1692:2016-01-31 1671:2020-09-20 1661:Threatpost 1603:2015-09-22 1562:2017-04-18 1492:2005-12-10 1459:References 1336:and other 317:Windows XP 309:PayPaI.com 301:google.com 289:googIe.com 285:GOOGLE.COM 281:G00GLE.COM 261:homoglyphs 256:homographs 150:Homoglyphs 68:homoglyphs 2308:Deception 2272:Fast flux 2250:Technical 2091:0167-4048 2050:0916-8532 1688:. Mozilla 1556:Wordfence 1369:Homoglyph 1205:Punycode. 1107:diacritic 1041:. In the 1011:Manoptica 850:From the 424:microsoft 339:(d g a). 291:(capital 202:rendering 1974:Archived 1956:Archived 1665:Archived 1639:Archived 1620:Archived 1577:Punycode 1536:Archived 1501:cite web 1353:See also 1240:Mac OS X 1236:Punycode 1232:Safari's 1220:such as 1214:Chromium 1195:Punycode 1015:IBM Plex 934:Armenian 820:palochka 757:, while 471:Cyrillic 449:Shmoocon 441:Slashdot 415:Technion 395:phishing 357:Cyrillic 337:cl cj ci 228:Internet 162:Cyrillic 135:typeface 72:Cyrillic 2303:Unicode 2143:General 2136:parking 1422:ο 1409:о 1269:Firefox 1154:KBOI-TV 1134:Unicode 1025:Chinese 945:Sylfaen 856:omicron 854:, only 828:izhitsa 826:) and 804:Komi De 353:Unicode 170:scripts 166:Unicode 141:History 95:Unicode 38:" and " 2089:  2048:  2009:  1939:GitHub 1828:IEBlog 1802:IEBlog 1435:o 1432: 1430:U+006F 1419: 1417:U+03BF 1406: 1404:U+043E 1276:e-mail 1177:trojan 1085:, and 1019:glyphs 970:Hebrew 957:Tahoma 802:While 613:sign, 461:PayPal 445:hacker 419:Israel 313:Tahoma 305:PayPal 213:zenith 198:hyphen 129:, and 2224:Legal 1486:(PDF) 1479:(PDF) 1390:Notes 1338:gTLDs 1299:ICANN 1226:Opera 1142:ICANN 1123:ASCII 846:Greek 836:WGL-4 399:looks 361:Latin 158:Greek 154:Latin 80:Latin 76:Greek 62:graph 2134:and 2087:ISSN 2046:ISSN 2007:ISBN 1507:link 1334:.com 1319:.мон 1315:.срб 1307:.укр 1267:and 1216:and 1073:and 1047:.biz 1043:.org 1029:The 987:Thai 938:The 812:shha 769:and 761:and 753:and 745:and 737:and 729:and 721:and 713:and 573:дтпи 568:дтпи 557:and 545:and 533:and 505:and 478:The 434:TLDs 426:.com 409:and 258:(or 160:and 78:and 60:homo 46:The 2077:doi 2073:128 2036:doi 1999:doi 1330:.рф 1311:.ελ 1303:.бг 1119:.ca 1113:or 1093:). 1069:), 1051:.hk 921:óíá 915:όίά 818:), 810:), 617:). 571:or 347:In 331:or 264:). 131:0/O 127:i/j 123:l/I 52:IDN 2289:: 2085:. 2071:. 2067:. 2044:. 2030:. 2005:. 1993:. 1936:. 1826:. 1800:. 1754:. 1730:. 1663:. 1659:. 1554:. 1503:}} 1499:{{ 1466:^ 1427:, 1414:, 1278:, 1179:. 1172:. 1081:, 876:. 860:nu 842:. 799:. 765:, 749:, 741:, 733:, 725:, 717:, 561:. 553:, 541:, 529:, 525:, 521:, 517:, 513:, 501:, 497:, 493:, 489:, 485:, 436:. 417:, 380:is 327:, 164:. 156:, 125:, 93:. 74:, 54:) 42:". 2123:e 2116:t 2109:v 2093:. 2079:: 2052:. 2038:: 2015:. 2001:: 1922:. 1879:. 1858:. 1837:. 1811:. 1785:. 1765:. 1740:. 1716:. 1695:. 1674:. 1606:. 1565:. 1509:) 1495:. 1089:( 1065:( 961:g 927:a 906:ß 898:а 889:φ 885:Л 873:a 868:α 832:ѵ 830:( 824:Ӏ 822:( 816:һ 814:( 808:ԁ 806:( 797:ö 794:ï 791:ë 788:ä 784:ӧ 781:ї 778:ё 775:ӓ 771:G 767:Ԍ 763:F 759:Ғ 755:Y 751:Ү 747:w 743:ԝ 739:s 735:ѕ 731:q 727:ԛ 723:j 719:ј 715:i 711:і 704:X 701:T 698:S 695:P 692:O 689:M 686:K 683:J 680:I 677:H 674:E 671:C 668:B 665:A 661:Х 658:Т 655:Ѕ 652:Р 649:О 646:М 643:К 640:Ј 637:І 634:Н 631:Е 628:С 625:В 622:А 615:∂ 607:g 603:д 599:u 596:n 593:m 590:d 586:и 583:п 580:т 577:д 575:( 559:6 555:4 551:3 547:б 543:Ч 539:З 535:y 531:x 527:p 523:o 519:e 515:c 511:a 507:у 503:х 499:р 495:о 491:е 487:с 483:а 457:а 453:a 367:( 333:i 329:l 325:j 321:c 297:L 293:i 172:. 119:а 115:a 50:( 40:а 36:е 20:)

Index

Homograph spoofing attack
An example of an IDN homograph attack; the Latin letters "e" and "a" are replaced with the Cyrillic letters "е" and "а".
е
а
homograph
homoglyphs
Cyrillic
Greek
Latin
Unicode
writing systems
typosquatting
typeface

Homoglyphs
Latin
Greek
Cyrillic
Unicode
scripts
text terminals
typewriters
combined the L and the one
crossing zeros
computer operator
hyphen
rendering
handwriting
zenith
blackletter

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.