Knowledge (XXG)

Regular expression

Source 📝

4296:. Like old typewriters, plain base characters (white spaces, punctuation characters, symbols, digits, or letters) can be followed by one or more non-spacing symbols (usually diacritics, like accent marks modifying letters) to form a single printable character; but Unicode also provides a limited set of precomposed characters, i.e. characters that already include one or more combining characters. A sequence of a base character + combining characters should be matched with the identical single precomposed character (only some of these combining sequences can be precomposed into a single Unicode character, but infinitely many other combining sequences are possible in Unicode, and needed for various languages, using one or more combining characters after an initial base character; these combining sequences 4503: 494:, 'b' is a literal character that matches just 'b', while '.' is a metacharacter that matches every character except a newline. Therefore, this regex matches, for example, 'b%', or 'bx', or 'b5'. Together, metacharacters and literal characters can be used to identify text of a given pattern or process a number of instances of it. Pattern matches may vary from a precise equality to a very general similarity, as controlled by the metacharacters. For example, 608: 4002:). The explicit approach is called the DFA algorithm and the implicit approach the NFA algorithm. Adding caching to the NFA algorithm is often called the "lazy DFA" algorithm, or just the DFA algorithm without making a distinction. These algorithms are fast, but using them for recalling grouped subexpressions, lazy quantification, and similar features is tricky. Modern implementations include the re1- 3913:, used since at least 1970, as well as some more sophisticated extensions like lookaround that appeared in 1994. Lookarounds define the surrounding of a match and do not spill into the match itself, a feature only relevant for the use case of string searching . Some of them can be simulated in a regular language by treating the surroundings as a part of the language as well. 9072: 186: 3810:"Regular expressions" are only marginally related to real regular expressions. Nevertheless, the term has grown with the capabilities of our pattern matching engines, so I'm not going to try to fight linguistic necessity here. I will, however, generally call them "regexes" (or "regexen", when I'm in an Anglo-Saxon mood). 4572:, processing them across the entire database could consume excessive computer resources depending on the complexity and design of the regex. Although in many cases system administrators can run regex-based queries internally, most search engines do not offer regex support to the public. Notable exceptions include 4045:
kind of backtracking. Some implementations try to provide the best of both algorithms by first running a fast DFA algorithm, and revert to a potentially slower backtracking algorithm only when a backreference is encountered during the match. GNU grep (and the underlying gnulib DFA) uses such a strategy.
4250:, do not allow character ranges to cross Unicode blocks. A range like is valid since both endpoints fall within the Basic Latin block, as is since both endpoints fall within the Armenian block, but a range like is invalid since it includes multiple Unicode blocks. Other engines, such as that of the 3584:
support multiple regex flavors. Perl-derivative regex implementations are not identical and usually implement a subset of features found in Perl 5.0, released in 1994. Perl sometimes does incorporate features initially found in other languages. For example, Perl 5.10 implements syntactic extensions
1727:
or not); a logical OR character, which offers a set of alternatives, and a logical NOT character, which negates an atom's existence; and backreferences to refer to previous atoms of a completing pattern of atoms. A match is made, not when all the atoms of the string are matched, but rather when all
4044:
Although backtracking implementations only give an exponential guarantee in the worst case, they provide much greater flexibility and expressive power. For example, any implementation which allows the use of backreferences, or implements the various extensions introduced by Perl, must include some
1921:
in BRE. Furthermore, as long as the POSIX standard syntax for regexes is adhered to, there can be, and often is, additional syntax to serve specific (yet POSIX compliant) applications. Although POSIX.2 leaves some implementation specifics undefined, BRE and ERE provide a "standard" which has since
1843:, where it forms part of the syntax distinct from normal string literals. In some cases, such as sed and Perl, alternative delimiters can be used to avoid collision with contents, and to avoid having to escape occurrences of the delimiter character in the contents. For example, in sed the command 2315:
According to Ross Cox, the POSIX specification requires ambiguous subexpressions to be handled in a way different from Perl's. The committee replaced Perl's rules with one that is simple to explain, but the new "simple" rules are actually more complex to implement: they were incompatible with
3727:
IETF RFC 9485 describes "I-Regexp: An Interoperable Regular Expression Format". It specifies a limited subset of regular-expression idioms designed to be interoperable, i.e. produce the same effect, in a large number of regular-expression libraries. I-Regexp is also limited to matching, i.e.
4060:
A few theoretical alternatives to backtracking for backreferences exist, and their "exponents" are tamer in that they are only related to the number of backreferences, a fixed property of some regexp languages such as POSIX. One naive method that duplicates a non-backtracking NFA for each
4603:
Because regexes can be difficult to both explain and understand without examples, interactive websites for testing regexes are a useful resource for learning regexes by experimentation. This section provides a basic description of some of the properties of regexes by way of illustration.
486:, is often used to mean the specific, standard textual syntax for representing patterns for matching text, as distinct from the mathematical notation described below. Each character in a regular expression (that is, each character in the string describing its pattern) is either a 4300:
include a base character or combining characters partially precomposed, but not necessarily in canonical order and not necessarily using the canonical precompositions). The process of standardizing sequences of a base character + combining characters by decomposing these
2040:, and exactly which characters are considered newlines is flavor-, character-encoding-, and platform-specific, but it is safe to assume that the line feed character is included). Within POSIX bracket expressions, the dot character matches a literal dot. For example, 505:
is a precise pattern (matches just 'b'). The metacharacter syntax is designed specifically to represent prescribed targets in a concise and flexible way to direct the automation of text processing of a variety of input data, in a form easy to type using a standard
1042:
Regular expressions consist of constants, which denote sets of strings, and operator symbols, which denote operations over these sets. The following definition is standard, and found as such in most textbooks on formal language theory. Given a finite
4246:). The natural extension of such character ranges to Unicode would simply change the requirement that the endpoints lie in to the requirement that they lie in . However, in practice this is often not the case. Some implementations, such as that of 4167:
space for a haystack of length n and k backreferences in the RegExp. A very recent theoretical work based on memory automata gives a tighter bound based on "active" variable nodes used, and a polynomial possibility for some backreferenced regexps.
1526: 3993:
An alternative approach is to simulate the NFA directly, essentially building each DFA state on demand and then discarding it at the next step. This keeps the DFA implicit and avoids the exponential construction cost, but running cost rises to
2316:
pre-existing tooling and made it essentially impossible to define a "lazy match" (see below) extension. As a result, very few programs actually implement the POSIX subexpression rules (even when they implement other parts of the POSIX syntax).
4260:. Some case-insensitivity flags affect only the ASCII characters. Other flags affect all characters. Some engines have two different flags, one for ASCII, the other for Unicode. Exactly which characters belong to the POSIX classes also varies. 397:
element group syntax. Prior to the use of regular expressions, many search languages allowed simple wildcards, for example "*" to match any sequence of characters, and "?" to match a single character. Relics of this can be found today in the
36: 4552:
software has the ability to use regexes to automatically apply text styling, saving the person doing the layout from laboriously doing this by hand for anything that can be matched by a regex. For example, by defining a
1262:. In principle, the complement operator is redundant, because it does not grant any more expressive power. However, it can make a regular expression much more concise—eliminating a single complement operator can cause a 1671:
over finite words. This is a surprisingly difficult problem. As simple as the regular expressions are, there is no method to systematically rewrite them to some normal form. The lack of axiom in the past led to the
4052:
and related DFA optimization techniques such as the reverse scan. GNU grep, which supports a wide variety of POSIX syntaxes and extensions, uses BM for a first-pass prefiltering, and then uses an implicit DFA. Wu
1571:
Finally, it is worth noting that many real-world "regular expression" engines implement features that cannot be described by the regular expressions in the sense of formal language theory; rather, they implement
1728:
the pattern atoms in the regex have matched. The idea is to make a small pattern of characters stand for a large number of possible strings, rather than compiling a large list of all the literal possibilities.
239:
Regular expressions entered popular use from 1968 in two uses: pattern matching in a text editor and lexical analysis in a compiler. Among the first appearances of regular expressions in program form was when
3471:) since in many programming languages the characters that can begin an identifier are not the same as those that can occur in other positions: numbers are generally excluded, so an identifier would look like 7764: 2176:
is a digit from 1 to 9. This construct is defined in the POSIX standard. Some tools allow referencing more than nine capturing groups. Also known as a back-reference, this feature is supported in BRE mode.
1150:
under string concatenation. This is the set of all strings that can be made by concatenating any finite number (including zero) of strings from the set described by R. For example, if R denotes {"0", "1"},
6537:), but many can. For example, the set of examples {1, 10, 100}, and negative set (of counterexamples) {11, 1001, 101, 0} can be used to induce the regular expression 1⋅0* (1 followed by zero or more 0s). 1099:
denotes the set of strings that can be obtained by concatenating a string accepted by R and a string accepted by S (in that order). For example, let R denote {"ab", "c"} and S denote {"d", "ef"}. Then,
4325:. Block properties are much less useful than script properties, because a block can have code points from several different scripts, and a script can have code points from several different blocks. In 1708:. An atom is a single point within the regex pattern which it tries to match to the target string. The simplest atom is a literal, but grouping parts of the pattern to match an atom will require using 1392: 1167:
To avoid parentheses, it is assumed that the Kleene star has the highest priority followed by concatenation, then alternation. If there is no ambiguity, then parentheses may be omitted. For example,
6533:
in that regular language, it is possible to induce a grammar for the language, i.e., a regular expression that generates that language. Not all regular languages can be induced in this way (see
4037:
that contain both alternation and unbounded quantification and force the algorithm to consider an exponentially increasing number of sub-cases. This behavior can cause a security problem called
4013:. This algorithm is commonly called NFA, but this terminology can be confusing. Its running time can be exponential, which simple implementations exhibit when matching against expressions like 4057:, which implements approximate matching, combines the prefiltering into the DFA in BDM (backward DAWG matching). NR-grep's BNDM extends the BDM technique with Shift-Or bit-level parallelism. 1779:
When entering a regex in a programming language, they may be represented as a usual string literal, hence usually quoted; this is common in C, Java, and Python for instance, where the regex
1557:
requires computing the modulus of the integer base 11, and can be easily implemented with an 11-state DFA. However, converting it to a regular expression results in a 2,14 megabytes file .
8425: 9150: 4176:
In theoretical terms, any token set can be matched by regular expressions as long as it is pre-defined. In terms of historical implementations, regexes were originally written to use
4163: 4110: 1553:
In the opposite direction, there are many languages easily described by a DFA that are not easily described by a regular expression. For instance, determining the validity of a given
1542:(NFAs) that does not lead to such a blowup in size; for this reason NFAs are often used as alternative representations of regular languages. NFAs are a simple variation of the type-3 4632:
programming language, release 5.8.8, January 31, 2006. This means that other implementations may lack support for some parts of the syntax shown here (e.g. basic vs. extended regex,
4217:, that is, the characters which can be encoded with only 16 bits. Currently (as of 2016) only a few regex engines (e.g., Perl's and Java's) can handle the full 21-bit Unicode range. 1003:
These constructions can be combined to form arbitrarily complex expressions, much like one can construct arithmetical expressions from numbers and the operations +, −, ×, and ÷.
3728:
providing a true or false match between a regular expression and a given piece of text. Thus, it lacks advanced features such as capture groups, lookahead, and backreferences.
2493:), the computer's locale settings determine the contents by the numeric ordering of the character encoding. They could store digits in that sequence, or the ordering could be 1410: 374:, which are used to define Raku grammar as well as provide a tool to programmers in the language. These rules maintain existing features of Perl 5.x regexes, but also allow 709:
or members. However, there are often more concise ways: for example, the set containing the three strings "Handel", "Händel", and "Haendel" can be specified by the pattern
4177: 1864: 1273:. There is, however, a significant difference in compactness. Some classes of regular languages can only be described by deterministic finite automata whose size grows 385:
The use of regexes in structured information standards for document and database modeling started in the 1960s and expanded in the 1980s when industry standards like
3968:(DFA). The DFA can be constructed explicitly and then run on the resulting input string one symbol at a time. Constructing the DFA for a regular expression of size 2459:
The character class is the most basic regex concept after a literal match. It makes one small sequence of characters match a larger set of characters. For example,
1213:
denotes the set of binary numbers that are multiples of 3: { ε, "0", "00", "11", "000", "011", "110", "0000", "0011", "0110", "1001", "1100", "1111", "00000", ... }
7843:
If the scanner detects a transition on backref, it returns a kind of "semi-success" indicating that the match will have to be verified with a backtracking matcher.
1956:
are treated as metacharacters unless escaped; other metacharacters are known to be literal or symbolic based on context alone. Additional functionality includes
323:(which has its own, incompatible syntax and behavior). Regexes were subsequently adopted by a wide range of programs, with these early forms standardized in the 2129:
Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line.
1948:
Perl regexes have become a de facto standard, having a rich and powerful set of atomic expressions. Perl has no "basic" or "extended" levels. As in POSIX EREs,
1660:
between regular-expression terms with variables holds if, and only if, its instantiation with different variables replaced by different symbol constants holds.
9762: 9470: 280:
meaning "Global search for Regular Expression and Print matching lines"). Around the same time when Thompson developed QED, a group of researchers including
7827: 3655:
by appending a plus sign, which disables backing off (in a backtracking engine), even if doing so would allow the overall match to succeed: While the regex
9143: 8847: 535:
similar names in a list of files, whereas regexes are usually employed in applications that pattern-match text strings in general. For example, the regex
1596: 9657: 4274:
is not applicable. For scripts like Chinese, another distinction seems logical: between traditional and simplified. In Arabic scripts, insensitivity to
4266:. As ASCII has case distinction, case insensitivity became a logical feature in text searching. Unicode introduced alphabetic scripts without case like 2501:. So the POSIX standard defines a character class, which will be known by the regex processor installed. Those definitions are in the following table: 6870: 1606:
Algebraic laws for regular expressions can be obtained using a method by Gischer which is best explained along an example: In order to check whether (
354:
implementation with improved performance characteristics. Software projects that have adopted Spencer's Tcl regular expression implementation include
8576: 7533: 6761: 2402:
The choice (also known as alternation or set union) operator matches either the expression before or the expression after the operator. For example,
9696: 6546: 3764: 3692:
Another common extension serving the same function is atomic grouping, which disables backtracking for a parenthesized group. The typical syntax is
1201:
denotes the set of all strings with no symbols other than "a" and "b", including the empty string: {ε, "a", "b", "aa", "ab", "ba", "bb", "aaa", ...}
174: 9622: 4049: 9627: 7473: 3441:
plus underscore. This reflects the fact that in many programming languages these are the characters that may be used in identifiers. The editor
9357: 9136: 2305:
matches any single character surrounded by "" since the brackets are escaped, for example: "", "", "", "", "]", and "" (bracket space bracket).
1207:
denotes the set of strings starting with "a", then zero or more "b"s and finally optionally a "c": {"a", "ac", "ab", "abc", "abb", "abbc", ...}
3740:. For example, many implementations allow grouping subexpressions with parentheses and recalling the value they match in the same expression ( 3540:
Because of its expressive power and (relative) ease of reading, many other utilities and programming languages have adopted syntax similar to
1755:
be primarily literal, and "escape" this usual meaning to become metacharacters. Common standards implement both. The usual metacharacters are
9632: 8992: 8973: 8941: 8890: 8783: 8750: 8714: 8648: 8623: 8419: 7303: 6864: 6669: 6631: 420:(Perl Compatible Regular Expressions), which attempts to closely mimic Perl's regex functionality and is used by many modern tools including 4610:
metacharacter(s) ;; the metacharacters column specifies the regex syntax being demonstrated =~ m//  ;; indicates a regex
1688:
axioms. Already in 1964, Redko had proved that no finite set of purely equational axioms can characterize the algebra of regular languages.
9282: 8087: 6534: 4433: 4038: 1109: 386: 1561: 640: 611: 8553: 7011: 6637: 4188:. In most respects it makes no difference what the character set is, but some issues do arise when extending regexes to support Unicode. 9372: 7990: 6675: 3535: 1811:
can be used to specify a range of lines (matching the pattern), which can be combined with other commands on either side, most famously
417: 362:(formerly named Perl 6) is to improve Perl's regex integration, and to increase their scope and capabilities to allow the definition of 9617: 8837:
Kleene, Stephen C. (1951). "Representation of Events in Nerve Nets and Finite Automata". In Shannon, Claude E.; McCarthy, John (eds.).
7448: 4195:. Some regex libraries expect to work on some particular encoding instead of on abstract Unicode characters. Many of these require the 1595:
that, for two given regular expressions, decides whether the described languages are equal; the algorithm reduces each expression to a
9843: 9297: 7310:
This property need not hold for extended regular expressions, even if they describe no larger class than regular languages; cf. p.121.
8299: 7699: 9711: 9652: 9524: 6570: 4331: 3961: 644: 347: 9559: 2261:
matches any three-character string ending with "at", including "hat", "cat", "bat", "4at", "#at" and " at" (starting with a space).
2119:
matches any single character that is not a lowercase letter from "a" to "z". Likewise, literal characters and ranges can be mixed.
1922:
been adopted as the default syntax of many tools, where the choice of BRE or ERE modes is usually a supported option. For example,
717:
each of the three strings. However, there can be many ways to write a regular expression for the same set of strings: for example,
655:(DFA) is run on the target text string to recognize substrings that match the regular expression. The picture shows the NFA scheme 5435:
Matches a zero-width boundary between a word-class character (see next) and either a non-word class character or an edge; same as
1747:, they have a metacharacter escape to a literal mode; starting out, however, they instead have the four bracketing metacharacters 9465: 9450: 4554: 1588:
As seen in many of the examples above, there is more than one way to construct a regular expression to achieve the same results.
1539: 8670: 7908: 4565:
to apply that style, any word of four or more consecutive capital letters will be automatically rendered as small caps instead.
804:, character, or group) specifies how many times the preceding element is allowed to repeat. The most common quantifiers are the 528:
also achieve this, but are more limited in what they can pattern, as they have fewer metacharacters and a simple language-base.
9899: 9884: 9326: 9048: 8018: 7140: 6591: 3619:
matches the entire line (because the entire line begins and ends with a double-quote) instead of matching only the first part,
1310: 635:
translates a regular expression in the above syntax into an internal representation that can be executed and matched against a
9739: 7563: 3719:
Possessive quantifiers are easier to implement than greedy and lazy quantifiers, and are typically more efficient at runtime.
1743:. Modern and POSIX extended regexes use metacharacters more often than their literal meaning, so to avoid "backslash-osis" or 228:"prehensible", but admitted "We would welcome any suggestions as to a more descriptive term.") Other early implementations of 9116: 9108: 9100: 7737: 7346: 7246: 6518: 6512: 4455: 3965: 1732: 652: 351: 261: 127: 8756: 6551: 272:'s use of regular expressions ("grep" is a word derived from the command for regular expression searching in the ed editor: 8485: 7647: 8371: 6526: 6517:
Regular expressions can often be created ("induced" or "learned") based on a set of example strings. This is known as the
4485: 3990:). Note that the size of the expression is the size after abbreviations, such as numeric quantifiers, have been expanded. 3557: 2311:
matches s followed by zero or more characters, for example: "s", "saw", "seed", "s3w96.7", and "s6#h%(>>>m n mQ".
1270: 444: 9744: 9599: 6953: 9874: 9343: 9268: 7396: 4763:, ... later to refer to the previously matched pattern. Some implementations may use a backslash notation instead, like 3748:). This means that, among other things, a pattern can match strings of repeated words like "papa" or "WikiWiki", called 3553: 2332:) syntax. With this syntax, a backslash causes the metacharacter to be treated as a literal character. So, for example, 1965: 678: 547:
matches excess whitespace at the beginning or end of a line. An advanced regular expression that matches any numeral is
209: 104: 8592: 8332:. The 'm' is only necessary if the user wishes to specify a match operation without using a forward-slash as the regex 6982: 6802: 5084:
The non-greedy match with 'l' followed by one or more characters is 'llo' rather than 'llo Wo'.
9879: 9823: 9549: 9440: 8362: 7943: 7522:"Essential classes: Regular Expressions: Quantifiers: Differences Among Greedy, Reluctant, and Possessive Quantifiers" 6892: 6769: 4490: 4460: 3736:
Many features found in virtually all modern regular expression libraries provide an expressive power that exceeds the
3561: 3545: 1263: 1044: 440: 359: 9673: 2139:
Defines a marked subexpression. The string matched within the parentheses can be recalled later (see the next entry,
8136: 2026:
Matches the starting position within the string. In line-based tools, it matches the starting position of any line.
1731:
Depending on the regex processor there are about fourteen metacharacters, characters that may or may not have their
1121:
of sets described by R and S. For example, if R describes {"ab", "c"} and S describes {"ab", "d", "ef"}, expression
9828: 9706: 9609: 9336: 7723:
Reprinted as "QED Text Editor Reference Manual", MHCC-004, Murray Hill Computing, Bell Laboratories (October 1972).
7481: 6566: 3772: 1744: 363: 92: 29: 9579: 5166:
There is an 'e' followed by zero to many 'l' followed by 'o' (e.g., eo, elo, ello, elllo).
3716:
because the engine is forbidden from backtracking and so cannot try setting the group to "w" after matching "wi".
2231:
matches only "aaa", "aaaa", and "aaaaa". This is not found in a few older instances of regexes. BRE mode requires
1085:
Given regular expressions R and S, the following operations over them are defined to produce regular expressions:
358:. Perl later expanded on Spencer's original library to add many new features. Part of the effort in the design of 9894: 9889: 9838: 9734: 9678: 9533: 9261: 8344: 7044: 4569: 96: 88: 9637: 7965:
Schmid, Markus L. (March 2019). "Regular Expressions with Backreferences: Polynomial-Time Matching Techniques".
7787: 4653:
The syntax and conventions used in these examples coincide with that of other programming environments as well.
1564:
computes an equivalent nondeterministic finite automaton. A conversion in the opposite direction is achieved by
1269:
Regular expressions in this sense can express the regular languages, exactly the class of languages accepted by
169:. Regular expressions are supported in many programming languages. Library implementations are often called an " 9833: 9569: 9413: 9408: 8838: 7111: 4506: 4445: 4214: 3786:
for their patterns. This has led to a nomenclature where the term regular expression has different meanings in
797: 394: 379: 253: 4119: 4066: 3779:, and the execution time for known algorithms grows exponentially by the number of backreference groups used. 516:
A very simple case of a regular expression in this syntax is to locate a word spelled two different ways in a
9494:
Any language in each category is generated by a grammar and by an automaton in the category in the same line.
1538:
must have at least 2 states. Luckily, there is a simple mapping from regular expressions to the more general
705:
of strings required for a particular purpose. A simple way to specify a finite set of strings is to list its
9782: 8264: 464: 4207:. In contrast, Perl and Java are agnostic on encodings, instead operating on decoded characters internally. 9424: 9362: 9287: 9123: 8568: 8516: 7677: 7521: 6848: 6774: 3905:
Other features not found in describing regular languages include assertions. These include the ubiquitous
1887: 759: 702: 225: 9082: 8112: 7245:, a regular expression of length about 850 such that its complement has a length about 2 can be found at 7003: 490:, having a special meaning, or a regular character that has a literal meaning. For example, in the regex 9787: 9729: 9589: 9517: 9367: 9315: 9128: 6916: 6914: 6529:. Formally, given examples of strings in a regular language, and perhaps also given examples of strings 3760: 3577: 1147: 648: 80: 8909: 8677:
Proceedings of the 25th International Symposium on Theoretical Aspects of Computer Science (STACS 2008)
7855:
Kearns, Steven (August 2013). "Sublinear Matching With Finite Automata Using Reverse Suffix Scanning".
7831: 1565: 1521:{\displaystyle (a\mid b)^{*}a\underbrace {(a\mid b)(a\mid b)\cdots (a\mid b)} _{k-1{\text{ times}}}.\,} 1277:
in the size of the shortest equivalent regular expressions. The standard example here is the languages
375: 8865:
Kozen, Dexter (1991). "A completeness theorem for Kleene algebras and the algebra of regular events".
7485: 4502: 4323:
Introduction of character classes for Unicode blocks, scripts, and numerous other character properties
9594: 9460: 9435: 9292: 9253: 7602: 6561: 4589: 3230: 2056:
A bracket expression. Matches a single character that is contained within the brackets. For example,
431:
Today, regexes are widely supported in programming languages, text processing programs (particularly
300: 8735:
Proceedings of the 35th International Colloquium on Automata, Languages and Programming (ICALP 2008)
7259: 9792: 8521: 8337: 4597: 4593: 4542: 4437: 1673: 264:, an important early example of JIT compilation. He later added this capability to the Unix editor 197: 189: 115: 9094:
Information technology – Portable Operating System Interface (POSIX) – Part 2: Shell and Utilities
7368:
Information technology – Portable Operating System Interface (POSIX) – Part 2: Shell and Utilities
6722: 236:
language, which did not use regular expressions, but instead its own pattern matching constructs.
9721: 9445: 9387: 9331: 9118:
Information technology – Portable Operating System Interface (POSIX) Base Specifications, Issue 7
9028: 8896: 8825: 8680: 8534: 7966: 7900: 7856: 7529: 7376:
Information technology – Portable Operating System Interface (POSIX) Base Specifications, Issue 7
6790:
The concept of regular events was introduced by Kleene via the definition of regular expressions.
6556: 4573: 4549: 4247: 2443:
POSIX Extended Regular Expressions can often be used with modern Unix utilities by including the
1668: 1274: 636: 532: 525: 425: 399: 9110:
Information technology – Portable Operating System Interface (POSIX) – Part 2: System Interfaces
9102:
Information technology – Portable Operating System Interface (POSIX) – Part 2: System Interfaces
8654: 7372:
Information technology – Portable Operating System Interface (POSIX) – Part 2: System Interfaces
2062:
specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed:
1531:
On the other hand, it is known that every deterministic finite automaton accepting the language
9076: 3782:
However, many tools, libraries, and engines that provide such constructions still use the term
9853: 9180: 8988: 8969: 8937: 8886: 8779: 8746: 8710: 8644: 8640: 8619: 8501: 8415: 8411: 8365: 7299: 7283:
Gischer, Jay L. (1984). (Title unknown) (Technical Report). Stanford Univ., Dept. of Comp. Sc.
6860: 6856: 6665: 6627: 6621: 6522: 4518: 4251: 3442: 2930: 1547: 510: 245: 9092: 8965: 8956: 8547: 7986: 6659: 9848: 9818: 9772: 9574: 9510: 9429: 9382: 9349: 9195: 9018: 8878: 8870: 8815: 8738: 8526: 8285: 7892: 7639: 7592: 7444: 4271: 3771:
with an unbounded number of backreferences, as supported by numerous modern tools, is still
3768: 3737: 3606: 3573: 2325: 1800: 1735:
character meaning, depending on context, or whether they are "escaped", i.e. preceded by an
1023: 801: 436: 316: 265: 229: 201: 166: 119: 84: 4267: 2328:
with a backslash is reversed for some characters in the POSIX Extended Regular Expression (
9701: 9642: 9554: 9392: 9307: 9274: 9190: 9163: 9159: 8455: 8403: 8348: 8289: 6945: 6844: 4886:"There are one or more consecutive letter \"l\"'s in $ string1.\n" 4522: 4316: 3787: 1736: 1031: 1027: 281: 217: 213: 170: 154: 108: 100: 4986:
There is an 'H' and a 'e' separated by 0-1 characters (e.g., He Hue Hee).
3933:
have been attested since at least 1994, starting with Perl 5. The look-behind assertions
1183:. Many textbooks use the symbols ∪, +, or ∨ for alternation instead of the vertical bar. 3806:, author of the Perl programming language, writes in an essay about the design of Raku: 1222:
The formal definition of regular expressions is minimal on purpose, and avoids defining
1064:) ε denoting the set containing only the "empty" string, which has no characters at all. 997:
matches any string that contains an "a", and then the character "b" at some later point.
9754: 9647: 9403: 9185: 9167: 8951: 8468: 8039: 6949: 4534: 4003: 3973: 3569: 1681: 1543: 1247: 1069: 146: 114:
The concept of regular expressions began in the 1950s, when the American mathematician
4254:
editor, allow block-crossing but the character values must not be more than 256 apart.
3960:
The oldest and fastest relies on a result in formal language theory that allows every
2074:
character is treated as a literal character if it is the last or the first (after the
1629:, it is necessary and sufficient to check whether the particular regular expressions ( 9868: 9564: 9541: 9488: 8802:
Johnson, Walter L.; Porter, James H.; Ackley, Stephanie I.; Ross, Douglas T. (1968).
8399: 7877: 4526: 4319:
and text direction markers. These codes might have to be dealt with in a special way.
4181: 3565: 1985: 1156: 1091: 1010:
for regular expressions varies among tools and with context; more detail is given in
805: 487: 367: 335: 284:
implemented a tool based on regular expressions that is used for lexical analysis in
142: 9044: 9032: 8900: 8829: 8804:"Automatic generation of efficient lexical processors using finite state techniques" 8690: 8538: 8472: 8336:. Sometimes it is useful to specify an alternate regex delimiter in order to avoid " 8014: 7169: 7132: 6599: 5890:
there are TWO non-whitespace characters, which may be separated by other characters.
3681:. Thus, possessive quantifiers are most useful with negated character classes, e.g. 3593:
In Python and some other implementations (e.g. Java), the three common quantifiers (
607: 9767: 9584: 9002: 8867:[1991] Proceedings Sixth Annual IEEE Symposium on Logic in Computer Science 7904: 7708: 7555: 6697: 4530: 4305:
sequences, before reordering them into canonical order (and optionally recomposing
4180:
characters as their token set though regex libraries have supported numerous other
4010: 3518:
in other regex flavors which support them. With most other regex flavors, the term
2444: 2113:
Matches a single character that is not contained within the brackets. For example,
1677: 1060: 734: 724:
Most formalisms provide the following operations to construct regular expressions.
706: 413: 241: 7506: 7334: 4596:
in use. Additionally, the functionality of regex implementations can vary between
4517:
Regexes are useful in a wide variety of text processing tasks, and more generally
4510: 3423:
POSIX character classes can only be used within bracket expressions. For example,
8725: 2433:
matches "hat", "cat", "hhat", "chat", "hcat", "cchchat", and so on, but not "at".
9777: 9455: 9377: 9302: 8742: 8479: 8294: 7627: 7605: 7586: 5801:
there are TWO whitespace characters, which may be separated by other characters.
3790:
and pattern matching. For this reason, some people have taken to using the term
3776: 2091:
character can be included in a bracket expression if it is the first (after the
1883: 1685: 1664: 1600: 1130: 830: 819: 755: 615: 517: 435:), advanced text editors, and some other programs. Regex support is part of the 150: 5148:"'l' followed by 'o' (e.g., eo, elo, ello, elllo).\n" 9040: 8933: 8451: 8208: 7643: 7032: 4558: 4465: 4309:
combining characters into the leading base character) is called normalization.
4235: 3803: 3549: 448: 390: 371: 355: 221: 8874: 8460:
Handbook of Theoretical Computer Science, volume A: Algorithms and Complexity
7423: 5271:
There exists a substring with at least 1 and at most 2 l's in Hello World
991:
matches any string that contains an "a", and then any character and then "b".
8611: 8596: 8333: 8232: 6978: 5066:"more characters is 'llo' rather than 'llo Wo'.\n" 4904:
There are one or more consecutive letter "l"'s in Hello World.
4223:. For example, in ASCII-based implementations, character ranges of the form 3954: 1788: 1592: 1118: 823: 296: 291:
Many variations of these original forms of regular expressions were used in
249: 130:
for writing regular expressions have existed since the 1980s, one being the
7933: 7788:"Jumbo Regexp Patch Applied (with Minor Fix-Up Tweaks): Perl/perl5@c277df4" 6896: 4621:
Also worth noting is that these regexes are all Perl-like syntax. Standard
1163:
denotes {ε, "ab", "c", "abab", "abc", "cab", "cc", "ababab", "abcab", ...}.
45: highlights show the match results of the regular expression pattern: 9071: 9023: 9006: 8820: 8803: 8530: 8063: 6957: 5723:
which in ASCII are tab, line feed, form feed, carriage return, and space;
1835:
and patterns can be joined with a comma to specify a range of lines as in
4588:
The specific syntax rules vary depending on the specific implementation,
4283: 4279: 2427:
matches "at", "hat", "cat", "hhat", "chat", "hcat", "cchchat", and so on.
2293:
matches "hat" and "cat", but only at the beginning of the string or line.
812: 285: 257: 4009:
The third algorithm is to match the pattern against the input string by
3635:, matching as few characters as possible, by appending a question mark: 3609:
by default because they match as many characters as possible. The regex
790:
are equivalent patterns which both describe the set of "gray" or "grey".
9813: 7756: 5860:"In $ string1 there are TWO non-whitespace characters, which" 5771:"In $ string1 there are TWO whitespace characters, which may" 4577: 4538: 4293: 4185: 3945:
are attested since 1997 in a commit by Ilya Zakharevich to Perl 5.005.
2037: 324: 8672:
Succinctness of the Complement and Intersection of Regular Expressions
8341: 7757:"How to simulate lookaheads and lookbehinds in finite state automata?" 7036: 6827: 6825: 639:
representing the text being searched in. One possible approach is the
9086: 8882: 8737:. Lecture Notes in Computer Science. Vol. 5126. pp. 39–50. 7938: 7597: 5057:"The non-greedy match with 'l' followed by one or " 4204: 4200: 3662:"Ganymede," he continued, "is the largest moon in the Solar System." 3616:"Ganymede," he continued, "is the largest moon in the Solar System." 2146:). A marked subexpression is also called a block or capturing group. 1139: 1007: 677:
denotes a simpler regular expression in turn, which has already been
389:(precursored by ANSI "GCA 101-1983") consolidated. The kernel of the 233: 20: 8160: 7896: 7298:. Upper Saddle River, New Jersey: Addison Wesley. pp. 117–120. 7294:
Hopcroft, John E.; Motwani, Rajeev & Ullman, Jeffrey D. (2003).
7103: 6592:"Regular Expression Tutorial - Learn How to Use Regular Expressions" 2471:
could mean any digit. Character classes apply to both POSIX levels.
35: 7971: 5982:
99 is the first number in '99 bottles of beer on the wall.'
4959:"There is an 'H' and a 'e' separated by " 4450: 2299:
matches "hat" and "cat", but only at the end of the string or line.
1763:. The usual characters that become metacharacters when escaped are 501:(match all lower case letters from 'a' to 'z') is less general and 9418: 8727:
Finite Automata, Digraph Connectivity, and Regular Expression Size
8685: 8184: 8015:"UTS#18 on Unicode Regular Expressions, Annex A: Character Blocks" 7861: 5401:"$ string1 contains at least one of Hello, Hi, or Pogo." 4645: 4628:
Unless otherwise indicated, the following examples conform to the
4521:, where the data need not be textual. Common applications include 4501: 4470: 4196: 4054: 3775:. The general problem of matching any number of backreferences is 2462:
could stand for any uppercase letter in the English alphabet, and
1977: 1867: 1824: 606: 507: 451:. In the late 2010s, several companies started to offer hardware, 432: 320: 216:(models of computation) and the description and classification of 185: 184: 131: 34: 8913: 8256: 252:. For speed, Thompson implemented regular expression matching by 9797: 8321: 6131:"$ string1 starts with the characters 'He'.\n" 4755:
When you match a pattern within parentheses, you can use any of
4629: 4475: 4326: 4275: 3896: 3541: 1942: 1840: 1820: 1816: 1621:) denote the same regular language, for all regular expressions 1554: 460: 452: 331: 312: 292: 269: 135: 123: 9506: 9132: 8793:
Hopcroft, John E.; Motwani, Rajeev; Ullman, Jeffrey D. (2000).
7669: 2187:
Matches the preceding element zero or more times. For example,
1663:
Every regular expression can be written solely in terms of the
1047:Σ, the following constants are defined as regular expressions: 8377: 6250:
Matches the beginning of a string (but not an internal line).
5015:'d regex that comes before to match as few times as possible. 4480: 3581: 2388:
Matches the preceding element one or more times. For example,
1923: 1828: 721:
also specifies the same set of three strings in this example.
456: 421: 403: 339: 308: 304: 162: 158: 9007:"Programming Techniques: Regular expression search algorithm" 5139:"There is an 'e' followed by zero to many " 4614:
operation in Perl =~ s///  ;; indicates a regex
4580:. However, Google Code Search was shut down in January 2012. 2374:
Matches the preceding element zero or one time. For example,
1839:. This notation is particularly well known due to its use in 447:, and is built into the syntax of others, including Perl and 9487:
Each category of languages, except those marked by a , is a
7374:, ISO/IEC 9945-2:2003, and currently ISO/IEC/IEEE 9945:2009 5964:"$ 1 is the first number in '$ string1'\n" 4184:. Many modern regex engines offer at least some support for 3957:
that decide whether and how a given regex matches a string.
3752:
in formal language theory. The pattern for these strings is
2195:
matches "", "x", "y", "z", "zx", "zyx", "xyzzy", and so on.
8795:
Introduction to Automata Theory, Languages, and Computation
7424:"Regular Expression Matching: the Virtual Machine Approach" 7296:
Introduction to Automata Theory, Languages, and Computation
5605:
There is at least one alphanumeric character in Hello World
5483:"There is a word that ends with 'llo'.\n" 2355:
backreferences and the following metacharacters are added:
334:, which originally derived from a regex library written by 196:
Regular expressions originated in 1951, when mathematician
9502: 8575:(6). The Open Group. 2004. IEEE Std 1003.1, 2004 Edition. 7192: 7190: 5097:
Matches the preceding pattern element zero or more times.
4367:
matches any character in the Armenian script. In general,
3435:
An additional non-POSIX class understood by some tools is
8473:"Chapter 10. Patterns, Automata, and Regular Expressions" 8454:(1990). "Algorithms for finding patterns in strings". In 7335:"On defining relations for the algebra of regular events" 6425:
Matches every character except the ones inside brackets.
4844:
Matches the preceding pattern element one or more times.
4813:"We matched '$ 1' and '$ 2'.\n" 4753:
Groups a series of pattern elements to a single element.
4391:
matches any uppercase letter. Binary properties that are
3432:
matches the uppercase letters and lowercase "a" and "b".
2392:
matches "abc", "abbc", "abbbc", and so on, but not "ac".
1387:{\displaystyle (a\mid b)^{*}a(a\mid b)(a\mid b)(a\mid b)} 897:
matches "abc", "abbc", "abbbc", and so on, but not "ac".
393:
standards consists of regexes. Its use is evident in the
7632:
International Journal of Foundations of Computer Science
7401:
The Open Group Base Specifications Issue 7, 2018 edition
7079: 6338:
Matches the end of a string (but not an internal line).
6046:"There is at least one character in $ string1" 5587:"character in $ string1 (A-Z, a-z, 0-9, _).\n" 4917:
Matches the preceding pattern element zero or one time.
2036:
Matches any single character (many applications exclude
1159:(including the empty string). If R denotes {"ab", "c"}, 7626:
Cezar Câmpeanu; Kai Salomaa & Sheng Yu (Dec 2003).
7507:
SRE: Atomic Grouping (?>...) is not supported #34627
6831: 3623:. The aforementioned quantifiers may, however, be made 1909:, and it removes the need to escape the metacharacters 1807:
is the editor command for searching, and an expression
1577: 8928:
Liger, François; McQueen, Craig; Wilton, Paul (2002).
8320:
The character 'm' is not always required to specify a
7104:"GRegex – Faster Analytics for Unstructured Text Data" 6812: 6810: 6467:"$ string1 contains a character other than " 5708:
The space between Hello and World is not alphanumeric.
4371:
matches any character with either the binary property
4048:
Sublinear runtime algorithms have been achieved using
8634: 8616:
Sams Teach Yourself Regular Expressions in 10 Minutes
7196: 5244:"There exists a substring with at least 1 " 4513:
which uses regular expressions to identify bad titles
4122: 4069: 1413: 1313: 463:
compatible regex engines that are faster compared to
9124:
Regular Expression, IEEE Std 1003.1-2017, Open Group
8593:"Regular Expression Matching Can Be Simple and Fast" 7878:"NR-grep: a fast and flexible pattern-matching tool" 7588:
I-Regexp: An Interoperable Regular Expression Format
5869:" may be separated by other characters.\n" 5325:"$ string1 contains one or more vowels.\n" 5179:
Denotes the minimum M and the maximum N match count.
4622: 4607:
The following conventions are used in the examples.
3613:(including the double-quotes) applied to the string 1077:
in Σ denoting the set containing only the character
330:
In the 1980s, the more complicated regexes arose in
9806: 9753: 9720: 9687: 9666: 9608: 9540: 6923:, 10.11 Bibliographic Notes for Chapter 10, p. 589. 6569:– converts a regular expression into an equivalent 5531:property contains more than Latin letters, and the 4436:support regex capabilities, either natively or via 4278:may be desired. In Japanese, insensitivity between 4014: 3910: 3906: 3490: 3472: 3463: 3454: 3439: 3436: 3424: 3416: 3397: 3387: 3381: 3372: 3353: 3343: 3337: 3328: 3318: 3308: 3298: 3277: 3259: 3249: 3239: 3235: 3224: 3205: 3197: 3188: 3169: 3159: 3153: 3144: 3125: 3115: 3109: 3100: 3081: 3073: 3064: 3054: 3044: 3034: 3023: 3013: 2995: 2985: 2975: 2971: 2962: 2943: 2935: 2873: 2863: 2797: 2787: 2772: 2761: 2742: 2732: 2726: 2717: 2698: 2688: 2682: 2673: 2663: 2653: 2643: 2632: 2622: 2612: 2602: 2591: 2572: 2564: 2555: 2536: 2487: 2479: 2475: 2463: 2460: 2116:matches any character other than "a", "b", or "c". 1300:. On the one hand, a regular expression describing 893:occurrences of the preceding element. For example, 873:occurrences of the preceding element. For example, 853:occurrences of the preceding element. For example, 767: 758:are used to define the scope and precedence of the 738: 548: 536: 46: 30:
Pointer (computer science) § Pointer-to-member
8955: 5514:Matches an alphanumeric character, including "_"; 4355:matches code points not in that block. Similarly, 4157: 4104: 3759:The language of squares is not regular, nor is it 3651:In Java and Python 3.11+, quantifiers may be made 1819:("global regex print"), which is included in most 1787:. However, they are often written with slashes as 1520: 1386: 268:, which eventually led to the popular search tool 220:, motivated by Kleene's attempt to describe early 16:Sequence of characters that forms a search pattern 7628:"A Formal Study of Practical Regular Expressions" 6237:is a line or string that ends with 'rld'. 4675:Normally matches any character except a newline. 2065:matches "a", "b", "c", "x", "y", or "z", as does 877:matches "ac", "abc", "abbc", "abbbc", and so on. 103:. Regular expression techniques are developed in 8064:"Regular expressions library - cppreference.com" 6623:The Oxford Handbook of Computational Linguistics 5780:" be separated by other characters.\n" 5253:"and at most 2 l's in $ string1\n" 4968:"0-1 characters (e.g., He Hue Hee).\n" 1644:) denote the same language over the alphabet Σ={ 19:"Regex" redirects here. For the comic book, see 8088:"Regular Expression Language - Quick Reference" 7260:"Regular expressions for deciding divisibility" 5578:"There is at least one alphanumeric " 3808: 3677:consumes the entire input, including the final 2474:When specifying a range of characters, such as 2199:matches "", "ab", "abab", "ababab", and so on. 1827:distributions. A similar convention is used in 531:The usual context of wildcard characters is in 95:for "find" or "find and replace" operations on 28:".*" redirects here. For the C++ operator, see 7370:, successively revised as ISO/IEC 9945-2:2002 7063: 7061: 6893:"An incomplete history of the QED Text Editor" 6073:There is at least one character in Hello World 4221:Extending ASCII-oriented constructs to Unicode 3510:Note that what the POSIX regex standards call 1258:matches all strings over Σ* that do not match 338:(1986), who later wrote an implementation for 9518: 9144: 8846:. Princeton University Press. pp. 3–42. 7698:Ritchie, D. M.; Thompson, K. L. (June 1970). 7670:"Perl Regular Expression Matching is NP-Hard" 5501:There is a word that ends with 'llo'. 5283:Denotes a set of possible character matches. 4276:initial, medial, final, and isolated position 2078:, if present) character within the brackets: 1712:as metacharacters. Metacharacters help form: 1284:consisting of all strings over the alphabet { 224:. (Kleene introduced it as an alternative to 134:standard and another, widely used, being the 8: 8930:Visual Basic .NET Text Manipulation Handbook 8776:Real World Regular Expressions with Java 1.4 7230: 7080:"PCRE - Perl Compatible Regular Expressions" 6979:"New Regular Expression Features in Tcl 8.1" 6497:contains a character other than a, b, and c. 6389:"that ends with 'd\\n'.\n" 5422:contains at least one of Hello, Hi, or Pogo. 1720:telling how many atoms (and whether it is a 9466:Counter-free (with aperiodic finite monoid) 7242: 7218: 6301:"that starts with 'H'.\n" 6216:"that ends with 'rld'.\n" 6089:Matches the beginning of a line or string. 5934:"99 bottles of beer on the wall." 4831:We matched 'Hel' and 'o W'. 4719:"$ string1 has length >= 5.\n" 4677:Within square brackets the dot is literal. 3978:(2), but it can be run on a string of size 3150:Visible characters and the space character 1704:. The pattern is composed of a sequence of 1584:Deciding equivalence of regular expressions 58:followed by one or more lower-case vowels). 9525: 9511: 9503: 9176: 9151: 9137: 9129: 8009: 8007: 6207:"$ string1 is a line or string " 5727:break spaces, next line, and the variable- 4568:While regexes would be useful on Internet 3818: 2503: 1597:minimal deterministic finite state machine 524:matches both "serialise" and "serialize". 9022: 8962:Introduction to the Theory of Computation 8819: 8705:Goyvaerts, Jan; Levithan, Steven (2009). 8684: 8520: 8389:All the if statements return a TRUE value 7970: 7860: 7815: 7596: 7397:"9.3.6 BREs Matching Multiple Characters" 7386:The Single Unix Specification (Version 2) 6920: 6413:is a string that ends with 'd\n'. 6325:is a string that starts with 'H'. 5535:property contains more than Arab digits. 4137: 4124: 4123: 4121: 4084: 4071: 4070: 4068: 3585:originally developed in PCRE and Python. 2087:. Backslash escapes are not allowed. The 1517: 1507: 1497: 1440: 1430: 1412: 1330: 1312: 1030:. They have the same expressive power as 439:of many programming languages, including 9697:Comparison of regular-expression engines 8724:Gruber, Hermann; Holzer, Markus (2008). 8549:The Single UNIX Specification, Version 2 7731: 7729: 7516: 7514: 7468: 7466: 6816: 6626:. Oxford University Press. p. 754. 6547:Comparison of regular expression engines 6152:starts with the characters 'He'. 5690:"World is not alphanumeric.\n" 5681:"The space between Hello and " 5625:-alphanumeric character, excluding "_"; 4655: 4440:. Comprehensive support is included in: 4411:. Examples of non-binary properties are 4158:{\displaystyle {\mathrm {O} }(n^{2k+1})} 4105:{\displaystyle {\mathrm {O} }(n^{2k+2})} 2357: 2009: 244:built Kleene's notation into the editor 7445:"Perl Regular Expression Documentation" 7163: 7161: 7159: 7157: 6762:"Regular Languages and Finite Automata" 6582: 6571:nondeterministic finite automaton (NFA) 5918:property, which itself the same as the 4061:backreference note has a complexity of 2348:. Additionally, support is removed for 2219:Matches the preceding element at least 2172:th marked subexpression matched, where 1870:standard has three sets of compliance: 1831:, where search and replace is given by 1195:denotes {ε, "a", "b", "bb", "bbb", ...} 957:The preceding item is matched at least 204:using his mathematical notation called 9358:Linear context-free rewriting language 7946:from the original on 14 September 2020 7451:from the original on December 31, 2009 7207: 7004:"Documentation: 9.3: Pattern Matching" 6932: 6747: 6521:and is part of the general problem of 4213:. Many regex engines support only the 909:The preceding item is matched exactly 9658:Zhu–Takaoka string matching algorithm 9283:Linear context-free rewriting systems 8669:Gelade, Wouter; Neven, Frank (2008). 7876:Navarro, Gonzalo (10 November 2001). 7321: 7197:Hopcroft, Motwani & Ullman (2000) 6698:"How a Regex Engine Works Internally" 6678:from the original on 27 February 2017 6658:Lawson, Mark V. (17 September 2003). 6165:Matches the end of a line or string. 4434:general-purpose programming languages 4315:. Unicode introduced amongst others, 3893:Look-behind and look-ahead assertions 3522:is used to describe what POSIX calls 1882:(Simple Regular Expressions). SRE is 1680:axiomatized regular expressions as a 1104:denotes {"abd", "abef", "cd", "cef"}. 737:separates alternatives. For example, 697:A regular expression, often called a 666:obtained from the regular expression 126:text-processing utilities. Different 7: 8404:"Regular Expressions, End of String" 8302:from the original on 21 October 2018 8233:"re – Regular expression operations" 7067: 6535:language identification in the limit 4199:encoding, while others might expect 4039:Regular expression Denial of Service 4006:-sregex family based on Cox's code. 1897:BRE and ERE work together. ERE adds 1890:. The subsection below covering the 1878:(Extended Regular Expressions), and 1230:—these can be expressed as follows: 985:matches any character. For example, 941:The preceding item is matched up to 91:. Usually such patterns are used by 9623:Boyer–Moore string-search algorithm 8985:Regular Expression Pocket Reference 8374:Scripting for Computational Science 7914:from the original on 7 October 2020 7804: 7767:from the original on 7 October 2020 7591:. Internet Engineering Task Force. 7566:from the original on 7 October 2020 7536:from the original on 7 October 2020 6055:" that is not a digit.\n" 5359:Separates alternate possibilities. 4625:regular expressions are different. 3953:There are at least three different 3665:matches the entire line, the regex 3576:. Some languages and tools such as 3536:Perl Compatible Regular Expressions 2191:matches "ac", "abc", "abbbc", etc. 2003:, whereas Extended Regular Syntax ( 1961: 1886:, in favor of BRE, as both provide 857:matches both "color" and "colour". 145:, in search and replace dialogs of 9491:of the category directly above it. 8573:The Open Group Base Specifications 8462:. The MIT Press. pp. 255–300. 8137:"Regular expressions - JavaScript" 7339:Ukrainskii Matematicheskii Zhurnal 6380:"$ string1 is a string " 6292:"$ string1 is a string " 4125: 4072: 3949:Implementations and running times 3732:Patterns for non-regular languages 1957: 1823:-based operating systems, such as 1721: 1599:, and determines whether they are 14: 9712:Nondeterministic finite automaton 9653:Two-way string-matching algorithm 8502:"A brief history of just-in-time" 7885:Software: Practice and Experience 6760:Leung, Hing (16 September 2010). 5181:N can be omitted and M can be 0: 4050:Boyer-Moore (BM) based algorithms 3962:nondeterministic finite automaton 3689:when applied to the same string. 2597:Alphanumeric characters plus "_" 1562:Thompson's construction algorithm 1125:describes {"ab", "c", "d", "ef"}. 762:(among other uses). For example, 645:nondeterministic finite automaton 641:Thompson's construction algorithm 402:syntax for filenames, and in the 122:. They came into common use with 9070: 9045:"Apocalypse 5: Pattern Matching" 8569:"Chapter 9: Regular Expressions" 7786:Zakharevich, Ilya (1997-11-19). 6723:"How Do You Actually Use Regex?" 5721:Matches a whitespace character, 4335:library, properties of the form 3972:has the time and memory cost of 2320:Metacharacters in POSIX extended 1980:standard, Basic Regular Syntax ( 1540:nondeterministic finite automata 1296:th-from-last letter equals  1218:Expressive power and compactness 391:structure specification language 315:, and in other programs such as 248:as a means to match patterns in 141:Regular expressions are used in 9051:from the original on 2010-01-12 8853:from the original on 2020-10-07 8797:(2nd ed.). Addison-Wesley. 8762:from the original on 2011-07-11 8657:from the original on 2005-08-30 8579:from the original on 2011-12-02 8556:from the original on 2020-10-07 8488:from the original on 2020-10-07 8481:Foundations of Computer Science 8428:from the original on 2020-10-07 8267:from the original on 2022-11-29 8113:"Pattern (Java Platform SE 7 )" 8021:from the original on 2020-10-07 7993:from the original on 2020-10-07 7761:Computer Science Stack Exchange 7680:from the original on 2020-10-07 7650:from the original on 2015-07-04 7349:from the original on 2018-03-29 7143:from the original on 2020-10-07 7114:from the original on 2020-10-07 7047:from the original on 2009-12-31 7014:from the original on 2020-10-07 6985:from the original on 2020-10-07 6873:from the original on 2020-10-07 6849:"A Regular Expressions Matcher" 6640:from the original on 2017-02-28 5729:width spaces (amongst others). 4545:systems, and many other tasks. 3964:(NFA) to be transformed into a 2421:matches "at", "hat", and "cat". 2283:matches all strings matched by 2273:matches all strings matched by 2047:matches only "a", ".", or "c". 1652:}. More generally, an equation 9628:Boyer–Moore–Horspool algorithm 9618:Apostolico–Giancarlo algorithm 8957:"Chapter 1: Regular Languages" 8633:Friedl, Jeffrey E. F. (2002). 8324:match operation. For example, 8040:"regex(3) - Linux manual page" 7707:. MM-70-1373-3. Archived from 7247:File:RegexComplementBlowup.png 6664:. CRC Press. pp. 98–100. 6519:induction of regular languages 6513:Induction of regular languages 4152: 4130: 4099: 4077: 3966:deterministic finite automaton 3438:, which is usually defined as 2324:The meaning of metacharacters 1937:" for BRE (the default), and " 1855:, using commas as delimiters. 1739:, in this case, the backslash 1488: 1476: 1470: 1458: 1455: 1443: 1427: 1414: 1381: 1369: 1366: 1354: 1351: 1339: 1327: 1314: 1252:generalized regular expression 1155:denotes the set of all finite 925:The preceding item is matched 653:deterministic finite automaton 346:. The Tcl library is a hybrid 262:Compatible Time-Sharing System 1: 8636:Mastering Regular Expressions 7585:Bormann, Carsten; Bray, Tim. 7432:Digression: POSIX Submatching 7170:"grep(1) - Linux manual page" 6527:computational learning theory 4264:Cousins of case insensitivity 1894:applies to both BRE and ERE. 1874:(Basic Regular Expressions), 1397:Generalizing this pattern to 1271:deterministic finite automata 1250:operator is added, to give a 1022:Regular expressions describe 9633:Knuth–Morris–Pratt algorithm 9560:Damerau–Levenshtein distance 8707:Regular Expressions Cookbook 7987:"Vim documentation: pattern" 5725:in Unicode, also matches no- 5346:contains one or more vowels. 5189:matches "at least" M times; 4413:\p{Bidi_Class=Right_to_Left} 4238:in the range and codepoint( 3514:are commonly referred to as 3453:classes (using the notation 2378:matches only "ac" or "abc". 1964:, named capture groups, and 1929:has the following options: " 1560:Given a regular expression, 849:The question mark indicates 800:after an element (such as a 344:Advanced Regular Expressions 210:theoretical computer science 192:, who introduced the concept 118:formalized the concept of a 105:theoretical computer science 75:), sometimes referred to as 9824:Compressed pattern matching 9550:Approximate string matching 8964:. PWS Publishing. pp.  8743:10.1007/978-3-540-70583-3_4 7474:"Regular Expression Syntax" 6770:New Mexico State University 5185:matches "exactly" M times; 4395:general categories include 2287:other than "hat" and "cat". 1591:It is possible to write an 1011: 749:can match "gray" or "grey". 713:; we say that this pattern 498:is a very general pattern, 364:parsing expression grammars 93:string-searching algorithms 9916: 9829:Longest common subsequence 9740:Needleman–Wunsch algorithm 9610:String-searching algorithm 9373:Deterministic context-free 9298:Deterministic context-free 8983:Stubblebine, Tony (2003). 8500:Aycock, John (June 2003). 8328:could also be rendered as 7989:. Vimdoc.sourceforge.net. 7736:Wall, Larry (1994-10-18). 7482:Python Software Foundation 7478:Python 3.5.0 documentation 7231:Gruber & Holzer (2008) 7037:"Perl Regular Expressions" 6510: 6476:"a, b, and c.\n" 6350:"Hello\nWorld\n" 6262:"Hello\nWorld\n" 5193:matches "at most" N times. 4343:match characters in block 3533: 3293:Non-whitespace characters 2367: 2202: 2180: 2158: 2132: 2122: 2107: 2059:matches "a", "b", or "c". 2050: 2029: 2019: 1745:leaning toothpick syndrome 222:artificial neural networks 27: 18: 9839:Sequential pattern mining 9679:Commentz-Walter algorithm 9667:Multiple string searching 9600:Wagner–Fischer algorithm 9484: 9446:Nondeterministic pushdown 9174: 9011:Communications of the ACM 8908:Laurikari, Ville (2009). 8808:Communications of the ACM 7644:10.1142/S012905410300214X 7243:Gelade & Neven (2008) 6954:"Jargon File 4.4.7: grep" 6552:Extended Backus–Naur form 6437:"Hello World\n" 6177:"Hello World\n" 6101:"Hello World\n" 6016:"Hello World\n" 5830:"Hello World\n" 5741:"Hello World\n" 5651:"Hello World\n" 5548:"Hello World\n" 5453:"Hello World\n" 5371:"Hello World\n" 5295:"Hello World\n" 5214:"Hello World\n" 5109:"Hello World\n" 5027:"Hello World\n" 4929:"Hello World\n" 4856:"Hello World\n" 4783:"Hello World\n" 4689:"Hello World\n" 4561:and then using the regex 3892: 2095:, if present) character: 2044:matches "abc", etc., but 961:times, but not more than 623:* means "zero or more of 520:, the regular expression 177:are available for reuse. 9849:String rewriting systems 9834:Longest common substring 9745:Smith–Waterman algorithm 9570:Gestalt pattern matching 8875:10.1109/LICS.1991.151646 8552:. The Open Group. 1997. 7219:Gelade & Neven (2008 6702:regular-expressions.info 6596:Regular-Expressions.info 6491: 6428: 6404: 6341: 6316: 6253: 6231: 6168: 6146: 6092: 6070: 6007: 5979: 5925: 5920:\p{Numeric_Type=Decimal} 5910:in Unicode, same as the 5884: 5821: 5795: 5732: 5705: 5642: 5602: 5539: 5498: 5444: 5416: 5362: 5340: 5286: 5268: 5205: 5163: 5100: 5081: 5018: 4983: 4920: 4901: 4847: 4828: 4774: 4734: 4680: 4375:or the general category 4215:Basic Multilingual Plane 3802:to describe the latter. 2561:Alphanumeric characters 2406:matches "abc" or "def". 2267:matches "hat" and "cat". 1972:POSIX basic and extended 1142:of the set described by 889:The plus sign indicates 380:recursive descent parser 299:in the 1970s, including 254:just-in-time compilation 9783:Generalized suffix tree 9707:Thompson's construction 9115:ISO/IEC/IEEE 9945:2009 8774:Habibi, Mehran (2004). 8546:"Regular Expressions". 7934:"travisdowns/polyregex" 7221:, p. 332, Thm.4.1) 6620:Mitkov, Ruslan (2003). 6567:Thompson's construction 4659:Meta­character(s) 4417:\p{Word_Break=A_Letter} 4211:Supported Unicode range 3516:POSIX character classes 3194:Punctuation characters 2439:matches "cat" or "dog". 1684:, using equational and 1266:blow-up of its length. 1146:that contains ε and is 1055:) ∅ denoting the set ∅. 869:The asterisk indicates 378:-style definition of a 226:McCulloch & Pitts's 9900:Programming constructs 9885:Automata (computation) 9735:Hirschberg's algorithm 9451:Deterministic pushdown 9327:Recursively enumerable 8376:, p. 320; Programming 7403:. The Open Group. 2017 5527:in Unicode, where the 5198:is thus equivalent to 4514: 4303:canonically equivalent 4159: 4106: 3812: 3788:formal language theory 3659:applied to the string 3445:further distinguishes 2679:Alphabetic characters 1888:backward compatibility 1522: 1404:gives the expression: 1388: 1028:formal language theory 1018:Formal language theory 681:translated to the NFA 628: 212:, in the subfields of 193: 60: 9590:Levenshtein automaton 9580:Jaro–Winkler distance 9024:10.1145/363347.363387 8821:10.1145/364175.364185 8531:10.1145/857076.857077 8509:ACM Computing Surveys 6921:Aho & Ullman 1992 5995:Matches a non-digit; 5916:\p{GC=Decimal_Number} 4557:that makes text into 4505: 4160: 4107: 3919:look-ahead assertions 3696:. For example, while 3231:Whitespace characters 1799:. This originates in 1523: 1389: 1138:denotes the smallest 647:(NFA), which is then 610: 188: 38: 9638:Rabin–Karp algorithm 9595:Levenshtein distance 9436:Tree stack automaton 9107:ISO/IEC 9945-2:2003 9099:ISO/IEC 9945-2:2002 9091:ISO/IEC 9945-2:1993 9079:at Wikimedia Commons 8869:. pp. 214–225. 8679:. pp. 325–336. 8257:"Regex on crates.io" 8209:"PHP: PCRE - Manual" 8161:"OCaml library: Str" 7738:"Perl 5: perlre.pod" 7447:. perldoc.perl.org. 7366:ISO/IEC 9945-2:1993 7333:Redko, V.N. (1964). 6562:Regular tree grammar 6076:that is not a digit. 5438:(^\w|\w$ |\W\w|\w\W) 4590:programming language 4541:, the production of 4421:\p{Numeric_Value=10} 4385:\p{Uppercase_Letter} 4294:combining characters 4286:is sometimes useful. 4120: 4067: 3899:regular expressions 2854:Non-word boundaries 2638:Non-word characters 2227:times. For example, 1984:) requires that the 1859:IEEE POSIX Standard 1411: 1311: 9875:Regular expressions 9793:Ternary search tree 9344:range concatenation 9269:range concatenation 9083:Regular Expressions 8910:"TRE library 0.7.6" 8408:Perl Best Practices 8351:' for more details. 8338:delimiter collision 8288:(24 October 2011). 8068:en.cppreference.com 6891:Ritchie, Dennis M. 6832:Johnson et al. 1968 5608:(A-Z, a-z, 0-9, _). 4740:has length >= 5. 4543:syntax highlighting 4365:\p{Script=Armenian} 4226:are valid wherever 3647:Possessive matching 3524:bracket expressions 3507:in POSIX notation. 3378:Hexadecimal digits 3070:Visible characters 1674:star height problem 526:Wildcard characters 480:regular expressions 459:implementations of 198:Stephen Cole Kleene 190:Stephen Cole Kleene 116:Stephen Cole Kleene 79:, is a sequence of 77:rational expression 9880:1951 introductions 9722:Sequence alignment 9689:Regular expression 8591:Cox, Russ (2007). 8469:Ullman, Jeffrey D. 8347:2009-12-31 at the 7828:"gnulib/lib/dfa.c" 7526:The Java Tutorials 7168:Kerrisk, Michael. 6780:on 5 December 2013 6557:Matching wildcards 5389:m/(Hello|Hi|Pogo)/ 5200:x{0,} y{1,} z{0,1} 4618:operation in Perl 4574:Google Code Search 4550:desktop publishing 4515: 4258:Case insensitivity 4193:Supported encoding 4155: 4102: 3784:regular expression 3334:Uppercase letters 3106:Lowercase letters 2931:Control characters 2223:and not more than 2148:BRE mode requires 1580:for more on this. 1566:Kleene's algorithm 1518: 1513: 1495: 1384: 1264:double exponential 1179:can be written as 1171:can be written as 818:(derived from the 651:and the resulting 649:made deterministic 629: 426:Apache HTTP Server 412:Starting in 1997, 366:. The result is a 327:standard in 1992. 194: 157:utilities such as 65:regular expression 61: 9862: 9861: 9854:String operations 9500: 9499: 9479: 9478: 9441:Embedded pushdown 9337:Context-sensitive 9262:Context-sensitive 9196:Abstract machines 9181:Chomsky hierarchy 9075:Media related to 8994:978-0-596-00415-6 8975:978-0-534-94728-6 8943:978-1-86100-730-8 8892:978-0-8186-2230-4 8785:978-1-59059-107-9 8752:978-3-540-70582-6 8716:978-0-596-52068-7 8650:978-0-596-00289-3 8625:978-0-672-32566-3 8421:978-0-596-00173-5 8286:Horowitz, Bradley 7891:(13): 1265–1312. 7755:Wandering Logic. 7556:"Atomic Grouping" 7422:Ross Cox (2009). 7305:978-0-201-44124-6 6866:978-0-596-51004-6 6671:978-1-58488-255-8 6633:978-0-19-927634-9 6523:grammar induction 6504: 6503: 5903:Matches a digit; 5814:Matches anything 4519:string processing 4313:New control codes 3903: 3902: 3773:context sensitive 3738:regular languages 3512:character classes 3421: 3420: 2527:ASCII characters 2455:Character classes 2410: 2409: 2250: 2249: 2168:Matches what the 1892:character classes 1833:s/re/replacement/ 1700:matches a target 1548:Chomsky hierarchy 1510: 1441: 1439: 1070:literal character 1038:Formal definition 1024:regular languages 969: 968: 719:(Hän|Han|Haen)del 208:. These arose in 202:regular languages 83:that specifies a 9907: 9895:Pattern matching 9890:Formal languages 9819:Pattern matching 9773:Suffix automaton 9575:Hamming distance 9527: 9520: 9513: 9504: 9495: 9492: 9456:Visibly pushdown 9430:Thread automaton 9378:Visibly pushdown 9346: 9303:Visibly pushdown 9271: 9258:(no common name) 9177: 9164:formal languages 9153: 9146: 9139: 9130: 9074: 9059: 9057: 9056: 9036: 9026: 8998: 8979: 8959: 8947: 8924: 8922: 8921: 8912:. Archived from 8904: 8861: 8859: 8858: 8852: 8845: 8840:Automata Studies 8833: 8823: 8798: 8789: 8770: 8768: 8767: 8761: 8732: 8720: 8701: 8699: 8698: 8689:. Archived from 8688: 8665: 8663: 8662: 8629: 8607: 8605: 8604: 8595:. Archived from 8587: 8585: 8584: 8564: 8562: 8561: 8542: 8524: 8506: 8496: 8494: 8493: 8477: 8467:Aho, Alfred V.; 8463: 8456:van Leeuwen, Jan 8437: 8436: 8434: 8433: 8396: 8390: 8387: 8381: 8358: 8352: 8331: 8327: 8318: 8312: 8311: 8309: 8307: 8282: 8276: 8275: 8273: 8272: 8253: 8247: 8246: 8244: 8243: 8229: 8223: 8222: 8220: 8219: 8205: 8199: 8198: 8196: 8195: 8189:perldoc.perl.org 8181: 8175: 8174: 8172: 8171: 8157: 8151: 8150: 8148: 8147: 8133: 8127: 8126: 8124: 8123: 8109: 8103: 8102: 8100: 8099: 8084: 8078: 8077: 8075: 8074: 8060: 8054: 8053: 8051: 8050: 8036: 8030: 8029: 8027: 8026: 8011: 8002: 8001: 7999: 7998: 7983: 7977: 7976: 7974: 7962: 7956: 7955: 7953: 7951: 7930: 7924: 7923: 7921: 7919: 7913: 7882: 7873: 7867: 7866: 7864: 7852: 7846: 7845: 7840: 7839: 7830:. Archived from 7824: 7818: 7816:Laurikari (2009) 7813: 7807: 7802: 7796: 7795: 7783: 7777: 7776: 7774: 7772: 7752: 7746: 7745: 7733: 7724: 7722: 7720: 7719: 7713: 7706: 7695: 7689: 7688: 7686: 7685: 7666: 7660: 7658: 7656: 7655: 7638:(6): 1007–1018. 7623: 7617: 7616: 7614: 7612: 7600: 7598:10.17487/RFC9485 7582: 7576: 7575: 7573: 7571: 7552: 7546: 7545: 7543: 7541: 7518: 7509: 7504: 7498: 7497: 7495: 7493: 7484:. Archived from 7470: 7461: 7460: 7458: 7456: 7441: 7435: 7434: 7419: 7413: 7412: 7410: 7408: 7393: 7387: 7384: 7378: 7364: 7358: 7357: 7355: 7354: 7330: 7324: 7319: 7313: 7312: 7291: 7285: 7284: 7280: 7274: 7273: 7271: 7270: 7256: 7250: 7239: 7233: 7228: 7222: 7216: 7210: 7205: 7199: 7194: 7185: 7184: 7182: 7180: 7165: 7152: 7151: 7149: 7148: 7129: 7123: 7122: 7120: 7119: 7100: 7094: 7093: 7091: 7090: 7076: 7070: 7065: 7056: 7055: 7053: 7052: 7029: 7023: 7022: 7020: 7019: 7000: 6994: 6993: 6991: 6990: 6975: 6969: 6968: 6966: 6965: 6956:. Archived from 6946:Raymond, Eric S. 6942: 6936: 6930: 6924: 6918: 6909: 6908: 6906: 6904: 6895:. Archived from 6888: 6882: 6881: 6879: 6878: 6859:. pp. 1–2. 6845:Kernighan, Brian 6841: 6835: 6829: 6820: 6814: 6805: 6799: 6793: 6792: 6787: 6785: 6779: 6773:. Archived from 6766: 6757: 6751: 6745: 6739: 6738: 6736: 6734: 6719: 6713: 6712: 6710: 6708: 6694: 6688: 6687: 6685: 6683: 6655: 6649: 6648: 6646: 6645: 6617: 6611: 6610: 6608: 6607: 6598:. Archived from 6590:Goyvaerts, Jan. 6587: 6498: 6495: 6483: 6480: 6477: 6474: 6471: 6468: 6465: 6462: 6459: 6456: 6453: 6450: 6447: 6444: 6441: 6438: 6435: 6432: 6422: 6414: 6411: 6408: 6396: 6393: 6390: 6387: 6384: 6381: 6378: 6375: 6372: 6369: 6366: 6363: 6360: 6357: 6354: 6351: 6348: 6345: 6335: 6326: 6323: 6320: 6308: 6305: 6302: 6299: 6296: 6293: 6290: 6287: 6284: 6281: 6278: 6275: 6272: 6269: 6266: 6263: 6260: 6257: 6247: 6238: 6235: 6223: 6220: 6217: 6214: 6211: 6208: 6205: 6202: 6199: 6196: 6193: 6190: 6187: 6184: 6181: 6178: 6175: 6172: 6162: 6153: 6150: 6138: 6135: 6132: 6129: 6126: 6123: 6120: 6117: 6114: 6111: 6108: 6105: 6102: 6099: 6096: 6086: 6077: 6074: 6062: 6059: 6056: 6053: 6050: 6047: 6044: 6041: 6038: 6035: 6032: 6029: 6026: 6023: 6020: 6017: 6014: 6011: 6003: 5999: 5992: 5983: 5971: 5968: 5965: 5962: 5959: 5956: 5953: 5950: 5947: 5944: 5941: 5938: 5935: 5932: 5929: 5921: 5917: 5913: 5907: 5900: 5891: 5888: 5876: 5873: 5870: 5867: 5864: 5861: 5858: 5855: 5852: 5849: 5846: 5843: 5840: 5837: 5834: 5831: 5828: 5825: 5811: 5802: 5799: 5787: 5784: 5781: 5778: 5775: 5772: 5769: 5766: 5763: 5760: 5757: 5754: 5751: 5748: 5745: 5742: 5739: 5736: 5728: 5726: 5718: 5709: 5697: 5694: 5691: 5688: 5685: 5682: 5679: 5676: 5673: 5670: 5667: 5664: 5661: 5658: 5655: 5652: 5649: 5646: 5634: 5629: 5618: 5609: 5606: 5594: 5591: 5588: 5585: 5582: 5579: 5576: 5573: 5570: 5567: 5564: 5561: 5558: 5555: 5552: 5549: 5546: 5543: 5534: 5530: 5523: 5518: 5511: 5502: 5490: 5487: 5484: 5481: 5478: 5475: 5472: 5469: 5466: 5463: 5460: 5457: 5454: 5451: 5448: 5439: 5432: 5423: 5420: 5408: 5405: 5402: 5399: 5396: 5393: 5390: 5387: 5384: 5381: 5378: 5375: 5372: 5369: 5366: 5356: 5347: 5344: 5332: 5329: 5326: 5323: 5320: 5317: 5314: 5311: 5308: 5305: 5302: 5299: 5296: 5293: 5290: 5280: 5272: 5260: 5257: 5254: 5251: 5248: 5245: 5242: 5239: 5236: 5233: 5230: 5227: 5224: 5221: 5218: 5215: 5212: 5209: 5201: 5197: 5192: 5188: 5184: 5176: 5167: 5155: 5152: 5149: 5146: 5143: 5140: 5137: 5134: 5131: 5128: 5125: 5122: 5119: 5116: 5113: 5110: 5107: 5104: 5094: 5085: 5073: 5070: 5067: 5064: 5061: 5058: 5055: 5052: 5049: 5046: 5043: 5040: 5037: 5034: 5031: 5028: 5025: 5022: 5014: 5010: 5006: 5002: 4996: 4987: 4975: 4972: 4969: 4966: 4963: 4960: 4957: 4954: 4951: 4948: 4945: 4942: 4939: 4936: 4933: 4930: 4927: 4924: 4914: 4905: 4893: 4890: 4887: 4884: 4881: 4878: 4875: 4872: 4869: 4866: 4863: 4860: 4857: 4854: 4851: 4841: 4832: 4820: 4817: 4814: 4811: 4808: 4805: 4802: 4799: 4796: 4793: 4790: 4787: 4784: 4781: 4778: 4770: 4766: 4762: 4758: 4750: 4741: 4738: 4726: 4723: 4720: 4717: 4714: 4711: 4708: 4705: 4702: 4699: 4696: 4693: 4690: 4687: 4684: 4672: 4656: 4649: 4643: 4639: 4635: 4564: 4428:Language support 4422: 4418: 4414: 4410: 4406: 4402: 4398: 4390: 4386: 4382: 4370: 4366: 4362: 4358: 4354: 4350: 4342: 4338: 4334: 4317:byte order marks 4272:case sensitivity 4225: 4166: 4164: 4162: 4161: 4156: 4151: 4150: 4129: 4128: 4113: 4111: 4109: 4108: 4103: 4098: 4097: 4076: 4075: 4036: 4035: 4032: 4029: 4026: 4023: 4020: 4017: 3944: 3943: 3938: 3937: 3932: 3931: 3926: 3925: 3921: 3920: 3912: 3908: 3888: 3886: 3883: 3874: 3872: 3857: 3855: 3845: 3843: 3819: 3769:pattern matching 3755: 3746: 3745: 3715: 3711: 3707: 3703: 3699: 3695: 3688: 3685:, which matches 3684: 3680: 3676: 3671:not match at all 3668: 3658: 3642: 3638: 3622: 3612: 3604: 3600: 3596: 3544:'s—for example, 3506: 3505: 3502: 3499: 3496: 3493: 3488: 3487: 3484: 3481: 3478: 3475: 3470: 3469: 3466: 3461: 3460: 3457: 3440: 3437: 3431: 3430: 3427: 3417: 3413: 3412: 3409: 3406: 3403: 3400: 3394: 3393: 3390: 3382: 3373: 3369: 3368: 3365: 3362: 3359: 3356: 3350: 3349: 3346: 3338: 3329: 3325: 3324: 3321: 3315: 3314: 3311: 3305: 3304: 3301: 3288: 3284: 3283: 3280: 3275: 3274: 3271: 3268: 3265: 3262: 3256: 3255: 3252: 3246: 3245: 3242: 3236: 3225: 3221: 3220: 3217: 3214: 3211: 3208: 3198: 3189: 3185: 3184: 3181: 3178: 3175: 3172: 3166: 3165: 3162: 3154: 3145: 3141: 3140: 3137: 3134: 3131: 3128: 3122: 3121: 3118: 3110: 3101: 3097: 3096: 3093: 3090: 3087: 3084: 3074: 3065: 3061: 3060: 3057: 3051: 3050: 3047: 3041: 3040: 3037: 3024: 3020: 3019: 3016: 3011: 3010: 3007: 3004: 3001: 2998: 2992: 2991: 2988: 2982: 2981: 2978: 2972: 2963: 2959: 2958: 2955: 2952: 2949: 2946: 2936: 2925: 2924: 2921: 2918: 2915: 2912: 2909: 2906: 2903: 2900: 2897: 2894: 2891: 2888: 2885: 2882: 2879: 2876: 2870: 2869: 2866: 2849: 2848: 2845: 2842: 2839: 2836: 2833: 2830: 2827: 2824: 2821: 2818: 2815: 2812: 2809: 2806: 2803: 2800: 2794: 2793: 2790: 2784: 2779: 2778: 2775: 2767:Word boundaries 2762: 2758: 2757: 2754: 2751: 2748: 2745: 2739: 2738: 2735: 2727: 2718: 2714: 2713: 2710: 2707: 2704: 2701: 2695: 2694: 2691: 2683: 2674: 2670: 2669: 2666: 2660: 2659: 2656: 2650: 2649: 2646: 2633: 2629: 2628: 2625: 2619: 2618: 2615: 2609: 2608: 2605: 2592: 2588: 2587: 2584: 2581: 2578: 2575: 2565: 2556: 2552: 2551: 2548: 2545: 2542: 2539: 2504: 2491: 2490: 2483: 2482: 2477:(i.e. lowercase 2476: 2470: 2469: 2466: 2461: 2438: 2432: 2426: 2420: 2405: 2399: 2391: 2385: 2377: 2371: 2358: 2354: 2347: 2343: 2339: 2335: 2310: 2304: 2298: 2292: 2286: 2282: 2276: 2272: 2266: 2260: 2245: 2243: 2230: 2216: 2215: 2198: 2194: 2190: 2184: 2165: 2153: 2152: 2145: 2136: 2126: 2118: 2115: 2110: 2102: 2098: 2094: 2090: 2086: 2083: 2080: 2077: 2073: 2067: 2064: 2061: 2058: 2053: 2046: 2043: 2033: 2023: 2010: 2002: 1998: 1994: 1990: 1955: 1951: 1940: 1936: 1933:" for ERE, and " 1932: 1928: 1916: 1912: 1908: 1904: 1900: 1854: 1850: 1846: 1838: 1834: 1814: 1810: 1806: 1798: 1794: 1786: 1782: 1770: 1766: 1762: 1758: 1754: 1750: 1742: 1711: 1527: 1525: 1524: 1519: 1512: 1511: 1508: 1496: 1491: 1435: 1434: 1393: 1391: 1390: 1385: 1335: 1334: 1246:. Sometimes the 1245: 1241: 1237: 1233: 1229: 1225: 1212: 1211:(0|(1(01*0)*1))* 1206: 1200: 1194: 1182: 1178: 1174: 1170: 1162: 1154: 1137: 1124: 1116: 1103: 1098: 1076: 1032:regular grammars 996: 990: 984: 954: 938: 922: 906: 896: 886: 876: 866: 856: 846: 839: 838: 828: 817: 810: 789: 788: 785: 782: 779: 776: 773: 770: 765: 748: 747: 744: 741: 720: 712: 672: 665: 603: 602: 599: 596: 593: 590: 587: 584: 581: 578: 575: 572: 569: 566: 563: 560: 557: 554: 551: 546: 545: 542: 539: 523: 504: 500: 497: 493: 437:standard library 408: 279: 230:pattern matching 218:formal languages 167:lexical analysis 120:regular language 101:input validation 53: 52: 49: 44: 43: 9915: 9914: 9910: 9909: 9908: 9906: 9905: 9904: 9865: 9864: 9863: 9858: 9802: 9749: 9716: 9702:Regular grammar 9683: 9662: 9643:Raita algorithm 9604: 9555:Bitap algorithm 9536: 9531: 9501: 9496: 9493: 9486: 9480: 9475: 9397: 9341: 9320: 9266: 9247: 9170: 9168:formal grammars 9160:Automata theory 9157: 9067: 9062: 9054: 9052: 9039: 9001: 8995: 8982: 8976: 8952:Sipser, Michael 8950: 8944: 8927: 8919: 8917: 8907: 8893: 8864: 8856: 8854: 8850: 8843: 8836: 8814:(12): 805–813. 8801: 8792: 8786: 8773: 8765: 8763: 8759: 8753: 8730: 8723: 8717: 8704: 8696: 8694: 8668: 8660: 8658: 8651: 8632: 8626: 8610: 8602: 8600: 8590: 8582: 8580: 8567: 8559: 8557: 8545: 8504: 8499: 8491: 8489: 8475: 8466: 8450: 8446: 8441: 8440: 8431: 8429: 8422: 8414:. p. 240. 8398: 8397: 8393: 8388: 8384: 8359: 8355: 8349:Wayback Machine 8329: 8325: 8319: 8315: 8305: 8303: 8284: 8283: 8279: 8270: 8268: 8255: 8254: 8250: 8241: 8239: 8237:docs.python.org 8231: 8230: 8226: 8217: 8215: 8207: 8206: 8202: 8193: 8191: 8183: 8182: 8178: 8169: 8167: 8159: 8158: 8154: 8145: 8143: 8135: 8134: 8130: 8121: 8119: 8117:docs.oracle.com 8111: 8110: 8106: 8097: 8095: 8086: 8085: 8081: 8072: 8070: 8062: 8061: 8057: 8048: 8046: 8038: 8037: 8033: 8024: 8022: 8013: 8012: 8005: 7996: 7994: 7985: 7984: 7980: 7964: 7963: 7959: 7949: 7947: 7942:. 5 July 2019. 7932: 7931: 7927: 7917: 7915: 7911: 7897:10.1002/spe.411 7880: 7875: 7874: 7870: 7854: 7853: 7849: 7837: 7835: 7826: 7825: 7821: 7814: 7810: 7803: 7799: 7785: 7784: 7780: 7770: 7768: 7754: 7753: 7749: 7735: 7734: 7727: 7717: 7715: 7711: 7704: 7701:QED Text Editor 7697: 7696: 7692: 7683: 7681: 7674:perl.plover.com 7668: 7667: 7663: 7659:Theorem 3 (p.9) 7653: 7651: 7625: 7624: 7620: 7610: 7608: 7584: 7583: 7579: 7569: 7567: 7554: 7553: 7549: 7539: 7537: 7520: 7519: 7512: 7505: 7501: 7491: 7489: 7488:on 18 July 2018 7472: 7471: 7464: 7454: 7452: 7443: 7442: 7438: 7421: 7420: 7416: 7406: 7404: 7395: 7394: 7390: 7385: 7381: 7365: 7361: 7352: 7350: 7332: 7331: 7327: 7320: 7316: 7306: 7293: 7292: 7288: 7282: 7281: 7277: 7268: 7266: 7258: 7257: 7253: 7240: 7236: 7229: 7225: 7217: 7213: 7206: 7202: 7195: 7188: 7178: 7176: 7167: 7166: 7155: 7146: 7144: 7137:bkase.github.io 7131: 7130: 7126: 7117: 7115: 7102: 7101: 7097: 7088: 7086: 7078: 7077: 7073: 7066: 7059: 7050: 7048: 7031: 7030: 7026: 7017: 7015: 7002: 7001: 6997: 6988: 6986: 6977: 6976: 6972: 6963: 6961: 6944: 6943: 6939: 6931: 6927: 6919: 6912: 6902: 6900: 6890: 6889: 6885: 6876: 6874: 6867: 6843: 6842: 6838: 6830: 6823: 6815: 6808: 6800: 6796: 6783: 6781: 6777: 6764: 6759: 6758: 6754: 6746: 6742: 6732: 6730: 6729:. 11 March 2020 6721: 6720: 6716: 6706: 6704: 6696: 6695: 6691: 6681: 6679: 6672: 6661:Finite Automata 6657: 6656: 6652: 6643: 6641: 6634: 6619: 6618: 6614: 6605: 6603: 6589: 6588: 6584: 6579: 6543: 6515: 6509: 6500: 6499: 6496: 6493: 6485: 6484: 6481: 6478: 6475: 6472: 6469: 6466: 6463: 6460: 6457: 6454: 6451: 6448: 6445: 6442: 6439: 6436: 6433: 6430: 6421: 6416: 6415: 6412: 6409: 6406: 6398: 6397: 6394: 6391: 6388: 6385: 6382: 6379: 6376: 6373: 6370: 6367: 6364: 6361: 6358: 6355: 6352: 6349: 6346: 6343: 6333: 6328: 6327: 6324: 6321: 6318: 6310: 6309: 6306: 6303: 6300: 6297: 6294: 6291: 6288: 6285: 6282: 6279: 6276: 6273: 6270: 6267: 6264: 6261: 6258: 6255: 6245: 6240: 6239: 6236: 6233: 6225: 6224: 6221: 6218: 6215: 6212: 6209: 6206: 6203: 6200: 6197: 6194: 6191: 6188: 6185: 6182: 6179: 6176: 6173: 6170: 6160: 6155: 6154: 6151: 6148: 6140: 6139: 6136: 6133: 6130: 6127: 6124: 6121: 6118: 6115: 6112: 6109: 6106: 6103: 6100: 6097: 6094: 6084: 6079: 6078: 6075: 6072: 6064: 6063: 6060: 6057: 6054: 6051: 6048: 6045: 6042: 6039: 6036: 6033: 6030: 6027: 6024: 6021: 6018: 6015: 6012: 6009: 6001: 5998: 5996: 5990: 5985: 5984: 5981: 5973: 5972: 5969: 5966: 5963: 5960: 5957: 5954: 5951: 5948: 5945: 5942: 5939: 5936: 5933: 5930: 5927: 5919: 5915: 5911: 5909: 5906: 5904: 5898: 5893: 5892: 5889: 5886: 5878: 5877: 5874: 5871: 5868: 5865: 5862: 5859: 5856: 5853: 5850: 5847: 5844: 5841: 5838: 5835: 5832: 5829: 5826: 5823: 5809: 5804: 5803: 5800: 5797: 5789: 5788: 5785: 5782: 5779: 5776: 5773: 5770: 5767: 5764: 5761: 5758: 5755: 5752: 5749: 5746: 5743: 5740: 5737: 5734: 5724: 5722: 5716: 5711: 5710: 5707: 5699: 5698: 5695: 5692: 5689: 5686: 5683: 5680: 5677: 5674: 5671: 5668: 5665: 5662: 5659: 5656: 5653: 5650: 5647: 5644: 5633: 5628: 5626: 5616: 5611: 5610: 5607: 5604: 5596: 5595: 5592: 5589: 5586: 5583: 5580: 5577: 5574: 5571: 5568: 5565: 5562: 5559: 5556: 5553: 5550: 5547: 5544: 5541: 5532: 5528: 5522: 5517: 5515: 5509: 5504: 5503: 5500: 5492: 5491: 5488: 5485: 5482: 5479: 5476: 5473: 5470: 5467: 5464: 5461: 5458: 5455: 5452: 5449: 5446: 5437: 5430: 5425: 5424: 5421: 5418: 5410: 5409: 5406: 5403: 5400: 5397: 5394: 5391: 5388: 5385: 5382: 5379: 5376: 5373: 5370: 5367: 5364: 5354: 5349: 5348: 5345: 5342: 5334: 5333: 5330: 5327: 5324: 5321: 5318: 5315: 5312: 5309: 5306: 5303: 5300: 5297: 5294: 5291: 5288: 5279: 5274: 5273: 5270: 5262: 5261: 5258: 5255: 5252: 5249: 5246: 5243: 5240: 5237: 5234: 5231: 5228: 5225: 5222: 5219: 5216: 5213: 5210: 5207: 5199: 5195: 5194: 5190: 5186: 5182: 5180: 5174: 5169: 5168: 5165: 5157: 5156: 5153: 5150: 5147: 5144: 5141: 5138: 5135: 5132: 5129: 5126: 5123: 5120: 5117: 5114: 5111: 5108: 5105: 5102: 5092: 5087: 5086: 5083: 5075: 5074: 5071: 5068: 5065: 5062: 5059: 5056: 5053: 5050: 5047: 5044: 5041: 5038: 5035: 5032: 5029: 5026: 5023: 5020: 5012: 5008: 5004: 5000: 4994: 4989: 4988: 4985: 4977: 4976: 4973: 4970: 4967: 4964: 4961: 4958: 4955: 4952: 4949: 4946: 4943: 4940: 4937: 4934: 4931: 4928: 4925: 4922: 4912: 4907: 4906: 4903: 4895: 4894: 4891: 4888: 4885: 4882: 4879: 4876: 4873: 4870: 4867: 4864: 4861: 4858: 4855: 4852: 4849: 4839: 4834: 4833: 4830: 4822: 4821: 4818: 4815: 4812: 4809: 4806: 4803: 4800: 4797: 4794: 4791: 4788: 4785: 4782: 4779: 4776: 4768: 4764: 4760: 4756: 4754: 4748: 4743: 4742: 4739: 4736: 4728: 4727: 4724: 4721: 4718: 4715: 4712: 4709: 4706: 4703: 4700: 4697: 4694: 4691: 4688: 4685: 4682: 4676: 4670: 4648: 4641: 4637: 4633: 4619: 4586: 4562: 4555:character style 4523:data validation 4511:Knowledge (XXG) 4500: 4495: 4430: 4420: 4416: 4412: 4408: 4404: 4400: 4397:\p{White_Space} 4396: 4388: 4384: 4380: 4379:. For example, 4368: 4364: 4360: 4356: 4352: 4348: 4340: 4336: 4332:java.util.regex 4330: 4224: 4174: 4133: 4118: 4117: 4115: 4080: 4065: 4064: 4062: 4033: 4030: 4027: 4024: 4021: 4018: 4015: 3951: 3941: 3940: 3935: 3934: 3929: 3928: 3923: 3922: 3918: 3917: 3894: 3884: 3879: 3877: 3870: 3865: 3853: 3848: 3841: 3836: 3817: 3753: 3743: 3742: 3734: 3725: 3713: 3710:^(?>wi|w)i$ 3709: 3705: 3701: 3697: 3693: 3686: 3682: 3678: 3674: 3666: 3663: 3656: 3649: 3640: 3636: 3620: 3617: 3610: 3602: 3598: 3594: 3591: 3538: 3532: 3520:character class 3503: 3500: 3497: 3494: 3491: 3485: 3482: 3479: 3476: 3473: 3467: 3464: 3458: 3455: 3428: 3425: 3410: 3407: 3404: 3401: 3398: 3391: 3388: 3366: 3363: 3360: 3357: 3354: 3347: 3344: 3322: 3319: 3312: 3309: 3302: 3299: 3287: 3281: 3278: 3272: 3269: 3266: 3263: 3260: 3253: 3250: 3243: 3240: 3218: 3215: 3212: 3209: 3206: 3182: 3179: 3176: 3173: 3170: 3163: 3160: 3138: 3135: 3132: 3129: 3126: 3119: 3116: 3094: 3091: 3088: 3085: 3082: 3058: 3055: 3048: 3045: 3038: 3035: 3017: 3014: 3008: 3005: 3002: 2999: 2996: 2989: 2986: 2979: 2976: 2956: 2953: 2950: 2947: 2944: 2922: 2919: 2916: 2913: 2910: 2907: 2904: 2901: 2898: 2895: 2892: 2889: 2886: 2883: 2880: 2877: 2874: 2867: 2864: 2846: 2843: 2840: 2837: 2834: 2831: 2828: 2825: 2822: 2819: 2816: 2813: 2810: 2807: 2804: 2801: 2798: 2791: 2788: 2782: 2776: 2773: 2755: 2752: 2749: 2746: 2743: 2736: 2733: 2711: 2708: 2705: 2702: 2699: 2692: 2689: 2667: 2664: 2657: 2654: 2647: 2644: 2626: 2623: 2616: 2613: 2606: 2603: 2585: 2582: 2579: 2576: 2573: 2549: 2546: 2543: 2540: 2537: 2488: 2480: 2467: 2464: 2457: 2450: 2436: 2430: 2424: 2418: 2403: 2397: 2389: 2383: 2375: 2369: 2349: 2345: 2341: 2337: 2333: 2322: 2308: 2302: 2296: 2290: 2284: 2280: 2274: 2270: 2264: 2258: 2233: 2232: 2228: 2205: 2204: 2196: 2192: 2188: 2182: 2160: 2150: 2149: 2140: 2134: 2124: 2117: 2114: 2109: 2100: 2096: 2092: 2088: 2085: 2082: 2079: 2075: 2071: 2066: 2063: 2060: 2057: 2052: 2045: 2041: 2031: 2021: 2000: 1996: 1992: 1988: 1974: 1953: 1949: 1938: 1934: 1930: 1926: 1914: 1910: 1906: 1902: 1898: 1861: 1852: 1848: 1847:will replace a 1844: 1836: 1832: 1812: 1808: 1804: 1796: 1792: 1784: 1780: 1777: 1768: 1764: 1760: 1756: 1752: 1748: 1740: 1737:escape sequence 1709: 1694: 1586: 1536: 1442: 1426: 1409: 1408: 1402: 1326: 1309: 1308: 1306: 1282: 1243: 1239: 1235: 1231: 1227: 1223: 1220: 1210: 1204: 1198: 1192: 1180: 1176: 1172: 1168: 1160: 1152: 1135: 1122: 1114: 1101: 1096: 1074: 1040: 1020: 994: 988: 980: 950: 934: 929:or more times. 918: 902: 894: 882: 874: 862: 854: 842: 826: 815: 808: 786: 783: 780: 777: 774: 771: 768: 763: 745: 742: 739: 718: 710: 695: 667: 656: 643:to construct a 633:regex processor 618: 600: 597: 594: 591: 588: 585: 582: 579: 576: 573: 570: 567: 564: 561: 558: 555: 552: 549: 543: 540: 537: 521: 502: 499: 495: 491: 476: 467:implementations 406: 382:via sub-rules. 282:Douglas T. Ross 273: 214:automata theory 183: 155:text processing 147:word processors 109:formal language 59: 50: 47: 41: 40: 33: 26: 17: 12: 11: 5: 9913: 9911: 9903: 9902: 9897: 9892: 9887: 9882: 9877: 9867: 9866: 9860: 9859: 9857: 9856: 9851: 9846: 9841: 9836: 9831: 9826: 9821: 9816: 9810: 9808: 9804: 9803: 9801: 9800: 9795: 9790: 9785: 9780: 9775: 9770: 9765: 9759: 9757: 9755:Data structure 9751: 9750: 9748: 9747: 9742: 9737: 9732: 9726: 9724: 9718: 9717: 9715: 9714: 9709: 9704: 9699: 9693: 9691: 9685: 9684: 9682: 9681: 9676: 9670: 9668: 9664: 9663: 9661: 9660: 9655: 9650: 9648:Trigram search 9645: 9640: 9635: 9630: 9625: 9620: 9614: 9612: 9606: 9605: 9603: 9602: 9597: 9592: 9587: 9582: 9577: 9572: 9567: 9562: 9557: 9552: 9546: 9544: 9538: 9537: 9532: 9530: 9529: 9522: 9515: 9507: 9498: 9497: 9485: 9482: 9481: 9477: 9476: 9474: 9473: 9471:Acyclic finite 9468: 9463: 9458: 9453: 9448: 9443: 9438: 9432: 9427: 9422: 9421:Turing Machine 9416: 9414:Linear-bounded 9411: 9406: 9404:Turing machine 9400: 9398: 9396: 9395: 9390: 9385: 9380: 9375: 9370: 9365: 9363:Tree-adjoining 9360: 9355: 9352: 9347: 9339: 9334: 9329: 9323: 9321: 9319: 9318: 9313: 9310: 9305: 9300: 9295: 9290: 9288:Tree-adjoining 9285: 9280: 9277: 9272: 9264: 9259: 9256: 9250: 9248: 9246: 9245: 9242: 9239: 9236: 9233: 9230: 9227: 9224: 9221: 9218: 9215: 9212: 9209: 9206: 9202: 9199: 9198: 9193: 9188: 9183: 9175: 9172: 9171: 9158: 9156: 9155: 9148: 9141: 9133: 9127: 9126: 9121: 9113: 9105: 9097: 9089: 9080: 9066: 9065:External links 9063: 9061: 9060: 9037: 9017:(6): 419–422. 8999: 8993: 8980: 8974: 8948: 8942: 8925: 8905: 8891: 8862: 8834: 8799: 8790: 8784: 8771: 8751: 8721: 8715: 8702: 8666: 8649: 8630: 8624: 8608: 8588: 8565: 8543: 8522:10.1.1.97.3985 8497: 8464: 8452:Aho, Alfred V. 8447: 8445: 8442: 8439: 8438: 8420: 8400:Conway, Damian 8391: 8382: 8353: 8342:perldoc perlre 8313: 8290:"A fall sweep" 8277: 8248: 8224: 8200: 8176: 8152: 8128: 8104: 8094:. 18 June 2022 8079: 8055: 8031: 8003: 7978: 7957: 7925: 7868: 7847: 7819: 7808: 7797: 7778: 7747: 7725: 7690: 7661: 7618: 7577: 7560:Regex Tutorial 7547: 7510: 7499: 7462: 7436: 7414: 7388: 7379: 7359: 7345:(1): 120–126. 7341:(in Russian). 7325: 7314: 7304: 7286: 7275: 7264:s3.boskent.com 7251: 7234: 7223: 7211: 7200: 7186: 7153: 7124: 7095: 7071: 7057: 7024: 6995: 6970: 6950:Dennis Ritchie 6937: 6925: 6910: 6883: 6865: 6857:O'Reilly Media 6853:Beautiful Code 6847:(2007-08-08). 6836: 6821: 6806: 6794: 6752: 6740: 6714: 6689: 6670: 6650: 6632: 6612: 6581: 6580: 6578: 6575: 6574: 6573: 6564: 6559: 6554: 6549: 6542: 6539: 6511:Main article: 6508: 6505: 6502: 6501: 6492: 6429: 6426: 6423: 6418: 6417: 6405: 6342: 6339: 6336: 6330: 6329: 6317: 6254: 6251: 6248: 6242: 6241: 6232: 6169: 6166: 6163: 6157: 6156: 6147: 6093: 6090: 6087: 6081: 6080: 6071: 6008: 6005: 5993: 5987: 5986: 5980: 5926: 5923: 5901: 5895: 5894: 5887:In Hello World 5885: 5822: 5819: 5818:a whitespace. 5812: 5806: 5805: 5798:In Hello World 5796: 5733: 5730: 5719: 5713: 5712: 5706: 5643: 5640: 5636: 5635: 5630:in ASCII, and 5619: 5613: 5612: 5603: 5540: 5537: 5533:Decimal_Number 5525: 5524: 5519:in ASCII, and 5512: 5506: 5505: 5499: 5445: 5442: 5433: 5427: 5426: 5417: 5363: 5360: 5357: 5351: 5350: 5341: 5287: 5284: 5281: 5276: 5275: 5269: 5206: 5203: 5177: 5171: 5170: 5164: 5101: 5098: 5095: 5089: 5088: 5082: 5019: 5016: 4997: 4991: 4990: 4984: 4921: 4918: 4915: 4909: 4908: 4902: 4848: 4845: 4842: 4836: 4835: 4829: 4801:m/(H..).(o..)/ 4775: 4772: 4751: 4745: 4744: 4735: 4681: 4678: 4673: 4667: 4666: 4663: 4660: 4609: 4585: 4582: 4570:search engines 4548:Some high-end 4535:data wrangling 4499: 4496: 4494: 4493: 4488: 4483: 4478: 4473: 4468: 4463: 4458: 4453: 4448: 4442: 4429: 4426: 4425: 4424: 4401:\p{Alphabetic} 4361:\p{IsArmenian} 4320: 4310: 4292:. Unicode has 4287: 4261: 4255: 4242:) ≤ codepoint( 4218: 4208: 4182:character sets 4173: 4170: 4154: 4149: 4146: 4143: 4140: 4136: 4132: 4127: 4101: 4096: 4093: 4090: 4087: 4083: 4079: 4074: 3950: 3947: 3901: 3900: 3890: 3889: 3875: 3863: 3859: 3858: 3846: 3834: 3830: 3829: 3826: 3823: 3816: 3813: 3744:backreferences 3733: 3730: 3724: 3721: 3672: 3661: 3648: 3645: 3615: 3590: 3587: 3570:.NET Framework 3568:, Microsoft's 3531: 3528: 3419: 3418: 3414: 3395: 3385: 3383: 3379: 3375: 3374: 3370: 3351: 3341: 3339: 3335: 3331: 3330: 3326: 3316: 3306: 3296: 3294: 3290: 3289: 3285: 3257: 3247: 3237: 3233: 3227: 3226: 3222: 3203: 3201: 3199: 3195: 3191: 3190: 3186: 3167: 3157: 3155: 3151: 3147: 3146: 3142: 3123: 3113: 3111: 3107: 3103: 3102: 3098: 3079: 3077: 3075: 3071: 3067: 3066: 3062: 3052: 3042: 3032: 3030: 3026: 3025: 3021: 2993: 2983: 2973: 2969: 2965: 2964: 2960: 2941: 2939: 2937: 2933: 2927: 2926: 2871: 2861: 2859: 2857: 2855: 2851: 2850: 2795: 2785: 2780: 2770: 2768: 2764: 2763: 2759: 2740: 2730: 2728: 2724: 2723:Space and tab 2720: 2719: 2715: 2696: 2686: 2684: 2680: 2676: 2675: 2671: 2661: 2651: 2641: 2639: 2635: 2634: 2630: 2620: 2610: 2600: 2598: 2594: 2593: 2589: 2570: 2568: 2566: 2562: 2558: 2557: 2553: 2534: 2532: 2530: 2528: 2524: 2523: 2520: 2517: 2514: 2511: 2508: 2495:abc...zABC...Z 2456: 2453: 2448: 2441: 2440: 2434: 2428: 2422: 2408: 2407: 2400: 2394: 2393: 2386: 2380: 2379: 2372: 2366: 2365: 2362: 2361:Metacharacter 2321: 2318: 2313: 2312: 2306: 2300: 2294: 2288: 2278: 2268: 2262: 2248: 2247: 2217: 2201: 2200: 2185: 2179: 2178: 2166: 2157: 2156: 2137: 2131: 2130: 2127: 2121: 2120: 2111: 2106: 2105: 2054: 2049: 2048: 2034: 2028: 2027: 2024: 2018: 2017: 2014: 2013:Metacharacter 1995:be designated 1986:metacharacters 1973: 1970: 1962:backreferences 1860: 1857: 1795:for the regex 1783:is entered as 1776: 1773: 1693: 1690: 1682:Kleene algebra 1603:(equivalent). 1585: 1582: 1534: 1529: 1528: 1516: 1506: 1503: 1500: 1494: 1490: 1487: 1484: 1481: 1478: 1475: 1472: 1469: 1466: 1463: 1460: 1457: 1454: 1451: 1448: 1445: 1438: 1433: 1429: 1425: 1422: 1419: 1416: 1400: 1383: 1380: 1377: 1374: 1371: 1368: 1365: 1362: 1359: 1356: 1353: 1350: 1347: 1344: 1341: 1338: 1333: 1329: 1325: 1322: 1319: 1316: 1304: 1280: 1219: 1216: 1215: 1214: 1208: 1202: 1196: 1165: 1164: 1157:binary strings 1126: 1105: 1083: 1082: 1065: 1056: 1039: 1036: 1019: 1016: 1001: 1000: 999: 998: 992: 977: 975: 971: 970: 967: 966: 955: 947: 946: 939: 931: 930: 923: 915: 914: 907: 899: 898: 887: 879: 878: 867: 859: 858: 847: 835: 834: 794: 793:Quantification 791: 753: 750: 731: 729: 701:, specifies a 694: 693:Basic concepts 691: 475: 472: 206:regular events 182: 179: 143:search engines 67:(shortened as 39: 15: 13: 10: 9: 6: 4: 3: 2: 9912: 9901: 9898: 9896: 9893: 9891: 9888: 9886: 9883: 9881: 9878: 9876: 9873: 9872: 9870: 9855: 9852: 9850: 9847: 9845: 9842: 9840: 9837: 9835: 9832: 9830: 9827: 9825: 9822: 9820: 9817: 9815: 9812: 9811: 9809: 9805: 9799: 9796: 9794: 9791: 9789: 9786: 9784: 9781: 9779: 9776: 9774: 9771: 9769: 9766: 9764: 9761: 9760: 9758: 9756: 9752: 9746: 9743: 9741: 9738: 9736: 9733: 9731: 9728: 9727: 9725: 9723: 9719: 9713: 9710: 9708: 9705: 9703: 9700: 9698: 9695: 9694: 9692: 9690: 9686: 9680: 9677: 9675: 9672: 9671: 9669: 9665: 9659: 9656: 9654: 9651: 9649: 9646: 9644: 9641: 9639: 9636: 9634: 9631: 9629: 9626: 9624: 9621: 9619: 9616: 9615: 9613: 9611: 9607: 9601: 9598: 9596: 9593: 9591: 9588: 9586: 9583: 9581: 9578: 9576: 9573: 9571: 9568: 9566: 9565:Edit distance 9563: 9561: 9558: 9556: 9553: 9551: 9548: 9547: 9545: 9543: 9542:String metric 9539: 9535: 9528: 9523: 9521: 9516: 9514: 9509: 9508: 9505: 9490: 9489:proper subset 9483: 9472: 9469: 9467: 9464: 9462: 9459: 9457: 9454: 9452: 9449: 9447: 9444: 9442: 9439: 9437: 9433: 9431: 9428: 9426: 9423: 9420: 9417: 9415: 9412: 9410: 9407: 9405: 9402: 9401: 9399: 9394: 9391: 9389: 9386: 9384: 9381: 9379: 9376: 9374: 9371: 9369: 9366: 9364: 9361: 9359: 9356: 9353: 9351: 9348: 9345: 9340: 9338: 9335: 9333: 9330: 9328: 9325: 9324: 9322: 9317: 9316:Non-recursive 9314: 9311: 9309: 9306: 9304: 9301: 9299: 9296: 9294: 9291: 9289: 9286: 9284: 9281: 9278: 9276: 9273: 9270: 9265: 9263: 9260: 9257: 9255: 9252: 9251: 9249: 9243: 9240: 9237: 9234: 9231: 9228: 9225: 9222: 9219: 9216: 9213: 9210: 9207: 9204: 9203: 9201: 9200: 9197: 9194: 9192: 9189: 9187: 9184: 9182: 9179: 9178: 9173: 9169: 9165: 9161: 9154: 9149: 9147: 9142: 9140: 9135: 9134: 9131: 9125: 9122: 9120: 9119: 9114: 9112: 9111: 9106: 9104: 9103: 9098: 9096: 9095: 9090: 9088: 9084: 9081: 9078: 9073: 9069: 9068: 9064: 9050: 9046: 9042: 9038: 9034: 9030: 9025: 9020: 9016: 9012: 9008: 9004: 9003:Thompson, Ken 9000: 8996: 8990: 8986: 8981: 8977: 8971: 8967: 8963: 8958: 8953: 8949: 8945: 8939: 8935: 8931: 8926: 8916:on 2010-07-14 8915: 8911: 8906: 8902: 8898: 8894: 8888: 8884: 8880: 8876: 8872: 8868: 8863: 8849: 8842: 8841: 8835: 8831: 8827: 8822: 8817: 8813: 8809: 8805: 8800: 8796: 8791: 8787: 8781: 8777: 8772: 8758: 8754: 8748: 8744: 8740: 8736: 8729: 8728: 8722: 8718: 8712: 8708: 8703: 8693:on 2011-07-18 8692: 8687: 8682: 8678: 8674: 8673: 8667: 8656: 8652: 8646: 8642: 8638: 8637: 8631: 8627: 8621: 8617: 8613: 8609: 8599:on 2010-01-01 8598: 8594: 8589: 8578: 8574: 8570: 8566: 8555: 8551: 8550: 8544: 8540: 8536: 8532: 8528: 8523: 8518: 8515:(2): 97–113. 8514: 8510: 8503: 8498: 8487: 8483: 8482: 8474: 8470: 8465: 8461: 8457: 8453: 8449: 8448: 8443: 8427: 8423: 8417: 8413: 8409: 8405: 8401: 8395: 8392: 8386: 8383: 8379: 8375: 8373: 8368: 8367: 8366:in a Nutshell 8364: 8357: 8354: 8350: 8346: 8343: 8339: 8335: 8323: 8317: 8314: 8301: 8297: 8296: 8291: 8287: 8281: 8278: 8266: 8262: 8258: 8252: 8249: 8238: 8234: 8228: 8225: 8214: 8210: 8204: 8201: 8190: 8186: 8180: 8177: 8166: 8162: 8156: 8153: 8142: 8138: 8132: 8129: 8118: 8114: 8108: 8105: 8093: 8092:microsoft.com 8089: 8083: 8080: 8069: 8065: 8059: 8056: 8045: 8041: 8035: 8032: 8020: 8016: 8010: 8008: 8004: 7992: 7988: 7982: 7979: 7973: 7968: 7961: 7958: 7945: 7941: 7940: 7935: 7929: 7926: 7910: 7906: 7902: 7898: 7894: 7890: 7886: 7879: 7872: 7869: 7863: 7858: 7851: 7848: 7844: 7834:on 2021-08-18 7833: 7829: 7823: 7820: 7817: 7812: 7809: 7806: 7801: 7798: 7793: 7789: 7782: 7779: 7766: 7762: 7758: 7751: 7748: 7743: 7739: 7732: 7730: 7726: 7714:on 2015-02-03 7710: 7703: 7702: 7694: 7691: 7679: 7675: 7671: 7665: 7662: 7649: 7645: 7641: 7637: 7633: 7629: 7622: 7619: 7607: 7604: 7599: 7594: 7590: 7589: 7581: 7578: 7565: 7561: 7557: 7551: 7548: 7535: 7531: 7527: 7523: 7517: 7515: 7511: 7508: 7503: 7500: 7487: 7483: 7479: 7475: 7469: 7467: 7463: 7450: 7446: 7440: 7437: 7433: 7429: 7425: 7418: 7415: 7402: 7398: 7392: 7389: 7383: 7380: 7377: 7373: 7369: 7363: 7360: 7348: 7344: 7340: 7336: 7329: 7326: 7323: 7318: 7315: 7311: 7307: 7301: 7297: 7290: 7287: 7279: 7276: 7265: 7261: 7255: 7252: 7248: 7244: 7238: 7235: 7232: 7227: 7224: 7220: 7215: 7212: 7209: 7208:Sipser (1998) 7204: 7201: 7198: 7193: 7191: 7187: 7175: 7171: 7164: 7162: 7160: 7158: 7154: 7142: 7138: 7134: 7128: 7125: 7113: 7109: 7105: 7099: 7096: 7085: 7081: 7075: 7072: 7069: 7064: 7062: 7058: 7046: 7042: 7038: 7034: 7028: 7025: 7013: 7009: 7005: 6999: 6996: 6984: 6980: 6974: 6971: 6960:on 2011-06-05 6959: 6955: 6951: 6947: 6941: 6938: 6935:, p. 98. 6934: 6929: 6926: 6922: 6917: 6915: 6911: 6899:on 1999-02-21 6898: 6894: 6887: 6884: 6872: 6868: 6862: 6858: 6854: 6850: 6846: 6840: 6837: 6833: 6828: 6826: 6822: 6818: 6817:Thompson 1968 6813: 6811: 6807: 6804: 6801:Kleene 1951, 6798: 6795: 6791: 6776: 6772: 6771: 6763: 6756: 6753: 6749: 6744: 6741: 6728: 6727:howtogeek.com 6724: 6718: 6715: 6703: 6699: 6693: 6690: 6677: 6673: 6667: 6663: 6662: 6654: 6651: 6639: 6635: 6629: 6625: 6624: 6616: 6613: 6602:on 2016-11-01 6601: 6597: 6593: 6586: 6583: 6576: 6572: 6568: 6565: 6563: 6560: 6558: 6555: 6553: 6550: 6548: 6545: 6544: 6540: 6538: 6536: 6532: 6528: 6524: 6520: 6514: 6506: 6490: 6489: 6427: 6424: 6420: 6419: 6403: 6402: 6340: 6337: 6332: 6331: 6315: 6314: 6252: 6249: 6244: 6243: 6230: 6229: 6167: 6164: 6159: 6158: 6145: 6144: 6091: 6088: 6083: 6082: 6069: 6068: 6006: 5994: 5989: 5988: 5978: 5977: 5924: 5902: 5897: 5896: 5883: 5882: 5820: 5817: 5813: 5808: 5807: 5794: 5793: 5731: 5720: 5715: 5714: 5704: 5703: 5641: 5639: 5632: 5631: 5624: 5620: 5615: 5614: 5601: 5600: 5538: 5536: 5521: 5520: 5513: 5508: 5507: 5497: 5496: 5443: 5441: 5434: 5429: 5428: 5415: 5414: 5361: 5358: 5353: 5352: 5339: 5338: 5285: 5282: 5278: 5277: 5267: 5266: 5204: 5178: 5173: 5172: 5162: 5161: 5099: 5096: 5091: 5090: 5080: 5079: 5017: 4999:Modifies the 4998: 4993: 4992: 4982: 4981: 4919: 4916: 4911: 4910: 4900: 4899: 4846: 4843: 4838: 4837: 4827: 4826: 4773: 4752: 4747: 4746: 4733: 4732: 4679: 4674: 4669: 4668: 4664: 4661: 4658: 4657: 4654: 4651: 4647: 4640:, or lack of 4631: 4626: 4624: 4617: 4613: 4608: 4605: 4601: 4599: 4595: 4591: 4583: 4581: 4579: 4575: 4571: 4566: 4560: 4556: 4551: 4546: 4544: 4540: 4536: 4532: 4528: 4527:data scraping 4524: 4520: 4512: 4508: 4504: 4497: 4492: 4489: 4487: 4484: 4482: 4479: 4477: 4474: 4472: 4469: 4467: 4464: 4462: 4459: 4457: 4454: 4452: 4449: 4447: 4444: 4443: 4441: 4439: 4435: 4427: 4394: 4378: 4374: 4346: 4333: 4328: 4324: 4321: 4318: 4314: 4311: 4308: 4304: 4299: 4295: 4291: 4290:Normalization 4288: 4285: 4281: 4277: 4273: 4270:. For these, 4269: 4265: 4262: 4259: 4256: 4253: 4249: 4245: 4241: 4237: 4233: 4229: 4222: 4219: 4216: 4212: 4209: 4206: 4202: 4198: 4194: 4191: 4190: 4189: 4187: 4183: 4179: 4171: 4169: 4147: 4144: 4141: 4138: 4134: 4094: 4091: 4088: 4085: 4081: 4058: 4056: 4051: 4046: 4042: 4040: 4012: 4007: 4005: 4001: 3997: 3991: 3989: 3985: 3981: 3977: 3976: 3971: 3967: 3963: 3958: 3956: 3948: 3946: 3914: 3898: 3891: 3882: 3876: 3869: 3864: 3861: 3860: 3852: 3847: 3840: 3835: 3832: 3831: 3827: 3824: 3821: 3820: 3814: 3811: 3807: 3805: 3801: 3797: 3793: 3789: 3785: 3780: 3778: 3774: 3770: 3766: 3765:pumping lemma 3763:, due to the 3762: 3757: 3751: 3747: 3739: 3731: 3729: 3723:IETF I-Regexp 3722: 3720: 3717: 3712:only matches 3700:matches both 3690: 3670: 3660: 3654: 3646: 3644: 3639:matches only 3634: 3630: 3626: 3614: 3608: 3589:Lazy matching 3588: 3586: 3583: 3579: 3575: 3571: 3567: 3563: 3559: 3555: 3551: 3547: 3543: 3537: 3530:Perl and PCRE 3529: 3527: 3525: 3521: 3517: 3513: 3508: 3452: 3448: 3444: 3433: 3415: 3396: 3386: 3384: 3380: 3377: 3376: 3371: 3352: 3342: 3340: 3336: 3333: 3332: 3327: 3317: 3307: 3297: 3295: 3292: 3291: 3286: 3258: 3248: 3238: 3234: 3232: 3229: 3228: 3223: 3204: 3202: 3200: 3196: 3193: 3192: 3187: 3168: 3158: 3156: 3152: 3149: 3148: 3143: 3124: 3114: 3112: 3108: 3105: 3104: 3099: 3080: 3078: 3076: 3072: 3069: 3068: 3063: 3053: 3043: 3033: 3031: 3028: 3027: 3022: 2994: 2984: 2974: 2970: 2967: 2966: 2961: 2942: 2940: 2938: 2934: 2932: 2929: 2928: 2872: 2862: 2860: 2858: 2856: 2853: 2852: 2796: 2786: 2781: 2771: 2769: 2766: 2765: 2760: 2741: 2731: 2729: 2725: 2722: 2721: 2716: 2697: 2687: 2685: 2681: 2678: 2677: 2672: 2662: 2652: 2642: 2640: 2637: 2636: 2631: 2621: 2611: 2601: 2599: 2596: 2595: 2590: 2571: 2569: 2567: 2563: 2560: 2559: 2554: 2535: 2533: 2531: 2529: 2526: 2525: 2521: 2518: 2515: 2512: 2509: 2506: 2505: 2502: 2500: 2496: 2492: 2485:to uppercase 2484: 2472: 2454: 2452: 2446: 2435: 2429: 2423: 2417: 2416: 2415: 2414: 2401: 2396: 2395: 2387: 2382: 2381: 2373: 2368: 2363: 2360: 2359: 2356: 2353: 2331: 2327: 2319: 2317: 2307: 2301: 2295: 2289: 2279: 2277:except "bat". 2269: 2263: 2257: 2256: 2255: 2254: 2241: 2237: 2226: 2222: 2218: 2213: 2209: 2203: 2186: 2181: 2175: 2171: 2167: 2164: 2159: 2155: 2144: 2138: 2133: 2128: 2123: 2112: 2108: 2104: 2055: 2051: 2039: 2035: 2030: 2025: 2020: 2015: 2012: 2011: 2008: 2006: 1987: 1983: 1979: 1971: 1969: 1967: 1963: 1959: 1958:lazy matching 1946: 1944: 1925: 1920: 1895: 1893: 1889: 1885: 1881: 1877: 1873: 1869: 1866: 1858: 1856: 1842: 1830: 1826: 1822: 1818: 1802: 1790: 1774: 1772: 1746: 1738: 1734: 1729: 1726: 1724: 1719: 1715: 1707: 1703: 1699: 1691: 1689: 1687: 1683: 1679: 1675: 1670: 1666: 1661: 1659: 1655: 1651: 1647: 1643: 1640: 1636: 1632: 1628: 1624: 1620: 1617: 1613: 1609: 1604: 1602: 1598: 1594: 1589: 1583: 1581: 1579: 1575: 1569: 1567: 1563: 1558: 1556: 1551: 1549: 1545: 1541: 1537: 1514: 1504: 1501: 1498: 1492: 1485: 1482: 1479: 1473: 1467: 1464: 1461: 1452: 1449: 1446: 1436: 1431: 1423: 1420: 1417: 1407: 1406: 1405: 1403: 1395: 1378: 1375: 1372: 1363: 1360: 1357: 1348: 1345: 1342: 1336: 1331: 1323: 1320: 1317: 1303: 1299: 1295: 1291: 1287: 1283: 1276: 1275:exponentially 1272: 1267: 1265: 1261: 1257: 1253: 1249: 1217: 1209: 1203: 1197: 1191: 1190: 1189: 1188: 1184: 1158: 1149: 1145: 1141: 1133: 1132: 1127: 1120: 1112: 1111: 1106: 1094: 1093: 1092:concatenation 1088: 1087: 1086: 1080: 1072: 1071: 1066: 1063: 1062: 1057: 1054: 1050: 1049: 1048: 1046: 1037: 1035: 1033: 1029: 1025: 1017: 1015: 1013: 1012:§ Syntax 1009: 1004: 993: 987: 986: 983: 979:The wildcard 978: 976: 973: 972: 964: 960: 956: 953: 949: 948: 944: 940: 937: 933: 932: 928: 924: 921: 917: 916: 912: 908: 905: 901: 900: 892: 888: 885: 881: 880: 872: 868: 865: 861: 860: 852: 848: 845: 841: 840: 837: 836: 832: 825: 821: 814: 807: 806:question mark 803: 799: 795: 792: 761: 757: 754: 751: 736: 732: 730: 727: 726: 725: 722: 716: 708: 704: 700: 692: 690: 688: 684: 680: 676: 670: 663: 659: 654: 650: 646: 642: 638: 634: 626: 622: 617: 613: 609: 605: 534: 529: 527: 519: 514: 512: 509: 489: 488:metacharacter 485: 481: 473: 471: 470: 466: 462: 458: 454: 450: 446: 442: 438: 434: 429: 427: 423: 419: 415: 410: 405: 401: 396: 392: 388: 383: 381: 377: 373: 369: 368:mini-language 365: 361: 357: 353: 349: 345: 341: 337: 336:Henry Spencer 333: 328: 326: 322: 318: 314: 310: 306: 302: 298: 294: 289: 287: 283: 277: 271: 267: 263: 259: 255: 251: 247: 243: 237: 235: 231: 227: 223: 219: 215: 211: 207: 203: 199: 191: 187: 180: 178: 176: 175:many of these 172: 168: 164: 160: 156: 152: 148: 144: 139: 137: 133: 129: 125: 121: 117: 112: 110: 106: 102: 98: 94: 90: 86: 85:match pattern 82: 78: 74: 70: 66: 57: 37: 31: 24: 23: 9768:Suffix array 9688: 9674:Aho–Corasick 9585:Lee distance 9425:Nested stack 9368:Context-free 9293:Context-free 9254:Unrestricted 9117: 9109: 9101: 9093: 9053:. Retrieved 9014: 9010: 8987:. O'Reilly. 8984: 8961: 8929: 8918:. Retrieved 8914:the original 8866: 8855:. Retrieved 8839: 8811: 8807: 8794: 8778:. Springer. 8775: 8764:. Retrieved 8734: 8726: 8706: 8695:. Retrieved 8691:the original 8676: 8671: 8659:. Retrieved 8635: 8615: 8601:. Retrieved 8597:the original 8581:. Retrieved 8572: 8558:. Retrieved 8548: 8512: 8508: 8490:. Retrieved 8480: 8459: 8430:. Retrieved 8407: 8394: 8385: 8370: 8361: 8356: 8316: 8304:. Retrieved 8293: 8280: 8269:. Retrieved 8260: 8251: 8240:. Retrieved 8236: 8227: 8216:. Retrieved 8212: 8203: 8192:. Retrieved 8188: 8179: 8168:. Retrieved 8165:v2.ocaml.org 8164: 8155: 8144:. Retrieved 8140: 8131: 8120:. Retrieved 8116: 8107: 8096:. Retrieved 8091: 8082: 8071:. Retrieved 8067: 8058: 8047:. Retrieved 8043: 8034: 8023:. Retrieved 7995:. Retrieved 7981: 7960: 7948:. Retrieved 7937: 7928: 7916:. Retrieved 7888: 7884: 7871: 7850: 7842: 7836:. Retrieved 7832:the original 7822: 7811: 7800: 7791: 7781: 7769:. Retrieved 7760: 7750: 7741: 7716:. Retrieved 7709:the original 7700: 7693: 7682:. Retrieved 7673: 7664: 7652:. Retrieved 7635: 7631: 7621: 7609:. Retrieved 7587: 7580: 7568:. Retrieved 7559: 7550: 7538:. Retrieved 7525: 7502: 7490:. Retrieved 7486:the original 7477: 7453:. Retrieved 7439: 7431: 7427: 7417: 7407:December 10, 7405:. Retrieved 7400: 7391: 7382: 7375: 7371: 7367: 7362: 7351:. Retrieved 7342: 7338: 7328: 7322:Kozen (1991) 7317: 7309: 7295: 7289: 7278: 7267:. Retrieved 7263: 7254: 7237: 7226: 7214: 7203: 7177:. Retrieved 7173: 7145:. Retrieved 7136: 7127: 7116:. Retrieved 7107: 7098: 7087:. Retrieved 7084:www.pcre.org 7083: 7074: 7049:. Retrieved 7040: 7027: 7016:. Retrieved 7007: 6998: 6987:. Retrieved 6973: 6962:. Retrieved 6958:the original 6940: 6928: 6901:. Retrieved 6897:the original 6886: 6875:. Retrieved 6852: 6839: 6797: 6789: 6782:. Retrieved 6775:the original 6768: 6755: 6743: 6731:. Retrieved 6726: 6717: 6705:. Retrieved 6701: 6692: 6680:. Retrieved 6660: 6653: 6642:. Retrieved 6622: 6615: 6604:. Retrieved 6600:the original 6595: 6585: 6530: 6516: 6487: 6486: 6400: 6399: 6312: 6311: 6227: 6226: 6142: 6141: 6066: 6065: 6004:in Unicode. 6000:in ASCII or 5975: 5974: 5880: 5879: 5815: 5791: 5790: 5701: 5700: 5638:in Unicode. 5637: 5622: 5598: 5597: 5526: 5494: 5493: 5436: 5412: 5411: 5336: 5335: 5264: 5263: 5159: 5158: 5077: 5076: 4979: 4978: 4897: 4896: 4824: 4823: 4730: 4729: 4662:Description 4652: 4627: 4620: 4616:substitution 4615: 4611: 4606: 4602: 4587: 4567: 4547: 4531:web scraping 4529:(especially 4516: 4431: 4392: 4376: 4372: 4357:\p{Armenian} 4344: 4322: 4312: 4306: 4302: 4297: 4289: 4263: 4257: 4243: 4239: 4231: 4227: 4220: 4210: 4192: 4175: 4059: 4047: 4043: 4011:backtracking 4008: 3999: 3995: 3992: 3987: 3983: 3979: 3974: 3969: 3959: 3952: 3915: 3904: 3880: 3867: 3850: 3838: 3809: 3799: 3798:, or simply 3795: 3791: 3783: 3781: 3761:context-free 3758: 3749: 3741: 3735: 3726: 3718: 3694:(?>group) 3691: 3664: 3652: 3650: 3632: 3628: 3624: 3618: 3592: 3539: 3523: 3519: 3515: 3511: 3509: 3450: 3446: 3434: 3422: 2507:Description 2498: 2494: 2486: 2478: 2473: 2458: 2445:command line 2442: 2412: 2411: 2364:Description 2351: 2329: 2323: 2314: 2252: 2251: 2239: 2235: 2224: 2220: 2211: 2207: 2173: 2169: 2162: 2147: 2142: 2069: 2016:Description 2007:) does not. 2004: 1981: 1975: 1947: 1918: 1917:, which are 1896: 1891: 1879: 1875: 1871: 1862: 1778: 1757:{}()^$ .|*+? 1730: 1722: 1717: 1713: 1705: 1701: 1697: 1695: 1678:Dexter Kozen 1662: 1657: 1653: 1649: 1645: 1641: 1638: 1634: 1630: 1626: 1622: 1618: 1615: 1611: 1607: 1605: 1590: 1587: 1573: 1570: 1559: 1552: 1532: 1530: 1398: 1396: 1307:is given by 1301: 1297: 1293: 1289: 1285: 1278: 1268: 1259: 1255: 1251: 1221: 1186: 1185: 1166: 1143: 1129: 1117:denotes the 1108: 1090: 1084: 1078: 1068: 1061:empty string 1059: 1052: 1041: 1021: 1006:The precise 1005: 1002: 981: 962: 958: 951: 942: 935: 926: 919: 910: 903: 890: 883: 871:zero or more 870: 863: 850: 843: 735:vertical bar 728:Boolean "or" 723: 714: 711:H(ä|ae?)ndel 698: 696: 686: 682: 674: 668: 661: 657: 632: 630: 624: 620: 530: 515: 483: 479: 477: 468: 430: 414:Philip Hazel 411: 384: 343: 329: 295:programs at 290: 275: 260:code on the 242:Ken Thompson 238: 232:include the 205: 195: 151:text editors 140: 113: 76: 72: 68: 64: 62: 55: 54:(lower case 21: 9778:Suffix tree 9434:restricted 9041:Wall, Larry 8295:Google Blog 8213:www.php.net 7950:21 November 7918:21 November 7771:24 November 7570:24 November 7540:23 December 7133:"CUDA grep" 7068:Wall (2002) 7033:Wall, Larry 6933:Aycock 2003 6748:Kleene 1951 6733:24 February 6707:24 February 6494:Hello World 6234:Hello World 6149:Hello World 5419:Hello World 5343:Hello World 4737:Hello World 4644:instead of 4507:A blacklist 4353:\P{Block=X} 4341:\p{Block=X} 4236:code points 3942:(?<!...) 3936:(?<=...) 3777:NP-complete 3767:. However, 3687:"Ganymede," 3641:"Ganymede," 3621:"Ganymede," 3029:Non-digits 2783:\< \> 2499:aAbBcC...zZ 1837:/re1/,/re2/ 1718:quantifiers 1686:Horn clause 1676:. In 1991, 1665:Kleene star 1509: times 1131:Kleene star 1110:alternation 891:one or more 851:zero or one 831:Kleene plus 822:), and the 820:Kleene star 756:Parentheses 679:recursively 616:Kleene star 612:Translating 518:text editor 478:The phrase 9869:Categories 9055:2006-10-11 8934:Wrox Press 8920:2009-04-01 8857:2017-12-10 8766:2011-02-03 8697:2009-06-15 8661:2005-04-26 8612:Forta, Ben 8603:2008-04-27 8583:2011-12-13 8560:2011-12-13 8492:2013-12-14 8444:References 8432:2017-09-10 8369:, p. 213; 8360:E.g., see 8271:2023-02-24 8242:2023-02-24 8218:2023-02-04 8194:2023-02-04 8170:2022-08-21 8146:2022-04-27 8122:2022-04-27 8098:2024-02-20 8073:2022-04-27 8049:2022-04-27 8025:2010-02-05 7997:2013-09-25 7972:1903.05896 7838:2022-02-12 7805:Cox (2007) 7718:2022-09-05 7684:2019-11-21 7654:2015-07-03 7492:10 October 7455:January 8, 7353:2018-03-28 7269:2024-02-21 7179:31 January 7147:2019-10-22 7118:2019-10-22 7089:2024-04-07 7051:2006-10-10 7018:2013-10-12 7008:PostgreSQL 6989:2013-10-11 6964:2009-02-17 6877:2013-05-15 6644:2016-07-25 6606:2016-10-31 5922:property. 5908:in ASCII; 5621:Matches a 5529:Alphabetic 5045:m/(l.+?o)/ 4559:small caps 4466:JavaScript 4268:Devanagari 3955:algorithms 3828:Lookahead 3825:Lookbehind 3815:Assertions 3804:Larry Wall 3698:^(wi|w)i$ 3673:, because 3653:possessive 3574:XML Schema 3550:JavaScript 3534:See also: 2342:\{ \} 2334:\( \) 2151:\( \) 1968:patterns. 1884:deprecated 1789:delimiters 1775:Delimiters 1725:quantifier 1669:set unions 1601:isomorphic 1248:complement 798:quantifier 449:ECMAScript 416:developed 409:operator. 372:Raku rules 356:PostgreSQL 250:text files 200:described 81:characters 9388:Star-free 9342:Positive 9332:Decidable 9267:Positive 9191:Languages 8883:1813/6963 8686:0802.2869 8517:CiteSeerX 8380:, p. 106. 8334:delimiter 8261:Crates.io 7862:1308.3822 7428:swtch.com 7241:Based on 7108:grovf.com 6903:9 October 6784:13 August 6507:Induction 6449:$ string1 6431:$ string1 6362:$ string1 6344:$ string1 6274:$ string1 6256:$ string1 6189:$ string1 6171:$ string1 6113:$ string1 6095:$ string1 6028:$ string1 6010:$ string1 6002:\P{Digit} 5946:$ string1 5928:$ string1 5912:\p{Digit} 5848:m/\S.*\S/ 5842:$ string1 5824:$ string1 5759:m/\s.*\s/ 5753:$ string1 5735:$ string1 5663:$ string1 5645:$ string1 5560:$ string1 5542:$ string1 5465:$ string1 5447:$ string1 5383:$ string1 5365:$ string1 5307:$ string1 5289:$ string1 5232:m/l{1,2}/ 5226:$ string1 5208:$ string1 5121:$ string1 5103:$ string1 5039:$ string1 5021:$ string1 4941:$ string1 4923:$ string1 4868:$ string1 4850:$ string1 4795:$ string1 4777:$ string1 4701:$ string1 4683:$ string1 4537:, simple 4438:libraries 4389:\p{GC=Lu} 4114:time and 4041:(ReDoS). 3862:Negative 3833:Positive 3822:Assertion 3633:reluctant 3451:word-head 2413:Examples: 2253:Examples: 1966:recursive 1945:regexes. 1593:algorithm 1502:− 1493:⏟ 1483:∣ 1474:⋯ 1465:∣ 1450:∣ 1432:∗ 1421:∣ 1376:∣ 1361:∣ 1346:∣ 1332:∗ 1321:∣ 1187:Examples: 1177:a|(b(c*)) 1119:set union 1053:empty set 952:{min,max} 824:plus sign 764:gray|grey 760:operators 297:Bell Labs 256:(JIT) to 165:, and in 99:, or for 9186:Grammars 9049:Archived 9043:(2002). 9033:21260384 9005:(1968). 8954:(1998). 8901:19875225 8848:Archived 8830:17253809 8757:Archived 8655:Archived 8641:O'Reilly 8618:. Sams. 8614:(2004). 8577:Archived 8554:Archived 8539:15345671 8486:Archived 8471:(1992). 8426:Archived 8412:O'Reilly 8402:(2005). 8345:Archived 8340:". See ' 8300:Archived 8265:Archived 8185:"perlre" 8044:man7.org 8019:Archived 7991:Archived 7944:Archived 7909:Archived 7765:Archived 7678:Archived 7648:Archived 7611:11 March 7564:Archived 7534:Archived 7449:Archived 7347:Archived 7174:man7.org 7141:Archived 7112:Archived 7045:Archived 7035:(2006). 7012:Archived 6983:Archived 6952:(2003). 6871:Archived 6676:Archived 6638:Archived 6541:See also 6368:m/d\n\z/ 6195:m/rld$ / 5997:same as 5952:m/(\d+)/ 5905:same as 5627:same as 5516:same as 5471:m/llo\b/ 5196:x* y+ z? 4707:m/...../ 4665:Example 4598:versions 4584:Examples 4409:\p{Dash} 4405:\p{Math} 4329:and the 4284:katakana 4280:hiragana 3982:in time 2513:Perl/Tcl 2346:{ } 2338:( ) 2038:newlines 1993:{ } 1989:( ) 1954:{ } 1950:( ) 1919:required 1915:{ } 1911:( ) 1851:with an 1803:, where 1791:, as in 1753:{ } 1749:( ) 1710:( ) 1696:A regex 1544:grammars 1292:} whose 1205:ab*(c|ε) 1140:superset 1045:alphabet 974:Wildcard 813:asterisk 752:Grouping 707:elements 673:, where 533:globbing 522:serialie 511:keyboard 474:Patterns 387:ISO SGML 288:design. 286:compiler 258:IBM 7094 138:syntax. 128:syntaxes 111:theory. 9844:Sorting 9814:Parsing 9534:Strings 9409:Decider 9383:Regular 9350:Indexed 9308:Regular 9275:Indexed 8458:(ed.). 7905:3175806 6948:citing 6682:25 July 6488:Output: 6401:Output: 6313:Output: 6228:Output: 6143:Output: 6067:Output: 5976:Output: 5881:Output: 5792:Output: 5702:Output: 5599:Output: 5495:Output: 5413:Output: 5337:Output: 5265:Output: 5160:Output: 5127:m/el*o/ 5078:Output: 4980:Output: 4947:m/H.?e/ 4898:Output: 4825:Output: 4731:Output: 4594:library 4578:Exalead 4539:parsing 4349:\P{InX} 4337:\p{InX} 4186:Unicode 4172:Unicode 4165:⁠ 4116:⁠ 4112:⁠ 4063:⁠ 3930:(?!...) 3924:(?=...) 3885:pattern 3871:pattern 3854:pattern 3842:pattern 3800:pattern 3750:squares 3629:minimal 2968:Digits 2437:cat|dog 2404:abc|def 2344:is now 2336:is now 2326:escaped 1976:In the 1939:grep -P 1935:grep -G 1931:grep -E 1733:literal 1698:pattern 1637:) and ( 1614:) and ( 1574:regexes 1546:of the 1254:; here 965:times. 945:times. 913:times. 855:colou?r 715:matches 699:pattern 484:regexes 370:called 342:called 325:POSIX.2 181:History 173:", and 97:strings 9461:Finite 9393:Finite 9238:Type-3 9229:Type-2 9211:Type-1 9205:Type-0 9087:Curlie 9031:  8991:  8972:  8940:  8899:  8889:  8828:  8782:  8749:  8713:  8647:  8622:  8537:  8519:  8418:  8372:Python 7939:GitHub 7903:  7792:GitHub 7742:GitHub 7530:Oracle 7302:  7041:perlre 6863:  6668:  6630:  6280:m/\AH/ 6119:m/^He/ 4486:Python 4419:, and 4407:, and 4381:\p{Lu} 4205:UTF-32 4201:UTF-16 3796:regexp 3754:(.+)\1 3607:greedy 3605:) are 3572:, and 3558:Python 3408:XDigit 2522:ASCII 2229:a{3,5} 1941:" for 1905:, and 1845:s,/,X, 1815:as in 1813:g/re/p 1765:dswDSW 1723:greedy 1702:string 1692:Syntax 1576:. See 1238:, and 1199:(a|b)* 1175:, and 1148:closed 1008:syntax 936:{,max} 920:{min,} 811:, the 637:string 445:Python 433:lexers 319:, and 311:, and 234:SNOBOL 171:engine 73:regexp 22:Re:Gex 9807:Other 9763:DAFSA 9730:BLAST 9419:PTIME 9077:Regex 9029:S2CID 8966:31–90 8897:S2CID 8851:(PDF) 8844:(PDF) 8826:S2CID 8760:(PDF) 8731:(PDF) 8681:arXiv 8535:S2CID 8505:(PDF) 8476:(PDF) 8306:4 May 7967:arXiv 7912:(PDF) 7901:S2CID 7881:(PDF) 7857:arXiv 7712:(PDF) 7705:(PDF) 6778:(PDF) 6765:(PDF) 6577:Notes 6473:print 6464:print 6410:World 6407:Hello 6386:print 6377:print 6322:World 6319:Hello 6298:print 6289:print 6213:print 6204:print 6128:print 6052:print 6043:print 6034:m/\D/ 5961:print 5866:print 5857:print 5777:print 5768:print 5687:print 5678:print 5669:m/\W/ 5584:print 5575:print 5566:m/\w/ 5480:print 5398:print 5322:print 5250:print 5241:print 5191:{0,N} 5175:{M,N} 5145:print 5136:print 5063:print 5054:print 5013:{M,N} 4965:print 4956:print 4883:print 4874:m/l+/ 4810:print 4716:print 4646:POSIX 4634:\( \) 4623:POSIX 4612:match 4592:, or 4471:OCaml 4432:Most 4387:, or 4369:\p{X} 4363:, or 4234:have 4203:, or 4197:UTF-8 4178:ASCII 4055:agrep 3868:<! 3839:<= 3792:regex 3669:does 3667:".*+" 3637:".+?" 3578:Boost 3554:Julia 3364:Upper 3270:Space 3216:Punct 3180:Print 3136:Lower 3092:Graph 3006:Digit 2954:Cntrl 2753:Blank 2709:Alpha 2583:Alnum 2547:ASCII 2510:POSIX 2497:, or 2447:flag 2197:(ab)* 1978:POSIX 1868:POSIX 1825:Linux 1714:atoms 1706:atoms 1578:below 1244:(a|ε) 1181:a|bc* 1169:(ab)c 1123:(R|S) 1115:(R|S) 802:token 508:ASCII 482:, or 321:Emacs 153:, in 132:POSIX 69:regex 9798:Trie 9788:Rope 9166:and 8989:ISBN 8970:ISBN 8938:ISBN 8887:ISBN 8780:ISBN 8747:ISBN 8711:ISBN 8709:. . 8645:ISBN 8620:ISBN 8416:ISBN 8363:Java 8322:Perl 8308:2019 7952:2019 7920:2019 7773:2019 7613:2024 7606:9485 7572:2019 7542:2016 7494:2015 7457:2012 7409:2023 7300:ISBN 7181:2023 6905:2013 6861:ISBN 6803:pg46 6786:2019 6735:2024 6709:2024 6684:2016 6666:ISBN 6628:ISBN 5313:m/+/ 5187:{M,} 4636:vs. 4630:Perl 4576:and 4563:{4,} 4498:Uses 4491:Rust 4476:Perl 4461:Java 4347:and 4327:Perl 4307:some 4282:and 4248:gawk 4230:and 3939:and 3927:and 3916:The 3909:and 3897:Perl 3704:and 3683:"*+" 3657:".*" 3625:lazy 3611:".+" 3601:and 3580:and 3562:Ruby 3546:Java 3542:Perl 3462:and 3449:and 3447:word 2914:)(?= 2902:< 2899:)|(? 2890:)(?= 2878:< 2838:)(?= 2826:< 2823:)|(? 2814:)(?= 2802:< 2519:Java 2390:ab+c 2376:ab?c 2340:and 2297:at$ 2189:ab*c 2101:abc] 2097:abc] 2070:The 2001:\{\} 1999:and 1997:\(\) 1991:and 1952:and 1943:Perl 1927:grep 1913:and 1865:IEEE 1863:The 1841:Perl 1821:Unix 1817:grep 1809:/re/ 1793:/re/ 1785:"re" 1767:and 1759:and 1751:and 1667:and 1555:ISBN 1226:and 1193:a|b* 1161:(R*) 1153:(R*) 1136:(R*) 1102:(RS) 1097:(RS) 995:a.*b 895:ab+c 875:ab*c 766:and 746:grey 740:gray 614:the 574:*)?| 461:PCRE 453:FPGA 443:and 441:Java 424:and 418:PCRE 407:LIKE 400:glob 360:Raku 332:Perl 313:expr 293:Unix 270:grep 161:and 149:and 136:Perl 124:Unix 107:and 89:text 48:/r+/ 42:Blue 9085:at 9019:doi 8879:hdl 8871:doi 8816:doi 8739:doi 8527:doi 8378:PHP 8326:m// 8141:MDN 7893:doi 7640:doi 7603:RFC 7593:doi 6531:not 6525:in 6455:m// 5914:or 5816:but 5623:non 5183:{M} 5011:or 4761:$ 2 4757:$ 1 4749:( ) 4650:). 4533:), 4509:on 4481:PHP 4451:C++ 4393:not 4351:or 4339:or 4298:may 4252:Vim 4004:re2 3895:in 3714:wii 3706:wii 3675:.*+ 3631:or 3627:or 3582:PHP 3489:or 3443:Vim 3276:or 3012:or 2516:Vim 2431:+at 2425:*at 2419:?at 2330:ERE 2309:s.* 2291:^at 2285:.at 2275:.at 2259:.at 2135:( ) 2042:a.c 2005:ERE 1982:BRE 1924:GNU 1880:SRE 1876:ERE 1872:BRE 1829:sed 1236:aa* 1173:abc 1026:in 989:a.b 963:max 959:min 943:max 927:min 904:{n} 703:set 689:). 601:+)? 589:+)( 544:+$ 465:CPU 457:GPU 422:PHP 404:SQL 395:DTD 376:BNF 352:DFA 348:NFA 340:Tcl 309:AWK 305:sed 301:lex 246:QED 163:AWK 159:sed 87:in 71:or 9871:: 9162:: 9047:. 9027:. 9015:11 9013:. 9009:. 8968:. 8960:. 8936:. 8932:. 8895:. 8885:. 8877:. 8824:. 8812:11 8810:. 8806:. 8755:. 8745:. 8733:. 8675:. 8653:. 8643:. 8639:. 8571:. 8533:. 8525:. 8513:35 8511:. 8507:. 8484:. 8478:. 8424:. 8410:. 8406:. 8330:// 8298:. 8292:. 8263:. 8259:. 8235:. 8211:. 8187:. 8163:. 8139:. 8115:. 8090:. 8066:. 8042:. 8017:. 8006:^ 7936:. 7907:. 7899:. 7889:31 7887:. 7883:. 7841:. 7790:. 7763:. 7759:. 7740:. 7728:^ 7676:. 7672:. 7646:. 7636:14 7634:. 7630:. 7601:. 7562:. 7558:. 7532:. 7528:. 7524:. 7513:^ 7480:. 7476:. 7465:^ 7430:. 7426:. 7399:. 7343:16 7337:. 7308:. 7262:. 7189:^ 7172:. 7156:^ 7139:. 7135:. 7110:. 7106:. 7082:. 7060:^ 7043:. 7039:. 7010:. 7006:. 6981:. 6913:^ 6869:. 6855:. 6851:. 6824:^ 6809:^ 6788:. 6767:. 6725:. 6700:. 6674:. 6636:. 6594:. 6452:=~ 6443:if 6365:=~ 6356:if 6334:\z 6277:=~ 6268:if 6246:\A 6192:=~ 6183:if 6161:$ 6116:=~ 6107:if 6031:=~ 6022:if 5991:\D 5949:=~ 5940:if 5899:\d 5845:=~ 5836:if 5810:\S 5756:=~ 5747:if 5717:\s 5666:=~ 5657:if 5617:\W 5563:=~ 5554:if 5510:\w 5468:=~ 5459:if 5440:. 5431:\b 5386:=~ 5377:if 5310:=~ 5301:if 5229:=~ 5220:if 5202:. 5124:=~ 5115:if 5042:=~ 5033:if 5007:, 5003:, 4944:=~ 4935:if 4871:=~ 4862:if 4798:=~ 4789:if 4771:. 4769:\2 4767:, 4765:\1 4759:, 4704:=~ 4695:if 4642:\d 4638:() 4600:. 4525:, 4456:C# 4415:, 4403:, 4399:, 4383:, 4359:, 4025:aa 4000:mn 3911:$ 3878:(? 3866:(? 3849:(? 3837:(? 3794:, 3756:. 3708:, 3702:wi 3643:. 3597:, 3566:Qt 3564:, 3560:, 3556:, 3552:, 3548:, 3526:. 3426:ab 3254:_s 2875:(? 2799:(? 2451:. 2449:-E 2281:at 2271:at 2265:at 2246:. 2234:\{ 2125:$ 2103:. 2099:, 2084:, 2081:, 2068:. 1960:, 1901:, 1801:ed 1797:re 1781:re 1771:. 1716:; 1625:, 1568:. 1550:. 1394:. 1242:= 1240:a? 1234:= 1232:a+ 1134:) 1113:) 1095:) 1073:) 1034:. 1014:. 833:). 796:A 769:gr 733:A 664:*) 631:A 627:") 604:. 559:+( 550:?( 541:+| 513:. 492:b. 455:, 428:. 317:vi 307:, 303:, 278:/p 276:re 274:g/ 266:ed 63:A 9526:e 9519:t 9512:v 9354:— 9312:— 9279:— 9244:— 9241:— 9235:— 9232:— 9226:— 9223:— 9220:— 9217:— 9214:— 9208:— 9152:e 9145:t 9138:v 9058:. 9035:. 9021:: 8997:. 8978:. 8946:. 8923:. 8903:. 8881:: 8873:: 8860:. 8832:. 8818:: 8788:. 8769:. 8741:: 8719:. 8700:. 8683:: 8664:. 8628:. 8606:. 8586:. 8563:. 8541:. 8529:: 8495:. 8435:. 8310:. 8274:. 8245:. 8221:. 8197:. 8173:. 8149:. 8125:. 8101:. 8076:. 8052:. 8028:. 8000:. 7975:. 7969:: 7954:. 7922:. 7895:: 7865:. 7859:: 7794:. 7775:. 7744:. 7721:. 7687:. 7657:. 7642:: 7615:. 7595:: 7574:. 7544:. 7496:. 7459:. 7411:. 7356:. 7272:. 7249:. 7183:. 7150:. 7121:. 7092:. 7054:. 7021:. 6992:. 6967:. 6907:. 6880:. 6834:. 6819:. 6750:. 6737:. 6711:. 6686:. 6647:. 6609:. 6482:} 6479:; 6470:; 6461:{ 6458:) 6446:( 6440:; 6434:= 6395:} 6392:; 6383:; 6374:{ 6371:) 6359:( 6353:; 6347:= 6307:} 6304:; 6295:; 6286:{ 6283:) 6271:( 6265:; 6259:= 6222:} 6219:; 6210:; 6201:{ 6198:) 6186:( 6180:; 6174:= 6137:} 6134:; 6125:{ 6122:) 6110:( 6104:; 6098:= 6085:^ 6061:} 6058:; 6049:; 6040:{ 6037:) 6025:( 6019:; 6013:= 5970:} 5967:; 5958:{ 5955:) 5943:( 5937:; 5931:= 5875:} 5872:; 5863:; 5854:{ 5851:) 5839:( 5833:; 5827:= 5786:} 5783:; 5774:; 5765:{ 5762:) 5750:( 5744:; 5738:= 5696:} 5693:; 5684:; 5675:{ 5672:) 5660:( 5654:; 5648:= 5593:} 5590:; 5581:; 5572:{ 5569:) 5557:( 5551:; 5545:= 5489:} 5486:; 5477:{ 5474:) 5462:( 5456:; 5450:= 5407:} 5404:; 5395:{ 5392:) 5380:( 5374:; 5368:= 5355:| 5331:} 5328:; 5319:{ 5316:) 5304:( 5298:; 5292:= 5259:} 5256:; 5247:; 5238:{ 5235:) 5223:( 5217:; 5211:= 5154:} 5151:; 5142:; 5133:{ 5130:) 5118:( 5112:; 5106:= 5093:* 5072:} 5069:; 5060:; 5051:{ 5048:) 5036:( 5030:; 5024:= 5009:? 5005:+ 5001:* 4995:? 4974:} 4971:; 4962:; 4953:{ 4950:) 4938:( 4932:; 4926:= 4913:? 4892:} 4889:; 4880:{ 4877:) 4865:( 4859:; 4853:= 4840:+ 4819:} 4816:; 4807:{ 4804:) 4792:( 4786:; 4780:= 4725:} 4722:; 4713:{ 4710:) 4698:( 4692:; 4686:= 4671:. 4446:C 4423:. 4377:X 4373:X 4345:X 4244:y 4240:x 4232:y 4228:x 4153:) 4148:1 4145:+ 4142:k 4139:2 4135:n 4131:( 4126:O 4100:) 4095:2 4092:+ 4089:k 4086:2 4082:n 4078:( 4073:O 4034:b 4031:* 4028:) 4022:| 4019:a 4016:( 3998:( 3996:O 3988:n 3986:( 3984:O 3980:n 3975:O 3970:m 3907:^ 3887:) 3881:! 3873:) 3856:) 3851:= 3844:) 3679:" 3603:? 3599:+ 3595:* 3504:* 3501:] 3498:_ 3495:] 3492:_ 3486:* 3483:w 3480:\ 3477:h 3474:\ 3468:h 3465:\ 3459:w 3456:\ 3429:] 3411:} 3405:{ 3402:p 3399:\ 3392:x 3389:\ 3367:} 3361:{ 3358:p 3355:\ 3348:u 3345:\ 3323:S 3320:\ 3313:S 3310:\ 3303:S 3300:\ 3282:s 3279:\ 3273:} 3267:{ 3264:p 3261:\ 3251:\ 3244:s 3241:\ 3219:} 3213:{ 3210:p 3207:\ 3183:} 3177:{ 3174:p 3171:\ 3164:p 3161:\ 3139:} 3133:{ 3130:p 3127:\ 3120:l 3117:\ 3095:} 3089:{ 3086:p 3083:\ 3059:D 3056:\ 3049:D 3046:\ 3039:D 3036:\ 3018:d 3015:\ 3009:} 3003:{ 3000:p 2997:\ 2990:d 2987:\ 2980:d 2977:\ 2957:} 2951:{ 2948:p 2945:\ 2923:) 2920:w 2917:\ 2911:w 2908:\ 2905:= 2896:W 2893:\ 2887:W 2884:\ 2881:= 2868:B 2865:\ 2847:) 2844:W 2841:\ 2835:w 2832:\ 2829:= 2820:w 2817:\ 2811:W 2808:\ 2805:= 2792:b 2789:\ 2777:b 2774:\ 2756:} 2750:{ 2747:p 2744:\ 2737:s 2734:\ 2712:} 2706:{ 2703:p 2700:\ 2693:a 2690:\ 2668:W 2665:\ 2658:W 2655:\ 2648:W 2645:\ 2627:w 2624:\ 2617:w 2614:\ 2607:w 2604:\ 2586:} 2580:{ 2577:p 2574:\ 2550:} 2544:{ 2541:p 2538:\ 2489:Z 2481:a 2468:d 2465:\ 2398:| 2384:+ 2370:? 2352:n 2350:\ 2303:\ 2244:} 2242:\ 2240:n 2238:, 2236:m 2225:n 2221:m 2214:} 2212:n 2210:, 2208:m 2206:{ 2193:* 2183:* 2174:n 2170:n 2163:n 2161:\ 2154:. 2143:n 2141:\ 2093:^ 2089:] 2076:^ 2072:- 2032:. 2022:^ 1907:| 1903:+ 1899:? 1853:X 1849:/ 1805:/ 1769:N 1761:\ 1741:\ 1658:F 1656:= 1654:E 1650:b 1648:, 1646:a 1642:b 1639:a 1635:b 1633:+ 1631:a 1627:Y 1623:X 1619:Y 1616:X 1612:Y 1610:+ 1608:X 1535:k 1533:L 1515:. 1505:1 1499:k 1489:) 1486:b 1480:a 1477:( 1471:) 1468:b 1462:a 1459:( 1456:) 1453:b 1447:a 1444:( 1437:a 1428:) 1424:b 1418:a 1415:( 1401:k 1399:L 1382:) 1379:b 1373:a 1370:( 1367:) 1364:b 1358:a 1355:( 1352:) 1349:b 1343:a 1340:( 1337:a 1328:) 1324:b 1318:a 1315:( 1305:4 1302:L 1298:a 1294:k 1290:b 1288:, 1286:a 1281:k 1279:L 1260:R 1256:R 1228:+ 1224:? 1144:R 1128:( 1107:( 1089:( 1081:. 1079:a 1075:a 1067:( 1058:( 1051:( 982:. 911:n 884:+ 864:* 844:? 829:( 827:+ 816:* 809:? 787:y 784:) 781:e 778:| 775:a 772:( 743:| 687:s 685:( 683:N 675:s 671:* 669:s 662:s 660:( 658:N 625:s 621:s 619:( 598:d 595:\ 592:? 586:d 583:\ 580:. 577:\ 571:d 568:\ 565:. 562:\ 556:d 553:\ 538:^ 503:b 496:. 469:. 350:/ 56:r 51:g 32:. 25:.

Index

Re:Gex
Pointer (computer science) § Pointer-to-member

characters
match pattern
text
string-searching algorithms
strings
input validation
theoretical computer science
formal language
Stephen Cole Kleene
regular language
Unix
syntaxes
POSIX
Perl
search engines
word processors
text editors
text processing
sed
AWK
lexical analysis
engine
many of these

Stephen Cole Kleene
Stephen Cole Kleene
regular languages

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.