{"id":269,"date":"2007-10-08T00:00:00","date_gmt":"2007-10-08T00:00:00","guid":{"rendered":"http:\/\/www.strongd.net\/?p=269"},"modified":"2007-10-08T00:00:00","modified_gmt":"2007-10-08T00:00:00","slug":"Perl Regular Expression Tutorial","status":"publish","type":"post","link":"https:\/\/www.strongd.net\/?p=269","title":{"rendered":"Perl Regular Expression Tutorial"},"content":{"rendered":"<p><DIV><FONT face=\u7d30\u660e\u9ad4 size=5><STRONG>Perl Regular Expression Tutorial<\/STRONG><\/FONT><\/DIV><br \/>\n<DIV>&nbsp;<\/DIV><br \/>\n<DIV><br \/>\n<HR><br \/>\n<\/DIV><br \/>\n<DIV> <\/DIV><br \/>\n<H2>Contents<\/H2><br \/>\n<OL><br \/>\n<LI><A href=\"#2\">Overview<\/A><br \/>\n<LI><A href=\"#2.2\">Simple Regular Expressions<\/A><br \/>\n<LI><A href=\"#2.3\">Metacharacters<\/A><br \/>\n<LI><A href=\"#2.4\">Forbidden Characters<\/A><br \/>\n<LI><A href=\"#2.5\">Things To Remember<\/A> <\/LI><\/OL><br \/>\n<P><br \/>\n<DIV><br \/>\n<HR><br \/>\n<\/DIV><br \/>\n<P><A name=2><br \/>\n<H2>Overview<\/H2><br \/>\n<DIV>A regular expression is a string of characters which tells the searcher which string (or strings) you are looking for. The following explains the format of regular expressions in detail. If you are familiar with Perl, you already know the syntax. If you are familiar with Unix, you should know that there are subtle differences between Perl&#8217;s regular expressions and Unix&#8217; regular expressions. <A name=2.2><\/DIV><br \/>\n<H3>Simple Regular Expressions<\/H3><br \/>\n<DIV>In its simplest form, a regular expression is just a word or phrase to search for. For example, <\/DIV><PRE>  gauss<\/PRE><br \/>\n<DIV>would match any subject with the string &#8220;gauss&#8221; in it, or which mentioned the word &#8220;gauss&#8221; in the subject line. Thus, subjects with &#8220;gauss&#8221;, &#8220;gaussian&#8221; or &#8220;degauss&#8221; would all be matched, as would a subject containing the phrases &#8220;de-gauss the monitor&#8221; or &#8220;gaussian elimination.&#8221; Here are some more examples: <\/DIV><PRE>  carbon<\/PRE><br \/>\n<DL><br \/>\n<DD>Finds any subject with the string &#8220;carbon&#8221; in its name, or which mentions carbon (or carbonization or hydrocarbons or carbon-based life forms) in the subject line. <\/DD><\/DL><PRE>  hydro<\/PRE><br \/>\n<DL><br \/>\n<DD>Finds any subject with the string &#8220;hydro&#8221; in its name or contents. Subjects with &#8220;hydro&#8221;, &#8220;hydrogen&#8221; or &#8220;hydrodynamics&#8221; are found, as well as subjects containing the words &#8220;hydroplane&#8221; or &#8220;hydroelectric&#8221;. <\/DD><\/DL><PRE>  oxy<\/PRE><br \/>\n<DL><br \/>\n<DD>Finds any subject with the string &#8220;oxy&#8221; in the subject line. This could be used to find subjects on oxygen, boxy houses or oxymorons. <\/DD><\/DL><PRE>  top ten<\/PRE><br \/>\n<DL><br \/>\n<DD>Note that spaces may be part of the regular expression. The above expression could be used to find top ten lists. (Note that they would also find articles on how to stop tension.) <\/DD><\/DL><br \/>\n<P><A name=2.3><br \/>\n<H3>Metacharacters<\/H3><br \/>\n<DIV>Some characters have a special meaning to the searcher. These characters are called <EM>metacharacters<\/EM>. Although they may seem confusing at first, they add a great deal of flexibility and convenience to the searcher. <\/DIV><br \/>\n<P>The <EM>period<\/EM> (<STRONG>.<\/STRONG>) is a commonly used metacharacter. It matches exactly one character, regardless of what the character is. For example, the regular expression: <PRE>  2,.-Dimethylbutane<\/PRE><br \/>\n<DIV>will match &#8220;2,2-Dimethylbutane&#8221; and &#8220;2,3-Dimethylbutane&#8221;. Note that the period matches <EM>exactly one<\/EM> character&#8211; it will not match a string of characters, nor will it match the null string. Thus, &#8220;2,200-Dimethylbutane&#8221; and &#8220;2,-Dimenthylbutane&#8221; will <EM>not<\/EM> be matched by the above regular expression. <\/DIV><br \/>\n<P>But what if you wanted to search for a string containing a period? For example, suppose we wished to search for references to pi. The following regular expression would <EM>not<\/EM> work: <PRE>  3.14     <STRONG>(THIS IS WRONG!)<\/STRONG><\/PRE><br \/>\n<DIV>This would indeed match &#8220;3.14&#8221;, but it would also match &#8220;3514&#8221;, &#8220;3f14&#8221;, or even &#8220;3+14&#8221;. In short, any string of the form &#8220;3&#215;14&#8221;, where x is any character, would be matched by the regular expression above. <\/DIV><br \/>\n<P>To get around this, we introduce a second metacharacter, the <EM>backslash<\/EM> (<STRONG>\\<\/STRONG>). The backslash can be used to indicate that the character immediately to its right is to be taken literally. Thus, to search for the string &#8220;3.14&#8221;, we would use: <PRE>  3\\.14    (This <EM>will<\/EM> work.)<\/PRE><br \/>\n<DIV>This is called &#8220;quoting&#8221;. We would say that the period in the regular expression above has been quoted. In general, whenever the backslash is placed before a metacharacter, the searcher treats the metacharacter literally rather than invoking its special meaning. <\/DIV><br \/>\n<P>(Unfortunately, the backslash is used for other things besides quoting metacharacters. Many &#8220;normal&#8221; characters take on special meanings when preceded by a backslash. The rule of thumb is, quoting a metacharacter turns it into a normal character, and quoting a normal character <EM>may<\/EM> turn it into a metacharacter.)<br \/>\n<P>Let&#8217;s look at some more common metacharacters. We consider first the <EM>question mark<\/EM> (<STRONG>?<\/STRONG>). The question mark indicates that the character immediately preceding it either zero times or one time. Thus <PRE>  m?ethane<\/PRE><br \/>\n<DIV>would match either &#8220;ethane&#8221; or &#8220;methane&#8221;. Similarly, <\/DIV><PRE>  comm?a<\/PRE><br \/>\n<DIV>would match either &#8220;coma&#8221; or &#8220;comma&#8221;. <\/DIV><br \/>\n<P>Another metacharacter is the <EM>star<\/EM> (<STRONG>*<\/STRONG>). This indicates that the character immediately to its left may be repeated any number of times, including zero. Thus <PRE>  ab*c<\/PRE><br \/>\n<DIV>would match &#8220;ac&#8221;, &#8220;abc&#8221;, &#8220;abbc&#8221;, &#8220;abbbc&#8221;, &#8220;abbbbbbbbc&#8221;, and any string that starts with an &#8220;a&#8221;, is followed by a sequence of &#8220;b&#8221;&#8216;s, and ends with a &#8220;c&#8221;. <\/DIV><br \/>\n<P>The <EM>plus<\/EM> (<STRONG>+<\/STRONG>) metacharacter indicates that the character immediately preceding it may be repeated one or more times. It is just like the star metacharacter, except it doesn&#8217;t match the null string. Thus <PRE>  ab+c<\/PRE><br \/>\n<DIV>would <EM>not<\/EM> match &#8220;ac&#8221;, but it <EM>would<\/EM> match &#8220;abc&#8221;, &#8220;abbc&#8221;, &#8220;abbbc&#8221;, &#8220;abbbbbbbbc&#8221; and so on. <\/DIV><br \/>\n<P>Metacharacters may be combined. A common combination includes the period and star metacharacters, with the star immediately following the period. This is used to match an arbitrary string of any length, including the null string. For example: <PRE>  cyclo.*ane<\/PRE><br \/>\n<DIV>would match &#8220;cyclodecane&#8221;, &#8220;cyclohexane&#8221; and even &#8220;cyclones drive me insane.&#8221; Any string that starts with &#8220;cyclo&#8221;, is followed by an arbitrary string, and ends with &#8220;ane&#8221; will be matched. Note that the null string will be matched by the period-star pair; thus, &#8220;cycloane&#8221; would be matche by the above expression. <\/DIV><br \/>\n<P>If you wanted to search for articles on cyclodecane and cyclohexane, but didn&#8217;t want to match articles about how cyclones drive one insane, you could string together three periods, as follows: <PRE>  cyclo&#8230;ane<\/PRE><br \/>\n<DIV>This would match &#8220;cyclodecane&#8221; and &#8220;cyclohexane&#8221;, but would not match &#8220;cyclones drive me insane.&#8221; Only strings eleven characters long which start with &#8220;cyclo&#8221; and end with &#8220;ane&#8221; will be matched. (Note that &#8220;cyclopentane&#8221; would not be matched, however, since cyclopentane has twelve characters, not eleven.) <\/DIV><br \/>\n<P>Here are some more examples. These involve the backslash. Note that the placement of backslash is important. <PRE>  a\\.*z<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches any string starting with &#8220;a&#8221;, followed by a series of periods (including the &#8220;series&#8221; of length zero), and terminated by &#8220;z&#8221;. Thus, &#8220;az&#8221;, &#8220;a.z&#8221;, &#8220;a..z&#8221;, &#8220;a&#8230;z&#8221; and so forth are all matched.<\/DD><\/DL><PRE>  a.\\*z<\/PRE><br \/>\n<DL><br \/>\n<DD>(Note that the backslash and period are reversed in this regular expression.)<br \/>\n<P>Matches any string starting with an &#8220;a&#8221;, followed by one arbitrary character, and terminated with &#8220;*z&#8221;. Thus, &#8220;ag*z&#8221;, &#8220;a5*z&#8221; and &#8220;a@*z&#8221; are all matched. Only strings of length four, where the first character is &#8220;a&#8221;, the third &#8220;*&#8221;, and the fourth &#8220;z&#8221;, are matched.<\/P><\/DD><\/DL><PRE>  a\\++z<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches any string starting with &#8220;a&#8221;, followed by a series of plus signs, and terminated by &#8220;z&#8221;. There must be at least one plus sign between the &#8220;a&#8221; and the &#8220;z&#8221;. Thus, &#8220;az&#8221; is <EM>not<\/EM> matched, but &#8220;a+z&#8221;, &#8220;a++z&#8221;, &#8220;a+++z&#8221;, etc. will be matched.<\/DD><\/DL><PRE>  a\\+\\+z<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches only the string &#8220;a++z&#8221;.<\/DD><\/DL><PRE>  a+\\+z<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches any string starting with a series of &#8220;a&#8221;&#8216;s, followed by a single plus sign and ending with a &#8220;z&#8221;. There must be at least one &#8220;a&#8221; at the start of the string. Thus &#8220;a+z&#8221;, &#8220;aa+z&#8221;, &#8220;aaa+z&#8221; and so on will match, but &#8220;+z&#8221; will not.<\/DD><\/DL><PRE>  a.?e<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches &#8220;ace&#8221;, &#8220;ale&#8221;, &#8220;axe&#8221; and any other three-character string beginning with &#8220;a&#8221; and ending with &#8220;e&#8221;; will also match &#8220;ae&#8221;.<\/DD><\/DL><PRE>  a\\.?e<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches &#8220;ae&#8221; and &#8220;a.e&#8221;. No other string is matched.<\/DD><\/DL><PRE>  a.\\?e<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches any four-character string starting with &#8220;a&#8221; and ending with &#8220;?e&#8221;. Thus, &#8220;ad?e&#8221;, &#8220;a1?e&#8221; and &#8220;a%?e&#8221; will all be matched.<\/DD><\/DL><PRE>  a\\.\\?e<\/PRE><br \/>\n<DL><br \/>\n<DD>Matches only &#8220;a.?e&#8221; and nothing else.<\/DD><\/DL><br \/>\n<DIV>Earlier it was mentioned that the backslash can turn ordinary characters into metacharacters, as well as the other way around. One such use of this is the <EM>digit<\/EM> metacharacter, which is invoked by following a backslash with a lower-case &#8220;d&#8221;, like this: &#8220;<STRONG>\\d<\/STRONG>&#8220;. The &#8220;d&#8221; <EM>must be lower case<\/EM>, for reasons explained later. The digit metacharacter matches exactly one digit; that is, exactly one occurence of &#8220;0&#8221;, &#8220;1&#8221;, &#8220;2&#8221;, &#8220;3&#8221;, &#8220;4&#8221;, &#8220;5&#8221;, &#8220;6&#8221;, &#8220;7&#8221;, &#8220;8&#8221; or &#8220;9&#8221;. For example, the regular expression: <\/DIV><PRE>  2,\\d-Dimethylbutane<\/PRE><br \/>\n<DIV>would match &#8220;2,2-Dimethylbutane&#8221;, &#8220;2,3-Dimethylbutane&#8221; and so forth. Similarly, <\/DIV><PRE>  1\\.\\d\\d\\d\\d\\d<\/PRE><br \/>\n<DIV>would match any six-digit floating-point number from 1.00000 to 1.99999 inclusive. We could combine the digit metacharacter with other metacharacters; for instance, <\/DIV><PRE>  a\\d+z<\/PRE><br \/>\n<DIV>matches any string starting with &#8220;a&#8221;, followed by a string of numbers, followed by a &#8220;z&#8221;. (Note that the plus is used, and thus &#8220;az&#8221; is not matched.) <\/DIV><br \/>\n<P>The letter &#8220;d&#8221; in the string &#8220;<STRONG>\\d<\/STRONG>&#8221; must be lower-case. This is because there is another metacharacter, the <EM>non-digit<\/EM> metacharacter, which uses the uppercase &#8220;D&#8221;. The non-digit metacharacter looks like &#8220;<STRONG>\\D<\/STRONG>&#8221; and matches any character <EM>except<\/EM> a digit. Thus, <PRE>  a\\Dz<\/PRE><br \/>\n<DIV>would match &#8220;abz&#8221;, &#8220;aTz&#8221; or &#8220;a%z&#8221;, but would <EM>not<\/EM> match &#8220;a2z&#8221;, &#8220;a5z&#8221; or &#8220;a9z&#8221;. Similarly, <\/DIV><PRE>  \\D+<\/PRE><br \/>\n<DIV>Matches any non-null string which contains <EM>no<\/EM> numeric characters. <\/DIV><br \/>\n<P>Notice that in changing the &#8220;d&#8221; from lower-case to upper-case, we have reversed the meaning of the digit metacharacter. This holds true for most other metacharacters of the format backslash-letter.<br \/>\n<P>There are three other metacharacters in the backslash-letter format. The first is the <EM>word<\/EM> metacharacter, which matches exactly one letter, one number, or the underscore character (<CODE>_<\/CODE>). It is written as &#8220;<STRONG>\\w<\/STRONG>&#8220;. It&#8217;s opposite, &#8220;<STRONG>\\W<\/STRONG>&#8220;, matches any one character <EM>except<\/EM> a letter, a number or the underscore. Thus, <PRE>  a\\wz<\/PRE><br \/>\n<DIV>would match &#8220;abz&#8221;, &#8220;aTz&#8221;, &#8220;a5z&#8221;, &#8220;a_z&#8221;, or any three-character string starting with &#8220;a&#8221;, ending with &#8220;z&#8221;, and whose second character was either a letter (upper- or lower-case), a number, or the underscore. Similarly, <\/DIV><PRE>  a\\Wz<\/PRE><br \/>\n<DIV>would <EM>not<\/EM> match &#8220;abz&#8221;, &#8220;aTz&#8221;, &#8220;a5z&#8221;, or &#8220;a_z&#8221;. It <EM>would<\/EM> match &#8220;a%z&#8221;, &#8220;a{z&#8221;, &#8220;a?z&#8221; or any three-character string starting with &#8220;a&#8221; and ending with &#8220;z&#8221; and whose second character was not a letter, number, or underscore. (This means the second character must either be a symbol or a whitespace character.) <\/DIV><br \/>\n<P>The <EM>whitespace<\/EM> metacharacter matches exactly one character of whitespace. (Whitespace is defined as spaces, tabs, newlines, or any character which would not use ink if printed on a printer.) The whitespace metacharacter looks like this: &#8220;<STRONG>\\s<\/STRONG>&#8220;. It&#8217;s opposite, which matches any character that is <EM>not<\/EM> whitespace, looks like this: &#8220;<STRONG>\\S<\/STRONG>&#8220;. Thus, <PRE>  a\\sz<\/PRE><br \/>\n<DIV>would match any three-character string starting with &#8220;a&#8221; and ending with &#8220;z&#8221; and whose second character was a space, tab, or newline. Likewise, <\/DIV><PRE>  a\\Sz<\/PRE><br \/>\n<DIV>would match any three-character string starting with &#8220;a&#8221; and ending with &#8220;z&#8221; whose second character was <EM>not<\/EM> a space, tab or newline. (Thus, the second character could be a letter, number or symbol.) <\/DIV><br \/>\n<P><\/P><br \/>\n<DIV>The <EM>word boundary<\/EM> metacharacter matches the boundaries of words; that is, it matches whitespace, punctuation and the very beginning and end of the text. It looks like &#8220;<STRONG>\\b<\/STRONG>&#8220;. It&#8217;s opposite searches for a character that is <EM>not<\/EM> a word boundary. Thus: <\/DIV><PRE>  \\bcomput<\/PRE><br \/>\n<DIV>will match &#8220;computer&#8221; or &#8220;computing&#8221;, but not &#8220;supercomputer&#8221; since there is no spaces or punctuation between &#8220;super&#8221; and &#8220;computer&#8221;. Similarly, <\/DIV><PRE>  \\Bcomput<\/PRE><br \/>\n<DIV>will <EM>not<\/EM> match &#8220;computer&#8221; or &#8220;computing&#8221;, unless it is part of a bigger word such as &#8220;supercomputer&#8221; or &#8220;recomputing&#8221;. <\/DIV><br \/>\n<P>Note that the underscore (<CODE>_<\/CODE>) is considered a &#8220;word&#8221; character. Thus, <PRE>  super\\bcomputer<\/PRE><br \/>\n<DIV>will <EM>not<\/EM> match &#8220;super_computer&#8221;. <\/DIV><br \/>\n<P>There is one other metacharacter starting with a backslash, the <EM>octal<\/EM> metacharacter. The octal metacharacter looks like this: &#8220;<STRONG>\\nnn<\/STRONG>&#8220;, where &#8220;n&#8221; is a number from zero to seven. This is used for specifying control characters that have no typed equivalent. For example, <PRE>  \\007<\/PRE><br \/>\n<DIV>would find all subjects with an embedded ASCII &#8220;bell&#8221; character. (The bell is specified by an ASCII value of 7.) You will rarely need to use the octal metacharacter. <\/DIV><br \/>\n<P>There are three other metacharacters that may be of use. The first is the <EM>braces<\/EM> metacharacter. This metacharacter follows a normal character and contains two number separated by a comma (<STRONG>,<\/STRONG>) and surrounded by braces (<STRONG>{}<\/STRONG>). It is like the star metacharacter, except the length of the string it matches must be within the minimum and maximum length specified by the two numbers in braces. Thus, <PRE>  ab{3,5}c<\/PRE><br \/>\n<DIV>will match &#8220;abbbc&#8221;, &#8220;abbbbc&#8221; or &#8220;abbbbbc&#8221;. No other string is matched. Likewise, <\/DIV><PRE>  .{3,5}pentane<\/PRE><br \/>\n<DIV>will match &#8220;cyclopentane&#8221;, &#8220;isopentane&#8221; or &#8220;neopentane&#8221;, but not &#8220;n-pentane&#8221;, since &#8220;n-&#8221; is only two characters long. <\/DIV><br \/>\n<P>The alternative metacharacter is represented by a vertical bar (<STRONG>|<\/STRONG>). It indicates an either\/or behavior by separating two or more possible choices. For example: <PRE>  isopentane|cyclopentane<\/PRE><br \/>\n<DIV>will match any subject containing the strings &#8220;isopentane&#8221; or &#8220;cyclopentane&#8221; or both. However, It will not match &#8220;pentane&#8221; or &#8220;n-pentane&#8221; or &#8220;neopentane.&#8221; The last metacharacter is the <EM>brackets<\/EM> metacharacter. The bracket metacharacter matches one occurence of any character inside the brackets (<STRONG>[]<\/STRONG>). For example, <\/DIV><PRE>  \\s[cmt]an\\s<\/PRE><br \/>\n<DIV>will match &#8220;can&#8221;, &#8220;man&#8221; and &#8220;tan&#8221;, but not &#8220;ban&#8221;, &#8220;fan&#8221; or &#8220;pan&#8221;. Similarly, <\/DIV><PRE>  2,[23]-dimethylbutane<\/PRE><br \/>\n<DIV>will match &#8220;2,2-dimethylbutane&#8221; or &#8220;2,3-dimethylbutane&#8221;, but not &#8220;2,4-dimethylbutane&#8221;, &#8220;2,23-dimethylbutane&#8221; or &#8220;2,-dimethybutane&#8221;. Ranges of characters can be used by using the dash (<STRONG>&#8211;<\/STRONG>) within the brackets. For example, <\/DIV><PRE>  a[a-d]z<\/PRE><br \/>\n<DIV>will match &#8220;aaz&#8221;, &#8220;abz&#8221;, &#8220;acz&#8221; or &#8220;adz&#8221;, and nothing else. Likewise, <\/DIV><PRE>  textfile0[3-5]<\/PRE><br \/>\n<DIV>will match &#8220;textfile03&#8221;, &#8220;textfile04&#8221;, or &#8220;textfile05&#8221; and nothing else. <\/DIV><br \/>\n<P>If you wish to include a dash within brackets as one of the characters to match, instead of to denote a range, put the dash immediately before the right bracket. Thus: <PRE>  a[1234-]z<\/PRE><br \/>\n<DIV>and <\/DIV><PRE>  a[1-4-]z<\/PRE><br \/>\n<DIV>both do the same thing. They both match &#8220;a1z&#8221;, &#8220;a2z&#8221;, &#8220;a3z&#8221;, &#8220;a4z&#8221; or &#8220;a-z&#8221;, and nothing else. <\/DIV><br \/>\n<P>The bracket metacharacter can also be inverted by placing a caret (<STRONG>^<\/STRONG>) immediately after the left bracket. Thus, <PRE>  textfile0[^02468]<\/PRE><br \/>\n<DIV>matches any ten-character string starting with &#8220;textfile0&#8221; and ending with anything except an even number. Inversion and ranges can be combined, so that <\/DIV><PRE>  \\W[^f-h]ood\\W<\/PRE><br \/>\n<DIV>matches any four letter wording ending in &#8220;ood&#8221; <EM>except<\/EM> for &#8220;food&#8221;, &#8220;good&#8221; or &#8220;hood&#8221;. (Thus &#8220;mood&#8221; and &#8220;wood&#8221; would both be matched.) <\/DIV><br \/>\n<P>Note that within brackets, ordinary quoting rules do not apply and other metacharacters are not available. The only characters that can be quoted in brackets are &#8220;<CODE>[<\/CODE>&#8220;, &#8220;<CODE>]<\/CODE>&#8220;, and &#8220;<CODE>\\<\/CODE>&#8220;. Thus, <PRE>  [\\[\\\\\\]]abc<\/PRE><br \/>\n<DIV>matches any four letter string ending with &#8220;abc&#8221; and starting with &#8220;<CODE>[<\/CODE>&#8220;, &#8220;<CODE>]<\/CODE>&#8220;, or &#8220;<CODE>\\<\/CODE>&#8220;. <A name=2.4><\/DIV><br \/>\n<H3>Forbidden Characters<\/H3><br \/>\n<DIV>Because of the way the searcher works, the following metacharacters should <EM>not<\/EM> be used, even though they are valid Perl metacharacters. They are: <\/DIV><br \/>\n<DL><br \/>\n<DD><STRONG>^<\/STRONG> (allowed within brackets)<br \/>\n<DD><STRONG>$<\/STRONG> (allowed within brackets)<br \/>\n<DD><STRONG>\\n<\/STRONG><br \/>\n<DD><STRONG>\\r<\/STRONG><br \/>\n<DD><STRONG>\\t<\/STRONG><br \/>\n<DD><STRONG>\\f<\/STRONG><br \/>\n<DD><STRONG>\\b<\/STRONG><br \/>\n<DD><STRONG>( )<\/STRONG> (allowed within brackets. Note that if you wish to search for parentheses within text outside of brackets, you should quote the parentheses.)<br \/>\n<DD><STRONG>\\1<\/STRONG>, <STRONG>\\2<\/STRONG> &#8230; <STRONG>\\9<\/STRONG><br \/>\n<DD><STRONG>\\B<\/STRONG><br \/>\n<DD><STRONG>:<\/STRONG><br \/>\n<DD><STRONG>!<\/STRONG> <\/DD><\/DL><A name=2.5><br \/>\n<H3>Things To Remember<\/H3><br \/>\n<DIV>Here are some other things you should know about regular expressions. <\/DIV><br \/>\n<OL><br \/>\n<LI>The archive search software searches only subject lines, and all articles within the same thread will also be displayed.<br \/>\n<P><\/P><br \/>\n<LI>Regular expressions should be a last resort. Because they are complex, it can be more work mastering a search than just sifting through a long list of matches (unless you&#8217;re already familiar with regular expressions).<br \/>\n<P><\/P><br \/>\n<LI>We limit the number of articles which can be shown to 200 or less. This is to minimize load on our system.<br \/>\n<P><\/P><br \/>\n<LI>The search is case insensitive; thus <PRE>  mopac<\/PRE>and <PRE>  Mopac<\/PRE>and <PRE>  MOPAC<\/PRE>all search for the same set of strings. Each will match &#8220;mopac&#8221;, &#8220;MOPAC&#8221;, &#8220;Mopac&#8221;, &#8220;mopaC&#8221;, &#8220;MoPaC&#8221;, &#8220;mOpAc&#8221; and so forth. Thus you need not worry about capitalization. (Note, however, that metacharacter must still have the proper case. This is especially important for metacharacters whose case determines whether their meaning is reversed or not.)<br \/>\n<P><\/P><br \/>\n<LI>Outside of the brackets metacharacter, you must quote parentheses, brackets and braces to get the searcher to take them literally. <\/LI><\/OL><\/A><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Perl Regular Expression Tutorial &nbsp; Contents Overview Simple Regular Expressions Metacharacters Forbidden Characters Things To Remember Overview A regular expression is a string of characters which tells the searcher which string (or strings) you are looking for. The following explains the format of regular expressions in detail. If you are familiar with Perl, you already &hellip; <a href=\"https:\/\/www.strongd.net\/?p=269\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Perl Regular Expression Tutorial<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-269","post","type-post","status-publish","format-standard","hentry"],"_links":{"self":[{"href":"https:\/\/www.strongd.net\/index.php?rest_route=\/wp\/v2\/posts\/269","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.strongd.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.strongd.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.strongd.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.strongd.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=269"}],"version-history":[{"count":0,"href":"https:\/\/www.strongd.net\/index.php?rest_route=\/wp\/v2\/posts\/269\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.strongd.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=269"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.strongd.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=269"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.strongd.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=269"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}