Regular expressions
Regular expressions are enclosed in double quotes. Within them, the following are supported:
-
afor charactera, for anya(special characters need escaping). -
\nfor the new line character (Unicode U+0A). -
\rfor the carriage return character (Unicode U+0D). -
\tfor the tab character (Unicode U+09). -
\afor charactera, for anya(especially useful for escaping special characters). -
\\for character\(escaped). -
\"for character"(escaped). -
(x)for regular expressionx(allows for grouping). -
xyfor regular expressionxfollowed by regular expressiony. -
x*for zero or more times regular expressionx. -
x+for one or more times regular expressionx. -
x?for zero or one times regular expressionx. -
.for any ASCII character except\n(new line, Unicode U+0A). -
x|yfor either regular expressionxor regular expressiony(but not both). -
[abc]for exactly one of the charactersa,borc. -
[a-z]for exactly one of the charactersa,b, …, orz. This notation is called a character class. Note that the ranges of characters are based on their ASCII character codes. -
[^a]for any ASCII character except for charactera. This notation is called a negated character class. -
{s}for the regular expression defined by shortcuts.
To include special characters, they must always be escaped, wherever they occur in the regular expression. For instance, regular expression [a\^] recognizes either character a or character ^ (but not both). Here the ^ character is escaped, as it is a special character (it may be used at the beginning of a character class to invert the character class).
New lines are not allowed in the regular expressions themselves. Obviously, it is possible to detect new lines using regular expressions.