Inserting and Formatting Text > Searching and replacing text, tags, and attributes > About regular expressions

 

About regular expressions

Regular expressions are patterns that describe character combinations in text. Use them in your searches to help describe concepts such as "sentences that begin with 'The'" and "attribute values that contain a number." The following table lists the special characters in regular expressions, their meanings, and usage examples.

To search for text containing one of the special characters in the table, "escape" the special character by preceding it with a backslash. For example, to search for the actual asterisk in the phrase some conditions apply*, your search pattern might look like this: apply\*. If you don't escape the asterisk, you'll find all the occurrences of "apply" (as well as any of "appl", "applyy", and "applyyy"), not just the ones followed by an asterisk.
Character Matches Example

^

Beginning of input or line.

^T matches "T" in "This good earth" but not in "Uncle Tom's Cabin"

$

End of input or line.

h$ matches "h" in "teach" but not in "teacher"

*

The preceding character 0 or more times.

um* matches "um" in "rum", "umm" in "yummy", and "u" in "huge"

+

The preceding character 1 or more times.

um+ matches "um" in "rum" and "umm" in "yummy" but nothing in "huge"

?

The preceding character at most once (that is, indicates that the preceding character is optional).

st?on matches "son" in "Johnson" and "ston" in "Johnston" but nothing in "Appleton" or "tension"

.

Any single character except newline.

.an matches "ran" and "can" in the phrase "bran muffins can be tasty"

x|y

Either x or y.

FF0000|0000FF matches "FF0000" in bgcolor="#FF0000" and "0000FF'" in font color="#0000FF"

{n}

Exactly n occurrences of the preceding character.

o{2} matches "oo" in "loom" and the first two o's in "mooooo" but nothing in "money"

{n,m}

At least n and at most m occurrences of the preceding character.

F{2,4} matches "FF" in "#FF0000" and the first four F's in #FFFFFF

[abc]

Any one of the characters enclosed in the brackets. Specify a range of characters with a hyphen (for example, [a-f] is equivalent to [abcdef]).

[e-g] matches "e" in "bed", "f" in "folly", and "g" in "guard"

[^abc]

Any character not enclosed in the brackets. Specify a range of characters with a hyphen (for example, [^a-f] is equivalent to [^abcdef]).

[^aeiou] initially matches "r" in "orange", "b" in "book", and "k" in "eek!"

\b

A word boundary (such as a space or carriage return).

\bb matches "b" in "book" but nothing in "goober" or "snob"

\B

A nonword boundary.

\Bb matches "b" in "goober" but nothing in "book"

\d

Any digit character. Equivalent to [0-9].

\d matches "3" in "C3PO" and "2" in "apartment 2G"

\D

Any nondigit character. Equivalent to [^0-9].

\D matches "S" in "900S" and "Q" in "Q45"

\f

Form feed.

\n

Line feed.

\r

Carriage return.

\s

Any single white-space character, including space, tab, form feed, or line feed.

\sbook matches "book" in "blue book" but nothing in "notebook"

\S

Any single non-white-space character.

\Sbook matches "book" in "notebook" but nothing in "blue book"

\t

A tab.

\w

Any alphanumeric character, including underscore. Equivalent to [A-Za-z0-9_].

b\w* matches "barking" in "the barking dog" and both "big" and "black" in "the big black dog"

\W

Any non-alphanumeric character. Equivalent to [^A-Za-z0-9_].

\W matches "&" in "Jake & Mattie" and "%" in "100%"

Control+Enter or Shift+Enter (Windows), or Control+ Return or Shift+Return or Command+ Return (Macintosh)

Return character. Be sure to deselect the Ignore Whitespace Differences option when searching for this, if not using regular expressions. Note that this matches a particular character, not the general notion of a line break; for instance, it doesn't match a <br> tag or a <p> tag. Return characters appear as spaces in the Document window, not as line breaks.


Use parentheses to set off groupings within the regular expression to be referred to later; use $1, $2, $3, and so on (use ($) in the Find field and use the backslash (\) in the Replace field), to refer to the first, second, third, and later parenthetical groupings. For example, searching for (\d+)\/(\d+)\/(\d+) and replacing it with $2/$1/$3 swaps the day and month in a date separated by slashes (to convert between American-style dates and European-style dates).