Regular expression

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A regular expression (abbreviated regexp or regex) is a way to describe sets of characters using syntactic rules. Many programming languages use or support regular expressions. A regular expression is then used by a special program or part of a programming language. This program will either generate a parser that can be used to match expressions or it will match such expressions itself.

regular expression processor is used for processing a regular expression statement in terms of a grammar in a given formal language, and with that examines a text string.

A few examples of what can be matched with regular expressions:

  • The sequence of characters "car" appearing consecutively in any context, such as in "car", "cartoon", or "bicarbonate"
  • The sequence of characters "car" occurring in that order with other characters between them, such as in "Icelander" or "chandler"
  • The word "car" when it appears as an isolated word
  • The word "car" when preceded by the word "blue" or "red"
  • The word "car" when not preceded by the word "motor"
  • A dollar sign immediately followed by one or more digits, and then optionally a period and exactly two more digits (for example, "$10" or "$245.99"). This does not match "$ 5", because of the space between the dollar sign and the digit, nor "€25", because there is no dollar sign.

Regular expressions can be much more complex than these examples.

Many regular expression languages also support "wildcard" characters.