Formal language

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In mathematics, logic, and computer science, a formal language is a language that is defined in a precise mathematical way. A language is defined using a set called the alphabet of the language. The members of the alphabet are usually called symbols of the language. The language is a set of sequences of symbols of the alphabet. The sequences usually have finite length. The sequences that are members of the language are called the words of the language or strings.

A precise definition is that a formal language \boldsymbol{L} is typically characterized as an ordered pair of sets \boldsymbol{L}=(\boldsymbol{A},\boldsymbol{F}). \boldsymbol{A} is the alphabet, and each element of  \boldsymbol{F} is a sequence of elements.

Formal languages have lots of uses. A formal language is often thought of as:

  • a collection of words

or

  • a collection of sentences

In the first case, the set \boldsymbol{A} is called the alphabet of \boldsymbol{L}, and the elements of \boldsymbol{F} are called words. In the second, the set \boldsymbol{A} is called the lexicon or the vocabulary of \boldsymbol{F}, while the elements of \boldsymbol{F} are then called sentences. The mathematical theory that treats formal languages in general is known as formal language theory.

Although it is common to hear the term formal language meaning natural language that is more stilted, disciplined or precise than everyday speech, this article refers to the meaning in formal language theory, in maths, logic or computer science.

As an example of formal language, an alphabet might be \left \{ a , b \right \}. One string over that alphabet is ababba\,.

A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols a\, and b\,.

The empty word (that is, length-zero string) is allowed and is often denoted by  e\, ,  \epsilon\, or  \Lambda\, . While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings. This is because the length of words belonging to it may be unbounded.

A question often asked about formal languages is "how difficult is it to decide whether a given word belongs to a particular language?" This is the kind of question that computability theory and complexity theory ask.

Examples[change | change source]

Some examples of formal languages:

  • the set of all words over {a, b}\,
  • the set \left \{ a^{n}\right\}, where n\, is a natural number and a^n\, means a\, repeated n\, times
  • Finite languages, such as \{\{a,b\},\{a, aa, bba\}\}\,
  • the set of syntactically correct programs in a given programming language; or
  • the set of inputs upon which a certain Turing machine halts.

Specification[change | change source]

A formal language can be specified in a great variety of ways, such as:

Other pages[change | change source]

Further reading[change | change source]

  • Hopcroft, J. & Ullman, J. (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. ISBN 0-201-02988-X
      .
  • Helena Rasiowa and Roman Sikorski (1970). The Mathematics of Metamathematics (3rd ed. ed.). PWN., chapter 6 Algebra of formalized languages.
  • Rozemberg, G. & Salomaa, A. (eds.) (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. ISBN 978-3-540-61486-9
      .

Other websites[change | change source]