## Definitions of Regular Language and Regular Expression

### Subjects to be Learned

• regular language
• regular expression

### Contents

Here we are going to learn one type of language called regular language which is the simplest of the four Homsky formal languages, and regular expression which is one of the ways to describe regular languages.

1. Regular language

The set of regular languages over an alphabet is defined recursively as below. Any language belonging to this set is a regular language over .

Definition of Set of Regular Languages :

Basis Clause: , { } and {a} for any symbol a  are regular languages.
Inductive Clause: If Lr and Ls are regular languages, then Lr Ls , LrLs and Lr* are regular languages.

Extremal Clause: Nothing is a regular language unless it is obtained from the above two clauses.

For example, let = {a, b}. Then since {a} and {b} are regular languages, {a, b} ( = {a} {b} ) and {ab} ( = {a}{b} ) are regular languages. Also since {a} is regular, {a}* is a regular language which is the set of strings consisting of a's such as , a, aa, aaa, aaaa etc. Note also that *, which is the set of strings consisting of a's and b's, is a regular language because {a, b} is regular.

2. Regular expression

Regular expressions are used to denote regular languages. They can represent regular languages and operations on them succinctly.
The set of regular expressions over an alphabet is defined recursively as below. Any element of that set is a regular expression.

Basis Clause: , and a are regular expressions corresponding to languages , { } and {a}, respectively, where a is an element of .
Inductive Clause: If r and s are regular expressions corresponding to languages Lr and Ls , then ( r + s ) , ( rs ) and ( r*) are regular expressions corresponding to languages Lr Ls , LrLs and Lr* , respectively.

Extremal Clause: Nothing is a regular expression unless it is obtained from the above two clauses.

Conventions on regular expressions

(1) When there is no danger of confusion, bold face may not be used for regular expressions.
So for example, ( r + s ) is used in stead of ( r + s ).
(2) The operation * has precedence over concatenation, which has precedence over union ( + ).
Thus the regular expression   ( a + ( b( c*) ) )  is written as   a + bc*,
(3) The concatenation of k r's , where r is a regular expression, is written as rk. Thus for example rr = r2 .
The language corresponding to rk is Lrk, where Lr is the language corresponding to the regular expression r.
(4) We use ( r+) as a regular expression to represent Lr+ .

Examples of regular expression and regular languages corresponding to them

• ( a + b )2 corresponds to the language {aa, ab, ba, bb}, that is the set of strings of length 2 over the alphabet {a, b}.
In general ( a + b )k corresponds to the set of strings of length k over the alphabet {a, b}. ( a + b )* corresponds to the set of all strings over the alphabet {a, b}.
• a*b* corresponds to the set of strings consisting of zero or more a's followed by zero or more b's.
• a*b+a* corresponds to the set of strings consisting of zero or more a's followed by one or more b's followed by zero or more a's.
• ( ab )+ corresponds to the language {ab, abab, ababab, ... }, that is, the set of strings of repeated ab's.

Note:A regular expression is not unique for a language. That is, a regular language, in general, corresponds to more than one regular expressions. For example ( a + b )* and ( a*b* )* correspond to the set of all strings over the alphabet {a, b}.

Definition of Equality of Regular Expressions

Regular expressions are equal if and only if they correspond to the same language.

Thus for example ( a + b )* = ( a*b* )* , because they both represent the language of all strings over the alphabet {a, b}.

In general, it is not easy to see by inspection whether or not two regular expressions are equal.

### Test Your Understanding of Regular Language and Regular Expression

Indicate which of the following statements are correct and which are not.
Click Yes or No , then Submit.
There are two sets of questions.

In the questions below the following notations are used:

\Lambda   for L^*   for   L*
a^*   for   a* , etc.

Next -- Examples of Regular Expression

Back to Study Schedule