Regular Language
Definitions of Regular Language and Regular Expression
Subjects to be Learned
- regular language
- regular expression
Contents
Here we are going to learn one type of language called regular language which is
the simplest of the four Homsky formal languages,
and regular expression which is one of the ways to describe regular languages.
1. Regular language
The set of regular languages over an alphabet
is defined recursively
as below. Any language
belonging to this set is a regular language over
.
Definition of Set of Regular Languages :
Basis Clause:
, {
}
and {a} for any symbol a
are
regular languages.
Inductive Clause: If Lr and Ls
are regular languages, then
Lr
Ls ,
LrLs
and Lr*
are regular languages.
Extremal Clause: Nothing is a regular language unless it is obtained from the above two clauses.
For example, let
= {a, b}.
Then since {a} and {b} are regular languages, {a, b}
( = {a}
{b} )
and {ab} ( = {a}{b} ) are regular languages. Also since {a} is regular, {a}*
is a regular language which is the set of strings consisting of a's such as
, a, aa, aaa, aaaa etc.
Note also that
*, which is the set of strings
consisting of a's and b's, is a regular language
because {a, b} is regular.
2. Regular expression
Regular expressions are used to denote regular languages.
They can represent regular languages and operations on them succinctly.
The set of regular expressions over an alphabet
is defined recursively
as below.
Any element of that set is a regular expression.
Basis Clause:
,
and
a are regular expressions corresponding to languages
,
{
} and {a}, respectively, where a is an element of
.
Inductive Clause: If r and s are regular expressions
corresponding to languages Lr and Ls ,
then ( r + s ) , ( rs ) and ( r*) are regular expressions
corresponding to languages Lr
Ls ,
LrLs
and Lr* ,
respectively.
Extremal Clause: Nothing is a regular expression unless it is obtained from the above two clauses.
Conventions on regular expressions
(1) When there is no danger of confusion, bold face may not be used for regular expressions.
So for example, ( r + s ) is used in stead of ( r + s ).
(2) The operation * has precedence over concatenation, which has precedence over union ( + ).
Thus the regular expression ( a + ( b( c*) ) ) is written as
a + bc*,
(3) The concatenation of k r's , where r is a regular expression,
is written as rk.
Thus for example rr = r2 .
The language corresponding to rk is
Lrk, where Lr
is the language corresponding to the regular expression r.
For a recursive definition of Lrk
click here.
(4) We use ( r+) as a regular expression
to represent Lr+ .
Examples of regular expression and regular languages corresponding to them
-
( a + b )2 corresponds to the language {aa, ab, ba, bb}, that is the set of strings of length 2
over the alphabet {a, b}.
In general ( a + b )k corresponds to
the set of strings of length k
over the alphabet {a, b}.
( a + b )* corresponds to
the set of all strings
over the alphabet {a, b}.
-
a*b* corresponds to the set of strings consisting of zero or more a's
followed by zero or more b's.
-
a*b+a* corresponds to the set of strings consisting of zero or more a's
followed by one or more b's followed by zero or more a's.
-
( ab )+ corresponds to the language {ab, abab, ababab, ... }, that is,
the set of strings of
repeated ab's.
Note:A regular expression is not unique for a language. That is, a regular language, in general,
corresponds to more than one regular expressions. For example ( a + b )*
and ( a*b* )* correspond to the set of all strings
over the alphabet {a, b}.
Definition of Equality of Regular Expressions
Regular expressions are equal if and only if they correspond
to the same language.
Thus for example ( a + b )*
= ( a*b* )* , because they both represent the language
of all strings
over the alphabet {a, b}.
In general, it is not easy to see by inspection whether or not two regular expressions are equal.
Test Your Understanding of Regular Language and Regular Expression
Indicate which of the following statements are correct and which are not.
Click Yes or No , then Submit.
There are two sets of questions.
In the questions below the following notations are used:
\Lambda for
L^* for L*
a^* for a* , etc.
Next -- Examples of Regular Expression
Back to Study Schedule
Back to Table of Contents