## Exercise Questions on Regular Language and Regular Expression

Ex. 1: Find the shortest string that is not in the language represented by the regular expression a*(ab)*b*.

Solution: It can easily be seen that , a, b, which are strings in the language with length 1 or less. Of the strings wiht length 2 aa, bb and ab are in the language. However, ba is not in it. Thus the answer is ba.

Ex. 2: For the two regular expressions given below,
(a) find a string corresponding to r2 but not to r1 and
(b) find a string corresponding to both r1 and r2.

r1 = a* + b*     r2 = ab* + ba* + b*a + (a*b)*

Solution: (a) Any string consisting of only a's or only b's and the empty string are in r1. So we need to find strings of r2 which contain at least one a and at least one b. For example ab and ba are such strings.
(b) A string corresponding to r1 consists of only a's or only b's or the empty string. The only strings corresponding to r2 which consist of only a's or b's are a, b and the strings consiting of only b's (from (a*b)*).

Ex. 3: Let r1 and r2 be arbitrary regular expressions over some alphabet. Find a simple (the shortest and with the smallest nesting of * and +) regular expression which is equal to each of the following regular expressions.

(a) (r1 + r2 + r1r2 + r2r1)*
(b) (r1(r1 + r2)*)+

Solution: One general strategy to approach this type of question is to try to see whether or not they are equal to simple regular expressions that are familiar to us such as a, a*, a+, (a + b)*, (a + b)+ etc.
(a) Since (r1 + r2)* represents all strings consisting of strings of r1 and/or r2 , r1r2 + r2r1 in the given regular expression is redundant, that is, they do not produce any strings that are not represented by (r1 + r2)*. Thus (r1 + r2 + r1r2 + r2r1)* is reduced to (r1 + r2)*.
(b) (r1(r1 + r2)*)+ means that all the strings represented by it must consist of one or more strings of (r1(r1 + r2)*). However, the strings of (r1(r1 + r2)*) start with a string of r1 followed by any number of strings taken arbitrarily from r1 and/or r2. Thus anything that comes after the first r1 in (r1(r1 + r2)*)+ is represented by (r1 + r2)*. Hence (r1(r1 + r2)*) also represents the strings of (r1(r1 + r2)*)+, and conversely (r1(r1 + r2)*)+ represents the strings represented by (r1(r1 + r2)*). Hence (r1(r1 + r2)*)+ is reduced to (r1(r1 + r2)*).

Ex. 4: Find a regular expression corresponding to the language L over the alphabet { a , b } defined recursively as follows:

Basis Clause: L
Inductive Clause: If x L , then aabx L and xbb L .
Extremal Clause: Nothing is in L unless it can be obtained from the above two clauses.

Solution: Let us see what kind of strings are in L. First of all L . Then starting with , strings of L are generated one by one by prepending aab or appending bb to any of the already generated strings. Hence a string of L consists of zero or more aab's in front and zero or more bb's following them. Thus (aab)*(bb)* is a regular expression for L.

Ex. 5: Find a regular expression corresponding to the language L defined recursively as follows:

Basis Clause: L and a L .
Inductive Clause: If x L , then aabx L and bbx L .
Extremal Clause: Nothing is in L unless it can be obtained from the above two clauses.

Solution: Let us see what kind of strings are in L. First of all and a are in L . Then starting with or a, strings of L are generated one by one by prepending aab or bb to any of the already generated strings. Hence a string of L has zero or more of aab's and bb's in front possibly followed by a at the end. Thus (aab + bb)*(a + ) is a regular expression for L.

Ex. 6: Find a regular expression corresponding to the language of all strings over the alphabet { a, b } that contain exactly two a's.

Solution: A string in this language must have at least two a's. Since any string of b's can be placed in front of the first a, behind the second a and between the two a's, and since an arbitrasry string of b's can be represented by the regular expression b*, b*a b*a b* is a regular expression for this language.

Ex. 7: Find a regular expression corresponding to the language of all strings over the alphabet { a, b } that do not end with ab.

Solution: Any string in a language over { a , b } must end in a or b. Hence if a string does not end with ab then it ends with a or if it ends with b the last b must be preceded by a symbol b. Since it can have any string in front of the last a or bb, ( a + b )*( a + bb ) is a regular expression for the language.

Ex. 8: Find a regular expression corresponding to the language of all strings over the alphabet { a, b } that contain no more than one occurence of the string aa.

Solution: If there is one substring aa in a string of the language, then that aa can be followed by any number of b. If an a comes after that aa, then that a must be preceded by b because otherwise there are two occurences of aa. Hence any string that follows aa is represented by ( b + ba )*. On the other hand if an a precedes the aa, then it must be followed by b. Hence a string preceding the aa can be represented by ( b + ab )*. Hence if a string of the language contains aa then it corresponds to the regular expression ( b + ab )*aa( b + ba )* .
If there is no aa but at least one a exists in a string of the language, then applying the same argument as for aa to a, ( b + ab )*a( b + ba )* is obtained as a regular expression corresponding to such strings.
If there may not be any a in a string of the language, then applying the same argument as for aa to , ( b + ab )*( b + ba )* is obtained as a regular expression corresponding to such strings.
Altogether ( b + ab )*( + a + aa )( b + ba )* is a regular expression for the language.

Ex. 9: Find a regular expression corresponding to the language of strings of even lengths over the alphabet of { a, b }.

Solution: Since any string of even length can be expressed as the concatenation of strings of length 2 and since the strings of length 2 are aa, ab, ba, bb, a regular expression corresponding to the language is ( aa + ab + ba + bb )*. Note that 0 is an even number. Hence the string is in this language.

Ex. 10: Describe as simply as possible in English the language corresponding to the regular expression a*b(a*ba*b)*a* .

Solution: A string in the language can start and end with a or b, it has at least one b, and after the first b all the b's in the string appear in pairs. Any numbe of a's can appear any place in the string. Thus simply put, it is the set of strings over the alphabet { a, b } that contain an odd number of b's

Ex. 11: Describe as simply as possible in English the language corresponding to the regular expression (( a + b )3)*( + a + b ) .

Solution: (( a + b )3) represents the strings of length 3. Hence (( a + b )3)* represents the strings of length a multiple of 3. Since (( a + b )3)*( a + b ) represents the strings of length 3n + 1, where n is a natural number, the given regular expression represents the strings of length 3n and 3n + 1, where n is a natural number.

Ex. 12: Describe as simply as possible in English the language corresponding to the regular expression ( b + ab )*( a + ab )*.

Solution: ( b + ab )* represents strings which do not contain any substring aa and which end in b, and ( a + ab )* represents strings which do not contain any substring bb. Hence altogether it represents any string consisting of a substring with no aa followed by one b followed by a substring with no bb.

### Test Your Understanding of Regular Language and Regular Expression

Indicate which of the following statements are correct and which are not.
Click Yes or No , then Submit.
There are two sets of questions.

In the questions below the following notations are used:

\Lambda   for
a^*   for   a* , etc.

Next -- Properties of Regular Language

Back to Study Schedule