Finite State Automata

CS390, Spring 2024

Last modified: Jan 3, 2023
Contents:

Abstract

Finite Automata are a simple, but nonetheless useful, mathematical model of computation. In this module, we look at this model and at the languages that they can accept.

These lecture notes are intended to be read in concert with the assigned portions of Chapter 2 of the text (Hopcroft). I will both offer my own commentary on the text as appropriate.

1 Introduction

Finite automata (FA), also widely known as finite state automata (FSA), are a mathematical model of computation based on the ideas of

This is, as we will see, a computation model of somewhat limited power. Not all things that we regard as “computation” can be done with FAs. It is, however, powerful enough to have quite few practical applications, while being simple enough to be easily understood.

2 An Opening Example

A traditional puzzle:

A man stands on the side of a small river. He has with him a head cabbage, a goose, and a dog. On the shore in front of him is a small rowboat. The boat is so small that he can take only himself and one of his accompanying items at a time.

But:

  • If he leaves the goose alone on either shore with the cabbage, the goose will eat the cabbage.

  • If he leaves the dog alone on either shore with the goose, the dog will kill the goose.

How can the man get across the river with all his items intact?

We can model this by labeling situations using the characters M C G D | to denote the man, the cabbage, the goose, the dog, and the river, respectively.

For example, we start with everyone on one side of the river: CDGM|. (We will choose to always write the characters on either side of the river in alphabetic order.) We want to end up with everything on the other side of the river: |CDGM.

Starting from CDGM|, we can consider four possibilities:

  1. The man rows across the stream with the cabbage: DG|CM.
  2. The man rows across the stream with the goose: CD|GM.
  3. The man rows across the stream with the dog: CG|DM.
  4. The man rows across the stream alone: CDG|M.

 

We can diagram that like this:

The circles (states) are labeled with the positions of the man and his items. The connecting arrows (transitions) are labeled with what the man took with him on his trip (using ‘x’ to indicate that he rowed back alone).

Now looking at this, we can see case 1 ends badly. The dog kills the goose. In case 3, the goose eats the cabbage. In case 4, one or both of those two things happens (depending on how fast the goose is). None of these are desirable, so I will mark them as “final”.

first prev1 of 11next last

We can read the series of steps necessary to solve the puzzle by reading off the labels of all the transitions that take us from CDGM| to |CDGM:

G x D G C x G

3 Finite Automata: Definition

What we have just built is a Finite Automaton (FA), a collection of states in which we make transitions based upon input symbols.

Definition: A Finite Automaton

A finite automaton (FA) is a 5-tuple $(Q, \Sigma, q_0, A, \delta)$ where

  • $Q$ is a finite set of states;
  • $\Sigma$ is a finite input alphabet;
  • $q_0 \in Q$ is the initial state;
  • $A \subseteq Q$ is the set of accepting states; and
  • $\delta : Q \times \Sigma \rightarrow Q$ is the transition function.

For any element q of Q and any symbol $\sigma \in \Sigma$, we interpret $\delta(q,\sigma)$ as the state to which the FA moves, if it is in state q and receives the input $\sigma$.

This is the key definition in this chapter and is worth a look to be sure you understand it.

 

For example, for the FA shown here, we would say that:

4 Some Examples

 

Look at this automaton.

What language do you think this accepts?

 

Look at this automaton.

What language do you think this accepts?

 

A little bit trickier.

What language do you think this accepts?

 

This automaton accepts strings over ${0,1}$, wo we can think of it as accepting certain binary numbers.

What binary numbers do you think this accepts?

5 Creating FSAs

Creating an FSA is much like writing a program. That is, it is a creative process with no simple “recipe” that can be followed to always lead you to a correct design. That said, many of the skills you apply to programming will work when creating FSAs as well.

5.1 Testing

You can test your FSAs, either by desk-checking them or by actually executing them in a FA simulator (as you will do in the next reading).

As when testing a program, you will get the best results by choosing a variety of inputs, with attention paid to the various distinct “cases” that the FSA/program must satisfy, and by paying attention to boundary cases (e.g., the empty string).

If you find yourself frequently “surprised” to learn that an automaton you submitted for grading is actually incorrect, you probably aren’t doing a good job of testing.

5.2 Decomposition

Every programmer quickly learns that the key to success is breaking down complicated problems into simpler ones that can be combined to form an overall solution. The same can be said of designing automata.

If I asked you, for example, to create an FA for, say, $\{01,101\}^*$, that breaks down into

So you could…

 

… start by writing out the straight-line sequences for the two concatenations.

 

Then tie them together at the beginning by merging their starting states (q0 and q3) so that we branch into the appropriate sequence depending on whether the first character is a 0 or 1. That gives us our “choice”. Note that because we have merged those two states, the total number of states is reduced by one.

 

Then, knowing that the * must loop us back to the beginning of its “body”, merge the endpoints back to the beginning. Again, this is really a merge - we lose the states q2 and q6 because we have merged them into q0.

 

Finally, choose the initial and final states. In this case, q0 is both.

Not every problem can be solved this way, but many can be approached like this in part at least. Sometimes this does not yield the simplest possible FA for the language (though it tends to produce one where the correctness of the FA is easily judged), but it’s often easier to simplify a known correct FA than to come up with the simplest possible one from scratch.

This whole approach will be easier for the NFAs we discuss later than it is for DFAs, because they allow sub-solutions to be linked together more easily. In fact, we will use this construction approach to prove that all regular expressions can be presented as an FA.

5.3 What’s in a State?

Although we think of each FSA, as a whole, as a means of recognizing a particular language, we can actually think of each state as being associated with a language, a set of strings that would take us to that particular state.

A good rule of thumb in creating FAs is that every state should have an obvious “meaning”, a set of strings that are “accepted” by that state.

 

For example in this problem from above, we clearly intend q0 to be associated with the set of strings $\{01,101\}^*$ – that’s what we designed this FSA to recognize.

But we can also say that

Try to only introduce new states into an FSA because they have a specific purpose. If you don’t know what set of strings would take you to a state, you should consider carefully whether you want or need that state at all.

6 Closing Discussion

6.1 Variations

This chapter has looked at a very “pure” form of FSA. There are some common variations that allow us to “do” things with an FA without substantially altering the computation power beyond that of a regular FA.

One worth mentioning is the Mealy machine, which adds to each state transition an optional string to be emitted.

 

Here, for example, is a Mealy machine that translates binary numbers into octal (base 8).

Some of the more complicated automata that we will examine in later chapters can be thought of as associating some form of I/O device with a Mealy machine “controller”. Thinking of an FA as a controller may make a little more sense if we believe that the FA can issue output other than simply accepting to not accepting a completed string, output that could be signals to some other device.

6.2 Applications

Although FAs are limited in computational power, they have the virtue of being relatively easy to understand. Consequently, they are often used by software developers and non-software system designers as a means of summarizing the behavior of a complex system.

 

Here, for example, is a state diagram summarizing the behavior of processes in a typical operating system. Most operating systems will be running more processes than can be simultaneously accommodated on the number of physical CPUs. Therefore newly started processes start in a Ready state and have to wait until the OS scheduler assigns a CPU to them. At that moment, the process starts Running. It stays in this state until either the scheduler decides to take back the CPU (because a “time slice” has expired) or because the process has initiated an I/O operation that, compared to the CPU speed, may take a substantial amount of time. In the latter case, the process moves into a Blocked state, surrendering the CPU back to the schedule. When the I/O operation is completed, the process returns to the Ready state until the scheduler decides to give it some more CPU time.

The notation used in this diagram is that of a UML state diagram. But the kinship to the formal model of FAs should be pretty obvious. The very fact that the notation for such state diagrams is part of an industry standard notation should give you a sense of how common it is for developers to use FAs as a basis for reasoning about software.

Pushing a bit further on this theme, model checking is an approach to validating complex systems, particularly concurrent systems. An FSA “specification” is given for various portions of a complex system. Combinations of states that are considered “dangerous” or that would indicate a failure (e.g., a system going into deadlock) are identified Analysis, some algorithmic and some human, of those state diagrams is performed to see if it is possible to reach any of those undesired states. The process for doing this is not unlike the process of minimizing the states of an FSA from section 2.6.