Language

Introduction to Language



Our first and one of the main topic for this course is language. A language is, in this course, a set of strings of symbols. Programming langauges we use are a language in that sense. Others such as languages of logics, languages of mathematics, natural languages etc. are all languages in that sense.

What we are going to study on languages in this course are four classes of languages called (Chomsky) formal languages and their properties. The four classes are regular (or type 3) languages, context-free (or type 2) languages, context-sensitive (or type 1) languages and phrase structure (or type 0) languages. Type 3 is a subset of type 2 which is a subset of type 1 and type 0 is the most general including the other three as a subset.

These formal languages are characterized by grammars which are essentially a set of rewrite rules for generating strings belonging to a language as we see later. Also there are various kinds of computing devices called automata which process these types of languages Thus formal languages can also be characterized by the computing devices which process them.

These formal languages and automata capture the essense of various computing devices and computation in a very simple way. Also for some important classes of problems, solving them can be seen as recognizing languages i.e. checking whether or not a string is in a language. Using automata and formal languages we can study limitations of computer and computation. It can be rigorously shown that some problems can not be solved by computers in any finite amount of time and that some others are practically unsolvable because of the time it takes to solve them. In asddition two of the formal languages, regular and context-free languages, are quite useful for modeling systems used in practice such as co9mputer network communication protocols, lexical analyzers and parser for compilers for programming languages.

In the following chapters we first learn about languages. Then we study regular languages, the simplest of the four formal languages, together with regular expressions which are a method of representing regular languages.
Then we investigate various kinds of finite automata: deterministic finite automata (DFA), nondeterministic finite automata (NFA) and nondeterministic finite automata with transitions (NFA-). They are devices that recognize regular languages. NFA and NFA- are conceptually simpler and easier to use when modeling a system because there are no restrictions on transitions for them unlike for DFA. On the other hand DFAs are suited for writing a simulator program because there is no nondeterminism such as going to two or more states from a state upon reading one input symbol. We are going to see an algorithm for converting NFA- to NFA which recognizes the same language and another for NFA to DFA conversion.
As we are going to learn next, in general there are more than one NFAs and DFAs that reconize one language. However, if the number of states of DFA is minimized, then the resulting DFA is unique up to the state names for a given regular language.
Then after seeing yet another way of representing regular laguages; regular grammars, we are going to learn modeling of systems finite automata.
Our last topic on regular language is testing of languages for non-regularity.



Next -- Definitions on Language
Back to Study Schedule
Back to Table of Contents