Notes on Models of Computation

H. Conrad Cunningham

014April 2022

Copyright (C) 2015, 2022, H. Conrad Cunningham
Professor of Computer and Information Science
University of Mississippi
214 Weir Hall
P.O. Box 1848
University, MS 38677
(662) 915-7396 (dept. office)

Browser Advisory: The HTML version of this textbook requires a browser that supports the display of MathML. A good choice as of April 2022 is a recent version of Firefox from Mozilla.

1 Introduction to the Theory of Computation

Why study theory?

  1. To understand the concepts and principles underlying the fundamental nature of computing
  2. To learn to apply the theory to practical areas of computing
  3. To have fun!

In this course, we study the following models:

  1. automaton (automata)
  2. formal language
  3. algorithm

1.1 Mathematical Preliminaries and Notation

The mathematical concepts used in the Linz textbook include:

1.1.1 Sets

Students in this course should be familiar with the following set concepts from previous study in mathematics:

Laws for operations on the empty set:

DeMorgan’s Laws:

1.1.2 Functions

Function f:DRf: D \rightarrow R means that

Function ff is a

1.1.3 Relations

A relation on XX and YY is any subset of X×YX \times Y.

An equivalence relation \equiv is a generalization of equality. It is:

  1. reflexive: xxxx \equiv x \forall x
  2. symmetric: if xyx \equiv y then yxy \equiv x
  3. transitive: if xyx \equiv y and yzy \equiv z then xzx \equiv z

1.1.4 Graphs

A graphgraph V,E\langle V, E \rangle is a mathematical entity where

A directed graph (or digraph) is a graph in which each edge ei=(vj,vk)e_{i} = ( v_{j}, v_{k} ) has a direction from vertex vjv_{j} to vertex vkv_{k}.

Edge ei=(vj,vk)e_{i} = ( v_{j}, v_{k} ) on a digraph is an outgoing edge from vertex vjv_{j} and an incoming edge to vertex vkv_{k}.

If there are no directions associated with the edges, then the graph is undirected.

Graphs may be labeled by assigning names or other information to vertices or edges.

We can visualize a graph with a diagram in which the vertices are shown as circles and edges as lines connecting a pair of vertices. For directed graphs, the direction of an edge is shown by an arrow.

Linz Figure 1.1 shows a digraph V,E\langle V, E \rangle where V={v1,v2,v3}V = \{ v_{1}, v_{2}, v_{3} \} and edges E={(v1v3),(v3,v1),(v3,v2),(v3,v3)}E = \{ (v_{1} v_{3}),(v_{3},v_{1}),(v_{3},v_{2}),(v_{3},v_{3}) \}.

Linz Fig. 1.1: Diagram of a Digraph

A sequence of edges (vi,vj),(vj,vk),,(vm,vn)(v_{i},v{j}), (v_{j},v{k}), \ldots, (v_{m},v_{n}) is a walk from vjv_{j} to vnv_{n}. The length of the walk is the total number of edges traversed.

A path is a walk with no edge repeated. A path is simple if no vertex is repeated.

A walk from some vertex viv_{i} to itself is a cycle with base viv_{i}. If no vertex other than the base is repeated, then the cycle is simple.

In Linz Figure 1.1:

If the edges of a graph are labelled, then the label of a walk (or path) is the sequence of edges encountered on a traversal.

1.1.5 Trees

A tree is a directed graph with no cycles and a distinct root vertex such that there is exactly one path from the root to every other vertex.

The root of a tree has no incoming edges.

A vertex of a tree without any outgoing edges is a leaf of the tree.

If there is an edge from viv_{i} to vjv_{j} in a tree, then:

The level associated with each vertex is the number of edges in the path from the root to the vertex.

The height of a tree is the largest level number of any vertex.

If we associated an ordering with the vertices at each level, then the tree is an ordered tree.

The above terminology is illustrated in Linz Figure 1.2.

Linz Fig. 1.2: A Tree

1.1.6 Proof Techniques

Students in this course should be familiar with the following proof techniques from previous study in mathematics:

We will see an example of an inductive proof in the next section.

1.2 Three Basic Concepts

Three fundamental ideas are the major themes of this course:

  1. languages
  2. grammars
  3. automata

1.2.1 Languages

Our concept of language is an abstraction of the concept of a natural language.

1.2.1.1 Language Concepts

Linz Definition (Alphabet): An alphabet, denoted by Σ\Sigma, is a finite, nonempty set of symbols.

By convention, we use lowercase letters near the beginning of the English alphabet a,b,c,a, b, c, \ \cdots to represent elements of Σ\Sigma.

For example, if Σ={a,b}\Sigma = \{ a, b \}, then the alphabet has two unique symbols denoted by aa and bb.

Linz Definition (String): A string is a finite sequence of symbols from the alphabet.

By convention, we use lowercase letters near the end of the English alphabet u,v,w,x,y,z\cdots\ u, v, w, x, y, z to represent strings. We write strings left to right. That is, symbols appearing to the left are before those appearing to the right.

For example, w=baabaaw = baabaa is a string from the above alphabet. The string begins with a bb and ends with an aa.

Linz Definition (Concatenation): The concatenation of strings uu and vv means appending the symbols of vv to the right end (i.e., after) the symbols of uu, denoted by uv.

If u=a1a2a3u = a_{1} a_{2} a_{3} and v=b1b2b3v = b_{1} b_{2} b_{3}, then uv=a1a2a3b1b2b3uv = a_{1} a_{2} a_{3} b_{1} b_{2} b_{3}.

Definition (Associativity): Operation \oplus is associative on set SS if, for all xx, yy, and zz in SS, (xy)z=x(yz)(x \oplus y) \oplus z = x \oplus (y \oplus z). We often write associative expressions without explicit parentheses, e.g., xyzx \oplus y \oplus z.

String concatenation is associative, that is, (uv)w=u(vw)(u v) w = u (v w).

Thus we normally just write uvwuvw without parentheses.

Definition (Commutativity): Operation \oplus is commutative on set SS if, for all xx and yy in SS, xy=yxx \oplus y = y \oplus x.

String concatenation is not commutative. That is, uvvuuv \neq vu.

Linz Definition (Reverse): The reverse of a string ww, denoted by wRw^{R}, is the string with same symbols, but with the order reversed.

If w=a1a2a3w = a_{1}a_{2}a_{3}, then wR=a3a2a1w^{R} = a_{3}a_{2}a_{1}.

Linz Definition (Length): The length of a string w, denoted by |w||w|, is the number of symbols in string ww.

Linz Definition (Empty String): The empty string, denoted by λ\lambda, is the string with no symbols, that is, |λ|=0|\lambda| = 0.

Definition (Identity Element): An operation \oplus has an identity element ee on set SS if, for all xSx \in S, xe=x=exx \oplus e = x = e \oplus x.

The empty string λ\lambda is the identity element for concatenation. That is, λw=wλ=w\lambda w = w \lambda = w.

1.2.1.2 Formal Interlude: Inductive Definitions and Induction

We can define the length of a string with the following inductive definition:

  1. |λ|=0|\lambda| = 0 (base case)
  2. |wa|=|w|+1|wa| = |w| + 1 (inductive case)

Note: This inductive defintion and proof differs from the textbook. Here we begin with the empty string.

Using the fact that λ\lambda is the identity element and the above definition, we see that

|a|=|λa|=|λ|+1=0+1=1|a| = |\lambda a| = |\lambda| + 1 = 0 + 1 = 1.

Prove |uv|=|u|+|v||uv|\ =\ |u| + |v|.

Noting the definition of length above, we choose to do an induction over string v (or, if you prefer, over the length of vv, basing induction at 0).

Base case v=λv\ =\ \lambda (that is, length is 0)

|uλ|| u \lambda |
= { identity for concatenation } \longleftarrow justification for step in braces
|u|| u |
= { identity for + }
|u|+0| u | + 0
= { definition of length }
|u|+|λ|| u | + | \lambda |

Inductive case v=wav=wa (that is, length is greater than 0)
Induction hypothesis: |uw|=|u|+|w||uw| = |u| + |w|

|u(wa)|| u (w a) |
= { associativity of concatenation }
|(uw)a|| (uw)a |
= { definition of length }
|uw|+1| uw | + 1
= { induction hypothesis }
(|u|+|w|)+1(|u| + |w|) + 1
= { associativity of + }
|u|+(|w|+1)|u| + (|w| + 1)
= { definition of length (right to left) }
|u|+(|wa|)|u| + (|w a|)

Thus we have proved |uv|=|u|+|v||uv| = |u| + |v|. QED.

1.2.1.3 More Language Concepts

Linz Definition (Substring): A substring of a string ww is any string of consecutive symbols in ww.

If w=a1a2a3w = a_{1}a_{2}a_{3}, then the substrings are λ,a1,a1a2,a1a2a3,a2,a2a3,a3\lambda, a_{1}, a_{1}a_{2}, a_{1}a_{2}a_{3}, a_{2}, a_{2}a_{3}, a_{3}.

Linz Definition (Prefix, Suffix): If w=vuw = vu, then vv is a prefix of ww and uu is a suffix.

If w=a1a2a3w = a_{1}a_{2}a_{3}, the prefixes are λ,a1,a1a2,a1a2a3\lambda, a_{1}, a_{1}a_{2}, a_{1}a_{2}a_{3}.

Linz Definition (wnw^{n}): wnw^{n}, for any string w and n0n \geq 0 denotes nn repetitions of string (or symbol) ww. We further define w0=λw^{0} = \lambda.

Linz Definition (Star-Closure): Σ*\Sigma^{*}, for alphabet Σ\Sigma, is the set of all strings obtained by concatenating zero or more symbols from the alphabet.

Note: An alphabet must be a finite set.

Linz Definition (Positive Closure): Σ+=Σ*λ\Sigma^{+} = \Sigma^{*} - \lambda

Although Σ\Sigma is finite, Σ*\Sigma^{*} and Σ+\Sigma^{+} are infinite.

For a string ww, we also write w*w^{*} and w+w^{+} to denote zero or more repetitions of the string and one or more repetitions of the string, respectively.

Linz Definition (Language): A language, for some alphabet Σ\Sigma, is a subset of Σ*\Sigma^{*}.

Linz Definition (Sentence): A sentence of some language LL is any string from LL (i.e., from Σ*\Sigma^{*}).

1.2.1.4 Linz Example 1.9: Example Languages

Let Σ={a,b}\Sigma = \{ a, b \}.

Since the language has a finite number of sentences, it is a finite language.

Sentences aabbaabb and aaaabbbbaaaabbbb are in LL, but aaabbaaabb is not.

As with most interesting languages, LL is an infinite language.

1.2.1.5 Operations on Languages

Languages are represented as sets. Operations on languages can be defined in terms of set operations.

Linz Definition (Union, Intersection, and Difference): Language union, intersection, and difference are defined directly as the corresponding operations on sets.

Linz Definition (Concatenation): Language complementation with respect to Σ*\Sigma^{*} is defined such that L=Σ*L\bar{L} = \Sigma^{*} - L.

Linz Definition (Reverse): Language reverse is defined such that LR={wR:wL}L^{R} = \{ w^{R} : w \in L \}. (That is, reverse all strings.)

Linz Definition (Concatenation): Language concatenation is defined such that L1L2={xy:xL1,yL2}L_{1} L_{2} = \{ x y : x \in L_{1}, y \in L_{2} \}.

Linz Definition (LnL^{n}): LnL^{n} means LL concatenated with itself nn times.

L0={λ}L^{0} = \{ \lambda \} and Ln+1=LnLL^{n+1} = L^{n}L

Definition (Star-Closure): Star-closure (Kleene star) is defined such that L*=L0L1L2L^{*} = L^{0} \cup L^{1} \cup L^{2} \cup \cdots.

Definition (Positive Closure): Positive closure is defined such that L+=L1L2L^{+} = L^{1} \cup L^{2} \cup \cdots.

1.2.1.6 Language Operation Examples

Let L={anbn:n0}L = \{ a^{n} b^{n} : n \geq 0 \}.

How would we express in L\bar{L} and L*L^{*}?

Although set notation is useful, it is not a convenient notation for expressing complicated languages.

1.2.2 Grammars

1.2.2.1 Grammar Concepts

Linz Definition 1.1 (Grammar): A grammar GG is a quadruple G=(V,T,S,P)G = (V, T, S, P) where

VV is a finite set of objects called variables.
TT is a finite set of objects called terminal symbols.
SVS \in V is a special symbol called the start symbol.
PP is a finite set of productions.
VV and TT are nonempty and disjoint.

Linz Definition (Productions): Productions have form xyx \rightarrow y where:

x(VT)+x \in (V \cup T)^{+}, i.e., xx is some non-null string of terminals and variables
y(VT)*y \in (V \cup T)^{*}, i.e., yy is some, possibly null, string of terminals and variables

Consider application of productions, given w=uxvw = uxv:

Linz Definition (Derives): w1*wnw_{1} \overset{*}{\Rightarrow} w_{n} means that w1w_{1} derives wnw_{n} in zero or more production steps.

Linz Definition (Language Generated): Let G=(V,T,S,P)G = (V, T, S, P) be a grammar. Then L(G)={wT*:S*w}L(G) = \{ w \in T^{*} : S \overset{*}{\Rightarrow} w \} is the language generated by GG.

That is, L(G)L(G) is the set of all strings that can be generated from the start symbol SS using the productions PP.

Linz Definition (Derivation): A derivation of some sentence wL(G)w \in L(G) is a sequence Sw1w2w3wnwS\Rightarrow w_{1} \Rightarrow w_{2} \Rightarrow w_{3} \Rightarrow \cdots \Rightarrow w_{n} \Rightarrow w.

The strings S,w1,,wnS, w_{1}, \cdots, w_{n} above are sentential forms of the derivation of sentence ww.

1.2.2.2 Linz Example 1.11 (Grammar)

Consider G=({S},{a,b},S,P)G = (\{S\},\{a,b\},S,P) where P is the set of productions

Consider SaSbaaSbbaabbS \Rightarrow aSb \Rightarrow aaSbb \Rightarrow aabb. Hence, S*aabbS \overset{*}{\Rightarrow} aabb.

aabbaabb is a sentence of the language; the other strings in the derivation are sentential forms.

Conjecture: The language formed by this grammar, L(G)L(G), is {anbn:n0}\{ a^{n} b^{n} : n \geq 0 \}.

Usually, however, it is difficult to construct an explicit set definition for a language generated by a grammar.

Now prove the conjecture.

First, prove that all sentential forms of the language have the structure wi=aiSbiw_{i} = a^{i}Sb^{i} for i0i \geq 0 by induction on i.

Basis step:

Clearly, w0=Sw_{0} = S is a sentential form, the start symbol.

Inductive step:

Assume wm=amSbmw_{m} = a^{m}Sb^{m} is a sentential form, show that wm+1=am+1Sbm+1w_{m+1} = a^{m+1}Sb^{m+1}.

Case 1: If we begin with the assumption and apply production SaSbS \rightarrow aSb, we get sentential form wm+1=am+1Sbm+1w_{m+1} = a^{m+1}Sb^{m+1}.

Case 2: If we begin with the assumption and apply production SλS \rightarrow \lambda, we get the sentence ambma^{m}b^{m} rather than a sentential form.

Hence, all sentential forms have the form aiSbia^{i}Sb^{i}.

Given that SλS \rightarrow \lambda is the only production with terminals on the right side, we must apply it to derive any sentence. As we noted in case 2 above, application of the production to any sentential form gives a sentence of the form ambma^{m}b^{m}. QED.

1.2.2.3 Linz Example 1.12: Finding a Grammar for a Language

Given L={anbn+1:n0}L = \{a^{n}b^{n+1} : n \geq 0 \}.

A slightly different grammar might introduce nonterminal A as follows:

1.2.2.4 More Grammar Concepts

To show that a language LL is generated by a grammar GG, we must prove:

  1. For every wLw \in L, there is a derivation using GG.

  2. Every string derived from GG is in LL.

Linz Definition (Equivalence): Two grammars are equivalent if they generate the same language.

For example, the two grammars given above for the language L={anbn+1:n0}L = \{a^{n}b^{n+1} : n \geq 0 \} are equivalent.

1.2.2.5 Linz Example 1.13

Let Σ={a,b}\Sigma = \{a,b\} and let na(w)n_{a}(w) and nb(w)n_{b}(w) denote the number of aa’s and bb’s in the string ww.

Let grammar GG have productions

SSSS \rightarrow SS
SλS \rightarrow \lambda
SaSbS \rightarrow aSb
SbSaS \rightarrow bSa

Let L={w:na(w)=nb(w)}L = \{w : n_{a}(w) = n_{b}(w) \}.

Prove L(G)=LL(G) = L.

Informal argument follows. Actual proof would be an induction over length of ww.

Consider cases for ww.

  1. Case wL(G)w \in L(G). Show wLw \in L.

    Any production adding an a also adds a b. Thus there is the same number of a’s and b’s.

  2. Case wLw \in L. Show wL(G)w \in L(G).

1.2.3 Automata

An automaton is an abstract model of a compute

Linz Fig. 1.4: Schematic Representation of a General Automaton

As shown in Linz Figure 1.4, a computer:

  1. reads input from an input file – one symbol from each cell – left to right

  2. produces output

  3. may use storage – unlimited number of cells (may be different alphabet)

  4. has a control unit

A configuration is a state of the control unit, input, and storage.

A move is a transition from one state configuration to another.

Automata can be categorized based on control:

Automata can also be categorized based on output:

Various models differ in

1.3 Applications

1.3.1 Linz Example 1.15: C Identifiers

The syntax rules for identifiers in the language C are as follows:

Formally, we can describe these rules with the grammar:

    <id>       -> <letter><rest>   | <underscr><rest>
    <rest>     -> <letter><rest>   | <digit><rest>    |
                  <underscr><rest> | <lambda>
    <letter>   -> a|b|c|...|z|A|B|C|...|Z
    <digit>    -> 0|1|2|...|9
    <underscr> -> _

Above <lambda> represents the symbol λ\lambda,-> is the \rightarrow for productions, and | denotes alternative right-hand-sides of the productions.

The variables are <id>, <letter>, <digit>, <underscr>, and <rest>. The other alphanumeric symbols are literals.

Linz Figure 1.6 shows a drawing of an automaton that accepts all legal C identifiers as defined above.

Linz Fig. 1.7: Automaton to Accept C Identifiers

We can interpret the automaton in Linz Figure 1.6 as follows:

1.3.2 Linz Example 1.17: Binary Adder

Let x=a0a1a0anx = a_{0} a_{1} a_{0} \cdots a_{n} where aia_{i} are bits.

Then value(x)=i=0nai2ivalue(x) = \sum_{i=0}^{n} a_{i} 2^{i}.

This is the usual binary representation in reverse.

A serial adder process two such numbers x and y, bit by bit, starting at the left end. Each bit addition creates a digit for the sum and a carry bit as shown in Linz Figure 1.7.

Linz Fig. 1.7: Binary Addition Table

A block diagram for the machine is shown in Linz Figure 1.8.

Linz Fig. 1.8: Binary Adder Block Diagram

A transducer automaton to carry out the addition of two numbers is shown in Linz Figure 1.9.

The pair on the edges represents the two inputs. The value following the slash is the output.

Linz Fig. 1.9: Binary Adder Transducer Automaton

2 Finite Automata

In chapter 2, we approach automata and languages more precisely and formally than in chapter 1.

A finite automaton is an automaton that has no temporary storage (and, like all automata we consider in this course, a finite number of states and input alphabet).

2.1 Deterministic Finite Accepters

2.1.1 Accepters

Linz Definition (DFA): A deterministic finite accepter, or dfa, is defined by the tuple M=(Q,Σ,δ,q0,F)M = ( Q, \Sigma, \delta, q_{0}, F ) where

A dfa operates as described in the following pseudocode:

currentState := q0q_{0}

position input at left end of string
while more input exists
currentInput := next_input_symbol
advance input to right
currentState := δ\delta(currentState,currentInput)
if currentState F\in F then ACCEPT else REJECT

2.1.2 Transition Graphs

To visualize a dfa, we use a transition graph constructed as follows:

2.1.3 Linz Example 2.1

The graph pictured in Linz Figure 2.1 represents the dfa M=({q0,q1,q2},{0,1},δ,q0,{q1})M = (\{ q_{0}, q_{1}, q_{2} \},\{ 0, 1 \},\delta, q_{0}, \{q_{1}\}), where δ\delta is represented by

δ(q0,0)=q0\delta(q_{0},0) = q_{0}, δ(q0,1)=q1\delta(q_{0},1) = q_{1}
δ(q1,0)=q0\delta(q_{1},0) = q_{0}, δ(q1,1)=q2\delta(q_{1},1) = q_{2}
δ(q2,0)=q2\delta(q_{2},0) = q_{2}, δ(q2,1)=q1\delta(q_{2},1) = q_{1}

Note that q0q_{0} is the initial state and q1q_{1} is the only final state in this dfa.

Linz Fig. 2.1: DFA Transition Graph

The dfa in Linz Figure 2.1:

What about 0111? 1100?

2.1.4 Extended Transition Function for a DFA

Linz Definition: The extended transition function δ*:Q×Σ*Q\delta^{*} : Q \times \Sigma^{*} \rightarrow Q is defined recursively:

δ*(q,λ)=q\delta^{*}(q,\lambda) = q
δ*(q,wa)=δ(δ*(q,w),a)\delta^{*}(q,wa) = \delta(\delta^{*}(q,w),a)

The extended transition function gives the state of the automaton after reading a string.

2.1.5 Language Accepted by a DFA

Linz Definition 2.2 (Language Accepted by DFA): The language accepted by a dfa M=(Q,Σ,δ,q0,F)M = ( Q, \Sigma, \delta, q_{0}, F ) is L(M)={wΣ*:δ*(q0,w)F}L(M) = \{ w \in \Sigma^{*} : \delta^{*}(q_{0}, w) \in F \}.

That is, L(M)L(M) is the set of all strings on the input alphabet accepted by automaton M.

Note that above δ\delta and δ*\delta^{*} are total functions (i.e., defined for all strings).

2.1.6 Linz Example 2.2

The automaton in Linz Figure 2.2 accepts all strings consisting of arbitrary numbers of aa’s followed by a single bb.

In set notation, the language accepted by the automaton is L={anb:n0}L = \{ a^{n}b : n \geq 0 \}.

Note that q2q_{2} has two self-loop edges, each with a different label. We write this compactly with multiple labels.

Linz Fig. 2.2: DFA Transition Graph with Trap State

A trap state is a state from which the automaton can never “escape”.

Note that q2q_{2} is a trap state in the dfa transition graph shown in Linz Figure 2.2.

Transition graphs are quite convenient for understanding finite automata.

For other purposes–such as representing finite automata in programs–a tabular representation of transition function δ\delta may also be convenient (as shown in Linz Fig. 2.3).

Linz Fig. 2.3: DFA Transition Table

2.1.7 Linz Example 2.3

Find a deterministic finite accepter that recognizes the set of all string on Σ={a,b}\Sigma = \{a, b\} starting with the prefix abab.

Linz Figure 2.4 shows a transition graph for a dfa for this example.

Linz Fig. 2.4

The dfa must accept abab and then continue until the string ends.

This dfa has a final trap state q2q_{2} (accepts) and a non-final trap state q3q_{3} (rejects).

2.1.8 Linz Example 2.4

Find a dfa that accepts all strings on {0,1}\{ 0, 1 \}, except those containing the substring 001001.

Linz Figure 2.5 shows a dfa for this example.

Linz Fig. 2.5

Accepts: λ\lambda, 0, 00, 01, 010000

Rejects: 001, 000001, 0010101010101

2.1.9 Regular Languages

Linz Definition 2.3 (Regular Language): A language L is called regular if and only if there exists a dfa M such that L=L(M)L = L(M).

Thus dfas define the family of languages called regular.

2.1.10 Linz Example 2.5

Show that the language L={awa:w{a,b}*}L = \{ awa : w \in \{ a, b \}^{*}\} is regular.

Linz Figure 2.6 shows a dfa for this example.

Linz Fig. 2.6

Question: How would we prove that a languages is not regular?

We will come back to this question in chapter 4.

2.1.11 Linz Example 2.6

Let LL be the language in the previous example (Linz Example 2.5).

Show that L2L^{2} is regular.

L2={aw1aaw2a:w1,w2{a,b}*}L^{2} = \{ aw_{1}aaw_{2}a: w_{1}, w_{2} \in \{a,b\}^{*} \}.

Linz Figure 2.7 shows a dfa for this example.

Linz Fig. 2.7

The last example suggests the conjecture that if a language LL then so is L2L^2, L3L^3, etc.

We will come back to this issue in chapter 4.

2.2 Nondeterministic Finite Accepters

2.2.1 Nondeterministic Accepters

Linz Definition 2.4 (NFA): A nondeterministic finite accepter or nfa is defined by the tuple M=(Q,Σ,δ,q0,F)M = ( Q, \Sigma, \delta, q_{0}, F ) where QQ, Σ\Sigma, q0q_{0}, and FF are defined as for deterministic finite accepters, but

δ:Q×(Σ{λ})2Q\delta : Q \times (\Sigma \cup \{\lambda\}) \rightarrow 2^{Q}.

Remember for dfas:

The key differences between dfas and nfas are

  1. dfa: δ\delta yields a single state
    nfa: δ\delta yields a set of states

  2. dfa: consumes input on each move
    nfa: can move without input (λ\lambda)

  3. dfa: moves for all inputs in all states
    nfa: some situations have no defined moves

An nfa accepts a string if some possible sequence of moves ends in a final state.

An nfa rejects a string if no possible sequence of moves ends in a final state.

2.2.2 Linz Example 2.7

Consider the transition graph shown in Linz Figure 2.8.

Linz Fig. 2.8

Note the nondeterminism in state q0q_{0} with two possible transitions for input aa.

Also state q3q_{3} has no transition for any input.

2.2.3 Linz Example 2.8

Consider the transition graph for an nfa shown in Linz Figure 2.9.

Linz Fig. 2.9

Note the nondeterminism and the λ\lambda-transition.

Note: Here λ\lambda means the move takes place without consuming any input symbol. This is different from accepting an empty string.

Transitions:

Accepts: λ\lambda, 10, 1010, 101010

Rejects: 0, 11

2.2.4 Extended Transition Function for an NFA

As with dfas, the transition function can be extended so its second argument is a string.

Requirement: δ*(qi,w)=Qj\delta^{*}(q_{i},w) = Q_{j} where QjQ_{j} is the set of all possible states the automaton may be in, having started in state qiq_{i} and read string ww.

Linz Definition (Extended Transition Function): For an nfa, the extended transition function is defined so that δ*(qi,w)\delta^{*}(q_{i},w) contains qjq_{j} if there is a walk in the transition graph from qiq_{i} to qjq_{j} labelled ww.

2.2.5 Language Accepted by an NFA

Linz Definition 2.6 (Language Accepted by NFA): The language LL accepted by the nfa M=(Q,Σ,δ,q0,F)M = ( Q, \Sigma, \delta, q_{0}, F ) is defined

L(M)={wΣ*:δ*(q0,w)F}L(M) = \{ w \in \Sigma^{*} : \delta^{*}(q_{0}, w) \in F \neq \emptyset \}.

That is, L(M)L(M) is the set of all strings ww for which there is a walk labeled ww from the initial vertex of the transition graph to some final vertex.

2.2.6 Linz Example 2.10 (Example 2.8 Revisited)

Let’s again examine the automaton given in Linz Figure 2.9 (Example 2.8).

Linz Fig. 2.9 (Repeated)

This nfa, call it MM:

Note that q2q_{2} is a dead configuration because δ*(q0,110)=\delta^{*}(q_{0},110) = \emptyset.

2.2.7 Why Nondeterminism

When computers are deterministic?

2.3 Equivalence of DFAs and NFAs

2.3.1 Meaning of Equivalence

When are two mechanisms (e.g., programs) equivalent?

The last seems to be the best approach.

Linz Definition 2.7 (DFA-NFA Equivalence): Two finite accepters M1M_{1} and M2M_{2} are said to be equivalent if L(M1)=L(M2)L(M_{1}) = L(M_{2}). That is, if they both accept the same language.

Here “same language” refers to the input and “both accept” refers to the output.

Often there are many accepters for a language.

Thus there are many equivalent dfa and nfas.

2.3.2 Linz Example 2.11

Again consider the nfa represented by the graph in Linz Fig. 2.9. Call this M1M_{1}.

Linz Fig. 2.9 (Repeated): An NFA

As we saw, L(M1)={(10)n:n0}L(M_{1}) = \{ (10)^{n} : n \geq 0 \}.

Now consider the dfa represented by the graph in Linz Figure 2.11. Call this M2M_{2}.

Linz Fig. 2.11: DFA Equivalent to Fig. 9 NFA

L(M2)={(10)n:n0}L(M_{2}) = \{ (10)^{n} : n \geq 0 \}.

Thus, M1M_{1} is equivalent to M2M_{2}.

2.3.3 Power of NFA versus DFA

Which is more powerful, dfa or nfa?

Clearly, any dfa DD can be made into a nfa NN.

Can any nfa NN be made into a dfa DD?

Thus, dfas and nfas are “equally powerful”.

Linz Theorem 2.2 (Existence of DFA Equivalent to NFA): Let LL be the language accepted by the nfa MN=(QN,Σ,δN,q0,FN)M_{N} = (Q_{N}, \Sigma, \delta_{N}, q_{0}, F_{N}). Then there exists a dfa MD=(QD,Σ,δD,{q0},FD)M_{D} = (Q_{D}, \Sigma, \delta_{D}, \{ q_{0} \}, F_{D}) such that L=L(MD)L = L(M_{D}).

A pure mathematician would be content with an existence proof.

But, as computing scientists, we want an algorithm for construction of MDM_{D} from MNM_{N}. The proof of the theorem follows from the correctness of the following algorithm.

Key ideas:

Remember from the earlier definitions in these notes and from discrete mathematics:

Procedure nfa_to_dfa

Given a transition graph GNG_{N} for nfa MN=(QN,Σ,δN,q0,FN)M_{N} = (Q_{N}, \Sigma, \delta_{N}, q_{0}, F_{N}), construct a transition graph GDG_{D} for a dfa MD=(QD,Σ,δD,q0,FD)M_{D} = (Q_{D}, \Sigma, \delta_{D}, q_{0}, F_{D}). Label each vertex in GDG_{D} with a subset of the vertices in GNG_{N}.

  1. Initialize graph GDG_{D} to have an initial vertex {q0}\{ q_{0} \} where q0q_{0} is the initial state of GNG_{N}.

  2. Repeat the following steps until no more edges are missing from GDG_{D}:

    1. Take any vertex from GDG_{D} labeled {qi,qj,qk}\{ q_{i}, q_{j}, \ldots\ q_{k} \} that has no outgoing edge for some aΣa \in \Sigma.

    2. For this vertex and input, compute δN*(qi,a),δN*(qj,a),δN*(qk,a)\delta^{*}_{N}(q_{i},a),\delta^{*}_{N}(q_{j},a), \ldots\ \delta^{*}_{N}(q_{k},a). (Each of these is a set of states from QNQ_{N}.)

    3. Let {ql,qm,qn}\{ q_{l},q_{m},\ldots\ q_{n} \} be the union of all δN*\delta^{*}_{N} sets formed in the previous step.

    4. If vertex {ql,qm,qn}\{ q_{l},q_{m},\ldots\ q_{n}\} constructed in the previous step (step 2c) is not already in GDG_{D}, then add it to GDG_{D}.

    5. Add an edge to GDG_{D} from vertex {qi,qj,qk}\{ q_{i},q_{j}, \ldots\ q_{k} \} (vertex selected in step 2b) to vertex {ql,qm,qn}\{ q_{l},q_{m},\ldots\ q_{n}\} (vertex possibly created in step 2d) and label the new edge with aa (input selected in step 2b).

  3. Make every vertex of GDG_{D} whose label contains any vertex qfFNq_{f} \in F_{N} a final vertex of GDG_{D}.

  4. If MNM_{N} accepts λ\lambda, then vertex {q0}\{ q_{0} \} in GDG_{D} is also a final vertex.

This, if the loop terminates, it constructs the dfa corresponding to the nfa.

Does the loop terminate?

What is the loop invariant? (This ia a property always that must hold at the loop test.)

2.3.4 Linz Example 2.12

Convert the nfa in Linz Figure 2.12 to an equivalent dfa.

Intermediate steps are shown in Figures 2.12-1 and 2.12-2, with the final results in Linz Figure 2.13.

Linz Fig. 2.12: An NFA
Intermediate Fig. 2.12-1
Intermediate Fig. 2.12-2
Linz Fig. 2.13: Corresponding DFA

2.3.5 Linz Example 2.13

Convert the nfa shown in Linz Figure 2.14 into an equivalent dfa.

Linz Fig. 2.14: An NFA

δD({q0},0)=δN*(q0,0)={q0,q1}\delta_{D}(\{q_{0}\},0) = \delta^{*}_{N}(q_{0},0) = \{ q_{0}, q_{1} \}

δD({q0},1)=δN*(q0,1)={q1}\delta_{D}(\{q_{0}\},1) = \delta^{*}_{N}(q_{0},1) = \{ q_{1} \}

δD({q0,q1},0)=δN*(q0,0)δN*(q1,0)={q0,q1,q2}\delta_{D}(\{q_{0},q_{1}\},0) = \delta^{*}_{N}(q_{0},0) \cup \delta^{*}_{N}(q_{1},0) = \{ q_{0}, q_{1}, q_{2} \}

δD({q0,q1,q2},1)=δN*(q0,1)δN*(q1,1)δN*(q2,1)={q1,q2}\delta_{D}(\{q_{0},q_{1}, q_{2}\},1) = \delta^{*}_{N}(q_{0},1) \cup \delta^{*}_{N}(q_{1},1) \cup \delta^{*}_{N}(q_{2},1) = \{ q_{1}, q_{2} \}

The above gives us the partially constructed dfa shown in Linz Figure 2.15.

Linz Fig. 2.15

δD({q1},0)=δN*(q1,0)={q2}\delta_{D}(\{q_{1}\},0) = \delta^{*}_{N}(q_{1},0) = \{ q_{2} \}

δD({q1},1)=δN*(q1,1)={q2}\delta_{D}(\{q_{1}\},1) = \delta^{*}_{N}(q_{1},1) = \{ q_{2} \}

δD({q2},0)=δN*(q1,0)=\delta_{D}(\{q_{2}\},0) = \delta^{*}_{N}(q_{1},0) = \emptyset

δD({q2},1)=δN*(q2,1)={q2}\delta_{D}(\{q_{2}\},1) = \delta^{*}_{N}(q_{2},1) = \{ q_{2} \}

δD({q0,q1},1)=δN*(q0,1)δN*(q1,1)={q1,q2}\delta_{D}(\{q_{0},q_{1}\},1) = \delta^{*}_{N}(q_{0},1) \cup \delta^{*}_{N}(q_{1},1) = \{ q_{1}, q_{2} \}

δD({q0,q1,q2},0)=δN*(q0,0)δN*(q1,0)δN*(q2,0)={q0,q1,q2}\delta_{D}(\{q_{0},q_{1}, q_{2}\},0) = \delta^{*}_{N}(q_{0},0) \cup \delta^{*}_{N}(q_{1},0) \cup \delta^{*}_{N}(q_{2},0) = \{ q_{0}, q_{1}, q_{2} \}

δD({q1,q2},0)=δN*(q1,0)δN*(q2,0)={q2}\delta_{D}(\{q_{1},q_{2}\},0) = \delta^{*}_{N}(q_{1},0) \cup \delta^{*}_{N}(q_{2},0) = \{ q_{2} \}

δD({q1,q2},1)=δN*(q1,1)δN*(q2,1)={q2}\delta_{D}(\{q_{1},q_{2}\},1) = \delta^{*}_{N}(q_{1},1) \cup \delta^{*}_{N}(q_{2},1) = \{ q_{2} \}

Now, the above gives us the dfa shown in Linz Figure 2.16.

Linz Fig. 2.16: Corresponding DFA for NFA

2.4 Reduction in the Number of States in Finite Automata

This section is not covered in this course.

3 Regular Languages and Regular Grammars

Regular languages

Thus we will examine other notations for representing regular languages.

3.1 Regular Expressions

3.1.1 Syntax

We define the syntax (or structure) of regular expressions with an inductive definition.

Linz Definition 3.1 (Regular Expression): Let Σ\Sigma be a given alphabet. Then:

  1. \emptyset, λ\lambda, and aΣa \in \Sigma are all regular expressions. These are called primitive regular expressions.

  2. If r1r_{1} and r2r_{2} are regular expressions, then r1+r2r_{1} + r_{2}, r1r2r_{1} \cdot r_{2}, r1*r_{1}^{*}, and (r1)(r_{1}) are also regular expressions.

  3. A string is a regular expression if and only if it can be derived from the primitive regular expressions by a finite number of applications of the rules in (2).

We use the the regular expression operators as follows:

For example, consider regular expression (a+(bc))*(a + (b \cdot c))^{*} over the alphabet {a,b,c}\{a,b,c\}. Note the use of parentheses.

As with arithmetic expressions, precedence rules and conventions can be used to relax the need for parentheses.

A string (a+b+)(a + b +) is not a regular expression. It cannot be generated using the above definition (as augmented by the precedence rules and convention).

3.1.2 Languages Associated with Regular Expressions

But what do we “mean” by a regular expression? That is, what is its semantics.

In particular, what languages do regular expressions describe?

Consider the regular expression (a+(bc))*(a + (b \cdot c))^{*} from above. As implied by the names for the operators, we intend this regular expression to represent the language ({a}{bc})*(\{a\} \cup \{bc\})^{*} which is {λ,a,bc,aa,abc,bca,bcbc,aaa,aabc,bcaa,}\{\lambda, a, bc, aa, abc, bca, bcbc, aaa, aabc,bcaa, \ldots \}.

We again give an inductive definition for the language described by a regular expression. It must consider all the cases given in the definition of regular expression itself.

Linz Definition 3.2: The language L(r)L(r) denoted by any regular expression rr is defined (inductively) by the following rules.

Base cases:

  1. \emptyset is a regular expression denoting the empty set.
  2. λ\lambda is a regular expression denoting {λ}\{ \lambda \}.
  3. For every aΣa \in \Sigma, aa is a regular expression denoting {a}\{ a \}.

Inductive cases: If r1r_{1} and r2r_{2} are regular expressions, then

  1. L(r1+r2)=L(r1)L(r2)L(r_{1} + r_{2}) = L(r_{1}) \cup L(r_{2})
  2. L(r1r2)=L(r1)L(r2)L(r_{1} \cdot r_{2}) = L(r_{1})L(r_{2})
  3. L((r1))=L(r1)L((r_{1})) = L(r_{1})
  4. L(r1*)=(L(r1))*L(r_{1}^{*}) = (L(r_{1}))^{*}

3.1.3 Linz Example 3.2

Show the language L(a*(a+b))L(a^{*} \cdot (a + b)) in set notation.

L(a*(a+b))L(a^{*} \cdot (a + b))
== { Rule 5 }
L(a*)L(a+b)L(a^{*})L(a+b)
== { Rule 7 }
(L(a))*L(a+b)(L(a))^{*}L(a+b)
== { Rule 4 }
(L(a))*(L(a)L(b))(L(a))^{*}(L(a) \cup L(b))
== { definition of star-closure of languages }
{λ,a,aa,aaa,}{a,b}\{\lambda, a, aa, aaa, \ldots \}\{a, b\}
== { definition of concatenation of languages }
{a,aa,aaa,...,b,ab,aab,aaab,}\{a, aa, aaa, ..., b, ab, aab, aaab, \ldots \}

3.1.4 Examples of Languages for Regular Expressions

Consider the languages for the following regular expressions.

L(a*ba*b(a+b)*)L(a^{*} \cdot b \cdot a^{*} \cdot b \cdot (a + b)^{*})
== {a}*{b}{a}*{b}{a,b}*\{a\}^{*} \{b\} \{a\}^{*} \{b\} \{a,b\}^{*}
== {w:w{a,b}*,nb(w)2}\{ w : w \in \{ a, b \}^{*}, n_{b}(w) \geq 2 \}
L((a+b)*ba*ba*)L((a + b)^{*} \cdot b \cdot a^{*} \cdot b \cdot a^{*})
== {a,b}*{b}{a}*{b}{a}*\{a,b\}^{*} \{b\} \{a\}^{*} \{b\} \{a\}^{*}
== same as above
L((a+b)*b(a+b)*b(a+b)*)L((a+b)^{*} \cdot b \cdot (a + b)^{*} \cdot b \cdot (a + b)^{*})
== {a,b}*{b}{a,b}*{b}{a,b}*\{a,b\}^{*} \{b\} \{a,b\}^{*} \{b\} \{a,b\}^{*}
== same as above

3.1.5 Linz Example 3.4

Consider the regular expression r=(aa)*(bb)*br = (aa)^{*} (bb)^{*} b.

3.1.6 Linz Example 3.5

For Σ={0,1}\Sigma = \{ 0, 1 \}, give a regular expression rr such that
L(r)={wΣ*:wL(r) = \{w \in \Sigma^{*} : w has at least one pair of consecutive zeros }.

3.1.7 Examples of Regular Expressions for Languages

Show regular expressions on the alphabet {a,b}\{ a, b \} for the following languages.

3.2 Connection Between Regular Expressions and Regular Languages

3.2.1 Regular Expressions Denote Regular Languages

Regular expressions provide a convenient and concise notation for describing regular languages.

Linz Theorem 3.1 (NFAs for Regular Expressions): Let rr be a regular expression. Then there exists some nondeterministic finite accepter (nfa) that accepts L(r)L(r). Consequently, L(r)L(r) is a regular language.

Proof Sketch: Show that any regular expression generated from the inductive definition corresponds to an nfa. Here we proceed informally.

Linz Figure 3.1 diagrammatically demonstrates that there are nfas that correspond to the primitive regular expressions.

  1. nfa accepts \emptyset
  2. nfa accepts {λ}\{ \lambda \}
  3. nfa accepts {a}\{ a \}
Linz Fig. 3.1: Primitive Regular Expressions as NFA

Linz Figure 3.2 shows a general scheme for a nondeterministic finite accepter (nfa) that accepts L(r)L(r), with an initial state and one final state.

Linz Fig. 3.2: Scheme for NFA Accepting L(r)

Linz Figure 3.3 gives an nfa for L(r1+r2)L(r_{1} + r_{2}). Note the use of λ\lambda-transitions to connect the two machines to the new initial and final states.

Linz Fig. 3.3: NFA for Union

Linz Figure 3.4 shows an nfa for L(r1r2)L(r_{1} r_{2}). Again note the use of λ\lambda-transitions to connect the two machines to the new initial and final states.

Linz Fig. 3.4: NFA for Concatenation

Linz Figure 3.5 shows an nfa for L(r1*)L(r_{1}^{*}). Note the use of λ\lambda-transitions to represent zero-or-more repetitions of the machine and to connect it to the new initial and final states.

Linz Fig. 3.5: NFA for Star-Closure

Thus, Linz Figures 3.3 to 3.5 illustrate composing nfas for any regular expression from the nfas for its subexpressions. Of course, the initial and final states of components are replaced by the initial and final states of the composite nfa.

3.2.2 Linz Example 3.7

Show an nfa that accepts r=(a+bb)*(ba*+λ)r = (a + bb)^{*}(ba^{*} + \lambda).

Linz Figure 3.6, part (a), shows M1M_{1} that accepts L(a+bb)L(a+bb). Part (b) shows M2M_{2} that accepts L(ba*+λ)L(ba^{*} + \lambda).

Linz Fig. 3.6: Toward a Solution to Ex. 3.6

Linz Figure 3.7 shows an nfa that accepts L((a+bb)*(ba*+λ)L((a + bb)^{*}(ba^{*} + \lambda).

Linz Fig. 3.7: Solution for Ex. 3.6

3.2.3 Converting Regular Expressions to Finite Automata

The construction in the proof sketch and example above suggest an algorithm for converting regular expressions to nfas.

This algorithm is adapted from pages 273-4 of the book: James L. Hein, Theory of Computation: An Introduction, Jones and Bartlett, 1996.

The diagrams in this section are from the Hein book, which uses a slightly different notation than the Linz book. In particular, these diagrams use capital letters for the expressions.

Algorithm to convert a regular expression to an nfa

End of Algorithm

Rule 2 in the above algorithm can result in an unbounded number of edges originating at the same state. This makes the algorithm difficult to implement. To remedy this situation, replace Rule 2 as follows.

  1. If an edge is labeled with r+sr + s, then replace the edge with subgraphs for each of rr and ss. The subgraph for rr consists of with a new source state connected to a new destination state with an edge labeled rr. Add edges labeled λ\lambda to connect the original source state to the new source state and the original destination state to the new destination state. Proceed similarly for ss.

3.2.4 Example Conversion of Regular Expression to NFA

This example is from page 275 of the Hein textbook cited above.

Construct an nfa for a*+aba^{*} + a \cdot b.

Start with a the two-state initial diagram.

Next, apply Rule 2 to a*+aba^{*} + a \cdot b.

Next, apply Rule 4 to a*a^{*}.

Finally, apply Rule 3 to aba \cdot b.

3.2.5 Converting Finite Automata to Regular Expressions

The construction in the proof sketch and example above suggest an algorithm for converting finita automata to regular expressions.

This algorithm is adapted from page 276 of the book: James L. Hein, Theory of Computation: An Introduction, Jones and Bartlett, 1996.

Algorithm to convert a finite automaton to a regular expression

Begin with a dfba or an nfa.

  1. Create a new start state ss and connect this to the original start state with an edge labeled λ\lambda.

  2. Create a new final state ff and connect the original final states to this state by edges labeled λ\lambda.

  3. For each pair of states ii and jj that has more than one edge connecting them, replace all the edges with the regular expression formed using union (++) to combine the labels on the previous edges.

  4. Construct a sequence of new machines by eliminating one state at a time until the only states remaining are ss and ff. To eliminate some state kk, construct a new machine as follows.

After eliminating all states except ss and ff, the regular expression is the label on the one edge remaining.

End of Algorithm

3.2.6 Example Conversion of Finite Automata to Regular Expressions

This example is from pages 277-8 of the Hein textbook cited above.

Consider the following dfa.

After applying Rule 1 (new start state), Rule 2 (new final state), and Rule 3 (create union), we get the following machine.

We can eliminate state 2 readily because it is trap state. That is, there is no path through 2 between edges adjacent to 2, so new(i,j)(i,j) == old(i,j)(i,j) for any states i2i \neq 2 and j2j \neq 2. The resulting machine is as follows.

To eliminate state 0, we construct a new edge that is labeled as follows:

Thus, we can eliminate state 0 and its edges and add a new edge (s,1)(s,1) labeled aa.

We can eliminate state 1 by adding a new edge (s,f)(s,f) labeled as follows

Thus the regular expression is a(a+b)*a(a + b)^{*}.

3.2.7 Another Example Conversion of Finite Automa to Regular Expressions

This example is from pages 277-8 of the Hein textbook cited above.

Consider the following dfa. Verify that it corresponds to the regular expression (a+b)*abb(a+b)^{*}abb.

Applying Rules 1 and 2 (adding new start and final states), we get the following machine.

To eliminate state 0, we add the following new edges.

We can eliminate either state 2 or state 3 next. Let’s choose 3. Thus we create the following new edges.

Now we eliminate state 2 and thus create the new edges.

Finally, we remove state 1 by creating a new edge.

3.2.8 Regular Expressions for Describing Simple Patterns

Pascal integer constants

Regular expression sdd*sdd^{*} where

Pattern matching

Program for Pattern Matching

We can convert a regular expression to an equivalent nfa, the nfa to a dfa, and the dfa to a transition table. We can use the transition table to drive a program for pattern matching.

For a more effiicent program, we can apply the state reduction algorithm to the dfa before converting to a transition table. Linz section 2.4, which we did not cover this semester, discusses this algorithm.

3.3 Regular Grammars

We have studied two ways of describing regular languages–finite automata (i.e. dfas, nfas) and regular expressions. Here we examine a third way–regular grammars.

Linz Definition 3.3 (Right-Linear Grammar): A grammar G=(V,T,S,P)G = (V,T,S,P) is said to be right-linear if all productions are of one of the forms

AxBA \rightarrow xB
AxA \rightarrow x

where A,BVA, B \in V and xT*x \in T^{*}.

Similarly, a grammar is said to be left-linear if all productions are of the form ABxA \rightarrow Bx or AxA \rightarrow x.

A regular grammar is one that is either right-linear or left-linear.

3.3.1 Linz Example 3.13

The grammar G1=({S},{a,b},S,P1)G_{1} = (\{ S \}, \{ a, b \}, S, P_{1}), with P1P_{1} given as

is right-linear.

The grammar G2=({S,S1,S2},{a,b},S,P2)G_{2} = ( \{ S, S_{1}, S_{2} \}, \{ a, b \}, S, P_{2} ), with productions

is left linear. Both G1G_{1} and G2G_{2} are regular grammars.

L(G1)=L((ab)*a)L(G_{1}) = L((ab)^{*}a)

L(G2)=L(aab(ab)*)L(G_{2}) = L(aab(ab)^{*})

3.3.2 Linz Example 3.14

The grammar G=({S,A,B},{a,b},S,P)G = (\{ S, A, B \}, \{a, b \}, S, P) with productions

is not regular.

Although every production is either in right-linear or left-linear form, the grammar itself is neither right-linear nor left-linear, and therefore is not regular. The grammar is an example of a linear grammar.

Definition (Linear Grammar): A linear grammar is a grammar in which at most one variable can appear on the right side of any production.

3.3.3 Right-Linear Grammars Generate Regular Languages

Linz Theorem 3.3 (Regular Languages for Right-Linear Grammars): Let G=(V,T,S,P)G = (V, T, S, P) be a right-linear grammar. Then L(G)L(G) is a regular language.

Strategy: Because a regular language is any language accepted by a dfa or nfa, we seek to construct an nfa that simulates the derivations of the right-linear grammar.

The algorithm below incorporates this idea. It is based on the algorithm given on page 314 of the book: James L. Hein, Theory of Computation: An Introduction, Jones and Bartlett, 1996.

Algorithm to convert a regular grammar to an nfa

Start with a right-linear grammar and construct an equivalent nfa. We label the nfa’s states primarily with variables from the grammar and label edges with terminals in the grammar or λ\lambda.

  1. If necessary, transform the grammar so that all productions have the form AxA \rightarrow x or AxBA \rightarrow xB, where xx is either a terminal in the grammar or λ\lambda.

  2. Label the start state of the nfa with the start symbol of the grammar.

  3. For each production IaJI \rightarrow aJ, add a state transition (edge) from a state II to a state JJ with the edge labeled with the symbol aa.

  4. For each production IJI \rightarrow J, add a state transition (edge) from a state II to a state JJ with the edge labeled with λ\lambda.

  5. If there exist productions of the form IaI \rightarrow a, then add a single new state symbol FF. For each production of the form IaI \rightarrow a, add a state transition from II to FF labeled with symbol aa.

  6. The final states of the nfa are FF plus all II such there is a production of the form IλI \rightarrow \lambda.

End of algorithm

3.3.4 Example: Converting Regular Grammar to NFA

Construct an nfa for the following regular grammar GG:

The grammar is in the correct form, so step 1 of the grammar is not applicable. The following sequence of diagrams shows the use of steps 2, 3 (three times), 5, and 6 of the algorithm. Step 4 is not applicable to this grammar.

Note that L(G)=L(a*ba*a)L(G) = L(a^{*}ba^{*}a).

3.3.5 Linz Example 3.5

This is similar to the example in the Linz textbook, but we apply the algorithm as stated above.

Construct an nfa for the regular grammar GG:

First, let’s transform the grammar according to step 1 of the regular grammar to nfa algorithm above.

Applying steps 2, 3 (three times), 5, and 6 of algorithm as show below, we construct the following nfa. Step 4 was not applicable in this problem.

Note that L(G)=L((aab)*ab)L(G) = L((aab)^{*}ab).

3.3.6 Right-Linear Grammars for Regular Languages

Linz Theorem 3.4 (Right-Linear Grammars for Regular Languages): If LL is a regular language on the alphabet Σ\Sigma, then there exists a right-linear grammar G=(V,Σ,S,P)G = (V, \Sigma, S, P) such that L=L(G)L = L(G).

Strategy: Reverse the construction of an nfa from a regular grammar given above.

The algorithm below incorporates this idea. It is based on the algorithm given on page 312 of the Hein textbook cited above.

Algorithm to convert an nfa to a regular grammar

Start with an nfa and construct a regular grammar.

  1. Relabel the states of the nfa with capital letters.

  2. Make the start state label the start symbol for the grammar.

  3. For each transition (edge) from a state II to a state JJ labeled with an alphabetic symbol aa, add a production IaJI \rightarrow aJ to the grammar.

  4. For each transition (edge) from a state II to a state JJ labeled with λ\lambda, add a production IJI \rightarrow J to the grammar.

  5. For each final state labeled KK, add a production KλK \rightarrow \lambda to the grammar.

End of algorithm

3.3.7 Example: Converting NFA to Regular Grammar

Consider the following nfa (adapted from the Hein textbook page 313). (The Hein book uses Λ\Lambda instead of λ\lambda to label silent moves and empty strings.)

We apply the steps of the algorithm as follows.

  1. The nfa states are already labeled as specified.

  2. Choose SS as start symbol for grammar.

  3. Add the following productions:

  4. Add the following production:

  5. Add the following production:

So, combining the above productions, we get the final grammar:

3.3.8 Equivalence Between Regular Languages and Regular Grammars

Linz Theorem 3.5 (Equivalence of Regular Languages and Left-Linear Grammars): A language LL is regular if and only if there exists a left-linear grammar GG such that L=L(G)L=L(G).

Linz Theorem 3.6(Equivalence of Regular Languages and Right-Linear Grammars): A language LL is regular if and only if there exists a regular grammar GG such that L=L(G)L = L(G).

The four theorems from this section enable us to convert back and forth among finite automata and regular languages as shown in Linz Figure 3.19. Remember that Linz Theorem 2.2 enabled us to translate from nfa to dfa.

Linz Fig. 3.19: Equivalence of Regular Languages and Regular Grammars

4 Properties of Regular Languages

The questions answered in this chapter include:

The concepts introduced in this chapter are:

4.1 Closure Properties of Regular Languages

4.1.1 Mathematical Interlude: Operations and Closure

Definition (Operation): An operation is a function p:VYp : V \rightarrow Y where VX1×X2××XkV \in X_{1} \times X_{2} \times \cdots \times X_{k} for some sets XiX_{i} with 0ik0 \leq i \leq k. kk is the number of operands (or arguments) of the operation.

We often use special notation and conventions for unary and binary operations. For example:

Often we consider an operations on a set, where all the operands and the result are drawn from the same set.

Definition (Closure): A set SS is closed under a unary operation pp if, for all xSx \in S, p(x)Sp(x) \in S. Similarly, a set SS is closed under a binary operation \odot if, for all xSx \in S and ySy \in S, xySx \odot y \in S.

Examples arithmetic on the set of natural numbers (={0,1,...}\mathbb{N} = \{0, 1, ...\})

However, the set of integers is closed under subtraction and negation. But it is not closed under division or square root (as we normally define the operations).

Now, let’s consider closure of the set of regular languages with respect to the simple set operations.

4.1.2 Closure under Simple Set Operations

Linz Theorem 4.1 (Closure under Simple Set Operations): If L1L_{1} and L2L_{2} are regular languages, then so are L1L2L_{1} \cup L_{2}, L1L2L_{1} \cap L_{2}, L1L2L_{1}L_{2}, L1\bar{L_{1}}, and L1*L_{1}^{*}.

That is, we say that the family of regular languages is closed under union, intersection, concatenation, complementation, and star-closure.

Proof of L1L2L_{1} \cup L_{2}

Let L1L_{1} and L2L_{2} be regular languages.

L1L2L_{1} \cup L_{2}
== { Th. 3.2: there exist regular expressions r1r_{1}, r2r_{2} }
L(r1)L(r2)L(r_{1}) \cup L(r_{2})
== { Def. 3.2, rule 4 }
L(r1+r2)L(r_{1} + r_{2})

Thus, by Theorem 3.1 (regular expressions describe regular languages), the union is a regular language.

Thus L1L2L_{1} \cup L_{2} is a regular language. QED.

Proofs of L1L2L_{1}L_{2} and L1*L_{1}^{*}

Similar to the proof of L1L2L_{1} \cup L_{2}.

Proof of L1\bar{L_{1}}

Strategy: Given a dfa MM for the regular language, construct a new dfa M̂\widehat{M} that accepts everything rejected and rejects everything accepted by the given dfa.

L1L_{1} is a regular language on Σ\Sigma.
\equiv { Def. 2.3 }
\exists dfa M=(Q,Σ,δ,q0,F)M = (Q,\Sigma,\delta,q_{0},F) such that L(M)=L1L(M)=L_{1}.

Thus

ωΣ*\omega \in \Sigma^{*}
\Rightarrow { by the properties of dfas and sets }
Either δ*(q0,ω)F\delta^{*}(q_{0},\omega)\in F or δ*(q0,ω)QF\delta^{*}(q_{0},\omega)\in Q-F
\Rightarrow { Def. 2.2: language accepted by dfa }
Either ωL(M)\omega \in L(M) or ωL(M̂)\omega \in L(\widehat{M}) for some dfa M̂\widehat{M}

Let’s construct dfa M̂=(Q,Σ,δ,q0,QF)\widehat{M} = (Q, \Sigma, \delta, q_{0}, Q-F).

Clearly, L(M̂)=L1L(\widehat{M}) = \bar{L_{1}}. Thus L1\bar{L_{1}} is a regular language. QED.

Proof of L1L2L_{1} \cap L_{2}

Strategy: Given two dfas for the two regular languages, construct a new dfa that accepts a string if and only if both original dfas accept the string.

Let L1=L(M1)L_{1} = L(M_{1}) and L2=L(M2)L_{2} = L(M_{2}) for dfas:

M1=(Q,Σ,δ1,q0,F1)M_{1} = (Q, \Sigma, \delta_{1}, q_{0}, F_{1})

M2=(P,Σ,δ2,p0,F2)M_{2} = (P, \Sigma, \delta_{2}, p_{0}, F_{2})

Construct M̂=(Q̂,Σ,δ̂,(q0,p0),F̂)\widehat{M} = (\widehat{Q}, \Sigma, \widehat{\delta}, (q_{0}, p_{0}), \widehat{F}), where

Q̂=Q×P\widehat{Q} = Q \times P

δ̂((qi,pj),a)=(qk,pl)\widehat{\delta}((q_{i}, p_{j}), a) = (q_{k}, p_{l}) when

δ1(qi,a)=qk\delta_{1}(q_{i}, a) = q_{k}

δ2(pj,a)=pl\delta_{2}(p_{j}, a) = p_{l}

F̂={(q,p):qF1,pF2}\widehat{F} = \{ (q,p) : q \in F_{1}, p \in F_{2} \}

Clearly, ωL1L2\omega \in L_{1} \cap L_{2} if and only if ω\omega accepted by M̂\widehat M.

Thus, L1L2L_{1} \cap L_{2} is regular. QED.

The previous proof is constructive.

Sometimes nonconstructive proofs are shorter and easier to understand. But they provide no algorithm.

Alternate (nonconstructive) proof for L1L2L_{1} \cap L_{2}

L1L_{1} and L2L_{2} are regular.
\equiv { previously proved part of Theorem 4.1 }
L1\bar{L_{1}} and L2\bar{L_{2}} are regular.
\Rightarrow { previously proved part of Theorem 4.1 }
L1L\bar{L_{1}} \cup \bar{L_{}} is regular
\Rightarrow { previously proved part of Theorem 4.1 }
L1L2¯\overline{\bar{L_{1}} \cup \bar{L_{2}}} is regular
\equiv { deMorgan’s Law for sets }
L1L2L_{1} \cap L_{2} is regular

QED.

4.1.3 Closure under Difference (Linz Example 4.1)

Consider the difference between two regular languages L1L_{1} and L2L_{2}, written L1L2L_{1} - L_{2}.

But this is just set difference, which is defined L1L2=L1L2L_{1} - L_{2} = L_{1} \cap \bar{L_{2}}.

From Theorem 4.1 above, we know that regular languages are closed under both complementation and intersection. Thus, regular languages are closed under difference as well.

4.1.4 Closure under Reversal

Linz Theorem 4.2 (Closure under Reversal): The family of regular languages is closed under reversal.

Proof (constructive)

Strategy: Construct an nfa for the regular language and then reverse all the edges and exchange roles of the initial and final states.

Let L1L_{1} be a regular language. Construct an nfa MM such that L1=L(M)L_{1} = L(M) and MM has a single final state. (We can add λ\lambda transitions from the previous final states to create a single new final state.)

Now construct a new nfa M̂\hat{M} as follows.

Thus nfa M̂\hat{M} accepts ωRΣ*\omega^{R} \in \Sigma^{*} if and only if the original nfa accepts ωΣ*\omega \in \Sigma^{*}. QED.

4.1.5 Homomorphism Definition

In mathematics, a homomorphism is a mapping between two mathematical structures that preserves the essential structure.

Linz Definition 4.1 (Homomorphism): Suppose Σ\Sigma and Γ\Gamma are alphabets. A function

h:ΣΓ*h: \Sigma \rightarrow \Gamma^{*}

is called a homomorphism.

In words, a homomorphism is a substitution in which a single letter is replaced with a string.

We can extend the domain of a function hh to strings in an obvious fashion. If

w=a1a2anw = a_{1}a_{2} \ \cdots\ a_{n} for n0n \geq 0

then

h(w)=h(a1)h(a2)h(an)h(w) = h(a_{1})h(a_{2}) \ \cdots\ h(a_{n}).

If LL is a language on Σ\Sigma, then we define its homomorphic image as

h(L)={h(w):wL}h(L) = \{ h(w) : w \in L \}.

Note: The homomorphism function hh preserves the essential structure of the language. In particular, it preserves operation concatenation on strings, i.e., h(λ)=λh(\lambda) = \lambda and h(uv)=h(u)h(v)h(uv) = h(u)h(v).

4.1.6 Linz Example 4.2

Let Σ={a,b}\Sigma = \{ a, b \} and Γ={a,b,c}\Gamma = \{ a, b, c \}.

Define hh as follows:

h(a)=abh(a) = ab,

h(b)=bbch(b) = bbc

Then h(aba)=abbbcabh(aba) = abbbcab.

The homomorphic image of L={aa,aba}L = \{ aa, aba \} is the language h(L)={abab,abbbcab}h(L) = \{ abab, abbbcab \}.

If we have a regular expression rr for a language LL, then a regular expression for h(L)h(L) can be obtained by simply applying the homomorphism to each Σ\Sigma symbol of rr. We show this in the next example.

4.1.7 Linz Example 4.3

For Σ={a,b}\Sigma = \{ a,b \} and Γ={b,c,d}\Gamma = \{ b, c, d \}, define hh:

h(a)=dbcch(a) = dbcc

h(b)=bdch(b) = bdc

If LL is a regular language denoted by the regular expression

r=(a+b*)(aa)*r = (a + b^{*})(aa)^{*}

then

r1=(dbcc+(bdc)*)(dbccdbcc)*r_{1} = (dbcc + (bdc)^{*})(dbccdbcc)^{*}

denotes the regular language h(L)h(L).

The general result on the closure of regular languages under any homomorphism follows from this example in an obvious manner.

4.1.8 Closure under Homomorphism Theorem

Linz Theorem 4.3 (Closure under Homomorphism): Let hh be a homomorphism. If LL is a regular language, then its homomorphic image h(L)h(L) is also regular.

Proof: Similar to the argument in Example 4.3. See Linz textbook for full proof.

The family of regular languages is therefore closed under arbitrary homomorphisms.

4.1.9 Right Quotient Definition

Linz Definition 4.2 (Right Quotient): Let L1L_{1} and L2L_{2} be languages on the same alphabet. Then the right quotient of L1L_{1} with L2L_{2} is defined as

L1/L2={x:xyL1L_{1} / L_{2} = \{ x : xy \in L_{1} for some yL2}y \in L_{2} \}

4.1.10 Linz Example 4.4

Given languages L1L_{1} and L2L_{2} such that

L1={anbm:n1,m0}{ba}L_{1} = \{ a^{n}b^{m} : n \geq 1, m \geq 0 \} \cup \{ ba \}

L2={bm:m1}L_{2} = \{ b^{m} : m \geq 1 \}

Then

L1/L2={anbm:n1,m0}L_{1} / L_{2} = \{ a^{n}b^{m} : n \geq 1, m \geq 0 \}.

The strings in L2L_{2} consist of one or more bb’s. Therefore, we arrive at the answer by removing one or more bb’s from those strings in L1L_{1} that terminate with at least one bb as a suffix.

Note that in this example L1L_{1}, L2L_{2}, and L1/L2L_{1} / L_{2} are regular.

Can we construct a dfa for L1/L2L_{1} / L_{2} from dfas for L1L_{1} and L2L_{2}?

Linz Figure 4.1 shows a dfa M1M_{1} that accepts L1L_{1}.

Linz Fig. 4.1: DFA for Example 4.4 L_{1}

An automaton for L1/L2L_{1} / L_{2} must accept any xx such that xyL1xy \in L_{1} and yL2y \in L_{2}.

For all states qM1q \in M_{1}, if there exists a walk labeled vv from qq to a final state qfq_{f} such that vL2v \in L_{2}, then make qq a final state of the automaton for L1/L2L_{1} / L_{2}.

In this example, we check states to see if there is bb*bb^{*} walk to any of the final states q1q_{1}, q2q_{2}, or q4q_{4}.

The resulting automaton is shown in Linz Figure 4.2.

Linz Fig. 4.2: DFA for Example 4.4 L_{1} / L_{2} EXCEPT q_{4} NOT FINAL

The next theorem generalizes this construction.

4.1.11 Closure under Right Quotient

Linz Theorem 4.4 (Closure under Right Quotient): If L1L_{1} and L2L_{2} are regular languages, then L1/L2L_{1} / L_{2} is also regular. We say that the family of regular languages is closed under right quotient with a regular language.

Proof

Let dfa M=(Q,Σ,δ,q0,F)M = (Q, \Sigma, \delta, q_{0}, F) such that L(M)=L1L(M) = L_{1}.

Construct dfa M̂=(Q,Σ,δ,q0,F̂)\widehat{M} = (Q, \Sigma, \delta, q_{0}, \widehat{F}) for L1/L2L_{1}/L_{2} as follows.

For all qiQq_{i} \in Q, let dfa Mi=(Q,Σ,δ,qi,F)M_{i} = (Q, \Sigma, \delta, q_{i}, F). That is, dfa MiM_{i} is the same as MM except that it starts at qiq_{i}.

Does L(M̂)=L1/L2L(\widehat{M}) = L_{1} / L_{2}?

First, let xL1/L2x \in L_{1} / L_{2}.

Now, let xx be accepted by M̂\widehat{M}.

Thus L(M̂)=L1/L2L(\widehat{M}) = L_{1} / L_{2}, which means L1/L2L_{1} / L_{2} is regular.

4.1.12 Linz Example 4.5

Find L1/L2L_{1} / L_{2} for

L1=L(a*baa*)L_{1} = L(a^{*}baa^{*})

L2=L(ab*)L_{2} = L(ab^{*})

We apply the construction (algorithm) used in the proof of Theorem 4.4.

Linz Figure 4.3 shows a dfa for L1L_{1}.

Linz Fig. 4.3: DFA for Example 4.5 L_{1}

Let M=(Q,Σ,δ,q0,F)M = (Q, \Sigma, \delta, q_{0}, F).

Thus if we construct the sequence of machines MiM_{i}

L(M0)L2=L(M_{0}) \cap L_{2} = \emptyset

L(M1)L2={a}L(M_{1}) \cap L_{2} = \{a\} \neq \emptyset

L(M2)L2={a}L(M_{2}) \cap L_{2} = \{a\} \neq \emptyset

L(M3)L2=L(M_{3}) \cap L_{2} = \emptyset

then the resulting dfa for L1/L2L_{1} / L_{2} is shown in Linz Figure 4.4.

Linz Fig. 4.4: DFA for Example 4.5 L_{1} / L_{2}

The automaton shown in Figure 4.4 accepts the language denoted by the regular expression

a*b+a*baa*a^{*}b + a^{*}baa^{*}

which can be simplified to

a*ba*a^{*}ba^{*}

4.2 Elementary Questions about Regular Languages

4.2.1 Membership?

Fundamental question: Is wLw \in L?

It is difficult to find a membership algorithm for languages in general. But it is relatively easy to do for regular languages.

A regular language is given in a standard representation if and only if described with one of:

Linz Theorem 4.5 (Membership): Given a standard representation of any regular language LL on Σ\Sigma and any wΣ*,w \in \Sigma^{*}, there exists an algorithm for determining whether or not ww is in LL.

Proof

We represent the language by some dfa, then test ww to see if it is accepted by this automaton. QED.

4.2.2 Finite or Infinite?

Linz Theorem 4.6 (Finiteness): There exists an algorithm for determining whether a regular language, given in standard representation, is empty, finite, or infinite.

Proof

Represent LL as a transition graph of a dfa.

QED.

4.2.3 Equality?

Consider the question L1=L2L_{1} = L_{2}?

This is practically important. But it is a difficult issue because there are many ways to represent L1L_{1} and L2L_{2}.

Linz Theorem 4.7 (Equality): Given a standard representation of two regular languages L1L_{1} and L2L_{2}, there exists an algorithm to determine whether or whether not L1=L2L_{1} = L_{2}.

Proof

Let L3=(L1L2)(L1L2)L_{3} = (L_{1} \cap \bar{L_{2}}) \cup (\bar{L_{1}} \cap L_{2}).

By closure, L3L_{3} is regular. Hence, there is a dfa MM that accepts L3L_{3}.

Because of Theorem 4.6, we can determine whether L3L_{3} is empty or not.

But from Excerise 8, Section 1.1, we see that L3=L_{3} = \emptyset if and only if L1=L2L_{1} = L_{2}. QED.

4.3 Identifying Nonregular Languages

A regular languages may be infinite

In processing a string, the amount of information that the automaton must “remember” is strictly limited (finite and bounded).

4.3.1 Using the Pigeonhole Principle

In mathematics, the pigeonhole principle refers to the following simple observation:

If we put nn objects into mm boxes (pigeonholes), and, if n>mn > m, at least one box must hold more than one item.

This is obvious, but it has deep implications.

4.3.2 Linz Example 4.6

Is the language L={anbn:n0}L = \{ a^{n}b^{n} : n \geq 0 \} regular?

The answer is no, as we show below.

Proof that LL is not regular

Strategy: Use proof by contradiction. Assume that what we want to prove is false. Show that this introduces a contradiction. Hence, the original assumption must be true.

Assume LL is regular.

Thus there exists a dfa M=(Q,{a,b},δ,q0,F)M = (Q,\{ a,b \},\delta,q_{0},F) such that L(M)=LL(M) = L.

Machine MM has a specific number of states. However, the number of aa’s in a string in L(M)L(M) is finite but unbounded (i.e., no maximum value for the length). If nn is larger than the number of states in MM, then, according to the pigeonhole principle, there must be some state qq such that

δ*(q0,an)=q\delta^{*}(q_{0},a^{n}) = q

and

δ*(q0,am)=q\delta^{*}(q_{0}, a^{m}) = q

with nmn \neq m. But, because MM accepts anbna^{n}b^{n},

δ*(q,bn)=qfF\delta^{*}(q,b^{n}) = q_{f} \in F

for some qfFq_{f} \in F.

From this we reason as follows:

δ*(q0,ambn)\delta^{*}(q_{0}, a^{m}b^{n})
== δ*(δ*(q0,am),bn)\delta^{*}(\delta^{*}(q_{0}, a^{m}), b^{n})
== δ*(q,bn)\delta^{*}(q, b^{n})
== qfq_{f}

But this contradicts the assumption that MM accepts ambna^{m}b^{n} only if n=mn = m. Therefore, LL cannot be regular. QED

We can use the pigeonhole principle to make “finite memory” precise.

4.3.3 Pumping Lemma for Regular Languages

Linz Theorem 4.8 (Pumping Lemma for Regular Languages): Let LL be an infinite regular language. There exists some m>0m > 0 such that any wLw \in L with |w|m|w| \geq m can be decomposed as

w=xyzw = xyz

with

|xy|m|xy| \leq m

and

|y|1|y| \geq 1

such that

wi=xyizw_{i} = xy^{i}z

is also in LL for all i0i \geq 0.

That is, we can break every sufficiently long string from LL into three parts in such a way that an arbitrary number of repetitions of the middle part yields another string in LL.

We can “pump” the middle string, which gives us the name pumping lemma for this theorem.

Proof

Let LL be an infinite regular language. Thus there exists a dfa MM that accepts LL. Let MM have states q0,q1,q2,qnq_{0}, q_{1}, q_{2}, \cdots\ q_{n}.

Consider a string wLw \in L such that |w|m=n+1|w| \geq m = n + 1. Such a string exists because LL is infinite.

Consider the set of states q0,qi,qj,qfq_{0}, q_{i}, q_{j}, \cdots\ q_{f} that MM traverses as it processes ww.

The size of this set is exactly |w|+1|w| + 1. Thus, according to the pigeonhole principle, at least one state must be repeated, and such a repetition must start no later than the nnth move.

Thus the sequence is of the form

q0,qi,qj,,qr,,qr,,qfq_{0}, q_{i}, q_{j}, \cdots, q_{r}, \cdots, q_{r}, \cdots, q_{f}.

Then there are substrings xx, yy, and zz of ww such that

δ*(q0,x)=qr\delta^{*}(q_{0}, x) = q_{r}

δ*(qr,y)=qr\delta^{*}(q_{r}, y) = q_{r}

δ*(qr,z)=qf\delta^{*}(q_{r}, z) = q_{f}

with |xy|n+1=m|xy| \leq n + 1 = m and |y|1|y| \geq 1. Thus, for any k0k \geq 0,

δ*(q0,xykz)=qf\delta^{*}(q_{0}, xy^{k}z) = q_{f}

QED.

We can use the pumping lemma to show that languages are not regular. Each of these is a proof by contradiction.

4.3.4 Linz Example 4.7

Show that L={anbn:n0}L = \{ a^{n}b^{n} : n \geq 0 \} is not regular.

Assume that LL is regular, so that the Pumping Lemma must hold.

If, for some n0n \geq 0 and i0i \geq 0, xyz=anbnxyz = a^{n}b^{n} and xyizxy^{i}z are both in LL, then yy must be all aa’s or all bb’s.

We do not know what mm is, but, whatever mm is, the Pumping Lemma enables us to choose a string w=ambmw = a^{m}b^{m}. Thus yy must consist entirely of aa’s.

Suppose k>0k > 0. We must decompose w=xyzw = xyz as follows for some p+kmp+k \leq m:

x=apx = a^{p}

y=aky = a^{k}

z=ampkbmz =a^{m-p-k} b^{m}

From the Pumping Lemma

w0=amkbmw_{0} = a^{m-k}b^{m}.

Clearly, this is not in LL. But this contradicts the Pumping Lemma.

Hence, the assumption that LL is regular is false. Thus {anbn:n0}\{ a^{n}b^{n}: n \geq 0 \} is not regular.

4.3.5 Using the Pumping Lemma (Viewed as a Game)

The Pumping Lemma guarantees the existence of mm and decomposition xyzxyz for any string in a regular language.

The Pumping Lemma holds for all wLw \in L and for all i0i \geq 0 (i.e., xyizLxy^{i}z \in L for all ii).

We can thus conceptualize a proof as a game against an opponent.

Strategy:

4.3.6 Linz Example 4.8

Let Σ={a,b}\Sigma = \{ a, b \}. Show that

L={wwR:wΣ*}L = \{ww^{R}: w \in \Sigma^{*}\}

is not regular.

We use the Pumping Lemma and assume LL is regular.

Whatever mm the opponent picks in step 1 (of the “game”), we can choose a ww as shown below in step 2.

Linz Fig. 4.5

Because of this choice, and the requirement that |xy|m|xy| \leq m, in step 3 the opponent must choose a yy that consists entirely of aa’s. Consider

wi=xyizw_{i} = xy^{i}z

that must hold because of the Pumping Lemma.

In step 4, we use i=0i = 0 in wi=xyizw_{i} = xy^{i}z. This string has fewer aa’s on the left than on the right and so cannot be of the form wwRww^{R}.

Therefore, the Pumping Lemma is violated. LL is not regular.

Warning: Be careful! There are ways we can go wrong in applying the Pumping Lemma.

4.3.7 Linz Example 4.9

For Σ={a,b}\Sigma = \{ a,b \}, show that the language

L={wΣ*:na(w)<nb(w)}L = \{ w \in \Sigma^{*}: n_{a}(w) < n_{b}(w) \}

is not regular.

We use the Pumping Lemma to show a contradiction. Assume LL is
regular.

Suppose the opponent gives us mm. Because we have complete freedom in choosing wLw \in L, we pick w=ambm+1w = a^{m}b^{m+1}. Now, because |xy||xy| cannot be greater than mm, the opponent cannot do anything but pick a yy with all aa’s, that is,

y=aky = a^{k} for 1km1 \leq k \leq m.

We now pump up, using i=2i = 2. The resulting string

w2=am+kbm+1w_{2} = a^{m+k}b^{m+1}

is not in LL. Therefore, the Pumping Lemma is violated. LL is not regular.

4.3.8 Linz Example 4.10

Show that

L={(ab)nak:n>k,k0}L = \{ (ab)^{n}a^{k}: n > k, k \geq 0 \}

is not regular

We use the Pumping Lemma to show a contradiction. Assume LL is
regular.

Given some mm, we pick as our string

w=(ab)m+1amw = (ab)^{m+1}a^{m}

which is in LL.

The opponent must decompose w=xyzw = xyz so that |xy|m|xy| \leq m and |y|1|y| \geq 1. Thus both xx and yy must be in the part of the string consisting of abab pairs. The choice of xx does not affect the argument, so we can focus on the yy part.

If our opponent picks y=ay = a, we can choose i=0i = 0 and get a string not in L((ab)*a*)L((ab)^{*}a^{*}) and, hence, not in LL. (There is a similar argument for y=by = b.)

If the opponent picks y=aby = ab, we can choose i=0i = 0 again. Now we get the string (ab)mam(ab)^{m}a^{m}, which is not in LL. (There is a similar argument for y=bay = ba.)

In a similar manner, we can counter any possible choice by the opponent. Thus, because of the contradiction, LL is not regular.

4.3.9 Linz Example (Factorial Length Strings)

Note: This example is adapted from an earlier edition of the Linz textbook.

Show that

L={an!:n0}L = \{a^{n!} : n \geq 0\}

is not regular.

We use the Pumping Lemma to show a contradiction. Assume LL is regular.

Given the opponent’s choice for mm, we pick ww to be the string am!a^{m!} (unless the opponent picks m<3m < 3, in which case we can use a3!a^{3!} as ww).

The possible decompositions w=xyzw = xyz (such that |xy|m|xy| \leq m) differ only in the lengths of xx and yy. Suppose the opponent picks yy such that

|y|=km|y| = k \leq m.

According to the Pumping Lemma, xz=am!kLxz = a^{m!-k} \in L. But this string can only be in LL if there exists a jj such that

m!k=j!m! - k = j!.

But this is impossible, because for m3m \geq 3 and kmk \leq m we know (see argument below) that

m!k>(m1)!m! - k > (m -1)!.

Therefore, LL is not regular.

Aside: To see that m!k>(m1)!m! - k > (m - 1)! for m3m \geq 3 and kmk \leq m, note that

m!km! - k \geq m!mm! - m == m(m1)!mm(m - 1)! - m == m((m1)!1)m((m-1)!- 1) >> (m1)!(m-1)!.

4.3.10 Linz Example 4.12

Show that the language

L={anbkcn+k:n0,k0}L = \{a^{n}b^{k}c^{n+k}: n \geq 0, k \geq 0\}

is not regular.

Strategy: Instead of using the Pumping Lemma directly, we show that LL is related to another language we already know is nonregular. This may be an easier argument.

In this example, we use the closure property under homomorphism (Linz Theorem 4.3).

Let hh be defined such that

h(a)=a,h(b)=a,h(c)=ch(a) = a, h(b) = a, h(c) = c.

Then

h(L)h(L) == {an+kcn+k:n+k0}\{ a^{n+k}c^{n+k} : n + k \geq 0 \}
== {aici:i0}\{ a^{i}c^{i} : i \geq 0 \}

But we proved this language was not regular in Linz Example 4.6. Therefore, because of closure under homomorphism, LL cannot be regular either.

Alternative proof by contradiction

Assume LL is regular.

Thus h(L)h(L) is regular by closure under homomorphism (Linz Theorem 4.3).

But we know h(L)h(L) is not regular, so there is a contradiction.

Thus, LL is not regular.

4.3.11 Linz Example 4.13

Show that the language

L={anbl:nl}L = \{ a^{n}b^{l}: n \ne l \}

is not regular.

We use the Pumping Lemma, but this example requires more ingenuity to set up than previous examples.

Assume LL is regular.

Choosing a string wLw \in L with m=n=l+1m = n = l + 1 or m=n=l+2m = n = l + 2 will not lead to a contradiction.

In these cases, the opponent can always choose a decomposition w=xyzw = xyz (with |xy|m|xy| \leq m and |y|1|y| \geq 1) that will make it impossible to pump the string out of the language (that is, pump it so that it has an equal number of aa’s and bb’s). For w=al+1blw = a^{l+1}b^{l}, the opponent can chose yy to be an even number of aa’s. For w=al+2blw = a^{l+2}b^{l}, the opponent can chose yy to be an odd number of aa’s greater than 1.

We must be more creative. Suppose we choose wLw \in L where n=m!n = m! and l=(m+1)!l = (m + 1)!.

If the opponent decomposes w=xyzw = xyz (with |xy|m|xy| \leq m and |y|=k1|y| = k \geq 1), then yy must consist of all aa’s.

If we pump ii times, we generate string xyizxy^{i}z where the number of aa’s is m!+(i1)km! + (i-1)k

We can contradict the Pumping Lemma if we can pick ii such that

m!+(i1)k=(m+1)!m! + (i - 1)k = (m + 1)!.

But we can do this, because it is always possible to choose

i=1+mm!/ki = 1 + mm!/k.

For 1km1 \leq k \leq m, the expression 1+mm!/k1 + mm!/k is an integer.

Thus the generated string has m!+((1+mm!/k)1)km! + ((1 + mm!/k) - 1)k occurrences of aa.

m!+((1+mm!/k)1)km! + ((1 + mm!/k) - 1)k
== m!+mm!m! + mm!
== m!(m+1)m!(m + 1)
== (m+1)!(m+1)!

This introduces a contradiction of the Pumping Lemma. Thus LL is not regular.

Alternative argument (more elegant)

Suppose L={anbl:nl}L = \{ a^{n}b^{l}: n \ne l \} is regular.

Because of complementation closure, L\bar{L} is regular.

Let L1=LL(a*b*)L_{1} = \bar{L} \cap L(a^{*}b^{*}).

But L(a*b*)L(a^{*}b^{*}) is regular and thus, by intersection closure, L1L_{1} is also regular.

But L1={anbn:n0}L_{1} = \{ a^{n}b^{n} : n \geq 0 \}, which we have shown to be nonregular. Thus we have a contradiction, so LL is not regular.

4.3.12 Pitfalls in Using the Pumping Lemma

The Pumping Lemma is difficult to understand and, hence, difficult to apply.

Here are a few suggestions to avoid pitfalls in use of the Pumping Lemma.

Like most interesting “games”, knowledge of the rules for use of the Pumping Lemma is necessary, but it is not sufficient to become a master “player”. To master the use of the Pumping Lemma, one must work problems of various difficulties. Practice, practice, practice.

5 Context-Free Languages

In Linz Section 4.3, we saw that not all languages are regular. We examined the Pumping Lemma for Regular Languages as a means to prove that a specific language is not regular.

In Linz Example 4.6, we proved that

L={anbn:n0}L = \{a^{n}b^{n} : n \geq 0\}

is not regular.

If we let aa = “(” and bb = “)”, then LL becomes a language of nested parenthesis.

This language is in a larger family of languages called the context-free languages.

Context-free languages are very important because many practical aspects of programming languages are in this family.

In this chapter, we explore the context-free languages, beginning with context-free grammars.

One key process we explore is parsing, which is the process of determining the grammatical structure of a sentence generated from a grammar.

5.1 Context-Free Grammars

5.1.1 Definition of Context-Free Grammars

Remember the restrictions we placed on regular grammars in Linz Section 3.3:

To create more powerful grammars (i.e., that describe larger families of languages), we relax these restrictions.

For context-free grammars, we maintain the left-side restriction but relax the restriction on the right side.

Linz Definition 5.1 (Context-Free Grammar): A grammar G=(V,T,S,P)G = (V,T,S,P) is context-free if all productions in PP have the form

AxA \rightarrow x

where AVA \in V and x(VT)*x \in (V \cup T)^{*}. A language LL is context-free if and only if there is a context-free grammar GG such that L=L(G)L = L(G).

The family of regular languages is a subset of the family of context-free languages!

Thus, context-free grammars

5.1.2 Linz Example 5.1

Consider the grammar G=({S},{a,b},S,P)G = (\{S\}, \{a, b\}, S, P) with productions:

SaSaS \rightarrow aSa
SbSbS \rightarrow bSb
SλS \rightarrow \lambda

Note that this grammar satisfies the definition of context-free.

A possible derivation using this grammar is as follows:

SaSaaaSaaaabSbaaaabbaaS \Rightarrow aSa \Rightarrow aaSaa \Rightarrow aabSbaa \Rightarrow aabbaa

From this derivation, we see that

L(G)={wwR:w{a,b}*}L(G) = \{ww^{R} : w \in \{a,b\}^{*}\}.

The language is context-free, but, as we demonstrated in Linz Example 4.8, it is not regular.

This grammar is linear because it has at most one variable on the right.

5.1.3 Linz Example 5.2

Consider the grammar GG with productions:

SabBS \rightarrow abB
AaaBbA \rightarrow aaBb
BbbAaB \rightarrow bbAa
AλA \rightarrow \lambda

Note that this grammar also satisfies the definition of context free.

A possible derivation using this grammar is:

       SabBabbbAaabbbaaBbaabbbaabbAabaS \Rightarrow abB \Rightarrow abbbAa \Rightarrow abbbaaBba\Rightarrow abbbaabbAaba
          abbbaabbaaBbabaabbbaabbaabbAababaabbbaabbaabbababa\Rightarrow abbbaabbaaBbaba \Rightarrow abbbaabbaabbAababa \Rightarrow abbbaabbaabbababa

We can see that:

L(G)={ab(bbaa)nbba(ba)n:n0}L(G) = \{ab(bbaa)^{n}bba(ba)^{n} : n \geq 0\}

This grammar is also linear (as defined in Linz Section 3.3). Although linear grammars are context free, not all context free grammars are linear.

5.1.4 Linz Example 5.3

Consider the language

L={anbm:nm}L = \{a^{n}b^{m} : n \neq m\}.

This language is context free. To show that this is the case, we must construct a context-free grammar that generates the language

First, consider the n=mn = m case. This can be generated by the productions:

SaSb|λS \rightarrow aSb\ | \ \lambda

Now, consider the n>mn > m case. We can modify the above to generate extra aa’s on the left.

SAS1S \rightarrow AS_{1}
S1aS1b|λS_{1} \rightarrow aS_{1}b\ | \ \lambda
AaA|aA \rightarrow aA\ | \ a

Finally, consider the n<mn < m case. We can further modify the grammar to generate extra bb’s on right.

SAS1|S1BS \rightarrow AS_{1}\ | \ S_{1}B
S1aS1b|λS_{1} \rightarrow aS_{1}b\ | \ \lambda
AaA|aA \rightarrow aA\ | \ a
BbB|bB \rightarrow bB\ | \ b

This grammar is context free, but it is not linear because the productions with SS on the left are not in the required form.

Although this grammar is not linear, there exist other grammars for this language that are linear.

5.1.5 Linz Example 5.4

Consider the grammar with productions:

SaSb|SS|λS \rightarrow aSb\ | \ SS\ | \ \lambda

This grammar is also context-free but not linear.

Example strings in L(G)L(G) include abaabbabaabb, aababbaababb, and abababababab. Note that:

We can see that L(G)L(G) is

{ w{a,b}*:na(w)=nb(w)w \in \{a, b\}^{*} : n_{a}(w) = n_{b}(w) and na(v)nb(v)n_{a}(v) \geq n_{b}(v) for any prefix vv of ww }.

What is a programming language connection for this language?

5.1.6 Leftmost and Rightmost Derivations

Consider grammar G=({A,B,S},{a,b},S,P)G = (\{A, B, S\}, \{a, b\}, S, P) with productions:

SABS \rightarrow AB
AaaAA \rightarrow aaA
AλA \rightarrow \lambda
BBbB \rightarrow Bb
BλB \rightarrow \lambda

This grammar generates the language L(G)={a2nbm:n0,m0}L(G) = \{a^{2n}b^{m} : n \geq 0, m \geq 0\}.

Now consider the two derivations:

SABaaABaaBaaBbaabS \Rightarrow AB \Rightarrow aaAB \Rightarrow aaB \Rightarrow aaBb \Rightarrow aab
SABABbaaABbaaAbaabS \Rightarrow AB \Rightarrow ABb \Rightarrow aaABb \Rightarrow aaAb \Rightarrow aab

These derivations yield the same sentence using exactly the same productions. However, the productions are applied in different orders.

To eliminate such incidental factors, we often require that the variables be replaced in a specific order.

Linz Definition 5.2 (Leftmost and Rightmost Derivations): A derivation is leftmost if, in each step, the leftmost variable in the sentential form is replaced. If, in each step, the rightmost variable is replaced, then the derivation is rightmost.

5.1.7 Linz Example 5.5

Consider the grammar with productions:

SaABS \rightarrow aAB
AbBbA \rightarrow bBb
BA|λB \rightarrow A\ | \ \lambda

A leftmost derivation of the string abbbbabbbb is:

SaABabBbBabAbBabbBbbBabbbbBabbbbS \Rightarrow aAB \Rightarrow abBbB \Rightarrow abAbB \Rightarrow abbBbbB \Rightarrow abbbbB \Rightarrow abbbb

Similarly, a rightmost derivation of the string abbbbabbbb is:

SaABaAabBbabAbabbBbbabbbbS \Rightarrow aAB \Rightarrow aA \Rightarrow abBb \Rightarrow abAb \Rightarrow abbBbb \Rightarrow abbbb

5.1.8 Derivation Trees

We can also use a derivation tree to show derivations in a manner that is independent of the order in which productions are applied.

A derivation tree is an ordered tree in which we label the nodes with the left sides of productions and the children of a node represent its corresponding right sides.

The production

AabABcA \rightarrow abABc

is shown as a derivation tree in Linz Figure 5.1.

Linz Fig. 5.1: Derivation Tree for Production A \rightarrow abABc

Linz Definition 5.3 (Derivation Tree): Let G=(V,T,S,P)G = (V, T, S, P) be a context-free grammar. An ordered tree is a derivation tree for GG if and only if it has the following properties:

  1. The root is labeled SS.

  2. Every leaf has a label from T{λ}T \cup \{\lambda\}.

  3. Every interior vertex (i.e., a vertex that is not a leaf) has a label from VV.

  4. If a vertex has a label AVA \in V, and its children are labeled (from left to right) a1,a2,,ana_{1}, a_{2}, \cdots, a_{n}, then PP must contain a production of the form Aa1a2anA \rightarrow a_{1} a_{2} \cdots\ a_{n}.

  5. A leaf labeled λ\lambda has no siblings, that is, a vertex with a child labeled λ\lambda can have no other children.

If properties 3, 4, and 5 and modified property 2 (below) hold for a tree, then it is a partial derivation tree.

  1. (modified) Every leaf has a label from VT{λ}V \cup T \cup \{\lambda\}

If we read the leaves of a tree from left to right, omitting any λ\lambda’s encountered, we obtain a string called the yield of the tree.

The descriptive term from left to right means that we traverse the tree in a depth-first manner, always exploring the leftmost unexplored branch first. The yield is the string of terminals encountered in this traversal.

5.1.9 Linz Example 5.6

Consider the grammar GG with productions:

SaABS \rightarrow aAB
AbBbA \rightarrow bBb
BA|λB \rightarrow A\ | \ \lambda

Linz Figure 5.2 shows a partial derivation tree for GG with the yield abBbBabBbB. This is a sentential form of the grammar GG.

Linz Fig. 5.2: Partial Derivation Tree

Linz Figure 5.3 shows a derivation tree for GG with the yield abbbbabbbb. This is a sentence of L(G)L(G).

Linz Fig. 5.3: Derivation Tree

5.1.10 Relation Between Sentential Forms and Derivation Trees

Derivation trees give explicit (and visualizable) descriptions of derivations. They enable us to reason about context-free languages much as transition graphs enable use to reason about regular languages.

Linz Theorem 5.1 (Connection between Derivations and Derivation Trees): Let G=(V,T,S,P)G = (V, T, S, P) be a context-free grammar. Then the following properties hold:

Proof: See the proof in the Linz textbook.

Derivation trees:

5.2 Parsing and Ambiguity

5.2.1 Generation versus Parsing

The previous section concerns the generative aspects of grammars–using a grammar to generate strings in the language.

This section concerns the analytical aspects of grammars–processing strings from the language to determine their derivations. This process is called parsing.

For example, a compiler for a programming language must parse a program (i.e., a sentence in the language) to determine the derivation tree for the program.

5.2.2 Exhaustive Search Parsing

Given some wL(G)w \in L(G), we can parse ww with respect to grammar GG by:

This is called exhaustive search parsing or brute force parsing. A more complete description of the algorithm is below.

This is a form of top-down parsing in which a derivation tree is constructed from the root downward.

Note: An alternative approach is bottom-up parsing in which the derivation tree is constructed from leaves upward. Bottom-up parsing techniques often have limitations in terms of the grammars supported but often give more efficient algorithms.

Exhaustive Search Parsing Algorithm

    – Add root and 1st level of all derivation trees
    F{x:sxF \leftarrow \{x : s \rightarrow x in PP of G}G \}
    while FF \neq \emptyset and wFw \notin F do
        FF' \leftarrow \emptyset
        – Add next level of all derivation trees
        for all xFx \in F do
            if xx can generate ww then
                VV \leftarrow leftmost variable of xx
                for all productions VyV \rightarrow y in GG do
                    FF{x}F' \leftarrow F' \cup \{x'\} where x=xx' = x with VyV \leftarrow y
       FFF \leftarrow F'

Note: The above algorithm determines whether a string ww is in L(G)L(G). It can be modified to build the actual derivation or derivation tree.

5.2.3 Linz Example 5.7

Note: The presentation here uses the algorithm above, rather than the approach in the Linz textbook.

Consider the grammar GG with productions:

SSS|aSb|bSa|λS \rightarrow SS\ | \ aSb\ | \ bSa\ | \ \lambda

Parse the string w=aabbw = aabb.

After initialization: F={SS,aSb,bSa,λ}F = \{ SS, aSb, bSa, \lambda \} (from the righthand sides of the grammar’s four productions with SS on the left).

First iteration: The loop test is true because FF is nonempty and ww is not present.

The algorithm does not need to consider the sentential forms bSabSa and λ\lambda in FF because neither can generate ww.

The inner loop thus adds {SSS,aSbS,bSaS,S}\{ SSS, aSbS, bSaS, S \} from the leftmost derivations from sentential form SSSS and also adds {aSSb,aaSbb,abSab,ab}\{ aSSb, aaSbb, abSab, ab \} from the leftmost derivations from sentential form aSbaSb.

Thus F={SSS,aSbS,bSaS,S,aSSb,aaSbb,abSab,ab}F = \{ SSS, aSbS, bSaS, S, aSSb, aaSbb, abSab, ab \} at the end of the first iteration.

Second iteration: The algorithm enters the loop a second time because FF is nonempty and does not contain ww.

The algorithm does not need to consider any sentential form beginning with bb or abab, thus eliminating {bSaS,abSab,ab}\{ bSaS, abSab, ab \} and leaving {SSS,aSbS,S,aSSb,aaSbb}\{ SSS, aSbS, S, aSSb, aaSbb \} of interest.

This iteration generates 20 new sentential forms from applying each of the 4 productions to each of the 5 remaining sentential forms.

In particular, note that that sentential form aaSbbaaSbb yields the target string aabbaabb when production SλS \rightarrow \lambda is applied.

Third iteration: The loop terminates because ww is present in FF.

Thus we can conclude wL(G)w \in L(G).

5.2.4 Flaws in Exhaustive Search Parsing

Exhaustive search parsing has serious flaws:

For example, if we choose w=abbw = abb in the previous example, the algorithm goes into an infinite loop.

The fix for nontermination is to ensure sentential forms increase in length for each production. That is, we eliminate productions of forms:

AλA \rightarrow \lambda
ABA \rightarrow B

Chapter 6 of the Linz textbook (which we will not cover this semester) shows that this does not reduce the power of the grammar.

5.2.5 Linz Example 5.8

Consider the grammar with productions:

SSS|aSb|bSA|ab|baS \rightarrow SS\ | \ aSb\ | \ bSA\ | \ ab\ | \ ba

This grammar generates the same language as the one in Linz Example 5.7 above, but it satisfies the restrictions given in the previous subsection.

Given any nonempty string ww, exhaustive search will terminate in no more than |w||w| rounds for such grammars.

5.2.6 Toward Better Parsing Algorithms

Linz Theorem 5.2 (Exhaustive Search Parsing): Suppose that G=(V,T,S,P)G = (V, T, S, P) is a context-free grammar that does not have any rules of one of the forms

AλA \rightarrow \lambda
ABA \rightarrow B

where A,BVA, B \in V. Then the exhaustive search parsing method can be formulated as an algorithm which, for any wT*w \in T{*}, either parses ww or tells us that parsing is impossible.

Proof outline

But exhaustive search is still inefficient. The number of sentential forms to be generated is

i=12|w||P|i\sum_{i=1}^{2|w|} |P|^{i}.

That is, it grows exponentially with the length of the string.

Linz Theorem 5.3 (Efficient Parsing): For every context-free grammar there exists an algorithm that parses any wL(G)w \in L(G) in a number of steps proportional to |w|3|w|^{3}.

5.2.7 Simple Grammar Definition

Linz Definition 5.4 (Simple Grammar): A context-free grammar G=(V,T,S,P)G=(V,T,S,P) is said to be a simple grammar or s-grammar if all its productions are of the form

AaxA \rightarrow ax

where AV,aT,xV*A \in V, a \in T, x \in V^{*}, and any pair (A,a)(A, a) occurs at most once in PP.

5.2.8 Linz Example 5.9

The grammar

SaS|bSS|cS \rightarrow aS\ | \ bSS\ | \ c

is an s-grammar.

The grammar

SaS|bSS|aSS|cS \rightarrow aS\ | \ bSS\ | \ aSS\ | \ c

is not an s-grammar because (S,a)(S, a) occurs twice.

5.2.9 Parsing Simple Grammars

Although s-grammars are quite restrictive, many features of programming languages can be described with s-grammars (e.g., grammars for arithmetic expressions).

If GG is s-grammar, then wL(G)w \in L(G) can be parsed in linear time.

To see this, consider string w=a1a2anw = a_{1}a_{2} \cdots a_{n} and use the exhaustive search parsing algorithm.

  1. The s-grammar has at most one rule with a1a_{1} on left: Sa1A1A2S \rightarrow a_{1}A_{1}A_{2} \cdots. Choose it!

  2. Then the s-grammar has at most one rule with a2a_{2} on left: A1a2B1B2A_{1} \rightarrow a_{2}B_{1}B_{2} \cdots. Choose it!

  3. And so forth up to the nnth terminal.

The number of steps is proportional to |w||w| because each step consumes one symbol of ww.

5.2.10 Ambiguity in Grammars and Languages

A derivation tree for some string generated by a context-free grammar may not be unique.

Linz Definition 5.5 (Ambiguity): A context-free grammar GG is said to be ambiguous if there exists some wL(G)w \in L(G) that has at least two distinct derivation trees. Alternatively, ambiguity implies the existence of two or more leftmost or rightmost derivations.

5.2.11 Linz Example 5.10

Again consider the grammar in Linz Example 5.4. Its productions are

SaSb|SS|λS \rightarrow aSb\ | \ SS\ | \ \lambda.

The string w=aabbw = aabb has two derivation trees as shown in Linz Figure 5.4

Linz Fig. 5.4: Two Derivation Trees for aabb

The left tree corresponds to the leftmost derivation SaSbaaSbbaabbS \Rightarrow aSb \Rightarrow aaSbb \Rightarrow aabb.

The right tree corresponds to the leftmost derivation SSSλSaSbaaSbbaabbS \Rightarrow SS \Rightarrow \lambda S \Rightarrow aSb \Rightarrow aaSbb \Rightarrow aabb.

Thus the grammar is ambiguous.

5.2.12 Linz Example 5.11

Consider the grammar G=(V,T,E,P)G = (V, T, E, P) with

V={E,I}V = \{E, I\}
T={a,b,c,+,*,(,)}T = \{a, b, c, +, *, (, )\}

and PP including the productions:

EIE \rightarrow I
EE+EE \rightarrow E + E
EE*EE \rightarrow E * E
E(E)E \rightarrow (E)
Ia|b|cI \rightarrow a\ | \ b\ | \ c

This grammar generates a subset of the arithmetic expressions for a language like C or Java. It contains strings such as (a+b)*c(a+b)*c and a*b+ca*b+c.

Linz Figure 5.5 shows two derivation trees for the string a+b*ca+b*c. Thus this grammar is ambiguous.

Linz Fig. 5.5: Two Derivation Trees for a+b*c

Why is ambiguity a problem?

Remember that the semantics (meaning) of the expression is also associated with the structure of the expression. The structure determines how the (machine language) code is generated to carry out the computation.

How do real programming languages resolve this ambiguity?

Often, they add precedence rules that give priority to “**” over “++”. That is, the multiplication operator binds more tightly than addition.

This solution is totally outside the world of the context-free grammar. It is, in some sense, a hack.

A better solution is to rewrite the grammar (or sometimes redesign te language) to eliminate the ambiguity.

5.2.13 Linz Example 5.12

To rewrite the grammar in Linz Example 5.11, we introduce new variables, making VV the set {E,T,F,I}\{E, T, F, I\}, and replacing the productions with the following:

ETE \rightarrow T
TFT \rightarrow F
FIF \rightarrow I
EE+TE \rightarrow E + T
TT*FT \rightarrow T * F
F(E)F \rightarrow (E)
Ia|b|cI \rightarrow a\ | \ b\ | \ c

Linz Figure 5.6 shows the only derivation tree for string a+b*ca+b*c in this revised grammar for arithmetic expressions.

Linz Fig. 5.6: Derivation Tree for a+b*c in Revised Grammar

5.2.14 Inherently Ambiguous

Linz Definition 5.6: If LL is a context-free language for which there exists an unambiguous grammar, then LL is said to be unambiguous. If every grammar that generates LL is ambiguous, then language is called inherently ambiguous.

It is difficult to demonstrate that a grammar is inherently ambiguous. Often the best we can do is to give examples and argue informally that all grammars must be ambiguous.

5.2.15 Linz Example 5.13

The language

L={anbncm}{anbmcm}L = \{a^{n}b^{n}c^{m}\} \cup \{a^{n}b^{m}c^{m}\},

with nn and mm non-negative, is an inherently ambiguous context-free language.

Note that L=L1L2L = L_{1} \cup L_{2}.

We can generate L1L_{1} with the context-free grammar:

S1=S1c|AS_{1} = S_{1}c\ | \ A
AaAb|λA \rightarrow aAb\ | \ \lambda

Similarly, we can generate L2L_{2} with the context-free grammar:

S2=aS2|BS_{2} = aS_{2}\ | \ B
BbBc|λB \rightarrow bBc\ | \ \lambda

We can thus construct the union of these two sublanguages by adding a new production:

SS1|S2S \rightarrow S_{1}\ | \ S_{2}

Thus this is a context-free language.

But consider a string of the form anbncna^{n}b^{n}c^{n} (i.e., n=mn=m). It has two derivations, one starting with

SS1S \Rightarrow S_{1}

and another starting with

SS2S \Rightarrow S_{2}.

Thus the grammar is ambiguous.

L1L_{1} and L2L_{2} have conflicting requirements. L1L_{1} places restrictions on the number of aa’s and bb’s while L2L_{2} places restrictions on the number of bb’s and cc’s. It is imposible to find production rules that satisfy the n=mn = m case uniquely.

5.3 Context-Free Grammars and Programming Languages

The syntax for practical programming language syntax is usually expressed with context-free grammars. Compilers and interpreters must parse programs in these language to execute them.

The grammar for programming languages is often expressed using the Backus-Naur Form (BNF) to express productions.

For example, the language for arithmetic expressing in Linz Example 5.12 can be written in BNF as:

    <expression> ::= <term>   | <expression> + <term>
          <term> ::= <factor> | <term> * <factor>

The items in angle brackets are variables, the symbols such as “+” and “-” are terminals, the “|” denotes alternatives, and “::=” separates the left and right sides of the productions.

Programming languages often use restricted grammars to get linear parsing: e.g., regular grammars, s-grammars, LL grammars, and LR grammars.

The aspects of programming languages that can be modeled by context-free grammars are called the the syntax.

Aspects such as type-checking are not context-free. Such issues are sometimes considered (incorrectly in your instructor’s view) as part of the semantics of the language.

These are really still syntax, but they must be expressed in ways that are not context free.

6 OMIT Chapter 6

7 Pushdown Automata

Finite automata cannot recognize all context-free languages.

To recognize language {anbn:n0}\{ a^{n}b^{n} : n \geq 0 \}, an automaton must do more than just verify that all aa’s precede all bb’s; it must count an unbounded number of symbols. This cannot be done with the finite memory of a dfa.

To recognize language {wwR:wΣ*}\{ ww^{R}: w \in \Sigma^{*} \}, an automaton must do more than just count; it must remember a sequence of symbols and compare it to another sequence in reverse order.

Thus, to recognize context-free languages, an automaton needs unbounded memory. The second example above suggests using a stack as the “memory”.

Hence, the class of machines we examine in this chapter are called pushdown automata.

In this chapter, we examine the connection between context-free languages and pushdown automata.

Unlike the situation with finite automata, deterministic and nondeterministic pushdown automata differ in the languages they accept.

7.1 Nondeterministic Pushdown Automata

7.1.1 Schematic Drawing

Linz Figure 7.1 illustrates a pushdown automaton.

On each move, the control unit reads a symbol from the input file. Based on the input symbol and symbol on the top of the stack, the control unit changes the content of the stack and enters a new state.

Linz Fig. 7.1: Pushdown Automaton

7.1.2 Definition of a Pushdown Automaton

Linz Definition 7.1 (Nondeterministic Pushdown Accepter): A nondeterministic pushdown accepter (npda) is defined by the tuple

M=(Q,Σ,Γ,δ,q0,z,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, z, F),

where

QQ is a finite set of internal states of the control unit,
Σ\Sigma is the input alphabet,
Γ\Gamma is a finite set of symbols called the stack alphabet,
δ:Q×(Σ{λ})×Γ\delta: Q \times (\Sigma \cup \{\lambda\}) \times \Gamma \rightarrow finite subsets of Q×Γ*Q \times \Gamma^{*} is the transition function,
q0Qq_{0} \in Q is the initial state of the control unit,
zΓz \in \Gamma is the stack start symbol,
FQF \subseteq Q is the set of final states.

Note that the input and stack alphabets may differ and that start stack symbol zz must be an element of Γ.\Gamma.

Consider the transition function δ:Q×(Σ{λ})×Γ\delta: Q \times (\Sigma \cup \{\lambda\}) \times \Gamma \rightarrow finite subsets of Q×Γ*Q \times \Gamma^{*}.

The machine is nondeterministic, but there are only a finite number of possible results for each step.

7.1.3 Linz Example 7.1

Suppose the set of transition rules of an npda contains

δ(q1,a,b)={(q2,cd),(q3,λ)}\delta(q_{1}, a, b) = \{(q_{2}, cd), (q_{3}, \lambda)\}.

A possible change in the state of the automaton from q1q_{1} to q2q_{2} is shown in following diagram.

This transition function can be drawn as a transition graph as shown below. The triple marking the edges represent the input symbol, the symbol at the top of the stack, and the string pushed back on the stack. Here we use “/” to separate the elements of the triple on the edges; the book uses commas for this purpose.

7.1.4 Linz Example 7.2

Consider an npda (Q,Σ,Γ,δ,q0,z,F)(Q,\Sigma,\Gamma,\delta,q_{0},z,F) where

Q={q0,q1,q2,q3}Q = \{q_{0}, q_{1}, q_{2}, q_{3}\}
Σ={a,b}\Sigma = \{a, b\}
Γ={0,1}\Gamma = \{0, 1\}
z=0z = 0
F={q3}F = \{q_{3}\}

with the initial state q0q_{0} and the transition function defined as follows:

  1. δ(q0,a,0)={(q1,10),(q3,λ)}\delta(q_{0}, a, 0) = \{(q_{1}, 10), (q_{3},\lambda)\}
  2. δ(q0,λ,0)={(q3,λ)}\delta(q_{0}, \lambda, 0) = \{(q_{3}, \lambda)\}
  3. δ(q1,a,1)={(q1,11)}\delta(q_{1}, a, 1) = \{(q_{1}, 11)\}
  4. δ(q1,b,1)={(q2,λ)}\delta(q_{1}, b, 1) = \{(q_{2}, \lambda)\}
  5. δ(q2,b,1)={(q2,λ)}\delta(q_{2}, b, 1) = \{(q_{2}, \lambda)\}
  6. δ(q2,λ,0)={(q3,λ)}\delta(q_{2}, \lambda, 0) = \{(q_{3}, \lambda)\}.

There are no transitions defined for final state q3q_{3} or in the cases

(q0,b,0)(q_{0}, b, 0), (q2,a,1)(q_{2}, a, 1), (q0,a,1)(q_{0}, a, 1), (q0,b,1)(q_{0}, b, 1), (q1,a,0)(q_{1}, a, 0), and (q1,b,0)(q_{1}, b, 0).

These are dead configurations of the npda. In these cases, δ\delta maps the arguments to the empty set.

The machine executes as follows.

  1. The first transition rule is nondeterministic, with two choices for input symbol aa and stack top 0.0.
    1. The machine can push 11 on the stack and transition to state q1q_{1}. (This is the only choice that will allow the machine to accept additional input.)
    2. The machine can pop the start symbol 00 and transition to final state q3q_{3}. (This is the only choice that will allow the machine to accept a single aa.)
  2. For stack top zz, the machine can also transition from the initial state q0q_{0} to final state q3q_{3} without consuming any input. (This is only choice that will allow the machine to accept an empty string.) Note that rule 2 overlaps with rule 1, giving additional nondeterminism.
  3. While the machine reads aa’s, it pushes a 11 on the stack and stays in state q1q_{1}.
  4. When the machine reads a bb (with stack top 11), it pops the 11 from the stack and transitions to state q2q_{2}.
  5. While the machine reads bb’s (with stack top 11), it pops the 11 from the stack and stays in state q2q_{2}.
  6. When the machine encounters the stack top 00, it pops the stack and transitions to final state q3q_{3}.

Acceptance or rejection?

The machine accepts:

Other strings will always end in dead configurations. For example:

Thus, it is not difficult to see that L={anbn:n0}{a}L = \{a^{n}b^{n}: n \geq 0\} \cup \{a\}.

Linz Figure 7.2 shows a transition graph for this npda. The triples marking the edges represent the input symbol, the symbol at the top of the stack, and the string pushed back on the stack.

Linz Fig. 7.2: Transition Graph for Example 7.2

7.1.5 Instantaneous Descriptions of Pushdown Automata

Transition graphs are useful for describing pushdown automata, but they are not useful in formal reasoning about them. For that, we use a concise notation for describing configurations of pushdown automata with tuples.

The triple (q,w,u)(q, w, u) where

qq is the control unit state
ww is the unread input string
uu is the stack as string, beginning with the top symbol

is called an instantaneous description of a pushdown automaton.

We introduce the symbol \vdash to denote a move from one instantaneous description to another such that

(q1,aw,bx)(q2,w,yx)(q_{1}, aw, bx) \vdash (q_{2}, w, yx)

is possible if and only if

(q2,y)δ(q1,a,b)(q_{2},y) \in \delta(q_{1}, a, b).

We also introduce the notation *\vdash^{*} to denote an arbitrary number of steps of the machine.

7.1.6 Language Accepted by an NPDA

Linz Definition 7.2 (Language Accepted by a Pushdown Automaton): Let M=(Q,Σ,Γ,δ,q0,z,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, z, F) be a nondeterministic pushdown automaton. The language accepted by MM is the set

L(M)={wΣ*:(q0,w,z)M*(p,λ,u),pF,uΓ*}L(M) = \{w \in \Sigma^{*}: (q_{0}, w, z) \vdash^{*}_{M} (p, \lambda, u), p \in F, u \in \Gamma^{*}\}.

In words, the language accepted by MM is the set of all strings that can put MM into a final state at the end of the string. The stack content uu is irrelevant to this definition of acceptance.

7.1.7 Linz Example 7.4

Construct an npda for the language

L={w{a,b}*:na(w)=nb(w)}L = \{w \in \{a, b\}^{*}: n_{a}(w) = n_{b}(w)\}.

We must count aa’s and bb’s, but their relative order is not important.

The basic idea is:

But, what if nb>nan_{b} > n_{a} at some point?

We must allow a “negative” count. So we modify the above solution as follows:

So the solution is an npda.

M=({q0,qf},{a,b},{0,1,z},δ,q0,z,{qf})M = (\{q_{0}, q_{f}\}, \{a, b\}, \{0, 1, z\}, \delta, q_{0}, z, \{q_{f}\}), with δ\delta given as

  1. δ(q0,λ,z)={(qf,z)}\delta(q_{0}, \lambda, z) = \{(q_{f}, z)\}
  2. δ(q0,a,z)={(q0,0z)}\delta(q_{0}, a, z) = \{(q_{0}, 0z)\}
  3. δ(q0,b,z)={(q0,1z)}\delta(q_{0}, b, z) = \{(q_{0}, 1z)\}
  4. δ(q0,a,0)={(q0,00)}\delta(q_{0}, a, 0) = \{(q_{0}, 00)\}
  5. δ(q0,b,0)={(q0,λ)}\delta(q_{0}, b, 0) = \{(q_{0}, \lambda)\}
  6. δ(q0,a,1)={(q0,λ)}\delta(q_{0}, a, 1) = \{(q_{0}, \lambda)\}
  7. δ(q0,b,1)={(q0,11)}\delta(q_{0}, b, 1) = \{(q_{0}, 11)\}.

Linz Figure 7.3 shows a transition graph for this npda.

Linz Fig. 7.3: Transition Graph for Example 7.4

In processing the string baabbaab, the npda makes the following moves (as indicated by transition rule number):

(q0,baab,z)(q_{0}, baab, z)
(3) \vdash (q0,aab,1z)(q_{0}, aab, 1z)
(6) \vdash (q0,ab,z)(q_{0}, ab, z)
(2) \vdash (q0,b,0z)(q_{0}, b, 0z)
(5) \vdash (q0,λ,z)(q_{0}, \lambda, z)
(1) \vdash (qf,λ,z)(q_{f}, \lambda, z)

Hence, the string is accepted.

7.1.8 Linz Example 7.5

Construct an npda for accepting the language L={wwR:w{a,b}+}L = \{ww^{R}: w \in \{a, b\}^{+}\},

The basic idea is:

Problem: How do we find the middle?

Solution: Use nondeterminism!

For L={wwR:w{a,b}+}L = \{ww^{R}: w \in \{a, b\}^{+}\}, a solution to the problem is given by M=(Q,Σ,Γ,δ,q0,z,F)M = (Q,\Sigma,\Gamma,\delta,q_{0},z,F), where:

Q={q0,q1,q2}Q = \{q_{0}, q_{1}, q_{2}\}
Σ={a,b}\Sigma = \{a, b\}
Γ={a,b,z}\Gamma = \{a, b, z\} – which is Σ\Sigma plus the staack start symbol F={q2}F = \{q_{2}\}

The transition function can be visualized as having several parts.

  1. δ(q1,λ,z)=\delta(q_{1}, \lambda, z) = {(q2,z)}\{(q_{2}, z)\}

Remember that, to be accepted, a final state must be reached with no unprocessed input remaining.

The sequence of moves accepting abbaabba is as follows, where the number in parenthesis gives the transition rule applied:

(q0,abba,z)(q_{0}, abba, z)
(5) \vdash (q0,bba,az)(q_{0}, bba, az)
(2) \vdash (q0,ba,baz)(q_{0}, ba, baz)
(8) \vdash (q1,ba,baz)(q_{1}, ba, baz)
(10) \vdash (q1,a,az)(q_{1}, a, az)
(9) \vdash (q1,λ,z)(q_{1}, \lambda, z)
(11) \vdash (q2,z)(q_{2}, z)

7.2 Pushdown Automata and Context-Free Languages

7.2.1 Pushdown Automata for CFGs

Underlying idea: Given a context-free language, construct an npda that simulates a leftmost derivation for any string in the language.

We assume the context-free language is represented as grammar in Greibach Normal Form, as defined in Linz Chapter 6. We did not cover that chapter, but the definition and key theorem are shown below.

Greibach Normal Form restricts the positions at which terminals and variables can appear in a grammar’s productions.

Linz Definition 6.5 (Greibach Normal Form): A context-free grammar is said to be in Greibach Normal Form if all productions have the form

AaxA \rightarrow ax,

where aTa \in T and xV*x \in V^{*}.

The structure of a grammar in Greibach Normal Form is similar to that of an s-grammar except that, unlike s-grammars, the grammar does not restrict pairs (A,a)(A,a) to single occurrences within the set of productions.

Linz Theorem 6.7 (Existence of Greibach Normal Form Grammars): For every context-free grammar GG with λL(G)\lambda \notin L(G), there exists an equivalent grammar Ĝ\hat{G} in Greibach normal form.

Underlying idea, continued: Consider a sentential form, for example,

x1x2x3x4x5x6x_{1} x_{2} x_{3} x_{4} x_{5} x_{6}

where x1x2x3x_{1} x_{2} x_{3} are the terminals read from the input and x4x5x6x_{4} x_{5} x_{6} are the variables on the stack.

Consider a production AaxA \rightarrow ax.

If variable AA is on top of stack and terminal aa is the next input symbol, then remove AA from the stack and push back xx.

An npda transition function δ\delta for AaxA \rightarrow ax must be defined to have the move

(q,aw,Ay)(q,w,xy)(q, aw, Ay) \vdash (q, w, xy)

for some state qq, input string suffix ww, and stack yy. This, we define δ\delta such that

δ(q,a,A)={(q,x)}\delta(q, a, A) = \{(q, x)\}.

7.2.2 Linz Example 7.6

Construct a pda to accept the language generated by grammar with productions

SaSbb|aS \rightarrow aSbb \ |\ a.

First, we transform this grammar into Greibach Normal Form:

SaSA|aS \rightarrow aSA\ |\ a
AbBA \rightarrow bB
BbB \rightarrow b

We define the pda to have three states – an initial state q0q_{0}, a final state q2q_{2}, and an intermediate state q1q_{1}.

We define the initial transition rule to push the start symbol SS on the stack:

δ(q0,λ,z)={(q1,Sz)}\delta(q_{0}, \lambda, z) = \{(q_{1}, Sz)\}

We simulate the production SaSAS \rightarrow aSA with a transition that reads aa from the input and replaces SS on the top of the stack by SASA.

Similarly, we simulate the production SaS \rightarrow a with a transition that reads aa while simply removing SS from the top of the stack. We represent these two productions in the pda as the nondeterministic transition rule:

δ(q1,a,S)={(q1,SA),(q1,λ)}\delta(q_{1}, a, S) = \{(q_{1}, SA), (q_{1}, \lambda)\}

Doing the same for the other productions, we get transition rules:

δ(q1,b,A)={(q1,B)}\delta(q_{1}, b, A) = \{(q_{1}, B)\}
δ(q1,b,B)={(q1,λ)}\delta(q_{1}, b, B) = \{(q_{1}, \lambda)\}

When the stack start symbol appears at the top of the stack, the derivation is complete. We define a transition rule to move the pda into its final state:

δ(q1,λ,z)={(q2,λ)}\delta(q_{1}, \lambda, z) = \{(q_{2}, \lambda)\}

7.2.3 Constructing an NPDA for a CFG

Linz Theorem 7.1 (Existence of NPDA for Context-Free Language): For any context-free language LL, there exists an npda MM such that L=L(M)L = L(M).

Proof: The proof partly follows from the following construction (algorithm).

Algorithm to construct an npda for a context-free grammar

7.2.4 Linz Example 7.7

Consider the grammar:

SaAS \rightarrow aA
AaABC|bB|aA \rightarrow aABC\ |\ bB\ |\ a
BbB \rightarrow b
CcC \rightarrow c

This grammar is already in Greibach Normal Form. So we can apply the algorithm above directly.

In addition to the transition rules for the startup and shutdown, i.e.,

  1. δ(q0,λ,z)\delta(q_{0}, \lambda, z) == {(q1,Sz)}\{(q_{1}, Sz)\}
  2. δ(q1,λ,z)\delta(q_{1}, \lambda, z) == {(qf,z)}\{(q_{f}, z)\}

the npda has the following transition rules for the productions:

  1. δ(q1,a,S)\delta(q_{1}, a, S) == {(q1,A)}\{(q_{1}, A)\}
  2. δ(q1,a,A)\delta(q_{1}, a, A) == {(q1,ABC),(q1,λ)}\{(q_{1}, ABC), (q_{1}, \lambda)\}
  3. δ(q1,b,A)\delta(q_{1}, b, A) == {(q1,B)}\{(q_{1}, B)\}
  4. δ(q1,b,B)\delta(q_{1}, b, B) == {(q1,λ)}\{(q_{1}, \lambda)\}
  5. δ(q1,c,C)\delta(q_{1}, c, C) == {(q1,λ)}\{(q_{1}, \lambda)\}

The sequence of moves made by MM in processing is aaabcaaabc is

(q0,aaabc,z)(q_{0}, aaabc, z)
(1) \vdash (q1,aaabc,Sz)(q_{1}, aaabc, Sz)
(3) \vdash (q1,aabc,Az)(q_{1}, aabc, Az)
(4a) \vdash (q1,abc,ABCz)(q_{1}, abc, ABCz)
(4b) \vdash (q1,bc,BCz)(q_{1}, bc, BCz)
(6) \vdash (q1,c,Cz)(q_{1}, c, Cz)
(7) \vdash (q1,λ,z)(q_{1}, \lambda, z)
(2) \vdash (qf,λ,z)(q_{f}, \lambda, z)

This corresponds to the derivation

SaAaaABCaaaBCaaabCaaabc.S \Rightarrow aA \Rightarrow aaABC \Rightarrow aaaBC \Rightarrow aaabC \Rightarrow aaabc.

The previous construction assumed Greibach Normal Form. This is not necessary, but the needed construction technique is more complex, as sketched below.

ABxA \rightarrow Bx

(q1,Bx)δ(q1,λ,A)(q_{1}, Bx) \in \delta(q_{1}, \lambda, A)

AabCxA \rightarrow abCx

e.g.,
(q2,λ)δ(q1,a,a)(q_{2}, \lambda) \in \delta(q_{1}, a, a)
(q3,λ)δ(q2,b,b)(q_{3}, \lambda) \in \delta(q_{2}, b, b)
(q1,Cx)δ(q3,λ,A)(q_{1}, Cx) \in \delta(q_{3}, \lambda, A)

etc.

7.2.5 Constructing a CFG for an NPDA

Linz Theorem 7.2 (Existence of a Context-Free Language for an NPDA): If L=L(M)L = L(M) for some npda MM, then LL is a context-free language.

Basic idea: To construct a context-free grammar from an npda, reverse the previous construction.

That is, construct a grammar to simulate npda moves:

This leads to a relatively complicated construction. This is described in the Linz textbook in more detail, but we will not cover it in this course.

7.3 Deterministic Pushdown Automata and Deterministic Context-Free Languages

7.3.1 Deterministic Pushdown Automata

A deterministic pushdown accepter (dpda) is a pushdown automaton that never has a choice in its move.

Linz Definition 7.3 (Deterministic Pushdown Automaton): A pushdown automaton M=(Q,Σ,Γ,δ,q0,z,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, z, F) is deterministic if it is an automaton as defined in Linz Definition 7.1, subject to the restrictions that, for every qQ,aΣ{λ}q \in Q, a \in \Sigma \cup \{\lambda\}, and bΓb \in \Gamma,

  1. δ(q,a,b)\delta(q, a, b) contains at most one element,

  2. if δ(q,λ,b)\delta(q, \lambda, b) is not empty, then δ(q,c,b)\delta(q, c, b) must be empty for every cΣc \in \Sigma.

Restriction 1 requires that for, any given input symbol and any stack top, at most one move can be made.

Restriction 2 requires that, when a λ\lambda-move is possible for some configuration, no input-consuming alternative is available.

Consider the difference between this dpda definition and the dfa definition:

Linz Definition 7.4 (Deterministic Context-Free Language): A language LL is a deterministic context-free language if and only if there exists a dpda MM such that L=L(M)L = L(M).

7.3.2 Linz Example 7.10

The language L={anbn:n0}L = \{a^{n}b^{n}: n \geq 0\} is a deterministic context-free language.

The pda M=({q0,q1,q2},{a,b},{0,1},δ,q0,0,{q0})M = (\{q_{0}, q_{1}, q_{2}\}, \{a,b\}, \{0,1\}, \delta, q_{0}, 0, \{q_{0}\}) with transition rules

δ(q0,a,0)={(q1,10)}\delta(q_{0}, a, 0) = \{(q_{1}, 10)\}
δ(q1,a,1)={(q1,11)}\delta(q_{1}, a, 1) = \{(q_{1}, 11)\}
δ(q1,b,1)={(q2,λ)}\delta(q_{1}, b, 1) = \{(q_{2}, \lambda)\}
δ(q2,b,1)={(q2,λ)}\delta(q_{2}, b, 1) = \{(q_{2}, \lambda)\}
δ(q2,λ,0)={(q0,λ)}\delta(q_{2}, \lambda, 0) = \{(q_{0}, \lambda)\}

accepts the given language. This grammar satisfies the conditions of Linz Definition 7.4. Therefore, it is deterministic.

7.3.3 Linz Example 7.5 Revisited

Consider language

L={wwR:w{a,b}+}L = \{ww^{R}: w \in \{a, b\}^{+}\}

and machine

M=(Q,Σ,Γ,δ,q0,z,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, z, F)

where:

Q={q0,q1,q2}Q = \{q_{0}, q_{1}, q_{2}\}
Σ={a,b}\Sigma = \{a, b\}
Γ={a,b,z}\Gamma = \{a, b, z\}
F={q2}F = \{q_{2}\}

The transition function can be visualized as having several parts:

δ(q0,a,a)={(q0,aa)}\delta(q_{0}, a, a) = \{(q_{0}, aa)\} \leftarrow Restriction 2 violation
δ(q0,b,a)={(q0,ba)}\delta(q_{0}, b, a) = \{(q_{0}, ba)\}
δ(q0,a,b)={(q0,ab)}\delta(q_{0}, a, b) = \{(q_{0}, ab)\}
δ(q0,b,b)={(q0,bb)}\delta(q_{0}, b, b) = \{(q_{0}, bb)\}
δ(q0,a,z)={(q0,az)}\delta(q_{0}, a, z) = \{(q_{0}, az)\}
δ(q0,b,z)={(q0,bz)}\delta(q_{0}, b, z) = \{(q_{0}, bz)\}

δ(q0,λ,a)={(q1,a)}\delta(q_{0}, \lambda, a) = \{(q_{1}, a)\} \leftarrow Restriction 2 violation
δ(q0,λ,b)={(q1,b)}\delta(q_{0}, \lambda, b) = \{(q_{1}, b)\}

δ(q1,a,a)={(q1,λ)}\delta(q_{1}, a, a) = \{(q_{1}, \lambda)\}
δ(q1,b,b)={(q1,λ)}\delta(q_{1}, b, b) = \{(q_{1}, \lambda)\}

δ(q1,λ,z)={(q2,z)}\delta(q_{1}, \lambda, z) = \{(q_{2}, z)\}

This machines violates Restriciton 2 of Linz Definition 7.3 (Deterministic Pushdown Automaton) as indicated above. Thus, it is not deterministic.

Moreover, LL is itself not deterministic (which is not proven here).

7.4 Grammars for Deterministic Context-Free Grammars

Deterministic context-free languages are important because they can be parsed efficiently.

An LL-grammar is a generalization of the concept of s-grammar. This family of grammars generates the deterministic context-free languages.

Compilers for practical programming languages may use top-down parsers based on LL-grammars to parse the languages efficiently.

8 Properties of Context-Free Languages

Chapter 4 examines the closure properties of the family of regular languages, algorithms for determining various properties of regular languages, and methods for proving languages are not regular (e.g., the Pumping Lemma).

Chapter 8 examines similar aspects of the family of context-free languages.

8.1 Two Pumping Lemmas

Because of insufficient time and extensive coverage of the Pumping Lemma for regular languages, we will not cover the Pumping Lemmas for Context-Free Languages in this course. See section 8.1 of the Linz textbook if you are interested in this topic.

8.1.1 Context-Free Languages

Linz Section 8.1 includes the following language examples. The
results of these are used in the remainder of this chapter.

  1. Linz Example 8.1 shows L={anbncn:n0}L = \{ a^{n}b^{n}c^{n} : n \geq 0 \} is not context free.

  2. Linz Example 8.2 shows L={ww:w{a,b}*}L = \{ ww : w \in \{a,b\}^{*} \} is not context free.

  3. Linz Example 8.3 shows L={an!:n0}L = \{ a^{n!} : n \geq 0 \} is not context free.

  4. Linz Example 8.4 shows L={anbj:n=j2}L = \{ a^{n}b^{j} : n = j^{2} \} is not context free.

8.1.2 Linear Languages

Linz Section 8.1 includes the following definitions. (The definition of linear grammar is actually from Chapter 3.)

Definition (Linear Grammar): A linear grammar is a grammar in which at most one variable can appear on the right side of any production.

A linear context-free grammar is thus a context-free grammar that is also a linear grammar.

Linz Definition 8.5 (Linear Language): A context-free language LL is linear if there exists a linear context-free grammar GG such that L=L(G)L = L(G).

Linz Section 8.1 also includes the following language examples.

  1. Linz Example 8.5 shows L={anbn:n0}L = \{ a^{n}b^{n} : n \geq 0 \} is a linear language.

  2. Linz Example 8.6 shows L={w:na(w)=nb(w)}L = \{ w : n_{a}(w) = n_{b}(w) \} is not linear.

8.2 Closure Properties and Decision Algorithms for Context-Free Languages

In most cases, the proofs and algorithms for the properties of regular languages rely upon manipulation of transition graphs for finite automata. Hence, they are relatively straightforward.

When we consider similar properties for context-free languages, we encounter more difficulties.

Let’s consider closure under the simple set operations as we did for regular languages in Linz Theorem 4.1.

8.2.1 Closure under Union, Concatenation, and Star-Closure

Linz Theorem 8.3 (Closure under Union, Concatenation, and Star-Closure): The family of context-free languages is closed under (a) union, (b) concatenation, and (c) star-closure.

(8.3a) Proof of Closure under Union:

Let L1L_{1} and L2L_{2} be context-free languages with the corresponding context-free grammars G1=(V1,T1,S1,P1)G_{1} = (V_{1}, T_{1}, S_{1}, P_{1}) and G2=(V2,T2,S2,P2)G_{2} = (V_{2}, T_{2}, S_{2}, P_{2}).

Assume V1V_{1} and V2V_{2} are disjoint. (If not, we can make them so by renaming.)

Consider L(G3)L(G_{3}) where

G3=(V1V2{S3},T1T2,S3,P3)G_{3} = (V_{1} \cup V_{2} \cup \{S_{3}\}, T_{1} \cup T_{2}, S_{3}, P_{3})

with:

S3V1V2S_{3} \notin V_{1} \cup V_{2} – i.e, S3S_{3} is a fresh variable
P3=P1P2{S3S1|S2}P_{3} = P_{1} \cup P_{2} \cup \{\ S_{3} \rightarrow S_{1}\ |\ \ S_{2}\ \}

Clearly, G3G_{3} is a context-free grammar. So L(G3)L(G_{3}) is a context-free language.

Now, we need to show that L(G3)=L1L2L(G_{3}) = L_{1} \cup L_{2}.

For wL1w \in L_{1}, there is a derivation in G3G_{3}:

  1. S3S1*wS_{3} \Rightarrow S_{1} \overset{*}{\Rightarrow} w

Similarly, for wL2w \in L_{2}, there is a derivation in G3G_{3}:

  1. S3S2*wS_{3} \Rightarrow S_{2} \overset{*}{\Rightarrow} w

Also, for wL(G3)w \in L(G_{3}), the first step of the derivation must be either (1) S3S1S_{3} \Rightarrow S_{1} or (2) S3S2S_{3} \Rightarrow S_{2}.

For choice 1, the sentential forms derived from S1S_{1} only have variables from V1V_{1}. But V1V_{1} is disjoint from V2V_{2}. Thus the derivation

S1*wS_{1} \overset{*}{\Rightarrow} w

can only involve productions from from P1P_{1}. Hence, for choice 1, wL1w \in L_{1}.

Using a similar argument for choice 2, we conclude wL2w \in L_{2}.

Therefore, L(G3)=L1L2L(G_{3}) = L_{1} \cup L_{2}.

QED.

(8.3b) Proof of Closure under Concatenation:

Consider L(G4)L(G_{4}) where

G4=(V1V2{S4},T1T2,S4,P4)G_{4} = (V_{1} \cup V_{2} \cup \{ S_{4} \}, T_{1} \cup T_{2}, S_{4}, P_{4})

with:

S4V1V2S_{4} \notin V_{1} \cup V_{2}
P4=P1P2{S4S1S2}P_{4} = P_{1} \cup P_{2} \cup \{\ S_{4} \rightarrow S_{1} S_{2}\ \}

Then L(G4)=L1L2L(G_{4}) = L_{1} L_{2} follows from a similar argument to the one in part (a).

QED.

(8.3c) Proof of Closure under Star-Closure:

Consider L(G5)L(G_{5}) where

G5=(V1{S5},T1,S5,P5)G_{5} = (V_{1} \cup \{ S_{5} \}, T_{1}, S_{5}, P_{5})

with:

S5V1S_{5} \notin V_{1}
P5=P1{S5S1S5|λ}P_{5} = P_{1} \cup \{\ S_{5} \rightarrow S_{1} S_{5}\ |\ \lambda\ \}

Then L(G5)=L1*L(G_{5}) = L_{1}^{*} follows from a similar argument to the one in part (a).

QED.

8.2.2 Non-Closure under Intersection and Complementation

Linz Theorem 8.4 (Non-closure under Intersection and Complementation): The family of context-free languages is not closed under (a) intersection and (b) complementation.

(8.4b) Proof of Non-closure under Intersection:

Assume the family of context-free languages is closed under intersection. Show that this leads to a contradiction.

It is sufficient to find two context-free languages whose intersection is not context-free.

Consider languages L1L_{1} and L2L_{2} defined as follows:

L1={anbncm:n0,m0}L_{1} = \{ a^{n}b^{n}c^{m} : n \geq 0, m \geq 0 \}
L2={anbmcm:n0,m0}L_{2} = \{ a^{n}b^{m}c^{m} : n \geq 0, m \geq 0 \}

One way to show that a language is context-free is to find a context-free grammar that generates it. The following context-free grammar generates L1L_{1}:

SS S1S2\rightarrow S_{1}S_{2}
S1S_{1} aS1b|λ\rightarrow aS_{1}b\ |\ \lambda
S2S_{2} cS2|λ\rightarrow cS_{2}\ |\ \lambda

Alternatively, we could observe that L1L_{1} is the concatenation of two context-free languages and, hence, context-free by Linz Theorem 8.3 above.

Similarly, we can show that L2L_{2} is context free.

From the assumption, we thus have that L1L2L_{1} \cap L_{2} is context free.

But

L1L2={anbncn:n0}L_{1} \cap L_{2} = \{ a^{n}b^{n}c^{n} : n \geq 0 \},

which is not context free. Linz proves this in Linz Example 8.1 (which is in the part of this chapter we did not cover in this course).

Thus we have a contradiction. Therefore, the family of context-free languages is not closed under intersection.

QED.

(8.4b) Proof of Non-closure under Complementation:

Assume the family of context-free languages is closed under complementation. Show that this leads to a contradiction.

Consider arbitrary context-free languages L1L_{1} and L2L_{2}.

From set theory, we know that

L1L2L_{1} \cap L_{2} == L1L2¯\overline{\bar{L_{1}} \cup \bar{L_{2}}}.

From Linz Theorem 8.3 and the assumption that context-free languages are closed under complementation, we deduce that the right side (L1L2¯\overline{\bar{L_{1}} \cup \bar{L_{2}}}) is a context-free language for all L1L_{1} and L2L_{2}.

However, we know from part (a) that the left side (L1L2L_{1} \cap L_{2}) is not necessarily a context-free language for all L1L_{1} and L2L_{2}.

Thus we have a contradiction. Therefore, the family of context-free languages is not closed under complementation.

QED.

8.2.3 Closure under Regular Intersection

Although context-free languages are not, in general, closed under intersection, there is a useful special case that is closed.

Linz Theorem 8.5 (Closure Under Regular Intersection): Let L1L_{1} be a context-free language and L2L_{2} be a regular language. Then L1L2L_{1} \cap L_{2} is context free.

Proof:

Let M1=(Q,Σ,Γ,δ1,q0,z,F1)M_{1} = (Q,\Sigma,\Gamma,\delta_{1},q_{0},z,F_{1}) be an npda that accepts context-free language L1L_{1}.

Let M2=(P,Σ,δ2,p0,F2)M_{2} = (P,\Sigma,\delta_{2},p_{0},F_{2}) be a dfa that accepts regular language L2L_{2}.

We construct an npda

M̂=(Q̂,Σ,Γ,δ̂,q0̂,F̂)\widehat{M} = (\widehat{Q},\Sigma,\Gamma,\widehat{\delta},\widehat{q_{0}},\widehat{F})

that simulates M1M_{1} and M2M_{2} operating simultaneously (i.e., executes the moves of both machines for each input symbol).

We choose pairs of states from M1M_{1} and M2M_{2} to represent the states of M̂\widehat{M} as follows:

Q̂=Q×P\widehat{Q} = Q \times P
q0̂=(q0,p0)\widehat{q_{0}} = (q_{0},p_{0})
F̂=F1×F2\widehat{F} = F_{1} \times F_{2}

We specify δ̂\widehat{\delta} such that the moves of M̂\widehat{M} correspond to simultaneous moves of M1M_{1} and M2M_{2}. That is,

((qk,pl),x)δ̂((qi,pj),a,b)((q_{k},p_{l}),x) \in \widehat{\delta}((q_{i},p_{j}),a,b)

if and only if

(qk,x)δ1(qi,a,b)(q_{k},x) \in \delta_{1}(q_{i},a,b)

and

δ2(pj,a)=pl\delta_{2}(p_{j},a) = p_{l}.

For moves (qi,λ,b)(q_{i},\lambda,b) in δ1\delta_{1}, we extend δ2\delta_{2} so that δ2(pl,λ)=pl\delta_{2}(p_{l},\lambda) = p_{l} for all ll.

By induction on the length of the derivations, we can prove that

((q0,p0),w,z)M̂*((qr,ps),λ,x)((q_{0},p_{0}), w, z) \vdash^{*}_{\widehat{M}} ((q_{r},p_{s}), \lambda, x),

with qrF1q_{r} \in F_{1} and psF2p_{s} \in F_{2} if and only if

(q0,w,z)M1*(qr,λ,x)(q_{0}, w, z) \vdash^{*}_{M_{1}} (q_{r}, \lambda, x)

and

δ*(p0,w)=ps\delta^{*}(p_{0},w) = p_{s}.

Therefore, a string is accepted by M̂\widehat{M} if and only if it is accepted by both M1M_{1} and M2M_{2}. That is, the string is in L(M1)L(M2)=L1L2L(M_{1}) \cap L(M_{2}) = L_{1} \cap L_{2}.

QED.

8.2.4 Linz Example 8.7

Show that the language

L={anbn:n0,n100}L = \{ a^{n}b^{n} : n \geq 0, n \neq 100 \}

is context free.

We can construct an npda or context-free grammar for LL, but this is tedious. Instead, we use closure of regular intersection (Linz Theorem 8.5).

Let L1={a100b100}L_{1} = \{ a^{100}b^{100} \}.

L1L_{1} is finite, and thus also regular. Hence, L1\bar{L_{1}} is regular because regular languages are closed under complementation.

From previous results, we know that L={anbn:n0}L = \{ a^{n}b^{n} : n \geq 0 \} is context free.

Clearly, L={anbn:n0}L1L = \{ a^{n}b^{n} : n \geq 0 \} \cap \bar{L_{1}}.

By the closure of context-free languages under regular intersection, LL is a context-free language.

8.2.5 Linz Example 8.8

Show that

L={w{a,b,c}*:na(w)=nb(w)=nc(w)}L = \{ w \in \{a,b,c\}^{*} : n_{a}(w) = n_{b}(w) = n_{c}(w) \}

is not context free.

Although we could use the Pumping Lemma for Context-Free Languages, we again use closure of regular intersection (Linz Theorem 8.5).

Assume that LL is context free. Show that this leads to a contradiction.

Thus

LL(a*b*c*)={anbncn:n0}L \cap L( a^{*} b^{*} c^{*} ) = \{ a^{n} b^{n} c^{n} : n \geq 0 \}

is also context free. But we have previously proved that this language is not context free.

Thus we have a contradiction. Therefore, LL is not context free.

8.2.6 Some Decidable Properties of Context Free Languages

There exist algorithms for determine whether a context-free language is empty or nonempty and finite or infinite.

These algorithms process the context-free grammars for the languages. They assume that the grammars are first transformed using various algorithms from Linz Chapter 6 (which we did not cover in this course).

The algorithms from Chapter 6 include the removal of

Linz Theorem 8.6 (Determining Empty Context-Free Languages): Given a context-free grammar G=(V,T,S,P)G = (V,T,S,P), then there exists an algorithm for determining whether or not L(G)L(G) is empty.

Basic idea of algorithm: Assuming λL\lambda \notin L, remove the useless productions. If the start symbol is useless, then LL is empty. Otherwise, LL is nonempty.

Linz Theorem 8.7 (Determining Infinite Context-Free Languages): Given a context-free grammar G=(V,T,S,P)G = (V,T,S,P), then there exists an algorithm for determining whether or not L(G)L(G) is infinite.

Basic idea of algorithm: Remove useless symbols, λ\lambda-productions, and unit productions. If there are variables AA that repeat as in

A*xAyA \overset{*}{\Rightarrow} xAy

then the language is infinite. Otherwise, the language is finite. To determine repeated variables, we can build a graph of the dependencies of the variables on each other. If this graph has a cycle, then the variable at the base of the cycle is repeated.

Unfortunately, other simple properties are not as easy as the above.

For example, there is no algorithm to determine whether two context-free grammars generate the same language.

9 Turing Machines

A finite accepter (nfa, dfa)

A pushdown accepter (npda, dpda)

The family of regular languages is a subset of the deterministic context-free languages, which is a subset of the context-free languages.

But, as we saw in Chapter 8, not all languages of interest are context-free. To accept languages like {anbncn:n0}\{ a^{n}b^{n}c^{n} : n \geq 0 \} and {ww:w{a,b}*}\{ ww : w \in \{a,b\}^{*} \}, we need an automaton with a more flexible internal storage mechanism.

What kind of internal storage is needed to allow the machine to accept languages such as these? multiple stacks? a queue? some other mechanism?

More ambitiously, what is the most powerful automaton we can define? What are the limits of mechanical computation?

This chapter introduces the Turing machine to explore these theoretical questions. The Turing machine is a fundamental concept in the theoretical study of computation.

The Turing machine

Although Turing machines are simple mechanisms, the Turing thesis (also known as the Church-Turing thesis) maintains that any computation that can be carried out on present-day computers an be done on a Turing machine.

Note: Much of the work on computability was published in the 1930’s, before the advent of electronic computers a decade later. It included work by Austrian (and later American) logician Kurt Goedel on primitive recursive function theory, American mathematician Alonso Church on lambda calculus (a foundation of functional programming), British mathematician Alan Turing (also later a PhD student of Church’s) on Turing machines, and American mathematician Emil Post on Post machines.

9.1 The Standard Turing Machine

9.1.1 What is a Turing Machine?

9.1.1.1 Schematic Drawing of Turing Machine

Linz Figure 9.1 shows a schematic drawing of a standard Turing machine.

This deviates from the general scheme given in Chapter 1 in that the input file, internal storage, and output mechanism are all represented by a single mechanism, the tape. The input is on the tape at initiation and the output is on that tape at termination.

On each move, the tape’s read-write head reads a symbol from the current tape cell, writes a symbol back to that cell, and moves one cell to the left or right.

Linz Fig. 9.1: Standard Turing Machine

9.1.1.2 Definition of Turing Machine

Turing machines were first defined by British mathematician Alan Turing in 1937, while he was a graduate student at Cambridge University.

Linz Definition 9.1 (Turing Machine): A Turing machine MM is defined by

M=(Q,Σ,Γ,δ,q0,,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, \Box, F)

where

  1. QQ is the set of internal states
  2. Σ\Sigma is the input alphabet
  3. Γ\Gamma is a finite set of symbols called the tape alphabet
  4. δ\delta is the transition function
  5. Γ\Box \in \Gamma is a special symbol called the blank
  6. q0Qq_{0} \in Q is the initial state
  7. FQF \subseteq Q is the set of final states

We also require

  1. ΣΓ{}\Sigma \subseteq \Gamma - \{\Box\}

and define

  1. δ:Q×ΓQ×Γ×{L,R}\delta : Q \times \Gamma \rightarrow Q \times \Gamma \times \{L, R\}.

Requirement 8 means that the blank symbol \Box cannot be either an input or an output of a Turing machine. It is the default content for any cell that has no meaningful content.

From requirement 9, we see that the arguments of the transition function δ\delta are:

The result of the transition function δ\delta gives:

In general, δ\delta is a partial function. That is, not all configurations have a next move defined.

9.1.1.3 Linz Example 9.1

Consider a Turing machine with a move defined as follows:

δ(q0,a)=(q1,d,R)\delta(q_{0}, a) = (q_{1}, d, R)

Linz Figure 9.2 shows the situation (a) before the move and (b) after the move.

Linz Fig. 9.2: One Move of a Turing Machine

9.1.1.4 A Simple Computer

A Turing machine is a simple computer. It has

The Turing machine can

The transition function δ\delta determines the behavior of the machine, i.e., it is the machine’s program.

The Turing macine starts in initial state q0q_{0} and then goes through a sequence of moves defined by δ\delta. A cell on the tape may be read and written many times.

Eventually the Turing machine may enter a configuration for which δ\delta is undefined. When it enters such a state, the machine halts. Hence, this state is called a halt state.

Typically, no transitions are defined on any final state.

9.1.1.5 Linz Example 9.2

Consider the Turing machine defined by

Q={q0,q1}Q = \{ q_{0}, q_{1} \},
Σ={a,b}\Sigma = \{ a, b\},
Γ={a,b,}\Gamma = \{a, b, \Box \},
F={q1}F = \{ q_{1} \}

where δ\delta is defined as follows:

  1. δ(q0,a)=(q0,b,R)\delta(q_{0}, a) = (q_{0}, b, R),
  2. δ(q0,b)=(q0,b,R)\delta(q_{0}, b) = (q_{0}, b, R),
  3. δ(q0,)=(q1,,L)\delta(q_{0}, \Box) = (q_{1}, \Box, L).

Linz Figure 9 .3 shows a sequence of moves for this Turing machine:

Linz Fig. 9.3: A Sequence of Moves of a Turing Machine

9.1.1.6 Transition Graph for Turing Machine

As with finite and pushdown automata, we can use transition graphs to represent Turing machines. We label the edges of the graph with a triple giving (1) the current tape symbol, (2) the symbol that replaces it, and (3) the direction in which the read-write head moves.

Linz Figure 9.4 shows a transition graph for the Turing machine given in Linz Example 9.2.

Linz Fig. 9.4: Transition Graph for Example 9.2

9.1.1.7 Linz Example 9.3 (Infinite Loop)

Consider the Turing machine defined in Linz Figure 9.5.

Linz Fig. 9.5: Infinite Loop

Suppose the tape initially contains aba b \ldots with the read-write head positioned over the aa and in state q0q_{0}. Then the Turing machine executes the following sequence of moves:

  1. The machine reads symbol aa, leaves it unchanged, moves right (now over symbol bb), and enters state q1q_{1}.

  2. The machine reads bb, leaves it unchanged, moves back left (now over aa again), and enters state q0q_{0} again.

  3. The machine then repeats steps 1-3.

Clearly, regardless of the tape configuration, this machine does not halt. It goes into an infinite loop.

9.1.1.8 Standard Turing Machine

Because we can define a Turing machine in several different ways, it is useful to summarize the main features of our model.

A standard Turing machine:

  1. has a tape that is unbounded in both directions, allowing any number of left and right moves

  2. is deterministic in that δ\delta defines at most one move for each configuration

  3. has no special input or output files. At the initial time, the tape has some specified content, some of which is considered input. Whenever the machine halts, some or all of the contents of the tape is considered output.

These definitions are chosen for convenience in this chapter. Chapter 10 (which we do not cover in this course) examines alternative versions of the Turing machine concept.

9.1.1.9 Instantaneous Description of Turing Machine

As with pushdown automata, we use instantaneous descriptions to examine the configurations in a sequence of moves. The notation (using strings)

x1qx2x_{1} q x_{2}

or (using individual symbols)

a1a2ak1qakak+1ana_{1} a_{2} \cdots a_{k-1} q a_{k} a_{k+1} \cdots a_{n}

gives the instantaneous description of a Turing machine in state qq with the tape as shown in Linz Figure 9.5.

By convention, the read-write head is positioned over the symbol to the right of the state (i.e., aka_{k} above).

Linz Fig. 9.6: Tape Configuration a_{1} a_{2} \cdots a_{k-1} q a_{k} a_{k+1} \cdots a_{n}

A tape cell contains \Box if not otherwise defined to have a value.

Example: The diagrams in Linz Figure 9.3 (above) show the instantaneous descriptions q0aaq_{0} a a, bq0ab q_{0} a, bbq0b b q_{0} \Box, and bq1bb q_{1} b.

As with pushdown automata, we use \vdash to denote a move.

Thus, for transition rule

δ(q1,c)=(q2,e,R)\delta(q_{1}, c) = (q_{2}, e, R)

we can have the move

abq1cdabeq2da b q_{1} c d \vdash a b e q_{2} d.

As usual we denote the transitive closure of move (i.e., arbitrary number of moves) using:

*\vdash^*

We also use subscripts to distinguish among machines:

M\vdash_{M}

9.1.1.10 Computation of Turing Machine

Now let’s summarize the above discussion with the following definitions.

Linz Definition 9.2 (Computation): Let M=(Q,Σ,Γ,δ,q0,,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, \Box, F) be a Turing machine. Then any string a1ak1q1akak+1ana_{1} \cdots a_{k-1} q_{1} a_{k} a_{k+1} \cdots a_{n} with aiΓa_{i} \in \Gamma and q1Qq_{1} \in Q, is an instantaneous description of MM.

A move

a1ak1q1akak+1ana_{1} \cdots a_{k-1} q_{1} a_{k} a_{k+1} \cdots a_{n} \vdash a1ak1bq2ak+1ana_{1} \cdots a_{k-1} b q_{2} a_{k+1} \cdots a_{n}

is possible if and only if

δ(q1,ak)=(q2,b,R)\delta(q_{1}, a_{k}) = (q_{2}, b, R).

A move

a1ak1q1akak+1ana_{1} \cdots a_{k-1} q_{1} a_{k} a_{k+1} \cdots a_{n} \vdash a1q2ak1bak+1ana_{1} \cdots q_{2} a_{k-1} b a_{k+1} \cdots a_{n}

is possible if and only if

δ(q1,ak)=(q2,b,L)\delta(q_{1}, a_{k}) = (q_{2}, b, L).

MM halts starting from some initial configuration x1qix2x_{1} q_{i} x_{2} if

x1qix2*y1qjay2x_{1} q_{i} x_{2} \ \vdash^*\ y_{1} q_{j} a y_{2}

for any qjq_{j} and aa, for which δ(qj,a)\delta(q_{j}, a) is undefined.

The sequence of configurations leading to a halt state is a computation.

If a Turing machine does not halt, we use the following special notation to describe its computation:

x1qx2*x_{1} q x_{2} \vdash^* \infty

9.1.2 Turing Machines as Language Acceptors

Can a Turing machine accept a string ww?

Yes, using the following setup:

Linz Definition 9.3 (Language Accepted by Turing Machine): Let M=(Q,Σ,Γ,δ,q0,,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, \Box, F) be a Turing machine. Then the language accepted by MM is

L(M)={wΣ+:q0w*x1qfx2,qfF,x1,x2Γ*}L(M) = \{ w \in \Sigma^{+} : q_{0} w \vdash^{*} x_{1} q_{f} x_{2}, q_{f} \in F, x_{1}, x_{2} \in \Gamma^{*} \}.

Note: The finite string ww must be written to the tape with blanks on both sides. No blanks can are embedded within the input string ww itself.

Question: What if wL(M)w \not\in L(M)?

The Turing machine might:

  1. halt in nonfinal state
  2. never halt

Any string for which the machine does not halt is, by definition, not in L(M)L(M).

9.1.2.1 Linz Example 9.6

For Σ={0,1}\Sigma = \{ 0, 1 \}, design a Turing machine that accepts the language denoted by the regular expression 00*00^{*}.

We use two internal states Q={q0,q1}Q = \{ q_{0}, q_{1} \}, one final state F={q1}F = \{ q_{1} \}, and transition function:

δ(q0,0)=(q0,0,R)\delta(q_{0}, 0) = (q_{0}, 0, R),
δ(q0,)=(q1,,R)\delta(q_{0}, \Box) = (q_{1}, \Box, R)

The transition graph shown below implements this machine.

The Turing machine also halts in a final state if started in state q0q_{0} on a blank. We could interpret this as acceptance of λ\lambda, but for technical reasons the empty string is not included in Linz Definition 9.3.

9.1.2.2 Linz Example 9.7

For Σ={a,b}\Sigma = \{ a, b \}, design a Turing machine that accepts

L={anbn:n1}L = \{a^{n} b^{n} : n \geq 1 \}.

We can design a machine that incorporates the following algorithm:

      While both aa’s and bb’s remain
          replace leftmost aa by xx
          replace leftmost bb by yy
      If no aa’s or bb’s remain
          accept
      else
          reject

Filling in the details, we get the following Turing machine for which:

Q={q0,q1,q2,q3,q4}Q = \{ q_{0}, q_{1}, q_{2}, q_{3}, q_{4} \}
F={q4}F = \{q_4\}
Σ={a,b}\Sigma = \{a, b\}
Γ={a,b,x,y,}\Gamma = \{a, b, x, y, \Box\}

The transitions can be broken into several sets.

The first set

  1. δ(q0,a)=(q1,x,R)\delta(q_{0}, a) = (q_{1}, x, R)
  2. δ(q1,a)=(q1,a,R)\delta(q_{1}, a) = (q_{1}, a, R)
  3. δ(q1,y)=(q1,y,R)\delta(q_{1}, y) = (q_{1}, y, R)
  4. δ(q1,b)=(q2,y,L)\delta(q_{1}, b) = (q_{2}, y, L)

replaces the leftmost aa with an xx, then causes the read-write head to travel right to the first bb, replacing it with a yy. The machine then enters a state q2q_{2}, indicating that an aa has been successfully paired with a bb.

The second set

  1. δ(q2,y)=(q2,y,L)\delta(q_{2}, y) = (q_{2}, y, L)
  2. δ(q2,a)=(q2,a,L)\delta(q_{2}, a) = (q_{2}, a, L)
  3. δ(q2,x)=(q0,x,R)\delta(q_{2}, x) = (q_{0}, x, R)

reverses the direction of movement until an xx is encountered, repositions the read-write head over the leftmost aa, and returns control to the initial state.

The machine is now back in the initial state q0q_{0}, ready to process the next aa-bb pair.

After one pass through this part of the computation, the machine has executed the partial computation:

q0aaabbbq_{0} a a \cdots a b b \cdots b *\vdash^{*} xq0aaybbx q_{0} a \cdots a y b \cdots b

So, it has matched a single aa with a single bb.

The machine continues this process until no aa is found on leftward movement.

If all aa’s have been replaced, then state q0q_{0} detects a yy instead of an aa and changes to state q3q_{3}. This state must verify that all bb’s have been processed as well.

  1. δ(q0,y)=(q3,y,R)\delta(q_{0}, y) = (q_{3}, y, R)
  2. δ(q3,y)=(q3,y,R)\delta(q_{3}, y) = (q_{3}, y, R)
  3. δ(q3,)=(q4,,R)\delta(q_{3}, \Box) = (q_{4}, \Box, R)

The input aabbaabb makes the moves shown below. (The bold number in parenthesis gives the rule applied in that step.)

q0aabbq_{0}aabb – start at left end
(1) \vdash xq1abbxq_{1}abb – process 1st a-b pair
(2) \vdash xaq1bbxaq_{1}bb – moving to right
(4) \vdash xq1aybxq_{1}ayb
(6) \vdash q2xaybq_{2}xayb – move back to left
(7) \vdash xq0aybxq_{0}ayb
(1) \vdash xxq1ybxxq_{1}yb – process 2nd a-b pair
(3) \vdash xxyq1bxxyq_{1}b – moving to right
(4) \vdash xxq2yyxxq_{2}yy
(5) \vdash xq2xyyxq_{2}xyy – move back to left
(7) \vdash xxq0yyxxq_{0}yy
(8) \vdash xxyq3yxxyq_{3}y – no a’s
(9) \vdash xxyyq3xxyyq_{3}\Box – check for extra b’s
(10) \vdash xxyyq4xxyy\Box q_{4}\Box – done, move to final

The Turing machine halts in final state q4q_{4}, thus accepting the string aabbaabb.

If the input is not in the language, the Turing machine will halt in a nonfinal state.

For example, consider:

9.1.3 Turing Machines as Transducers

Turing machines are more than just language accepters. They provide a simple abstract model for computers in general. Computers transform data. Hence, Turing machines are transducers (as we defined them in Chapter 1). For a computation, the

Thus, we can view a Turing machine transducer MM as an implementation of a function ff defined by

ŵ=f(w)\hat{w} = f(w)

provided that

q0wM*qfŵq_{0} w \vdash^{*}_{M} q_{f} \hat{w},

for some final state qfq_{f}.

Linz Definition 9.4 (Turing Computable): A function ff with domain DD is said to be Turing-computable, or just computable, if there exists some Turing machine M=(Q,Σ,Γ,δ,q0,,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, \Box, F) such that

q0wM*qff(w)q_{0} w \vdash^{*}_{M} q_{f} f(w), qfFq_f \in F,

for all wDw \in D.

Note: A transducer Turing machine must start on the leftmost symbol of the input and stop on the leftmost symbol of the output.

9.1.3.1 Linz Example 9.9

Compute x+yx + y for positive integers xx and yy.

We use unary notation to represent the positive integers, i.e., a positive integer is represented by a sequence of 1’s whose length is equal to the value of the integer. For example:

1111=41111 \ =\ 4

The computation is

q0w(x)0w(y)q_{0} w(x) 0 w(y) *\vdash^{*} qfw(x+y)0q_{f} w(x + y) 0

where 00 separates the two numbers at initiation and after the result at termination.

Key idea: Move the separating 00 to the right end.

To achieve this, we construct M=(Q,Σ,Γ,δ,q0,,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, \Box, F) with

Q={q0,q1,q2,q3,q4}Q = \{ q_{0}, q_{1}, q_{2}, q_{3}, q_{4} \}
F={q4}F = \{ q_{4} \}
δ(q0,1)=(q0,1,R)\delta(q_{0}, 1) = (q_{0}, 1, R)
δ(q0,0)=(q1,1,R)\delta(q_{0}, 0) = (q_{1}, 1, R)
δ(q1,1)=(q1,1,R)\delta(q_{1}, 1) = (q_{1}, 1, R)
δ(q1,)=(q2,,L)\delta(q_{1}, \Box) = (q_{2}, \Box, L)
δ(q2,1)=(q3,0,L)\delta(q_{2}, 1) = (q_{3}, 0, L)
δ(q3,1)=(q3,1,L)\delta(q_{3}, 1) = (q_{3}, 1, L)
δ(q3,)=(q4,,R)\delta(q_{3}, \Box) = (q_{4}, \Box, R)

The sequence of instantaneous descriptions for adding 111 to 11 is shown below.

q0111011q_{0}111011 \;\vdash\; 1q0110111q_{0}11011 \;\vdash\; 11q0101111q_{0}1011 \;\vdash\; 111q0011111q_{0}011
\;\vdash\; 1111q11111111q_{1}111 \;\vdash\; 11111q1111111q_{1}1 \;\vdash\; 111111q1111111q_{1}\Box
\;\vdash\; 11111q2111111q_{2}1 \;\vdash\; 1111q3101111q_{3}10 \;\vdash\; 111q3110111q_{3}110
\;\vdash\; 11q3111011q_{3}1110 \;\vdash\; 1q3111101q_{3}11110 \;\vdash\; q3111110q_{3}111110
\;\vdash\; q3111110q_{3}\Box 111110 \;\vdash\; q4111110q_{4}111110

9.1.3.2 Linz Example 9.10

Construct a Turing machine that copies strings of 11’s. More precisely, find a machine that performs the computation

q0w*qfwwq_{0} w \vdash^{*} q_{f} ww,

for any w{1}+w \in \{1\}^{+}.

To solve the problem, we implement the following procedure:

  1. Replace every 11 by an xx.
  2. Find the rightmost xx and replace it with 11.
  3. Travel to the right end of the current nonblank region and create a 11 there.
  4. Repeat steps 2 and 3 until there are no more xx’s.

A Turing machine transition function for this procedure is as follows:

δ(q0,1)=(q0,x,R)\delta(q_{0}, 1) = (q_{0}, x, R)
δ(q0,)=(q1,L)\delta(q_{0}, \Box) = (q_{1} \Box, L)
δ(q1,x)=(q2,1,R)\delta(q_{1}, x) = (q_{2}, 1, R)
δ(q2,1)=(q2,1,R\delta(q_{2}, 1) = (q_{2}, 1, R
δ(q2,)=(q1,1,L)\delta(q_{2}, \Box) = (q_{1}, 1, L)
δ(q1,1)=(q1,1,L)\delta(q_{1}, 1) = (q_{1}, 1, L)
δ(q1,)=(q3,,R)\delta(q_{1}, \Box) = (q_{3}, \Box, R)

where q3q_{3} is the only final state.

Linz Figure 9.7 shows a transition graph for this Turing machine.

Linz Fig. 9.7: Transition Graph for Example 9.10

This is not easy to follow, so let us trace the program with the string 11. The computation performed is as shown below.

q011q_{0}11 \;\vdash\; xq01xq_{0}1 \;\vdash\; xxq0xxq_{0}\Box \;\vdash\; xq1xxq_{1}x
\;\vdash\; x1q2x1q_{2}\Box \;\vdash\; xq111xq_{1}11 \;\vdash\; q1x11q_{1}x11
\;\vdash\; 1q2111q_{2}11 \;\vdash\; 11q2111q_{2}1 \;\vdash\; 111q2111q_{2}\Box
\;\vdash\; 11q11111q_{1}11 \;\vdash\; 1q11111q_{1}111
\;\vdash\; q11111q_{1}1111 \;\vdash\; q11111q_{1}\Box 1111 \;\vdash\; q31111q_{3}1111

9.1.3.3 Linz Example 9.11

Suppose xx and yy are positive integers represented in unary notation.

Construct a Turing machine that halts in a final state qyq_{y} if xyx \geq y and in a nonfinal state qnq_{n} if x<yx < y.

That is, the machine must perform the computation:

q0w(x)0w(y)q_{0} w(x) 0 w(y) *\vdash^{*} qyw(x)0w(y)q_{y} w(x) 0 w(y), if xyx \geq y
q0w(x)0w(y)q_{0} w(x) 0 w(y) *\vdash^{*} qnw(x)0w(y)q_{n} w(x) 0 w(y), if x<yx < y

We can adapt the approach from Linz Example 9.7. Instead of matching aa’s and bb’s, we match each 1 on the left of the dividing 0 with the 1 on the right. At the end of the matching, we will have on the tape either

xx110xxxx x \cdots 110xx \cdots x \Box

or

xxxx0xxx11x x \cdots xx0xx \cdots x11 \Box,

depending on whether x>yx > y or y>xy > x.

A transition graph for machine is shown below.

9.2 Combining Turing Machines for Complicated Tasks

9.2.1 Introduction

How can we compose simpler operations on Turing machines to form more complex operations?

Techniques discussed in this section include use of:

9.2.2 Using Block Diagrams

In the block diagram technique, we define high-level computations in boxes without internal details on how computation is done. The details are filled in on a subsequent refinement.

To explore the use of block diagrams in the design of complex computations, consider Linz Example 9.12, which builds on Linz Examples 9.9 and 9.11 (above).

9.2.2.1 Linz Example 9.12

Design a Turing machine that computes the following function:

f(x,y)=x+yf(x, y) = x + y, if xyx \geq y,
f(x,y)=0f(x, y) = 0, if x<yx < y.

For simplicity, we assume xx and yy are positive integers in unary representation and the value zero is represented by 0, with the rest of the tape blank.

Linz Figure 9.8 shows a high-level block diagram of this computation. This computation consists of a network of three simpler machines:

Linz Fig. 9.8: Block Diagram

We use such high-level diagrams in subsequent discussions of large computations. How can we justify that practice?

We can implement:

Comparer CC carries out the computations

qC,0w(x)0w(y)*qA,0w(x)0w(y)q_{C,0} w(x) 0 w(y) \vdash^{*} q_{A,0} w(x) 0 w(y), if xyx \geq y,

and

qC,0w(x)0w(y)*qE,0w(x)0w(y)q_{C,0} w(x) 0 w(y) \vdash^{*} q_{E,0} w(x) 0 w(y), if x<yx < y.

If qA,0q_{A,0} and qE,0q_{E, 0} are the initial states of computations AA and EE, respectively, then CC starts either AA or EE.

Adder AA carries out the computation

qA,0w(x)0w(y)*qA,fw(x+y)0q_{A,0} w(x) 0 w(y) \vdash^{*} q_{A,f} w(x + y) 0.

And, Eraser EE carries out the computation

qE,0w(x)0w(y)*qE,f0q_{E,0} w(x) 0 w(y) \vdash^{*} q_{E,f} 0.

The outer diagram in Linz Figure 9.8 thus represents a single Turing machine that combines the actions of machines CC, AA, and EE as shown.

9.2.3 Using Pseudocode

In the pseudocode technique, we outline a computation using high-level descriptive phrases understandable to people. We refine and translate it to lower-level implementations later.

9.2.3.1 Macroinstructions

A simple kind of pseudocode is the macroinstruction. A macroinstruction is a single statement shorthand for a sequence of lower-level statements.

We first define the macroinstructions in terms of the lower-level language. Then we compose macroinstructions into a larger program, assuming the relevant substitutions will be done.

9.2.3.2 Linz Example 9.13

For this example, consider the macroinstruction

if aa then qjq_{j} else qkq_{k}.

This means:

We can implement this macroinstruction with several steps of a Turing machine:

δ(qi,a)=(qj0,a,R)\delta(q_{i}, a) = (q_{j0}, a, R) for all qiQq_{i} \in Q
δ(qj0,c)=(qj,c,L)\delta(q_{j0},c) = (q_{j}, c, L) for all cΓc \in \Gamma

δ(qi,b)=(qk0,b,R)\delta(q_{i}, b) = (q_{k0}, b, R) for all qiQq_{i} \in Q and all bΓ{a}b \in \Gamma - \{a\}
δ(qk0,c)=(qk,c,L)\delta(q_{k0},c) = (q_{k}, c, L) for all cΓc \in \Gamma

States qj0q_{j0} and qk0q_{k0} just back up Turing machine tape position one place.

Macroinstructions are expanded at each occurrence.

9.2.3.3 Subprograms

While each occurrence of a macroinstruction is expanded into actual code, a subprogram is a single piece of code that is invoked repeatedly.

As in higher-level language programs, we must be able to call a subprogram and then, after execution, return to the calling point and resume execution without any unwanted effects.

How can we do this with Turing machines?

We must be able to:

We can do this by partitioning the tape into several regions. Linz Figure 9.9 illustrates this technique for a program AA (a Turing machine) that calls a subprogram BB (another Turing machine).

  1. AA executes in its own workspace.
  2. Before transferring control to BB, AA writes information about its configuration and inputs for BB into some separate region TT.
  3. After transfer, BB finds its input in TT.
  4. BB executes in its own separate workspace.
  5. When BB completes, it writes relevant results into TT.
  6. BB transfers control back to AA, which resumes and gets the needed results from TT.
Linz Fig. 9.9: Tape Regions for Subprograms

Note: This is similar to what happens in an actual computer for a subprogram (function, procedure) call. The region TT is normally a segment pushed onto the program’s runtime stack or dynamically allocated from the heap memory.

9.2.3.4 Linz Example 9.14

Design a Turing machine that multiplies xx and yy, positive integers represented in unary notation.

Assume the initial and final tape configurations are as shown in Linz Figure 9.10.

We can multiply xx by yy by adding yy to itself xx times as described in the algorithm below.

      Repeat until xx contains no more 11’s\
          Find a 11 in xx and replace it with another symbol aa\
          Replace the leftmost 00 by 0y0y\
      Replace all aa’s with 1’s
Linz Fig. 9.10: Multiplication

Although the above description of the pseudocode approach is imprecise, the idea is sufficiently simple that it is clear we can implement it.

We have not proved that the block diagram, macroinstruction, or subprogram approaches will work in all cases. But the discussion in this section shows that it is plausible to use Turing machines to express complex computations.

9.3 Turing’s Thesis

The Turing thesis is an hypothesis that any computation that can be carried out by mechanical means can be performed by some Turing machine.

This is a broad assertion. It is not something we can prove!

The Turing thesis is actually a definition of mechanical computation: a computation is mechanical if and only if it can be performed by some Turing machine.

Some arguments for accepting the Turing thesis as the definition of mechanical computation include:

  1. Anything that can be computed by any existing digital computer can also be computed by a Turing machine.

  2. There are no known problems that are solvable by what we intuitively consider an algorithm for which a Turing machine program cannot be written.

  3. No alternative model for mechanical computation is more powerful than the Turing machine model.

The Turing thesis is to computing science as, for example, classical Newtonian mechanics is to physics. Newton’s “laws” of motion cannot be proved, but they could possibly be invalidated by observation. The “laws” are plausible models that have enabled humans to explain much of the physical world for several centuries.

Similarly, we accept the Turing thesis as a basic “law” of computing science. The conclusions we draw from it agree with what we know about real computers.

The Turing thesis enables us to formalize the concept of algorithm.

Linz Definition 9.5 (Algorithm): An algorithm for a function f:DRf: D \rightarrow R is a Turing machine MM, which given as input any dDd \in D on its tape, eventually halts with the correct answer f(d)Rf(d) \in R on its tape. Specifically, we can require that

q0dM*qff(d),qfFq_{0} d \vdash^{*}_{M} q_{f} f(d), q_{f} \in F,

for all dDd \in D.

To prove that “there exists an algorithm”, we can construct a Turing machine that computes the result.

However, this is difficult in practice for such a low-level machine.

An alternative is, first, to appeal to the Turing thesis, arguing that anything that we can compute with a digital computer we can compute with a Turing machine. Thus a program in suitable high-level language or precise pseudocode can compute the result. If unsure, then we can validate this by actually implementing the computation on a computer.

Note: A higher-level language is Turing-complete if it can express any algorithm that can be expressed with a Turing machine. If we can write a Turing machine simulator in that language, we consider the language Turing complete.

10 OMIT Chapter 10

11 A Hierarchy of Formal Languages and Automata

The kinds of questions addressed in this chapter:

Note: We assume the languages in this chapter are λ\lambda-free unless otherwise stated.

11.1 Recursive and Recursively Enumerable Languages

Here we make a distinction between languages accepted by Turing machines and languages for which there is a membership algorithm.

11.1.1 Aside: Countability

Definition (Countable and Countably Infinite): A set is countable if it has the same cardinality as a subset of the natural numbers. A set is countably infinite if it can be placed into one-to-one correspondence with the set of all natural numbers.

Thus there is some ordering on any countable set.

Also note that, for any finite set of symbols Σ\Sigma, then Σ*\Sigma^{*} and any its subsets are countable. Similarly for Σ+\Sigma^{+}.

From Linz Section 10.4 (not covered in this course), we also have the following theorem about the set of Turing machines.

Linz Theorem 10.3 (Turing Machines are Countable): The set of all Turing machines is countably infinite.

11.1.2 Definition of Recursively Enumerable Language

Linz Definition 11.1 (Recursively Enumerable Language): A language LL is recursively enumerable if there exists a Turing machine that accepts it.

This definition implies there is a Turing machine MM such that for every wLw \in L

q0wM*x1qfx2q_{0}w \ \vdash^{*}_{M}\ x_{1}q_{f}x_{2}

with the initial state q0q_{0} and a final state qfq_{f}.

But what if wLw \notin L?

11.1.3 Definition of Recursive Language

Linz Definition 11.2 (Recursive Language): A language LL on Σ\Sigma is recursive if there exists a Turing machine MM that accepts LL and that halts on every ww in Σ*\Sigma^{*}.

That is, a language is recursive if and only if there exists a membership algorithm for it.

11.1.4 Enumeration Procedure for Recursive Languages

If a language is recursive, then there exists an enumeration procedure, that is, a method for counting and ordering the strings in the language.

Thus M̂\hat{M} generates the candidate strings wiw_{i} in order. MM' writes the the accepted strings to its tape in order.

11.1.5 Enumeration Procedure for Recursively Enumerable Languages

Problem: A Turing machine MM might not halt on some strings.

Solution: Construct M̂\hat{M} to advance “all” strings simultaneously, one move at a time. The order of string generation and moves is illustrated in Linz Figure 11.1.

Now machine M̂\hat{M} advances each candidate string wiw_{i} (columns of Linz Figure 11.1) one MM-move at a time.

Linz Fig. 11.1: Enumeration Procedure for Recursively Enumerable Languages

Because each string is generated by M̂\hat{M} and accepted by MM in a finite number of steps, every string in LL is eventually produced by MM. The machine does not go into an infinite loop for a wiw_{i} that is not accepted.

Note: Turing machine M̂\hat{M} does not terminate and strings for which MM does not halt will never complete processing, but any string that can be accepted by MM will be accepted within a finite number of steps.

11.1.6 Languages That are Not Recursively Enumerable

Linz Theorem 11.1 (Powerset of Countable Set not Countable) Let SS be an countably infinite set. Then its powerset 2S2^{S} is not countable.

Proof: Let S={s1,s2,s3,}S = \{\ s_{1}, s_{2}, s_{3}, \cdots\ \} be an countably infinite set.

Let t2St \in 2^{S}. Then tt can represented by a bit vector b1b2b_{1}b_{2}\cdots such that bi=1b_{i} = 1 if and only if sits_{i} \in t.

Assume 2S2^{S} is countable. Thus 2S2^{S} can be written in order t1,t2,t_{1}, t_{2}, \cdots and put into a table as shown in Linz Figure 11.2.

Linz Fig. 11.2: Cantor’s Diagonalization

Consider the main diagonal of the table (circled in Linz Figure 11.2). Complement the bits along this diagonal and let tdt_{d} be a set represented by this bit vector.

Clearly td2St_{d} \in 2^{S}. But tdtit_{d} \neq t_{i} for any ii, because they differ at least at sis_{i}. This is a contradicts the assumption that 2S2^{S} is countable.

So the assumption is false. Therefore, 2S2^{S} is not countable. QED.

This is Cantor’s diagonalization argument.

Linz Theorem 11.2 (Existence of Languages Not Recursively Enumerable): For any nonempty Σ\Sigma, there exist languages that are not recursively enumerable.

Proof: Any LΣ*L \subseteq \Sigma^{*} is a language on Σ\Sigma. Thus 2Σ*2^{\Sigma^{*}} is the set of all languages on Σ\Sigma.

Because Σ*\Sigma^{*} is infinite and countable, Linz Theorem 11.1 implies that the set of all languages on Σ\Sigma is not countable. From Linz Theorem 10.3 (see above), we know the set of Turing machines can be enumerated. Hence, the recursively enumerable languages are countable.

Therefore, some languages on Σ\Sigma are not recursively enumerable. QED.

11.1.7 A Language That is Not Recursively Enumerable

Linz Theorem 11.3: There exists a recursively enumerable language whose complement is not recursively enumerable.

Proof: Let Σ={a}\Sigma = \{ a \}.

Consider the set of all Turing machines with input alphabet Σ\Sigma, i.e., {M1,M2,M3,}\{ M_{1}, M_{2}, M_{3}, \cdots \}.

By Linz Theorem 10.3 (see above), we know that this set of is countable. So it has some order.

For each MiM_{i} there exists a recursively enumerable language L(Mi)L(M_{i}).

Also, for each recursively enumerable languages on Σ\Sigma, there is some Turing machine that accepts it.

Let L={ai:aiL(Mi)}L = \{ a^{i} : a^{i} \in L(M_{i}) \}.

LL is recursively enumerable because here is a Turing machine that accepts it. E.g., the Turing machine works as follows:

Now consider L={ai:aiL(Mi)}\bar{L} = \{ a^{i} : a^{i} \notin L(M_{i}) \}.

Assume L\bar{L} is recursively enumerable.

There must be some Turing machine MkM_{k}, for some kk, that accepts L\bar{L}. Hence, L=L(Mk)\bar{L} = L(M_{k}).

Consider aka^{k}. Is it in LL? Or in L\bar{L}?

Consider the case akLa^{k} \in \bar{L}. Thus akL(Mk)a^{k} \in L(M_{k}). Hence, akLa^{k} \in L by the definition of LL. This is a contradiction.

Consider the case akLa^{k} \in L, i.e., akLa^{k}\notin \bar{L}. Thus akL(Mk)a^{k} \notin L(M_{k}) by definition of L\bar{L}. But from the defintion of LL, akLa^{k} \in \bar{L}. This is also be a contradiction.

In all cases, we have a contradiction, so the assumption is false. Therefore, L\bar{L} is not recursively enumerable. QED.

11.1.8 A Language That is Recursively Enumerable but Not Recursive

Linz Theorem 11.4: If a language LL and its complement L\bar{L} are both recursively enumerable, then both languages are recursive. If LL is recursive, then L\bar{L} is also recursive, and consequently both are recursively enumerable.

Proof: See Linz Section 11.2 for the details.

Linz Theorem 11.5: There exists a recursively enumerable language that is not recursive; that is, the family of recursive languages is a proper subset of the family of recursively enumerable languages.

Proof: Consider the language LL of Linz Theorem 11.3.

This language is recursively enumerable, but its complement is not. Therefore, by Linz Theorem 11.4, it is not recursive, giving us the required example. QED.

There are well-defined languages that have no membership algorithms.

11.2 Unrestricted Grammars

Linz Definition 11.3 (Unrestricted Grammar): A grammar G=(V,T,S,P)G = (V, T, S, P) is an unrestricted gramar if all the productions are of the form

uvu \rightarrow v,

where uu is in (VT)+(V \cup T)^{+} and vv is in (VT)*(V \cup T)^{*}.

Note: There is no λ\lambda on left, but otherwise the use of symbols is unrestricted.

Linz Theorem 11.6 (Recursively Enumerable Language for Unrestricted Grammar): Any language generated by an unrestricted grammar is recursively enumerable.

Proof: See Linz Section 11.2 for the details.

The grammar defines an enumeration procedure for all strings.

Linz Theorem 11.7 (Unrestricted Grammars for Recursively Enumerable Language): For every recursively enumerable language LL, there exists an unrestricted grammar GG, such that L=L(G)L = L(G).

Proof: See Linz Section 11.2 for the details.

11.3 Context-Sensitive Grammars and Languages

Between the restricted context-free grammars and the unrestricted grammars, there are a number of kinds of “somewhat restricted” families of grammars.

Linz Definition 11.4 (Context-Sensitive Grammar): A grammar G=(V,T,S,P)G = (V, T, S, P) is said to be context-sensitive if all productions are of the form

xyx \rightarrow y,

where x,y(VT)+x, y \in (V \cup T)^{+} and

|x||y||x| \leq |y|.

This type of grammar is noncontracting in that the length of successive sentential forms can never decrease.

All such grammars can be rewritten in a normal form in which all productions are of the form

xAyxvyxAy \rightarrow xvy.

This is equivalent to saying that the production

AvA \rightarrow v

can only be applied in a context where AA occurs with string xx on the left and string yy on the right.

Linz Definition 11.5 (Context-Sensitive) : A language LL is said to be context-sensitive if there exists a context-sensitive grammar GG, such that L=L(G)L = L(G) or L=L(G){λ}L = L(G) \cup \{ \lambda \}.

Note the special cases for λ\lambda. This enables us to say that the family of context-free languages is a subset of the family of context-sensitive languages.

11.3.1 Linz Example 11.2

The language L={anbncn:n1}L = \{ a^{n}b^{n}c^{n} : n \geq 1 \} is a context-sensitive language. We show this by defining a context-sensitive grammar for the language, such as the following:

Sabc|aAbcS \rightarrow abc\ |\ aAbc
AbbAAb \rightarrow bA
AcBbccAc \rightarrow Bbcc
bBBbbB \rightarrow Bb
aBaa|aaAaB \rightarrow aa\ |\ aaA

Consider a derivation of a3b3c3a^{3}b^{3}c^{3}:

SS aAbcabAcabBbcc\Rightarrow aAbc \Rightarrow abAc \Rightarrow abBbcc
aBbbccaaAbbccaabAbcc\Rightarrow aBbbcc \Rightarrow aaAbbcc \Rightarrow aabAbcc
aabbAccaabbBbcccaabBbbccc\Rightarrow aabbAcc \Rightarrow aabbBbccc \Rightarrow aabBbbccc
aaabbbccc\Rightarrow aaabbbccc

The grammar uses the variables AA and BB as messengers.

The process is similar to how a Turing machine would work to accept the language LL.

LL is not context-free.

11.3.2 Linear Bounded Automata (lba)

In Linz Section 10.5 (not covered in this course), a linear-bounded automaton is defined as a nondeterministic Turing machine that is restricted to the part of its tape occupied by its input (bounded on the left by [[ and right by ]]).

[______][\_ \_ \_ \_ \_ \_].

Linz Theorem 11.8: For every context-sensitive language LL not including λ\lambda, there exists some linear bounded automaton MM such that L=L(M)L = L(M):

Proof: See Linz Section 11.3 for the details.

Linz Theorem 11.9: If a language LL is accepted by some linear bounded automaton MM, then there exists a context-sensitive grammar that generates LL.

Proof: See Linz Section 11.3 for the details.

11.3.3 Relation Between Recursive and Context-Sensitive Languages

Linz Theorem 11.10: Every context-sensitive language LL is recursive.

Linz Theorem 11.11: There exists a recursive language that is not context-sensitive.

We have studied a number of automata in this course. Ordered by decreasing power these include:

11.4 The Chomsky Hierarchy

We have studied a number of types of languages in this course, including

  1. recursively enumerable languages LREL_{RE}
  2. context-sensitive languages LCSL_{CS}
  3. context-free languages LREGL_{REG}
  4. regular languages LREGL_{REG}

One way of showing the relationship among these families of languages is to use the Chomsky hierarchy, where the types are numbered as above and as diagrams in Linz Figures 11.3 and 11.4.

This classification was first described in 1956 by American linguist Noam Chomsky, a founder of formal language theory.

Linz Fig 11.3: Original Chomsky Hierarchy
Linz Fig 11.4: Extended Chomsky Hierarchy

12 Limits of Algorithmic Computation

In Linz Chapter 9, we studied the Turing thesis, which concerned what Turing machines can do.

This chapter we study: What Turing machines cannot do.

This chapter considers the concepts:

12.1 Some Problems That Cannot Be Solved with Turing Machines

12.1.1 Computability

Recall the following definition from Chapter 9.

Linz Definition 9.4 (Turing Computable): A function ff with domain DD is said to be Turing-computable, or just computable, if there exists some Turing machine M=(Q,Σ,Γ,δ,q0,,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, \Box, F) such that

q0wM*qff(w)q_{0} w \vdash^{*}_{M} q_{f} f(w), qfFq_f \in F,

for all wDw \in D.

Note:

12.1.2 Decidability

Here we work in a simplified setting: the result of a computation is either “yes” or “no”. In this context, the problem is considered either decidable or undecidable.

Problem: We have a set of related statements, each either true or false.

This problem is decidable if and only if there exists a Turing machine that gives the correct answer for every statement in the domain. Otherwise, the problem is undecidable.

Example problem statement: For a context-free grammar GG, the language L(G)L(G) is ambiguous. This is a true statement for some GG and false for others.

If we can answer this question, with either the result true or false, for every context-free grammar, then the problem is decidable. If we cannot answer the question for some context-free grammar (i.e., the Turing machine does not halt), then the problem is undecidable.

(In Linz Theorem 12.8, we see that this question is actually undecidable.)

12.1.3 The Turing Machine Halting Problem

Given the description of a Turing machine MM and input string ww, does MM, when started in the initial configuration q0wq_{0}w, perform a computation that eventually halts?

What is the domain DD?

We cannot solve this problem by simulating MM. That is an infinite computation if the Turing machine does not halt.

We must analyze the Turing machine description to get an answer for any machine MM and string ww. But no such algorithm exists!

Linz Definition 12.1 (Halting Problem): Let wMw_{M} be a string that describes a Turing machine M=(Q,Σ,Γ,δ,q0,,F)M = (Q, \Sigma, \Gamma, \delta, q_{0}, \Box, F) and let ww be a string in MM’s alphabet. Assume that wMw_{M} and ww are encoded as strings of 0’s and 1’s (as suggested in Linz Section 10.4). A solution to the halting problem is a Turing machine HH, which for any wMw_{M} and ww, performs the computation

q0wMw*x1qyx2q_{0}w_{M}w\ \vdash^{*}\ x_{1}q_{y}x_{2}

if MM is applied to ww halts, and

q0wMw*y1qny2q_{0}w_{M}w\ \vdash^{*}\ y_{1}q_{n}y_{2},

if MM is applied to ww does not halt. Here qyq_{y} and qnq_{n} are both final states of HH.

Linz Theorem 12.1 (Halting Problem is Undecidable): There does not exist any Turing machine HH that behaves as required by Linz Definition 12.1. Thus the halting problem is undecidable.

Proof: Assume there exists such a Turing machine HH that solves the halting problem.

The input to HH is wMww_{M}w, where wMw_{M} is a description of Turing machine MM. HH must halt with a “yes” or “no” answer as indicated in Linz Figure 12.1.

Linz Fig. 12.1: Turing Machine H

We next modify HH to produce a Turing machine HH' with the structure shown in Linz Figure 12.2.

Linz Fig. 12.2: Turing Machine H'

When HH' reaches a state where HH halts, it enters an infinite loop.

From HH' we construct Turing machine Ĥ\hat{H}, which takes an input wMw_{M} and copies it, ending in initial state q0q_{0} of HH'. After that, it behaves the same as HH'.

The behavior of Ĥ\hat{H} is

q0wMĤ*q0wMwMq_{0}w_{M} \;\vdash^{*}_{\hat{H}}\; q_{0}w_{M}w_{M} Ĥ*\;\vdash^{*}_{\hat{H}}\;\infty

if MM applied to wMw_{M} halts, and

q0wMĤ*q0wMwMq_{0}w_{M} \;\vdash^{*}_{\hat{H}}\; q_{0}w_{M}w_{M} Ĥ*y1qny2\;\vdash^{*}_{\hat{H}}\; y_{1}q_{n}y_{2}

if MM applied to wMw_{M} does not halt.

Now Ĥ\hat{H} is itself a Turing machine, which can be also be encoded as a string ŵ\hat{w}.

So, let’s apply Ĥ\hat{H} to its own description ŵ\hat{w}. The behavior is

q0ŵĤ*q_{0}\hat{w} \;\vdash^{*}_{\hat{H}}\; \infty

if Ĥ\hat{H} applied to ŵ\hat{w} halts, and

q0ŵĤ*y1qny2q_{0}\hat{w} \;\vdash^{*}_{\hat{H}}\; y_{1}q_{n}y_{2}

if MM applied to ŵ\hat{w} does not halt.

In the first case, Ĥ\hat{H} goes into an infinite loop (i.e., does not halt) if Ĥ\hat{H} halts. In the second case, Ĥ\hat{H} halts if Ĥ\hat{H} does not halt. This is clearly impossible!

Thus we have a contradiction. Therefore, there exists no Turing machine HH. The halting problem is undecidable. QED.

It may be possible to determine whether a Turing machine halts in specific cases by analyzing the machine and its input.

However, this theorem says that there exists no algorithm to solve the halting problem for all Turing machines and all possible inputs.

Linz Theorem 12.2: If the halting problem were decidable, then every recursively enumerated language would be recursive. Consequently, the halting problem is undecidable.

Proof: Let LL be a recursively enumerable language on Σ\Sigma, MM be a Turing machine that accepts LL, and wMw_{M} be an encoding of MM as a string.

Assume the halting problem is decidable and let HH be a Turing machine that solves it.

Consider the following procedure.

  1. Apply HH to wMww_{M}w.
  2. If HH says “no”, then wLw \notin L.
  3. If HH says “yes”, then apply MM to ww, which will eventually tell us whether wLw \in L or wLw \notin L.

The above is thus a membership algorithm, so LL must be recursive. But we know that there are recursively enumerable languages that are not recursive. So this is a contradiction.

Therefore, HH cannot exist and the halting problem is undecidable. QED.

12.1.4 Reducing One Undecidable Problem to Another

In the above, the halting problem is reduced to a membership algorithm for recursively enumerable languages.

A problem AA is reduced to problem BB if the decidability of BB implies the decidability of AA. We transform a new problem AA into a problem BB whose decidability is already known.

Note: The Linz textbook gives three example reductions in Section 12.1

12.2 Undecidable Problems for Recursively Enumerable Languages

Linz Theorem 12.3 (Empty Unrestricted Grammars Undecidable): Let GG be an unrestricted grammar. Then the problem of determining whether or not

L(G)=L(G) = \emptyset

is undecidable.

Proof: See Linz Section 12.2 for the details of this reduction argument. The decidability of membership problem for recursively enumerated languages implies the problem in this theorem.

Linz Theorem 12.4 (Finiteness of Turing Machine Languages is Undecided): Let MM be a Turing Machine. Then the question of whether or not L(M)L(M) is finite is undecidable.

Proof: See Linz Section 12.2 for the details of this proof.

Rice’s theorem, a generalization of the above, states that any nontrivial property of a recursively enumerable language is undecidable. The adjective “nontrivial” refers to a property possessed by some but not all recursively enumerated languages.

12.3 The Post Correspondence Problem

This section is not covered in this course.

12.4 Undecidable Problems for Context-Free Languages

Linz Theorem 12.8: There exists no algorithm for deciding whether any given context-free grammar is ambiguous.

Proof: See Linz Section 12.4 for the details of this proof.

Linz Theorem 12.9: There exists no algorithm for deciding whether or not

L(G1)L(G2)=L(G_{1}) \cap L(G_{2}) = \emptyset

for arbitrary context-free grammars G1G_{1} and G2G_{2}.

Proof: See Linz Section 12.4 for the details of this proof.

Keep in mind that the above and other such decidability results do not eliminate the possibility that there may be specific cases–perhaps even many interesting and important cases–for which there exist decision algorithms.

However, these theorems do say that there are no general algorithms to decide these problems. There are always some cases in which specific algorithms will fail to work.

12.5 A Question of Efficiency

This section is not covered in this course.

12.6 References