Models of Language Recognition: Automata
159
6. Variations of Automata The four types of automata, TM, LBA, PDA, and FA (either deterministic or nondeterministic) are typical models for
computation that have been extensively investigated in the area of theory of computation. Many variations of these automata have been introduced, including automata with variations on the number of tapes or stacks, tape dimensions, number of heads, head mobility, etc. This chapter introduces some of these interesting variations and discusses their computational capability. By computational capability we are concerned only with a computational model’s capacity to show us a final result, with no regard to the amount of space or time used. If two models have the same computational capability, we say they are equivalent. Simulation is a popular technique for showing that two computational models are equivalent. If we can show that one model can simulate the other and vice versa, then they are equivalent. We will see a couple of application examples of this technique.
6.1 Transducers: automata with output 161 Mealy automata, Moore automata 6.2 Variations of TM and LBA 162 Multi-track TM, Multi-tape TM, 2D-tape TM 6.3 Variations of PDA 168 2-way PDA, Multi-stack PDA PDA accepting with empty stack 6.4 Variations of FA 176 2-way FA, 2D-tape FA, Multi-head FA, FA array: cellular automata, Hidden Markov models (HMM) 6.5 Church’s hypothesis 181 Exercises 182
160
Variations of automata
6.1 Transducers: Automata with output The automata that we have introduced produce (implicitly) a simple output “yes” by entering an accepting state if the input string belongs to its language. Attaching an output port to the finite state control we can extend this limited output capacity so that for the given input string they produce an output string of finite length (see Figure (a) below for an FA). We call such automata transducers. Depending on the time of output, there are two variations. The one produces an output while taking a transition, and the other after the transition (i.e., on the next state). In particular for FA, the former models are called Mealy automata, and the latter Moore automata. (See the figure (b) and (c), where b is an output symbol.) b --a-output port
a/b
a
b (a) Transducer
(b) Mealy automaton
(c ) Moore automaton 161
Variations of Automata
6.2 Variations of TM The TM with one-dimensional tape and one read/write head can be extended to multi-track, multi-tape, or multi-dimensional tape TM. We will see that these variations are equivalent to the standard TM. In other words, if one of these variations recognizes a language L, we can build a standard TM that recognizes L, and vice versa. Multi-track TM A multi-track TM has a read/write tape organized into multiple (i.e., some constant k 2) tracks. The input string is written on one of the tracks and during the computation, all the cells in one column are read and then written simultaneously.
a
p
b
a
a
b
a
b
b
c
a
a
b
r 162
Variations of Automata
Variations of TM
The figures below illustrate that by encoding k symbols that can appear in a column of k-track tape, we can transform a k-track tape TM M to the standard 1-track TM M’ that recognizes the same language. Conversely, for a given 1-track TM, it is trivial to construct a k-track TM that recognizes the same language. Hence, 1-track TM’s and multi-track TM’s are equivalent. a
b
b
c
a
a
b
([a, B], [a, b], R)
p
a
b
Encoding [a, B] = a
M
a
r
r
M’
[b, B] = b
q
[a, b] = [b, c] = . .
(a, , R)
q
p 163
Variations of Automata
Variations of TM Multi-tape TM
A multi-tape TM has some constant k 1 auxiliary tapes in addition to the input tape. Each tape head can move independent of others. Given a (k+1)-tape TM M, we can build a (k+1)-track TM M’ which simulates M as follows. Put the k+1 tracks of M’ in correspondence with the k+1 tapes of M as illustrated below. Use a check mark for each track to indicate the position of its corresponding tape head. To simulate the next move of M, M’ scans its tape and collects the k+1 check marked symbols, applies the transition function of M and then updates its tape accordingly. Given a multi-track TM, it is trivial to construct a multi-tape TM that recognizes the same language. It follows that both TM models are equivalent. Since multi-track TM’s are equivalent to standard TM’s, multi-tape TM’s are equivalent to standard TM’s. a
b
b
a Input tape
M
auxiliary tapes c a
b
a b b c d a b
a
d M’ 164
Variations of Automata
Variations of TM 2D-tape TM
We can extend the tape dimension of the standard TM to a multi-dimension one. A two-dimensional (2D) tape TM is illustrated below. The finite state control, with a read/write head under it, can move up, down, to the right or to the left. This model is widely used for pattern analysis and searching. We can show that the tape’s dimension does not affect computational capability. For example, given a standard TM M, it is trivial to build a 2D-tape TM M’, which, given the input string on one of its tape rows, simply simulates M step by step.
Conversely, given a 2D-tape TM M, constructing a standard TM M’ that simulates M is a little bit tricky. TM M’, with the 2D input mapped on its one-dimensional tape raw by raw with a boundary mark, must keep track of the current position of M. Here we present a rough sketch of an idea.
finite state control with 4-way read/ write head.
a
b a
b
b a a
165
Variations of Automata
Variations of TM
Given a 2D pattern, we first transform it into the smallest rectangle by filling the blank cells around the boundary with a special symbol (# in the figure below). (We need this rectangular form for the convenience of keeping track of the current position of M when it moves up or down.) Then we make copies of the rectangular pattern row by row and concatenate them into a one-dimensional string by inserting a boundary marker ($ in the figure). Finally we put this one-dimensional string onto the standard TM tape with a check mark on every symbol appearing in the column where M is positioned. Now, we let M’ simulate M step by step. The simulation needs tedious housekeeping operations, especially when M moves up, down, or off the pattern. We leave these challenging details for the reader.
finite state control with 4-way read/write head.
a
b a
b b
b #
a b a $ b b # $ b a #
a # M’
M 166
Variations of Automata Variations of LBA’s We can extend the LBA model with multi-track, multi-tape, and multi-dimensional tape LBA’s as we did for the TM. In these extensions, we restrict the additional tape space available to the size of the input. Recall that the tape space of the standard LBA is restricted to the length of the input. So, for the two-tape LBA, the machine cannot use auxiliary tape longer than the input size. We can show that these LBA variations are all equivalent to the standard LBA.
Classic Teaching
Break Time
A ten year old boy was failing math. His parents tried everything from tutors to hypnosis, but to no avail. Finally, at the insistence of a family friend, they decided to enroll their son in a private Catholic school. After the first day, the boy's parents were surprised when he walked in after school with a stern, focused and very determined expression on his face, and went right past them straight to his room, where he quietly closed the door. For nearly two hours he toiled away in his room - with math books strewn about his desk and the surrounding floor. He emerged long enough to eat, and after quickly cleaning his plate, went straight back to his room, closed the door, and worked feverishly at his studies until bedtime. This pattern continued ceaselessly until it was time for the first quarter report card. (To be continued on next slide)
167
Variations of Automata
6.3 Variations of PDA The PDA model also has many variations, such as two-way (input head), multi-stack, finite-turn, and empty stack. A finite-turn PDA can have a constant number of turns from pushing operations to popping and vice versa during the computation. An empty stack PDA accepts the input string by empty stack (instead of being in an accepting state) together with the condition of reading up to the last input symbol. Contrary to the TM case, these variations of the standard PDA show different computational capability, and in general it is not easy to prove the difference. Classic Teaching
Break Time
The boy walked in with his report card -- unopened -- laid it on the dinner table and went straight to his room. Cautiously, his mother opened it, and to her amazement, she saw a bright red "A" under the subject of MATH. Overjoyed, she and her husband rushed into their son's room, thrilled at his remarkable progress. "Was it the nuns that did it?", the father asked. The boy only shook his head and said, "No." "Was it the one-on-one tutoring? The peer-mentoring?" "No." "The textbooks? The teachers? The curriculum?" "Nope," said the son. "On that first day, when I walked in the front door and saw that guy they nailed to the 'plus sign,' I just knew they meant business!" - Denise -
168
Variations of Automata
Variations of PDA 2-way PDA
The 1-way head of PDA can be extended to 2-way with the same two accepting conditions. This two-way mobility gives more computational power. For example, we can easily construct a 2-way PDA M that recognizes our familiar contextsensitive language {aibici | i > 0 }. To accept this language, M scans the input twice. During the first scan M reads up to the first c and checks if the prefix before c is aibi, for some i > 0. If this part is successful, M moves back to the left till it sees the last a. Then scanning right, M makes sure that there is the postfix bici. (We leave the details for the reader.) In Chapter 12 we will prove that this language is not a context-free and that no standard PDA can recognize it. aa . . . abb . . . bcc . . . c
2-way read only Z0
stack
169
Variations of Automata
Variations of PDA Multi-stack PDA
Given one extra stack to the standard PDA, the computational capability of the machine jumps to that of TM. In other words, given a TM, we can construct a 2stack PDA which can simulate the TM. Here we will sketch how. Before the simulation begins, the PDA reads the whole input and pushes it into one of its stacks such that the leftmost symbol under the TM head appears at the stack top as figure (b) below shows. abbab . . . aab$ 1-way read only
abbab . . . aab 2-way read/write
(a) TM
Stack 1
Stack 2
Z0
Z0baa . . .babba (b) 2-stack PDA 170
Variations of Automata
Variations of PDA
The 2-stack PDA can do this by pushing the whole input into one of its stacks and then transferring it to the other stack. Notice that since the input has no end marker, to shift the whole input string into one of the stacks, the PDA needs a nondeterministic move to guess the last symbol of the input. During the simulation we can always let the symbol read by the TM appear at one of the stack tops (stack 2 in figure (b) below) by popping a symbol from a stack and pushing it onto the other stack. Using this trick, it is easy to simulate the TM step by step. (The details are left to the reader.)
abbab . . . aab abbab . . . aab
1-way read only
2-way read/write
Z0abba
Stack 1
Z0baa . . .b
Stack 2
(a) TM (b) 2-stack PDA 171
Variations of PDA
Variations of Automata
PDA’s accepting by empty stack Recall that for a PDA to accept the input string, it must satisfy two conditions: (1) the whole input must be read and (2) the machine must be in an accepting state. Substituting the empty stack for condition (2), we get a PDA accepting by empty stack. Here, by empty stack we mean the stack completely emptied by popping out the bottom of the stack symbol Z0. No computational move is possible with empty stack. We can show that the standard PDA and the PDA accepting by empty stack are computationally equivalent. Let Ma be a standard PDA and Me be a PDA accepting by empty stack. Given an Me, we can construct an Ma which recognizes the language L(Me). We simply change every transition of Me that pops Z0 to a transition that enters an accepting state without empting the stack, as the following figure show. (a, Z0/)
(a, Z0/Z0) (, Z0/)
(a) Me
(, Z0/Z0)
(b) Ma 172
Variations of Automata
Variations of PDA
Now suppose that we are given a standard PDA Ma, and assume that Ma does not pop the bottom of stack symbol Z0. Otherwise, if Ma has a transition, for example (p, a, Z0)= (q, ), we introduce a new stack symbol, say A0, and let the machine rewrite Z0 with A0, instead of popping (see the figure below). Such modification does not affect the language accepted by Ma. Notice that Ma does not empty the stack, because no transition is defined with A0 at the stack top.
(a, Z0/) q0
start
(a, Z0 /A0 )
(, Z0/)
q0
(, Z0/A0 )
start
173
Variations of Automata
Variations of PDA
With Ma, which does not pop the bottom of the stack symbol Z0, we construct an Me to recognize L(Ma) as follows. Introduce a new state, say qe, and add a transition to this state from every accepting state of Ma without reading the input. Then let it dump the stack contents in state qe. (In the figure below, X denotes any stack symbol including Z0.) (, X/)
qe
(, X/) (, X/)
(a) Ma
(b) Me 174
Variations of Automata
Variations of PDA
We learned that substituting the condition of empty-stack for the condition of being in an accepting state does not affect the language accepted by the PDA, i.e., the classes of languages accepted by both models are the same. In the proof, we paid no attention to the types (deterministic or nondeterministic) of PDA Ma and Me. If the given Me is DPDA (or NPDA), then obviously the PDA Ma that we have constructed is also DPDA (or, respectively, NPDA). But it does not work the other way around. Here is an example. It is easy to construct a standard DPDA which recognizes the language is a+ (see figure (a) below). If we transform this standard DPDA to a DPDA accepting by empty stack, we get an NPDA Me as shown in figure (b) below. Actually, it is impossible to construct a DPDA that recognizes this language with empty stack. So we can say that the standard DPDA is more powerful than the DPDA accepting by empty stack. ( a, Z0/Z0 ) ( a, Z0/Z0 ) (a) DPDA Ma
( a, Z0/Z0 ) ( a, Z0/Z0 )
( a, Z0/ )
(b) NPDA Me 175
Variations of Automata
6.4 Variations of FA The FA also has many variations. In this section we will introduce 2-way FA, multi-head FA, 2-D tape FA and the hidden Markov model (HMM). 2-way FA
It is interesting to see that unlike the 2-way PDA, 2-way FA is computationally equivalent to the standard 1-way FA. This implies that for FA, the mobility of the head does not give any additional capability for recognizing the language. Actually there is an algorithm that, given a 2-way FA, constructs 1-way FA recognizing the same language. The algorithm is given in Appendix B with an example.
a b a b b 2-way read only
176
Variations of Automata
Variations of FA 2-dimensional (2D)-tape FA
A 2D-tape FA model is an FA with a read-only 2D-tape. As in the 2D-tape TM model, the finite state control can move up, down, to the left or to the right. However, it cannot write. This model is widely used in the area of pattern recognition. Sometimes the tape size is restricted to some constant size.
a
Finite state control
b
b
a
b
a
2D tape (read-only)
a
177
Variations of FA
Variations of Automata
Multi-head FA We can add extra read-only heads to the standard FA model. Initially all the heads are assumed to be reading the leftmost input symbol. The head mobility is usually restricted to one-way, and we may (or may not) allow the heads to sense the presence of the others on the same cell.
A multi-head FA model is more powerful than the standard FA. We can constrict a 2-head FA that recognizes {aibi | i > 0 }, which is a non-regular context-free language. (In Chapter 12 we will prove that this language is not regular.) (The details for the construction are left for the reader.) a a a b b b 2-head read only We can construct a 1-way 2-head NFA that recognizes the context-sensitive language {ww$ | w {a, b}+ }, which is not context-free. However, this fact does not mean that multi-head FA outperforms PDA. For example, it is shown that no 1-way multi-head FA can recognize the context-free language {xxR$ | x {a, b}+ }. 178
Variations of FA
Variations of Automata
FA array: cellular automata A cellular automaton (CA) is a regular array of FA’s, called cells, connected by oneway or two-way communication lines between neighboring cells. All the cells operate in synchrony. The CA is a typical fine grained parallel computer model that has been extensively investigated and implemented. An input string is given to a dedicated cell to read in sequentially, or in parallel to all the boundary cells. Interested readers are referred to “Theory and Applications of Cellular Automata” (by Stephen Wolfram, 1986, World Scientific) or “Cellular automata: Theory and Experiment” (Edited by Howard Gutowitz, 1991, MIT Press). ..... .....
.....
(a) 1D cellular automaton
..... . . . .
.
.
. . .
..... (b) 2D cellular automaton 179
Variations of Automata
Variations of FA Hidden Markov Models (HMM)
A Markov model (MM) is a probabilistic model widely used in the areas of pattern matching and image recognition. The MM is a variation of Moore automata with a transition probability given on each edge (instead of an input symbol) and an output probability given to each output symbol (see the figure below). In general, no start state is defined and every state can be the start state with a given probability.
An MM for a given system can be constructed with a probability that can be computed with a sequence of state transitions and the outputs collected by observing the system. HMM is the case when only the output sequence is available and the state transitions are hidden and therefore not available. 0.3
a/0.1 b/0.9
a/0.5 b/0.5 180
Variations of Automata
6.5 Church’s Hypothesis In this chapter we learned that there are many variations of the four types of automata. Some of them (e.g., multi-tape TM) are equivalent to the standard model, and others (e.g., 2-way PDA and FA) show their computational capability upgraded. With regards to these variations and other possible computational models, we may ask the following questions: [ Is there any upper limit on computational capability? In particular, is there any computational model that is more powerful than TM? ] To these questions Church proposed a thesis stating that all possible models of computation, if they are sufficiently broad, must be equivalent to TM model. This is called Church’s hypothesis or the Church-Turing thesis. This hypothesis implies that there is an upper limit in the computational capability, which is TM’s computability. Every computational model that has been introduced turned out having no more power than the TM model, supporting the hypothesis that no such model exists.
181
Variations of Automata
Exercise
6.1 Construct the state transition graph of a DPDA which recognizes each of the following languages by empty stack. (a) L1 = {aibj; | i > j > 0 }
(b) L2 = {aibj | i > j > 0 }
(c) L3 = { aibkci | i, k > 0 }
6.2 Construct the transition graph of a 2-head 1-way DFA which recognizes the language L below and briefly describe how it works. The two heads can move independently to the left, right or stay, after reading the input. When you construct a 2head 1-way DFA, use the notation shown in the figure below. (In Chapter 12 we will prove that it is impossible for any standard FA to recognize L.) L = { aibi | i > 0 }
In state p, if head 1 reads a and head 2 reads b, or both heads read a, then the 2-head DFA enters state q. ( a, b ), (a, a )
q
p t s
(a, )
In state t, head 2 does not read, and if head 1 reads a, then the 2-head DFA enters state s.
182
Variations of Automata
Exercises
6.3 Construct the state transition graph of a 2-D DTM, which draws a figure 8 with character 8’s (see figure (b)) with as few states as possible. When the work is complete your DTM should be in an accepting state.
Finite state control
8 8 8 8 8
(a)
English Professor
8 8 8
8 8 8 8 8
(b)
Break Time
An English professor wrote the words, “a woman without her man is nothing” on the blackboard and directed the students to punctuate it correctly. The men wrote: “A woman, without her man, is nothing.” The women wrote: “ A woman: without her, man is nothing.” - Mike & Corinne -
183
Hierarchy of the Models
184
7. Hierarchy of the Models: Characterization We have studied four classes of formal languages and their grammars, four classes of automata and their variations, and other models for representing or generating languages such as regular expressions, syntax flow graphs and L-systems. In this chapter we will study the relations, called the Chomsky hierarchy, between these models. The Chomsky hierarchy reveals two important relationships, characterizations and containments, among the four classes of languages and automata. The characterizations show the relations between the models for the language generation (i.e., grammars) and those for
language recognition (i.e., automata) or expression. For example, a language L can be generated by a regular grammar if and only if it is recognizable by an FA, and a language L is recognizable by an FA if and only if it is expressible by a regular expression. The containments show the set relations among the classes of languages generated (recognized) by the four types of grammars (respectively, automata). For example, the class of languages generated (recognized) by context-free grammars (respectively, by PDA) properly contains the class of languages generated (recognized) by regular grammars (respectively, FA). The same set relation holds among the classes of languages generated (recognized) by type 0, 1, and 2 grammars (respectively, TM, LBA, and PDA). In terms of computational capability, this containment relation between the four classes of languages recognized by TM, LBA, PDA, and FA implies that anything computable by an LBA is also computable by TM, but not necessarily the other way around, and anything computable by a PDA is also computable by LBA, and so on. Therefore the Chomsky hierarchy provides valuable information for designing an efficient computational model, as well as for analyzing the computational capability of a given model. This chapter proves the characterization of the models at the lowest level of the hierarchy, i.e., regular grammars, FA, and regular expressions. Laying the groundwork through the following chapters, we will complete the proof of the hierarchy in Chapters 12 and 15.
185
Hierarchy
7.1 Chomsky Hierarchy 187 7.2 Proof of Characterization 191 Constructing an FA from a regular grammar G that recognizes L(G). Constructing a regular grammar from an FA M that produces L(M). Constructing an FA from a regular expression R that recognizes L(R). Constructing a regular expression from an FA M that expresses L(M). Rumination 215 Exercises 216
Today’s Quote
Break Time
You’re alive. Do something. The directive in life, the moral imperative was so uncomplicated. It could be expressed in single words, not complete sentences. It sounded like this: Look. Choose. Act. - Barbara Hall -
186
Hierarchy
7. 1 Chomsky Hierarchy The Chomsky hierarchy shows the characterization relations between two different types of models as illustrated below. For example, L is a language generated by type 0 grammar if and only if it is recognized by a TM. In this chapter we will only prove the characterization relations between regular grammars, FA’s and regular expressions. We defer the proofs for other characterizations till Chapter 15. These proofs are challenging, and usually included in a graduate course. .
Type 0
TM
Regular
CSL
FA
LBA
CFL
PDA
Regular exp
187
Hierarchy
Chomsky Hierarchy
For i {0, 1, 2, 3}, let TYPE-iL be the class of languages generated by type i grammars. The Chomsky hierarchy shows the proper containment relation among the four classes of languages as illustrated by the Venn diagram below. The class of regular languages is properly contained in the class of context-free languages. The class of context-free languages is properly contained in the class of context-sensitive languages which is in turn contained in the class of type 0 (phrase-structured) languages. In Chapter 12, we will see an elegant proof of these containment relations.
Type-3L
Type-0L
Type-1L
Type-2L
188
Hierarchy
Chomsky Hierarchy
The characterization and containment relations of the four classes of languages imply the hierarchy in the computational capability of the four types of automata, as illustrated in the figure below. Every language recognized by an FA can be recognized by a PDA, and every language recognized by a PDA can be recognized by an LBA, and so on.
TM
LBA PDA FA
The figure in the following page shows the summary of these relations, called the Chomsky hierarchy (named after Noam Chomsky, who defined the four classes of languages), among the models that we have studied, as well as some interesting models investigated by researchers. This hierarchy is a beautiful piece of knowledge that computer scientists have gained through the advancement of the field. 189
Hierarchy
Chomsky Hierarchy Languages (grammars)
Automata
Other Models
Recursively Enumerable Sets (type 0)
Turing Machines
Post System, Markov Algorithms, -recursive Functions
Context-sensitive Languages(type 1)
Linear-bounded Automata
Context-free Languages(type 2)
Pushdown Automata
Regular Languages(type3)
Finite State Automata
. . .
: Containment : Characterization
Regular Expression 190
Hierarchy
7.2 Proof of the Characterization The theorem below states the characterization relations between two models at the lowest level of the Chomsky hierarchy, which is illustrated by the following figure. We will prove this theorem. Regular languages
FA
Regular expression
Theorem. Let L be a language. (1.1) If L can be generated by a regular grammar, then there is an FA that recognizes L. (This implication will be denoted by RG FA.) (1.2) If L is recognizable by an FA, then L can be generated by a regular grammar (FA RG). (2.1) If L is expressible by a regular expression, then there is an FA that recognizes L (Re FA). (2.2) If L is recognizable by an FA, then L can be expressible by a regular expression (FA Re). 191
Hierarchy
Proof (1.1): RG FA
Let G be a regular grammar. For the proof, we will present how to construct an FA to recognize L(G). Suppose that A and B are two arbitrary nonterminal symbols, and a and b are terminal symbols. The following figure shows how to transform typical production rules that will appear in a regular grammar into the state transitions of an FA which recognizes the language generated by the grammar. (Notice that heavy circle denote an accepting state.) Rules A abB | B
State transitions A
b
a
A S is the start symbol Aa
start S
A
a
B Let the state with label A be an accepting state. Let the state with label S be the start state. Put a transition from state A to a new accepting state. 192
Hierarchy
Proof (1.1): RG FA
Example. Transforming a regular grammar to an FA The following example would be enough to illustrate how to transform an arbitrary regular grammar into an FA which recognizes the language generated by the grammar. start S abcS | cC A aS | B B aA | a | C aB | ac
c b
S a a
a
c
C
A
c
a a B
a For the proof, let be the set of terminal symbols of G. We must show that for every string x *, x L(G) if and only if x is accepted by the FA that we have constructed by the above idea. Suppose that string x is derivable by the grammar as follows. S => w1V1 => w1w2V2 => . . . => w1w2 . . .wn-1Vn-1 => w1w2. . .wn-1wn = x, where Vi and wi are, respectively, a nonterminal symbol and a terminal string (or ) such that Vi → wi+1Vi+1 is a rule in G. We can prove that that the string x is accepted by the FA, and the converse is also true. (We leave the detailed proof for the reader.) 193
Hierarchy
Proof (1.2): FA RG Given an arbitrary FA, we transform the FA into a regular grammar as follows. First, label the start state with the designated symbol S and others state with an arbitrary (distinguishable) nonterminal symbol. Then transform each transition into a rule as shown below. State transition
Production rule
a A
b, c
S
start A
B
A bB | cB | aA
Let S be the start symbol. A
194
Proof (1.2): FA RG
Hierarchy
Example: Transforming an FA into a regular grammar c
a
a b
start
b b
a a
a a
a
b
S start
c
A
b
a
B a
Label the states
E
C a D
b
A N I
Transform each transition into a rule G: S aS | aA A bB D aC
C aB | cE
B bB | bS | aD | E
To complete the proof, we need to show that for every string x, the FA accepts x if and only if the grammar G generates it. (The detailed proof is left for the reader.)
195
Hierarchy
Proof (2.1): Re FA Let R be a regular expression which expresses a regular language L. We construct an FA which recognizes L. Let be the alphabet of L. Going along the inductive definition of regular expressions, we will show how to construct an FA recognizes the language expressed by R. (1) If R is a regular expression which is either , , or a, for a symbol a , which, respectively, express the language (the empty set), {}, or {a}, we construct an FA which recognizes the respective language as follows:
start
a
start
a
start
(2) Suppose that we have constructed two FA M1 and M2 , which recognize the languages expressed by regular expressions r1 and r2, respectively: L(M1) = L( r1)
L(M2) = L(r2)
196
Hierarchy
Proof (2.1): Re FA
(3) Using M1 and M2 , we construct FA M1+2, M12, and M1*, which, respectively, recognize the language expressed by the regular expressions r1+ r2, r1r2, and (r1)*as follows.
To construct the FA M1+2, introduce a new start state and link it to the start state of M1 and M2 , as the following figure illustrates. Clearly, L(M1+2) = L( r1+ r2).
start
new start
M1
(a) M1+2
M2 start
L(M1) = L( r1)
L(M2) = L(r2)
L(M1+2) = L( r1+ r2)
197
Hierarchy
Proof (2.1): Re FA
To construct the FA M12, we first convert all accepting states in M1 to non-accepting states. Then from each of these non-accepting states, we set up an -transition to the start state of M2 . Clearly, L(M12 ) = L(r1r2).
start
M1 (b) M12
Assumptions
L(M1) = L( r1) L(M2) = L(r2)
start
M2 L(M12 ) = L(r1r2)
Break Time
A car was involved in an accident. As one might expect, a large crowd gathered. A newspaper reporter, anxious to get is story, pushed and struggled to get near the car. Being a clever sort, he started shouting loudly, “Let me through! Let me through please! I am the son of the victim.” The crowd made the way for him. Lying in front of the car was a donkey. - Anonymous -
198
Hierarchy
Proof (2.1): Re FA
Now, we construct an FA M1* to recognize the language expressed by r1*. First introduce a new accepting start state, and then set up an -transition, respectively, from this start state to the old start state and from every accepting state to the new start state (see the illustration below). Notice that we need the new start state defined as an accepting state to let the FA accept which is in the language expressed by r1*.
new start
start start
L(M1) = L( r1) M1
(c) M1*
L(M1*) = L(r1*)
199
Hierarchy
Proof (2.1): Re FA
Notice that when we construct M1* , if we use the old start state as illustrated in figure (a) below without introducing the new one, FA M1* may accept a string not in r1*. To see why, consider an FA whose start state is in a cycle as shown in figure (b) below. Since string ab is not accepted by M1, it should not be in r1*. However, the FA in figure (c) shows that the FA accepts ab.
b start
M1 start
(a)
a
a b
start
b
(b)
a
b start
a
(c)
200
Proof(2.1): Re FA
Hierarchy
Example: Constructing an FA for a given regular expression. Based on the approach given above, we show a step-wise construction of an FA which recognizes the language expressed by the regular expression ((ab + )ba)*.
a
b
a
start
start
ab +
start
a
ab
b
a
a
b
start
ba
b
b
A N I
start
start
(ab + )ba
a
b
a
start
((ab + )ba)*
b
a
b
b
a
start
201
Hierarchy
Proof (2.2): FA Re
Let M be an FA. We will show a method systematically transforming the state transition graph of M to a regular expression which expresses L(M). We first transform all the edge labels (i.e., the input symbols) in the transition graph into a regular expression (see figure (b) below). Now, we interpret the transition graph as follows: If there is an edge from a state p to a state q labeled with a regular expression r, then it implies that M, in state p reading any string in L(r), enters state q. By extending the function , we let it denote the above observation by (p, r) = q. Clearly, labeling the edges with regular expressions this way does not affect the language accepted by M. a, b a
2
start 1 b
3
(a)
a+b a, b, c
a 5
4
a
start
2
1
b
3
a+b+c 5 4
a
(b) 202
Proof (2.2): FA Re
Hierarchy
Now, let G be the state transition graph with edges labeled with a regular expression. We eliminate any state (except for the start state and the accepting states) from G and manipulate the edges and their labels without affecting the language recognize by the automaton. The following example shows how. a+b Eliminate state 2 a(a+b)*(a+b+c) a+b+c a 2
1
b
3
a
4
5
b
a Eliminate states 3 and 4 5
3
a
4
a
A N I
a(a+b)*(a+b+c)
Merge edges
a(a+b)*(a+b+c)+baa 1
1
5
1
5
baa Clearly, the label a(a+b)*(a+b+c)+baa on the edge from the start state to accepting state 5 is a regular expression which denotes the set of strings accepted by state 5 of the automaton. 203
Proof (2.2): FA Re
Hierarchy
The following figures show a typical case of eliminating a state from a state transition graph and manipulating the edge labels with a regular expression. Clearly, the same idea works even when a complex regular expression is substituted for each simple regular expression in the graph. f a
... r
c
s ...
q b
af*b
d
df*c af*c
s ...
... r df*b
If the state transition graph has k ( 1) accepting states, the language L(M) is the union of all the languages accepted by these accepting states. Suppose that M has k accepting states and for each i, we found a regular expression ri that denotes the language accepted by i-th accepting state, then the regular expression r denoting the language accepted by M is given as follows. r = r1 + r2 + . . . . + r k 204
Proof (2.2): FA Re
Hierarchy
Example: We compute a regular expression which denotes the language accepted by the FA shown in figure (a) below. We will compute regular expressions r4 and r0 that denote the language accepted by the accepting states 4 and 0, respectively, as they are illustrated by figures (b) and (c) below, and then find the answer r = r4 + r0. r4
a a b
r = r 0 + r4
4
b
a
(b)
a 0
4
2
1
b start
0
b
3
b
b
r0
0
(a) (c) 205
Proof (2.2): FA Re
Hierarchy
To compute r4, we change state 0, the start state, to non-accepting state, and leaving the start state and the accepting state 4, eliminating all the other states one by one. We begin by eliminating state 2. The order of elimination does not matter. Although the regular expression may differ, it expresses the same language. It is more convenient to eliminate a state that involves fewer transition edges.
a
a b a
start
4
b 3
b
ba
b
b
1
b
a 0
a a
2
1
b start
Eliminate state 2
b
A N I
b
0
ba
b a
3
4
b
b
206
Proof (2.2): FA Re
Hierarchy
Parallel transition edges (i.e., edges having the same origin and destination) are merged into one edge with all the labels merged into one using the operator +. In the figures below, notice how the two transitions from state 1 to state 4 and the two loop transitions on state 1 are, respectively, merged into one. a+ba a
ba
a b
0
a
b
1
b start
Merge parallel edges
ba
start
b a
3
4
b
b
b+
1
b
ba
b
0
b a
3
4
b
b
207
Proof (2.2): FA Re
Hierarchy
a+ba a
b a
4
bba
b
0
start
b+
1
b
ba
b
0
a+ba a
b+
1
b start
Eliminate state 3
b
4
ba
3
b
b bb
a+ba a
1
b start
0
Merge parallel edges
b+ bba
b ba
b+bb 4
A N I
208
Proof (2.2): FA Re
Hierarchy
a+ba a
start
0
b+
1
b
A N I
bba
b ba
b+bb
Eliminate state 1
4
a(a+ba)*b b start
a(a+ba)*(b+) bba(a+ba)*b
0
b+bb 4
ba bba(a+ba)*(b+)
209
Proof (2.2): FA Re a(a+ba)*b b start
Hierarchy
a(a+ba)*(b+) bba(a+ba)*b
0
b+bb
A N I
4
Merge parallel edges
ba bba(a+ba)*(b+)
a(ba+a)*b+b
start
a(a+ba)*(b+)
4
0
bba(a+ba)*b+ba bba(ba+a)*(b+)+(b+bb)
210
Proof (2.2): FA Re
Hierarchy
In general, with all the states eliminated except for the start state and one accepting state, we get a transition graph, as shown in figure (i) below. We need one more step.
To enter the accepting state, the machine must take a transition from the start state to the accepting state by (r11)*r12 . Once in the accepting state, we see the machine has two self-loops as shown in figure (ii), the one with label r22 and the other with label r21(r11)*r12, passing through state 1 and returning back to state 2. We finally merge them into one as shown in figure (iii), and read off the regular expression which denotes the language accepted by the accepting state 2 as shown in figure (iv). r11 r21(r11)*r12 r22 r22 r12 (r11)*r12 2 start 1 1 2 A r21 (i) N r22 + r21(r11)*r12 (ii)
I
(r11)*r12 1
2 (iii)
r2 = (r11)*r12 (r22 + r21(r11)*r12)* (iv) 211
Proof (2.2): FA Re
Hierarchy
Now, we go back to figure (g) (repeated below) in the series of state eliminations, and we apply the idea we have just developed. Using the short notation rij for the label on the edge linking state i to j, the language accepted by state 4 can be expressed by the regular expression r4 shown below. r04
r00
a(a+ba)*(b+)
a(ba+a)*b+b
start
r4 = (r00)*r04( r44 + r40(r00)*r04 )* 4
0
bba(a+ba)*b+ba r40 bba(ba+a)*(b+)+(b+bb) r44
212
Proof (2.2): FA Re
Hierarchy
To compute a regular expression r0 which denotes the language accepted by state 0 (the start state) of the FA, we can use the same reduced graph (repeated below). To compute a regular expression that denotes the language accepted by the start state, we change the start state back to accepting state and state 4 to non-accepting state as show below.
a(ba+a)*b+b
start
0
a(a+ba)*(b+)
a(ba+a)*b+b 4
bba(a+ba)*b+ba bba(ba+a)*(b+)+(b+bb)
start
0
a(a+ba)*(b+) 4
bba(a+ba)*b+ba bba(ba+a)*(b+)+(b+bb)
213
Proof (2.2): FA Re
Hierarchy
By eliminating state 4 and merging the resulting two self-loops on state 0 as shown, we finally get a regular expression r0. (Notice that we are using the short notation for convenience.) r04
r00 a(ba+a)*b+b
start
0
(h)
r00
a(a+ba)*(b+)
r04(r44)*r40 start
4
0
bba(a+ba)*b+ba r40
(i)
bba(ba+a)*(b+)+(b+bb) r44 r0 = (r00 + r04(r44)*r40)*
A N I
r00 + r04(r44)*r40
start
0
(k)
Finally, substituting back the entire short notation in r0 + r4, we will get a regular expression r which denotes the language recognized by the given FA. 214
Hierarchy
Rumination (1): FA Re • The state elimination technique would be easy to practice for a simple graph with a paper-and-pencil.
However, it will be messy for a large graph. There is a beautiful algorithmic approach, called CYK algorithm, developed based on the dynamic programming. This algorithm is presented in Appendix C. • Depending on the order of the states eliminated in the procedure, we will get a different regular expression. However, they are equivalent, i.e., they denote the same language recognized by the FA. • It is an intractable (i.e., solvable, but no practical algorithm available) problem to tell whether two arbitrary regular expressions are equivalent or not. However, for some particular cases, especially if the expressions are simple, we can solve the problem using the techniques available in this text. The figures below prove the two equivalences (1) and (2) below. Notice that the same equivalence holds even when an arbitrary regular expression is substituted for a or b in the expressions. (In the following chapter, we will learn how to convert an NFA to a DFA and a method to minimize the number of states of a DFA.) (1)
a*
=
(a* )*
(2) (a +
b)*
=
a
a
(a* )*
1
a a
2
Eliminate -transitions
Convert to an FA
(a*b*)*
1
2
2 a
1
a a
a
b a
A N I
(a*b*)*
Minimize the number of states a, b
Regular expression a, b (a + b)*
a, b 1
a*
1 Convert to a DFA a, b
a,b
{1,2}
2 a, b
a, b
{1,2}
1
215
Hierarchy Exercises 7.1 Using the technique presented in Section 7.2, transform the following state transition graph of an FA into a regular grammar which generates the same language recognized by the FA. b
a b a,b
b
start
a,b 7.2 Using the technique presented in Section 7.2, transform the following regular grammar into the state transition graph of an FA which recognizes the same language generated by the grammar. S abS | cC
A aS | B | a
B aA |
C aB | abc
7.3 Compute a regular expression that denotes the language recognized by the FA shown in problem 7.1 above. You should also show the procedure that you took to get your answer. 7.4 Let L be the language denoted by the following regular expression. (a) Construct the state transition graph of an FA that recognizes L, and (b) Construct a regular grammar that generates L. You should also show the procedure that you took to find each answer. ((ba + b)* + (cd(a + b))*)bba
216
Models of Language Recognition: Automata
217
8. Manipulating FA In the preceding chapter, when we proved the characterizations among the class of regular languages, FA and regular expressions, we used NFA. If we convert a regular grammar or a regular expression to an FA, the automaton usually turns out to be an NFA. NFA is a conceptual model that cannot be implemented with any currently available technology. However, since every NFA can be converted to a DFA, NFA’s are used as convenient tools for solving problems concerning regular languages or designing DFA’s, which are directly realizable. Figures (a) and (b) below show, respectively, an NFA and a DFA which recognize the same language { xa | x {a, b}+ }. In this chapter, we will learn how to convert an NFA to a DFA accepting the same language.
a a, b a
b a b
(a) NFA
(b) DFA
When we designed automata, we did not pay much attention on the size of the automata. However, when designing an automaton as a model for implementation, we must take account of the size because it may directly affect the size or the complexity of the implemented hardware or software.
218
Manipulating FA
Figures (a) and (b), respectively, show pairs of DFA and DPDA (of different size), respectively, accepting the languages a* and | i > 0}. Given an arbitrary deterministic automaton (not to mention a nondeterministic one), it is very difficult problem to
{aibi
reduce the number of states of the automaton. No practical algorithms are available except for DFA. For TM and LBA the problem is unsolvable, and it remains as an open problem for PDA’s. We will shortly learn how to minimize a DFA.
(b, a/ )
a
a
L = {aibi | i > 0}
a
(b, a/ )
(a, a/aa) start (b, a/ ) a
(, Z0/Z0)
(a, Z0/aZ0), (a, a/aa)
L = a* (b, a/ )
(a, Z0/aZ0)
(, Z0/Z0)
start start (a) Equivalent DFA
start (b) Equivalent DPDA
219
Manipulating FA
8.1 Converting an NFA to DFA 221 8.2 State Minimization 224 Rumination 231 Exercises 236
Family Gift
Break Time
Amanpreet heard a rumor that his father, grandfather and great-grandfather had all walked on water on their 21st birthdays. So, on his 21st birthday, Amanpreet and his good friend Brian headed out to the lake. "If they did it, I can too!" he insisted. When Amanpreet and Brian arrived at the lake, they rented a boat and began paddling. When the got to the middle of the lake, Amanpreet stepped off of the side of the boat . . . and nearly drowned. Furious and somewhat shamed, he and Brian headed for home. When Amanpreet arrived back at the family farm, he asked his grandmother for an explanation. "Grandma, why can I not walk on water like my father, and his father, and his father before him?“ The feeble old grandmother took Amanpreet by the hands, looked into his eyes, and explained, "That's because your father, grandfather, and great-grandfather were born in January . . . you were born in July, dear.“ - Raman -
220
Manipulating FA
8.1 Converting NFA to DFA In this section we will study how to convert an NFA M = (Q, , , q0, F) to a DFA M’ = (Q’, , ’, q0, F’) which recognizes the same language. Let t and A, B Q. Extend such that (A, t) gives the set of next states that M enters, after reading the symbol t in every state in A. To convert M to M’, first let q0 be the start state of M’ and compute the set (q0, t) = A, for every t . Then in M’ create a new state labeled with A and add the transition ’(q0, t) = A (see figure (b)). For each new state A created, compute (A, t) = B, for each t , create a state labeled B, if A B, and add the transition ’(A, a) = B, and so on. This procedure is repeated until no new state is added to M’. This idea is best understood by the following example. A a b b c a N a b,c b I a {0,1} {1,2} 1 2 0 0 c c start start b,c b, c {2} c (a) NFA M (b) DFA M’ 221
Manipulating FA
Converting NFA to DFA
Now, finally, we define accepting states in M’. Suppose that A Q is defined as a state of M’. This implies that for a string x, if ’(q0, x) = A, then reading string x, the NFA M simultaneously enters into all states in A. If there is an accepting state i in A, then by definition, x is accepted by M ignoring all nonaccepting cases. Thus, to make M’ accept the same language, we must define A as an accepting state, as figure (b) below illustrates. If A contains no accepting state, then state A in M’ remains as a non-accepting state. j i
... k
start
x
(a) Nondeterministic transition in M
x
{. . i, j, . .k.}
start (b) Deterministic transition in M’
222
Manipulating FA
Converting NFA to DFA
In the example, since state 2 is the only accepting state of M, in M’ we define every state labeled with a state set containing 2 as an accepting state. It is easy to prove that L(M) = L(M’). (For the proof, it is enough to show that for every string x *, x is accepted by M if and only if x is accepted by M’. We leave the proof for the reader.) Convert to deterministic a b c b a transitions a b,c 1 2 0 a {0,1} b
start
start
(a) NFA
a a 0
start (c) DFA
c
0
b,c b b
{0,1} c
b, c
{1,2} c
{1,2}
b,c
c
{2}
Define accepting states (b)
c A N I
{2} c 223
Manipulating FA
8.2 State Minimization This section presents a method for minimizing the number of states of a DFA. For the method we need the following definition. Let M = (Q, , , q0, F) be a DFA. Two states p, q Q are equivalent if they satisfy the following condition. For every w *, (p, w) enters an accepting state if and only if (q, w) does. The state minimization technique searches all equivalent states and merges them into one state. To see how the technique works, consider the following DFA in figure (a). Notice that for every input string w {a, b}*, (3, w) enters an accepting state if and only if (4, w) does. Hence, we can merge states 3 and 4 without affecting the language recognized by the DFA, as shown in figure (b).
a
b 1
a
a
b 2
a b (a)
a
3
4
1
a
2
b
a
[3,4]
b (b) 224
Manipulating FA
State Minimization
Clearly, states 2 and [3, 4] in figure (b) are equivalent and hence, we can merge them as shown in figure (c) below. Notice that we cannot merge the two states in figure (c), because one is an accepting state and the other is not. They are not equivalent, because, for w = , ([2, 3, 4], w) = [2, 3, 4], which is an accepting state, while (1, w) = 1 is not. In other words, accepting states and non-accepting states cannot be equivalent. Now, with this conceptual foundation, we are ready to study a systematic approach to minimizing a DFA. a a b b 1
a
2
a
3 b
(a)
a
1
4 b a
1
a
a
(b)
2
a [3,4] b
[2,3,4]
b (c) 225
Manipulating FA
State Minimization
There are several algorithms available for finding equivalent states of a DFA to merge and minimize the number of states. We will study an algorithm, called the state partitioning technique, which is easy to follow with an example. Example 1. Consider the DFA M in figure (a) below. We iteratively partition the states of M into groups such that no two states belonging to different groups are equivalent. Since accepting states and non-accepting states cannot be equivalent, we first partition the states into two groups P1 and P2 , the one accepting and the other non-accepting, as shown in figure (b). a b
a start
a
3
1
a b
0
4 a
2 b b (a) DFA M
P1 { 3, 4, 5 }
P2 { 0, 1, 2 }
b
b
5
a (b) Initial partitioning 226
Manipulating FA
State Minimization
Now, we analyze the responses of each state for every input symbol in the alphabet {a, b} as follows. For each state q and a symbol t {a, b}, if (q, t) = r and state r belongs to group Pi , we record that the response of state q for the input t is Pi , as shown in figure (a). Clearly, no two states showing different responses can be equivalent, because reading either a or b, one goes to an accepting state and the other goes to a nonaccepting state. So, we partition the groups P1 and P2 as shown in figure (b).
P1 p1 a {3, b p2
p1 4, p1
P2 p1 5} p1
p2 {0, p2
(a) Response analysis
p1 1, p1
p1 2} p1
P11
P12
P21
P22
{3}
{4, 5}
{0}
{1, 2}
(b) Partitioning 227
Manipulating FA
State Minimization
Having P1 and P2 partitioned into the four new groups, again we need to make sure that all the states in each group are equivalent by analyzing their responses to each input. Suppose that states i and j in a group show different responses. Then, there must be an input string (of length 2 or longer) for which i enters in an accepting state, while j does not, or vice versa. They cannot be equivalent, and should be partitioned into different groups. For the example, the response analysis below shows that all the states in every group show the same responses. We stop the partitioning. Let p and q be two states belonging to a group of the final partition. For an arbitrary input string w {a, b}*, both (p, w) and (q, w) will enter a state which belongs to the same group. Since all the states in a group are either accepting or non-accepting, (p, w) enters an accepting states if and only if (q, w) does. They are equivalent and hence, we can merge them into one state to reduce the number of states.
P11
P12 a
{3}
b
P21
p12 p12
p11 p11
{4, 5}
p12 p12
P22
{0}
{1, 2}
p12 p12 228
Manipulating FA
State Minimization
Thus, for the example, merging the two states in {1, 2} and {4, 5}, we get the minimized DFA shown in figure (b) below.
a b a start
a
{4, 5}
{0}
a b
a
b
3
4 a
2 b b (a) DFA
{1, 2}
3
1
0
{3}
a, b
b
b 5
0 a
start
1,2
a b
a, b
4,5
(b) Reduced DFA
229
State Minimization
Manipulating FA
Example 2. The figure below shows the results of state partitioning. It shows no two states are equivalent, i.e., the DFA is already minimized. For this simple DFA, we can get the same conclusion by the following observation: (2, aaa) = 5 is A accepting, while (3, aaa) = 1 is not. So, states 2 and 3 are not equivalent. Likewise, N (3, aa) = 5 is accepting, while (4, aa) = 1 is not, and hence states 3 and 4 are not I equivalent, and so on. a P1 P21 P22 p21 p21 p22 3 4 5 1 2 a a a a a {1} {2, 3, 4} { 5 } P1 P2 p2 p2 p2 p1 a P1 P211 P212 P22 {1} {2, 3, 4, 5 } p211 p212 a { 1 } {2} {3} { 4 } { 5 } {1} {2, 3} {4} {5} 230
Manipulating FA
Rumination (1): State Minimization •
Here we present another algorithm for minimizing a DFA. This approach iteratively identifies state pairs which are not equivalent. For the same previous example (repeated in figure (a) below), the matrix in figure (b) shows the initial contents (the x marks) for all accepting and non-accepting state pairs, which are not equivalent. The matrix has an x mark at entry [i, j], if states i and j are not equivalent. Blank entries are to be resolved.
a b a start
a
2
3
1
a
b
0
4 a
2 b b (a) DFA
b
b 5
1
a
i 3 4
x
x
x
x
x
x
5
x
x
x
0 1 2 3 4 j (b) Matrix of state pairs not equivalent
231
Manipulating FA
Rumination (1): State Minimization
To resolve the entry [4, 3], we examine the two states (4, b) = 5 and (3, b) = 1. Since [5,1] is marked with x, the entry [4, 3] is also marked with x. Notice that since (4, a) = 4 and (3, a) = 3 and entry [4, 3] is unresolved, we cannot use this information. Likewise, we find the pairs [5, 3], [1, 0], and [2, 0] are not equivalent. Now, we examine [2,1]. Since (2, b) = 5, (1, b) = 4, and entry [5, 4] is not resolved yet, we defer the resolution till [5, 4] is resolved. Since (4, a) = 4 and (5, a) = 5, and (4, b) = 5 and (5, b) = 4, we mark entry [5, 4] with E, implying that states 4 and 5 are equivalent. Since (1, a) = (2, a) = 3, (1, b) = 4, (2, b) = 5, and [5, 4] is marked with E, entry [2,1] is also marked with E.
a b a start
a
0
a
4 a
2 b b (a) DFA
b
b 5
A N I
2 x E
3
1 b
1 x
a
i 3 4
x
x
x
x
x
x
x
5
x
x
x
x
E
0 1 2 3 4 j (b) Matrix of state pairs not equivalent
232
Manipulating FA
Rumination (1): State Minimization •
We have studied two methods for minimizing the number of states of a DFA, but no proof is given. We need to prove that after the minimization, no further state reduction is possible. For the proof, let M be a DFA (figure (a)) that we get by applying one of the state minimization methods. Suppose that there is a DFA M’(figure (b)), which recognizes the same language with smaller number of states than M.
w2
w2
q p
start
w1 (a) M
s start
w1 (b) M’
Since M’ is smaller than M, and they recognize the same language, there must be two input strings w 1 and w2 such that DFA M’, reading both w1 and w2, enter a state s, while DFA M enters in two different states, p and q. Because they recognize the same language, state s is an accepting state if and only if both p and q are accepting. Because states p and q are distinct in M, which is constructed through the method merging equivalent states, they must not be equivalent states.
233
Manipulating FA
Rumination (1): State Minimization
Since states p and q in M are not equivalent, there will be a string x such that reading x in state p, M enters an accepting state, while, reading the same string x in state q, the machine enters a non-accepting state (see figure (a) below). Now we examine what happens in M’ for the same input strings w1x and w2x. Since M’ recognizes the same language, reading w1x it should enter an accepting state, while reading w2x it enters a non-accepting state. This implies that in state s reading x, M’ takes two different transitions as shown in figure (b). This implies that M’ is an NFA, which contradicts our assumption that M’ is a DFA.
w2
w2
x
q p
start
w1 (a) M
x
A N I s
start
w1
x
x
(b) M'
234
Manipulating FA
Rumination (1): State Minimization
The state minimization methods that we have studied are applicable only to DFA’s. For example, figure (a) shows an NFA where no two states are equivalent. However, figure (b) shows a smaller FA which recognizes the same language.
a
a
a
a a
a (a)
May You Have
(b)
Break Time
May you have enough happiness to make you sweet, enough trials to make you strong, enough sorrow to keep you human, enough hope to make you happy and Enough money to buy gift! - Anonymous -
235
Manipulating FA
Exercises 8.1 Using the method presented in this chapter, convert each of the following NFA’s to a DFA. You should also show the procedure that you took for your answer. a, b
a, b a
start
1
2
a
a
3
4
1
start
(a)
2
a
3
a
4
(b)
8.2 Using the state partitioning technique, minimize the following DFA. You should also show how you partitioned the states to get your answer.
0 1 2 1
0
4
0 1
1
0
start
3 1
236
Manipulating FA
Exercises
8.3 Can we reduce the number of states of the following DFA without affecting the language recognized by it? Justify your answer. a 1
2 a
3 a
4 a
5 a
8.4 Let M be an NFA, and M’ be the DFA that we get by the technique presented in Section 8.1 for converting an NFA to a DFA. Prove that L(M) = L(M’). (For the proof, show that for an arbitrary input string w, M accepts w if and only if M’ does.)
237
Models of Language Generation: Grammars
238
9. Properties of the Formal Languages Every language has structural properties. For example, every string in the language {anbn | n > 0 } has a’s followed by b’s, and their numbers are same. Since, in general, it is impractical to define a language in terms of set properties, we use grammars. However, it is very difficult problem to figure out the structure of the language generated by a grammar G. In fact, it has been shown that for a given grammar G, the decision problem of whether L(G) = * or not is unsolvable, except for regular and deterministic context-free grammars, where is the set of terminal symbols of G. Given two grammars G1 and G2, it is also proven that the problem of deciding whether or not L(G1) = L(G2) is unsolvable, except for the cases where both grammars are regular or deterministic context-free grammars. In fact, the problem remains open for the grammars whose
languages can be recognized by a DPDA. We classify formal grammars (and their languages) into type 0 (phrase structured), type 1 (context-sensitive), type 2 (context-free), and type 3 (regular) according to the format of the rules. An interesting question is; Are there any properties that the class of languages commonly exhibit? In this chapter, we answer this question for regular and context-free languages by examining what happens under the various set operations. 9.1 Properties of regular languages 240 Union, Concatenation, Kleene star, Reverse, Complement, Intersection, Set subtraction 9.2 Properties of context-free languages 245 Union, Concatenation, Kleene star, Reverse, Complement Rumination 248 Exercises 251
239
Properties of Languages
9.1 Properties of regular languages Theorem 9.1 Let L1 and L2 be regular languages. The result of each of the following set operations is also regular, i.e., regular languages are closed under each of the following set operations. (1) L1 L2 (union) (2) L1L2 (product) (3) L1* (Kleene star) (4) L1R (reversal) (5) L1 (complement) (6) L1 L2 (intersection) (7) L1 - L2 (set subtraction)
240
Properties of regular languages
Properties of Languages
Proof. Let G1 and G2 be regular grammars, respectively, generating the languages L1 and L2, and let M1 and M2 be DFA, respectively, recognizing L1 and L2. (Recall that every regular language is recognized by a DFA.) Let r1 and r2 be regular expressions, respectively, denoting L1 and L2. (Recall that every language denoted by a regular expression is regular.) (1) (union). The regular expression r1+ r2 denotes L1 L2. Since every language expressible by a regular expression is regular, L1 L2 is regular. (2) (product). The regular expression r1r2 denotes L1L2. Hence, L1L2 is regular.
(3) (Kleene star). The regular expression (r1)* denotes L1*. Hence, L1* is regular.
241
Properties of Languages
Properties of regular languages
(4) (reverse). Using DFA M1, which recognizes L1, we can construct an FA that recognizes L1R. (The following example shows the procedure.) This implies that L1R is also regular. Add a new accepting state, and link an -transition from every accepting state to this new a c c e p t i n g s t a t e . ( T h i s FA recognizes the same language.)
a
a
a b
start
a
start
a
b
b
a b
a
b
a
b
a start
b
a
a
b
a
A N I
b Let the state added be the start state, the old start state be the only accepting state, and reverse all the edges. This NFA recognizes the reverse of the language r e c o g n i z e d b y t h e o r i g i n a l FA . 242
Properties of Languages
Properties of regular languages
(5) (complement). Using DFA M1, which recognizes L1, we can construct a DFA that recognizes the complement of L1, as the following example shows. (Recall that for convenience, we do not show the dead state and the transitions entering to it.)
Show the dead state and all the transitions entering it.
a
a a
b
start
a
b
start
b
b a
a
a b b
a
a a,b
start
b a
b
b
a,b
b
a Change the accepting states to non-accepting states, and non-accepting states to accepting states.
b
A N I
243
Properties of regular languages
Properties of Languages
(6) (intersection). By the equality shown below and the properties (1) and (5) (i.e., the closure under union and complementation), L1 L2 is also regular.
L1 L2 = L1 L2 = L1 L2 (7) (set subtraction). Since L1 - L2 = L1 L2 , by the closure properties (5) (closure under complementation) and (6) (closure under intersection), the language L1 - L2 is also regular.
Pressure
Break Time
Pressure is part of life: adapt to it and use it! I don’t do without pressure. A physical therapist said, “Get in tune with your body.” But if I listened to my body, I’d stay in bed all morning.” - Stan Pottinger -
244
Properties of Languages
9.2 Properties of Context-free Languages The properties of context-free languages are a little different from those of regular languages. As the following theorem shows, they are not closed under intersection and complementation. Theorem 9.2 Context-free languages are closed under union, product, Kleene star, and reversal. They are not closed under intersection and complementation. Proof: For the proof, we will use context-free grammars exclusively. (We may use PDA’s, but in general the proofs become somewhat unwieldy.) Let L1 be L2 two arbitrary context-free languages that are, respectively, generated by context-free grammars G1 = (N1, T1, P1, S1 ) and G2 = (N2, T2 , P2, S2 ).
Assume that N1 N2 = , i.e., the two grammars use different nonterminal symbols. Otherwise, we can simply modify one of the grammars and let them meet this condition without changing the language. Now, we prove the theorem. (1) (union). We construct a context-free grammar G by merging all the rules from G1 and G2, and then add a new start symbol S and the rules S S1 | S2. Grammar G generates L1 L2 . 245
Properties of Languages
Properties of CFL
(2) (product). We construct a context-free grammar G by merging all the rules of G1 and G2, and then adding new start symbol S and the rule SS1S2. This grammar generates the language L1L2 . (3) (Kleene star). We modify G1 by introducing a new start symbol S and adding the rule S S1S | . This modified G1 generates the language (L1)*. (4) (reversal). To make a context-free grammar that generates (L1)R , we reverse the right side of every rule in G1. Now we prove that context-fee languages are not closed under intersection and complementation. For this proof, it is enough to show a counter example. (5) (intersection). We can show that languages L1 and L2 shown below are context-fee by either constructing a PDA or a CFG. However, language L3 is not context-free, which we will prove in Chapter 12.
L1 = {a ib ic j i, j 0 },
L2 = {a k b nc n k, n 0 }
L3 = L1 L2 = {a i b i c i i 0 }
246
Properties of CFL
Properties of Languages
(6) (complement). To the contrary, suppose that the context-free languages are closed under complementation. Then both L1 and L2 are context-free. We will use the following property; L 1 L 2 = L 1 L 2 = L1 L 2 By property (1), the union L1 L2 is also context-free. It follows that L1 L2 = L1 L2 is context-free. This contradicts to the proven fact that context-free languages are not closed under intersection.
Break Time Enjoy Sharing A young man saw an elderly couple sitting down to lunch at McDonald's. He noticed that they had ordered one meal, and an extra drink cup. As he watched, the gentleman artfully divided the hamburger in half, then counted out the fries, one for him, one for her, until each had half of them. Then he poured half of the soft drink into the extra cup and set that in front of his wife. The old man then began to eat, and his wife sat watching, with her hands folded in her lap. The young man decided to ask if they would allow him to purchase another meal for them so that they didn't have to split theirs. The old gentleman said," Oh no. We've been married 50 years, and everything has always been and will always be shared, 50/50." The young man then asked the wife if she was going to eat, and she replied, "It's his turn with the teeth.“ - Fred & Wanda -
247
Rumination(1) : language properties
Properties of Languages
• Property (4) of regular languages says that if L is regular, so is LR. Let G be a regular grammar whose language is L. As the following example shows, if we reverse the right side of every rule of G, then the resulting grammar G’ generates the reverse of L(G), which is also regular according to the closure property of regular languages under reversal. G: G’:
S abA | ac S Aba | ca
A bS | a A Sb | a
Recall that every rule of a regular grammar has the restricted form of either A xB or A y, where A, B VN and x, y (VT )*, where VN and VT are, respectively, the sets of nonterminals and terminals. In other words, at the right side of every rule, at most one nonterminal symbol can appear, positioned at the right end, if any. We call such grammars rightlinear. In contrast, the left-linear grammars are the ones whose rules have the restricted form of either A Bx or A y. Both right-linear grammars and left-linear grammars generate the same class of regular languages. We can prove this by showing that (1) every language generated by a left-linear grammar is regular, and (2) every regular language can be generated by a left-linear grammar as follows. Proof (1). Let G be a left-linear grammar. Construct a right-linear grammar G’ by reversing the right side of every rule of G. Grammar G’ generates the reverse of L(G), i.e., L(G) = (L(G’))R. Since G’ is a right-linear, L(G’) is regular. Since the reverse of every regular language is also regular, L(G) is regular. Proof (2). We argue with the same logic. Let L be a regular language. By the closure property under reversal, LR is regular. Let G be a right-linear grammar which generates LR . Construct a left-linear grammar G’ by reversing the right side of every rule of G. Grammar G’ generates L.
248
Properties of Languages
Rumination (1): language properties
• We used DFA to prove that the complement of a regular language is regular. The proof does not work with NFA. Here we show an example. The language recognized by the NFA in figure (a) is {a, aa}, whose member cannot be in its complement. However, the NFA in figure (c) accepts both a and aa.
a
a
a
Add the dead state
start
start
a
a a
a a
(a) a
a
start
Convert the states of accepting and non-accepting
a a a (c)
a
(b)
a
• For the proof of the closure properties of regular languages, we used various models that characterize regular languages. To prove the closure property under union operation, we used regular expressions. We may as well use regular grammars or FA. However, for the efficiency of the proof, we need to choose a right model. Otherwise, the proof often appears difficult or sometimes impossible. For example, try to prove the closure property under complementation with regular grammars or regular expressions. There is no guideline available for the choice, except for building up related knowledge and experiences.
249
Rumination (1): language properties
Properties of Languages
• We saw the difference between the closure properties of regular languages and context-free languages. Another contrasting property is that the class of languages recognized by NPDA properly contains the class of languages recognized by DPDA, i.e., there are context-free languages that cannot be recognized by any DPDA. For example, it is impossible for a DPDA to recognize the context-free language {wwR | w {a, b}+ }. We know how to construct an NPDA recognizing this language. (Recall that a language recognized by an NFA if and only if it is recognized by a DFA.) Appendix D shows that, like the class of regular languages, the class of deterministic context-free languages (i.e., the languages recognized by DPDA) is closed under complementation.
Communication Deadlock “Who is calling?” was the answer to the telephone. “Watt.” “What is your name, please?” “Watt’s my name.” “That’s what I asked you. What’s your name?” “That’s what I told you. Watt’s my name.” A long pause, and then from Watt, “Is this James Brown?” “No, this is Knott.” “Please, tell me your name.” “Will Knott.” Whereupon they both hung up.
Break Time
- Raman -
250
Properties of Languages
Exercises
9.1 Let L be the language generated by the following regular grammar G. Construct a regular grammar whose language is LR. You should also show the procedure that you took to get your answer.
G : S aA | bB B aC | bE
A bD | C aC | bE |
D aE | bD |
E aE | bD
9.2 (a) Construct a regular grammar (i.e., right-linear) whose language is the complement of language L generated by the grammar G above. You should also show the procedure that you took to get your answer. (b) Construct a left-linear grammar that generates the same language generated by the grammar G above. 9.3 For two context-free grammars G1 and G2 below, (a) construct a context-free grammar whose language is the product L(G1)L(G2). (b) Construct a context-free grammar whose language is the union, L(G1) L(G2). You should also show the procedure that you took to get your answer.
G1 : S aA | bB
B aC | bE D aE | bD |
A bD |
C aC | bE | E aE | bD
G2:
S aA
A Sb | b
251
Models of Language Generation: Grammars
252
10. Manipulating Context-free Grammars Throughout the last three chapters we studied the characterization among the three models of regular grammars, FA and regular expressions, which belong to the bottom level of the Chomsky hierarchy, and how to manipulate them. In this chapter, we move one level up the hierarchy and study how to manipulate CFG’s and PDA. Since PDA’s have a stack, it is much harder to manipulate them than FA. The nondeterminism in FA does not make any difference in the class of languages recognized by them, i.e., if a language is recognizable by an NFA if and only if it is recognizable by a DFA. Actually, there is an effective method to transform an NFA to a DFA which recognizes the same language. In contrast, there are context-free languages that can only be recognized by an NDPDA. There is no algorithm available for converting an NPDA to a DPDA or for minimizing the number of states of a DPDA. Our study will mainly focus on CFG’s till Chapter 13, where we will come back to a variation of DPDA and a subclass of CFG’s for an application. 10.1 Reducing -production rules 254 Algorithm and example 10.2 Eliminating unit production rules 262 Algorithm and example 10.3 Eliminating useless symbols 266 Algorithm and example 10.4 Normal forms of CFG’s 277 Chomsky normal form (CNF) Greibach normal form (GNF) Exercises 285
253
Manipulating CFG
10.1 Minimizing the number of -production rules Using -production rules may help designing a succinct grammar. However, like transitions in an FA, it could be computationally intensive to automatically analyze the language generated by a grammar which has many -production rules. In this section we will learn how to minimize the number of -production rules in a CFG. To understand the basic idea of minimizing the number of -production rules, let’s see how we can reduce the number of the -production rules in the CFG G shown below, while preserving the language of the grammar.
Eliminating A and B , we need to expand the rule S AB as shown in G so that G and G generate the same language. In G we keep the rule S AB for the case when both A and B, respectively, produce a and b. In addition to this rule we need the rule S A for the case when A and B, respectively, produce a and . Likewise, we need the rule S B for the case when A and B, respectively, produce and b. Finally, we need S , for the case of A and B . We claim that L(G) = L(G).
G: S AB
Aa|
Bb|
G: S AB | A | B |
A a
Bb 254
Manipulating CFG
Minimizing -production rules
In the previous example since the language contains , the grammar G’ needs one -production rule S . If the language does not contain , we can eliminate -production rules completely, as the following example shows. G: S ABc
A a |
Bb|
G: S ABc | Ac | Bc | c
Aa
Ba
We apply the same idea when a nonterminal generates indirectly, as C and D do in the CFG G below. (Notice that C and D can derive , respectively, using A and B.)
G: S CD G:
CA| c
S CD | C | D |
DB| d
CA| c
Aa |
DB| d
Aa
Bb| Bb
255
Minimizing -production rules
Manipulating CFG
Now, for the proof of the following theorem we present an algorithm which given a CFG G, minimizes the number of -production rules of G using the idea presented above. Theorem 10.1 Given a CFG G, we can construct a CFG G' that satisfies the following conditions. • L(G) = L(G'). • If L(G), G' has no -production rule. Otherwise, S is the only -production rule in G'. Proof (Algorithm): Let G = (VT, VN, P, S) be the CFG. The following algorithm constructs a CFG G' = (VT , VN , P', S), which satisfies the conditions.
256
Minimizing -production rules
Manipulating CFG
Algorithm (minimizing -production rules) Input: a CFG G = (VT, VN, P, S) Output: a CFG G = (VT , VN , P', S), G satisfies the conditions of the theorem (1) With the following procedure find all nonterminal symbols in VN, which directly (by line 1) or indirectly (by lines 2 - 5) generate , to construct the set W. 1. W = {A | A VN , A }; 2. Do {
3.
W = W;
4.
W = W' {A | A VN , A , (W) +};
5.
} while ( W != W );
257
Minimizing -production rules
Manipulating CFG
Algorithm (continued) (2) Eliminate all -production rules from the rule set P, and let the resulting set be P1.
(3) With P1, construct the rule set P' of G as follows: For each rule A (A VN ), construct the following set R. R = {' | ' ( ) is a string constructed with by deleting zero or more symbols from that belong to the set W constructed in step (1) of the algorithm.} For every ' R add the rule A ' to the set P'. (Notice that R, i.e., A is excluded in this step, and every rule in P1 is included in P'.) (4) If the start symbol S is in W, put the rule S in P'. (Notice that if S W, then is in the language L(G), and hence, G' needs the rule S .) Now, we will show two examples of minimizing the number -production rules using this algorithm.
258
Manipulating CFG
Minimizing -production rules
Example 1. Minimize the number of -production rules of the following CFG G. G: S CD
CA| c
DB| d
Aa |
Bb|
Applying steps (1) - (4) of the algorithm, we get the following. (1) : W = { A, B };
W = W;
// W = { A, B }
W = W { C, D }; W = W;
// W = { A, B, C, D }
W = W { S };
// W = { A, B, C, D, S }
W = W;
W = W { }; (2) : P1 = { S CD
// W = { A, B, C, D, S } C A| c
DB| d
Aa
Bb }
(3), (4):
G: S CD | C | D |
CA| c
DB| d
Aa
Bb
259
Manipulating CFG
Minimizing -production rules
Example 2. Minimize the number of -production rules from the following CFG G. G: S ADC | EFg A aA | Ea
Ff|
D FGH | b
C c |
G Gg | H
Hh|
(1) W = {A, C, F, H}; W = W; W = W {G};
W = W;
// W = {A, C, F, G, H};
W = W {D};
W = W;
// W = {A, C, D, F, G, H};
W = W {S};
W = W;
// W = {A, C, D, F, G, H, S};
W = W { } ;
W = W;
// W = {A, C, D, F, G, H, S};
260
Manipulating CFG
Minimizing -production rules G: S ADC | EFg A aA | Ea
Ff|
(1) : W = {A, C, D, F, G, H, S} (2) : P1 = { S ADC | EFg A aA
Ea
Ff
D FGH | b
C c |
G Gg | H
Hh|
D FGH | b
C c
G Gg | H
Hh }
(3), (4) : G :
S ADC | AD | AC | DC | A | D | C | EFg | Eg | A aA | a C c
D FGH | FG | FH | GH | F | G | H | b
Ea
Ff
G Gg | g | H
Hh
261
Manipulating CFG
10.2 Eliminating unit production rules A unit production rule is a rule of the form C A, which generates a single nonterminal symbol. Unit productions rules in a CFG can be eliminated without affecting the language. For example, minimizing the number of -production rules in Example 1 in the previous section, we got the following grammar. G: S CD | C | D |
CA| c
DB| d
Aa
Bb
Rules C A and A a can be reduced to a single rule C a without affecting the language generated by the grammar. Hence, we can safely change the rules C A | c to C a | c. For the same reason, we can substitute S a | c for the rules S C and C a | c to eliminate the unit production rule S C. Finally, applying the same idea to the unit production rules S C | D, we get the following grammar, which has no unit production rules. (Notice that A a and B b are removed because they are useless.) G1: S CD | a | c | b | d |
Ca|c
Db|d 262
Eliminating unit production rules
Manipulating CFG
Now, for the proof of the following theorem, we present an algorithm which given an arbitrary CFG G, systematically applies the idea that we have presented above and eliminates all unit production rules to construct a CFG G, without affecting the language. Theorem 10.2 Let G be an arbitrary CFG. We can construct a CFG G that generates L(G) with no unit production rules. Proof (Algorithm): Let G = (VT, VN, P, S). The following algorithm constructs a CFG G = (VT , VN , P, S), which generates L(G) with no unit production rules.
Today’s Quote
Break Time
The simple solution for disappointment depression: Get up and get moving. Physically move. Do. Act. Get going. - Peter McWilliams -
263
Eliminating Unit production rules
Manipulating CFG
Algorithm (eliminate unit production rules) Input: a CFG G = (VT, VN, P, S) // G has no -production rules except S . Output: CFG G = (VT , VN , P, S) // G has no unit production rules // and L(G) = L(G). // Let A, B VN and ( VT VN )* . while (there is a unit production rule in P) { find a unit production rule A B in P such that B has no unit production rules; for every B which is not unit production, change A B to A ; } //end while P = P ; G = (VT , VN , P, S);
264
Manipulating CFG
Eliminating Unit production rules
This algorithm, going bottom-up, iteratively changes unit production rules to nonunit production rules until it sees no unit production rule. The following example would be enough for understanding how it works. (Notice that by definition the rules generating a terminal symbol (e.g., A a ) are not unit production rules.) G : S CD | C | D | Aa
B bB | b
CA| c
DB| d
EF
FE
1.
CA| c C a | c
DB|d
2.
S CD | C | D | S CD | a | c | bB | b | d | G' : S CD | a | c | bB | b | d | A a
B bB | b
D bB | b | d
Ca| c
EF
D bB | b | d FE
Notice that the algorithm fail to eliminate cyclic unit production rules. For example, see the rules involved with E and F in G' above. Together with A, these symbols are useless for generating the language. We will shortly presents an algorithm eliminating useless symbols in a CFG. 265
Manipulating CFG
10.3 Eliminating Useless Symbols from a CFG Manipulating a CFG may induce useless symbols (and their rules), as we saw in the previous section. The following grammar has two typical examples of useless symbols, B, C, D, and E.
G: S AE | A | B
CB
A aA |
BC
D Db | b
E Ec
Symbol D is useless because no sentential forms (i.e., strings consisting of terminal and nonterminal symbols) containing D are derivable starting with S. Although it is possible to derive a sentential form containing E starting with S, this symbol is also useless because no terminal strings can be generated starting with it. Symbols B and C are involved in a cyclic unit production rules, they are useless. Therefore B, C, D, and E and the rules involving these nonterminal symbols can be safely eliminated to get the following simple CFG G.
G: S A
A aA |
266
Eliminating useless symbols
Manipulating CFG
In the previous example we observed the following; in a CFG G = (VT, VN, P, S), a nonterminal symbol X VN is useful if it satisfies the two conditions (1) and (2) below. (Recall that by * we denote the derivation of , starting with , in zero or more steps of applying a rule at a time.) (1) X * w, where w VT* (2) S * X, where , (VT VN)*
These two conditions are illustrated in the figure below, i.e., a nonterminal symbol X is useful, if (1) a terminal string w (or ) is derivable starting with X, and (2) a sentential form including X is derivable starting with S,.
S
*
X
a sentential form
(2) * w (1)
267
Eliminating useless symbols
Manipulating CFG
Now we present an algorithm for eliminating useless symbols in a CFG. The algorithm is presented in two parts by the following two lemmas. Lemma 1 and Lemma 2 present an algorithm for eliminating all symbols that do not satisfy the conditions (1) and (3), respectively. Lemma 1. Given a CFG G = (VT , VN , P, S), it is possible to construct a CFG G = (VT ,VN , P, S) that satisfies the following conditions. (i) L(G) = L(G) (ii) For every X VN , X * w, where w (VT)* Proof (algorithm 1). Let A be a nonterminal symbol. We compute VN and P of grammar G with the following algorithm. (Notice that this algorithm uses a similar technique to one which we used for minimizing the number of -production rules.)
268
Eliminating useless symbols
Manipulating CFG
Algorithm for constructing VN and P // Collect all nonterminal symbols directly generating a terminal string. W = {A | A w P, w (VT)* };
// Collect all nonterminal symbols which can derive a terminal string. do { W = W; W = W {A | A P and ( VT W )*}; }while ( W != W ) VN = W; P = {A | A P and (VN VT)*};
269
Eliminating useless symbols
Manipulating CFG
Lemma 2. Given a CFG G = (VT , VN , P, S), it is possible to construct a CFG G = (VT ,VN , P, S) that satisfies the following conditions. (1) L(G) = L(G) (2) For every symbol X VT VN , it is possible to derive a sentential form starting with S, i.e., S * X, where , (VT VN)* Proof (algorithm 2). Starting with empty sets VT and VN , the algorithm progressively builds them up as follows. Put the start symbol S in VN. Suppose that A is a nonterminal symbol that has just been put in VN. Then for every rule A and every symbol X that appears in , if X is a nonterminal symbol, put it in VN. Otherwise, if it is a terminal symbol, put it in VT. The algorithm repeats this process till no more nonterminal symbols added into VN. Then the algorithm constructs P by collecting all the rules in P that uses only the symbols in VT VN.
270
Eliminating useless symbols
Manipulating CFG
Algorithm (Lemma 2) Input: G = (VT , VN , P, S) Output: G = (VT ,VN , P, S) // G satisfies the two conditions of the lemma. Let VT = VN = V = ; do { VN = V;
V = V { S };
// initialize
// Iteratively build up VT and VN
V = V { B | A VN , A P, and contains B VN }; VT = VT { a | A VN , A P, and contains a VT }; } while (VN != V ) P = {A | A P, A VN , (VT VN)* }; G = (VT ,VN , P, S) ;
271
Eliminating useless symbols
Manipulating CFG
Theorem 10.3 Given a CFG G = (VT , VN , P, S), it is possible to construct a CFG G = (VT , VN , P, S) that satisfies the following conditions. (1) L(G) = L(G) (2) For every X VN , X * (VT)* (3) For every X VT VN , S * X , where , (VN VT)* Proof. We construct such CFG G using the two algorithms presented in Lemmas 1 and 2, in that order.
Today’s Quote
Break Time
He who receives a benefit should never forget it; he who bestow should never remember it. - Pierre Charron -
272
Eliminating useless symbols
Manipulating CFG
Example 1. Eliminate all useless symbols in the following CFG. G:
S AD | EFg
A aGD
D FGd
C cCEc
E Ee
F Ff |
G Gg | g
H hH | h
Step 1: Apply the algorithm of Lemma 1: W = {F, G, H} ; W´ = W ; W = W´ {D}
// W = {D, F, G, H};
W´ = W ; W = W´ {A}
// W = {A, D, F, G, H};
W´ = W ; W = W´ {S}
// W = {A, D, F, G, H, S};
W´ = W ; W = W´ { } V´N = W
// W = {A, D, F, G, H, S}; // V´N = {A, D, F, G, H, S}
G1: S AD G Gg | g
A aGD H hH | h
D FGd
F Ff | f
273
Eliminating useless symbols G1: S AD G Gg | g
A aGD H hH | h
Manipulating CFG D FGd
F Ff | f
Step 2: Apply the algorithm of Lemma 2 to G1: 1. VT = VN = V = { }; 2. V = V {S};
VN = V;
3. V = V {A, D};
VT = VT { };
VN = V; 4. V = V {G, F} ; VN = V;
// VN = {S}
// VN = {S, A, D}, VT = { } VT = VT {a, d}; // VN = {S, A, D, G, F}, VT = {a, d}
5. VN = VN {} ; VT = VT {g, f} = {a, d, g, f};
VN = V;
// VN = {S, A, D, G, F}, VT = {a, d, g, f}
274
Eliminating useless symbols
G1: S AD G Gg | g
A aGD H hH | h
Manipulating CFG
D FGd
F Ff | f
Apply the algorithm of Lemma 2 VN = {S, A, D, G, F}, VT = {a, d, g, f} Select the rules that contain only the symbols in VN VT Final result: G: S AD G Gg | g
A aGD
D FGd
F Ff | f
275
Eliminating useless symbols
Manipulating CFG
To eliminate useless symbols, we applied the algorithm of Lemma 1 followed by the algorithm of Lemma 2. If we apply the algorithm of Lemma 2 first, we may fail to eliminate all useless symbols. The following figure shows an example. G: S AB | a
Apply the algorithm of Lemma 2
Apply the algorithm of Lemma 1 G1: S a Apply the algorithm of Lemma 2 G: S a
Aa
A a
G1: S AB | a
A a Apply the algorithm of Lemma 1
G: S a
Aa
276
Manipulating CFG
10.4 Normal Form of CFG We know that CFG rules have no restriction on their right side. There can be any string (of finite length) of terminal and/or nonterminal symbols. This “free form” of CFG rules makes it more complex to solve certain problems concerning context-free languages. It is often desirable to have the rules simplified on their form without affecting the language of the grammar. In this section we will study two of such forms, called Chomsky normal form and Greibach normal form, presented below.
Let G = (VN, VT, P, S) be a CFG and let A, B, CVN, a VT and (VN)*. • If every rule of G has the form A BC or A a, then G is called a Chomsky normal form (CNF) grammar. • If every rule of G has the form A a, then G is called a
Greibach normal form (GNF) grammar. Note that the definitions are only concerned with CFG’s whose language does not contain . 277
Manipulating CFG
Normal Form of CFG
Converting a CFG to CNF Given a CFG, we will show how to convert the grammar to CNF. The conversion is done for each rule. The following example would be enough to understand the idea.
• Converting A aBCDbE to a set of rules in CNF. A A1B1 A1 a
// and let B1 derive BCDbE as follows,
B1 BC1 C1 CD1 D1 DE1
// and let C1 derive CDbE as follows, // and let D1 derive DbE as follows, // and let E1 derive bE,
E1 F1E F1 b // and finally let F1 derive b. * Subscripted symbols are nonterminal symbols newly introduced. Starting with A, the above set of CNF rules applied in the order, we can derive aBCDbE, which can be derived by applying the single rule A aBCDbE.
278
Manipulating CFG
Normal Form of CFG
Example 1. Convert the following CFG to CNF. (1) S AaBCb
(2) A abb
(3) B aC
(4) C aCb | ac
For each rule, we apply the idea that we have presented above.
(1) S AA1
A1 A2A3
A2 a
(2) A B1B2
B1 a
B2 B3B4
(3) B C1C
C1 a
(4) C D1D2 | E1E2 D1 a
A3 BA4
D2 CD3
B3 b D3 b
A4 CA5
A5 b
B4 b E1 a
E2 c
We leave it for the reader to prove that the conversion does not affect the language of the grammar. 279
Normal Form of CFG
Manipulating CFG
Converting a CFG to GNF GNF: A a , A VN , a VT, (VN)*
The algorithm for converting a CFG to GNF is not simple, not to mention the proof of its correctness. We shall only present an example to show the basic idea. We need some terminology introduced before the algorithm. We call a rule frontrecursion, if its right side starts with the same nonterminal symbol as the one in the left side (e.g., A AB). A front-nonterminal is a nonterminal symbol that is located at the left end of the right side of the rule, and a non-front terminal is a terminal symbol that is not located at the left end of the rule. The algorithm works in the following three phases. 1. Eliminate front-recursion 2. Eliminate front–nonterminal 3. Eliminate non-front terminal 280
Manipulating CFG
Normal Form of CFG GNF: A a , A VN , a VT, (VN)*
The algorithm repeatedly applies phases 1 and 2 until it sees neither frontrecursions nor front-nonterminals. Then it applies Phase 3 to eliminate non-front terminals in the rules to transform them finally into GNF. Phase 1. Consider the following rules with a front-recursion. We introduce a new nonterminal (N in the example) and eliminate the front-recursion as follows. A AB | CD
A CDN | CD
N BN | B
Notice that a front-recursion rule must accompany a non-front-recursion, for example A CD in the above example. Otherwise, the rule with the front-recursive is useless and can be eliminated. In general, let A be a non-front-recursion rule. Then the front-recursion A A can be eliminated as follows. A A |
A N |
N N | 281
Manipulating CFG
Normal Form of CFG
Phase 2. Find a rule with a front-nonterminal (B in the example below) such that every rule of B has a front-terminal. Convert the rule with a front-nonterminal into a rule having a front-terminal as follows.
A BbC | aA
A cDAbC | aEbC | aA
B cDA | aE
B cDA | aE
....
....
Phase 3. Convert every non-front terminal to a new nonterminal as follows. A cDAbC
A cDANC
Nb
We can show that the above three phases of conversion do not affect the language of CFG G. (We omit the proof, which is challenging.)
282
Manipulating CFG
Normal Form of CFG
Example: Converting a CFG to GNF CFG: S SA | BeA A AS | a
Phase 1
S BeAC | BeA
C AC | A
A aD | a
D SD | S
Bb
Bb
Phase 2 S beAC | beA
C aDC | aC | aD | a
A aD | a
D SD | S
Bb Phase 2 S beAC | beA
C aDC | aC | aD | a
A aD | a
D beACD | beAD | beAC | beA
Bb 283
Manipulating CFG
Normal Form of CFG The result of phase 2 iteration: S beAC | beA
C aDC | aC | aD | a
A aD | a
D beACD | beAD | beAC | beA
Bb Phase 3 S bEAC | bEA
C aDC | aC | aD | a
A aD | a
D bEACD | bEAD | bEAC | bEA
Bb
Ee
Rule B b, which is useless, can be eliminated using the algorithm presented in Section 10.3. 284
Manipulating CFG
Exercises 10.1 With the CFG G given below follow the steps (i) – (iii) below to construct a CFG such that L(G) = L(G’) with the smallest possible number of -production rules . (i) Find the set of nonterminal symbols that derive . You should also show the procedure that you took for you answer. (ii) Eliminate all -production rules from the grammar. (iii) Find CFG G’ using the result from step (ii) above.
G:
S ABCd | BEF
A aA |
B FGH | b
C cC |
Ea|
Ff|
G Gg | H
H h |
10.2 Using the algorithm presented in Section 10.2, eliminate all unit production rules from the following CFG G. You should also show how you got your answer for the grammar.
G:
S AB | A | B |
A C | c
D bD | B | b
Ca
BD| d
285
Manipulating CFG
Exercises
10.3 This problem is designed to eliminate all useless symbols from the CFG G given below in two steps (a) and (b). You should show the results from each step. (a) Using the algorithm presented in Lemma 1 in Section 10.3, construct a CFG G1 that satisfies the following conditions. (i) L(G) = L(G1) (ii) Every nonterminal symbol of G1 derives a terminal string. (b) Apply the algorithm of Lemma 2 in Section 10.3 and construct a CFG G G2 that satisfies the following conditions. (i) L(G) = L(G2), (ii) For every terminal or nonterminal symbol X of G2, it is possible to derive a sentential form containing X. G:
S BA | CDg
B aEA
A DEd
E Eg | g
F hF | h
G cGCc
C Ce
D Df |
I cDEFab
10.4 Transform the following CFG into Chomsky normal form. S abABCde
A aAa
B ba
C cab
286