Chapter 12. Hierarchy Of Models (ii)

  • Uploaded by: kims3515354178
  • 0
  • 0
  • July 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Chapter 12. Hierarchy Of Models (ii) as PDF for free.

More details

  • Words: 7,274
  • Pages: 43
Hierarchy of the Models

1

12. Hierarchy of the Models: Proper Containment In Chapter 7 we saw the Chomsky hierarchy among languages, automata, and other models and proved the horizontal relations (i.e., the characterization) at the lowest level of the hierarchy. Figures (a) and (b), respectively, are the copies of the relations among the classes of languages and automata presented in Chapter 7 (see next slide). Recall that if a language is a member of a class of languages at a lower level, then it is also a member of an upper class. If a language is recognizable by an automaton at a lower level, then it is also recognizable by an automaton at an upper level. But the reverse of these relations does not hold. An upper lever language class has a member that does not belong to a lower level class, and there is an automaton belonging to an upper level whose language cannot be recognized by any automaton belonging to a lower level of the hierarchy.

TM Type-3L

Type-0L Type-1L

Type-2L

(a) Containment relations of the language classes

LBA PDA FA

(b) Containment relations of automata capability 2

Hierarchy: Proper Containment Review

The Chomsky Hierarchy

Languages (grammars) Recursively Enumerable Sets (type 0) Context-sensitive Languages(type 1)

Machines Turing Machines (TM)

Linear-bounded Automata (LBA)

Context-free Languages(type 2)

Pushdown Automata (PDA)

Regular Languages(type3)

Finite State Automata (FA)

Other Models Post System, Markov Algorithms, µ -recursive Functions . . . . . ↓ : Containment ↔ : Characterization

Regular Expression 3

Hierarchy: Proper Containment To prove the proper containment relations of the hierarchy it requires in-depth logical arguments that we have developed in Chapter 1. In this chapter we will prove the containment relations up to the class of context-sensitive languages. In Chapter 15 we will complete the proof of the hierarchy by proving the characterizations of context-free, context-sensitive and phrase structured languages, together with the proof of the last proper containment relation between type 0 and type 1. The logics involved in these proofs are so elegant that it worth a challenge.

12.1 Relationship of the language classes: proper containment 312 12.2 The pumping lemma 316 The pumping lemma and proof 12.3 Applications of the pumping lemma 323 Examples 12.4 The pumping lemma for context-free languages 328 12.5 Application of the pumping lemma for context-free languages 331 12.6 Ogden’s lemma 336 The lemma and an application 12.7 A proof of the pumping lemma for context-free languages 339 Rumination 346 Exercises 350

4

Hierarchy: Proper Containment

12.1 The containment relations Before the proof of the proper containment relations (⊃) between two levels of the hierarchy, we will prove the containment relations (⊇) by the following theorem. Theorem 12.1 Let Type-iL denote type i (i = 0, 1, 2, 3) language class. Then for all i < 3, the following relation holds. Type-iL ⊇ Type-(i+1)L Proof. Let α →β be a rule of a grammar G = (VT, VN, P, S). We know that by definition, type 3 (regular) grammar rules should have the form (i.e., |α | = 1) of type 2 (context-free) grammar rules with additional restriction that either β = xB or β = x , where x ∈ (VT)* and B ∈ VN. It follows that every type 3 grammar is a type 2 grammar, and hence, all type 3 languages belong to the class of type 2 languages, i.e., Type-2L ⊇ Type-3L.

5

Containment Relations

Hierarchy: Proper Containment

Proof (continued). By definition, type 1 grammar (CSG) rules are type 0 grammar rules with the additional restriction that the left side of each rule cannot be longer than the right side. It follows that every type 1 (context-sensitive) grammar is also a type 0 (phrase structured) grammar. That is, Type-0L ⊇ Type-1L Now, we prove that type 1 language class contains type 2 (context-free) language class. Recall that type 1 grammar rules are non-contracting. In other words, type 1 grammar rules are type 0 grammar rules with the restriction that for any rule α →β , the left side cannot be longer than the right side, i.e., |α | ≤ |β |. Type 2 grammar rules are type 0 grammar rules having one nonterminal symbol on their left side, i.e., |α | = 1. Both type 1 and type 2 grammars are defined by adding a restriction on type 0 grammar rules. So for the proof of the containment relation between type 1 and type 2 language classes, we cannot apply the simple logic that we used for the other relations between other language classes.

6

Containment Relations

Hierarchy: Proper Containment

Clearly, since every CFG rule has exactly one nonterminal symbol on its left side, as far as the left side of grammar rules are concerned, CFG rules belong to CSG rules. The only problem is the ε -production rules allowed in CFG’s, which violate the requirement that the right side of CSG rules cannot be shorter than the left side, except for S → ε . All the other CFG rules, which do not produce ε , are CSG rules. Fortunately, we have a way to overcome this problem. Recall Theorem 10.1 repeated below. Theorem 10.1 Given a CFG G, we can construct a CFG G' that satisfies the following conditions. • L(G) = L(G'). • If ε ∉ L(G), G' has no ε -production rule. Otherwise, S → ε is the only ε production rule in G'. Grammar G' is a CSG because every rule in G' is non-contracting except for the rule S → ε , if any. This implies that every CFL can be generated by a CSG. It follows that Type-1L ⊇ Type-2L. 7

Containment Relations

Hierarchy: Proper Containment

In summary, we showed the following relations. Type-0L ⊇ Type-1L ⊇ Type-2L ⊇ Type-3L Now, we will show that all these containment relations are proper, i.e., Type-0L ⊃ Type-1L ⊃ Type-2L ⊃ Type-3L For the proof it is enough to show that for each i = 0, 1, 2, there is a language in class Type-iL that does not belong to class Type-(i+1)L. In this chapter we will first show that the familiar type 2 language {aibi | i ≥ 0} does not belong to Type-3L, i.e., the class of regular languages. Then we will show that the language {aibici | i ≥ 0 }, which was presented as a typical type 1 language in Chapter 2, does not belong to Type-2L. These two results prove the following relations. Type-1L ⊃ Type-2L ⊃ Type-3L We will defer the proof of the last part Type-0L ⊃ Type-1L until Chapter 15.

8

12.2 The Pumping Lemma

Hierarchy: Proper Containment

To prove the relation Type-2L ⊃ Type-3L, it is enough to find a context-free language that is not regular. Our approach is to first find a common property of all regular languages, and then show a context-free language that does not satisfy this property. Since every finite language is regular (we leave the proof of this claim for the reader), the property we are looking for must be found among infinite regular languages. The following lemma presents such property. Pumping Lemma: For every infinite regular language L, there is a constant integer n that satisfies the following conditions: Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length is greater than or equal to n, there are u, v, w ∈ ∑* such that • z = uvw, |uv| ≤ n, |v| ≥ 1, and • for all i ≥ 0, uviw ∈ L

9

Pumping Lemma

Hierarchy: Proper Containment

Proof: For the proof we need the following two terminologies; a transition path is a directed path on a state transition graph and a path label is the string constructed by picking the input symbols along a transition path. For example, in the state transition graph shown below, aabba is the path label on the state transition path 1 → 2 → 4 → 3 → 2 → 4. Given the string aabba as an input, the DFA will take a sequence of transitions along the transition path 1 → 2 → 4 → 3 → 2 → 4. To help the reader understand the argument, we will use the following state transition graph throughout the proof, together with other figures when needed. 2

a start

11

a

a b

33

4

b

b 10

Pumping Lemma

Hierarchy: Proper Containment

Given an infinite regular language L, let M be a DFA that recognizes L and let n be the number of states of this automaton. For an input string z ∈ L, whose length is greater than or equal to n (i.e., |z| ≥ n), examine the state transition path with path label z. Notice that because M is deterministic, there is only one transition path with path label z. Since L is infinite and the number of states n is finite, there must be a string z in L such that |z| ≥ n. Otherwise, L cannot be infinite.

Let m be the length of z (i.e., |z| = m ≥ n ) and let z = a1a2 …am. 2

a start

11

a

a b

33

4

b

b

11

Pumping Lemma

Hierarchy: Proper Containment

Since z ∈ L, the transition path with label z must end in an accepting state.

Because |z| ≥ n, the path involves a cycle and hence, there should be a state, say q, visited a second time by reading some j-th symbol aj (j ≤ n) in z as illustrated in figure (b) below. (This can be proved by the pigeonhole principle. Refer the application example of the pigeonhole principle in Chapter 1.) 2

a start

11

|z| = m ≥ n

a

a b

33

4

b

b

a1 start

a2

. ..

ai ai+1

aj+1 . q

aj

.

. a m

M (a) Transition path with path label aabba

(b) Transition path with path label z = a1a2 …am, m ≥ n 12

Pumping Lemma

Hierarchy: Proper Containment

As shown in figure (b), let v be the path label of the cyclic transition path, and let u and w be, respectively, the prefix and the suffix of z partitioned by v. (If q is the start state, then u = ε , and if q is the last state in the path, w = ε .) Since the cycle involves at least one edge, |v| ≥ 1. Since j ≤ n, we have |uv| ≤ n. z = a1a2 …ai ai+1 . . . . aj aj+1 . . . .am

u = a, v = abb, w = a 2

a start

11

a

|z| = m ≥ n

w

|uv| ≤ n

|v| ≥ 1

a b

33

v

u

4

b

b (a) Transition path with path label aabba

a1 start

a2

. .. u

M

ai ai+1

aj+1 . q

.

aj

. a m w

v

(b) Transition path with path label z = a1a2 …am, m ≥ n 13

Pumping Lemma

Hierarchy: Proper Containment

Sine uvw is in L, M accepts uvw. It follows that all the following strings corresponding to the path labels of the transition paths with the cycle repeated zero or more times are also accepted by M. uw = uv0w, uvvw = uv2w, uvvvw = uv3w, . . . . That is, for all i ≥ 0, uviw ∈ L. (Refer to figure (a) for an example.) z = a1a2 …ai ai+1 . . . . aj aj+1 . . . .am

For all i ≥ 0, a(abb)ia ∈ L 2

a start

11

a

b

|z| = m ≥ n

a1 start

(a) Transition path with path label aabba

|v| ≥ 1

For all i ≥ 0, uviw ∈ L

4

b

w

|uv| ≤ n

a b

33

v

u

u = a, v = abb, w = a

a2

. .. u

M

ai ai+1

aj+1 . q

.

aj

. a m w

v

(b) Transition path with path label z = a1a2 …am, m ≥ n 14

Pumping Lemma

Hierarchy: Proper Containment

Summarizing the observations, we get the pumping lemma repeated here. Pumping Lemma: For every infinite regular language L, there is a constant integer n which satisfies the following conditions: Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length is greater than or equal to n, there are u, v, w ∈ ∑* such that • z = uvw, |uv| ≤ n, |v| ≥ 1, and • for all i ≥ 0, uviw ∈ L

Job Applicant Bloopers

Break Time

Interviewer: “Do you think you can handle a variety of tasks?” Applicant: “I should say so. I’ve had nine totally different jobs in the past five months.” - Jim -

15

Hierarchy: Proper Containment

12.3 Application of the Pumping Lemma Now, as an application of the pumping lemma we will show that there is a CFL which is not regular. Specifically, using the lemma, we will prove that the language {aibi | i ≥ 1} does not satisfy the pumping lemma to show that Type-2L ⊃ Type-3L, the proper containment relation between the classes of type 2 and type 3 in the Chomsky hierarchy. For the convenience of the application, the lemma is rewritten below to expose its logical structure. (Notice the quantified parts highlighted.) (1) For every infinite regular language L, // ∀ infinite regular language L, (2) there exists a constant n such that // ∃ a constant n such that . . . (3) for every string z ∈ L, such that | z | ≥ n, // ∀ string z ∈ L, | z | ≥ n (4) there exist strings u, v, w ∈ ∑* such that // ∃ u, v, w that satisfy . . . (i) z = uvw, (ii) |uv| ≤ n, (iii) |v| ≥ 1, (iv) for all i ≥ 0, uviw ∈ L. // ∀ i ≥ 0, uviw ∈ L.

16

Application of the Pumping Lemma

Hierarchy: Proper Containment

Before using the pumping lemma for the proof, recall the techniques (in Section 1.3) for dealing with existential quantification (∃ ) and universal quantification (∀) in a statement to prove. In this example, we are going to prove that for the given context-free language, the pumping lemma is not true. Hence, for the existentially quantified parts of the lemma, i.e., the constant n and strings u, v, and w, we should consider all possible cases. For the universally quantified parts, i.e., regular language L, string z ∈ L, and integer i ≥ 0, it is enough to consider a single case, which can be chosen for our convenience of the proof. The argument of the proof will run as a game between the proponent of the lemma and us who are going to show that the lemma does not hold for the given language.

(1) For every infinite regular language L, // ∀ infinite regular language L, (2) there exists a constant n such that // ∃ a constant n such that . . . (3) for every string z ∈ L, such that | z | ≥ n, // ∀ string z ∈ L, | z | ≥ n (4) there exist strings u, v, w ∈ ∑* such that // ∃ u, v, w that satisfy . . . (i) z = uvw, (ii) |uv| ≤ n, (iii) |v| ≥ 1, (iv) for all i ≥ 0, uviw ∈ L. // ∀ i ≥ 0, uviw ∈ L. 17

Application of the Pumping Lemma

Hierarchy: Proper Containment

Now, we show that our familiar CFL L = {aibi | i ≥ 1} does not satisfy the pumping lemma, and hence, it is not regular. Suppose that L is a regular language. (By this sentence the reader will promptly recognize that we are going to use the proof-bycontradiction technique.) We will carefully analyze the pumping lemma to find a contradicting part against our supposition. (The numbers in the following arguments match the ones that we put in the lemma.) Pumping lemma says (1) For every infinite regular language L, (2) there exists a constant n s.t. (3) . . . . . (4) . . . .

Our argument L is infinite regular language. Hence, there should be a constant n such that conditions (3) and (4) are satisfied. Let n be just that constant. Using n as a variable, we are going to consider all possible constant values of n in our argument.

18

Application of the Pumping Lemma Pumping lemma says (3) For all strings z ∈ L such that |z|≥ n, (4) there exist strings u, v and w that satisfy the following . (i) z = uvw, (ii) |uv| ≤ n, (iii) |v| ≥ 1, (iv) for all i ≥ 0, uviw ∈ L. n

n

aa . . . . . . . .a bb . . . . . . . . .b u v w |uv| ≤ n

Hierarchy: Proper Containment Our argument We choose string z = anbn in L, and examine whether for all strings u, v, w satisfying conditions (i) – (iii), the condition (iv) is satisfied. Let z = anbn = uvw according to (i). Then by (ii) and (iii), v contains only (and at least one) a’s. (See the figure at the bottom left.) It follows that string uv2w contains more a’s than b’s. That is, for i = 2, we claim that uv2w ∉ L. The language L does not satisfy the pumping lemma. This contradicts the assumption that L is regular. Therefore, L is not regular.

19

Application of the Pumping Lemma

Hierarchy: Proper Containment

We know that L = {aibi | i ≥ 1 } is a CFL. Since we have just shown that L is not regular, the containment relation Type-2L ⊇ Type-3L that we have established in the previous section can be refined to the proper containment relation, i.e., Type-2L ⊃ Type-3L.

Secrets of Success

Break Time

The other day I had the opportunity to drop by my department head's office. He's a friendly guy and on the rare opportunities that I have to pay him a visit, we have had enjoyable conversations. While I was in his office yesterday I asked him, "Sir, What is the secret of your success?" He said, "Two words." "And, Sir, what are they?" "Right decisions." "But how do you make right decisions?" "One word." He responded. "And, sir, What is that?" "Experience." "And how do you get Experience?" "Two words." "And, Sir, what are they?" "Wrong decisions." - Rubin -

20

Hierarchy: Proper Containment

12.4 The pumping lemma for context-free languages Now, we prove the proper containment relation between the next upper two

levels of the language classes, i.e., Type-1L ⊃ Type-2L. We use the same approach that we took for proving the containment relation Type-2L ⊃ Type-3L. With no proof we first present a common property of infinite context-free languages, called the pumping lemma for context-free languages (see below). Then, as an application of the lemma, we will show that the context-sensitive language {aibici | i ≥ 1 }, which we introduced in Section 2.3, does not satisfy the lemma. We shall present the proof of the lemma after showing the application. The pumping lemma for context-free languages. For every infinite CFL L, there is a constant integer p that satisfies the following conditions: Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length is greater than or equal to p, there are u, v, w ∈ ∑* such that • z = uvwxy, |vwx| ≤ p, |vx| ≥ 1 • for all i ≥ 0, uviwxiy ∈ L

21

Pumping Lemma for CFL’s

Hierarchy: Proper Containment

The pumping lemma for CFL’s looks very similar to the one for regular languages. The figure below shows the difference. Both lemmas claim the existence of a constant (i.e, n for regular languages and p for CFL’s). The lemma for regular languages divides string z into three segments (i.e., z = uvw), while the lemma for CFL’s divides it into five segments (i.e., z = uvwxy). The lemma for regular languages provides one site, named v, to pump (see the up arrow), while the lemma for CFL’s provides two sites v and x to pump simultaneously. The lemma for CFL’s

The lemma for regular language • z = uvw, |uv| ≤ n, |v| ≥ 1 • for all i ≥ 0, uviw ∈ L

• z = uvwxy, |vwx| ≤ p, |vx| ≥ 1 • for all i ≥ 0, uviwxiy ∈ L z

u

v

w vwx| ≤ p

x

y

u

v

w

|uv| ≤ n 22

Pumping Lemma for CFL’s

Hierarchy: Proper Containment

Recall that when we applied the pumping lemma for regular languages, we took advantage of the range of the pump site v that is restricted within the prefix of length n of string z, given by the condition |uv| ≤ n. Unfortunately, for the pumping lemma for CFL’s the segment vwx, which contains the two pump sites, can be flanked by u to the left and by y to the right. So the pump sites can be located anywhere in z in the window of width |vwx| ≤ p. When we apply the lemma to prove a language is not context-free, we must consider all possible locations of the window vwx, from the left end of z up to the right end. The lemma for CFL’s

The lemma for regular language • z = uvw, |uv| ≤ n, |v| ≥ 1 • for all i ≥ 0, uviw ∈ L

• z = uvwxy, |vwx| ≤ p, |vx| ≥ 1 • for all i ≥ 0, uviwxiy ∈ L z

u

v

w vwx| ≤ p

x

y

u

v

w

|uv| ≤ n 23

Hierarchy: Proper Containment

12.5 Application of the pumping lemma for CFL’s Again, for the clarity of the argument, we rewrite the lemma as follows. (1) For every infinite CFL L, // ∀ (2) there exists a constant p such that // ∃ (3) for every string z ∈ L, such that | z | ≥ p, (4) there exist strings u, v, w, x and y that satisfy the following conditions. (i) z = uvwxy, (ii) |vwx| ≤ p, (iii) |vx| ≥ 1, (iv) for all i ≥ 0, uviwxiy ∈ L.

infinite CFL L, a constant p such that . . . // ∀ string z ∈ L, | z | ≥ p // ∃ u, v, w, x, y that . . .

// ∀ i ≥ 0, uviwxiy ∈ L.

24

Application of CFL Pumping Lemma

Hierarchy: Proper Containment

Now, we show that the CSL L = {aibici | i ≥ 1}, that we have introduced in Section 2.3, does not satisfy the pumping lemma for CFL’s. Our argument is similar to the one for proving that language {aibi | i ≥ 1} is not regular. Suppose L is a CFL. It should satisfy the lemma. With L we walk along the lemma and come out with a contradiction. Pumping Lemma’s claim (1) For every infinite CFL L, (2) there exists a constant p s.t. (3) . . . . . (4) . . . .

Our claim L is a infinite CFL. Hence, there should be a constant p such that conditions (3) and (4) are satisfied. Let p be the constant. Using p as a variable, we are going to consider all possible constant values of p in our argument.

25

Application of CFL Pumping Lemma

Hierarchy: Proper Containment Our claim

Pumping Lemma’ claim (3) For all strings z ∈ L such that |z|≥ p, (4) there exist strings u, v, w, x, y that satisfy the following. (i) z = uvw, (ii) |uv| ≤ n, (iii) |v| ≥ 1, (iv) for all i ≥ 0, uviw ∈ L. p

We choose string z = apbpcp in L and examine whether, for all strings u, v, w satisfying conditions (i) – (iii), it satisfies condition (iv). By (i) let Let z = apbpbp = uvwxy, and examine what kinds of symbols can be in the two pump sites v and x. Recall that vwx, whose length ≤ p, can be anywhere in z. p

p

aa . . . . . . . . .abb . . . . . . . .bcc. . . . . . . . . c vwx y u

26

Application of CFL Pumping Lemma

Hierarchy: Proper Containment

As we can see in the figure below, by condition (ii) |vwx| ≤ p, the pump sites v and x together can contain no more than two different symbols (a’s and b’s or b’s and c’s). By condition (iii) |vx| ≤ 1, strings v and x together should contain at least one symbol. It follows that in uv2wx2y the numbers of a’s, b’s, and c’s cannot be the same, which implies that uv2wx2y ∉ L. The language does not satisfy condition (iv) of the lemma when i = 2. Hence, L is not context-free.

(i) z = uvwxy, (ii) |vwx| ≤ p, (iii) |vx| ≥ 1, (iv) for all i ≥ 0, uviwxiy ∈ L.

p

p

p

aa . . . . . . . . .abb . . . . . . . .bcc. . . . . . . . . c vwx y u |vwx| ≤ p

27

Application of CFL Pumping Lemma

Hierarchy: Proper Containment

For the language {aibici | i ≥ 1 }, it was fairly straightforward to apply the lemma and show that the language is not context-free. However, this lemma is so weak that there are many non-context-free languages for which it is either hard or impossible to apply it. For example, consider the following language L, which is not context-free. Whichever string z (of length greater than or equal to p) we pick from this language, it is impossible to pump and come to a contradiction. L = {xaix | i > 0, x ∈ {0,1}+} ∪ {x | x ∈ {0,1}+} p p

p p

For example, suppose that we picked string z = 1 0 aaa1 0 to see if z = uvwxy can be pumped to come up with a contradiction. For the case of v = a, w = a, and x = a, it p p p p satisfies that uviwxiy ∈ L, for all i ≥ 0. Even when there is one a (i.e., z = 1 0 a1 0 ), if we set v = a, and w = x = ε , we get uviwxiy ∈ L, for all i ≥ 0. Notice that picking a binary string for z from the second part of the language doesn’t work either. Ogden has introduced a lemma to strengthen this weakness.

28

12.6 Ogden’s Lemma

Hierarchy: Proper Containment

Ogden’s lemma, shown below, is very similar to the original lemma except for the additional (highlighted) parts restricting the locations of the pump sites. (1) For every infinite CFL L, (2) there exists a constant p such that (3) for every string z ∈ L such that |z| ≥ p, if we mark p symbols in z (4) there exist strings u, v, w, x and y that satisfy the following conditions. (i) z = uvwxy, (ii) |vwx| ≤ p, (iii) |vx| ≥ 1, and string vx contains at least one marked symbol, (iv) for all i ≥ 0, uviwxiy ∈ L. We omit the proof of this lemma. Interested readers are referred to the book “Introduction to Formal Languages, Automata Theory, Languages and Computation” by Hopcroft, Motwani, and Ullman (2001. Addison Wedley). 29

Ogden’s Lemma

Hierarchy: Proper Containment

Now, we can apply Ogden’s lemma and show that the following language is not CFL, which we failed to prove with the original lemma. L = {xaix | i > 0, x ∈ {0,1}+} ∪ {x | x ∈ {0,1}+} Suppose L is a CFL. Then it should satisfy Ogden’s lemma. Let p be the constant. p p p p p Pick string z = 1 0 a1 0 , let z = uvwxy, and mark the prefix 1 . According to the lemma, |vwx| ≤ p, and string vx should contain at least one marked symbol. It follows that string vx can never include the symbol a in the middle. We see that string z' = uv2wx2y ∉ L, because in z' the binary string on the left side of a must be different from the string on the right side of a. We have a contradiction. It follows that L is not context-free. By the two application examples of the pumping lemmas for regular languages and context-free languages, we have proven the relations of proper containment between the lower two levels of Chomsky hierarchy (repeated below). We will complete the remaining proofs of other relations in Chapter 15.

30

Review

Chomsky Hierarchy

Languages (grammars) Recursively Enumerable Sets (type 0) Context-sensitive Languages(type 1)

Machines Turing Machines (TM)

Linear-bounded Automata (LBA)

Context-free Languages(type 2)

Pushdown Automata (PDA)

Regular Languages(type3)

Finite State Automata (FA)

Other Models Post System, Markov Algorithms, µ -recursive Functions . . . ↓ : Containment . ↔ : Characterization . * Colored arrows indicate proven relations. Regular Expression 31

Hierarchy: Proper Containment

12.7 Proof of the pumping lemma for CFL’s In this section, we will present a proof of the lemma with a CFG to support the argument with some illustrations. (Detailed formal proof is omitted.) Consider the CFG below, which is in Chomsky normal form (CNF) and has no ε -production rule. (Recall that every CFL can be generated by a CFG in CNF with no ε -production rules, except for S → ε in the case when the language contains ε .) S → AB C → DB | c

A → DE F → HC

B → FD D →d

E → FC H →h

Let’s examine the parse tree of the grammar yielding the following string. d(hd)3hc(d)3dchdchd

32

Proof of the CFL Pumping Lemma

Hierarchy: Proper Containment S

A Notice that on the longest trunk, D E there appears a repeating F subsequence of nonteminals, d C H C S-A-E-F-C-B-F-C-B-F-C-B-F-C B h D D H B F (see the highlighted part). Also d D notice that in the yield of this tree F D d H h C 3 3 d(hd) hc(d) dchdchd, d H C d the highlighted substrings that h D B c h are produced by the repeating d F D nonterminals also repeat. d G: H C Since every nonterminal has h S AB D B two children, it can produce C DB | c d D “fruits” (i.e., a terminal string) F F HC either on its left side branches or d H C D d on its right side branches. In the h c example, F, C and B, yield: d(hd)3hc(d)3dchdchd respectively, produce h, d and d.

B F

D C

d

c

A DE B → FD E FC H h

33

Hierarchy: Proper Containment

Proof of the CFL Pumping Lemma

S

On the longest trunk, F-C-B repeats. This repeating segment can be inserted (or deleted), proportionally increasing (or decreasing) the “fruits” on both sides of the trunk. Notice that each repeating segment F-C-B bears hb to the left and d to the right.

For CFG G, it satisfies the condition that for all i ≥ 0, z= d(hd)ihc(d)idhcdhcd is in the language L(G).

A D d

H h D d H h D d

F C B F C B

E

B

C B

D D

d d

F

F

D

H

C

h

c

F D d G: H C h S D B C d D F F d H C D h c

D

H

C

h

c

d

d

AB DB | c HC d

A DE B → FD E FC H h

z = d ( h d )3 h c (d)3 d h c d h c d u v w x y 34

Proof of the CFL Pumping Lemma

Hierarchy: Proper Containment

We can make the trunk of the suffix tree grow by inserting more repeating subsequences of F-C-B, or cut it short by deleting them. Since the number of repeats can be arbitrary, there can be infinite number of parse trees. Every repeating trunk has a repeating pattern of branches to its left or right and every branch should bear “fruit” (i.e., a terminal symbol). (Recall that there is no ε -production rule except S → ε .) Hence, the parse tree can yield a repeating substring either to the left (bh in the example) or to the right (b in the example) of the trunk. For CFG G , we can claim that for all i ≥ 0, string d(hd)ihc(d)idchdchd is a yield of a parse tree of G, i.e., d(hd)ihc(d)idchdchd ∈ L(G). We can generalize the above observation as follows. Let G be a CFG in Chomsky normal form whose language is infinite. Consider a string z in L(G) whose length is long enough (i.e., some constant which is proportional to the size of the nonterminal alphabet) such that the parse tree yielding z has a trunk where a nonterminal symbol A repeats.

35

Hierarchy: Proper Containment

Proof of the CFL Pumping Lemma

Let Aα Aβ be a pattern appearing on a trunk of a parse tree, where A is a repeating nonterminal symbol, and α and β are strings of nonterminals. In general this parse tree will have the structure shown below. Let v and x be the yields, respectively, from the left branches and the right branches of Aα as illustrated in the figure below. Let w be the yield generated by the branches from the “tip” Aβ , and let u and y be, respectively, the yields from the left subtrees and the right subtrees, if any, of the trunk. S A u

α

y

v A x β w 36

Proof of the CFL Pumping Lemma

Hierarchy: Proper Containment

We see that string uvwxy ∈ L(G), and for all i ≥ 0, uviwxiy ∈ L(G), because if Aα Aβ is a legal pattern on a trunk, then for all i ≥ 0, (Aα )iAβ is also a legal pattern. Let VN be the nonterminal alphabet of the grammar. The length of string z for which every parse tree has a trunk with a repeating pattern is proportional to | VN|, i.e., for some constant p, |z| ≤ p = c|VN|. It follows that |vwx| ≤ p and |vx| ≥ 1. S S A A u

α

y

u v

v A x β w

y

α A

x

α v

A

x

β w

37

Proof of the CFL Pumping Lemma

Hierarchy: Proper Containment

Summarizing the above observations, we have the following pumping lemma for CFL’s. The pumping lemma for context-free languages. For every infinite CFL L, there is a constant integer p that satisfies the following conditions: Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length is greater than or equal to p, there are u, v, w ∈ ∑* such that • z = uvwxy, |vwx| ≤ p, |vx| ≥ 1 • for all i ≥ 0, uviwxiy ∈ L

38

Hierarchy: Proper Containment Rumination (1): Application of the Pumping Lemma



To prove that {aibi | i ≥ 1} is not regular, we showed that for this language the pumping lemma does not hold. Let’s review

how we dealt with the parts of the lemma quantified existentially (i.e., prefixed with “there is,” “there exist” or “there are”) or universally (i.e., prefixed with “for all” or “for every”). To show that the lemma does not hold for the language, we must consider all possible cases for each part existentially quantified. To contrary, for each part universally quantified, it is enough for us to pick just one case (for our convenience) that will lead to a contradiction. The lemma uses an existential quantification in the following two parts; (2) There is a constant n … (4) There are strings u, v and w . . . So, in the proof, by using n as a variable, we took into consideration of all possible constant values of n, and examined all possible strings u, v, and w that satisfy the conditions given in (i) – (iii) of the lemma. In the following three parts, the lemma uses a universal quantification; (1) For every infinite regular language, (3) For every string z ∈ L, (4)-(iv) For all i ≥ 0, For part (1) we picked language {aibi | i ≥ 1} that we supposed to be a regular language, for part (3) we picked z = anbn, and for part (4)-(iv) we picked i =2. (If we wanted, we could have chosen others than z = anbn and i = 2.)

39

Hierarchy

Rumination (1): Application of the Pumping Lemma

• Here is an application example of the pumping lemma with a logical bug that we often come across while grading homework assignments.

Question: Is the language L = {abcaiabc | i ≥ 0} regular? Justify your answer. Answer: L is not regular. Suppose L is regular. Then L should satisfy the pumping lemma. Let n be the constant of the lemma, and choose string z = abcanabc, which is in L. Since |z| ≥ n, string z satisfies condition (2) of the lemma. Let z = uvw, and consider the case where u = a, v = b, and w = canabc. Then we get uv2w = abbcanabc, which is not in L. It follows that L does not satisfy the pumping lemma. L is not regular language!! We can show that L is regular by showing an FA which recognizes L (see the figure below). What is wrong in the above argument? It is in the argument dealing with the existential quantification for strings u, v, and w in part (4) of the lemma. The argument took u, v, and w only for the specific values of u = a, v = b, and w = canabc, not considering other cases satisfying conditions (i)-(iii). We should show that uviw ∉ L, for all possible locations of the “pump site” v in the prefix of length n of string z and for all possible lengths of v (i.e., 1≤ |v| ≤ n). Actually, for the case where v contains some a’s in the substring an, it satisfies that for all i ≥ 0, uviw ∈ L. The proof fails. (4) There exist strings u, v and w that satisfy the following conditions. (i) z = uvw, (ii) |uv| ≤ n, (iii) |v| ≥ 1, (iv) for all i ≥ 0, uviw ∈ L.

a

b

c

a

b

c

a

40

Rumination (1): Application of the Pumping Lemma

Hierarchy

• The pumping lemma presents a common property of all infinite regular languages. Recall the last claim of the lemma, i.e., “for all i ≥ 0, uviw ∈ L”, which is based on the existence of a cyclic path in the state transition graph of every DFA which recognizes an infinite regular language. Notice that every finite language is regular. Let L = {w1, w2, . . . , wk} be a finite language. Then we can simply construct a regular grammar with rules S → w1 | w2 | . . .| wk. Therefore, every non-regular language is infinite. • We know that every infinite regular language satisfies the pumping lemma. Is the converse of this statement true? In other words, can we say that if an infinite language satisfies the pumping lemma, then the language is regular? We didn’t prove that every infinite language satisfies the pumping lemma. We proved it only for infinite regular languages. Actually, Appendix E shows a non-regular CFL which satisfies the lemma. Thus, it is wrong to prove the regularity of a language by showing that it satisfies the pumping lemma.

Application of the pumping lemma for CFL

• When we apply the pumping lemma for CFL, we should keep in mind the difference between the lemmas for regular languages and CFL’s. Recall that for regular languages, the region x, where we can pump, is limited in the prefix of length n of z, while for the lemma for CFL’s, the two regions v and x, where we can pump, can be anywhere in z within the window of width | vwx| ≤ p. Thus, for CFL’s it requires more cases to analyze.

41

Hierarchy: Proper Containment Rumination (2): Application of Ogden’s Lemma • Often we hear that programming languages are context-free. However, they are not clean context-free languages, and carry some context-dependency in their syntax. For example, in C it is not allowed to declare a variable more than once in a block, and in FORTRAN there are statements requiring a matching label (e.g., 10 in the following DO-loop). .... DO 10 I = 1, 100, 1 SUM = SUM + I 10 PRO = PRO * I .... The formal language L below has a similar context-dependency of programming languages. String x corresponds to the label (i.e., a string of decimal digits) and ai corresponds to the loop body. Using Ogden’s lemma we showed that L is not a context-free language. L = {xaix | i > 0, x ∈ {0, 1}+} ∪ {x | x ∈ {0, 1}+}

42

Exercises

Hierarchy: Proper Containment

12.1 Using the pumping lemma for regular languages, we are going to prove that L = {xxR | x ∈{a, b, c}* }is not regular as follows. Complete the proof by answering each of the following questions. Proof. Suppose that L is regular, and let n be the constant of the lemma. All the following strings are in L. (a) abccba

(b) a100 b100 b100 a100

(c ) an/2 bn/2 b n/2 an/2

(d) anbnbnan

Question 1: Which of the strings above are you going to pick as the string z for your proof? Why are other strings not your choice? If you don’t like any, choose another string in L and explain why. Question 2: Consider strings u, v, w ∈{a, b, c}* that satisfy the conditions (i) z = uvw, (ii) |uv| ≤ n, and (iii) |v| ≥ 1. What will be in v? Briefly explain all possible contents of string v. Question 3: To show that L is not regular, how are you going to use the last part of the lemma: “For all i ≥ 0, string uviw ∈ L.?” Write your argument. 12.2 Which of the following languages are regular? Justify your answer. L1 = {aibjck | i, j, k > 0 }

L2 = {aaaaibjci | i, j > 0 }

12.3 For the symbols a, b and c, and a string x, let #a(x), #b(x) and #c(x), respectively, be the number of a’s, b’s and c’s in string x. Which of the following languages are context-free? Justify your answer. L3 = { x | x ∈{a,b,c}+, and #a(x) > #b(x) } L4 = { x | x ∈{a,b,c}+, and #a(x) = #b(x) = #c(x) } L5 = {x | x ∈{a,b,c}+, and #a(x) < #b(x) < #c(x) } 12.4 We learned that language { w#wR | w ∈ {a, b}* } is context-free. The following languages are not context-free. Prove why. L6 = {w#w | w ∈ {a, b}* }

L7 = {xyx | x, y ∈ {a, b}* }

43

Related Documents


More Documents from "Malak Kinaan"