Models For Computation: Part I

  • Uploaded by: Sam Myo Kim
  • 0
  • 0
  • July 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Models For Computation: Part I as PDF for free.

More details

  • Words: 25,596
  • Pages: 158
Chapter 1. Preliminaries A preliminary step for an abstraction is to introduce a proper notation that will symbolically represent the objects in which we are interested. In particular, we will see that formal languages heavily rely on such symbolic representations. In this chapter we begin with introducing such notational convention. To follow the discussions presented in this book, we need the capacity for logical reasoning, which will in turn be enhanced through the course. Often the book leaves out the routine and tedious details of a proof that can be completed by following one of the conventional proof techniques, such as proof by induction. This doesn‟t mean that proofs can be ignored. Thus, this chapter presents an overview of proof techniques with easy examples and helps the reader build confidence in validating and presenting his or her own proofs. Studying this chapter, the reader will see that “acquired proof-phobia” is curable.

1.1 Notation 2 Symbols, Strings, Composite symbols Sets and set operations 1.2 Propositions and logical connectives 6 1.3 Proof techniques 10 Proof by contrapositive, Proof by cases, Proof by contradiction Proof by example, Proof by generalization Proof by induction, Proof by pigeonhole principle Proof by counting, Diagonalization technique Rumination 22 Exercises 26

1

Preliminary

1.1 Notation In this book, with no further explanation, we shall use the following notation.

• The leading English alphabet letters, a, b, c, .…, will be used as symbols to denote an object. • A string is a finite sequence of symbols, for which we shall use the trailing English alphabet, .., u, v, w, x, y, z. • For two strings x and y, xy denotes the concatenated string of x and y. • Epsilon () denotes the null string, which, for any string x, has the property x = x = x. • For a string x, by |x| we denote the length of x, i.e., the number of symbols in x. • For a string x, x0 = , xi+1 = xxi, and xR denotes the reverse of the string x. Example: (ab)3 = ababab, (ab)0 = , (abcd)R = dcba. • We shall use capital letters to denote the sets.

2

Preliminary Notation

• Composite symbols: To show the constituent entities that a symbol represent, composite symbols are used in addition to the conventional superscript, subscript, and primed notation. A composite symbol is represented by a finite number of symbols listed in the brackets as follows. [a, b], [1, A, 5], [a, qi, (c, d)] Notice that we can always substitute a single symbol for a composite symbol, if needed.

• Set specification: There are two ways to specify the sets; in curly brackets, either by explicitly showing all the members or by describing the set properties. (a) Explicit. Examples: A = {2, 8},  = {a, b, c} In particular, the empty (or null) set is denoted by Φ. (b) By set properties of the form { x | property list of x}. Examples: A = {n | n is a even number} = { 2, 4, 6, . . . }, B = {xxR | x is a binary string } = {00, 11, 0110, 1001, 1111, . .}, C = {anbncn | n  0} = {, abc, aabbcc, aaabbbccc, . . . }

3

Preliminary Set operations • Operations on the sets of strings: Let A = {aa, ab, abc) and B = {aa, bc}. (a) union() : A  B = {aa, ab, bc, abc}, A  Φ = A (b) intersection( ) : A  B = { aa }, A  Φ = Φ (c) subtraction() : A – B = { ab, abc} (d) exclusive or () : A  B = {ab, abc, bc} (e) product : AB = { aaaa, aabc, abaa, abbc, abcaa, abcbc} AΦ = ΦA = Φ (f) complement : Let U be the universal set, i.e., the set of all possible elements. Then the complement of set A, denoted by A is A=U–A

(g) DeMorgan‟s law: A  B = A  B,

A B =A B

4

Preliminary

Set operations

(h) Closure (star, or Kleene closure): For a set A, define A0 = {}, and for i > 1, Ai = AAi-1. The closure of the set A is defined as follows. 

A* =  (Ai) = A0  A1  A2 . . . . i=0

The closure A* is the set of all strings that we can get by picking arbitrary number of strings in A (you are allowed to pick the same string arbitrary number of times) and concatenating them in any order. For example, for the binary alphabet , the closure * = {0, 1}* is the set of all binary strings including the null string . If  is the English alphabet, * is the set of  and all strings over the alphabet . Every closure contains the null string . Kleene plus, A+, denotes the closure without , i.e., A+ = A* - {}. (i) For a set A the power set, denoted by 2A, is the set of all subsets of A. Example: Let A = {a, b, c}. Then, 2A = {Φ, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, {a, b, c}}

5

Preliminary

1.2 Propositions and Logical Connectives A proposition is a declarative statement that can be either true or false. By combining propositions using the logical connectives, such as and, or, and not, which are symbolically denoted by , , and , respectively, we can form another proposition. Let p and q be two propositions. The three compound propositions p  q, p  q, and p, respectively, have the truth values defined by the truth tables as shown below. p

q

pq

p

q

pq

T

T

T

T

T

T

p

p

T

F

F

T

F

T

T

F

F

T

F

F

T

T

F

T

F

F

F

F

F

F

6

Preliminary

Propositions

Another commonly used logical connective is the conditional (or if-then), denoted by the symbol →. The truth table for the proposition p → q (read if p, then q) is defined as shown below. Notice that the truth tables for p → q and p  q are identical. Thus p → q is a short notation for the proposition p  q. Notice that though we use the term conditional, implying an antecedentconsequent relation, this conditional connective does not necessarily carry such relational meaning. p q p → q p  q T

T

T

T

T

F

F

F

F

T

T

T

F

F

T

T

For example, let p be “Elephants fly,” and q be “The earth is flat.” By the definition, the compound statement p → q is (vacuously) true, with no regard to the truth values of statement q, and without extending to their meanings. 7

Preliminary

Propositions p → q p  q

p

q

T

T

T

T

T

F

F

F

F

T

T

T

F

F

T

T

There are several ways of reading the conditional statement p → q. Among others are the following. (Notice that in (c) the order of p and q are reversed.) (a) if p, then q

(b) p only if q

(c) q if p

(d) if p holds, q follows

The biconditional connective of two propositions p and q, denoted by p  q, is the short notation for the compound statement (p → q)  (p  q). Often this proposition is shortly written as “p iff q”, and read “p if and only if q” or “if p then q, and conversely.” We usually prove p  q by proving each of the two parts: p → q (called the “only if part”), and p  q (called the “if part”). 8

Preliminary

Propositions

Often we come across a problem that asks us to justify a statement of the form “p implies q” (denoted by p q), meaning that q is true in each case in which p is true. Notice that the statement p q, in particular, refers to the first row of the truth table of the proposition p → q . In other words, if p → q is true whenever p is true, we say p implies q. The implication p q is not a proposition, which would have a truth value, but an assertion about the relationship of two propositions. If p  q and p  q, we write p  q and say that p and q are equivalent. Consider the truth tables of p → q, p  q and q → p, which are all identical, as shown below. Clearly, we have (p → q)  (p  q)  (q → p) for every pair of values of p and q. Since p  q and q → p are true for every case in which p → q is true and vice versa, we can also say (p → q)  (p  q )  (q → p). p → q p  q q → p

p

q

T

T

T

T

T

T

F

F

F

F

F

T

T

T

T

F

F

T

T

T 9

Preliminary

1.3 Proof Techniques A proof is a convincing argument that a statement is true. Therefore, the statement to be proven must be a proposition that can be either true or false. Usually it involves a statement of the form p → q. In this case the proof must show that p → q is true. Since, if p is false, p → q is vacuously true (i.e., has no meaning), the proof is only concerned with every case in which p is true. In other words the proof justifies the implication p  q. A theorem is a statement that has been proven true. However, we do not call every statement proven true a theorem. We reserve the use of the word theorem only for important statements of interest. Occasionally we prove statements to use in the proof of other more important statements. We call such statements lemmas. Sometimes when we have proven a theorem, we can easily show that a further statement is true. Such statements are called corollaries of the theorem. Now, we are ready for a quick overview of some important proof techniques.

10

Preliminary Proof techniques

1. Proof by contrapositive: To prove p  q, we use the following equivalence. ( p → q )  ( q → p) Example: Let p and q be, respectively, the statements “it rained” and “the lawn is wet.” Then instead of proving that p → q (i.e., “if it rained, then the lawn is wet”) is true, we can prove that q → p (i.e., “if the lawn is not wet, it did not rain”) is true. 2. Proof by cases: When p is a compound statement, we prove p  q by considering the truth values of each component of p. Example: Suppose that p = p1  p2  p3. Using the equivalence

((p1  p2  p3) → q)  ((p1 → q)  (p2 → q)  (p3 → q)), we prove p q by justifying either (p1  q), (p2  q), or (p3  q).

11

Preliminary Proof techniques

3. Proof by contradiction: This proof technique shows that if p  q does not hold, i.e., the statement p → q is false, then it contradicts some statement, possibly including p, that we know to be true. Recall that the implication p  q presupposes that p is true. So, if we can show that p is false from the supposition that p → q is false, we also have a contradiction. Example: Suppose that the statement “everyone who studies automata and formal languages gets a good job” is a proven fact. Let p be the statement “Tom failed to get a good job,” and let q be the statement “Tom did not study the automata and formal languages.” To justify the statement p  q (i.e., p→ q is true), we suppose that p→ q is false, and show that it contradicts to the above fact. Recall that (p → q)  (p  q). If p → q is false, p  q must be false. It follows that both p and q should be false, i.e., (p  q) is true, implying that Tom failed to get a good job and he studied the automata and formal languages. This contradicts to the proven fact that everybody, who studied the automata and formal languages, gets a good job. The proof by contradiction is quite popular in computer science. We will see more interesting examples in other chapters. 12

Preliminary Proof techniques

4. Dealing with quantified statements: Occasionally a statement contains free variables bound to a phrase “for all”, “for every”, or “there exists” in the following forms. (1) “for all (or for every) x, . . . .” (denoted by x …)

(2) “there exists (or there is) x . . .” (denoted by  x …) The phrases “for all” and “there exists” are, respectively, called the universal quantifier and the existential quantifier. (a) Proof by example: Let x [p(x)] denote a statement with the existential quantification bound to variable x. To prove such statements are true it is enough to show an example. Example: We can justify the statement “Among the students who took this course, there exists a student who got a grade of A” by showing a student, say Mary, who got an A from the course.

13

Preliminary Proof techniques

(b) Proof by generalization: We apply this technique to justify a statement of the form x [p(x)]. If the domain of the variable x is small, we may examine the statement for each case of x. Otherwise, we need to come up with a generalization, possibly using other proven facts. Example: Suppose that we want to justify the statement “all the students who take the language course are from abroad.” If possible, it is a good strategy to find some information, like the curriculum of the school, showing that only foreign students are admitted to the course, for example, to learn English as their second language. Notice that if you want to disprove a statement of the form x [P(x)] (or the form x [p(x)]), you may use the same strategy (i.e., to show an example against the statement, called counter example) as for proving a statement of the form x [p(x)] (respectively, x [P(x)] ).

14

Preliminary Proof techniques 5. Proof by induction: This is a quite effective proof technique for justifying a statement that claims a generalization over all integers n  n0, i.e., a statement of the form  n  n0 [ p(n) ]. The statement may explicitly describe the involvement of the natural numbers, as the following example, or it may be implicit (as we shall see in other examples). n i=0 i

n0[

= n(n+1) /2 ]

The proof is carried out in three steps: (1) basis, (2) induction hypothesis, and (3) induction. In step (1), we show that p(n) is true for n0, the lower bound. In step (2) we suppose that for every natural number m < n, p(m) is true, and finally in step (3) we show that p(m + 1) is true. Applicant Bloopers Interviewer: "Do you think you can handle a variety of tasks?" Applicant: "I should say so. I've had nine totally different jobs in the past five months."

Break Time

The stern faced Personnel Officer told an applicant that they needed an individual who is totally responsible. "I sure qualify then." replied the applicant. "Everywhere I've worked, whenever something went wrong, I was responsible." - Jim -

15

Preliminary Proof techniques n

Example: Prove that  n  0 [ 

i=0 i

= n(n+1) /2 ]. n

Proof. (1) Basis (n = 0): we show that for n = 0, the equality  i=0 i = n(n+1) /2 holds. 0 The left side of the equation is (  i=0 i) = 0, and the right side is 0(0 + 1)/ 2 = 0. Obviously, the statement is true. (2) Induction hypothesis: suppose that  m < n, [ 

m

i=0 i

(3) Induction: we show that for n = m+1, the equality  n

m+1

= m(m+1) /2]. n

i=0 i

= (n)(n+1) /2 holds.

m

The left side of the equation is  i=0 i =  i=0 i =  i=0 i + (m+1). Since by the m induction hypothesis,  i=0 i = m (m+1)/2, we get the right side as follows. m



i=0 i

+ (m+1) = m(m+1)/2 + (m+1) = m(m+1)/2 + (m+1) = (m+1)(m+2)/2 = n(n+1)/2

In the rumination section at the end of this chapter, we discuss this proof technique further with another example.

16

Preliminary Proof techniques 6. Constructive proof and non-constructive proof: Constructive proof is another name for proof by example, which is used for proving a statement of the form x [p(x)]. The technique shows a specific value for x that makes p(x) true. A nonconstructive proof shows indirectly that such a value for x must exist by using other facts. Example: Prove that n [ n is a prime number and n > 2]. Constructive proof: 5 is a prime number greater than 2. Non-constructive proof: It is proven that every natural number is a product of two prime numbers. If there is no prime number greater than 2, then every number must be equal to 2i, because 2 is the only prime number by the assumption. However, there are natural numbers (e.g., 5) that are not a power of 2. This contradicts to the proven fact that every natural number is a product of two prime numbers.

17

Preliminary Proof techniques

7. Proof by the pigeonhole principle: The pigeonhole principle says that if m letters are distributed among n < m pigeonholes, there must be at least one pigeonhole that contains more than one letter.

1

2

n

..... 1 2 ...

m

The pigeonhole principle appears intuitively obvious. But an intuition by itself cannot be a proof. We need a sound logical argument for the proof. This principle can be proved by the proof by induction technique. We will show this in the rumination section at the end of this chapter.

18

Preliminary Proof techniques

Application example: Using the pigeonhole principle, show that in a directed graph with n nodes, any path of length greater than or equal to n involves a cycle. Proof: Let n be the number of nodes of the graph. We know that on any path of the graph, the number of nodes visited is greater than the path length, i.e., the number of edges. (Notice that we count the starting node of a path as visited.)

Let the nodes of the graph correspond to pigeonholes and visiting a node corresponds to putting a letter in it (see the figure below). If the path length is greater than or equal to n, then more than n nodes are visited. In view of the pigeonholes, this implies that we are putting m (> n) letters into the n pigeonholes. According to the principle, there is a pigeonhole (i.e., a node) containing more than one letter (i.e., visited more than once). This implies that the path contains a cycle.

19

Preliminary Proof techniques 8. Proof by counting: Many theorems are involved with the number of entities that satisfy certain property. One way to prove such a theorem is by figuring out the number or its range that satisfies the theorem. (The technique does not necessarily count. The name is somewhat misleading.) Example: Prove that n > 0 [it is impossible to put all the binary numbers of length n in the rows and columns of n  n bit (0/1) matrix]. 1 1 0 1 0 1 0 1 1 0 1 1 Proof: We prove the statement in two cases: (1) when n > 2, and (2) when n = 1 or 2. Case (1), when n > 2: There are 2n binary numbers of length n, and the total number of columns and rows is 2n. Since 2n > 2n, for every n > 2, the statement is true. Case (2), when n = 1 or 2: We leave the proof as an exercise for the reader.

20

Preliminary Proof techniques 8. Diagonalization Technique: Given a set of lists, this technique shows how to construct a list that is not in the set. This is a powerful tool for proving certain properties of formal languages presented in Chapters 15. Example: Given an n  n bit matrix M, for some n > 0, construct a binary number of length n which does not appear in any row or column of M. Answer: Let mii be the entry at the i-th row and i-th column of M. Construct the following binary number v, v = m11m22….mnn , where mii is the complement of mii.

Clearly, v does not appear in any row or column of M. This is true, because for any i, the number appearing in the i-th row or column contains a number whose i-th bit is mii, which is the complement of the i-th bit of v.

21

Preliminary Rumination (1): Conditional connective and implication Note that the truth table of the conditional connective (if-then) is not the antecedent-consequent (or cause-effect) relationship, but a logical definition. It is, however, very useful template to use for deducing a proposition (i.e., consequent) from another (i.e., antecedent), only if they are related. Otherwise, we will end up with a nonsense. For example, let p be the statement “I watch TV” and q be “Our soccer team wins.” Then, by simple manipulations, we get the following. (p  p)  (q  q) = (p  q)  (q  p) Since (p  p)  (q  q) is true, either (p  q) or (q  p) must be true, i.e.,

(p → q) is true, or (q → p) is true. A verbal interpretation of this last symbolic statement would be as follows. “It is true that if I watch TV, then our soccer team wins, or if our soccer team wins, then I watch TV.”

p→q

p

q

T

T

T

T

F

F

F

T

T

F

F

T

22

Preliminary Rumination (2): Proof by induction Proof by induction is a powerful technique for proving a statement is true for all integers n  n0, i.e., a proposition with a universal quantification of the form n  n0 [p(x)]. The statement one is to prove may clearly show the lower bound n0 and the integer variable n, as in the mathematical equation presented in this chapter as an example. However, occasionally we come across statements that do not explicitly describe those numbers. We should figure them out before applying the technique. When the statement is presented in a mathematical formula, this proof technique works so routinely that we may fail to comprehend the underlying logical basis involved in the proof. To understand why the technique works, consider, as an analogy, how we walk up a stairway. To walk up to the next floor at the n-th stair, we must go to the bottom (i.e., 0-th stair) of the stairway. Then we repeat the step going from i-th stair to (i+1)-th stair. In other words, to walk up to n-th stair, (1) we must be able to get to the bottom of the stairway, and (2) if we are up on an i-th stair, for any i (0  i < n ), (3) we must be able to move up to (i+1)-th stair. The proof by induction logically works the same way. By the basis step, (1) it shows that the statement is true for the lower bound n0 of the quantified variable n. By the induction hypothesis, (2) we suppose that the statement is true for all m (n0  m < n), and by the induction step, (3) we show the statement is true for m + 1. Notice that because the statement is true for n0 (by the basis step), it is also true for n0 + 1 by the induction step. Since it is true for n0 + 1, it is also true for n0 + 2, and so on. We can claim that the statement is true for all n  n0. (Sometimes the proof is described in two steps with the hypothesis included in the induction step.) The next slide shows how the “walking up a stairway” and the proof by induction correspond.

23

Preliminary

Rumination: Proof by Induction

Prove that  n  0 [ 

n

n

i=0 i = n(n+1) /2 ]

A N I

(3) Induction step: for n = m+1, show that m

3



n

i=0 i

= n(n+1) /2.

(2) Hypothesis:  m < n, suppose that 

2

m

i=0 i

= m(m+1) /2.

1 (1) Basis (when n = 0): 

0

0 i=0

i = 0 and n(n+1)/ 2 = 0.

It is true for n = 0. 0 + 1 + 2 +

. . . .. . . m(m+1)/2

+ m

+ (m+1)

+ (m+1) = (m+1)(m+2)/2 = n(n+1)/2 24

Preliminary

Rumination: Proof by Induction

As another application example of the proof by induction technique, here we will justify the pigeonhole principle, which is repeated below. Pigeonhole Principle: Suppose that there are n pigeonholes and m (> n) letters. If we distribute all the letters (in any way) into the pigeonholes, there will be at least one pigeonhole that contains more than one letter. Proof: (1) Basis: The pigeonhole principle makes sense when there is at least one pigeonhole. Hence the lower bound n0 = 1. In this case, if we put all the m > 1 letters in the pigeonhole, that pigeonhole clearly contains more than one letter. The principle is true.

(2) Hypothesis: Suppose that for some number of pigeonholes n‟ < n, the principle is true. (3) Induction: We prove that the principle is true for n‟+1 pigeonholes and m (> n‟+1) letters. As shown in the figure bellow, we examine the problem divided into two cases depending on the number of letters contained in one particular pigeonhole, say hole n‟+1. (a) When pigeonhole n‟+1 has one or no letter: In the remaining n‟ pigeonholes at least m - 1 > n‟ letters are distributed. According to the induction hypothesis, among the n‟ pigeonholes there is at least one pigeonhole that contains more than one letter. The principle is true. (b) When pigeonhole n‟+1 has two or more letters: Obviously this pigeonhole has more than one letter. Again, the principle is true. 1

n‟

2

n‟+1

..... 1 2

m

25

Preliminary Exercises 1.1 What does each of the following expressions denote? If your answer is a finite set, show all the members. If it is infinite, use set property notation. (a) a5

(b) |aba7b|

(c) ||

(d) xR, where x = abab

(e) AB, where A = {a, b, c}, B = {aaaa, bbbb, cccc}

(f) A*BA*, where A and B are the sets shown in part (e) above. 1.2 Among the strings given in (1) – (7) below, identify all the members that belong to each of the following sets. (a) {xyxR | x  {a, b, c}*, y  ccc }

(b) {xx | x  {a, b, c}* }

(c) {x | x  {a, b, c}* and in x the number of a‟s is strictly greater than the number of b‟. }  {aibj | i, j > 0 } (d) ({a, b, c}* - ({aibj | i > j > 0 }  {aibj | 0 < i < j }))  {aibj | i, j > 0 } (1) aaaabbbb (2) aaaa (3) aaaacccaaaa

(4) bbbaaaa (5) abcccccba

(6) aaaaab (7) abaaba

1.3 For the sets of strings given below, show the result of each of the following set operations. Write your answer in set property notation. Try to make it as simple as possible. (a) L0 L4

(b) L0  L1

L0 = {cibjak | i, j, k > 0, j = k } L2 = {a, b, c}* L4 = { cibjak | i,j,k  0 }

(c) L3  L5

(d) (L2 - L1)  L4

(e) L0L5

L1 = { cibjak | i, j, k  0, i = j } L3 = {xxR | x  {a, b, c}* } L5 = {cibjck | i,j,k  0 }

26

Preliminary

Exercises 1.4 Show that the following statement is true for a pair of integers a and b. a = b if and only if a ≥ b and a  b.

1.5 Prove that for all positive integers n  2, it is impossible to put all binary numbers of length n in the n rows and n columns of n  n bit matrix. 1.6 Let G be a directed graph with n ≥ 2 nodes, with each node labeled with either 1 or 2. Let P12 be a path in G which starts at a node with label 1 and ends in a node with label 2. An edge is said “good” if it links two nodes with different labels (i.e., 1  2 or 2  1). (The figure below shows a path with 7 good edges.) Prove that in any path P12 the number of good edges is always odd. 6 1 7

2 8

5 1

1 2

1

4 9 2

1 3

2

A path P12 1 2 2 1 2 1 2 2 1 2 7 good edges

2

1.7 Let‟s change the definition of good edges (in problem 1.6) the other way around, i.e., an edge is “good” if it links two nodes with the same label. Can you claim the same property on the number of good edges? Justify your answer.

27

Preliminary

Exercises 1.8 Using the pigeonhole principle we proved that the following statement is true. [ In a directed graph with n nodes, any path of length greater than or equal to n involves a cycle. ]

Here, we want to justify the statement using the proof-by-induction technique. Answer each of the following questions. (1) What is the basis of the statement? In particular, identify the lower bound (n0). Show that the statement is true for the lower

bound. (2) Write the induction hypothesis. (3) For the argument in the induction step, you may use the figures below, each showing a typical case of a path whose length is greater than or equal to n. Figures (a) – (c) are the cases where there is a node which the path does not go through, and figure (d) is the case where the path goes through every node. Complete this step answering the following questions. (i) How can you argue that in cases (a), (b) and (c), the path involves a cycle? (Hint: use the hypothesis.) (ii) Why does the path involve a cycle in case (d)?

(a)

(b)

(c)

(d)

28

Models of Language Generation: Grammars

29

Chapter 2. Formal Languages and Grammars In chapter 1 we learned how to represent a set. If the set is small, we write it out. Otherwise, we show it in terms of a set property representing the characteristics of the set elements. This approach is useful for the sets whose members (i.e., the strings) have simple properties to describe. For the formal languages in real applications, like programming languages, it is impractical to represent a language in terms of the set properties because in general the properties are too complex to identify and succinctly describe. Hence the major part of the book will be dedicated to the languages represented as a set of rewriting rules, called grammar, which derives the language. In this chapter we will learn how to derive a language with a given set of rewriting rules, and conversely, given a language, how to find a set of rewriting rules that derive it. We begin with a couple of rewriting rules to see how they derive a language.

2.1 Languages 31 2.2 Deriving a Language with rewriting rules: Examples 33 2.3 Definitions: Formal languages and grammars 48 Type 0 (phrase structured) grammars Type 1 (context-sensitive) grammars Type 2 (context-free) grammars Type 3 (regular) grammars 2.4 Grammars and their languages: Examples 53 Rumination 55 Exercises 58

30

Languages and Grammars

2.1 Languages Webster‟s dictionary defines the word language as follows. (1) The words, their pronunciation, and the methods of combining them used and understood by a community. (2) A system of signs and symbols and rules for using them that is used to carry information. In the above definitions we see two common factors: (a) language elements such as words, their pronunciation, signs and symbols, and (b) methods of combining (or rules for using) them. In natural languages the elements and the rules of using them are evolved, not defined. Each sentence is constructed with some of the language elements combined according to the rules of the language. Since the language elements and the rules for using them give rise to all the sentences, we can simply define a natural language as a set of sentences. In formal languages we usually deal with strings of symbols (not pronunciation) and the rules for constructing the strings, which correspond to the sentences of a natural language. Hence, in formal languages we simply define the language as follows. 31

Languages

Languages and Grammars

According to the above definition, the following sets are languages. Notice that set A is finite and others are infinite. Set D is the C++ programming language. A = { aaba, bba, aaabbb } B = {xxR | x  {0, 1}+ } = {00, 11, 0110, 1001, 1111, . .} C = {anbncn | n  0} = {, abc, aabbcc, aaabbbccc, . . . } D = {x | x is a C++ program }

Love

Break Time

Never pretend to a love which you do not actually feel, for love is not ours to command. - Alan Watts – We can only learn to love by loving. - Iris Murdoch – Love is the triumph of imagination over intelligence. - H. L. Mencken – There is no remedy for love but to love more. - H. D. Thoreau -

32

Languages and Grammars

2.2 Deriving a language with rules The following example shows a set of rewriting rules named spring-grammar. spring-grammar: <spring-sentence>  <spring-subject> <spring-verb> <spring-subject>  birds | flowers | buds <spring-verb>  sing | bloom | fly A rewriting rule (or simply rule) is divided into two sides; left side and right side separated by an arrow. When there is more than one element (word) at the right side, we separate them with a vertical bar (|). In the above example, every rule has one element at its left side. There can be more, as we shall see later. We will call each rule by the word on its left side. In the example, the elements delimited by a pair of angled brackets < . . . > are only used for generating a “spring sentence” consisting of “spring subjects,” birds, flowers and buds, followed by “spring verbs,” sing, bloom and fly.

33

Deriving a language with rules

Languages and Grammars

spring-grammar: <spring-sentence>  <spring-subject> <spring-verb> <spring-subject>  birds | flowers | buds <spring-verb>  sing | bloom | fly Let‟s see how we can derive a “spring sentence” with the above set of rules. We begin with a designated element, in this example <spring-sentence>, and rewrite it with one of the right side elements of its rule. The choice is arbitrary, if there is more than one element. In the example we have no choice except <spring-subject> <spring-verb>. So, we get the following.

<spring-sentence>

 <spring-subject> <spring-verb>

A N I

34

Deriving a language with rules

Languages and Grammars

spring-grammar: <spring-sentence>  <spring-subject> <spring-verb> <spring-subject>  birds | flowers | buds <spring-verb>  sing | bloom | fly We call the result of a rewriting step a sentential form. From the current sentential form, we pick an element that appears in the left side of a rule and rewrite it with one of its right side elements. We keep this rewriting step until we get a sentential form with no rule applicable as shown below.

This last sentential form (“birds sing” in the example) is a member (string or sentence) of the language generated by the set of rules.

<spring-sentence>  <spring-subject> <spring-verb>  birds <spring-verb>

 birds sing

A N I

35

Deriving a language with rules

Languages and Grammars

spring-grammar: <spring-sentence>  <spring-subject> <spring-verb> <spring-subject>  birds | flowers | buds <spring-verb>  sing | bloom | fly <spring-sentence>  <spring-subject> <spring-verb>  birds <spring-verb>  birds sing. In the above derivation if we had rewritten <spring-subject> with flowers instead of birds, we would get a spring sentence “flowers sing.” It follows that the set of rules in the spring-grammar can derive all the 9 sentences that can be formed with the three subjects and three verbs. Denoting this set of sentences (i.e., the language) by L(springgrammar), we get

L(spring-grammar) = { birds sing, birds bloom, birds fly, flowers sing, flowers bloom, flowers fly, buds sing, buds bloom, buds fly } 36

Deriving a language with rules

Languages and Grammars

If we are not concerned with the meaning of the sentences (i.e., strings), we can simplify the grammar by substituting a single symbol for each of the elements as shown below. (Studying formal languages, we are mainly concerned with the structural complexity of the languages, not the meaning.) In this simplification, we used lower case letters for the words that appear in the language, and upper case letters for the others. In particular, we used S to denote the starting element for the language derivation. (Notice that in the example, # denotes the blank between <spring-subject> and <spring-verb>.) G spring-grammar: S A # B <spring-sentence>  <spring-subject> <spring-verve> A a b c <spring-subject>  birds | flowers | buds B d e f <spring-verve>  sing | bloom | fly

G: S  A#B A  a|b|c B  d|e|f

L(G) = {a#d, a#e, a#f, b#d, b#e, b#f, c#d, c#e, c#f}

37

Deriving a language with rules

Languages and Grammars

Conventionally, we use lower case letters and special characters to denote a word in the language, and upper case letters only for the elements used for language generation. We call those lower case letters terminal symbols, and the upper case letters nonterminal symbols. In our example (repeated below), a through f and # are terminal symbols, and S, A and B are nonterminal symbols. We call the set of rewriting rules a grammar of the language. G: S  A#B A a | b | c

L(G) = { a#d, a#e, a#f, b#d, b#e, b#f, c#d, c#e, c#f }

Bd|e|f

Break Time Life - Life is a long lesson in humility. - J. M. Barrie – - Life is far too important a thing ever to talk seriously about. - Oscar Wilde – - Life is like playing a violin in public and learning the instrument as one goes on. - Samuel Butler – - Life is a foreign language; all men mispronounce it. - Christopher Morley –

38

Languages and Grammars

Deriving a language with rules

The size of the language of the above grammar is finite. The following grammar shows how it is possible to generate an infinite language.

Example 1. G1: S  dingdongA

A  dengS | deng

S  dingdongA A  dengS S dingdongA dingdongdengS S  dingdongA

A N I

A  deng dingdongdengdingdongA

dingdongdengdingdongdeng In the above derivation, rule A  dengS allows us to derive a string containing “dingdongdeng” repeated arbitrary number of times. To terminate the derivation we should finally apply rule A  deng. 39

Languages and Grammars

Deriving a language with rules

Example 1. G1: S  dingdongA

A  dengS | deng

We can see that the language L(G1) is

L(G1) = { (dingdongdeng)i | i  1 } Here are more examples.

Example 2. G2: S  aaS | aa S  aa S  aa

S  aaS S  aa S  aaS  aaaa

S  aaS S  aaS S  aa S  aaS  aaaaS  aaaaaa

A N I

..........

L(G2) = {a2i | i > 0} 40

Deriving a language with rules

Languages and Grammars

Example 3. G3: S  aSb | ab S  ab  ab S S  aSb  aSb

L(G3) = {aibi | i  1} S  ab  aabb S  aSb  aaSbb

S  ab  aaabbb

A N I

. . . . aaaabbbb . . . . . . aaaaabbbbb ....... 41

Languages and Grammars

Deriving a language with rules

The following grammar looks challenging. The number of rules and their complexity, with more than one symbol on the left side, are overwhelming. However, once you understand how it derives a couple of strings, you will like it. Notice that the grammar has only one terminal symbol a.

Example 4. G4: S  ACaB | a

aD  Da

Ca  aaC

CB  DB | E

AD  AC

aE  Ea

AE  

Using rule S → a, the grammar generates string a, which is a member of the language. Let‟s see what other members it can generate starting with rule S  ACaB.

42

Languages and Grammars

Deriving a language with rules

For the convenience, the grammar is repeated below with id number assigned to each rule. (1)

(2)

(3)

S  ACaB | a

(4)

Ca  aaC

(6)

CB  DB | E

(7)

aD  Da

(5)

(8)

AD  AC

aE  Ea

(9)

AE  

Applying rules (1), the only rule applicable next is rule (3). Notice that by rule (3), C keeps jumping over a to the right adding one a till it meets B, where we can apply either rule (4) or (5). If we apply rule (4), then by rule (6) D moves left till it meets A and converts to C as follows. We will shortly see what will happen if A we apply rule (5). (1)

(4)

(3)

(6)

(6)

N I

(7)

S  ACaB  AaaCB  AaaDB  AaDaB ADaaB  ACaaB Again by keep applying rule (3) till C meets B, we can double the number of a.

ACa . . . . aB

(3)

(3)

(3)

 AaaCa . . . . aB  . . . .  Aaa . . . . aaCB 43

Languages and Grammars

Deriving a language with rules (1)

(2)

(3)

S  ACaB | a

Ca  aaC

(6)

(8)

AD  AC (3)

 AaaC . . . . aB

(5)

CB  DB | E

(7)

aD  Da ACa . . . . aB

(4)

aE  Ea (3)

(9)

AE  

(3)

 . . .  Aaa . . . . aaCB

If we want to derive a terminal string, i.e., a member of the language, we should apply rule (5) (i.e., CB → E) and then, by applying rule (8) repeatedly, bring E toward the left end until it meets A. Finally applying rule (9) we get a string of terminal symbol a. (5)

Aaa . . . . aaCB (8)

 AEaa . . . . aa

 Aaa . . . . aaE

(8)

(8)

 Aaa . . . . aEa  . . .

A N I

(9)

 aa . . . . aa 44

Languages and Grammars

Deriving a language with rules (1)

(2)

(3)

S  ACaB | a

Ca  aaC

(6)

(8)

AD  AC (3)

 AaaC . . . . aB

(5)

CB  DB | E

(7)

aD  Da ACa . . . . aB

(4)

aE  Ea (3)

(9)

AE  

(3)

 . . .  Aaa . . . . aaCB

If we want to double the number of a again in the last sentential form above, we apply rule (4), bring D toward the left end using rule (6) until it meets A and then change D to C by rule (7) as follows. (4)

Aaa . . . . aaCB (6)

 Aaa . . . . aaDB

 ADaa . . . . aaB

(6)

(6)

 Aaa . . . . aDaB  . . .

(7)

 ACaa . . . . aaB

Now, by rule (3) again we are ready to double the number of a in the above final sentential form. 45

Languages and Grammars

Deriving a language with rules (1)

(2)

S  ACaB | a (6)

(3)

(4)

Ca  aaC

CB  DB | E

(7)

aD  Da

(5)

(8)

AD  AC

aE  Ea

(9)

AE  

Now, we are ready to figure out the language of the grammar. By rule (2) (S → a) the string a can be derived. By applying rules (1), (3), (5), (8), (8) and (9) in this order, we can derive string aa as follows. (1)

(3)

(5)

S  ACaB  AaaCB  AaaE

(8)

(8)

A N I

(9)

 AaEa  AEaa  aa

We know that instead of rule (5) in the above derivation, if we apply rule (4) followed by the rules (6), (6) and (7), we get the sentential form ACaaB. With this sentential form we are ready to derive the string aaaa.

46

Languages and Grammars

Deriving a language with rules

G4 :

(1)

(2)

S  ACaB | a (6)

aD  Da

(3)

(4)

Ca  aaC (7)

AD  AC

(5)

CB  DB | E (8)

aE  Ea

(9)

AE  

Now, we claim that the language of the grammar G4 is the set of strings of a whose length is some power of 2, i.e.,

L(G4) = { am | m = 2n, n  0 } We can prove this claim by induction. We have just shown that the basis part is true. We leave the formal proof for the reader.

47

Languages and Grammars

2.3 Definitions: Formal Languages and Grammars To define a grammar we need the following information. (1) The language alphabet, i.e., the set of terminal symbols, which appear in the language. Conventionally, we use the lower case English alphabet together with special characters if needed. (2) The set of nonterminal symbols (the upper case letters by convention) used to construct the rules to derive the language.

(3) The set of rules. (4) The start symbol (S by convention). In the previous examples we saw various kinds of rules. The rules in the first three examples have only one symbol on their left side. In general there can be a string of finite length. We will later see that such variations may affect the characteristics of the language generated by the grammar. In this book we will study the following four types of grammars that have been extensively investigated. 48

Languages and Grammars

Definition

Type 0 ( phrase structured ) grammar: a type 0 grammar G is the following quadruple, G = < VT , VN , P , S > , where • VT : the terminal alphabet (called morphemes by linguists). Following the convention, we will use the lower case letters. • VN : the nonterminal alphabet (also called variables or syntactic categories). Again, following the convention, we will use the upper case letters. • V = VT  VN , i.e., the set of symbols used in the grammar. This set is called the total alphabet. • S  VN : the start symbol. • P : a finite set of rules, each denoted by   and read “ generates ,” where   V*VNV*,   V*. 49

Definition

Languages and Grammars

Notice that in a rule  , the left side must be a string   V*VNV* and the right side must be a string   V*. More specifically,  is a string in V* with at least one symbol in VN (i.e., a nonterminal symbol), and  can be any string in V*, including the null string ε.

For two strings w1 and w2, by w1  w2 we mean that w1 derives w2 by applying a rule. We write w1 * w2 , if w1 derives w2 by applying an arbitrary number (i.e., zero or more) of rules. Any string w  V* that can be derived starting with the start symbol (i.e., S * w) is called a sentential form. The language of a grammar G, denoted by L(G), is the set of all string of terminal symbols that can be derived by some rules of G. That is L(G) = { x | x  (VT )* and S * x } All the grammars in Examples 1 – 4 are type 0. The languages of type 0 grammars are also called recursively enumerable (R.E.) languages. 50

Definition

Languages and Grammars

The following three types of grammars, type 1, 2 and 3 are defined with some additional restrictions on the rule form of type 0. However, we will see that since those restrictions do not violate type 0 forms, all these restricted grammar types are also type 0. Type 1 (context-sensitive) grammars: This type of grammars have rules   , where   V*VNV*and   V*, with the restriction that ||  ||, i.e., the left side of all the rules in the grammar cannot be longer than its right side, except for the rule S   under the condition that no rule in the grammar has S on its right side. (Notice that the compositions of  and  are the same as those in type 0 rules.)

Type 2 (context-free) grammars: These grammars have rules   , where   V*VNV*,   V*, with the restriction that | | = 1. Since  should have at least one nonterminal symbol, this restriction implies that the left side must have one nonterminal symbol (i.e., an upper case letter).

51

Definition

Languages and Grammars

Type 3 (regular) grammars: these grammars have type 2 grammar rules with the following restrictions on their right side: • the right side of each rule can have no more than one nonterminal symbol.

• The nonterminal symbol at the right side of a rule, if any, should be at the end of the string. That is, in rule    it is required that  = xA or  = x, where A  VN and x  (VT )*. For example, A  abB, B  aa | D, D   are all legal type 3 grammar rules. However, rules A  aBa and B  aaAD cannot be in a type 3 grammar, because in the former nonterminal symbol B is not at the end of right side string, and in the latter there are more than one nonterminal symbol.

52

Languages and Grammars

2.4 Grammars and their language: Examples Here is a typical example for each of the four types of grammars and their languages that we will often use for further discussions in the following chapters.

Type 0 (phrase structured): G = < {a}, {S,A,B,C,D,E}, P, S > P = { S  ACaB | a Ca  aaC CB  DB | E aD  Da AD  AC aE  Ea AE   } L(G) = { am | m = 2n, n  0 } The above grammar is the one that we examined in Section 2.1. If you understand how this grammar generates its language, it would not be hard to see how the following type 1 grammar generates its language shown below.

Type 1 (context-sensitive): G = < {a,b,c}, {S,B,C}, P, S > P = { S  aSBC | aBC CB  BC bB  bb aB  ab bC  bc cC  cc }

L(G) = {aibici | i  1 } 53

Grammars: Examples

Languages and Grammars

Type 2 (context-free): G = < {0,1}, {S,A,B}, P, S > P = { S  ASB |  A0 B1} L(G) = {0i1i | i  0 } Type 3 (regular): G = < {0,1}, {S,A}, P, S >

P = { S  0S | A A  1A |  } L(G) = { 0i1j | i, j  0 } From now on, to present a grammar we shall only show the set of rewriting (or production) rules written according to the convention, i.e., lower case letters for terminal symbols, upper case letters for nonterminal symbols, and S for the start symbol.

54

Languages and Grammars Rumination (1): Grammars and production rules The following remarks summarize subtle conceptual aspects concerning formal grammars and their languages that we have defined. Let G = < VT , VN , P , S > be a grammar. (a) Recall that the language L(G) is the set of terminal strings that can be generated by applying a finite sequence of production rules. There is no order in the grammar rules that must be observed when a string is derived. However, depending on the order of the rules applied, we may end up with a string containing a nonterminal symbol from which no terminal (or null) string can be derivable. For example, consider the following type 1 grammar with four rules. (1) S → ABC

(2) AB → ab

(3) BC → bc

(4) bC → bc

Clearly, only rules (1) (2) (4) applied in this order will derive terminal string abc, which is the only member of the language of the grammar. If you apply rule (1) followed by rule (3), you will be stuck with Abc as follows, which cannot be a member of the language because the string has a nonterminal symbol.

S

S  ABC BC  bc  ABC  Abc

 ??

(b) Rule (3) of the grammar above is useless in the sense that it does not contribute to the generation of the language. We can delete this rule from the grammar without affecting its language. In general, the decision problem of whether a rule in an arbitrary grammar is useless or not is unsolvable. However, if we restrict the problem to the class of context-free grammars (type 2), we can effectively clean up useless rules, if any. We will learn such an algorithm in Chapter 9. (c) In this chapter we defined a type 0 (or phrase structured) grammar as a grammar having production rules of the form   , where   V*VNV*,   V*. The left side of each rule should have at least one nonterminal symbol. In the text we see some variations of this definition, in particular on the left side of the rules. For example, unrestricted grammars are defined as having rules with   V+ and type 0 grammas with   (VN)+. These variations are equivalent in the sense that they generate the same class of languages as the phrase structured grammars. In this text we will use the generic names and type numbers to refer the grammars and their languages defined in this chapter.

55

Languages and Grammars

Rumination (1): Grammars and Rules

(d) The grammars that we have defined are sequential in the sense that only one rule is allowed to apply at a time. Notice that with the grammar below, if we apply rules (2) AB → ab and (3) BC → bc simultaneously on the string ABC, which is derived with rule (1), we will get terminal string abbc, which is not a member of the language according to our definition. There is a class of grammar where more than one rule can be applied simultaneously. We call such rules parallel rewriting rules. (In Chapter 3, we will see a class of grammars using parallel rewriting rules.) In general it is not easy to figure out the language of a parallel rewriting grammar. (1) S → ABC

(2) AB → ab

(3) BC → bc

(4) bC → bc

(e) For context-free grammars we get the same language independent of the mode of rule application, either sequential or parallel. Why? The answer is left for the reader. (f) A context-sensitive grammar is defined as a phrase structured grammar with non-contracting rules, except for S   under the condition that no rule has S on its right side. On account of the non-contracting rules, the sentential forms that we get along a derivation never shrink, which is a typical property of context-sensitive grammars. However, we need the exceptional case, S  , which is contracting, to include the null string in the language. Including this rule in the grammar, we need the condition of not allowing S on the right side of any rule because, otherwise (e.g., S  SSS | ε), the non-contracting property of derivations will be violated. Sometimes in a text we see context-sensitive grammars defined without this exceptional case, thus, excluding  from the language. (g) A context-free grammars are defined as a type 0 (not type 1) grammar with the restriction of || = 1. It follows that a context-free grammar can have a contracting rule, like A → ε (called ε-production rule), while type 1 grammars are not allowed to have such rules except for S → ε. Later we will learn in Chapter 10 that all ε-production rules in a context-free grammar, if any, can be eliminated, leaving S → ε only when the null string is in the language.

56

Rumination (1): Grammars and Rules

Languages and Grammars

(h) A context-free grammar is called linear if each rule is restricted to the following form. A  xBy or A  x, where A, B VN, x, y  (VT)* Recall that each rule of a regular grammar is restricted to the form of A  xB or A  x. Such grammars are also called right linear. By this definition, neither of the following rules cannot be a regular grammar rule. A → bBC A → abBa A → Ba (i) Following the same notion, a left linear grammar can be defined as a context-free grammar having its rules restricted to the form of A  Bx or A  x. We will show in Chapter 9 that left linear grammars and right linear grammars are equivalent in the sense that they both generate the same class of regular languages. (j) Notice that a regular grammar cannot have both right linear and left linear rules. For example, the following linear grammar generates the language L = {aibi | i  1}. In Chapter 12 we will show that no regular grammar can generate L. S → aB B → Sb | b (k) Formal grammars were defined in terms of the quadruple G = < VT, VN, P, S >, which shows all the necessary information that are required to specify a grammar. This is very general way of defining a system not only in computer science but also in other fields. We commonly define a system by first showing the list of the system‟s entities that are required for the definition, and then specifying each entity. We will follow the same approach when we define automata in Chapter 4.

Defining a system S: (1) S = < E1, E2, . . ., Ek > // a list of system entities (2) Specify each of entities E1, E2, . . ., Ek.

57

Languages and Grammars

Exercises 2.1 For each of the grammars G1 – G4 below, answer the following questions.

(a) Which type is the grammar (either phrase structures, context-sensitive, context-free, or regular)? (Recall that a grammar can be in more than one type.) (b) What is the language? G1: S  aS | abcB

B  bbB | 

G2: S  ABC | 

AB  ab

G3: S  AAA

A a | 

G4: S  ABC

AB  a

bC  bc

BC  cd

aC  c

BC  b

Ac  bc

2.2 Using the set property notation, show the language of each of the following grammars. (Review Example 4 in Section 2.1 before you figure out the answers for parts (g) and (h).) (a) S  aS | bS | 

(b) S  aS | Saa | 

(c) S  aSa | bSb | aa | bb

(d) S  aA | bB

(e) S  aA |  A  Sb

(f) S  aSb | A

A  Sa |a

B  Sb | b

A  aA | a

(g) S  DG

G  bAGc | bEc

bA  Ab

(h) S  DG

G  bbAGcc | bbEcc

bA  Ab

bE  Eb bE  Eb

DA  aaD DA  aaD

DE  aa DE  aa

58

Formal Languages

Exercises 2.3 For each of the following languages, show a regular (type 3) grammar that generates it. (a) L1 = {a}* (d) L4 = L2  L3

(b) L2 = {ab, abb, abbb} (e) L4 = L2L3

(c) L3 = { axb | x {0, 1}*} (f) L4 = { xy | x {0, 1}+, y {a, b}+}

(g) L3 = { x | x is a decimal number. i.e., x{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+ } (h) L5 = { x | x  {a, b}* and x has at least one a. } 2.4 For each of the following languages, construct a context-free (type 2) grammar that generates the language and briefly explain how your grammar generates it. (a) L1 = {aibj | i > j > 0} (d) L4 = {aibj | j > i > 0}

(b) L2 = {aibj | i  j > 0} (e) L5 = {aibj | i, j > 0 and i  j }

(c) L3 = {aibj | i  j  0} (f) L6 = {xxR | x  {a, b}*}

2.5 For each of the following languages, construct a grammar (of any type) that generates the language, and briefly explain how your grammar generates it. (Hint: try to apply the idea of the type 1 grammar given as an example in Section 2.3. ) (a) L1 = {a2nbnc2ndm | n,m  1 }

(b) L2 = {anbncndn | n  1 }

59

Models of Language Generation: Grammars

60

Chapter 3. Other Models for Language Representation In Chapter 2 we learned how to define a language in terms of a grammar, and saw several examples. Numerous other models for language representation have been introduced in the literature, among others, graph grammars, picture language grammars, Lyndenmayer systems (L-systems), syntax flow graphs, regular expressions, etc. This chapter briefly introduces L-systems, syntax flow graphs and regular expressions, and points to their applications.

3.1 L-systems 62 Definitions and examples Application 3.2 Syntax flow graph 67 Definition and examples 3.3 Regular Expressions 70 Definition and examples Algebraic properties of regular expressions Rumination 74 Exercises 75

61

Other Language Models

3.1 L-systems L-system is a grammar model introduced by Lyndenmayer to describe biological development, such as the growth of plants and cells. L-systems are different from the grammars that we have studied in the last chapter. In an L-system, to drive the next string from the current string all applicable rules are applied simultaneously (i.e., in parallel), and every string that can be derived belongs to the language of the system. There are several variations of L-systems. Zero-sided L-systems correspond to the context-free grammars in the sense that the production rules are not context dependent, i.e., there is only one symbol on the

left side of each rule. Depending on the context-sensitivity of the rules to the left, to the right or both, the L-systems are called, respectively, left-sided, right-sided and two-sided L-systems. Having parallel rewriting rues, in general it is hard to indentify the language of an L-system. However, because of the unique characteristics of the languages generated by L-systems that can be used in computer imagery, L-systems have been widely used as a graphical tool in computer science. The following book contains a rich collection of various L-systems and their applications. H. Ehrig, M. Nagl, G. Rozenberg, and A. Rosenbeld (Edition), “Graph Grammars and Their Application to Computer Science,” (Lecture Notes in Computer Science #291, Springe-Verlag, 1986)

62

Other Language Models

L-systems

In an L-system there is no difference between terminal symbols and nonterminal symbols. Every string that can be derived by applying the rules belongs to the language.

Definition: An L-system G is a triple G = ( , h,  ), where  is a finite alphabet, h is a set of rewriting rules, and , called the axiom, is the start string. Conventionally the rules are expressed as a function h. For example, h() =  and h() = {, }, where , ,   *, imply, respectively, the rules    and    |  in the formal grammar notation. Let hi() denote a string derivable by applying the rules in h i times, i.e., h0( ) =  , h1( ) = h( ), h2() = h(h1( )), . . . ., hi() = h(hi-1()). The language of an L-system G = ( , h,  ) is defined as follows. L(G) = { hi( ) | i  0}

63

L-systems

Other Language Models

Below are two simple L-systems. Notice that in both examples, there is only one symbol at the left side of the rules. That is, the rules are context-free. Such Lsystems are called 0L (zero-sided Lyndenmayer) systems. We can also define Lsystems with rules whose left side has a string of length greater than 1. Example 1. G1 = ( {a}, h, a2 ), h(a) = {a, a2 }. L(G1) = {an | n  2}. Example 2. G2 = ({a, b}, h, ab ), h(a) = aa, h(b) = ab. L(G2) = {amb | m = 2n-1, n  1}.

Break Time When one door of happiness closes, another opens. But often we look so long at the closed door that we don‟t see the one which has been open for us. It‟s true that we don‟t know what we got until we lose it, but it‟s also true that we don‟t know what we‟ve been missing until it arrives. - Anonymous -

64

Other Language Models

L-system

The following figures show how an L-system can be used to draw a plant, where the lines correspond to the branches of a tree, and the matching brackets indicate the start and the end points of a branch. Nested matching brackets indicate branching on their “main-branch” corresponding to the outer brackets. When we draw a tree in a two dimensional space with such an expression, we let the branches grow in alternate directions, to the right and then to the left, with certain angle, say 30o from the current direction.

[

(a)

(b)

]

[

]

[ [] [ ] ]

[

]

(c)

To draw a tree we first let the L-system generate a string of symbols (each corresponding to a line segment) with matching brackets, and then translate the string into a tree using the same idea shown above. Next slide shows an example, where every digit is translated into a line segment of the same length. 65

Other Language Models

L-system Drawing a branching plant using an L-system

L-system rule (1 is the axiom) 1 23 4  25 78

22 5  65 8  9[3]

3  24 67 99

5 6 7 8 9 9

Generate a string

1 23 224 Translate .. .. 2229[229[24]9[3]8765]9[229[3]8765]9[228765] 9[228765]9[2265]9[225]9[24]9[3]8765

9 9 9 9 9 9 2 2 2

66

Other Language Models

3.2 Syntax Flow Graph A syntax flow graph (also called syntax diagram) is a graphical representation of context-free grammar rules. (This model is applicable only for context-free grammar rules.) This model helps us understand how the grammar rules put symbols together to construct a string. The figure below shows an example. For each of the nonterminal symbols (S, A and B in the example below) in the grammar, we draw a directed graph under the title of that nonterminal with one entry and one exit edge as follows. For each right side of the rule corresponding to the nonterminal symbol, we draw a path from the entry to the exit with the sequence of nodes, each labeled with the symbol in the order appearing on the right side. Conventionally, terminal symbols are put in a circle and nonterminal symbols in a rectangle. The null string is represented with a line. S  aS | A S

a

S A

A  bAc | B

B d|f| 

A

B

B

b

A

c

d

f

67

Other Language Models

Syntax Flow Graph

Example 1. Here is a syntax flow graph (and the rewriting rules) corresponding to the identifier defined in C programming language.



a

0

b

1

. . z

 a | b | … | z

. .

 0 | 1 | 2 . . . | 9

9

| |

digit letter letter

68

Other Language Models

Syntax Flow Graph

Example 2. The following syntax flow graph (and the rewriting rules) represents the unsigned integers and unsigned doubles (i.e., real numbers) defined in C.

|

digit

unsigned integer

+

.

unsigned integer

E

unsigned integer -

. |  . <exponent> | <exponent> |  <exponent>  E<sign> <sign>  + | - | 

69

Other Language Models

3.3 Regular Expressions Searching a file or browsing the Web, we enter a query in terms of strings (or a pattern) representing the information that we want to find. We learned how to represent a set of strings (i.e., a language) in terms of a set property or a grammar. However, neither the set property nor the grammar model is practical for interactive online search. We need a convenient way to succinctly express and efficiently find what we are looking for. This section introduces regular expressions, an algebraic model, which denotes regular languages. Since regular expressions can be transformed to a machine that recognizes the language expressed by the expression (we will learn how in Chapter 7), they are widely used in Web browsing, word processing, compilers, etc. (Unfortunately no such practical model is available for denoting other types of formal languages.)

Interesting Warning Labels

Break Time

On a knife: Warning. Keep out of children. On a string of Christmas lights: For indoor or outdoor only. On a food processor: Not to be used for the other use. On a chainsaw: Do not attempt to stop chain with your hands. On a child‟s superman costume: Wearing of this garment does not enable you fly. - Anonymous -

70

Regular Expression

Other Language Models

Definition: Regular expression Let  be a finite alphabet. The regular expressions over  and the sets of strings (i.e., languages) they denote are defined as follows. (1) Ø is a regular expression which denotes the empty set. (2)  is a regular expression which denotes the set {}. (3) For each symbol a  , a is a regular expression which denotes the set {a}. (4) If r and s are regular expressions which denote the sets of strings R and S, respectively, then r + s, rs and r* are regular expressions that, respectively, denote the set R  S, RS and R*. In the above definition boldface symbols are used to help the reader distinguish regular expressions from strings. From now on, whenever it is clear under the context, we shall use normal fonts in a regular expression.

71

Regular Expression

Other Language Models

In the definition of the regular expression, operators +, concatenation and *, respectively, correspond to the set union, set product and set closure. The precedence of these operators in a regular expression is in the order (from the highest) of *, concatenation, +. However, we are free to use parentheses in a regular expression, as we do in an arithmetic expression, wherever there is ambiguity. For a regular expression r, by L(r) we denote the language expressed by r. Here are some examples showing regular languages and their representation in terms of a regular expression.

Language L(r) {a, b}*

Regular expression r (a + b)*

{ai | i  0}

a*

{aibj | i, j  0}

a*b*

{xaaybbz | x, y, z  {a, b}* }

(a+b)*aa(a+b)*bb(a+b)*

72

Regular Expression

Other Language Models

Algebraic Properties of Regular expressions For two regular expressions E and F, define the identity, written E = F, if and only if their languages are identical, i.e., L(E) = L(F). Regular expressions with operator + or concatenation have the identity laws similar to those of arithmetic algebra with operator + (plus) and multiplication as shown below. Notice that in a regular expression the commutative law does not hold for concatenation. Let r, s and t be arbitrary regular expressions. • Commutative law: r + s = s + r. (But for concatenation, rs  sr.) • Associative law: r + (s + t) = (r + s) + t, (rs)t = r(st). • Distributive law: r(s + t) = rs + rt. • For the regular expressions Ø and , we have the following identities. Ø + r = r + Ø = r, Ø r = rØ = Ø , r = r = r Notice that Ø and , respectively, correspond to 0 and 1 in arithmetic algebra. We leave the proof for the reader.

73

Other Language Models Rumination (1): syntax flow graph • Occasionally, we hear that programming languages belong to the class of context-free languages. That is true except for some special cases, such as input/output statements. However, in a text book we hardly come across the whole grammar of a programming language. Usually the grammar is informally described, focusing on how to use the statements. However, to develop a system component such as compiler or debugger for a programming language, we need a formal definition of the language. Appendix A shows the whole syntax flow graphs for Pascal. (Pascal is a predecessor of C.)

Rumination (2): Regular Expression • There are quite a few variations in the form of regular expressions. Here are some examples together with the standard regular expressions. (a | b)* = (a+b)* ,

(a | b)+ = (a+b)(a+b)*

abc? = (ab + abc),

(abc)? = abc + 

[ab] = (a+b),

[a-z] = a + b + c + . . . . + z ,

With these variations, the integers defined in C can be expressed as follows. digits [0-9] int {digits}+ real {int}”.”{int}([Ee][+-]?{int})? • For a given language, there are infinitely many regular expressions that denote the same language. For example, r, r + r, r + r + r, … , etc. are all equivalent. As shown in the preceding section for algebraic properties of regular expressions, for such simple

regular expressions it is possible to prove their equivalence. However, there is no practical algorithm available for proving the equivalence of two arbitrary regular expressions. For example, it is not trivial to prove the following equivalences. In Chapter 7, we will present an idea how to prove them. (a) (r*)* = (r)* (b) r*(r + s)* = ( r + s)* (c) (r + s)* = ( r*s*)*

74

Other Language Models

Exercises

3.1 What is the language of L-system G = ({a, b, c}, h, acb ), where the rules are defined as follows? h (a) = aa h (b) = cb h (c) = a 3.2 Draw a syntax flow graph for the following context-free grammar. S  aSbB | A A  bSa | ba | ab B  bB | 

3.3 The following syntax flow graph is a simplified version of the syntax flow graph for <expression> of Pascal programming language. (a) Write a grammar which generates the same language defined by this syntax flow graph. (b) Show how expression a1+ (a + b)*a – a/b in Pascal can be derived with the rewriting rules of your grammar. <expression>



term



+

-

*

term



variable expression

1

/

factor



(

0

factor

letter

letter )

digit

a b

75

Other Language Models

Exercises

3.4 Briefly describe the language expressed by each of the following regular expressions.

(a) (0 + 1)*00

(b) (1 + 0 )*111( 0 + 1)*

(c) (1 + 0)*(111 + 000)(0 + 1)*

Actual Signs

Break Time

Outside a muffler ship: “No appointment necessary. We‟ll hear you coming.” In a veterinarian’s waiting room: “Be back in 5 minutes. Sit! Stay!” At the electric company: “We would be delighted if you send in you bill. However, if you don‟t, you will be (de-lighted).” In a beauty shop: “Dye now!” In a cafeteria: “Shoes are required to eat in the cafeteria. Socks can eat any place else they want.” - Anonymous -

76

Models of Language Recognition: Automata

77

4. Automata In the preceding two chapters we learned several models for language representation; formal grammars, L-systems, syntax flow graphs and regular expressions. The natural languages are communication media, and so are the formal languages. The person who hears a spoken sentence should be able to understand the meaning (i.e., the semantics) of the sentence. In this chapter we will learn how to design machine models (i.e., automata) that recognize formal languages. We are not going to build a model which understands a formal language, but one that recognizes it. Here we use the word “recognize” in a much narrower sense than the word “understand” as follows. For a given language L, we say a machine M recognizes L, if M, given a string x, can output “yes” or give a positive indication if and only if x ∈ L. What does it take for a machine M to recognize a language L? To answer this question it is better to think about a conceptual model, instead of complex computer systems currently available. From 1930‟s computer scientists have investigated many machine models, called automata, for language recognition. We will study four popular models – Turing machines, linearbounded automata, pushdown automata and finite automata, respectively, recognizing the languages of types 0, 1, 2, and 3, that we have introduced in Chapter 2. In Chapters 7, 12, and 14 we will study the so-called Chomsky hierarchy (after Noam Chomsky), which shows the hierarchical relationship among the languages and automata. Automata can be classified into two types of models, deterministic and nondeterministic. In this chapter we will first study the deterministic models, which naturally follow our intuition, and then the nondeterministic models in the following chapter.

78

Automata 4.1 Deterministic Turing machines (DTM) 80 Design example State transition graph, State transition table, State transition function Formal definition 4.2 DLBA (Deterministic Linear Bounded Automata) 98 Definition and Design example 4.3 DPDA (Deterministic Pushdown Automata) 100 Definition Design example 4.4 DFA (Deterministic Finite Automata) 111 Definition Design example Rumination 114 Exercises 120

LOVE

Break Time

Love is when you take away the feeling, the passion, the romance and you find out you still love the person. Love comes to those who still hope even though they‟ve been disappointed, to those who still believe, even though they‟ve been betrayed, to those who still love even though they‟ve been deeply wounded before. - Anonymous -

79

Automata

4.1 Deterministic Turing Machines The Turing machine (TM), introduced by Alan M. Turing in 1936, is a model that can recognize type 0 (phrase structured) languages. In this section we will study the deterministic version of the Turing machine. As illustrated below, a TM has a finite state control with a two-way read/write tape head and one-dimensional tape of infinite length. The finite state control corresponds to the CPU of a modern computer, where the “state” represents the whole information stored in the main memory and registers. The tape is divided into cells, each capable of holding one symbol. It is assumed that initially, an input string over a given alphabet is written on the tape, and the tape head is reading the leftmost symbol. We assume that all the blank cells are written with the blank symbol B. a

b

a

b

b

read/write tape

two-way read/write head finite state control 80

Automata

DTM

In one move (or step) a deterministic Turing machine (DTM) does the following: (1) Read the content of the cell on which the tape head is located. (2) Depending on the symbol read and the current state of the finite state control, write a symbol on the cell, erasing the original symbol. (3) Changing the state (or possibly without changing it) move the tape head left or right, or let it stay on the same cell.

a p

b

a

..

c

b q

a

..

c

b

a

..

p

A N I

We say the DTM accepts the input string if it enters a designated state (called accepting state) after some finite number of moves. 81

Automata

DTM

It is unwieldy to draw the configuration (i.e., the tape and the finite state control) for every step of a DTM to illustrate how it works. Instead we will mainly use a directed graph, called the state transition graph, or occasionally the transition function  to represent each step as shown in figures (a) and (b), respectively.

next state ( symbol read, symbol written, head direction ) current state

(current state, symbol read) = (next state, symbol written, head direction)

(a)

(b)

82

Automata

DTM

Figures (a) and (b) below, respectively, show a graph and its functional representation of two steps of a DTM. In the figures, R and L, respectively, denote the tape head moving right and left. We shall use N to denote the head staying on the same cell.

a

b

a

..

c

p

b

a

..

c

q

b

a

..

p

q (a, c, R)

(p, a) = (q, c, R) (b, b, L)

(q, b) = (p, b, L)

p (a)

(b) 83

Automata

DTM

Example 1. Construct a DTM which given the word “mother,” converts it to “father,” and vice versa. After completing the conversion, the DTM must halt in an accepting state. Figure (a) below shows the initial and final configurations of a DTM which does the conversion. (Intermediate configurations are omitted.) Figure (b) shows the state transition graph. In the state transition graph of a DTM, state labels are just for reference and hence, can be omitted, if not needed, as shown in figure (b).

m

o

t

h

e

r

s (f, m, R)

f

a

t

h

e

r f

(a) (B, B, N)

(a, o, R)

f (r, r, R)

s start

(m, f, R)

(o, a, R)

(t, t, R)

(h, h, R)

(e, e, R)

(b) 84

Automata

DTM

Notice that depending on the first character (either „m‟ or „f‟) read, the machine takes a transition to a different state. After converting the first two characters, it does the same work (i.e., rewriting the “ther” part) by taking the four transitions, and finally, reading the blank symbol B, enters the accepting state (the heavy circle labeled with f). Also notice that the machine cannot enter an accepting state after changing the first two characters. It should make sure that the remaining part of the input is correct. (B, B, N) (f, m, R)

f

(a, o, R)

(r, r, R)

s

(m, f, R)

(o, a, R)

(t, t, R)

(h, h, R)

(e, e, R)

start

85

Automata

DTM

Example 2. Design a DTM which recognizes the language L = {0n1n | n  1}. Suppose that string 000111 is initially written on the tape as shown below. How can a DTM M recognize that this string belongs to L? We know that if a string x is in L, there should be the same number of 0‟s and 1‟s in x, and all the 0‟s should come first. We will let M use this property. One idea is to let M move back and forth checking if, for each symbol 0 erased (i.e., rewritten) by a symbol, say X, there is 1 that can be erased by Y. We let M enter an accepting state, if it sees that every pair of 0 and 1 is erased. Before we give a formal definition of M, let‟s figure out how the machine should move along the tape to implement this idea. 0

0

0

1

1

1

86

Automata

DTM

Let q0 denote the start state of M. We will use the following state names in which the machine does the following; q1: The state in which the machine, after erasing the leftmost 0 with X, moves right in search of the leftmost 1. q2 : The state in which the machine, after erasing the leftmost 1 with Y, moves left in search of X. q3 : The state in which, after reading X, the machine backs up right to read the leftmost 0, if any.

We will introduce additional states, if needed, while designing M‟s work further in detail. 0

0

0

1

1

1

q0 87

Automata

DTM

(a)

X 0 q0

0

1

1

1

q1

X 0 0 Y 1 1

(b)

(a) In start state q0, M reads 0 and erasing it by X, moves right in q1 in search of the leftmost 1. (b) After erasing 1 with Y, M moves left in state q2 in search of X.

A N I

q2 X 0 0 Y 1 1

(c)

q2 (d)

q3 X X 0 Y 1 1

q1

(c) Reading X in state q2, M backs up in state q3. (d) Reads the next leftmost 0, if any, and erase it with X and moves right in state q1 in search of the next 1 to be erased. M repeats the procedure (b), (c) and (d). 88

Automata

DTM

(b)

A N I

X X 0 Y Y 1 q2

(c)

X X 0 Y Y 1 q2

(d)

(f) This is the final phase. In state q3 if M reads Y, not 0, it implies that all 0‟s are erased by X. M should make sure that no 1‟s remain on the tape. For this, M needs additional states q4 and an accepting state q5.

q3

X X X Y Y 1 q1

(e)

Reading Y in state q3, M enters state q4 and moves right until it reads the blank symbol and then enters an accepting state q5 .

X X X Y Y Y q2

(f)

X X X Y Y Y q2

q3

q4

q5 89

Automata

DTM

The following state transition graph is a formal representation of how the DTM recognizes the language L = {0n1n | n  1}.

In q1, after erasing the leftmost 0 with X, keep moving right in search of the leftmost 1.

(0,0,L) (Y,Y,L)

(0,0,R) (Y,Y,R)

(1,Y,L)

In q2, after erasing the leftmost 1 with Y, keep moving left in search of X.

q2

Reading X, back up right in state q3.

q1 (0,X,R)

(X,X,R) (0,X,R)

q0

A N I

q3

start

(Y,Y,R)

(Y,Y,R) q4 In q4, check if all 1‟s have been erased with Y.

(B,B,N)

q5

90

Automata

DTM

Only the transitions along a path defined on the graph are legal. Reading a symbol undefined (e.g., reading 1 in state q0 ), implies that M rejects the input. Notice that the sequence of moves along the cyclic transitions through q1, q2, and q3 erases any number (≥ 1) of matching pairs of 0 and 1. We can prove by induction that DTM M enters the accepting state q5 if and only if the input string is in the language L = {0n1n | n  1}. (0,0,R) (Y,Y,R)

(1,Y,L)

q2

(0,0,L) (Y,Y,L)

q1 (X,X,R) (0,X,R)

(0,X,R)

q0

q3

start

(Y,Y,R)

(Y,Y,R) q4

(B,B,N)

q5 91

Automata

DTM

The functional representation of the DTM M that we have designed is given below. In this book we will mainly use the state transition graph to show a DTM and other automata, because it is easier to understand how they work.

State Transition Function  of M (q0, 0) = (q1, X, R)

(q1, 0) = (q1, 0, R)

(q1, 1) = (q2, Y, L)

(q2, 0) = (q2, 0, L)

(q2, X) = (q3, X, R)

(q3, 0) = (q1, X, R)

(q1, Y) = (q1, Y, R)

(q2, Y) = (q2, Y, L)

(q3, Y) = (q4, Y, R)

(q4, Y) = (q4, Y, R)

(q4, B) = (q5, B, N)

92

Automata

DTM

A DTM can also be represented with a table, called the state transition table. The table below defines the DTM that we have designed. The states are listed on the first column and the tape symbols on the first row. Table entries contain the triple (next state, symbol written, direction). Blank entries are for the cases of rejecting the input.

State transition table of M 0

1

X

Y

q0

(q1, X, R)

q1

(q1, 0, R)

q2

(q2, 0, L)

(q3, X, R) (q , Y, L) 2

q3

(q1, X, R)

(q4, Y, R)

q4

(q2, Y, L)

B

(q1, Y, R)

(q4, Y, R) (q5, B, N)

q5 93

Automata

DTM

Formally a DTM M is defined with a sextuple M = ( Q, , , , q0, F ), where Q: The finite set of states. q0 : The start state (q0  Q). : The finite input alphabet. : The finite tape alphabet, i.e., the set of symbols that are allowed to be written on the tape, including the blank symbol B. Notice that   . F: The set of accepting states (F  Q). : The state transition function (: Q   Q    {L, N, R}), where R, L and N, respectively, denote the direction in which the head moves right, left or stay. Given an input string x, if M enters an accepting state in a finite number of moves, we say M accepts x. The language recognized by M, denoted by L(M), is the set of all strings x in * that are accepted by M, i.e., L(M) = { x | x  *, M accepts x}.

94

Automata

DTM

Example: The DTM M which recognizes L = {0n1n | n  1} can be formally defined as follows. M = ( Q,  ,  ,  , q0 , F ), where Q = {q0, q1, q2, q3, q4, q5},  = {0, 1},  = {0, 1, X, Y, B}, F = {q5}, and the transition function  is given by the following state transition graph. (0,0,R) (Y,Y,R)

(0,0,L) (Y,Y,L)

(1,Y,L)

q2

q1

(0,X,R)

(X,X,R) (0,X,R)

q0

q3

start

(Y,Y,R)

(Y,Y,R) q4

(B,B,N)

q5 95

Automata

DTM

The state transition of M shows only the transitions that lead to an accepting state. For example, the graph (repeated below) defines only one transition from the start state q0 for the case when the input symbol is 0. What will happen when the DTM reads 1 in state q0? We assume that in such undefined cases, the machine enters a “dead state” and rejects the input. For simplicity, we do not show such rejecting transitions. (0,0,R) (Y,Y,R)

(0,0,L) (Y,Y,L)

(1,Y,L)

q2

q1

(0,X,R)

(X,X,R) (0,X,R)

q0

q3

start

(Y,Y,R)

(Y,Y,R) q4

(B,B,N)

q5 96

Automata

DTM Unless stated otherwise, the reader may assume the following. • Only lower case letters are used for the input alphabet .

• Initially, the input tape is written with a string x over the input alphabet, i.e., x  *, and the tape head is reading the leftmost symbol, if the input is not . Notice that these assumptions do not affect the computational capability of the machines. We could assume that the machine‟s read/write head is initially positioned at an arbitrary location on the input string. Then we can simply modify the machine, as the following figure illustrates, so that it moves onto the leftmost symbol of the input string and then starts the computation. (X, X, L) (B, B, R)

q0

. . . .

start X: any input symbol 97

Automata

4.2 Deterministic linear bounded automata (DLBA) A DLBA is a DTM whose read/write tape range is bounded by the length of the input string. To indicate this range, the input string is delimited by a pair of boundary markers (the matching brackets) as shown below. During the computation, if the DLBA read one of the markers, it should either halt or move back toward the direction it came from. Note that the boundary markers „[„ and „]‟ are in . Formally a DLBA is defined with an octuple including the boundary markers as shown below. Except for the boundary markers, other elements of the octuple and the language recognized by the automaton are specified as in the formal definition of a DTM.

M = (Q,  ,  ,  , q0, [, ], F)

[

a

b

a

a

b

]

q0

98

Automata

DLBA

With a simple modification of the DTM recognizing {0n1n | n  1} as shown below, we can make the state transition graph of a DLBA which recognizes the same language. Notice that in state q4, reading the closing boundary mark „]‟ (instead of the blank symbol B), the machine enters the accepting state. All the other moves are identical to the ones of the DTM.

(0,0,L) (Y,Y,L)

(0,0,R) (Y,Y,R)

(1,Y,L)

q2

q1 (0,X,R)

(X,X,R) (0,X,R)

q0

q3

start

(Y,Y,R)

(Y,Y,R) q4

( ], ], N )

q5 99

Automata

4.3 Deterministic pushdown automata (DPDA) A deterministic pushdown automaton (DPDA) is a restrictive version of the DTM. The capacity of the two-way read/write tape head is restricted to one-way read-only and instead, the machine is equipped with a pushdown stack as shown below. In each move the machine does the following: (1) Read the stack-top symbol. (2) Depending on the stack-top symbol and the current state, either read the input or do not read it. (3) Depending on the input symbol read (if it was allowed to read) and the stack-top, pop the stack-top symbol or push some symbols, possibly changing the stack-top. a

b

b

a

a

b

one-way read only finite state control

push down stack Z0 100

Automata

DPDA

Notice that depending on the state and the stack top symbol, the input head is either allowed to read or not. Reading the input tape, the head should move right onto the next cell. Otherwise, it does not move. The stack head is always positioned at the stack top. We assume that at the bottom of the stack, a designated bottom of the stack symbol (Z0) is initially written.

a

b

b

a

a

b

one-way read only finite state control

push down stack Z0

101

Automata

DPDA

As we did for DTM and DLBA, we shall mainly describe a DPDA with the state transition graph or the transition function  as shown below. To denote a move of not reading the input tape, we shall use the null character  for the parameter “input symbol read.” We call it an -move. If the parameter “stack-top content changed” is , it is a popping move, i.e., the stack-top symbol is erased.

next state ( input symbol read, stack-top symbol /stack-top content changed) current state

(a)

(current state, input symbol read, stack-top symbol) = (next state, stack-top content changed)

(b) 102

Automata

DPDA

The following figures show a sequence of moves of a DPDA and their representations in terms of the transition function  and the state transition graph. Notice that the second move is an -move.

a b c

a b c

a b c

a b c

p

q

r

q





Z



X Y

 ( p , a, Z ) = ( q,YX )

X

 ( q, , Y ) = ( r, V )

 ( r, b, V ) = ( q,  )

(, Y/V)

A N I

(a, Z /YX) p



X V

q

r

(b, V/) 103

DPDA

Automata

Formally a DPDA M is defined by the following septuple

M = ( Q, , , , q0 , Z0, F ), where Q is the set of states,  is the input alphabet,  is the stack alphabet,  is the transition function, q0 is the start state, Z0 is the bottom of the stack symbol, and F is the set of accepting states. The function , given the current state, the stack-top symbol and the input symbol read or  (for -move), outputs the next state and a string of stack symbols with which the stack-top symbol is replaced, i.e.,  : Q  (   { } )    Q  * According to the above definition, M can rewrite the stack-top symbol with a string (of stack symbols) of finite length (including the null string). For convenience we can simplify the stack operation such that the machine either pops the stack-top or pushes one symbol on top of the stack without affecting the language recognized by the machine. At the end of this chapter we will show how.

104

DPDA

Automata

Starting with the initial configuration with the input head on the leftmost symbol and the stack head at the bottom of the stack, a DPDA M accepts the input if and only if it enters a configuration satisfying the following two conditions. (Notice that the conditions are irrelevant to the stack contents.)

(1) The input head has read the input string up to the last symbol. (2) The machine is in an accepting state (qf in figure (b)). a

b

b

a

a

a

b

b

q0

a

a

qf

Z0

X

Z0

(a)

(b)

The language L(M) is the set of input strings accepted by M, i.e., L(M) = { x | x  * , M accepts x } 105

Automata

Designing a DPDA Example 1. Design a DPDA M which recognizes L = {aibi | i > 0}.

Here is a simple idea. Reading a‟s, M pushes them in the stack, and then, for each b read from the input tape, the machine erases one a by popping it from the stack top till it sees Z0 (i.e., the stack is empty). Seeing the stack is empty, M enters an accepting state without reading the input. The state transition graph of M is shown below. The best way to understand how this DPDA works is to trace its moves along the transition graph. See how M accepts the input string aaabbb and fails to accept aaabb.

a

a

a

b

b

b (a, a/aa)

(b, a/)

(, Z0/Z0)

q0 Z0

start

(a, Z0/aZ0)

(b, a/)

106

Designing a DPDA

Automata

Example 2. Design a DPDA which recognizes {aibj | j > i > 0 }. We will use the same idea of pushing a‟s and then, for each b read, popping one a from the stack top. Notice that by the time the stack is empty (i.e., Z0 appears at the stack top), the machine has read the same number of a‟s and b‟s. It should make sure that there are more b‟s in the tape by reading them up to the last one.

To do this the machine reads one more symbol and if it is a, enters an accepting state. In the accepting state the machine reads off all the remaining b‟s, if any.

(a, a/aa)

(b, Z0/Z0)

(b, a/) (b, Z0/Z0)

start

(a, Z0/aZ0)

(b, a/)

107

Designing a DPDA

Automata

Example 3. Design a DPDA that recognizes language {aibj | i > j > 0 }. Notice that this language has more a‟s than b‟s. The figure shown below is a DPDA that recognizes this language. Notice that this machine reads the first a and discards it. Then the machine pushes all the remaining a‟s onto the stack and reading the first b, enters an accepting state. And in the accepting state, read off all the remaining b‟s, popping one a from the stack top for each b read. Because all the a‟s in the input except the first one have been pushed into the stack, during the popping phase it is guaranteed that the total number of a‟s read from the input tape is strictly greater than the number of b‟s. See how this DPDA works by tracing its moves for the input strings aaabb and aabbb. (a, Z0/aZ0) (a, a/aa)

(b, a/)

(b, a/)

start

(a, Z0/Z0) 108

Automata

Designing a DPDA

For comparison, here are the three state transition graphs that we have designed in the previous examples. {aibi | i > 0 } (a, a/aa)

{aibj | i > j > 0 }

(a, Z0/aZ0) (a, a/aa)

(b, a/)

(b, a/)

(, Z0/Z0) (b, a/)

start

(a, Z0/aZ0)

(b, a/)

(a, a/aa)

start

(a, Z0/Z0)

(b, Z0/Z0)

(b, a/) (b, Z0/Z0)

start

(a, Z0/aZ0)

(b, a/)

{aibj | j > i > 0 } 109

DPDA 설계

Automata

Example 4. Design a DPDA which recognizes the following language. {aibkci | k, i > 0 }  {aibkdk | k, i > 0 } Notice that to recognize the first half of the language, the machine needs to push a‟s into the stack to match with c‟s. However, we cannot let the machine read off b‟s and discard them, because the input string may belong to the second half of the language. The machine needs all a‟s and b‟s (in this order) pushed into the stack till it reads c or d from the input tape. Reading c, we let it pop out all the b‟s in the stack before it begins matching a‟s and c‟s. (, b/) (c, a/) (, a/) (a, a/aa)

start

(a, Z0/aZ0)

(c, b/) (b, a/ba)

(, Z0/ Z0) (d, b/)

(b, b/bb)

(, a/a)

(d, b/) 110

Automata

4.4 Deterministic finite automata (DFA) Deterministic finite automata (DFA) are the simplest computational model, which can recognize only regular languages. A DFA is a DPDA with the pushdown stack removed. Thus a DFA M is defined with the quintuple M = ( Q,  ,  ,q0 , F ), where Q, , q0, and F are the same as they are defined for a DPDA. The transition function  is the mapping : Q    Q. In a move the machine reads the input tape, and depending on the current state and the input symbol read, it enters a state in Q and moves right. For example, (p, a) = q implies a move shown in figure (a) below, which can be represented with the state transition graph shown in figure (b).

a a b p

One-way read only

q (a)

q

a

a a b p

(b) 111

DFA

Automata

A DFA accepts the input string if and only if it satisfies the following two conditions, which are the same for DPDA‟s. (1) The input head has read the input string up to the last symbol. (2) The machine is in an accepting state.

The language recognized by a DFA M, denoted by L(M), is L(M) = { x | x  * , M accepts x } Example 1. Construct a DFA which recognizes the following language. L = {x | x  {0, 1}+, the number of 1‟s in x is odd} Here is the transition graph of a DFA which recognizes the language L.

0

1

0

start 1 112

Automata

Designing a DFA Example 2. Design a DFA which recognizes the following language. L = {x | x is a binary number, i.e., x  {0, 1}+, and x mod 3 = 0 }

Let M be a DFA which recognizes L. In addition to the start state, M has 3 states, each labeled with one of the three possible remainders, 0, 1, and 2 (see the graph below). Reading the input binary string, M computes the current remainder on the fly and enters the state labeled with that remainder as follows. 0

1

1 0 start

0

1

0 2

1 0

1

For a binary string w, let dec(w) be the decimal value of w. If the first symbol read is 0, the current remainder is 0, if 1 it is 1. So, depending on the first input symbol read, M enters either state 0 or 1. Suppose that reading up to a prefix v of the input string, M entered a state i  {0, 1, 2}, implying that i = dec(v) mod 3. Reading the next bit j  {0, 1}, M enters a state labeled by dec(vj) mod 3. Since dec(vj) mod 3 = (2dec(v) + j) mod 3 = (2i + j) mod 3, M can easily identify the next state to enter as the transition graph shows. 113

Automata

Rumination (1): States

Every automaton has some states. We have introduced the word “state” without elaborating the deep conceptual issues involved. Actually, by a state we represent the whole information stored in the finite state control that the machine needs for the work. The DTM that we have designed in Example 2 of Section 4.1 has 6 states. In each state qi the machine does a unique work. Erasing the leftmost 0 with X, the machine keeps moving right in state q1 till it reads the leftmost 1. Rewriting this 1 with Y, the machine keeps moving left in state q2 until it reads X (see the top figures (a) through (d) below). Would it be possible to do this part of the work in one state? The answer is no. The figures at the bottom show why. Suppose that the machine, erasing the leftmost 1 by Y, moved left in the same state q1. Then the machine will encounter the same symbol 0 that it came across moving right. Reading this 0 the machine should move right, because the DTM must take the same move (as in figure (a)) with the same input 0 in the same state q1, i.e.,  (q1, 0) = (q1, 0, R). Reading the same 0 second time in state q1, the machine has no information in its finite control that indicates it should move left this time. In its finite control the machine should have some information available that is different from the one when it is in state q1. Being in state q2, different from q1, implies this.

X 0 1 1 .. q1

q1

(a) X 0 1 1 .. q1

X 0 1 1 ..

X 0 Y 1 .. q2

q2

(b)

(c)

X 0 1 1 ..

X 0 Y 1 ..

q1

q1

X 0 Y 1 ..

(d) ?

X 0 Y 1 .. q1 114

Automata

Rumination (1): States

The six states of the DTM in Example 2 (repeated below) can be implemented with a three-bits register, for example, letting its values 000, 001, 010, 011, 100, and 101 represent the states q0, q1, q2, q3, q4, and q5, respectively. The number of states is directly related to the hardware complexity of the machine. Unfortunately, given an arbitrary automaton, it is an unsolvable problem to minimize the number of states of the machine. However, for DFA it is possible to minimize them. We will learn how in Chapter 8.

(0, 0, R) (Y,Y, R)

(0, 0, L) (Y,Y, L)

(1,Y, L)

q2

q1

(0, X, R)

(X, X, R)

X 0

Y

1

..

(0, X, R) q0

q3

start

010

(Y,Y, R)

(Y,Y, R) q4

(B, B, N)

q5

115

Automata

Rumination (2): Accepting conditions

Recall that by the definition, both DTM and DLBA accept the input string by entering an accepting state. Hence, no transition is needed out of an accepting state. We may think that both DTM and DLBA halt in an accepting state. Also recall that both DTM and DLBA can have more than one accepting state (i.e., F  Q). However, we can show that one accepting state is enough. If it has more than one accepting state, we simply merge them into one as the figure below illustrates. Thus, we can design a user-friendly DTM or DLBA which entering the accepting state, can kindly announce that the input has been accepted. What will happen when the input string is not accepted? Would it be possible to design a DTM which can inform us of not accepting the input? What about the other machine models? We will deal with these intriguing questions and related ones in Chapters 7, 12 and 15, under the title of Hierarchy of Languages and Automata.

In contrast, for both DPDA and DFA the following two conditions should be satisfied simultaneously to accept the input string. (1) The input head has read the string up to the last symbol. (2) The finite state control is in an accepting state. Notice that with no unique marker appended to every input string, it is impossible for both DPDA and DFA to identify whether the symbol read is the last one or not. Figures (a) and (b) below, respectively, show a DPDA and a DFA recognizing the language L = {ai | i > 0 }. Suppose that the string aaaa is given as an input. Reading the first a, the machines enter the accepting state. However, it is too early to say that the input string is accepted because it has not read the last a yet (i.e., the second condition is not yet satisfied). Since the machines keep reading a‟s in the accepting state, it will eventually satisfy the two conditions. a (a, Z0/Z0) a (a, Z0/Z0) start

start (a)

(b)

116

Automata

Rumination (2): Accepting conditions

If the input string had a symbol different from a, the machine would fail to accept it because the transitions are defined only for the symbols in the input alphabet. Since the blank symbols does not belong to the input alphabet, the machine cannot read the blank symbol next to the last input symbol to ensure that the symbol read in the previous move is the last input symbol or not. (Such an illegal move is quite tempting for a student designing a DPDA.) From the above analysis we find that both DPDA and DFA are not autonomous computational models that can kindly tell us whether the input is accepted or not. In other words, we cannot make them say “Yes, I accept it!” or “Sorry, I cannot.” Rather we, the user, must decide, for a given input string, whether or not the machine satisfies the two conditions after some finite number of moves. Concerning these accepting conditions, we may raise the following question: Why should we put the burden on the user for the decision? How about appending an end-of-string marker to every input string? Consider the language L‟ = {ai ; | i > 0} with a marker „;‟ appended to every string in L. For such language we don‟t need condition (1) above. The input strings will accepted if and only if the machine enters an accepting state (see the figures below). (a, Z0/Z0)

a

(a, Z0/Z0) start

a ( ; , Z0/Z0)

start

;

Appending an end-of-string marker to every string appears to alleviate the burden for the user to see if a DPDA or a DFA accepts it. However, this approach introduces a new burden for the user in the real application. For example, programming language C uses such approach (each statement must end by a semicolon) and we know why it is a burden, while FORTRAN does not. In this book we mainly study the automata whose languages have no end-of–string markers. Recall that for both DTM and DLBA, the only accepting condition is to enter an accepting state. The user does not have to check if the last input symbol has been read. They can identify the last symbol by reading the next neighboring symbol, i.e., the blank symbol for DTM and the right boundary marker „]‟ for DLBA. (Notice that blank symbol belongs to the tape alphabet that the DTM can read.) Since the tape head of these models is bidirectional, they can accept the input string with the head located at any position.

117

Automata Rumination (3): Stack operation of DPDA By definition, in every move, a DPDA rewrites the stack-top symbol with a string of finite length (including the null string). (Notice that rewriting the stack-top with the null string implies a popping operation.) This stack operation can be simplified such that the stack-top symbol is either popped out or exactly 1 symbol is pushed on top of the stack without affecting the language recognized by the machine. (Notice that in this simplified version, there is no stack operation rewriting the stack-top symbol with another.) Suppose that as illustrated in figure (a) below, there is a DPDA which has a stack operation rewriting the stack-top with a string of length either 1 (i.e., simply replaces the stack-top with another) or greater than 2 (i.e., pushes more than 1 symbol on top of the stack). Figure (b) shows a simplified version of figure (a). Notice that the stack operation involved in the transition with (b, A/B) is replaced by two transitions with a pushing operation (b, A/XB) followed by a popping operation (, X/), where X is a new stack symbol introduced. It is clear from the transformation that such simplification does not affect the language recognized by the DPDA.

(, X/)

(b, A/XB) (b, A/B)

(a, Z0/AaBZ0)

(a, Z0/BZ0)

(, a/Aa) (, B/aB)

(a)

(b)

118

Automata

Rumination (4): Reading operation of a DPDA Finally we would like to ruminate on the reading operation of a DPDA and clarify some tricky concepts involved. Recall that a DPDA does the following in a move. (1) In the current state, read the stack-top symbol. (2) Depending on the stack-top symbol and the current state, either read the input or do not read. (3) Depending on the input symbol read (if it was allowed to read in (2) above) and the stack-top symbol, pop the stack-top symbol or push some symbols, possibly changing the stack-top. For a state p, stack symbol A and an input symbol a, if transition  (p, a, A) is defined, then the above definition implies that transition  (p, , A) should not be defined, and vice versa. To the contrary, suppose that both transitions  (p, a, A) = (q, BA) and  ( p, , A) = (r, CA) are defined (see the figure below). Transition  (p, a, A) = (q, BA) implies the following move. (1) In state p, the DPDA reads the stack-top symbol. (This part is always done in any state.) (2) If the stack-top symbol is A, the machine reads the input, and

(3) if the input symbol read is a, pushes B on top of the stack and enters state q. In contrast, transition  (p, , A) = (r, CA) implies the follow move:

(, A/ CA)

r

(1) In state p, the DPDA reads the stack-top symbol. (2) If the stack-top is A, the machine does not read the input, and

p

q (a, A/ BA)

(3) pushes C on top of the stack and enters state r. Notice that in the two moves  (p, a, A) = (q, BA) and  ( p, , A) = (r, CA), the machine does two contradictory actions (boldfaced phrases) in the same state p with the same stack-top symbol A; in the former move, the machine reads the input symbol, while in the later it does not read the input. This violates the determinism of DPDA that for the same course (i.e., the same state p and stack-top symbol A) there must be a unique response. Deterministic automata do not allow such moves. In the following chapter we will study nondeterministic automata, where such contradictory moves are allowed.

119

Automata Exercises 4.1 Design a DTM which, given a binary number x  {0, 1}+ on the input tape, increments x by one and halts in an accepting state. 4.2 What does the following DTM do? The input is a binary string. Describe the work in detail referring each state. For your answer, it is a good strategy to trace the moves of the machine with a couple of short input strings.

(0,0,L) (1,1,R) (0,0,R)

start

(1,0,L)

0

4

(0,B,L)

(1,1,R) (0,0,R)

(B,0,N)

(B,B,L)

1

2

(B,B,L)

(0,1,L)

3

6

(1,0,L)

(0, 1, R) (B,1,R) (1,B,L)

5

(B,1,N)

(1,1,L)

120

Automata

Exercises

4.3 Design a DTM which, given a string x {a, b}*, does each of the following works and then halts in an accepting state. (a) Shift the string x one cell to the right.

(b) Shift the string x two cells to the right.

4.4 Design a DTM which, given a string x {a, b}*, collects all a‟s in x to the left and all b‟s to the right. For example, given the string baabbab as an input, your machine should rearrange the symbols in it such that aaabbbb is the only string in the tape. 4.5 Design a DTM which, given a string x  {a, b, c}+, moves all c‟s toward the left end of the string and halts. For example, given the string baacbbccabc, the machine should leave only the string ccccbaabbab on the tape and halt. With the transition graph of your machine and an input string, briefly explain how it works. 4.6 Design a DPDA to recognize each of the following languages. (Notice that L1, L2, and L3 have an end-of-string marker „;‟) (a) L1 = {aibj; | i > j > 0 }

(b) L2 = {aibj; | i  j > 0 }

(c) L3 = {aibj; | i  j  0 }

(d) L4 = {aibj | i > j > 0 }

(e) L5 = {aibj | i  j > 0 }

(f) L6 = {aibj | i  j  0 }

(g) L7 = { aibkci | i, k > 0 }

(h) L8 = { aibkci | i, k  0 }

4.7 Design a DPDA to recognize the language generated by each of the following context-free grammars. (a) G1: S  aSbb | abb (b) G2: S  aSbb |  (c ) G3: S  aSbb | ccc 4.8 Design a DPDA to recognize each of the following languages. L1 = { x | x  {a, b, c}+, in x the number of a‟s and the number of b‟s are same. } L2 = { x | x  {a, b, c}+, in x the number of a‟s is strictly greater than the number of c‟s. } 4.9 Design a DFA to recognize each of the following languages.

(a) L1 = { babo }

(b) L2 = { xbaboyobba | x, y  {a, b, o}*}

(c) L3 = {x | x  {a, b, c}+ , string x begins and ends with the same letter.}

121

Automata

Exercises

The following problem is for modeling a system by a DFA. The original problem is given as an exercise in the book “Introduction to Automata Theory, Languages, and Computation (page 53) (by Hopcropt, Motwani and Ullman, Edison Wesley, 2001).” Here we present it slightly modified. 4.10 Consider a network of ducts shown below. The network has two entries labeled A and B, and two exits C and D. There are two levers; x1 at the branching section and x2 at the cross-section. If we drop a marble in at A or B, the marble rolls down along the ducks and hits one of the levers, which causes it to fall either to the left or to the right. Whenever a lever is hit by a marble, it causes the lever change its orientation 90o such that the next marble to encounter the lever will be directed to the opposite direction. The initial lever positions are set as shown in the figure, indicating that both levers direct the marble to the right. Model this network by a DFA whose input is a binary string, where 0 denotes a marble dropped in at A, and 1 denotes a marble dropped in at B. The automaton accepts the input if the last marble comes out of exit D.

A

B

x1 x2

C

D

122

Models of Language Recognition: Automata

123

5. Nondeterministic Automata The computational machine models that we studied in Chapter 4 are deterministic in the sense that the next move (i.e., the effect) of the automata is uniquely determined by the current state and the input (i.e., the cause), including the stack-top for the case of DPDA. Thus, the state transition graphs of DFA's, for example, have at most one outgoing edge for each input symbol. For a state p and an input symbol a of a DFA, the transition function  (p, a) gives no more than one next state, for example,  (p, a) = q. Deterministic machines follow our intuition that for a given cause the effect is uniquely determined. In this chapter we will study the computational models whose transitions do not go along with our deterministic intuition. They are allowed to take more than one action in a move. More specifically, for the same state and an input symbol (or a stack symbol for the pushdown automata), we allow the automata to take more than one transition simultaneously. For example, a nondeterministic finite state automaton (NFA) can have a nondeterministic transition, like (p, a) = {q, r), i.e., reading symbol a in state p, the machine enters two different states q and r simultaneously. Likewise, nondeterministic pushdown automata (NPDA) can have (p, a, A) = {(q, BA), (r, )}, i.e., they can push and pop simultaneously, possibly entering different states. Nondeterministic linear bounded automata (NLBA) and nondeterministic Turing machines (NTM) can have nondeterministic transitions, like  (p, a) = {(q, b, R), (q, b, L)}. Nondeterministic automata involve a conceptual model that cannot be implemented with current technology. However, we will see that once we understand the underlying concept of these nondeterministic automata, we can use them as a powerful tool for the design and analysis of deterministic machines. Now, we will study these nondeterministic models beginning with NFA, and then NPDA, NLBA, and NTM, in this order.

124

Nondeterministic Automata 5.1 NFA (Nondeterministic Finite Automata) 126 Definition and examples 5.2 NPDA (Nondeterministic Pushdown Automata) 132 Definition and examples 5.3 NTM (Nondeterministic Turing Machines) and NLBA (Nondeterministic Linear Bounded Automata) 139 5.4 Nondeterministic Computing 140 Nondeterministic computing Example 5.5 NFA and -transitions 145 -transitions in an FA Eliminating -transitions Rumination 154 Exercises 156

Today‟s Quote

Break Time

I pay very little regard to what any young person says on the subject of marriage. If they profess a disinclination for it, I only set it down that they have not seen the right person. - Jane Austen –

Love is the difficult realization that something other than oneself is real. - Iris Murdoch -

125

Nondeterministic Automata

5.1 Nondeterministic Finite Automata (NFA) Figure (a) below shows the state transition graph of an NFA that has nondeterministic transitions in states q0 and q2, i.e., (q0, b) = {q0 , q1},

(q2, a) = {q0 , q1}.

For the input string bba, this NFA takes multiple sequences of transitions simultaneously as shown in figure (b), which we shall call state transition profile for the given input. Notice that a root-to-leaf path corresponds to a full sequence of transition for the input. (In the figure, X denotes the dead state.) b

a,b

b

q1

a

b q0 b a

a

q0

q2

(a) An NFA M.

q0

a

q1

q0

b q1 b

q1 a

X

q2

(b) State transition profile of M for bba. 126

Nondeterministic Automata

NFA

The accepting condition of an NFA is defined as follows. An NFA accepts the input if and only if the transition profile has an accepting leaf node (i.e., state). (All the other non-accepting leaf nodes are ignored.) In other words the input string is accepted by an NFA if and only if there is a sequence of transitions finally entering an accepting state. According to this definition, the input string bba is accepted by M. a,b

b

q1 a

a

q2

(a) An NFA M

q0

b q0 b a

a

q0

b

q1

q0

b q1 b

q1 a

X

q2

(b) Transition profile of M for bba. 127

Nondeterministic Automata

NFA

Figure (b) below shows the transition profile for the input string bbb. This profile has no root-to-leaf path ending in an accepting state. Hence, by definition the input is rejected by M.

a,b

b

b q0

a (a) An NFA M

q2

q0

b q0 b

a

a

q0

b

q1

q0

b q1

b q1 b X

q1 b X

(b) Transition profile of M for bbb.

128

Nondeterministic Automata

NFA

Formally, an NFA M is defined with the quintuple as follows. M = (Q, , , q0, F ), where all the elements are the same as the ones for DFA‟s, except for the state transition function , which is defined as follows.  : Q    2Q That is, the automaton, reading the input symbol in a state, takes transitions simultaneously to every state in a subset of Q. (Recall that 2Q denotes the set of subsets of Q.) For a string x, let (p, x) denote the set of states reachable from state p after reading the input string x. Then the language L(M) is defined as follows. L(M) = {x| x  *, (q0, x) = S, and S  F   }

Notice that S  F   implies that the transition profile for the string x has an accepting leaf node, i.e., there is a sequence of transitions ending in an accepting state. 129

NFA Nondeterministic Automata

Designing an NFA Example 1. Construct an NFA which recognizes L = {xaa | x  {a, b}*}. This language is the set of strings in {a, b}* ending with aa. The figure below shows an NFA which recognizes L. This NFA, reading b, stays in state 1, while reading a, takes nondeterministic transitions to state 1 and 2. Because of this nondeterministic transition, it is guaranteed that if the input string contains aa, the machine enters the accepting state 3 by reading the second a. Hence, if the second a is the last symbol of the input string, then by the definition, it is accepted. Notice that together with this accepting sequence of transitions, there are other nonaccepting sequences of transitions ending in a non-accepting state that we ignore by the definition.

a, b

a

a 1

2

3

130

NFA

Nondeterministic Automata

Example 2. Construct an NFA to recognize the following language. L = {xaaybbz | x, y, z  {a, b}*} This language is the set of strings in {a, b}*, each having aa followed by bb, not necessarily consecutive. We use the same idea that we used for Example 1. We connect two transition graphs; one for recognizing xaa followed by the other for ybb, and then let the automaton read off the part z in an accepting state.

b a, b

a

a

b

a, b

a, b

Nondeterministic transitions do not give any additional power for recognizing a language, i.e., NFA‟s and DFA‟s recognize the same languages. We will learn why in Chapter 8. Actually we will study how to transform an NFA to a DFA which recognizes the same language. 131

Nondeterministic Automata

5.2 Nondeterministic Pushdown Automata (NPDA) Since PDA‟s have the stack, they can do more works in a move than NFA‟s. Therefore they can take more diverse nondeterministic moves at a time. For example, in a move they can enter different states simultaneously as well as change the stack top with different contents, as the following examples show. (p, a, Z) = {(q, aZ), (r, aZ)} (s, b, Z) = {(t,  ), (t, bZ), (t, cZ)} The power of nondeterministic transitions of an NPDA is best shown by the following simple example.

132

Nondeterministic Automata

NPDA

Example 1. Construct an NPDA to recognize the following language. L = {aibjck| i, j, k > 0, and i = j or j = k }. This language is the union of the following two languages. L1 = {aibjck| i, j, k > 0, i = j } L2 = {aibjck| i, j, k > 0, j = k } We first construct DPDA‟s M1 and M2 , respectively, recognizing L1 and L2. (a, a/ aa)

(b, a/)

(b, a/)

(a, Z0/aZ0) start

(c, Z0/ Z0) (a) M1

(c, Z0/ Z0) (b, b/bb)

start

(, Z0/ Z0) (a, Z0/Z0) (a, Z0/Z0)

(b, Z0/bZ0)

(c, b/  )

(c, b/ )

(b) M2 133

Nondeterministic Automata

NPDA L = {aibjck| i, j, k > 0, and i = j or j = k }.

It is proven that no DPDA can recognize L. However, as the following figure shows, we can simply construct an NPDA, which can recognize L, by combining the two DPDA‟s M1 and M2 with a nondeterministic transition.

(a, a/ aa)

(a, Z0/aZ0) start

1

(b, a/)

(b, a/) (c, Z0/ Z0)

2 (c, Z0/ Z0) (b, b/bb)

(a, Z0/Z0)

3

(a, Z0/Z0)

(, Z0/ Z0) (b, Z0/bZ0) (c, b/  ) (c, b/ )

134

Nondeterministic Automata

NPDA

A simple idea for identifying whether an automaton is deterministic or not is to check if the  function of the automaton is one-to-one, i.e., the function gives no more than one value. However this idea is not sufficient for NPDA. As we mentioned in Chapter 4 studying DPDA, if a PDA has two transitions from a state, one reading the input and the other not reading, with the same stack-top symbol, then the PDA is an NPDA. For example, if the transition function  of a PDA has two transitions shown below, then it is an NPDA. In state p, with B on the stack-top, the machine is taking two contradictory transitions; reading and not reading the input. (p, a, B) = (q, AB)

(p, , B ) = (q, AB)

Let‟s see how we can utilize such nondeterministic transitions to design an NPDA that can recognize a language, but no DPDA can recognize it.

135

Nondeterministic Automata

NPDA

Example 2. Construct an NPDA which recognizes the following language. L = {xxR | x  {a, b}+ }

This is a typical context-free language for which it is impossible to design a DPDA to recognize (see Appendix D for a proof). If there were a marker, say #, at the center of every string as shown in the language L‟ below, it would be trivial to construct a DPDA. This DPDA, reading the part x (till it reads #), pushes all the symbols into the stack, and then pops the stack-top for every matching symbol read from the second half (i.e., xR) until it sees the stack empty. L‟ = {x#xR | x  {a, b}+ } (Y, X/YX ) ( , Z0/Z0 )

(X, Z0/X Z0 )

start

(#, X/X )

In the figure, X, Y  {a, b}. (Y, X/YX ) is a shorthand for (a, a/aa), (a, b/ab), (b, a/ba), (b, b/bb).

(X, X/ ) 136

Nondeterministic Automata

NPDA

Now, we will design an NPDA which uses the power of nondeterministic transitions to recognize L (see the figure below). The NPDA, reading the input symbols and pushing them onto the stack in state q1, guesses that from the next input symbol, the part xR will begin, and takes nondeterministic transitions to a popping state (q2 in the figure) while remaining in state q1. In the popping state the machine keeps popping the stack-top for each matching input symbol. Notice that in the same state q1 with the same stack-top X, the NPDA takes reading and not-reading transitions simultaneously, the one entering q1 and the other entering q2.

L = {xxR | x  {a, b}+ } (Y, X/YX ) (X, Z0/X Z0 ) start

q0

( , Z0/Z0 ) q1 (, X/X )

q3 q2

In the figure X, Y  {a, b}. (Y, X/YX ) is a shorthand for (a, a/aa), (a, b/ab), ( b, a/ba), (b, b/bb). (X, X/) is a shorthand for (a, a /), (b, b /)

(X, X/ ) 137

NPDA

Nondeterministic Automata

Palindrome is a word or phrase that reads the same forwards and backwards (with case insensitive and spaces ignored). For example, “Madam” and “sums are not set as a test on Erasmus” are palindromes. The language L = {xxR | x  {a, b}+ } is a language of palindromes.

Every language of palindromes (with case sensitive and spaces excluded) is a context-free language. In Chapter 15 we will learn that the every context-free language can be recognized by an NPDA. However, there are context-free languages that cannot be recognized by any DPDA. The language L above is an example. It is a challenging problem to prove that for a given context-language, whether or not it can be recognized by a DPDA. Appendix D shows a proof that there is no DPDA that can recognize L.

Break Time

Actual Signs

In a non-smoking area: “If we see you smoking, we will assume you are on fire and take appropriate action.” In the front yard of a funeral home: “Drive carefully. We‟ll wait.” On a maternity room door: “Push, Push, Push.” On an optometrist’s office: “If you don‟t see what you‟re looking for, you‟ve come to the right place.” On a butcher’s window: “Let me meat your needs.” - Anonymous -

138

Nondeterministic Automata

5.3 Nondeterministic Turing Machines (NTM) and Nondeterministic Linear Bounded Automata (NLBA) The figure below shows a part of the state transition graph of an NTM (or an NLBA). Notice the various nondeterministic transitions in states 1, 2 and 4. In a move, the machine enters more than one state (from state 1 in the figure), write different symbols (in state 2) and/or move in different directions (in state 4). Though it would be tedious, we can show that given a NTM M, we can build a DTM M‟ which simulates M by systematically keeping track of every nondeterministic transition of M. NTM‟s and DTM‟s are computationally equivalent, i.e., they accept the same class of type 0 languages. (a, b, R)

(a, b, R)

2

4 (a, c, R)

1

(a, b, R), (a, b, L)

3 (a, b, R)

5 139

Nondeterministic Automata

5.4 Nondeterministic Computing Sometimes it may help to think that whenever a nondeterministic move occurs, the original machine makes a copy for each instance of the move. For example, suppose that an NFA M, in state p, reads symbol a and takes a nondeterministic move to q and r, i.e., (p, a ) = {q, r}. Then, conceptually it is clear to think that two copies of the machine, one in state q and the other in r, are created and proceed independently. The same concept applies to the other nondeterministic automata and algorithms. a b M

A N I

p

(p, a) = {q, r)

M1

a b

a b

q

r M2 140

Nondeterministic Automata

Nondeterministic Computing

For a finite set S, define a function nondet_choose (S), which returns every element in S simultaneously. The figure below illustrates conceptually how a program works with a nondeterministic function returning multiple values simultaneously. Executing the function nondet_choose, for each of the function values returned, a copy of the process (i.e., the current instance of the executable program) is generated (i.e., forked). Then, all the process copies proceed independently, with the function value assigned. . A

N I

.

x = nondet_choose (q, r) ; . .

. .

. .

x = nondet_choose (q, r) ;

x = nondet_choose (q, r) ;

{ compute with x = q }

{ compute with x = r }

.

.

141

Nondeterministic Computing

Nondeterministic Automata

Using the function nondet_choose we can quickly search the solution space of a problem by nondeterministically forking (i.e., making a copy of the program and let them proceed independently). If there is any sequence of computation that gives the solution, we say the program solves the problem, ignoring the unsuccessful computations. Otherwise, we say the program fails to find it. There are many, so-called, NP-hard problems for which no polynomial-time algorithm (i.e., which gives an answer in O(nk) time, for some constant k ) is available. The sum-of-subset problem, which is defined as follows, is one of them. Given a set S of N integers and an integer M, is there a subset S’  S such that the sum of elements in S’ is equal to M? For example let S = {8, 21, 13, 20, 7, 11, 5} and M = 34. Since 8 + 21 + 5 = 34, the answer is “yes.”

142

Nondeterministic Computing

Nondeterministic Automata

Here is a nondeterministic algorithm which solves the sum-of-subset problem in time proportional to the problem size (i.e., in time linear to the number of given integers), assuming that the nondeterministic function takes a constant time. // S is an integer array of size n. This algorithm tests if there is a // subset of the numbers in S whose sum is equal to M. 1. Nondet-Sum-of-Subset (int S[ ], int n, int M) { 2. int i, sum = 0; 3. boolean selected; 4. for ( i = 0; i < n; i++) { 5. selected = nondet_choose(true, false); 6. if (selected) sum += S[i]; } 7. if (sum = = M) return “yes”; 8. else return “no”; }

143

Nondeterministic Automata

Nondeterministic Computing

Notice that whenever the statement selected = nondet_choose(true, false) is executed, two processes are created, one with selected = true, and the other with selected = false. Let 1 and 0, respectively, denote true and false, and suppose |S| = 3. Then the initial process will be forked down to three levels, summing up all the selected (= true) elements by the statement in line 6. The statement in line 7 makes the final decision. Notice that there are 8 leaf nodes (processes) in the decision tree. Each leaf node of the tree corresponds to one of the 8 possible subsets of the set of size 3. If there is a subset whose summation is equal to the given number M, one of the processes at the leaf nodes will output “yes.” The figure shows a successful selection sequence 011 and the leaf node which will output “yes” for the given set S. 0 i=0

0

1

0

2

0 000

1 001

1 0

S

1

0

2

34 7 12

M = 19

1

0 1

1

1

0

010 011 100 101 110

1

1: true 0: false 111 144

Nondeterministic Automata

5.5 -transitions in an FA Consider a part of a state transition graph of an FA shown in figure (a) below. In state q1, if the input is a, the automaton remains at state q1, and if the input is b, it enters state q2. Now, we assume that while the automaton does not read the input, it remains in the current state. To denote such idling transition to the current state lets add a self-loop labeled by  to every state as shown in figure (b). Clearly, such modification does not affect the computational capability of an FA. 

a

a

q1

b (a)

q2

q1 

b

q2

(b)

145

Nondeterministic Automata

-transitions

Now, we define a “spontaneous” transition between two different states of an FA. For example, (p, ) = q denotes a spontaneous transition from state p to q without reading the input. We shall use the label  on the state transition graph to denote such spontaneous transitions (also called -transitions) as shown in figure (c) below. In this figure, if we take into account of the self-loop labeled with  that we have introduced for the case of no input, we see that this machine has nondeterministic transitions on input . That is, (q1, ) = {q1, q2}. So, if an FA has an -transition between two different states, it is an NFA. (As usual we shall not explicitly show the self-loop labeled with . Only the -transitions between two different states will be shown.)



a

a q1

b (a)

q2

q1 

b (b)

q2



a q1



q2

 (c)

146

Nondeterministic Automata

Eliminating -transitions from an FA An NFA with ε-transitions are very convenient model for designing a DFA, which can be implemented in the real application, or solving other problems concerning regular languages. However, the ε-transitions in an NFA need to be eliminated before converting the NFA to a DFA. It is possible to remove all the ε-transitions, if any, from an NFA without changing the language of the automaton. We will shortly learn how. Before we introduce a method for eliminating -transitions from an NFA, we need to extend the transition function  as follows. For a subset A of the set of states Q of an NFA (i.e., A  Q) and an input string x, let (A, x) denote the set of states reachable from each state in A reading the string x. For example, in the following FA, we get ({1, 3}, ab) = {1, 2, 3}. 1 start

a

2

b

a 3

4

a

a, b b

147

Nondeterministic Automata

Eliminating -transitions

Example 1. Eliminate the -transition from the NFA shown below without affecting the language recognized by the automaton. This example would be enough to understand the basic idea for eliminating transitions from an NFA. For the method we need the following property of the regular expressions involved with the null character , where t is an input symbol.  =  = *, t = t = t = t = *t*. We will also use the property that for a subset A  Q and two input strings x and y, (A, xy) = ((A, x), y). Let M be the given NFA and M ' be the FA that we get by eliminating the -transitions from M. b a start

q0



q2

For each state qi and for each input symbol t  {a, b}, we compute the set of states S that can be reachable by reading t, i.e., (qi, *t* ) = S. (Notice that instead of t, we use *t*, because M has -transitions.) We let this S be the states reachable from qi in M' by reading t, i.e., '(qi, t) = (qi, *t* ) = S. 148

Nondeterministic Automata

Eliminating -transitions

The figure below shows in detail how we build the transition function ' of M' with  of M to eliminate the -transitions from state q0. (Recall that on each state there is a self-loop labeled with . b

a start

q0

a

 q1 (a) M

start

q0

a, b

q1

A N I

(b) building M'

'(q0, a) = (q0, *a* ) = (( q0, * ), a* ) = ({q0, q1}, a* ) = ( ( {q0, q1}, a), * ) = ( {q0}, * ) = { q0, q1 } '(q0, b) = (q0, *b* ) = (( q0, * ), b* ) = ({q0, q1}, b* ) = ( ( {q0, q1}, b), * ) = ( {q1}, * ) = { q1 }

149

Eliminating -transitions

Nondeterministic Automata

This figure shows the part for eliminating -transitions from state q1 to construct '.

start

q0

 (a) M

b

a

b

a

q1

a, b start

q0

b

a q1

start

(b) Building M'

q0

a,b

q1

A N I

(c) Final M'

'(q1, a) = (q1, *a* ) = (( q1, * ), a* ) = ({q1}, a* ) = ( ({q1}, a), * ) = ( {}, * ) = {} =  '(q1, b) = (q1, *b* ) = (( q1, * ), b* ) = ({q1}, b* ) = ( ( {q1}, b), * ) = ( {q1}, * ) = { q1 } Finally, if an accepting state is reachable by a sequence of -transitions (i.e.,  is in the language of M), we let the start state (q0) be an accepting state as shown in figure (c) above. 150

Nondeterministic Automata

Eliminating -transitions

Example 2. Eliminate the -transitions from the following NFA. a  start

c

b 

q1

q0

A N I

q2

(a) M

Computing ' for the state q0 and input symbol a. '(q0, a) = (q0, *a* ) = (( q0, * ), a* ) = ({ q0, q1, q2 }, a* ) = ( ( q0, q1, q2}, a), * ) = ( q0, * ) = { q0, q1, q2 }. a

start

q0

a

q1

q2

a

(b) M 151

Nondeterministic Automata

Eliminating -transitions

a start q0



c

b q1



q2

(a) M Computing ' for the state q0, and input symbol b and c. '(q0, b) = (q0, *b* ) = {q1, q2 } '(q0, c) = (q0, *c* ) ={ q2 } a start

q0

a, b

q1

q2

a, b, c (b) M

152

Eliminating -transitions

Nondeterministic Automata

Computing ' for the states q1 and q2 , and input symbols a, b, and c. '(q1, a) = Ø ,

'(q1, b) = { q1, q2 },

'(q1, c) = { q2 },

'(q2, a) = Ø ,

'(q2, b) = Ø ,

'(q2, c) = { q2 }.

Finally, since the accepting state q2 is reachable by ε-transitions from the start state, we change the start state into an accepting state to get M as shown in figure (b). a  start

q0

a

c

b

q1



c

b a,b

q2

start

q0

b,c q1

q2

a,b,c (a) M

(b) M

153

Nondeterministic Automata

Rumination (1): Nondeterministic automata • The only difference between DFA‟s and NFA‟s is in the definitions of , as shown below. DFA  : Q    Q NFA  : Q    2Q In other words, NFA‟s take a transition to one or more states (i.e., a subset of Q), while DFA‟s take transition to no more than one state. Hence, we can view a DFA as a special NFA, every transition of which goes to at most one state. • It is not easy to identify the language accepted by a DFA directly if the state transition graph of the automaton is not simple. It is more difficult to identify the language accepted by an NFA because for a given input string, there are multiple number of transition paths, some leading to an accepting state, while others not. In Chapter 7, we will learn a systematic way to transform the state transition graph of an FA (either DFA or NFA) to a regular expression which expresses the language recognized by the FA. • Since NFA‟s are allowed to take multiple transitions simultaneously, they seem to be more powerful than DFA‟s. In particular, we may raise the following question. Is there an NFA language that cannot be recognizable by any DFA? In Chapter 8 we will learn a method of transforming an NFA to a DFA that recognizes the same language. This implies that for FA‟s, nondeterminism does not give any additional power for the language recognition. We may raise the same question for the PDA‟s, the TM‟s and the LBA‟s. For PDA‟s, the nondeterminism gives additional power (see Appendix D for the proof). We will see a couple of examples. However, for TM‟s it doesn‟t, and it is an open problem for LBA‟s. • We will see that, once we get used to the nondeterminism, it is easier to design an NFA than DFA. We will also see that NFA is a powerful tool for solving many problems concerning regular languages and related models.

154

Rumination: Nondeterministic automata

Nondeterministic Automata

• With nondeterministic automata, we can search the solution space of a problem by tracing diverse routes simultaneously, like in a parallel computer. But they are different from the parallel computers. While a parallel computer, out of the parallel search, generates the final result for the user, a nondeterministic automaton does not “announce” the final result. The user must make the final decision out of the many possible results. Nondeterministic automata cannot say “Yes, the input is accepted.” or “No, it is rejected.” Such final decisions are up to the user.

• We learned that the stack operation mode restricted to either popping or pushing exactly one symbol in a move (see the rumination section at the end of Chapter 4). We can apply the same restriction to NPDA‟s without affecting their capacity for language recognition. • Nondeterministic automata are different from the probabilistic machine, where a move takes place according to a given probabilistic rule. For example, in an NFA,  (p, a) = {q, r} implies that in a state p, reading a symbol a, the machine certainly enters state q and r simultaneously. There is no coin-tossing. • From now on by TM, LBA, PDA and FA we denote those automata either deterministic or nondeterministic.

Ask God

Break Time

A person asked God, “What surprises you most about mankind?” And God answered: “That they lose their health to make money and then lose their money to restore their health. That by thinking anxiously about the future, they forget the present, such that they live neither for the present nor the future. That they live as if they will never die, and they die as if they had never lived ….” - Marty -

155

Exercises 5.1 For each of the strings given below, (i) show the state transition profile (the tree) of the FA shown below, and (ii) based on the profile, check whether the string is accepted by the automaton or not. a (a) aaabaa (b) aabaa a

a

a

a

start

b

a 5.2 What is the language recognized by the NFA above? (i) Describe the language informally. Your answer should be clear. (ii) Express the language with a regular expression. 5.3 (a) Construct the state transition graph of an NFA which recognizes {aaxaa | x {a, b, c}+ }. Your graph should have no more than 6 transition edges. (Do not count the implicit edges going to the dead state.) (b) Construct a DFA which recognizes the language given in part (a) . 5.4 For each of the following languages, construct a PDA (either NPDA or DPDA, whichever possible) which recognizes the language, and briefly explain how your PDA recognizes the language. For your PDA, it is enough to show the state transition graph. (a) L1 = {xyz | x, y, z  {a, b}+, and y = zR }

(b) L2 = {xyz | x, y, z  {a, b}+, and x = yR }

156

5.5 Given a directed graph G and two nodes s and t in G, the Hamiltonian path from s to t is a path that starts at node s and ends at node t visiting all the other nodes exactly once. For example, for the graph below and two nodes s = 4 and t = 7, a Hamiltonian path is 4  2  1  8  9  10  6  5  3  7. Given an arbitrary directed graph G and two node ids s and t, write a nondeterministic algorithm that checks if G has a Hamiltonian path from s to t. Your algorithm should take no more than a polynomial time, i.e., in time O(nk), for some constant k. (This is a challenging problem. Review the nondeterministic algorithm to solve the sum-of-subset problem that we studied in Section 5.4.)

2 1

s

7

t

4 3

8

10

5 6

9

157

5.6 Using the technique for eliminating -transitions from an NFA presented in Section 5.6, construct an NFA which recognizes the language of the following NFA with no -transitions. You should also show the procedure that you took to get your answer.

b

a b

2

a,b

b



4

1 a

start 

5

3 a,b

158

Related Documents


More Documents from ""