Appendix B Converting a 2-way DFA to a 1-way DFA
a
....
b
q0 (a) 2-way FA M
a
....
b
r0 (a) 1-way FA M’
1
2-way FA ⇒1-way FA Here we will show that every language recognizable by a 2-way DFA can also be recognizable by a 1-way FA. Actually, we will show an algorithm which given an arbitrary 2-way DFA M, constructs an 1-way DFA M’ recognizing the same language. (This algorithm was first presented by J. C. Sheperdson in IBM J. Res. 3: 2, 198-200.) Recall that both 1-way FA and 2-way FA have the same starting and accepting configuration, i.e., initially they read the leftmost input symbol and accepts the input string by falling off to the right of the rightmost input symbol in an accepting state. (Recall that the blank symbol does not belong to the input alphabet, and hence, they are not allowed to read it during the computation.)
a
....
b
q0 (a) Starting configuration of M
a
....
b qf
(b) Accepting configuration of M 2
2-way FA ⇒1-way FA
The Algorithm
Before we present the conversion algorithm, lets examine how M works with an example shown in figure (a) below. Given the input string aba, this automaton M will move as shown in figure (b). We are interested in the sequences of states and directions while the machine moves back and forth across the cell boundaries. In the figure each arrow shows the direction together with the machine’s state. (b, L)
(a, R), (b, R)
2 (a, L)
1 (a, L)
(b, R)
3
(b, L) (a, R)
a
2 3
5
(a) 2-way DFA M
b
1 (b, R)
4
a
(a, R)
3 1
5
4 5
4 2
Accept!
3
3
(b) The computing profile of M on input aba 3
2-way FA ⇒1-way FA
Algorithm
Let Q = {1, 2, . . . , n} be the set of states of M. If the machine crosses a cell boundary to the left in state i and crosses back the same boundary to the right in a state qi ∈ Q, then we shall call the state pair (i, qi) a crossing state pair. If there is no such state qi (i.e., it never crosses back), we shall denote it by 0. Note that on each cell boundary, there are n crossing state pairs. We shall call this list of pairs the crossing information (shortly, CI). .... ....
.... i
....
1
q1
2
qi
(a) Crossing state pair (i, qi)
. . . n
q2
qn
(b) Crossing information CI 4
Algorithm
2-way FA ⇒1-way FA
The key idea of the algorithm is that given the CI on a boundary ci and the input symbol in its right adjacent cell, it is possible to compute the CI on the next cell boundary ci+1 . Figure (b) shows how it is possible with M. (Recall that in a CI, 0 implies that M never crosses back the boundary.) ci+1 ci ....
2 (a, L)
1 (a, L)
(a, R)
(b, L)
(a, R), (b, R)
(b, R)
3 (b, R)
4
5 (b, L) (a, R)
(a) 2-way DFA M
a
1 q1
2
2 q2
0
3 q3 4 q4
3 0
... 2 0
A N I
3 0
5 q5
5 5 (b) Computing CI on ci+1
5
2-way FA ⇒1-way FA
Algorithm
Now, with M in Figure (a) below we shall show how it is possible to accept the input string aba ∈ L(M), going 1-way to the right. Since it is illegal for M to cross the leftmost cell boundary c0 to the left, for every state i, its crossing state pair is (i, 0) because the machine never crosses the boundary back to the right. c0 c1 c2 c3 a (a, R), (b, R) 1 (a, L)
(a, R)
(b, L) 2 (a, L)
(b, R)
3 (b, R)
4
5 (b, L) (a, R)
(a) 2-way DFA M
1 q1
0
2 q2
0
3 q3
0
4 q4
0
5 q5
0
b
a
(b) Starting CI 6
2-way FA ⇒1-way FA
Algorithm
Let q0 denote the state in which M crosses a cell boundary first time to the right. With the CI on c0 we first find q0 in which M crosses c1 to the right and then compute the CI on c1 with the CI on c0. Since by convention M starts reading the leftmost input symbol in the start state 1, q0 on c1 is 2. With the CI on c1 computed, we find q0 (state 4) on c2, c0 c1 c2 c3 and compute the CI on c2 and so on. b
a
1 (a, L)
(a, R)
(b, L)
(a, R), (b, R)
2 (a, L)
(b, R)
3 (b, R)
4
5 (b, L) (a, R)
(a) 2-way DFA M
q0
1
2
1 q1
0
2
2 q2
0
0
3 q3
0
3
4 q4
0
0
5 q5
a 4
A N I
(b) Computing CI 0
5 7
2-way FA ⇒1-way FA
Algorithm
The following figure shows how to compute CI on c2 with CI on c1, and then find q0 on c3 (i.e., state 3). This figure shows that M, starting in state 1, crosses the boundaries to the right in states 2, 4 and 3, in this order, and accepts the input string. c0 c1 c2 c3 b
a (a, R), (b, R) 1 (a, L)
(a, R)
(b, L) 2 (a, L)
3
(b, R)
(b, R)
4
5 (b, L) (a, R)
(a) 2-way DFA M
q0
1
a 4
2
1 q1
2
2 q2
0
3 q3
3
4 q4
0
5 q5
5
2
3 Accept!
4 4 3
A N I
3 8
2-way FA ⇒1-way FA
Algorithm 2 (b, R)
(a, L)
1 (a, L)
(a, R)
(b, L)
(a, R), (b, R)
3 (b, R)
4
5 (b, L)
a
b
1
(a, R)
5
4 5
q0
1
1 q1
0
2
2 q2
0
0
4
3 q3
0
2
4 q4
0
5 q5
0
a
3 1
b 2
a
2 3
The figure below shows q0 when the machine crosses each boundary first time to the right. The sequence 2, 4, 3 is identical to the one made my M.
3
3
Accept!
3 0
5
a 4 2
3 Accept!
4 4 3 3 9
Algorithm
2-way FA ⇒1-way FA
Computing next CIM be a 2-way DFA. Below is the algorithm which given an input symbol a on a Let tape cell and CI c on the left cell boundary, computes CI c’ on the right cell boundary. If M crosses the left boundary to the left in state q, the algorithm calls function computeNextq0 (c, n, q, a) (see next page) and traces c to find the state in which the machine crosses the right boundary c’ to the right first time. Algorithm computeNextCI (c, c’, n, a) { // input: CI in an integer array c[n] with values c[i] ∈ {1, 2, . . ,n} which represents // CI = <(1, c[1]), (2, c[2]), . . ., (n, c[n])>, and the transition function δ of a 2-DFA. // output: next CI in array c’[n]. c c' for ( i = 1; i <= n; i++ ) { // Note that q is a state variable. a let (q, d) = δ ( i, a ); // q ∈ {1, 2, . . , n}, d ∈ {+1, -1} if (d == +1 ) c’[i] = q; // M moves right in state q. else let c’[i] = computeNextq0(c, n, q, a) ; // M moves left (i.e., d = -1) in state q } // end for-loop } // end algorithm 10
Algorithm
2-way FA ⇒1-way FA
// M, reading a in state q, moves to the left. // Array c[n] contains the CI on the left boundary of the cell where a is written. // This algorithm traces c[n] and finds the state of M when it crosses the right // boundary to the right first time. The set L is used to check if M keeps crossing // the left boundary back and forth (i.e., enters a loop). Algorithm computeNextq0 (c, n, q, a) // q ∈ {1, 2, . . , n}, d ∈ {+1, -1} c c' { let L = { } //empty set do { if (c[q] == 0) then return 0; // M is in dead state (never comes back). if (q ∈ L ) then return 0; // M is in a looping state. else put q in set L; let (q, d) = δ (c[q], a); } while ( d == -1); // keep tracing CI c while d = -1 return q; // M is crossing the right boundary in state q; }
a
11
2-way FA ⇒1-way FA
Algorithm
Computing q0 Here is the algorithm which using computeNextCI and ComputeNextq0, finds the sequence of states (denoted by q0 ) in which M first crosses the cell boundaries from left to right first time. We assume that the transition function δ and an input string are given as global data. let integer array c[n] = {0}; define an integer array c’[n]; let q0 = 1;
// initialize leftmost CI // 1 is the start state of 2-DFA M
do { read next input into a; print (q0); // Output current value of q0. computeNextCI (c, c’, n, a ); c = c’ ; let (p, d) = δ (q0, a);
c'
c a
if (d == +1) q0 = p; else q0 = ComputeNextq0(c, n, p, a ); } forever; 12
2-way FA ⇒1-way FA
Algorithm
The algorithm traces the sequence of states q0, while computing the CI on each cell boundary moving right. Since the number of states n is finite, there are at most n2 different CIs, which is finite. It follows that the sequence of states q0 can be computed by a 1-way FA with M and the algorithm A stored in its finite state control and the same input written on the input tape as illustrated in figure (b) below.
....
M (a) 2-DFA M
....
M A (b) 1-DFA accepting L(M)
13
2-way FA ⇒1-way FA
Algorithm
Let
denote the pair of CI and q0 on a cell boundary. Given a pair of information on a cell boundary, M’ can compute on the next cell boundary using the transition function of M and the algorithm A. Since M and the algorithm A are fixed, the number of states of M’ is determined by the number of that is finite. Let n be the number of sates of M. For each state i, there are no more than n different crossing pairs , implying that there are at most (n)n different CI’s. Since there can be at most n different q0, there are no more than (n+1)n+1 pairs of . This number is finite. Let those CIs be named as C0, C1, C2, . . . , and the first CI be as shown in Figure (a) for two input symbols a and b, then the state transition graph of M’ will appear as shown in Figure (b). ( Notice that 1 is the start state and C0 is the CI on a the leftmost cell boundary, where no q0.) i
a
b
1
C0
p
Ci
1
C0
q
Cj
(a) CIs for the first input a and b
C0 b
(b) State transitions of 1-DFA M’ 14
2-way FA ⇒1-way FA
Algorithm
Since M is deterministic, and given a pair and an input symbol, the algorithm computes a unique succeeding pair, M’ is deterministic. We can extend the algorithm to nondeterministic 2-way FA. Suppose that for an input symbol a, M has some nondeterministic transitions as shown in figure (a) below. We let the 1-way NFA M’ nondeterministically choose one of the transitions and apply the algorithm for constructing the next CI as the following figures illustrate. a
(a, R) r
s
a
r
t
(a, L)
s
a
a r
(a) 2-way NFA
Cj
Ci
<s, Cj>
q
(c) Transitions of 1-way NFA
t
Ci
Ck
(b) Computing CI 15
2-way FA ⇒1-way FA
Algorithm
Previous example is somewhat misleading, because the state transition graphs of the two automata appear isomorphic. This is not true in general. Consider the 2-way NFA in figure (a) below which has a self-looping transition on state t. The 1-way NFA constructed by the algorithm may not necessarily have a self-loop on state because the pair can be different from . a (a, R)
s
a
r
a
t (a, L)
<s, Cj>
(a, R)
(a) 2-FA transitions
(b) 1-FA transitions 16
2-way FA ⇒1-way FA
Algorithm
The 1-way FA model has many attracting properties, such as on-line (computes while receiving the input string), real-time (takes exactly |x| steps for input string x), easier to implement in hardware than the other model, etc. On the contrary, the 2-way FA model is off-line and the computing time may take longer than the input length. Because of the 2-way property, it is not easy to analyze or manipulate them. For example, consider the 2-way FA shown below. Though the automaton appears simple, it is not easy to figure out the language L(M) recognized by the machine. (a, R) (b, R) 1
(b, R) 2
(a, L)
(a, R) 3
L(M) = (a+b)ba*
(b, L)
17