Automata and Language Theory First Semester SY 2008-2009 Pabroa, Benjie
Consider the game “snake and lather” ▪ Look at all possible position of the piece on the board and call them states. ▪ The game changes from one state to another in a fashion determined by the input of a certain number ▪ For each possible number, there is one and only one resulting state
▪ We should allow for the possibility that after a number is entered, the game is still in the same state as it was before. ▪ After a certain number of rolls, the board arrives at a state that means a victory for one of the players and the game is over. We call this a final state. ▪ There might be many possible final states that result in victory for this player.
▪ In computer theory, final state are also called halting states, terminal states, or accepting states. ▪ Beginning with the initial state (which we presume to be unique), some input sequences of numbers lead to victory for the a player and some do not.
How can we differentiate “snake and
lather” with computer? ▪ You have a simple computer (input device, processing unit, memory, output device) and you wish to calculate the sum of 3 plus 4. ▪ You write a program, which is a sequence of instructions that are fed into the machine one at a time. Each instruction is executed as soon as it is read, and then the next instruction is read. If all goes well, the machine outputs the number 7 and terminates execution. ▪ We can consider this process to be similar to the game
▪ The computer is also deterministic ▪ On reading one particular input instruction, the machine converts itself from the state it was in to some particular other state where the resultant state is determined by the prior state and the input instruction. ▪ Some sequence of input instructions may lead to success and some may not. Success is entirely determined by the sequence of inputs. ▪ In calculating 4+3, what is important is the printing of 7, the rest in the memory does not matter.
▪ Connecting with language…. ▪ In the game, we can consider the set of all possible dice rolls (or set of all computer instructions) to be the letter of an alphabet. ▪ We can then define a certain language as the set of strings of those letters that lead to success / final state. ▪ We can then define the language to be the set of all words over this alphabet that lead to success. ▪ This is the language whose words are all programs that print a 7.
▪ The most general model (of the game and computer program) is called a Finite Automaton (FA) ▪ “finite” because the number of possible states and number of letters in the alphabet are both finite ▪ “automaton” because the change of states is totally governed by the input. The determination of what state is next is automatic (involuntary and mechanical), not willful. Automaton comes to us from the Greek, so its correct plural is automata
A finite automaton is a collection of three things: A finite set of states, one of which is designated as the initial state, called the start state, and some (maybe none) of which are designated as final state. An alphabet ∑ of possible input letters. A finite set of transitions that tell for each state and for each letter of the input alphabet which state to go to next.
▪ It works by being presented with an input string of letters that it reads letters by letter starting at the leftmost letter. ▪ Start state -> seqseqseq -> end state
FA is also called as: ▪ Finite acceptor / language acceptor ▪ Language recognizer
▪ The set of all strings that do leave us in a final state is called the language defined by the finite automaton. ▪ We may say the a word is not accepted by this finite automaton because it does not lead to final state, or ▪ We may also say that the word is rejected by this FA
▪ The set of all strings accepted is the language associated with the FA.
▪ The set of all strings accepted is the language associated with the FA. The we can say that: ▪ “this FA accepts the language L” or ▪ “L is the language accepted by this FA”, or ▪ “L is the language of the FA”
▪ Question: ▪ If language L1 is contained in L2 and a certain FA accepts L2. Is this FA accept all the word in language L1.
Suppose that the input alphabet are a and
b. Let us also assume that there are only three states, x,y, and z Rule 1 From state x and input a, go to state y Rule 2 From state x and input b, go to state z Rule 3 From state y and input a, go to state x Rule 4 From state y and input b, go to state z Rule 5 From state z and any input, go to state z ▪ Let us designate state x as the starting state and state z as the only final state ▪ We now have a perfectly defined FA because it fulfills all the requirements: states, alphabet, transitions
Try to input the following to our FA
and determine if it is in the language defined in our FA ▪ aaa ▪ abba
Describe the language associated
with this FA Create the regular expression associated with this FA
It is much simpler to summarize them in a
table format. ▪ The entries inside the table are the new states that the FA moves into – the transition states. ▪ This is called transition table
Start x y Final z
a y x z
b z z z
▪ Even though it is no more than a table of symbols, we consider an FA to be a machine, that is, we understand that this FA has dynamic capabilities. It moves. it process inputs.
▪ We may make the definition of FAs even more mathematically abstract by replacing the transition table with a total function whose input is a pair of state and alphabet letter and whose output is a single state. ▪ This function is called the transition function, usually denoted δ (lowercase Greek delta)
The abstract definition of an FA is then 1.A finite set of states Q = {q0 q1 q2…} of which q0 is the start state. 2.A subset of Q called the final state. 3.An alphabet ∑={x0 x1 x2…). 4.A transition function δ associating each pair of state and letter with a state: δ(q1 , xj ) = xk: Note: we never refer to this transition function again….
▪ Transition Diagram - a pictorial representation of an FA that gives us more of a feel for the motion ▪ We begin by representing each state by a small circle. ▪ From each state, we draw arrows showing to which other states the different letters of the input alphabet will lead us. ▪ We label these arrows with the corresponding alphabet letters. ▪ We make a loop indicating that a certain letter makes a state go back to itself
▪ We indicate the start state: by labeling it with the word “start” by a minus sign (-) an arrow ▪ We indicate the final state: by word “final” by plus sign (+) drawing a box or another circle around its circle
▪ Create the diagram of the FA we have just defined earlier. ▪ Trace the direction/path of which the following inputs will ▪ aaaabba ▪ bbaabbbb
▪ We borrow from graph theory the name direct edge, or simply edge, for the arrow between states. ▪ An edge comes from one state and leads to another (or the same, if it is a lood) ▪ Every state has as many outgoing edges as there are letters in the alphabet. ▪ It is possible for a state to have no incoming edges or to have many.
▪ There are machines for which it is not necessary to give the states specific names. ▪ we can still determine whether a particular input string will be accepted by our machine or not a -
a b
b +
a
b
One of many FAs that accepts all words
is
a, b ±
Similarly, there are FAs that accept no
language. a -
b
a, b
In this case, we can say that the graph is
disconnected a, b
-
a, b a, b
+
▪ Given the diagram below a, b
a -
+
b
▪ Describe the language associated with this machine ▪ Construct the regular expression of the FA associated with this machine
▪ It is possible to look at the world of FAs in two ways: ▪ We could start with the machine and try to analyze it to see what language it accepts, or ▪ We could start with a desired language in our mind and try to construct an FA that would act as a language-recognizer or language definer
▪ We must practice studying FA from two different angles: ▪ Given a language can we build a machine for it. ▪ Given a machine, we can deduce its language.
Build
an FA that accepts all words containing a double letter aa or bb, and only those words
▪ Let us build a machine that accepts the language of all words over the alphabet {a b} with an even number of letters ▪ Suppose we want to build a finite automaton that accepts all the words in the language a(a+b)* ▪ Build your FA using 4 and 5 states.
▪ Build an FA that accepts all words containing a tipple letter aaa or bbb, and only those words. Hits: -
a
a
a
+
State 2 We have just read an a that was not preceded by an a and we are looking for a second a as the next input
a
State 1 Start here but do not get too comfortable; you are going to leave immediately
State 4
a
2
a,b 1-
b
4+
a
b
3
b
State 3 We have just read a b that was not preceded by an b and we are looking for a second b as the next input
We have already discovered the existence of a double letter in the input string and we are going to wait out the rest of the input sequence and then announce acceptance when it is all over
Create
an FA that accepts all words with b as the third letter and rejects all other words.
4
a a, b 1-
2
a, b
a, b
3
b
a, b 5+
= (aab + abb + bab + bbb) (a+b)* = (a+b)(a+b)b(a+b)*
This
FA accepts only the word baa b
a
-
a
a b
b
a, b
a, b
+
Create
an FA that accepts exactly two strings baa and ab
+
a
a
-
b
a
a
b
a
a, b
a, b
a, b
+
▪ What is the language accepted by this machine? a
1-
b
2
a
3+ a
b
b
b
4 a
▪ This FA accepts the language in the previous FA and null(^) a
a
b
±
b
b
a
How can you define the language
expressed by the FA below?
±
a,b
a,b
Create
the FA that figures out the language that accepts all words that ends with the letter a
a b
a +
-
b = b*ab*(ab*ab*)*
Create the FA of the language
associated with the regular expressions: (5 pts each) ▪ (a+b)*a ▪ ^ + ((a+b)*a) ▪ (a+b)* aa (a+b)*
Create an FA that accepts all words that
have different first and last letters. It is associated with the Regular expression a(a+b)*b + b(a+b)*a (15 pts)
▪ Build an FA that accepts only the language of all words with b as the second letter ▪ Build an Fa that accepts only the words baa, ab and abb and no other strings longer or shorter ▪ Build an FA that accept only those words that have more than four letters
Build
an FA that accepts only those words that do not end with ba. Build an FA that accepts only those words that begin or end with a double letter. Build an FA that accepts only those words that have an even number of substrings ab.
Consider the picture below b 1±
a
a
3
2
b
a
b b
a
4