Thomson Const Algo

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Thomson Const Algo as PDF for free.

More details

  • Words: 627
  • Pages: 4
Construction of an NFA from a Regular Expression Computer Science 332 Compiler Construction

A constructive algorithm (proof) : cases of algorithm follow cases in definition Base case: NFA recognizing a in Σ , or ∈



3.7 From a Regular Expression to an NFA





Inductive cases: Disjunction Concatenation Kleene closure

Each case introduces no more than 2 states, so size of NFA is linear in size of reg. exp.

Thompson's Construction

Thompson's Construction

First parse r.e. into constituent subexpressions; e.g., (a | b)*abb parses to

Next build NFA from depth-first traversal of parse, according to following constructive rules:

r5 ( r1 a

r4 r3 |

) r2 b

r7 *

r9 r6 a

r11 r8

1. For ∈ construct the NFA r10

start



b

b

2. For a in Σ construct the NFA start

a

Thompson's Construction 3. Suppose N(s) and N(t) are NFA's for regular expressions s and t a) For the reg. exp. (s | t) construct the following composite NFA N(s | t): ∈

N(s)



start ∈

3. Suppose N(s) and N(t) are NFA's for regular expressions s and t b) For the reg. exp. st construct the following composite NFA N(st): start

N(s)

N(t)

∈ N(t)

I.e., merge final state of N(s) and initial state of N(t)

Thompson's Construction 3. Suppose N(s) and N(t) are NFA's for regular expressions s and t c) For the reg. exp. s* construct the following NFA N(s*): ∈ ∈

start

Thompson's Construction

∈ N(s)



Thompson's Construction 3. Suppose N(s) and N(t) are NFA's for regular expressions s and t c) For the reg. exp. (s) use the existing NFA N(s)

Thompson's Construction New NFA has at most twice as many states as all of its component NFA's. New NFA has exactly one start state and one accepting state. Each state of new NFA has either one outgoing transition on a symbol in Σ or at most two outgoing ∈ transitions.

Thompson's Construction Let's build an NFA for r.e. (a | b)*abb with parse tree

r5 ( r1 a

Simulating an NFA

|

)

r6

r11 r8

r10 b

b

a

r2 b



Goal: Given reg. exp. r and input string x, determine whether x is in L(r) Method #1: Build NFA N from r using Thompson's construction, then run previous algorithm 

S := ∈-closure({s0}) a := nextchar while a ≠ eos do begin S := ∈-closure(move(S, a)) a := nextchar end if S ∩ F ≠ ∅ then return ''yes'' else return ''no''

r3

*

r9

Time-Space Tradeoffs



Can use the following deterministic algorithm to simulate an NFA without building its DFA:

r4

r7

Can construct NFA in O(|r|) time. N has at most twice as many states as |r|, and at most two transitions from each state, so transition table is O(|r|) space. Previous algorithm accepts or rejects x in O(|r|×|x|) time.

Time-Space Tradeoffs





Goal: Given reg. exp. r and input string x, determine whether x is in L(r) Method #2: Build NFA N from r using Thompson's construction, then DFA D from N using subset construction; then use DFA algorithm from last time for accepting/rejecting x D can have up to 2k states, where k = # states in N. ''Worstcase'' string (a | b)* a (a | b)(a | b)...(a | b) : why? DFA acceptance algorithm accepts or rejects x in O(|x|)

Tradeoffs Summary: Automaton

Build

Run

NFA

O(|r|)

O(|r|×|x|)

DFA

O(2|r|)

O(|x|)

So use first method (NFA) for quick search over short text strings (e.g., emacs r.e. search) Use second method (DFA) for longer searches over long text strings (e.g., Unix grep on multiple files) ''Lazy'' DFA method builds DFA transition table on the fly, caching transitions as state/input pairs are encountered.

Related Documents

Thomson Const Algo
June 2020 3
Const
November 2019 38
Algo
June 2020 24
Algo+
June 2020 27
Algo
December 2019 45
Algo
June 2020 26