11-711 Algorithms for NLP
Introduction to Analysis of Algorithms
Reading: Cormen, Leiserson, and Rivest, Introduction to Algorithms Chapters 1, 2, 3.1., 3.2.
Analysis of Algorithms Analyzing an algorithm: predicting the computational resources that the algorithm requires – Most often interested in runtime of the algorithm – Also memory, communication bandwidth, etc. Computational model assumed: RAM – Instructions are executed sequentially – Each instruction takes constant time (different constants) The behavior of an algorithm may be different for different inputs. We want to summarize the behavior in general, easy-to-understand formulas, such as worst-case, average-case, etc.
1
11-711 Algorithms for NLP
Analysis of Algorithms
Generally, we are interested in computation time as a function of input size. – Different measures of input size depending on the problem – Examples: Problem
Input
Size
Sorting
List of numbers to sort
# of numbers to sort
Multiplication
Numbers to multiply
# of bits
Graph
Set of nodes and edges
# of nodes + # of edges
“Running time” of an algorithm: number of primitive “steps” executed – Think of each “step” as a constant number of instructions, each of which takes constant time.
Example: Insertion-Sort (CLR, p. 8) 2
11-711 Algorithms for NLP
Insertion Sort (A) cost
n-1
A[j]
do key
n
2.
2 to length[A]
1. for j
times
n-1
3.
i-1
key
A[i+1]
i
A[i]
do A[i+1]
key
8.
7.
0 and A[i]
while i
5. 6.
j-1
i
4.
3
n-1
11-711 Algorithms for NLP
Insertion-Sort
( '2 %# ( ' / , + 1 # % ( ' / , + 0 %# , / + * %# ( ') %# ( ' # %& # $ "!
: # of times the while loop in line 5 is executed for the value
& .-,
& .-,
& .-,
3
In the best case, when the array is already sorted, the running time is a linear function of .
4
6% 5 !"
3
In the worst case, when the array is opposite of sorted order, the running time is a quadratic function of .
4
# % & 6% 5 !" 4
11-711 Algorithms for NLP
Analysis of Algorithms
7
We are mostly interested in worst-case performance because 1. It is an upper bound on the running time for any input. 2. For some algorithms it occurs fairly often. 3. The average case is often roughly as bad as the worst case.
7
We are interested in the rate of growth of the runtime as the input size increases. Thus, we only consider the leading term.
=
<;:9 8
– Example: Insertion-Sort has worst case
7
We will normally look for algorithms with the lowest order of growth
7
But not always: for instance, a less efficient algorithm may be preferable if it’s an anytime algorithm.
>
5
11-711 Algorithms for NLP
Analyzing Average Case of Expected Runtime
?
Non-experimental method: 1. Assume all inputs of length
@
are equally likely
2. Analyze runtime for each particular input and calculate the average over all the inputs
?
Experimental method: 1. Select a sample of test cases (large enough!) 2. Analyze runtime for each test case and calculate the average and standard deviation over all test cases
6
11-711 Algorithms for NLP
Recursive Algorithms (Divide and Conquer)
A
General structure of recursive algorithms: 1. Divide the problem into smaller subproblems. 2. Solve (Conquer) each subproblem recursively. 3. Merge the solutions of the subproblems into a solution for the full problem.
A
Analyzing a recursive algorithm usually involves solving a recurrence equation.
A
Example: Merge-Sort (CLR, p. 13)
7
11-711 Algorithms for NLP
Recursive Algorithms
E
B
time to combine solutions of size
B
time to divide a problem of size
E
GF
ED H
I
subproblems, each
the size of
J K
B
Assume we divide a problem into the original problem.
B
The resulting recurrence equation is:
\ P_ TSR E \]^ P Q YZ[ F ED H XWF ED C F XWF K OND VUD M LI
G
8
11-711 Algorithms for NLP
h
fgc ce cd ba` 1.
1 2.
1
pv i oq up t
ij ij ij ij m i m i
ij
x
xk
9.
1
11.
1
p
1
kz 10.
{ pk v p i oq t kx
1
w 8.
pv zv mx o lt o nt y iq y { xk v x k ztv pk v st w k z i oj st do
7.
to
6. for
w
pv i oj pu p st on l <ml rm ik j ik q do
5.
to 4. for
11-711 Algorithms for NLP 9
1 and
3. create arrays
: Initialization Merge
times
g~ ~ ~ b}|
16.
else
17.
15.
then
14.
n
do if 13.
n to 12. for
(cont.): Actual Merge Merge
times
11-711 Algorithms for NLP 10
Merge Sort (A,p,r) times r
1. if p
then q
2.
(p+r)/2
Merge-Sort(A,p,q)
4.
Merge-Sort(A,q+1,r)
5.
Merge(A,p,q,r)
¡ ¡ ¡ ¢
3.
11
11-711 Algorithms for NLP
A note on Implementation
£
Note that in the pseudocode of the Merge operation, the for-loop is always executed times, no matter how many items are in each of the subarrays.
¤
£
This can be avoided in practical implementation.
£
Take-home message: don’t just blindly follow an algorithm.
12
11-711 Algorithms for NLP
Merge-Sort
© © ¬O§ ¨§ « « <ª© ª © ¨§ ¨§ ¦
¥ ¥ ¥
According to the Merge-Sort algorithm, we divide the problem into 2 subproblems of size .
® ¯
¥
The resulting recurrence equation is:
¨§
¬ ¬ ª ¨ ¨µ ±² ±² © ¨§ « © ® X´© ¯ O¬§ § « ³°
ª©
°
,
°
¨§
©
¨¯ ·¸ ¶¨§ « ª©
¥
By solving the recurrence equation we get much better than for Insertion-Sort.
¯ ©
¨§ «
¥
There are several ways of solving this recurrence equation. The simplest one uses a recurrence tree, where each node in the tree indicates what the cost of solving it is. Then inspecting the tree can give us the solution. 13
11-711 Algorithms for NLP
¹
In this case, we get each recursive step.
¾ ½¼
º»
, because we divide the problem in half in
14
11-711 Algorithms for NLP
Growth of Functions
¿
We need to define a precise notation and meaning for order of growth of functions.
¿
We usually don’t look at exact running times because they are too complex to calculate and not interesting enough.
¿
Instead, we look at inputs of large enough size to make only the order of growth relevant - asymptotic behavior.
¿
Generally, algorithms that behave better asymptotically will be the best choice (except possibly for very small inputs).
15
11-711 Algorithms for NLP
À
-notation (“Theta” notation)
Á
Asymptotically tight bound if both functions grow “equally” fast
ÅÅ
ÄÃ ÈÃ Ç <ÆÅ ÄÂÃ
Á
Idea:
Á
Formally:
Ä Î
ÄÃ Ë ÍÈ ÐÅ ÄÂÃ ÐÅ ÄÃ Ë ÌÈ
Ï
such that
, and
ËÍ
,
ËÌ
ÄÃ ÈÃ Ç
ÅÊ ÂÄÃÉ Æ ÅÅ
there exist positive constants
Ð
for
Å
ÑÄ
Ä
all
ÎÒ
Á
Å
OÓÃ Ç
means a constant function or a constant
16
11-711 Algorithms for NLP
Ô
-notation (“Big-O” notation)
Õ
Asymptotic upper bound , possibly much
Ù
ØÖ×
grows faster than
Ù
Ø× Ü
if
ÙÙ
Ø× Ü× Û Ú<Ù ØÖ×
Õ
Idea: faster
Õ
Formally:
à ãØ àä
ØÖ× æ Ú ÙÙ
ØÖ×
Ø× Ü× å ÚÙ
Õ
Note:
Ø
Ù Ø× ß Ü ÙÙ â Ù Ø× ØÖ× Û Ü× â ÚÙ
for all
á
such that
Ø
ß
ÙÞ ÖØ×Ý
Ú ÙÙ
Ø× Ü× Û
there exist positive constants and
Õ
“Big-O” notation often allows us to describe runtime by just inspecting the general structure of the algorithm.
Ù
èçØ× Û
– Example: The double loop in Insertion-Sort leads us to conclude . that the runtime of the algorithm is
Õ
We use “Big-O” analysis for worst-case complexity since that guarantees an upper bound on any case. 17
11-711 Algorithms for NLP
é
-notation (“Big-Omega” notation)
ê
Asymptotic lower bound , possibly
î
íëì
grows slower than
î
íì ñ
if
îî
íì ñì ð ï<î íëì
ê
Idea: much slower
ê
Formally:
õ øí
for all
í
î
õù
íëì î÷ íì ôñ ÷
ö
such that
í
ô
íì ñì
îó ëíìò ï îî
ð
there exist positive constants and
ê
Theorem: For any two functions iff
,
îî
íì ñì ð ïî
18
íëì
and
îî
íì ñì ú ïî
íëì
î î íì î ñ íñìì û î íëì ï î í ëì
and
11-711 Algorithms for NLP
ü
-notation (“Little-o” notation)
ý
Upper bound that is not asymptotically tight
þÿ
grows much faster than
ÿ
ÿ
ÿ
if
þÿ
ý
Idea:
ý
Formally:
, there exists
þÿ
ÿ
ÿ
for any positive constant
ÿ
þÿ
such that
for all
a constant
ý
We will normally use “Big-O” notation since we won’t want to make hard claims about tightness.
19
11-711 Algorithms for NLP
-notation (“Little-omega” notation)
!
, there exists
for any positive constant
if and only if
grows much slower than
if
Formally:
Note:
Idea:
Lower bound that is not asymptotically tight
$
#
!
!
"
% "
for all
such that
a constant
&
20
11-711 Algorithms for NLP
Comparison of Functions '
, , , and
relations are transitive.
*
,
,
+
)
(
The
– Example: 0
0
/.
5.
)
1 0
/. -
0
.
0
1 0
/ . -
/ . -
)
relations are reflexive.
6
0
1 0
/.
5.
)
1 0
/.
2
0
430
*
– Example:
/.
, and
,
2.
)
)
(
1 0
/. -
'
The
'
9
<;:
8 7 0
9
;=
0
8 7 0
0
/ .
,
/ .
)
1 0
2.
10
/. -
2.
/. -
Note the analogy to the comparison of two real numbers. e.g. e.g.
21
11-711 Algorithms for NLP
Comparison of Growth of Functions >
Transitivity: all >
Reflexivity:
B
B B
B B A@ ?
B A@ ?
E
CB
F
@
A @ ? D
CB
A@ ?
@
CB
A@ ?
@
A@ ?
>
Symmetry:
B
B B
B B
@
A @ ? I C B A @ G
B B
H
>
F
B B
@
A@ ?
B B
@
A@ ? E
B A@ ?
11-711 Algorithms for NLP 22
B
A @ ? F
@
A @ ?
D
C B
G
B
B
A @
A @ E
C B
G@
C B
A@ ?
G@
A@ ?
nor Lack of Trichotomy: Some are neither
@
C B
A @
>
Transpose symmetry: iff iff
A @ G
B B
G@
A@ ?
C B
D
A @
iff