Game Playing Chapter 8
Outline • • • • • •
Overview
Minimax search
Adding alpha-beta cutoffs Additional refinements Iterative deepening Specific games
2
Overview Old beliefs Games provided a structured task in which it was very easy to measure success or failure. Games did not obviously require large amounts of knowledge, thought to be solvable by straightforward search.
3
Overview Chess The average branching factor is around 35. In an average game, each player might make 50 moves. One would have to examine 35100 positions.
4
Overview • Improve the generate procedure so that only good moves are generated. • Improve the test procedure so that the best moves will be recognized and explored first.
5
Overview • Improve the generate procedure so that only good moves are generated. plausible-moves vs. legal-moves
• Improve the test procedure so that the best moves will be recognized and explored first. less moves to be evaluated
6
Overview • It is not usually possible to search until a goal state is found. • It has to evaluate individual board positions by estimating how likely they are to lead to a win. Static evaluation function
• Credit assignment problem (Minsky, 1963).
7
Overview • Good plausible-move generator. • Good static evaluation function.
8
Minimax Search • Depth-first and depth-limited search. • At the player choice, maximize the static evaluation of the next position.
• At the opponent choice, minimize the static evaluation of the next position.
9
Minimax Search A -2 B -6 E 9
F -6
G 0
C -2 H 0
I -2
D -4 J -4
K -3
Maximizing ply Player Minimizing ply Opponent
Two-ply search
10
Minimax Search Player(Position, Depth):
for each S ∈ SUCCESSORS(Position) do RESULT = Opponent(S, Depth + 1) NEW-VALUE = VALUE(RESULT) if NEW-VALUE > MAX-SCORE, then MAX-SCORE = NEW-VALUE
return
BEST-PATH = PATH(RESULT) + S
VALUE = MAX-SCORE PATH = BEST-PATH
11
Minimax Search Opponent(Position, Depth):
for each S ∈ SUCCESSORS(Position) do RESULT = Player(S, Depth + 1) NEW-VALUE = VALUE(RESULT) if NEW-VALUE < MIN-SCORE, then MIN-SCORE = NEW-VALUE
return
BEST-PATH = PATH(RESULT) + S
VALUE = MIN-SCORE PATH = BEST-PATH
12
Minimax Search Any-Player(Position, Depth):
for each S ∈ SUCCESSORS(Position) do RESULT = Any-Player(S, Depth + 1) NEW-VALUE = − VALUE(RESULT) if NEW-VALUE > BEST-SCORE, then BEST-SCORE = NEW-VALUE
return
BEST-PATH = PATH(RESULT) + S
VALUE = BEST-SCORE PATH = BEST-PATH
13
Minimax Search MINIMAX(Position, Depth, Player): • MOVE-GEN(Position, Player). • STATIC(Position, Player). • DEEP-ENOUGH(Position, Depth)
14
Minimax Search 1. if DEEP-ENOUGH(Position, Depth), then return: VALUE = STATIC(Position, Player) PATH = nil
2. SUCCESSORS = MOVE-GEN(Position, Player)
3. if SUCCESSORS is empty, then do as in Step 1
15
Minimax Search 4. if SUCCESSORS is not empty:
RESULT-SUCC = MINIMAX(SUCC, Depth+1, Opp(Player)) NEW-VALUE = - VALUE(RESULT-SUCC) if NEW-VALUE > BEST-SCORE, then: BEST-SCORE = NEW-VALUE BEST-PATH = PATH(RESULT-SUCC) + SUCC
5. Return:
VALUE = BEST-SCORE PATH = BEST-PATH
16
Adding Alpha-Beta Cutoffs • Depth-first and depth-limited search. branch-and-bound
• At the player choice, maximize the static evaluation of the next position. > α threshold
• At the opponent choice, minimize the static evaluation of the next position. < β threshold
17
Adding Alpha-Beta Cutoffs A
B 3 D 3
>3
E 5
Maximizing ply Player
C F -5
G
Minimizing ply Opponent
Alpha cutoffs
18
Adding Alpha-Beta Cutoffs A
>3
B D 3 Alpha and Beta cutoffs
E 5
C <5
F I K 0
L
Maximizing ply Player
G
J 5
M 7
H N
Minimizing ply Opponent
Maximizing ply Player Minimizing ply Opponent
19
Adding Alpha-Beta Cutoffs Opponent β Player α≥β α≥β Alpha and Beta cutoffs
α Opponent Player
20
Player(Position, Depth, α, β):
for each S ∈ SUCCESSORS(Position) do RESULT = Opponent(S, Depth + 1, α, β) NEW-VALUE = VALUE(RESULT) if NEW-VALUE > α, then α = NEW-VALUE
BEST-PATH = PATH(RESULT) + S
if α ≥ β then return VALUE = α
return
PATH = BEST-PATH
VALUE = α
PATH = BEST-PATH 21
Opponent(Position, Depth, α, β):
for each S ∈ SUCCESSORS(Position) do RESULT = Player(S, Depth + 1, α, β) NEW-VALUE = VALUE(RESULT) if NEW-VALUE < β, then β = NEW-VALUE
BEST-PATH = PATH(RESULT) + S
if β ≤ α then return VALUE = β
return
PATH = BEST-PATH
VALUE = β
PATH = BEST-PATH 22
Any-Player(Position, Depth, α, β):
for each S ∈ SUCCESSORS(Position) do RESULT = Any-Player(S, Depth + 1, −β, −α) NEW-VALUE = − VALUE(RESULT) if NEW-VALUE > α, then α = NEW-VALUE
BEST-PATH = PATH(RESULT) + S
if α ≥ β then return VALUE = α
return
PATH = BEST-PATH
VALUE = α
PATH = BEST-PATH 23
Adding Alpha-Beta Cutoffs MINIMAX-A-B(Position, Depth, Player, UseTd, PassTd): • UseTd: checked for cutoffs. • PassTd: current best value
24
Adding Alpha-Beta Cutoffs 1. if DEEP-ENOUGH(Position, Depth), then return: VALUE = STATIC(Position, Player) PATH = nil
2. SUCCESSORS = MOVE-GEN(Position, Player)
3. if SUCCESSORS is empty, then do as in Step 1
25
Adding Alpha-Beta Cutoffs
4. if SUCCESSORS is not empty:
RESULT-SUCC = MINIMAX-A-B(SUCC, Depth + 1, Opp(Player), − PassTd, − UseTd)
NEW-VALUE = - VALUE(RESULT-SUCC) if NEW-VALUE > PassTd, then: PassTd = NEW-VALUE
BEST-PATH = PATH(RESULT-SUCC) + SUCC
if PassTd ≥ UseTd, then return: VALUE = PassTd
5. Return:
PATH = BEST-PATH
VALUE = PassTd
PATH = BEST-PATH
26
Additional Refinements • Futility cutoffs
• Waiting for quiescence • Secondary search
• Using book moves
• Not assuming opponent’s optimal move
27
Iterative Deepening
Iteration 1 Iteration 2
Iteration 3 28
Iterative Deepening • Search can be aborted at any time and the best move of the previous iteration is chosen.
• Previous iterations can provide invaluable moveordering constraints.
29
Iterative Deepening • Can be adapted for single-agent search.
• Can be used to combine the best aspects of depthfirst search and breadth-first search.
30
Iterative Deepening Depth-First Iterative Deepening (DFID) 1. Set SEARCH-DEPTH = 1 2. Conduct depth-first search to a depth of SEARCHDEPTH. If a solution path is found, then return it. 3. Increment SEARCH-DEPTH by 1 and go to step 2.
31
Iterative Deepening Iterative-Deepening-A* (IDA*) 1. Set THRESHOLD = heuristic evaluation of the start state 2. Conduct depth-first search, pruning any branch when its total cost exceeds THRESHOLD. If a solution path is found, then return it. 3. Increment THRESHOLD by the minimum amount it was exceeded and go to step 2. 32
Iterative Deepening • Is the process wasteful?
33
Homework Presentations
– Specific games: Chess – State of the Art
Exercises
1-7, 9 (Chapter 12)
34