Module 10 Evolutionary Algorithms
Optimization Tasks Adaptive Walk in a “fitness landscape” Start Xn
End
Fitness
Global optimum
Local optimum
X2 X1
X1
X2
Fitness
Types of fitness landscapes
The Dream
The Nightmare
Local Neighborhood search
Start
B q
A
C
Neighborhood N x
Local optimum
• fixed N • variable (“adaptive”) N
Local Neighborbood Strategies Local (neighborhood) search
Any-ascent/Stochastic hill-climbing
Simulated annealing
Steepest/First-ascent hill-climbing
Threshold-accepting
Population-based Population-based
Genetic Genetic algorithms algorithms
Tabu search
Evolution Evolution strategies strategies
Evolutionary Evolutionary programming programming
Elements of artificial adaptive systems (Koza) •
Structures that undergo adaptation
•
Initial structures (starting solutions)
•
Fitness measure that evaluates the structures
•
Operations to modify the structures
•
State (memory) of the system at each stage
•
Method for designating a result
•
Method for terminating the process
•
Parameters that control the proces
Learning from Nature ...
• Genetic algorithms • Evolution strategies • Particle Swarm Optimization (PSO) • Ant Colony Optimization (ACO)
Which search strategy should be applied? Global & local search techniques guided by • Random parameter variation • Stepwise systematic parameter variation • Gradients in the fitness landscape ... There is no one best method. Evolutionary algorithms are • easily implemented • robust • able to cope with high-dimensional optimization tasks
Searching in real and artificial fitness landscapes
Quality (experimental)
“Fitness” (calculated)
Idealized path
Neural network model of peptide antibody recognition
n xi2 x f ( x) = ∑ − ∏ cos i + 1 i =1 4000 i =1 i
n
n
(
f ( x) = 10 ⋅ n + ∑ xi2 − 10 ⋅ cos(2 ⋅ π ⋅ xi )
)
i =1
a) Griewangk Funktion
b) Rastrigin Funktion
c) De Jong‘s Sphere Funktion
n
f ( x) = ∑ xi2 i =1
globales Minimum
globales Minimum
d) Schaffer F6 Funktion
e) Rosenbrock Funktion
globales Minimum
f ( x) = 0.5 +
sin 2 x 2 + y 2 − 0.5
(1 + 0.001⋅ ( x
2
+ y2 )
n −1
)
2
(
f ( x) = ∑100 ⋅ xi +1 − x i =1
2
) + (1 − x )
2 2 i
i
Operators of Evolutionary Optimization
Variation
• Mutation
Adds new information to a population
• Crossover
Exploits information within a population
• Selection
Genetic Algorithms: Chromosome Representation
O
S
O NH H N
Descriptors
Real Binary Gray
clogP cMW O+N H-donors Surfpolar [Å2]
5.19 404 5 2 59.9
O
100011100001010 110010110001011 genes chromosome
100 011 100 001 010
110 010 110 001 011
Principle of Evolutionary Searching
Step 1: Random search
Step 2 and following: “Local hill climbing”
Optimum
Best value
Generate a diverse set of parameter values and hope for a hit
Generate localized distributions of parameter values and improve steadily Adaptive neighborhood
The (µ,λ λ) Evolution Strategy
(Rechenberg, 1973)
Algorithm of the (1,λ λ) Evolution Strategy Stepsize Variables
Quality (“Fitness”)
Initialize parent (ξ ξP,σ σP,QP); For each generation: σV,QV) around the parent: Generate λ variants (ξ ξV,σ σ V = σP
•
G;
ξV = ξP + σV
•
G;
calculate fitness QV; Select best variant by QV ; Set (ξ ξP,σ σP,QP) ≡ (ξ ξV,σ σV,QV)best ;
Box-Muller:
G(i , j ) = − 2 ln(i ) ⋅ sin(2π j ) i,j: pseudo-random numbers in ]0,1]
Evolutionary Optimization: Protein backbone folding
Variables: Coordinates of “Beads on a string” Fitness function: Empirical potential „Folding“ of the ribonucleasese A backbone
X-ray structure (1rtb)
Multiobjective (MO) Optimization (MO) One possibility: weighted Fitness Function
f(property) = w1 * property1 + w2 * property2 + … Disadvantages: • The setting of the weights is non-intuitive • Regions of the search space may be excluded due to the chosen weight setting • Objectives may be correlated • Only a single solution is found
MOGA – Multiobjective Genetic Algorithm • Most optimization algorithms can only cope with a single objective • Evolutionary Algorithms use a population of solutions MOGA (Fonseca and Fleming 1993): • Maps out the hypersurface in the search space where all solutions are tradeoffs between the different objectives • Searches for a set of non-dominated solutions • The ranking of solutions is based on dominance instead of a fitness function (Pareto ranking)
Multiobjective Optimization (MO)
• Many practical optimization applications are characterized by more than one objective • Optimal performance in one objective often correlates with low performance in another one • Strong need for a compromise • Search for solutions that offer acceptable performance in all objectives
MOGA – Multiobjective Genetic Algorithm Potential solutions in a two-objective (f1 and f2) minimization problem:
f2
0 2 4 0
number of times the solution is dominated
0 1 0
dominated solution 0
non-dominated solution (Pareto solution)
f1
Particle Swarm Optimization (PSO) • Biological motivated optimization paradigm • Individuals move through a fitness landscape • Particles can change their movement direction and velocities • “Social” interactions between individuals of a population • Each particle knows about its own best position and all particles know about the overall best position
Particle Swarm Optimization (PSO) – cont.
T. Krink, University of Aarhus
PSO in a virtual fitness landscape y = f(x1,x2)
Trajectory of a particle
min. 4
2
Fitness
0
-2
-4
-4
y = f ( x) = x 2 − 10 ⋅ cos(2 ⋅ π ⋅ x) Rastrigin function
-2
0
2
4
PSO Algorithmus
begin Positionen und Geschwindigkeiten der Partikel initialisieren Fitness evaluieren Gedächtnisse initialisieren while (Abbruchbedingung ≠ true) neue Geschwindigkeiten berechnen Positionen neu berechnen Fitness evaluieren Gedächtnisse auffrischen end end
PSO: Update Regel 1 (Standard Typ) Individuelles Gedächtnis
Soziales Gedächtnis
v i (t + 1) = v i (t ) + 2 ⋅ r1 ⋅ ( p i − xi (t )) + 2 ⋅ r2 ⋅ ( p b − x i (t )) v: Geschwindigkeitsvektor des Partikels x: Ortsvektor des Partikels i: Dimension t: Epochenzahl r1 und r2: Zufallszahlen zwischen 0 und 1 pi und pb: die besten Postionen aus dem individuellen und sozialen Gedächtnis
PSO: Update Regel 2 (Inertia Weight Typ) Inertia Weight
Individuelles Gedächtnis
Soziales Gedächtnis
v i (t + 1) = w ⋅ v i (t ) + n1 ⋅ r1 ⋅ ( p i − x i (t )) + n 2 ⋅ r2 ⋅ ( p b − x i (t )) v: Geschwindigkeitsvektor des Partikels x: Ortsvektor des Partikels w − wend w = wstart − start ⋅ Epochs w: Trägheitsgewicht (inertia weight) MaxEpochs i: Dimension t: Epochenzahl n1 und n2: individuelle und soziale Konstante r1 und r2: Zufallszahlen zwischen 0 und 1 pi und pb: die besten Postionen aus dem individuellen und sozialen Gedächtnis
PSO: Individuelles und soziales Gedächtnis Konvergenzverhalten von PSO bei der Minimierung der Griewangk-Funktion nach 100 Epochen für unterschiedliche Werte für n1 und n2
globales Minimum
n1 = 1 n2 = 2 Höhere Gewichtung des sozialen Gedächtnisses
n1 = 2 n2 = 1
PSO: Update Regel 3 (Constriction Typ) Constriction Constant
Individuelles Gedächtnis
Soziales Gedächtnis
v i (t + 1) = K ⋅ ( v i (t ) + n1 ⋅ r2 ⋅ ( pi − xi (t )) + n2 ⋅ r2 ⋅ ( pb − xi (t ))) v: Geschwindigkeitsvektor des Partikels 2 K= x: Ortsvektor des Partikels | 2 − ϕ − ϕ 2 − 4ϕ | K: Constriction Constant i: Dimension ϕ = n1 + n 2 , ϕ > 4 t: Epochenzahl n1 und n2: individuelle und soziale Konstante r1 und r2: Zufallszahlen zwischen 0 und 1 pi und pb: die besten Postionen aus dem individuellen und sozialen Gedächtnis
Der Constriction Factor reguliert in zweierlei Hinsicht das Schwarmverhalten (Kennedy & Eberhart, 2001): • Er führt nach einer bestimmten Zeit zu einer Konvergenz des Schwarms. • Wird ein neues lokales Optimum gefunden, wenn der Schwarm bereits fortgeschritten konvergiert ist, so hat er trotzdem die Fähigkeit, wieder größere Bewegungen auszuführen und sich somit anzupassen.
globales Minimum
globales Minimum Rosenbrock Funktion
Constriction-Typ Schwarm
Standard-Schwarm
(jeweils nach 100 Epochen)
Visualisierung des Pfades der besten gefundene Positionen
• Der Partikelschwarm verschiedene lokale Minima besucht, bevor das globale Minimum gefunden wurde links: Seitenansicht der Funktion rechts: Unterseite der Funktion
Das war‘s für‘s Erste! … doch es geht weiter.