Optimization Software: Neural Networks

  • Uploaded by: Houw Liong The
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Optimization Software: Neural Networks as PDF for free.

More details

  • Words: 2,017
  • Pages: 40
OPTIMIZATION SOFTWARE as a Tool for Solving Differential Equations Using

NEURAL NETWORKS Fotiadis, D. I. Karras, D. A. Lagaris, I. E. Likas, A. Papageorgiou, D. G.

DIFFERENTIAL EQUATIONS HANDLED • ODE’s • Systems of ODE’s • PDE’s ( Boundary and Initial Value Problems ) • Eigen - Value PDE Problems • IDE’s

ARTIFICIAL NEURAL NETWORKS

• • • • •

Closed Analytic Form Universal Approximators Linear and Non-Linear Parameters Highly Parallel Systems Specialized Hardware for ANN

OPTIMIZATION ENVIRONMENT MERLIN / MCL 3.0 SOFTWARE Features Include: • A Host of Optimization Algorithms • Special Merit for Sums of Squares • Variable Bounds and Variable Fixing • Command Driven User Interface • Numerical Estimation of Derivatives • Dynamic Programming of Strategies

ARTIFICIAL NEURAL NETWORKS Input Layer

x1

Hidden Layers (1) 1

w

Output Layer

(2)

w

1

x2

2

Bias +1

3

• Inspired from biological NN

2

Input - Output mapping via the weights u,w,v and the activation functions σ

v

u

Analytically this is given by the formula:  n ( 2 )  n (1)  N ( x , u , w , w , v ) = ∑ viσ ∑ wij σ ∑ w jk xk + u j    k =1   i =1  j =1 (1)

( 2)

n3

2

1

Activation Functions Many different functions can be used. Our current choice:

The Sigmoidal

A smooth function, infinitely differentiable, bounded in (0,1)

1 σ ( x) = −x 1+e

σ (x )

1 0. 8 0. 6 0. 4 0. 2 0 -10

-5

0

5

10

The Sigmoidal properties dσ ( x ) dx

= σ ( x )[1 − σ ( x )]

2

d σ ( x) dx

2

= σ ( x )[1 − σ ( x )][1 − 2σ ( x )]

FACTS Kolmogorov and

Cybenko

and

Hornik

proved theorems concerning the approximation capabilities of ANNs In fact it is shown that ANNs are UNIVERSAL APPROXIMATORS

DESCRIPTION OF THE METHOD SOLVE THE EQUATION

LΨ ( x ) = f ( x ) SUBJECT TO DIRICHLET B.C. Where L

is an Integrodifferential Operator Linear or Non-Linear

Ψ

( x ) = B ( x ) + Z ( x ) N ( x ) M

Where: •B(x) satisfies the BC •Z(x) vanishes on the boundary •N(x) is an Artificial Neural Net

MODEL PROPERTIES The Model

Ψ

M

( x) =

B( x) + Z ( x) N ( x)

satisfies by construction the B.C. The Model thanks to the Network is “trainable” The Network parameters can be adjusted so that:

L Ψ ( x) − f ( x) = 0 M

∀x ∈[ 0,1]

N

Pick a set of representative points x1 , x2 ,..., xn in the unit Hypercube The residual “Error” 2 [L Ψ (x ) − f(x ) ] ∑ M i i

i = 1, n

ILLUSTRATION Simple 1-d example d dΨ Ψ ( x ) = f ( x, Ψ , ) dx dx Ψ (0) = Ψ , Ψ (1) = Ψ 2

2

0

1

Model Ψ ( x) = Ψ (1 − x) + Ψ x + x(1 − x) N ( x) M

0

1

ILLUSTRATION For a second order, two-dimensional PDE:

Ψ( x, y ) =B( x, y ) +x(1 −x) y (1 −y ) N ( x, y ) M

where

B( x, y ) = (1 − x)Ψ (0, y ) + xΨ (1, y ) + (1 − y ){Ψ ( x,0) − [(1 − x)Ψ ( x,0) + xΨ (1,0)]} + y{Ψ ( x,1) − [(1 − x)Ψ (0,1) + xΨ (1,1)]}

EXAMPLES Problem: Solve the 2-d PDE:

∇ 2Ψ ( x, y) = e− x ( x − 2 + y3 + 6 y) x, y ∈[0,1] In the domain: Subject to the BC : Ψ(0, y ) = y 3 Ψ( x,0) =xe −x

(1 + y 3 ) Ψ(1, y ) = e Ψ( x,1) =(1 +x)e −x

A single hidden layer Perceptron was used:

ΨM ( x, y ) = B ( x, y ) + x(1 − x) y (1 − y ) N ( x, y ) B ( x, y ) = (1 − x) y 3 + x(1 + y 3 ) / e + (1 − y ) x(e − x − e −1 ) + y[(1 + x)e − x − (1 − x − 2 xe −1 )]

GRAPHICAL REPRESENTATION

The analytic solution is: Exac t

Ψ ( x, y) = e− x ( x + y3 )

GRAPHS & COMPARISON Neural Solution accuracy Plot Points: Training Points

Ψ ( x, y ) − ΨM ( x, y )

GRAPHS & COMPARISON Neural Solution accuracy Plot Points: Test Points

Ψ ( x, y ) − ΨM ( x, y )

GRAPHS & COMPARISON Finite Element Solution accuracy Plot Points: Training Points

Ψ ( x, y ) − ΨFE ( x, y )

GRAPHS & COMPARISON Finite Element Solution accuracy Plot Points: Test Points

Ψ ( x, y ) − ΨFE ( x, y )

PERFORMANCE • Highly Accurate Solution (even with few training points) • Uniform “Error” Distribution • Superior Interpolation Properties The model solution is very flexible. Can be easily enhanced to offer even higher accuracy.

EIGEN VALUE PROBLEMS Problem:

L Ψ ( x) = λ Ψ ( x )

With appropriate Dirichlet BC The model is the same as before. However the “Error” is defined as: n

∑ [ LΨ i =1

( xi ) − λΨM ( xi )]

2

M n

∑ [Ψ i =1

M

2

( xi )]

EIGEN VALUE PROBLEMS n

Where:

λ=

∑ Ψ ( x ) LΨ ( x ) i =1

i

i

n

2 [ Ψ ( x )] ∑ i i =1

i.e. the value for which the “Error” is minimum. Problems of that kind are often encountered in Quantum Mechanics. (Schrödinger’s equation)

EXAMPLES The non-local Schrödinger equation ∞

h/ 2 d 2ψ (r ) − + V (r )ψ (r ) + ∫ K 0 (r , r ' )ψ (r ' )dr ' = εψ (r ) 2 2 µ dr 0

ψ (0) = 0, ψ (r ) ~ e , k > 0 − kr

Describes the bound “n+α” system in the framework of the Resonating Group Method.

Model:ψ M ( r ) = re Where: N (r ) =

−br

N (r ), b > 0

nodes

∑ v σ (w r + u ) j =1

j

j

j

is a single hidden layer, sigmoidal Perceptron

OBTAINING EIGENVALUES Example:The Henon-Heiles potential 1  ∂ 2Ψ ∂ 2Ψ  1 2 1  2 1 3  −  2 + 2  +  ( x + y2 ) +  xy − x  Ψ = εΨ 2  ∂x ∂y   2 4 5 3 

Asymptotic behavior: Model used:

Ψ ( x, y ) ~ e

ΨM ( x, y ) = e

−b ( x 2 + y 2 )

−k ( x2 + y 2 )

N ( x, y )

Use the above model to obtain an eigen solution Φ. Obtain a different eigen solution by deflation, i.e. : ~ ΨM ( x, y ) = ΨM ( x, y ) − Φ ( x, y ) ∫∫ Φ ( x' , y ' )ΨM ( x' , y ' )dx' dy ' This model is orthogonal to Φ(x,y) by construction. The procedure can be applied repeatedly.

ARBITRARILY SHAPED DOMAINS For domains other than Hypercubes the BC cannot be embedded in the model. Let Ri , ∀i = 1,2,..., m be the set of points defining the arbitrarily shaped boundary. The BC are then: Ψ ( Ri ) = bi ∀i = 1,2,..., m Let ri , ∀i = 1,2,..., n be the set of the training points inside the domain.

We describe two ways to proceed solving the LΨ ( x ) = f ( x ) problem

OPTIMIZATION WITH CONSTRAINTS Model:

ΨM ( x) = N ( x)

“Error” to be minimized: Domain terms n

∑ [ LΨ i =1

+ Boundary terms

(ri ) − f (ri )] + β ∑ [Ψ M ( Ri ) − bi ] 2

M

m

i =1

With β a penalty parameter, to control the degree of satisfaction of the BC.

2

PERCEPTRON-RBF SYNERGY Model: m

ΨM ( x ) = N ( x ) + ∑ai e

−λ x −Ri 2

i =1

Where the α’s are determined in a way so that the model satisfies the BC exactly, i.e.: m

∑a e k =1

k

−λ Ri −Rk 2

= bi − N ( Ri )

The free parameter λ is chosen once initially so as the system above is easily n solved. 2 [ L Ψ ( r ) − f ( r )] “Error”: ∑ M i i i =1

Pros & Cons . .. The RBF - Synergy is: • Computationally costly. A linear system is solved each time the model is evaluated. • Exact in satisfying the BC.

The Penalty method is: • Approximate in satisfying the BC.

• Computationally efficient

IN PRACTICE . . .

• Initially proceed via the penalty method, till an approximate solution is found. • Refine the solution, using the RBFSynergy method, to satisfy the BC exactly.

Conclusions: Experiments on several model problems shows performance similar to the one reported earlier.

GENERAL OBSERVATIONS Enhanced generalization performance is achieved, when the exponential weights of the Neural Networks are kept small. Hence box-constrained optimization methods should be applied. Bigger Networks (greater number of nodes) can achieve higher accuracy. This favors the use of: • Existing Specialized Hardware • Sophisticated Optimization Software

MERLIN 3.0 A software package offering What is it ? many optimization algorithms and a friendly user interface.

What problems does it solve ? Find a local minimum of the function:

f ( x ) , x ∈ R N , x = ( x1 , x2 ,..., x N ) Under the conditions:

xi ∈[li , ui ], ∀ i = 1,2,..., N

ALGORITHMS Direct Methods • SIMPLEX • ROLL

Gradient Methods

Conjugate Gradient

Quasi Newton

• Polak-Ribiere

• BFGS (3 versions)

• Fletcher-Reeves

• DFP

• Generalized P&R Levenberg-Marquardt • For Sum-Of-Squares

THE USER’S PART What the user has to do ? • Program the objective function • Use Merlin to find an optimum

What the user may want to do ? • Program the gradient • Program the Hessian • Program the Jacobian

MERLIN FEATURES & TOOLS • Intuitive free-format I/O • Menu assisted Input • On-line HELP • Several gradient modes • Confidence parameter intervals • Box constraints • Postscript graphs • Programmability • “Open” to user enhancements

MCL:

Merlin Control Language

What is it ? High-Level Programming Language, that Drives Merlin Intelligently.

What are the benefits ? • Abolishes User Intervention. • Optimization Strategies. • Handy Utilities. • Global Optimum Seeking Methods.

MCL REPERTOIRE MCL command types: • Merlin Commands • Conditionals (IF-THEN-ELSE-ENDIF) • Loops (DO type of loops) • Branching (GO TO type) • I/O (READ/WRITE) MCL intrinsic variables: All Merlin important variables, e.g.: Parameters, Value, Gradient, Bounds ...

SAMPLE MCL PROGRAM

program var i; sml; bfgs_calls; nfix; max_calls sml = 1.e-4 bfgs_calls = 1000 max_calls = 10000

% Gradient threshlod. % Number of BFGS calls. % Max. calls to spend.

again: loosall nfix = 0 loop i from 1 to dim if abs[grad[i]] <= sml then fix (x.i) nfix = nfix+1 end if end loop if nfix == dim then display 'Gradient below threshold...' loosall finish end if bfgs (noc=bfgs_calls) when pcount < max_calls just move to again display 'We probably failed...' end

MERLIN-MCL Availability The Merlin - MCL package is written in ANSI Fortran 77 and can be downloaded from the following URL:

http://nrt.cs.uoi.gr/merlin/ It is maintained, supported and is FREELY available to the scientific community.

FUTURE DEVELOPMENTS • Optimal Training Point Sets • Optimal Network Architecture • Expansion & Pruning Techniques

Hardware Implementation on NEUROPROCESSORS

Related Documents

Neural Networks
November 2019 56
Neural Networks
June 2020 27
Neural Networks
November 2019 42
Neural Networks
November 2019 42
55206-mt----neural Networks
October 2019 23

More Documents from "SRINIVASA RAO GANTA"