A DATA DRIVEN APPROACH FOR GENERATING REDUCED-ORDER STOCHASTIC MODELS OF RANDOM HETEROGENEOUS MEDIA Nicholas Zabaras and Baskar Ganapathysubramanian Materials Process Design and Control Laboratory Sibley School of Mechanical and Aerospace Engineering Cornell University Ithaca, NY 14853-3801
[email protected] http://mpdc.mae.cornell.edu/
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MOTIVATION PROVIDE LOW-DIMENSIONALITY REPRESENTATION OF THE MICROSTRUCTURE, PROPERTY AND PROCESS SPACES
Process
Applications: (i) Identify microstructures that have extremal properties. Process-structure space
3.11
1000
Property-structure space A80
3.1
f
-1.4
3.08
-1.5
C
b
3.07
-1.6
R
3.06
a
-1.7
d
0.5 -1.8 1 0
-0.5 -0.5 -1
-1
Process paths
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
3.05
3. Rolling followed by drawing
0 0.5
A100
e
3.09
-1.3
α
Property-process space
3.12
Taylor factor along TD
(ii) Identify processing sequences that lead to desired microstructures and properties. A
Microstructure representations
3.04 3.03 3.05
2. Rolling 1. Drawing 3.06
3.07
3.08
3.09
3.1
3.11
3.12
Taylor factor along RD
Materials Process Design and Control Laboratory
STRATEGY Given limited experimental information (microstructural features): Represent several plausible microstructures Encode this information into a low-dimensional parameterization of all such possible microstructures WHY? Can incorporate effects of limited information of macro behavior Low-dimensional embedding significantly aids searching and contouring of high dimensional microstructural space
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
OUTLINE Linear reduced-order modeling framework of microstructures Non-linear reduced order modeling framework for microstructures Applications of classification and reduced models of structure-property-process maps for tailored materials
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
LINEAR REDUCED-ORDER MODELING FRAMEWORK
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
DEVELOPING LINEAR TOPOLOGICAL MODELS Data driven techniques for encoding the variability in properties into a viable, finite dimensional stochastic model. Advances in using Bayesian modeling, Random domain decomposition Aim is to create a seamless technique that utilizes the tools of the mature field of property/ microstructure reconstruction First investigations into constructing data-driven reduced order representation of topological/ material/ property distributions utilized a Principal Component Analysis (PCA/POD/KLE) based approach. Generate 3D samples from the microstructure space and apply PCA to them
= a1
+ a2
+..+ an
Convert variability of property/microstructure to variability of coefficients. Not all combinations allowed. Developed subspace reducing methodology1 to find the space of allowable coefficients that reconstruct plausible microstructures 1.
B. Ganapathysubramanian, N. Zabaras, Modelling diffusion in random heterogeneous media: Data-driven models, stochastic collocation and the variational multi-scale method, J Comp Physics 226 (2007) 326-353.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
OVERVIEW OF METHODOLOGY 1. Property extraction
2. Microstructure reconstruction
Extract properties P1, P2, .. Pn, that the structure satisfies.
Reconstruct realizations of the structure satisfying the properties.
These properties are usually statistical: Volume fraction, 2-point correlation, auto correlation
Monte Carlo, Gaussian Random Fields, Stochastic optimization etc.
Extract structure-property-process relations Link with microstructure classification and statistical learning algorithms
4. Applications
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Construct a reduced-order stochastic model from the data. This model must be able to approximate the class of structures. KL expansions, FFT and other transforms, Autoregressive models, ARMA models
3. Reduced model
Materials Process Design and Control Laboratory
DATA TO CONSTRUCT INPUT MODELS Only have characterization of property variation in finite number of regions or finite realizations Consider the property variation and/or microstructure to be a stochastic process.
2D microstructure characterization
Identify this stochastic process using the experimental information available Process data for statistical invariance of the structure volume fraction, 2-point correlation, 3-point correlations …. All realizations of the stochastic process satisfy the experimental statistical relations These microstructures belong to a very large (possibly) infinite dimensional space.
tomographic characterization
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Convert this representation into a computationally useful form: Finite dimensional representation Materials Process Design and Control Laboratory
FINITE DIMENSIONAL REPRESENTATION The data extraction/reconstruction procedure gives a set of 3D microstructures. These are samples from the microstructural space. Need a qualitative, functional representation of the topological variation. Must be finite dimensional for this description to be useful Necessity of model reduction arises
I = Iavg + I1a1 + I2a2+ I3a3 + … + Inan Represent any microstructure as a linear combination of the microstructures or some eigenimages Move randomness from image to coefficients
= a1
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
+ a2
+ ..+ an
Materials Process Design and Control Laboratory
REDUCED MODEL OF TOPOLOGICAL VARIATION Construct descriptor from sample images. Use POD Microstructure images (nxnxn pixels) represented as vectors Ii i=1,..,M The eigenvectors of the covariance matrix are computed The first N eigenimages are chosen to represent the microstructures Represent any microstructure as a linear combination of the microstructures or some eigenimages Move randomness from image to coefficients
= a1
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
+ a2
+ ..+ an
Materials Process Design and Control Laboratory
PROPER ORTHOGONAL DECOMPOSITION Suppose we had a collection of data (from experiments or simulations) for the some variable/process/parameter
S = { A( x)}iN=1 , { A( x, t )}iN=1 Is it possible to identify a basis such that this data can be represented in the smallest possible space. I.e find
{φ ( x)}iM=1 , M = N Such that it is optimal for the data to be represented as M
A( x) = ∑ aiφi ( x), i =1
M
A( x, t ) = ∑ ai (t )φi ( x ), i =1
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Proper Orthogonal Decomposition (POD), Principal Component Analysis (PCA), Karhunen Loeve Decomposition (KLE), Sirovich, Lumley, Ravindran, Ito PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first principal component), PCA is theoretically the most optimum transform for a given data in least square terms.
Materials Process Design and Control Laboratory
PROPER ORTHOGONAL DECOMPOSITION Data usually collated in terms of a matrix XT. X is shifted to a mean zero value T
Method of snapshots Solve the optimization problem
The covariance matrix of this data is 1 computed. T
C=
N
XX
Compute the eigen values and eigen vectors of the covariance matrix
where
C = V ΣV T
The reduced description is given by
YT =VΣ
Requires the computation of the covariance matrix of the data and subsequent eigen decomposition. Can become computationally demanding as N increases. A computationally simpler POD technique is the method of snapshots
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Eigen-value problem where
Ci , j =
1 T X X N
Materials Process Design and Control Laboratory
REDUCED MODEL : CONSTRAINTS Let I be an arbitrary microstructure satisfying the experimental statistical correlations The PCA method provides a unique representation of the image That is, the PCA provides a function The function Every image But every point in
is injective but nor surjective has a unique mapping need not define an image
Construct the subspace of allowable n-tuples
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
CONSTRUCTING THE REDUCED SUBSPACE H Image I belongs to the class of structures? It must satisfy certain conditions a) Its volume fraction must equal the specified volume fraction b) Volume fraction at every pixel must be between 0 and 1 c) It should satisfy the given two point correlation Thus the n tuple (a1,a2,..,an) must further satisfy some constraints. Enforce these constraints sequentially 1. Pixel based constraints
Microstructures represented as discrete images. Pixels have bounds This results in 2n3 inequality constraints
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
CONSTRUCTING THE REDUCED SUBSPACE H 2. First order constraints The Microstructure must satisfy the experimental volume fraction
This results in one linear equality constraint on the n-tuple
3. Second order constraints The Microstructure must satisfy the experimental two point correlation. This results in a set of quadratic equality constraints
This can be written as
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
SEQUENTIAL CONSTRUCTION OF THE SUBSPACE Computational complexity Pixel based constraints + first order constraints result in a simple convex hull problem Enforcing second order constraints becomes a problem in quadratic programming
Sequential construction of the subspace First enforce first order statistics, On this reduced subspace, enforce second order statistics
Example for a three dimensional space: 3 eigen images 15 10 5 0 -5 -10 10
15
20
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
-15
-10
-5
0
5
10
15
Materials Process Design and Control Laboratory
THE REDUCED MODEL The sequential contraction procedure a subspace H, such that all n-tuples from this space result in acceptable microstructures
H represents the space of coefficients that map to allowable microstructures. Since H is a plane in N dimensional space, we call this the ‘material plane’ Since each of the microstructures in the ‘material’ plane satisfies all required statistical properties, they are equally probable. This observation provides a way to construct the stochastic model for the allowable microstructures: Define
such that
This is our reduced stochastic model of the random topology of the microstructure class
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
PROPERTY EXTRACTION Reconstruction of well characterized material Tungsten-Silver composite1 Produced by infiltrating porous tungsten solid with molten silver
640x640 pixels = 198 μm x 198 μm
1. S. Umekawa, R. Kotfila, O. D. Sherby, Elastic properties of a tungsten-silver composite above and below the melting point of silver, J. Mech. Phys. Solids 13 (1965) 229-230
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
PROPERTY EXTRACTION First order statistics: Volume fraction: 0.2
Second order statistics: 2 pt correlation
1 0.9 0.8 0.7
Digitized two phase microstructure image
Black phase- Ag Simple matrix operations to extract image statistics
g(r)
White phase- W
0.6 0.5 0.4 0.3 0.2 0.1 0 0
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
5
10 r (um)
15
20
Materials Process Design and Control Laboratory
MICROSTRUCTURE RECONSTRUCTION Statistical information available- First and second order statistics Reconstruct Three dimensional microstructures that satisfy these experimental statistical relations GAUSSIAN RANDOM FIELDS GRF- model interfaces as level cuts of a function Build a function y(r). Model microstructure is given by level cuts of this function. y(r) has a field-field correlation given by g(r) If this function is known, y(r) can be constructed as
Uniformly distributed over the unit sphere Uniformly distributed over [0, 2π) Distributed according to
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
where Materials Process Design and Control Laboratory
MICROSTRUCTURE RECONSTRUCTION Relate experimental properties to 1. Two phase microstructure, impose level cuts on y(r). Phase 1 if 2. Relate to statistics first order statistics
where second order statistics
Set
,
and
For the Gaussian Random Field to match experimental statistics
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
FITTING THE GRF PARAMETERS Assume a simplified form for the far field correlation function
Three parameters, β is the correlation length, d is the domain length and rc is the cutoff length Use least square minimization to find optimal fit
1
g(r)
0.8 0.6 0.4 0.2 0
0
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
5
10 r (um)
15
20 Materials Process Design and Control Laboratory
3D MICROSTRUCTURE RECONSTRUCTION 20 μm x 20 μm x 20 μm
64x64x64 pixel 40 μm x 40 μm x 40 μm
200 μm x 200 μm
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
128x128x128 pixel
Materials Process Design and Control Laboratory
MODEL REDUCTION Principal component analysis
Constructing the reduced subspace and the stochastic model
0.25
- Enforcing the pixel based bounds and the linear equality constraint (of volume fraction) was developed as a convex hull problem. A primal-dual polytope method was employed to construct the set of vertices. - Enforcing the second order constraints was performed through the quadratic programming tools in the optimization toolbox in Matlab.
Normalized eigenvalue
0.2
0.15
0.1
0.05
0
5
10
15 20 Eigen number
25
30
- Two separate cases are considered in this example. In the first case, only the first-order constraints (volume fraction) are used to reconstruct the subspace H. In the second case, both first-order as well as second-order constraints (volume fraction and two-point correlation) are used to construct the subspace H.
First 9 eigen values from the spectrum chosen
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
PHYSICAL PROBLEM Investigate effects of limited topological information on diffusion in heterogeneous random media Structure size 40x40x40 μm Tungsten Silver Matrix
T= -0.5
T= 0.5
Heterogeneous property is the thermal diffusivity. Tungsten: ρ 19250 kg/m3 k 174 W/mK c 130 J/kgK Silver:
ρ 10490 kg/m3
Left wall maintained at -0.5
k 430 W/mK
Right wall maintained at +0.5
c 235 J/kgK
All other surfaces insulated
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Diffusivity ratio αAg /αW = 2.5
Materials Process Design and Control Laboratory
COMPUTATIONAL DETAILS The construction of the stochastic solution: through sparse grid collocation level 5 interpolation scheme used
Number of deterministic problems solved: 15713 Computational domain of each deterministic problem: 128x128x128 pixels Each deterministic problem solution: solved on
The solution of each deterministic VMS problem: about 34 minutes, In comparison, a fully-resolved fine scale FEM solution took nearly 40 hours.
Computational platform: 40 nodes on local Linux cluster
10
0
10-1 Error
a 8× 8× 8 coarse element grid (uniform hexahedral elements) with each coarse element having 16 × 16 × 16 fine-scale elements.
10
-2
Total time: 56 hours 100
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
101 102 103 Number of collocation points
104
Materials Process Design and Control Laboratory
FIRST ORDER STATISTICS: MEAN TEMPERATURE
e b
c
d
f
a
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
g Materials Process Design and Control Laboratory
FIRST ORDER STATISTICS: HIGHER MOMENTS 4
7 Probability distribution function
Probability distribution function
3.5 3 2.5 2 1.5 1
5 4
d
3 2 1
0.5 0
6
-0.4
b
-0.2 0 Temperature
0.2
0.4
0
c
0
0.2 Temperature
0.4
e
a
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
f Materials Process Design and Control Laboratory
SECOND ORDER STATISTICS: MEAN TEMPERATURE
e b
c
d
f
a
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
g Materials Process Design and Control Laboratory
SECOND ORDER STATISTICS: HIGHER MOMENTS 10 9
6 Probability distribution function
Probability distribution function
7
5 4 3 2 1 0
8 7 6 5
d
4 3 2 1
-0.4
b
-0.2 0 Temperature
0.2
0.4
0 -0.2
c
0
0.2 Temperature
0.4
e
a
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
f Materials Process Design and Control Laboratory
INCORPORATING MORE INFORMATION As more information is incorporated into the analysis, the subspace of allowable microstructures shrinks
Probability distribution function
7 6 5
This corresponds to tighter probability distributions
4 3
A general methodology was presented for constructing a reduced-order microstructure model
2 1 0
-0.4
-0.2 0 Temperature
0.2
0.4
Comparison of temperature PDF’s at a point due to the application of first and second order constraints
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Using more sophisticated model reduction techniques to build the reduced-order microstructure model, Extending the methodology to arbitrary types of microstructures
Materials Process Design and Control Laboratory
NON-LINEAR REDUCED ORDER MODELING FRAMEWORK
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
INPUT STOCHASTIC MODELS: LINEAR APPROACH - PCA based approaches find the smallest coordinate representation of the data …. … but assumes that the data lies in a linear vector space Only guaranteed to discover the true structure of data lying on a linear subspace of the high dimensional input space What is the result when the data lies in a nonlinear space? As the number of input samples increases, PCA based approaches tend to overestimate the dimensionality of the reduced representation. Becomes computationally challenging
# of eigen vectors
Further related issues: - How to generalize it to other properties/structures? Can PCA be applied to other classes of microstructures, say, polycrystals? - How does convergence change as the amount of information increases? Computationally? NONLINEAR APPROACHES TO MODEL REDUCTION: IDEAS FROM IMAGE PROCESSING, PSYCOLOGY
# of samples
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
NONLINEAR REDUCTION: THE KEY IDEA Set of images. Each image = 64x64 = 4096 pixels Each image is a point in 4096 dimensional space. But each and every image is related (they are pictures of the same object). Same object but different poses. That is, all these images lie on a unique curve (manifold) in 4096 . Can we get a parametric representation of this curve? Problem: Can the parameters that define this manifold be extracted, ONLY given these images (points in 4096 )
Different images of the same object: changes in up-down (UD) and left-right (LR) poses
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Solution: Each image can be uniquely represented as a point in 2D space (UD, LR). Strategy: based on the ‘manifold learning’ problem Materials Process Design and Control Laboratory
NONLINEAR REDUCTION: EXTENSION TO INPUT MODELS Given some experimental correlation that the microstructure/property variation satisfies. Construct several plausible ‘images’ of the microstructure/property. Each of these ‘images’ consists of , say, n pixels. Each image is a point in n dimensional space. But each and every ‘image’ is related. That is, all these images lie on a unique curve (manifold) in n. Can a low dimensional parameterization of this curve be computed? Different microstructure realizations satisfying some experimental correlations
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Strategy: based on a variant of the ‘manifold learning’ problem.
Materials Process Design and Control Laboratory
A FORMAL DEFINITION OF THE PROBLEM State the problem as a parameterization problem (also called the manifold learning problem) Given a set of N unordered points belonging to a manifold embedded in a high dimensional space n, find a low dimensional region d that parameterizes , where d << n Classical methods in manifold learning have been methods like the Principle Component Analysis (PCA) and multidimensional scaling (MDS). These methods have been shown to extract optimal mappings when the manifold is embedded linearly or almost linearly in the input space. In most cases of interest, the manifold is nonlinearly embedded in the input space, making the classical methods of dimension reduction highly approximate. Two approaches developed that can extract non-linear structures while maintaining the computational advantage offered by PCA1,2 . 1. 2.
J. B. Tenenbaum, V. De Silva, J. C. Langford, A global geometric framework for nonlinear dimension reduction Science 290 (2000), 2319-2323. S Roweis, L. Saul., Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science 290 (2000) 2323--2326.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
AN INTUITIVE PICTURE OF THE STRATEGY - Attempt to reduce dimensionality while preserving the geometry at all scales. - Ensure that nearby points on the manifold map to nearby points in the lowdimensional space and faraway points map to faraway points in the low dimensional space. PCA
3D data 4 0 2 0
1 5
1 0
5 6 0
0 1 5 5 0 1 0 -5 5 0 -1 0
4 0
-5 -1 0 3 0
2 0
1 0
0
-1 0
Linear approach
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
0
2 0
4 0
6 0
8 0
1 0 0
Non-linear approach: unraveling the curve
Materials Process Design and Control Laboratory
KEY CONCEPT 1) Geometry can be preserved if the distances between the points are preserved – Isometric mapping. 2) The geometry of the manifold is reflected in the geodesic distance between points 3) First step towards reduced representation is to construct the geodesic distances between all the sample points Euclidian dist Geodesic dist
Pt A
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Pt B
Materials Process Design and Control Laboratory
THE NONLINEAR MODEL REDUCTION ALGORITHM 1) Given the N unordered sample points ( microstructures, property maps …) 2) Compute the geodesic distance between each pair of samples (i,j) . 3) Given the pairwise distance matrix between N objects, compute the location of N points, {ξi} in d such that the distance between these points is arbitrarily close to the given distance matrix . Basic premise of group of statistical methods called Multi Dimensional Scaling1 (MDS)
1.
Given N unordered samples
Compute pairwise geodesic distance
N points in a low dimensional space
Perform MDS on this distance matrix
T.F.Cox, M.A.A.Cox, Multidimensional scaling, 1994, Chapman and Hall
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MATHEMATICAL DETAILS How to compute geodesic distance? Sum over short hops. Need the notion of distance between samples Flexibility in defining the distance measure….. The distance measure defines the properties of the manifold that the samples lie on
1. Properties of the manifold n.
The distance measure, , based on how much the microstructures vary. Defined as the difference in statistical correlation between two microstructures. D (i, j ) =| S (i ) − S ( j ) | The key to a reasonable dimension reduction is a good choice of the distance measure Any choice of functions are allowable as long as they satisfy the metric properties 1.
B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", Journal of Computational Physics, in press.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MATHEMATICAL DETAILS 1. Properties of the manifold n. a) (, ) is a metric space. Ensure that D (i, j ) =| S (i ) − S ( j ) | satisfies the properties of non-negativity, symmetry and the triangle inequality Equivalence between microstructures: Two microstructures are equivalent if they share the same higher order statistical correlation b) (, ) is a bounded. c) (, ) is dense. d) (, ) is complete. e) (, ) is a compact metric space1,2 . 1. 2.
B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics J. R. Munkres, Topology, Second edition, Prentice-Hall, 2000.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MATHEMATICAL DETAILS 2. Mapping a compact manifold to a low-dimensional set Have no notion of the geometry of the manifold to start with. Hence cannot construct true geodesic distances!
Dm (i, j ) = inf{length(γ )} M γ
Approximate the geodesic distance using the concept of graph distance G(i,j) : the distance of points far away is computed as a sequence of small hops. This approximation, G, asymptotically matches the actual geodesic distance . In the limit of large number of samples1,2 . (Theorem 4.5 in 1)
(1 − λ1 )Dm (i, j ) ≤ Dm (i, j ) ≤ (1 + λ2 ) Dm (i , j ) M G M Based on results on graph approximations to geodesics2. 1. 2.
B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics. M.Bernstein, V. deSilva, J.C.Langford, J.B.Tenenbaum, Graph approximations to geodesics on embedded manifolds, Dec 2000
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MATHEMATICAL DETAILS 3. MDS and choosing the dimensionality of the reduced space Perform MDS on the geodesic matrix. i.e perform an eigenvalue decomposition of the squared geodesic matrix. The largest d eigenvalues are the coordinates of the N points. The manifold has an intrinsic dimensionality. How to choose the correct value of d? (related with issues of accuracy and computational effort) Estimate the dimensionality of the manifold based on a novel geometrical probability approach (developed by A. Hero et. al.) Based on ideas from graph theory. The rate of convergence of the length functional, L of the minimal spanning tree of the geodesic distance matrix is related to the dimensionality1,2 , d.
log( L) = a log( N ) + ε
1. 2.
with
d −1 a= d
B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics. J.A.Costa, A.O.Hero, Geodesic Entropic Graphs for Dimension and Entropy Estimation in Manifold Learning, IEEE Trans. on Signal Processing, 52 (2004) 2210--2221.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
THE REDUCED ORDER TOPOLOGICAL MODEL Given N unordered samples
n.
d
N points in a low dimensional space The procedure results in N points in a low-dimensional space. The geodesic distance + MDS step (Isomap algorithm1) results in a low-dimensional convex, connected space2, d. Using the N samples, the reduced space is given as A = = convex hull({ξi }) serves as the surrogate space for . Access variability in by sampling over . BUT have only come up with → map …. Need → map too 1. 2.
J. B. Tenenbaum, V. De Silva, J. C. Langford, A global geometric framework for nonlinear dimension reduction Science 290 (2000), 2319-2323. B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
THE REDUCED ORDER TOPOLOGICAL MODEL Only have N pairs to construct → map. Various possibilities based on specific problem at hand. But have to be conscious about computational effort and efficiency. Illustrate 3 such possibilities below. Error bounds can be computed1.
d
d
n
n
2. Local linear interpolation
1. Nearest neighbor map
d
n 3. Local linear interpolation with projection 1.
B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
THE REDUCED ORDER TOPOLOGICAL MODEL Algorithm consists of two parts. 1) Compute the low dimensional representation of a set of N unordered sample points belonging to a high dimensional space Given N unordered samples
Compute pairwise geodesic distance
Perform MDS on this distance matrix
N points in a low dimensional space
For using this model in a stochastic collocation framework, must sample points in → 2) For an arbitrary point ξ € must fins the corresponding point x €. Compute the mapping from → d
n.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
NON LINEAR DIMENSION REDUCTION 40000
The developments detailed before are applied to find a low dimensional representation of these 1000 microstructure samples.
30000
Log(length of MST)
20000
10000
The optimal representation of these points was a 9 dimensional region Able to theoretically show that these points in 9D space form a convex region in 9. 100
300
Log(Samples)
500
15
This convex region now represents the low dimensional stochastic input space Use sparse grid collocation strategies to sample this space.
10 5 0 -5 -10 10
15
20
-15
-10
-5
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
0
5
10
15
Materials Process Design and Control Laboratory
COMPUTATIONAL DETAILS The construction of the stochastic solution: through sparse grid collocation level 5 interpolation scheme used
101 100
Computational domain of each deterministic problem: 65x65x65 pixels Total number of dof’s: 653x26017 ~ 7x109 Computational platform: 50 nodes on local Linux cluster (x2 3.2 GHz) Total time: 210 minutes
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Error
Number of deterministic problems solved: 26017
10
-1
10
-2
10-3 10
-4
10
-5
10-6 0 10
1
10
2
3
10 10 # of collocation points
10
4
Materials Process Design and Control Laboratory
MEAN TEMPERATURE PROFILE
c
b
d
e
(a) Temp contour (b-d) Temp isocontours
f
(e-g) Temp slices
a
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
g Materials Process Design and Control Laboratory
HIGHER ORDER STATISTICS Probability distribution function
6
b
5 4 3 2 1 0
c -0.2
0 Temperature
0.2
d
(a) Temp contour (b) Temp isocontours e
(c) PDF of temp (d-f) Temp slices
a
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
f Materials Process Design and Control Laboratory
MODELS OF POLYCRYSTALLINE MATERIALS Microstructural variations affect macro-scale properties It is lot more difficult to analyze than two-phase or multi-phase materials Multiple layers of representation (a) grain distribution (b) orientation distribution Continuum distribution of orientation. Solvable problem Limit analysis to grain distribution This is a tricky problem: Have to faithfully encode grain distribution features and should quickly reconstruct approximate grains
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MICROSTRUCTURAL FEATURE: GRAIN SIZE
2D microstructures
3D microstructures
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Grain size obtained by using a series of equidistant, parallel lines on a given microstructure at different angles. In 3D, the size of a grain is chosen as the number of voxels (proportional to volume) inside a particular grain.
Grain size is computed from the volumes of individual grains
Materials Process Design and Control Laboratory
EXPERIMENTAL DATA Polarized light micrograph of aluminium alloy AA3302 (source Wittridge NJ et al. Mat.Sci.Eng. A, 1999)
0.2 0.18 0.16
Extract grain size distribution from image
probability
0.14
=10.97 =124.90
0.12 0.1 0.08 0.06 0.04 0.02 0 0
2
4
6
8
10
12
14
16
18
20
GrainSize( µm)
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
RECONSTRUCTING PLAUSIBLE DATA SET Reconstruct N=200 microstructures that satisfy the experimental grain size distribution Utilize stochastic optimization to construct microstructures
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MODEL REDUCTION OF POLYCRYSTALS The key feature to encode is the grain boundary But grain boundaries are sparsely distributed in the domain Need a strategy to compress all the grain boundary information and remove all the interior information Look at different types of transforms 1) Transform and its inverse should be computationally efficient 2) Data size should be limited. 3) Should be able to process grain boundaries : monochromatic lines on a monochromatic background 4) Translation and rotation invariant
Radon and Fourier transform.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
IMAGE TRANSFORMATION Radon transform: The Radon transform of an image represented by the function f(x,y) can be defined as a series of line integrals through f(x,y) at different offsets from the origin. ∞ ∞
R( ρ ,θ ) =
∫∫
f ( x, y )δ ( x cosθ − y sinθ + r )dxdy
−∞ −∞
Why Radon transform? It collects line integral information. In some sense it is similar to the Heyn’s intercept method Extensively utilized in CAT scanning and medical imaging.. Mature applications
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
IMAGE TRANSFORMATION Forward radon transform
Inverse radon transform- Filtered back projection: Two steps, filtering and then projection π
f ( x, y ) = ∫ f i ( x cos θ + y sin θ ,θ )dθ 0
The reconstructed image is heavily blurred. Use a high pass filter to the sinogram data in the frequency domain. Apply a 1-D DFT to the sinogram data for each angle, multiply by the filter, and then using the inverse DFT to reconstruct the data.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
IMAGE TRANSFORMATION
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
REDUCED ORDER MODEL The distance measure, , based on how much the microstructures vary. Since the Radon transform encodes (in a sense) the grain volume, define the distance as the difference in Radon transformed images
D (i, j ) =|| Ri ( ρ , θ ) − R j ( ρ , θ ) || Use filtered or un-filtered Radon transform? Given N unordered samples
Compute pairwise geodesic distance
N points in a low dimensional space
Perform MDS on this distance matrix
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
DIMENSIONALITY OF THE REDUCED MODEL Dimensionality of the parametric space computed from application of the BHH theorem. Which connects the dimensionality of the surface to the length functional of a graph
3E+11
Log(Length functional)
2.5E+11 2E+11 1.5E+11
1E+11
5E+10
d = 31 25
50
75
Log(Number of samples)
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
100
125
Materials Process Design and Control Laboratory
SAMPLING AND RECONSTRUCTION Consider random points in the hyper cube Reconstruct polycrystals corresponding to this point based on data. Direct interpolation of the radon transform followed by inversion
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
SOME CHALLENGES Reconstruction improves as the number of data point improves .. i.e. A is densely populated
Experimental information, gappy data
As the number of neighbors used in reconstruction increases the reconstruction degrades. Reason is averaging
15 10 5 0 -5 -10 10
15
20
-15
-10
-5
0
5
10
15
No filtering or heuristics used so far. But seems like a good idea. Could result in better microstructures (line merging, prefiltering) Investigate other translation and rotation invariant transforms: Hough transform, Steerable pyramid Extension to 3D is straightforward. Use 3D Radon transform.
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
Applications of classification and reduced models of structureproperty-process maps for tailored materials
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
PROCESSING PATH DESIGN Tailored microstructures so that desired properties can be achieved: controlled deformation or thermal treatment. ‘Processing path design’ to realize microstructures with optimal properties. Non-uniqueness in processing path solution. This problem cannot be addressed solely using conventional optimization schemes. Data mining strategies comes natural to such problems. Development of a database that can accommodate unknown microstructures into newly formed classes without user intervention
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
MODEL REDUCTION AND STATISTICAL LEARNING 1
Model reduction results in low-complexity and low-dimensional models of the microstructural space
3.095
Process plane
3.09
0.5
3.085 3.08 3.075
0
Interrogate sample microstructures to construct corresponding process and property spaces
Utilize classification and statistical learning frameworks to construct relationships between low dimensional models of microstructure, process and property spaces Unsupervised learning strategies for automated design: X-means classifier
3.065 -0.5 3.06 3.055 -1 -1
3.11
-0.5
0
0.5
1
Process-property plane
3.1 3.09
Taylor factor along TD
Potentially results in huge reduction in dimensions
3.07
C
3.08 3.07
R
3.06 3.05
3.055
3.06 3.065 3.07 3.075 3.08 3.085 3.09 3.095
Taylor factor along RD
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials and Control Labora Materials ProcessProcess Design Design and Control Laboratory
CLASSIFICATION HIERARCHY: MICROSTRUCTURES Classify/tessellate reduced space based on features
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
CLASSIFICATION HIERARCHY: TEXTURE Classify/tessellate reduced space based on fiber orientations
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
K-MEANS CLUSTERING Find the cluster centers {C1,C2,…,Ck} such that the sum of the 2-norm distance squared between each feature xi , i = 1,..,n and its nearest cluster center Ch is minimized. Each class is affiliated with multiple processes
Cost function n
J (c , c ,.., c ) = ∑ min ( 1
2
k
i =1 h =1,.., k
1 xi − C h 2
DATABASE OF ODFs
2 2
) Feature Space Clusters
Identify clusters
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
ADAPTIVE REDUCED MODELS: ACCELERATED DESIGN
= a1
+ a2
+..+ an
Linking data-driven reduced order models for microstructure and texture evolution potentially very significant The classification technique is database-driven and the availability of existing information can be further utilized to accelerate the texture evolution models. Adaptivity to account for the sensitivity of different features. This provides addition information of significance: Which processes affect which features and which features affect different properties Tangible input for further experimentation
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
THE DESIGN FRAMEWORK
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory
DESIGN FOR DESIRED ODF: A MULTI STAGE PROBLEM OptimalReduced order control
1 Normalized objective function
Desired ODF
0.8 0.6 0.4 0.2 0 0
Initial guess, α 1 = 0.65, α 2 = -0.1
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
5
10 15 IterationIndex
20
20x faster than full optimization.
Stage: 1 Plane strain compression (α 1 = 0.9472)
Gradients are obtained from reduced order sensitivity analysis.
Stage: 2 Compression (α 2 = -0.2847)
25
Materials Process Design and Control Laboratory
N o rm a liz e do b je c tiv efu n c ti
DESIGN FOR DESIRED ODF: A MULTI STAGE PROBLEM Crystal <100> direction. Easy direction of magnetization – zero power loss
1
0 . 8 0 . 6 0 . 4 0 . 2
Magnetic hysteresis loss (W/Kg)
External magnetization direction
h
0
Stage: 2 Tension (α 2 = 0.4821)
UU NN II VV EE RR SS II TT YY
1 5
1.23
1.225 1.22
1.215 1.21 0
C CO OR RN NE EL LL L
1 0
Desiredpropertydistribution Optimal(reduced) Initial
1.235
Stage: 1 Shear – 1 (α 1 = 0.9745)
5 I t e r a t io n I n d e x
20 40 60 Anglefromtherollingdirection
80
Materials Process Design and Control Laboratory
CONCLUSIONS Data-driven non-linear reduced order models of microstructures developed Very significant when performing computationally demanding operations – searching, contouring - in intrinsically high-dimensional propertyprocess-structure spaces Naturally coupled with statistical learning and unsupervised classification strategies to effectively estimate optimal processing routes for tailored materials 1) B. Ganapathysubramanian and N. Zabaras, "Modelling diffusion in random heterogeneous media: Data-driven models, stochastic collocation and the variational multi-scale method", Journal of Computational Physics, Vol. 226, pp. 326-353, 2007 2) B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", Journal of Computational Physics, submitted. 3) V. Sundararaghavan and N. Zabaras, "A statistical learning approach for the design of polycrystalline materials", Statistical Analysis and Data Mining, submitted 4) V. Sundararaghavan and N. Zabaras, "Linear analysis of texture-property relationships using process-based representations of Rodrigues space", Acta Materialia, Vol. 55, Issue 5, pp. 1573-1587, 2007
C CO OR RN NE EL LL L UU NN II VV EE RR SS II TT YY
Materials Process Design and Control Laboratory