Using Genetic Programming for an Advanced Performance Assessment of Industrially Relevant Heterogeneous Catalysts L.A. Baumes1*, A. Blansché2, P. Serna1, A.Tchougang2, N. Lachiche2, P. Collet2, A. Corma1 1
Instituto de Tecnología Química, UPV, Av. Naranjos s/n, E-46022 Valencia, Spain 2 Université Louis Pasteur, LSIIT, FDBT, Pôle API, F-67400 Illkirch, France
* Corresponding author(s) Abstract Beside the ease and speed brought by automated synthesis stations and reactors technologies in materials science, adapted informatics tools must be further developed in order to handle the increase of throughput and data volume, and not to slow down the whole process. This paper reports the use of genetic programming (GP) in heterogeneous catalysis. Despite the fact that GP has received only little attention in this domain, it is shown how such an approach can be turned into a very singular and powerful tool for solid optimization, discovery, and monitoring. Jointly with neural networks, the GP paradigm is employed in order to accurately and automatically estimate the whole curve “conversion versus time” in the epoxidation of large olefins using titanosilicates, Ti-MCM-41 and Ti-ITQ-2, as catalysts. In contrast to previous studies in combinatorial materials science and high-throughput screening, it was possible to estimate the entire evolution of the catalytic reaction for unsynthesized catalysts. Consequently the evaluation of the performance of virtual solids is not reduced to a single point (e.g. the conversion level at only one given reaction time or the initial reaction rate). The methodology is thoroughly detailed, while stressing on the comparison between a newly proposed CAX crossover operator and the traditional one. Keywords: High-throughput, Data Mining, Genetic Programming, Materials Science, Heterogeneous Catalysis
1. Introduction The availability of long chain lineal olefins from Fisher-Tropsch units opens new possibilities to obtain long chain aliphatic epoxides that can be functionalised for application in lubricants, plastisizers, chemicals and fine chemicals production. Among the different catalytic systems to carry out the epoxidation of double bonds, micro and mesoporous titanosilicates1,2,3 have been shown more efficient catalysts than other metal-based materials.4,5,6 Considering this, and the fact that extra-large pores or high external surface areas are required to avoid diffusional restrictions when reacting large olefins, structured mesoporous material7,8,9 MCM-41, and the delaminated zeolite10,11,12,13,14 ITQ-2, were selected in this paper as silica supports for grafting active Ti species (see Figure 1).
Figure 1. Synthesis of the catalysts. Right - Firstly one of the two supports is selected. Then a given amount of Titanium is grafted onto the surface. Finally, a given amount of one of the four selected silylating agent is grafted on the solid. Left - Example of catalyst with ITQ-2 as support, SiMe3 as silylating agent.
On the other hand, the catalytic activity of such materials can be improved by properly controlling their surface properties, taking into account that the own hydrophilic nature of these silica supports can contribute to the Ti sites deactivation by water adsorption and formation of different by-products such as diols. Therefore, the design of an efficient epoxidation catalyst requires not only the synthesis of highly active sites, but also a way to prevent their poisoning during the reaction. Tailoring the hydrophobicity allows an optimum adsorption of the reactants, while reducing the adsorption of the water and the opening of the desired product (epoxide) to form diols (see Figure 2), which would lead to the deactivation of the catalyst.7
Figure 2. Reaction scheme. The starting reactant is on the left, the target product is in the middle, and the molecule to avoid is on the right hand side.
In the present work, this control has been achieved by anchoring alkyl-silylated agents onto the catalyst surface, see Figure 1, whose apolar character modifies the final hydrophilicity of the material. During the silylation process, the amount of grafted molecules and the nature of the alkyl ligands are key parameters. Four different silylating agents have been selected to test their ability for protecting the Ti active sites from the presence of water. Such procedure introduces numerous variables to be optimised, requiring an important experimental effort, which has been reduced by using high throughput synthesis and testing apparatus,15 see Figure 3. In our
precedent work,16 the amount of grafted Ti, the level of silylation, and the nature of the silylating agent on the two different supports (MCM-41 and ITQ-2) were studied for the epoxidation of a C10 n-olefine taking the initial reaction rate as performance criterion. Contrarily to most prior studies16,17 in combinatorial materials science and high-throughput screening applied to heterogeneous catalysis, which restrict the data analysis by using a single standpoint (e.g. conversion value at one given reaction time, or initial reaction rate), we want to extract more information from the previously collected data in order to be able to automatically compare the materials behaviour from different catalytic criteria.
Figure 3. High throughput equipments. Left - Automated solid and liquid handling station for catalyst synthesis. Right - Parallel batch reactors in which catalysts and reactants are mixed and analyzed.
In absence of a complete kinetic studies of the different synthesized catalysts, which could not be tackled in practice due to the relatively large number of experiments, a new approach needs to be proposed. To do this, a genetic programming18 (GP) technique is employed in order to discover one analytical function f behind the general shape corresponding to all the previously tested catalysts ci, i=1..C. The GP objective can be formulated as the minimization of the error e taking into account all the conversion measurements xi,t, t=tj ..tT, for the whole dataset. Therefore, considering a given function, its parameters bɶi ,k , k=1..N are fitted using LevenbergMarquart methodology for each solid. T
C
Min
∑e i
ci
with eci
= ∑ ( xi ,t − xɶi ,t )
2
and
(
xɶi ,t = f bɶi ,k , xi ,t
)
(Equation 1)
t
Once the best function is found, the parameters can be used as output of a neural network while the synthesis variables of the catalysts are the inputs. This allows obtaining the parameters values for unsynthesized solids, and thus, the entire conversion curve. Beside the ease and speed brought by automated synthesis stations and reactors technologies in materials science, adapted informatics tools must be further developed in order to handle the increase of throughput and data volume, and not to slow down the whole process.19 In Ref.20, 21 and 22, the authors present a new Genetic Programming crossover operator called Context Aware Crossover (CAX) that yielded great results on several usual benchmarks. Therefore, it was decided to try it out on the real problem of catalyst performance modelling, which is a form of multi-objective symbolic regression. We report the use of genetic programming (GP) in heterogeneous catalysis. Despite the fact that GP has received only a little attention in this domain,23 this paper shows how such an approach can be turned into a very singular and powerful tool for solid optimization and discovery. The GP paradigm is employed in order to accurately and automatically estimate the whole curve “conversion versus time” in the epoxidation of large olefins using titanosilicates, Ti-MCM-41 and Ti-ITQ-2, as catalysts. Because of this, the evaluation of the performance of the virtual solids is not reduced to a single conversion value or the initial reaction rate, while the knowledge gain about the response of the catalysts, expressed through few parameters capturing
the evolution of the reaction along time, can be applied to predict the behaviour of new (unsynthesized) materials. The methodology is thoroughly detailed, and the analysis of the GP crossover is stressed by comparing the newly proposed CAX operator with the traditional one. This paper starts with a quick description of the real dataset. Then, the scheme of the employed methodology is drawn and the paper focuses on the CAX crossover. The presentation of the results obtained on the catalyst optimisation problem and different benchmarks allow comparing the CAX with the standard GP crossover based on consumed CPU-time. Finally, a conclusion ends the paper.
2. Description of the input data and experimental setup 2.1.- Datasets •
Benchmarks
Standard benchmarks have been implemented in order to asses the efficiency of the CAX under the new point of view of CPU-time basis, namely the quadratic polynomial symbolic regression, the 11 bit multiplexer and the artificial ant on the Santa-Fe trail (with no ADF as in Koza's implementation).
•
Real application
The dataset obtained from the first step of the study16 is composed of 128 different synthesized and tested catalysts, e.g. 36 for catalysts with SiMe3 as silylating agent, 6 for the three next silylating agents, each time on both supports, and a selection of 10 new diverse catalysts per support for verifying the modelling (36×2+6×3×2+10×2=128). Catalysts activity has been monitored during 16 hours giving a series of seven conversion measurements, i.e. the quantity of initial reactant which is transformed along time, see Equation 2. Since reactions were performed in a closed reactor, so-called batch mode, reactant concentration decreases over time, providing always curves “conversion versus time” characterized by a positive first derivative and a negative second derivative.
% Conversion (t1 ) = xt1 =
x (t0 ) − x(t1 ) ×100 x(t1 )
(Equation 2)
2.2.- Experimental setup The CAX aims at improving the efficiency of the standard GP crossover by improving the second part of the operation, i.e. choosing where to graft into parent 1 (P1) a subtree chosen in parent 2 (P2). Usually, a “modern” GP crossover operator creates one new child from two selected parents (P1 and P2) by i) randomly selecting a subtree S2 in P2 with 90% chance to select a node, ii) randomly selecting a subtree S1 in P1 pointing on a node if S2 is a node, and iii) creating a child which is the clone of P1 with subtree S2 in place of S1. Considering the CAX operator, after selecting S2 in P2, one tries to find the best place where it could be grafted in P1. All nodes of P1 can potentially receive the graft, excluding the root of P1 and the nodes at the bottom of P1 due to depth constraint. All possibilities are deterministically explored, by evaluating all possible children resulting from the graft of S2 wherever P1 can receive it (gray nodes in Figure 4), and the child with the best fitness is returned. Even though the exhaustive exploration of all potential crossover points in P1 is clearly expensive, Majeed and Ryan claimed exceptional results, convincing us to try this new operator.
Figure 4. Context Aware Crossover (CAX): the shaded nodes in P1 are possible crossover points where the selected subtree S2 from P2 can go in.
In their different papers, Majeed and Ryan suggest to first use the standard GP crossover, and then start the CAX only after some time, so curves were plotted for CAX_10 (CAX started after 10% of the run), CAX_40, CAX_70 and no CAX (cf . Figure. 5). In Ref.20, the population is made of 4,000 individuals for standard GP where the algorithm using CAX only needed 200. Figure 5-left shows that if the same population size is used for standard GP and CAX, the generation count just freezes when the CAX starts, due to the huge amount of children evaluations that this operator needs. Therefore, it appears that using 200 individuals for CAX is an advantage to CAX rather than GP. Thus, it was decided to reduce the population by 95% when CAX starts, so as to keep a generation count roughly equivalent to standard GP as shown in Figure 5- centre. Note that in Ref.20, fitness curves are given with reference to the number of generations. However, to produce one child the CAX needs many more evaluations than a standard crossover. In Ref.21, performance is given considering the number of evaluations. One could argue that all individuals do not take the same time to be evaluated. For these reasons, the results will be expressed against computing time, all four plots being done in parallel, on a quadri-processor exclusively devoted to the runs.
Figure 5. Top - Catalyst optimisation problem. Left - Number of generations for constant population; Centre - reduced population for CAX; Right - Results averaged on 4 runs for a reduced population size when CAX starts. Each run takes around 13 hours on a 3Ghz PC. Bottom - The implementation population reduction scheme is fair for the CAX evaluation- and generation-wise.
All the experiments were done over 50 runs, but for a number of seconds allowing standard GP to perform the same number of evaluations as found in Koza's book. The experiments implement the simple solution of turning on the CAX after completion of a certain percentage of a run. In order to precisely evaluate the effects of CAX, the standard GP population size (4,000) is used in the beginning of CAX runs until the CAX operator is started, after which the population is reduced down to 200 individuals. As a consequence, in this paper, the runs using CAX are identical to the standard GP run until the CAX operator is started.
3. Results 3.1.- Benchmarks Koza's quartic polynomial symbolic regression problem (x4 + x3 + x2 + x) is implemented. To obtain the CAX_10 curve which takes 1200 seconds, see Figure 6, the algorithm begins with a population of 4,000 individuals for 120 seconds (10% of 1200), after which the CAX is started. At this moment, the population is reduced down to 200 individuals using the following process: the best individual is kept (elitism), and the other 199 individuals are selected with a tournament of size 40 (1% of the original population size). Lower arities were tested, with elitist tournament-7 and random selection, but tournament-40 is what yielded the best results. On Figure 6-Top-Left, it can be observed that all methods perform the same, even when the CAX is started and the population reduced from 4,000 down to 200. However, the Average Population fitness curve, Figure 6-Top-Right clearly shows that, when the CAX starts, the average population fitness is boosted to values not far from the best individual's, but apparently, this does not lead to premature convergence, which is an interesting feature. Unfortunately, the great improvement announced in Ref.21 was not seen. On the 11-bit multiplexer problem, the effects of CAX look pretty much the same: on Figure 6Centre-Left, starting the CAX does not seem to have much effect at all (although it seems that CAX_10 has had a small negative impact on the best individual performance). On the right, one can clearly see the effect of CAX on the population average fitness whenever CAX is started. Before CAX starts, the curve is of course identical to standard GP. What is remarkable, though, is that for CAX_10, it seems that the population has not prematurely converged, though the average fitness is very close to the best fitness. In the end, the best individual value for CAX_10 is the same as for standard GP. The last benchmark in Ref.21 was the Lawnmower problem.18 However, this problem uses ADFs that were not implemented in this work, since the original catalysis problem did not need
them. So, in order to take a comparable benchmark, the Artificial Ant on the Santa-Fe trail problem was chosen. On this benchmark, still no improvement on the best fitness can be seen, cf. Figure 6-Bottom-Left, although this time, CAX_10 does not seem to recover and catch up with Standard GP. Here again, a spectacular boost on the population average fitness is observed whenever the CAX starts.
Figure 6. Top - Quartic polynomial symbolic regression. Left: Best individual performance. Right: Average performance of the population. Middle - 11 bit multiplexer problem. Left - Best performance. Right- mean performance. Bottom - Artificial Ant on the Santa-Fe Trail. Left - Number of hits of the best individual. Right - Number of hits of the average population.
3.2.- Real application
This difficult problem was first tackled with a tailored GP algorithm that did not use the CAX operator. The adjusted fitness (in the Koza sense) of the best individual measured on the evaluation set is 0.93 which corresponds to a mean R2 of 0.93, considering all the catalysts and all measurements. Data has been previously divided in learning set, test set, and evaluation set in order to detect overfitting. Considering the real application, it seems that one can conclude that the exhaustive search started by the CAX in order to find the best positions of grafting does not yield much better results than when the same amount of CPU time is used by an ordinary standard crossover, see Figure 5. Finally, different functions can be extracted from the best Pareto front, see Figure 7. For example, a two parameter function X=h(t)=ktn/(1+ktn) is selected that was found, using the standard GP operator, that shows the best balance between fitting accuracy and number of parameters. On the other hand, a three parameter function X=f(t)=a-bct is also selected since the number of operators is minimized while showing approximately the same fitting quality. /
-
a
b
k
^
c
+
×
×
t
1
^
t
n
×
k
^
t
n
Figure 7. Genetic programming trees: a-bct on the left hand side, and ktn/(1+ktn) respectively on the right.
4- Using genetic programming results GP algorithm using the CAX operator, as well as the ordinary standard crossover, were evaluated on a real set of data, consisting of kinetic measurements for 128 different catalysts in the epoxidation of 4-decene. The application of GP to extract an analytical expression for reproducing the relationship between the conversion level and the reaction time introduces new opportunities during the evaluation of the results, since all the information obtained during the experimental assays is entirely retained. As a consequence, the loss of information is avoided through an expression capturing the evolution of conversion with reaction time for each catalyst, while the data storage is also simplified by transforming the collection of discrete conversion vs. time values into the few parameters of the proposed equation. The automatic discovery and fitting of analytical expressions to reproduce kinetic experiments represents a key issue for speeding up the data treatment stage, especially when large amount of information has been generated by using high-throughput technologies. Even when the behaviour of the tested catalysts wants to be evaluated from one unique stand point (initial reaction rate, or the conversion level at a specific reaction time), it is necessary to normalize the experimental results to fairly perform the comparisons, since aliquots for each reaction are hardly ever taken at the same reaction times. In this scenario, managing a simple equation to
rapidly estimate initial rates or the conversion at any reaction time (interpolation) becomes crucial. by simply calculating the derivate of a given analytical function, reaction rates can be evaluated at whatever reaction time including the initial reaction rate r0. For example, considering
δ h k 2 nt 2 n −1 knt n −1 2 k .t n ' , R =0.98 is found = h t , v ( t ) = h t = = − + ( ) ( ) 1 + k .t n δ t (1 + kt n ) 1 + kt n
between estimated r0 and previously reported in Ref.16 On the other hand, analytical equations allow retaining most of the information from the kinetic measurements, whose importance has been graphically expressed in Figure 8. In this Figure three representative “conversion vs. reaction time” curves are depicted, showing that the ranking of catalysts (A, B, C) depends on the selected criteria, i.e. C > B > A at t = 1, B > C > A at t = 4, and A > B > C at t = 10, with t in hours. This result is a direct consequence of the fact that the final catalytic response is actually defined by a set of chemical-physical phenomena, such as the type and magnitude of the interactions between reactants and the active sites or the occurrence of some deactivation processes. 100 A
Conversion (%)
A B
B
C
B C
C
50
A
C B A
1
10
4 Time (h)
Figure 8. On the importance of the comparison criterion
Therefore, the use of GP on catalysis field deals with acquiring a global understanding of the studies in case, enhancing the quality of results and the final knowledge gain. For instance, in the present work we have applied the GP algorithm to infer various analytical expressions able to reproduce within very low errors (global R2 ≈ 0.93) the “conversion vs. reaction time” curves for the epoxidation of 4-decene using Ti-MCM-41 and Ti-ITQ-2 catalysts. As a consequence, we are now ready to evaluate the catalysts behaviour from different standpoints, as shown in Figure 9. In this Figure, experimental results (conversion levels) are represented at two (left) or three (right) reaction times, using some filters to identify some of the characteristics of the related catalysts (type of material, i.e. MCM-41 or ITQ-2, top; type of silylating agent, SiMe3, SiMe2Bu, SiMe2Ph, or SiMePh2, bottom). Under this approach, new conclusions can be extracted about the catalysts mode of action, complementing those previously reported.16 On this regard, it is shown that ITQ-2 samples providing the same conversion level than MCM-41 at the initial stages of the reaction (t = 0.2 h), are generally more active materials at larger reaction times (t = 1 h), as can be inferred from Figure 9 (top, left). A similar analysis can be carried out in three dimensions by considering the behaviour of the catalysts at 0.2, 1, and 6 h
(Figure 9, top right). Moreover, when the same results are filtered by the type of silylating agent, it is possible to observe the formation of some clusters, indicating, for instance, that SiMe2Bu is highly active at short reaction times, but becomes overcome by SiMe3 at 6 h.
Figure 9. Left - 2D plot of percent of conversion at t=1, and t=0.2. Right - 3D plot of percent conversion at t=0.2, t=1, and t=6; t in hours. Influence of support is shown in the top charts with ITQ samples as red squares and MCM with filled blue circles, while silylating agent influence is shown at the bottom with blue, red, gray, green, and white circles respectively for SiMe2Ph, SiMe2Bu, SiMe3, SiMePh2, and without silylating agent.
On the other hand, the powerful of GP to offer an analytical expression to the experimental data is not only related to the ration knowledge gain/time savings but to the possibility of introducing diverse mathematical criteria during the search process. For instance, among the large number of possible equations to fit our kinetic measurements, we have limited the complexity of the solution (number of operators, and number of parameters), leading to simple empiric
expressions. For instance, the equation a − bct = f ( t ) has been found by minimizing the number of operators to obtain a satisfactory correlation (R2=0.92). Thanks to this fact, a new criterion can be easily calculated to rank the catalysts with regard to the whole “conversion vs. reaction time” data, using the area bellow the kinetic curves (integral of the analytical expression between 0 and T=10 hours) as shown in Equation 3. Figure 10 shows that this new criterion allows giving a new point of view on catalysts ranking complementarily to previously established one.16 t =T
∫ 0
b.cT ( a − b.c ) = F − F , with F = a.T − Ln ( c ) (Equation 3) t
0
T
T
0.07 r0 - MCM41 Integral - MCM41
0.06 0.05 0.04 0.03 0.02 0.01 0 1
11
21
31
41
51
61
41
51
61
0.07 r0 - ITQ2 Integral - ITQ2
0.06 0.05 0.04 0.03 0.02 0.01 0 1
11
21
31
Figure 10. In black is represented the initial reaction rate while the area below the reaction curve between 0 and 10 hours appears in grey. Area has been divided by 100 in order to keep only one y-axis. Results are given separately for MCM41 and ITQ2, resp. Top and Bottom.
On the other hand, the equation
k .t n = h ( t ) has been achieved by minimizing the total 1 + k .t n
number of parameters involved. Although the resulting expression is clearly more complex, making difficult its analytical treatment (and in particular the definition of the primitive for integral calculation), it is more convenient for trying to correlate the responses (conversion vs. time curves) of the catalysts with their chemical characteristics using advanced modelling algorithms. In this sense, the regression between parameters values and synthesis variables has been handled with a neural network. The synthesis variables are the following: 2 supports for the Titanium grafting process {MCM-41, ITQ-2}, the range [0.1-5] Ti wt% for the Titanium grafting, 4 silylant agents to analyse the effect of the alkyl group size on the catalytic properties {SiMe3, SiMe2Ph, SiMePh2, SiMe2Bu}, [0.0-1] and [0.0-0.5] for the silylation degree, e.g. SiR3/(SiO2+TiO2), for MCM-41 and ITQ-2 samples respectively. The epoxidation of trans-4decene is elected as test reaction to evaluate the catalytic performance of the synthesized materials. Minimizing the low number of parameters allows overfitting of the neural network to be easily handled, and thus, the resulting architecture shows a very low level of complexity in both the number of hidden layers and total amount of neurons (Multi-layer Perceptron 4:4-82:2), four synthesis variables as input, and k and n as output. Table 1. Neural network statistics Data mean Data S.D. Error Mean Error S.D. Abs. E. Mean S.D. Ratio Correlation
Training k n 0.178283 0.539448 0.143507 0.162605 -0.000971 -0.001282 0.054646 0.075635 0.041913 0.062049 0.380789 0.465148 0.924835 0.886070
Selection k n 0.223869 0.574403 0.164714 0.145667 -0.032269 -0.036759 0.080973 0.080328 0.060644 0.069006 0.491600 0.551453 0.872050 0.854034
Test k 0.196315 0.159821 -0.010183 0.084756 0.067030 0.530319 0.854259
n 0.561092 0.111543 -0.019647 0.075151 0.063672 0.673738 0.804007
Figure 11 shows the estimation and nominal errors of k and n using the synthesis variables as input (e.g. %Ti, %Sylilation, Sylilating agent, and Support). Before using the neural network, a division of the dataset (½ for training, ¼ for selection, and ¼ for testing, i.e. unseen materials) allows preventing overfitting, see Table 1 for statistics. 1.5
k kpred
n npred
0.3
1.2
0.2 0.9
0.1
0.6
0.0 -0.1
0.3
-0.2 0.0
diff k diff n
Figure 11. Top - Neural network (Multi-layer Perceptron 4:4-6-2:2) estimation. Bottom – Observed versus predicted values of k and n (respectively left and right) for separated datasets, i.e. T for training, S for selection, and X for Test.
5.- Conclusion The conclusion is not exactly the one that was originally planned. When starting this work, the aim was to improve the best individual result on the heterogeneous catalyst optimisation problem using the Context Aware Crossover. Unfortunately, things did not turn out as expected, as it was impossible to obtain better results with the CAX than with an ordinary crossover operator on this real world problem. A careful implementation of the benchmarks seems to show that CAX is not capable of improving the best fitness value; although CAX seems to be a very good exploitation operator that boosts the whole population towards much better fitness values while maintaining a good level of diversity (best individual fitness keeps rising after the CAX is started). This means that CAX remains a very interesting crossover method that would deserve another careful investigation on diversity preservation. From the point of view of the chemistry, the application of GP allows reproducing the relationship between the conversion level and the reaction time, it retains all the information, and data storage is also simplified. Moreover, the use of GP permits acquiring a more global understanding, enhancing the quality of results and the final knowledge gain. Catalysts behaviour can be quickly evaluated from different points of view, allowing new conclusions to be extracted about the catalysts mode of action. For the first time in heterogeneous catalysis, genetic programming has been used for an application of industrial interest. With this study, it has been shown how such a tool can open new opportunities for data mining and knowledge extraction in material science. As an example, the combination with a modelling tool such as neural network makes again the GP strategy very promising and relevant.
References 1 2 3 4 5 6 7
8
9 10
11
12 13 14
15
16 17
18
19
20
A. Corma, M.T. Navarro, J. Perez Pariente. J. Chem. Soc., Chem. Commun. 1994 147 A. Thangaraj, R. Kumar, P. Ratnasamy, J. Catal. 131 1991 294 W. Fan, P. Wu, S. Namba, T. Tatsumi, Angew. Chem., Int. Ed. 43 2003 236 P. Barret, F. Pautet, M. Dauton, J.F. Sabot, Pharm. Acta Helv., 62 1987 348 N. Fdil, A. Romane, S. Allaoud, A. Karim, Y. Castanet, A. Morteaux., J. Mol. Catal., 108 1996 15 M. Lajunen, A.M.P. Koskinen, Tet. Lett., 35 1994 4461 A. Corma, M. Domine, J.A. Gaona, J.L. Jorda, M.T. Navarro, F. Rey, J. Perez-Pariente, J. Tsuji, B. McCullock, L.T. Nemeth, Chem. Comm., 2211 1998 W. Zhang, M. Froeba, J. Wang, P.T. Tanev, J. Wong, T.J. Pinnavaia, JACS 1996, 118(38), 91649171. K.A. Koyano, T. Tatsumi, Microporous Materials 1997, 10(4-6), 259-271. A. Corma, V. Fornes, S.B. Pergher, Th.L.M. Maesen, J.G. Buglass, Nature (London) 1998, 396(6709), 353-356. A. Corma, U. Diaz, V. Fornes, J.L Jorda, M.E. Domine, F. Rey, Chem. Comm. (Cambridge) 1999, (9), 779-780. A. Corma, U. Diaz, M.E. Domine, V. Fornes, Angewandte Chemie, Int. Ed. 2000, 39(8), 1499-1501. A. Corma, U. Diaz, M.E. Domine, V. Fornes, JACS 2000, 122(12), 2804-2809. P. Wu, D. Nuntasri, J. Ruan, Y. Liu, M. He, W. Fan, O. Terasaki, T. Tatsumi, J. of Physical Chemistry B 2004, 108(50), 19126-19131. (a) Jandeleit, B.; Schaefer, D.J.; Powers, T.S.; Turner, H.W.; Weinberg, W.H., Angew. Chem. Int. Ed. 1999, 38, (17), 2494-2532. (b) Senkan, S.M., Angew. Chem. Int. Ed. 2001, 40, (2), 312-329. (c) Reetz, M.T., Angew. Chem. Int. Ed. 2001, 40, (2), 284-310. (d) Newsam, J.M.; Schuth, F., Biotechnol. Bioeng. 1999, 61, (4), 203-216. (e) Gennari, F.; Seneci, P.; Miertus, S., Catal. Rev.-Sci. Eng. 2000, 42, (3), 385-402. P. Serna, L.A. Baumes, M. Moliner, A. Corma, Journal of Catalysis, 258, 35-34, 2008 (a) M. Holena, M. Baerns, Catal. Today, 2003, 81, 485-494. (b) L.A. Baumes, M. Moliner, A. Corma., QSAR comb. Sci. Vol. 26, Issue 2, 255-272, 2007 (c) D. Nicolaides, QSAR Comb. Sci. 2005, 24, 15-21. (d) L.A. Baumes, J.M. Serra, P. Serna, A. Corma. J. Comb. Chem. 2006, 8, 583-596 (e) M. M. Gardner, J. N. Cawse, In Experimental Design for Combinatorial and High Throughput Materials Development, Ed. J.M. Cawse. J. Wiley & Sons, Inc. 2003, 129-145. (f) F. Schüth, L.A. Baumes, F. Clerc, D. Demuth, D. Farrusseng, J. Llamas-Galilea, C. Klanner, J. Klein, A. Martinez-Joaristi, J. Procelewska, M. Saupe, S. Schunk, M. Schwickardi, W. Strehlau, T. Zech. Catal. Today. Vol. 117, 2006. 284-290 (g) A. Corma, J. M. Serra, E. Argente, S. Valero, V. Botti, Chem. Phys. Chem., 2002, 3, 939-945. (h) L.A. Baumes, D. Farruseng, M. Lengliz, C. Mirodatos. QSAR & Comb. Sci. Nov. 2004, vol. 29, Issue 9, 767-778. (a) J.R. Koza. Genetic Programming: On the Programming of Computers by means of Natural Evolution. MIT Press, Massachusetts, 1992. (b) J.R. Koza. Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Massachussetts, 1994. (a) Baumes, L.A. Combinatorial Stochastic Iterative Algorithms and High-Throughput Approach: from Discovery to Optimisation of Heterogeneous Catalysts (in English). Univ. Claude Bernard Lyon 1, Lyon, France, 2004. (b) Farrusseng, D.; Baumes, L.A.; Mirodatos, C., Data management for combinatorial heterogeneous catalysis: methodology and development of advanced tool. In In HighThroughput Analysis: A Tool For Combinatorial Materials Science, Potyrailo., R. A.; Amis., E. J., Eds. Kluwer Academic/Plenum Publishers: 2003; pp 551-579. (c) http://catalyse.univlyon1.fr/gre3b4.htm website accessed the 20th july 2006 (d) http://www.fist.fr/article259.html website accessed the 20th july 2006. (e) Adams, N.; Schubert, U.S., Macromol. Rapid. Commun. 2005, 25, 4858. (f) Adams, N.; Schubert, U.S., QSAR & Comb. Sci. 2005, 24, 58-65. (g) Ohrenberg, A.; von Torne, C.; Schuppet, A.; Knab, B., QSAR & Comb. Sci. 2005, 24, 29-37. (h) Saupe, M.; Fodisch, R.; Sunderrmann, A.; Schunk, S.A.; Finger, K.E., QSAR & Comb. Sci. 2005, 24, 66-77. (i) Gilardoni, F.; Curcin, V.; Karunanayake, K.; Norgaard, J.; Guo, Y., QSAR & Comb. Sci. 2005, 24, 120-130. H. Majeed and C. Ryan. A less destructive, context-aware crossover operator for GP. In P. Collet et al., editor, Proc of the 9th European Conf. on Genetic Programming, vol. 3905 of Lecture Notes in Computer Science, 36-48, Budapest, 2006. Springer.
21
22
23
H. Majeed and C. Ryan. Using context-aware crossover to improve the performance of GP. In Maarten Keijzer et al., editor, GECCO 2006: Proc. of the 8th annual conf. on Genetic and evolutionary computation, vol.1, 847-854, Seattle, Washington, USA, 8-12 July 2006. ACM Press. (a) H. Majeed and C. Ryan. Context-aware mutation: a modular, context aware mutation operator for genetic programming. In Dirk Thierens et al., editor, GECCO '07: Proc. of the 9th annual conf. on Genetic and evolutionary computation, vol.2, 1651-1658, London, 7-11 July 2007. ACM Press. (b) H. Majeed and C. Ryan. On the constructiveness of contextaware crossover. In Dirk Thierens et al., editor, GECCO '07: Proc. of the 9th annual conf. on Genetic and evolutionary computation, vol.2, 1659-1666, London, 7-11 July 2007. ACM Press. L.A. Baumes, P.Collet. Computational Materials Science. 2008. In Press.