Evol. Intel. (2008) 1:133–144 DOI 10.1007/s12065-008-0010-z
RESEARCH PAPER
Evolutionary algorithms to simulate the phylogenesis of a binary artificial immune system Grazziela P. Figueredo Æ Luis A. V. de Carvalho Æ Helio J. C. Barbosa Æ Nelson F. F. Ebecken
Received: 12 November 2007 / Revised: 12 March 2008 / Accepted: 13 March 2008 / Published online: 29 April 2008 Ó Springer-Verlag 2008
Abstract Four binary-encoded models describing some aspects of the phylogenetics evolution in an artificial immune system have been proposed and analyzed. The first model has focused on the evolution of a paratope’s population, considering a fixed group of epitopes, to simulate a hypermutation mechanism and observe how the system would self-adjust to cover the epitopes. In the second model, the evolution involves a group of antibodies adapting to a given antigenic molecules’ population. The third model simulated the coevolution between antibodies’ generating gene libraries and antigens. The objective was to simulate somatic recombination mechanisms to obtain final libraries apt to produce antibodies to cover any possible antigen that would appear in the pathogens’ population. In the fourth model, the coevolution involves a new population of self-molecules whose function was to establish restrictions in the evolution of libraries’ population. For all the models implemented, evolutionary algorithms (EA) were used to form adaptive niching inspired in the coevolutionary shared niching strategy ideas taken from a monopolistic competition economic model where ‘‘businessmen’’ locate themselves among geographically distributed ‘‘clients’’ so G. P. Figueredo (&) L. A. V. de Carvalho N. F. F. Ebecken Federal University of Rio de Janeiro - COPPE, Rio de Janeiro, Brazil e-mail:
[email protected] L. A. V. de Carvalho e-mail:
[email protected] N. F. F. Ebecken e-mail:
[email protected] H. J. C. Barbosa LNCC, MCT, Petro´polis, Brazil e-mail:
[email protected]
as to maximize their profit. Numerical experiments and conclusions are shown. These considerations present many similarities to biological immune systems and also some inspirations to solve real-world problems, such as pattern recognition and knowledge discovery in databases. Keywords Artificial immune systems Evolutionary computation Artificial immune systems models
1 Introduction The immune system (IS) is able to protect us from a number of pathogens. It also monitors the organism, searching and destroying anomalous cells. To perform such tasks, the IS must recognize a great variety of different compounds and distinguish, among them, those which can remain in the organism and those that are to be eliminated. It is believed that the IS identifies about 1016 foreign molecules [18], which means that it can identify any molecule [10]. The IS pattern recognition task is performed through surface receptor molecules of T and B cells. The identification of antigens in both these types of lymphocytes occurs differently. B cells recognize antigens through immune globulins from its cell surface. T cells recognize only antigens presented by an antigen presenting cell (APC). The creation of these receptors and their capability to cover all antigens have their origin in a very sophisticated genetic mechanism. During the receptor’s formation process, the variation is caused by the combinatorial associations among the receptors codifying genes and the hypermutation mechanism. The hypermutations occur in the lymph nodes’ germinative centers. Thus, when an APC penetrates the lymph
123
134
node and shows an antigen to a T cell, or when a B cell finds a pathogen and identifies it, that means the combination was well succeeded. After the recognizing pattern is established, the lymphocyte becomes activated, cloning itself. Clones with high capacity of recognizing a certain antigen tend to proliferate. On the other hand, clones with low recognizing capacity disappear and are replaced by others with higher efficiency [19, 27]. The analogy between the clonal selection and the Darwinian natural selection [5] is clearly seen here. After recognizing an antigen by a B cell receptor, followed by a sequence of other events, a formation of plasma cells clones responsible for the secretion of the same receptor in its soluble form takes place. This secreted receptor is the antibody and its function is to catch and bind the antigen. This binding occurs between the paratope of the antibody and the antigen’s correspondent epitope. A paratope that presents a strong bind within the epitope has a greater capacity of neutralizing the antigen [6, 7]. An antibody is constituted by two equal heavy chains and two light chains, as it can be seen in Fig. 1. The shape of the molecule is similar to a Y. The base of the Y has parts of heavy chains and the arms are constituted by both chains. The antibody’s recognizing site is located at the end of each arm, and is known as V region. The antibody is antigen-specific, and to provide the organism with antibodies able to recognize all the antigens, the antibodies’ codifying genes suffer somatic recombinations (Fig. 2). The variable portion of the light chain requires two distinct DNA encoding segments. The V one encodes most of the variable region. The remaining is encoded by a J segment. In a non-activated B cell, the V and C regions encoding DNA sequences are spatially apart from each other. When B lymphocytes become mature, the somatic recombination joins genes from segments V, J and C. The C segment encodes the light chain constant portion of the antibody. This mechanism is illustrated in Fig. 2.
Fig. 1 The antibody molecule. Adapted from [19]
123
Evol. Intel. (2008) 1:133–144
There are many possible combinations of the available gene segments, which gives the IS the capability of producing an enormous number of distinct antibodies. These concepts will be the building blocks of the models presented in this work, which simulates the dynamics between paratopes and epitopes, and between antigens and antibodies inside the organism along the evolution of a species. This work was inspired by the ideas found in [3, 4, 8, 10, 14–16, 21–25, 28–30]. Basically, in the above ideas which served as inspiration to the present work, the objective was ‘‘to develop models directed at understanding the pattern recognition process of two aspects of the IS, clonal selection and long term evolution of genes.’’ In the works cited above most of the models work with binary strings evolved by genetic algorithms (GAs) [10, 14, 16, 23, 28]. Other approaches involving artificial models and citations of many other Artificial immune systems (AIS) models can be found in [2, 9, 20, 26, 36]. No comparison among the models of the present article has been made with the ones previously cited, because the focus of each work was different. Evolutionary Computation has been chosen to implement the models studied [33]. A coevolutionary genetic algorithm (CGA) was used to form adaptive niching based on the ideas of [8]. In that work, in contrast with fixed shared schemes, a niching formation strategy named coevolutionary shared niching (CSN) was proposed to allow for the adaptation of the location and the radius of each niche . CSN was inspired by Tullock’s [35] economic model of monopolistic competition where ‘‘businessmen’’
Fig. 2 Generation of the antibody’s chains given by somatic recombinations (adapted from [19])
Evol. Intel. (2008) 1:133–144
locate themselves among geographically distributed ‘‘clients’’ so as to maximize their profit. Four models are described in Sects. 2–5, respectively, which also present numerical experiments. The paper ends with Sect. 6 which discusses the results of our work. It is important to make clear that this work does not focus on theoretical immunology. The main objectives of the four models are to simulate the evolution of an AIS in order to understand some aspects of biological IS development in order to come up with methods and algorithms to solve real-world engineering problems. Given the great amount of experiments made in all the models with similar results, it was decided to show only the most important graphs containing the evolution of each model.
135
2.1.1 Encoding In real biological systems, it is known that the constituting regions of epitopes and paratopes are formed by complex chains of organic compounds. Nevertheless, in this artificial model, epitopes and paratopes are represented by binary chains, following [8, 10]. Therefore, in the GA context [11, 17], phenotypes and genotypes will be the same. 2.1.2 Initialization The initialization of paratopes can be made entirely at random or, according to [10], by inserting some pre-defined binary blocks in the chromosome. 2.1.3 Niche distribution
2 The first model In the first model proposed, which represents a simplification of what happens in biological ISs, there is a group of paratopes that have to adapt through the generations, so they can optimize the coverage of a fixed given group of epitopes. The aim is to analyze the capability of adaptation of the system in an environment full of aggressive elements, and its behavior due to pattern identification within the epitopes structure as well. That is why the simulations are initially performed with a number of paratopes smaller than the number of epitopes. It is known that the antibodies are antigen-specific [19], meaning that it is assumed that there is just one paratope able to bind itself to a particular epitope of the antigen molecule. In this model, however, epitopes with slight structural differences can be inactivated by the same paratope. 2.1 The algorithm for the first model This sub-section will give details of the first model. The corresponding pseudo-code is shown in Algorithm 1. After the algorithm, each part of it is explained in details.
The number of niches is always the same as the paratope’s population size. In terms of the CSN, paratopes and epitopes play the roles of the businessmen and clients, respectively. The distribution of epitopes among the niches is determined by the smallest distance between them and the paratopes. Each epitope is compared to a paratope, in order to establish which paratope is the closest one and, consequently, which niche the epitope will belong to. The individuals in the jth niche are the epitopes that the jth paratope is more apt to neutralize among the current paratope population. The capability of a paratope to neutralize an epitope is measured by means of a distance computation. Here, distance is understood as a function which compares the epitope and paratope chromosomes, also known as matching function. There are various types of matching functions [24], however, in this model, the one believed to be most faithful to biological systems was chosen. The chromosomes are compared bitwise, and the matching value is determined by the longest complementary chain between them, as it can be seen in the following example. Epitope: 0001111010101000011110 Paratope: 1111011101010111111110 MatchingValue ¼ 10 The complementary chains represent the molecular bind between a paratope and an epitope. The objective is to reduce the distance between paratopes and epitopes along the evolution. The distance will be given by the formula in Eq. 1: Distance ¼ Epitope0 sChromosomeSize MatchingValue ð1Þ This distance presented by Eq. 1 is supposed to be minimized along the process of evolution.
123
136
Evol. Intel. (2008) 1:133–144
2.1.4 Mutation The genetic operator used was the classical mutation for binary GAs, in which one bit is sorted and its value is inverted. The mutation to an individual is retained only when its fitness improves. This procedure makes the search similar to Hill Climbing algorithms. It was chosen because there was not an explicit objective function in the system capable to determine gradients to drive the search. Thus, it is the evaluation of a paratope mutation that guarantees a bias to increase performance through the generations, and allows the system to organize itself in the best way to defend the organism. 2.1.5 System’s general state evaluation An important question for this model is how to determine the efficacy of the generated system after a number of generations, or, in other words, if it is capable to combat the given epitopes. During the paratope’s population evolution, there will be, at least, one minimum site of bind to all epitopes. Nevertheless, it is prudent to say that weak binds are not able to produce efficient neutralizations, since they could break when in contact with other molecules or under slight environmental variation. Concerning this problem, a performance measure has been established to determine the efficiency rate in fighting aggressors. This parameter was named Inefficiency Limit, and its value corresponds to the minimum percentage of an epitope that must be recognized by a paratope so that the latter can be considered inactive. 2.1.6 Evaluation
Fig. 3 The evolution of the AIS considering populations of epitopes and paratopes in different sizes
typical run with a mutation rate of 85%. The inefficiency limit was set to 10%. As it can be seen in the graph of Fig. 3, at the start of the evolution of the paratopes, there is no good performance of the system in recognizing and neutralizing the epitopes. That is why the curve starts indicating a small number of recognized epitopes (vertical axis). However, along the evolution course (horizontal axis), the artificial hypermutation mechanism enables paratopes to cover the epitopes population. Another graph, presented in Fig. 4, shows the improvement of the fitness of the paratopes through the generations. In this experiment, the number of epoches was set to 200, chromosome size 65, paratopes population size is 10, and epitopes population size is 300, which is the maximum value to be reached by the paratopes population as shown in the graph of Fig. 4. The results observed showed a great similarity to real ISs. Those who were able to adapt to new pathogens
The evaluation is obtained by observing individually the paratope; it is a self-organized system in which what is expected is the individual action of each paratope leading to an efficient global defense system. 2.2 Experiments 2.2.1 The first example This first example shows the adaptation of the paratopes according to a fixed epitopes population. It considers an epitope population greater than the paratope population and explores the capacity of the model in recognizing patterns and grouping the epitopes into niches. Experiments have shown that mutation probabilities ranging from 60 to 85% do not interfere with the evolution of the paratopes. The results presented correspond to a
123
Fig. 4 The evolution of the sum of the paratopes fitness the populations of epitopes and paratopes have different sizes
Evol. Intel. (2008) 1:133–144
137
Fig. 7 An antigenic molecule and the correspondent binding antibodies
3 The second model
Fig. 5 The evolution of the AIS considering populations of the same size
survived and multiplied. In this model, one can simulate a system inefficient in recognizing the epitopes by increasing the inefficiency limit parameter. 2.2.2 The second example In this second example, shown in Fig. 5, paratopes and epitopes populations have the same size. Now the model is closer to the real biological systems. What is examined here is the capacity of the system in neutralizing the given epitopes.The Inefficiency Limit was increased to 50%. Both epitope’s and paratope’s population have a size of 100, and chromosomes 25-bit long. Two hundred epoches were performed. Another graph, presented in Fig. 6, shows the improvement of paratopes fitness along the evolution.
The first model was the simplest prototype elaborated in this work and it presents some limitations. The main limitation is that it broadens the specific antigenic restriction for each antibody. During the implementation and the analysis of this model, an upgrade of the first system was presented as the second version in which the behavioral patterns would be more faithful to real ISs. It is known that molecules have to be large, rigid and chemically complex to be considered antigenic [34]. Pathogenic organisms—such as bacteria, anomalous cells or erythrocytes—can start up an immune response because their structure has a complex compound of various molecules that alone are taken as antigens [1, 4]. As a result, a bacteria could be seen as an antigenic region with a multiple bind site for antibodies, for example. Each site stands for a different antigen. After the study of these notions, it was possible to establish new parameters to improve the model. The new version does not deal with epitopes and paratopes, but antibodies and antigenic regions. This idea was adopted to make possible the implementation of binding sites of pathogenic molecules, where antibodies could match. Now the antigenic regions are represented by longer bit chains, and antibodies of shorter bit sequences that have to bind to antigen sub-chains. These sub-chains represent the antigenic determinants for the molecule. Figure 7 shows an example of the new model. Only one antigenic molecule was considered, and a possible configuration of antibodies to neutralize it is also shown in the bottom of the figure. In Fig. 7 four different antibodies were generated. They were all able to recognize and neutralize antigens within a molecule. The matching rate for antigenic determinant identification was set to 100%. The following sub-sections explain in detail how this new model was implemented. 3.1 The algorithm for the second model
Fig. 6 The evolution of the sum of the paratopes fitness. The populations of epitopes and paratopes have equal sizes
The algorithm corresponding to the first model underwent some changes in order to accommodate the additional requirements, as shown in Algorithm 2.
123
138
Evol. Intel. (2008) 1:133–144
strategy similar to the one used in CSN [12] was adopted. The algorithm accepts only mutations that generate different individuals from the ones that are already part of the population. This difference is given by the Hamming distance, which must be greater than zero. 3.1.2 Evaluation
3.1.1 Niches distribution In this new strategy a role reversal between antibodies and the antigenic molecules occurs. The niche owner, or ‘‘businessman’’ in the monopolistic competition model are now the antigenic molecules, and the antibodies are the ‘‘clients’’. The decision to alter the original configurations emerged because the new system has the ability to determine, to each antigenic molecule, a new group of binding antibodies. In the model, every antigenic determinant represents an antigen and has a fixed part of the chromosome. All parts have the same size, which is also the size of the antibody’s chromosome. The assignment of an antibody to a specific niche occurs when this antibody reaches a certain rate in the matching function when paired to some antigen in the molecule. It is possible to notice a peculiar situation is derived from the model’s evolution. There will be situations when an antibody will take part in more than one niche. This means that some niches will intersect. In immunological theory, this is named cross reaction. Some of the initial difficulties in obtaining the expected behavior from the model derived from this particular feature. In some examples that had been run in an intermediate model between the first model and the Algorithm 2, large populations of antigenic molecules were used. This created various identical sub-chain gene patterns of chromosomes within the genotype. Consequently, the antibody population biased these more frequent sub-chains. At first, it seemed natural, for it is believed the greater the antigen number the more attention they draw from defense mechanisms. However, forming various identical antibodies within the population was not the objective of this model. In the second model, each antibody in the population represents the whole group of antibodies secreted by a plasma cell clone. To solve the problem of identical antibodies, a
123
For each time the antibody equals the matching function to a given antigen, the antibody’s fitness is increased by one point. If it happens that the matching rate is bellow the minimum value required to indicate a binding location, a score lower than one is added to the antibody’s fitness. The score value is found by dividing the matching rate by the smallest chromosome size. This method is used to avoid loss in combinations that could potentially excel in future generations. 3.1.3 System’s general state evaluation In this second model, there is also the individual evaluation for the antibodies. This leads the system to self-adjust in order to increase its covering of the antigen group. Relevant to the system, nevertheless, is not only the improvement to the antibodies’ fitness, but also the system’s capability to maximize the neutralization of any antigen given. This characteristic is clearly derived from the improvement of the system. 3.2 Experiments 3.2.1 The first example This second model explores pattern recognition into the antigenic molecules. The following graphs show the evolution of two instances containing different numbers of antibodies. The first example represents a model with a small number of antibodies, whose mission is to find equal building blocks into the antigens and maximizing the neutralization of the whole antigens population. The graph in Fig. 8 shows the evolution of the system producing antibodies able to adapt to the antigenic molecules given. The vertical axis shows the antigenic molecules reconized by the system. The horizontal axis are the epoches. The first example paratope’s population size was set to 50, epitopes’ 300. The paratope’s chromosome size was 8 and epitope’s, 64. All these numbers of chromosome sizes, amount of epitopes and paratopes have been empirically determined, after many experiments using other values. In the results of Fig. 8, not all the epitopes have been recognized because of binary limitations.
Evol. Intel. (2008) 1:133–144
139
Fig. 10 An example of a library individual after some generations
to observe how the coevolution would proceed during the generations in terms of velocity, expansion of gene libraries and robustness. 4.1 The libraries population’s GA Fig. 8 The evolution of the AIS in the first example
3.2.2 The second example The second example, shown in Fig. 9 introduces a greater number of antibodies and shows how this could improve the performance of the AIS. The paratopes’ population size was set to 100 and the epitopes’ to 200.
4 The third model The antibodies’V region task consists of recognizing the antigens. This region is encoded by libraries of gene segments. The V light chain requires three DNA segments from the types called V, J and C. The heavy chain is encoded by four segments, V, D, J and C. The somatic recombination between these segments is one of the main lymphocytes— and therefore, antibodies—diversity generators. The third model simulates the coevolution between an artificial species’ lymphocytes’ encoding genes library and an antigens’ population. This simulation was implemented using CGAs. The objective was to obtain a gene library which produces antibodies that would recognize any possible mutation in the antigens’ genes. It was also a goal
In order to implement the system, each individual in this first GA’s population represents a simplified library which contains only three binary encoded segment groups V, D and J. Initially, the libraries have only one segment of each group and their initialization is entirely randomly made. One example of an individual is shown in Fig. 10. The junction between one segment of each group forms the genetic code for producing an antibody. Decoding an individual here means to produce all of its potential antibodies repertoire. This is done by making recombination between the individual library segments of V, D and J kind, in this order, as shown in Fig. 11. During the evaluation, this repertoire is contrasted to a binary chained antigens population. The libraries’ recombination operator used was a crossover in which one of the segment groups V, D, or J, is randomly chosen and exchanged between the parents, as shown in Fig. 12.
Fig. 11 Antibodies generation. The first antibody was created by the junction of the first gene segment of V, followed by segments D and J. The second one was created by the junction of the V’s second gene segment with D and J
Fig. 9 The evolution of the AIS in the second example
Fig. 12 Crossover operator
123
140
Fig. 13 Mutation operators applied to the V group of genes
There are three kinds of mutation in the libraries’ GA. After randomly selecting a segment group to mutate, the next step is to establish, probabilistically, the mutation mechanism. The additive mutation introduces a new segment into the selected group. The subtractive one removes a randomly chosen segment, and the inversive mutation randomly selects a segment and one of its bit to be inverted. These mechanisms are illustrated in Fig. 13. The fitness of each library is given by its capacity of producing an antibody potential repertoire capable of maximizing the neutralization of the antigens population. To neutralize an antigen, the antibody’s paratope needs to bind an antigenic determinant in the pathogen’s molecule. In the model, the antibody is constituted only by the paratope. The antigen can be larger than the antibody. Thus, there might be more than one region in the antigenic molecule where a set of antibodies could bind. An example is shown in Fig. 7. The pseudo-code for the libraries’ GA is shown in Algorithm 3.
Evol. Intel. (2008) 1:133–144
certainly not be destroyed and keep being harmful. Following these ideas, it was defined that the antigen individual would suffer mutations and its fitness would be increased as much as its capacity to remain unrecognized. A mutation in the antigen’s chromosome is made by randomly selecting and inverting a bit. If this mutation produces a better individual, the former chromosome is replaced by the new one. Otherwise, the operation is ignored. The fitness calculation is similar to the one done for libraries, which means that the value of fitness is given by the matching function, given by Eq. 1. The difference is that the antibodies have to maximize matching in the niche, while antigens need to minimize it. Algorithm 4 shows in detail the algorithm for the antigens evolution.
4.3 The main algorithm This section shows schematically how the whole system works. The main pseudo-code is presented in Algorithm 5. For each population GA, the components described in Sects. 4.1 and 4.2 were implemented. The number of the main algorithm epochs and generations for each population are parameters to be defined by the user.
4.2 The antigens population’s GA 4.4 Experiments An individual in the antigens’ population is represented by a bit string. The GA operates in the population only by making mutations on the individual’s chromosome. If the mutation increases the antigen fitness, the change on the genetic material is kept. Otherwise, it is reversed. The antigen fitness is given by its capacity of aggression inside the organism. If the pathogen is not identified, it will
123
4.4.1 The first example In this first experiment, the parameters used for the libraries population GA were 120 generations per epoch, ten individuals in the population, elitism of one individual, 4 bits per gene segment and 85% of probability of crossover and
Evol. Intel. (2008) 1:133–144
141
4.4.2 The second example In this second experiment the number of generations for the libraries’ GA was reduced to 100. The other parameters had the same values used in the previous example. The antigens’ GA used 200 generations, 150 individuals and 60% of mutation probability. The results are shown in Fig. 15. In this example, as there is a smaller number of antigens, the libraries’ stability is achieved in a shorter period of time.
The Evolution of the Artificial Immune System 140
120
Recognized Antigens
mutation. The probabilities of application assigned to the additive, subtractive, and inversive mutations were, respectively, 20, 10, and 70%. The group of segments V, D, and J had the same chances of selection for mutation. The genetic operators’ values have been adopted based in what is found in nature. The antigens GA used 400 generations, 200 individuals, 64 bits per chromosome and 85% of mutation probability. The number of recognized antigens along the generations is shown in the graphic of Fig. 14. As it can be seen in the evolutive curve shown, at the beginning of the evolution, the gene libraries are not yet robust enough to recognize new antigens produced by the mutation mechanisms. That is why there is first an increase of neutralization and then, when antigens starts to evolute, this neutralization decreases. As the evolution of the artificial species proceeds, the library genes rapidly self-adjust in order to have a minimum repertoire able to produce receptors to identify and eliminate any given antigen.
100
80
60
40
20
0 0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Generations
Fig. 15 The evolution of the AIS in the second experiment
This fourth model simulates tolerance by adding a new population representing self. Now, the libraries’ population has to evolve maximizing the coverage of the antigens population and minimizing the attack of self-molecules. To implement this new requirement, a penalty for those individuals that produce self-reactive antibodies is introduced. Such penalty is computed by dividing the antibodies’ fitness sum by the number of self-molecules attacked. The algorithms for this fourth model are presented in the following algorithms. The pseudo-code for the libraries’ GA is shown in Algorithm 6.
5 The fourth model The next step in this work was to consider the biological IS tolerance characteristic. To be self-tolerant, an IS must distinguish between foreign molecules and those that belong to the organism, so that its cells and molecules will not self-attack [31]. The Evolution of the Artificial Immune System
Recognized Antigens
200
150
The pseudo code for the antigen’s algorithms and main algorithm is equal to the ones presented in model 3— Algorithm 5 and 4.
100
5.1 Experiments
50
5.1.1 The first example 0 0
1000
2000
3000
4000
5000
6000
Generations
Fig. 14 The evolution of the AIS in the first experiment
7000
In the first experiment, the libraries’ GA used 600 generations per epoch, ten individuals in the population and one
123
142
Evol. Intel. (2008) 1:133–144
5.1.2 The second example Some parameter changes were made, in this second example, in order to confirm the system’s behavior shown in the previous case. For the libraries’ GA 1,000 generations per epoch were used. The antigens’ GA had 200 individuals also with chromosomes of 120 bits. The selfpopulation had the same parameters used in the previous example. The results are shown in Fig. 17. Once more, the libraries could not evolve towards antigens neutralization. And specifically in this case, no antigen was recognized at the end of the evolution. The explanation for this phenomena is that the antigens evolved
The Evolution of the IS Considering Self Molecules
Recognized Antigenic Molecules without self harm
100
80
60
40
20
0 0
1000
2000
3000
4000
5000
Generations
Fig. 16 The evolution of the AIS in the first experiment
123
6000
The Evolution of the IS Considering Self Molecules 200
Recognized Antigenic Molecules without self harm
individual as part of elitism. The other parameters assumed the same values used in the previous model, 4 bits per gene segment and 85% of probability of crossover and mutation. The antigens’ GA used 1,500 generations per epoch, 100 individuals with chromosomes of 120 bits and mutation probability of 85%. The self-population had size 50 and its molecules were represented by chromosomes of 12 bits. The size of chromosomes in the antigens and self-populations were set in different values because of the binary strings limitations. Using an equal number of bits, there could be problems such as very similar sequences of zeros and ones in individuals from self and antigens populations. That would result in antibodies automatically matching self when matching antigens. The results obtained with the first set of parameters can be seen in Fig. 16. Here, results different from the other model were obtained. Initially, the libraries evolved to cover the antigens, and although this evolution proceeded slower, most pathogens were matched without self-harm. Nevertheless, by the time antigens mutated, libraries could not identify them anymore, not even considering all the rest of the evolution.
150
100
50
0 0
5000
10000
15000
20000
25000
Generations
Fig. 17 The evolution of the AIS in the second experiment
towards having strings in their genotype similar to self. In other words, antigens have mimicked self. This fact showed the main model limitation: when antigens became similar to self-molecules, there are no more antibodies’ effective defense mechanisms. This leads to the conclusion that in real biological ISs this could also happen. Thus, there must be other protection means to avoid self-similar antigens to invade an organism. In biological ISs this task is performed by T-cells, whose function, among others, is to detect anomalous cells, such as those that have suffered pathogenic invasions, or tumoral cells. The experiment showed the importance of Tcells inside the organism as another source of protection.
6 Conclusions Understanding how ISs in mammals have evolved to their present configuration is challenging but it also may be the key to figure out more details of their mechanisms. This paper has proposed four binary encoded models describing some aspects of the evolution in an artificial IS with some characteristics similar to the real biological systems. The first model has focused on the evolution of a paratope’s population considering a fixed group of epitopes. The objective of this first experiment was to simulate a hypermutation mechanism and observe how the system would self-adjust to cover the epitopes. This covering capacity is the measure of how well the system could protect an artificial specie along its evolution. The results of this first experiment showed that, at the beginning of the evolution, the paratopes available were not well adapted to the epitopes. However, as the evolution proceeded, the paratopes were becoming much more adapted to the environment presented, being able to recognize almost all epitopes given. The improvement of the first model produced a second model with characteristics more similar to real ISs. Instead
Evol. Intel. (2008) 1:133–144
of paratopes and epitopes, the evolution involved a group of antibodies adapting to a given antigen molecules population. The results of this second experiment has also shown an improvement of the adaptation curve, as in the first model. It has also brought a recognizing pattern apparatus able to find equal blocks into a population of bit strings. This algorithm could be adapted in order to be applied to other complex recognition systems [13]. The third model has focused on the coevolution between an antibodies producing gene libraries population and a set of antigens. The objective of this experiment was to simulate the somatic recombination mechanisms and observe how the system would self-adjust to provide the ability of identifying any foreign molecule. This covering capacity is the measure of how well the system could protect an artificial species along its evolution. Results of the experiments have shown that, at the beginning of the evolution, the libraries available were not well adapted to the exposed antigens. However, as the evolution proceeded, they became much more adapted to the environment presented, being able to recognize any new mutated antigen that would appear in the population. With the third model, the real ISs coverage stability was reached. Therefore, an artificial system able to adapt and recognize any given binary segment has been created. Also, the necessity and ability of a fast changing genetic mechanism to provide robustness to a biological species antibodies’ genotype has been shown. In addition to libraries and antigens, in the fourth model the evolution has involved a new group of bit strings. They represented molecules belonging to the organism and against who the IS could not activate defense mechanisms. With this new restriction, the libraries evolution had to proceed in order to maximize matching against antigens and minimize it against self. To implement this idea, a penalty scheme has been adopted to reduce the fitness of those self-reacting libraries. The results obtained have shown that the antigens’ population evolution proceeded towards imitating selfmolecules’ bits sequences. As a result, antigens became invisible to the antibodies defense mechanisms pointing to the necessity of other means of protection. In real ISs, these other means are constituted by T-cells. References 1. Abbas AK, Lichman AH, Pober JS (1998) Molecular and cellular immunology (in Portuguese), 2nd edn. Revinter, Rio de Janeiro 2. Aguilar E (2003) Um estudo sisteˆmico de um modelo de sistema imune com evoluc¸a˜o da especificidade. Master’s thesis, COPPE/ UFRJ 3. Cayzer S, Smith J, Marshall JAR, Kovacs T (2005) What have gene libraries done for artificial immune systems? In: International conference on artificial immune systems
143 4. Cormack DH (1991) HAM histology (in Portuguese), 9 edn. Guanabara Koogan 5. Darwin C (1872) On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life, 6th edn. John Murray, London 6. de Castro LN (2001) Immune engineering: development of computational tools inspired by the artificial immune systems (in Portuguese). Ph.D. thesis, DCA FEEC/UNICAMP, Campinas/SP, Brazil 7. de Castro LN, Timmis J (2002) Artificial immune systems: a new computational intelligence approach, vol 1, 1st edn. Springer, New York 8. Farmer JD, Packard NH, Perelson AS (1986) The immune system, adaptation, and machine learning. Physica 22D:187–204 9. Flores LE, Aguilar EJ, Barbosa VC, de Carvalho LAV (2004) A graph model for the evolution of specificity in humoral immunity. J Theoret Biol 229(3):311–325 10. Forrest S, Smith RE, Javornik B, Perelson AS (1993) Using genetic algorithms to explore pattern recognition in the immune system. Evolu Comput 1(3):191–211 11. Goldberg D (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading 12. Goldberg DE, Wang L (1998) Adaptive niching via coevolutionary sharing. Genetic Algorithms and Evolution Strategy in Engineering and Computer Science, pp 21–38 13. Golub ES (1992) Is the function of the immune system only to protect? In: Theoretical and experimental insights into immunology, vol 66 of H: cell biology. NATO ASI Series, pp 15–26 14. Hightower R, Forrest S, Perelson AS (1995) The evolution of emergent organization in immune system gene libraries. In: Eshelman L (ed) Proceedings of the 6th international conference on genetic algorithms. Morgan Kaufmann, San Francisco, pp 344–350 15. Hillis W (1990) Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D 42:228–234 16. Hofmeyr SA, Forrest S (2000) Architecture for an artificial immune system. Evol Comput 8(4):443–473 17. Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor 18. Inman J (1978) The antibody combining region: speculations on the hypothesis of general multiespecificity. Theoretical Immunology, pp 243–278 19. Janeway CA, Travers P, Walport M, Shlomchik M (2001) Immunobiology: the immune system in health and disease, 5th (brazilian) edn. Artes Me´dicas Porto Alegre 20. Jerne NK (1974) Towards a network theory of the immune system. Ann Immunol (Inst. Pasteur) 125C(3):73–89 21. Kepler TB, Perelson AS (1993) Cyclic reentry of germinal center b cells and the efficiency of affinity maturation. Immunol Today 14 14:412–415 22. Kepler TB, Perelson AS (1993) Somatic hypermutation in b cells: an optimal control treatment. J Theoret Biol 164:37–64 23. Oprea M, Forrest S (1999) How the immune system generates diversity: pathogen space coverage with random and evolved antibody libraries. In: Banzhaf W, Daida J, Eiben AE, Garzon MH, Honavar V, Jakiela M, Smith RE (eds) Proceedings of the genetic and evolutionary computation conference, vol 2. Morgan Kaufmann, Orlando, pp 1651–1656 24. Oprea M, Kepler TB (1999) Genetic plasticity of v genes under somatic hypermutation: statistical analyses using a new resampling-based methodology. Genome Res 9(12):1294–1304 25. Oprea M, Perelson A (1997) Somatic mutation leads to efficient affinity maturation when centrocytes recycle back to centroblasts. J Immunol 158:5155–5162 26. Perelson A (1989) Immune network theory. Immunol Rev 110:5– 36
123
144 27. Perelson AS, Weisbuch G (1997) Immunology for physicists. Rev Mod Phys 69:1219 28. Ron J. The evolution of secondary organization in immune system gene libraries 29. Rosin C, Belew R (1995) Methods for competitive co-evolution: finding opponents worth beating. In: Eshelman L (ed) Proc of the 6th international conference on genetic algorithms and their applications. Pittsburgh, PA 30. Rosin C, Belew R (1997) New methods for competitive coevolution. Evol Comput 5(1):1–29 31. Rumjanek VM (2001) Pro´prio e estranho: e´ essa a questa˜o? Cieˆncia Hoje 29:40 32. Shimura J (1996) Somatic mutations in immunoglobulin v gene determine the structure and function of the protein—an evidence
123
Evol. Intel. (2008) 1:133–144
33.
34. 35. 36.
from homology modeling. In: Hunter L, Klein T (eds) Biocomputing: proceedings of the 1996 Pacific symposium. World Scientific Publishing, Singapore Stewart J (1992) The immune system in an evolutionary perspective. In: Theoretical and experimental insights into immunology, vol 66 of H: cell biology. NATO ASI Series, pp 27–48 Tizard I (1985) Introduction to veterinary immunology (In Portuguese), 2nd edn Tullock G (1967) Towards a mathematics of politics. The University of Michigan Press, Ann Arbor Twycross J, Aickelin U (2007) Biological inspiration for artificial immune systems. In: Artificial immune systems