An Introduction to Genetic Algorithms - Boente

More documents

Recommendations

Info

Chapter 2: Genetic Algorithms in Problem Solving Figure 2.14: A representation of the three−dimensional structure of a Crambin protein. (From the "PDB at a Glance" page at the World Wide Web URL http://www.nih.gov/molecular_modeling/pdb_at_a_glance.) The next step is to define a fitness function over the space of chromosomes. The goal is to find a structure that has low potential energy for the given sequence of amino acids. This goal is based on the assumption that a sequence of amino acids will fold to a minimal−energy state, where energy is a function of physical and chemical properties of the individual amino acids and their spatial interactions (e.g., electrostatic pair interactions between atoms in two spatially adjacent amino acids). If a complete description of the relevant forces were known and solvable, then in principle the minimum−energy structure could be calculated. However, in practice this problem is intractable, and biologists instead develop approximate models to describe the potential energy of a structure. These models are essentially intelligent guesses as to what the most relevant forces will be. Schulze−Kremer's initial experiments used a highly simplified model in which the potential energy of a structure was assumed to be a function of only the torsion angles, electrostatic pair interactions between atoms, and van der Waals pair interactions between atoms (Schulze−Kremer 1992). The goal was for the GA to find a structure (defined in terms of torsion angles) that minimized Figure 2.15: An illustration of the representation for protein structure used in Schulze−Kremer's experiments. Each of the N amino acids in the sequence is represented by 10 torsion angles: Æ È É and x 1 x 7 . (See Schulze−Kremer 1992 for details of what these angles represent.) A chromosome is a list of these N sets of 10 angles. Crossover points are chosen only at amino acid boundaries. 48
Chapter 2: Genetic Algorithms in Problem Solving this simplified potential−energy function for the amino acid sequence of Crambin. In Schulze−Kremer's GA, crossover was either two−point (i.e., performed at two points along the chromosome rather than at one point) or uniform (i.e., rather than taking contiguous segments from each parent to form the offspring, each "gene" is chosen from one or the other parent, with a 50% probability for each parent). Here a "gene" consisted of a group of 10 torsion angles; crossover points were chosen only at amino acid boundaries. Two mutation operators designed to work on real numbers rather than on bits were used: the first replaced a randomly chosen torsion angle with a new value randomly chosen from the 10 most frequently occurring angle values for that particular bond, and the second incremented or decremented a randomly chosen torsion angle by a small amount. The GA started on a randomly generated initial population of ten structures and ran for 1000 generations. At each generation the fitness was calculated (here, high fitness means low potential energy), the population was sorted by fitness, and a number of the highest−fitness individuals were selected to be parents for the next generation (this is, again, a form of rank selection). Offspring were created via crossover and mutation. A scheme was used in which the probabilities of the different mutation and crossover operators increased or decreased over the course of the run. In designing this scheme, Schulze−Kremer relied on his intuitions about which operators were likely to be most useful at which stages of the run. The GA's search produced a number of structures with quite low potential energy—in fact, much lower than that of the actual structure for Crambin! Unfortunately, however, none of the generated individuals was structurally similar to Crambin. The snag was that it was too easy for the GA to find low−energy structures under the simplified potential energy function; that is, the fitness function was not sufficiently constrained to force the GA to find the actual target structure. The fact that Schulze−Kremer's initial experiments were not very successful demonstrates how important it is to get the fitness function right—here, by getting the potential−energy model right (a difficult biophysical problem), or at least getting a good enough approximation to lead the GA in the right direction. Schulze−Kremer's experiments are a first step in the process of "getting it right." I predict that fairly soon GAs and other machine learning methods will help biologists make real breakthroughs in protein folding and in other areas of molecular biology. I'll even venture to predict that this type of application will be much more profitable (both scientifically and financially) than using GAs to predict financial markets. 2.3 EVOLVING NEURAL NETWORKS Neural networks are biologically motivated approaches to machine learning, inspired by ideas from neuroscience. Recently some efforts have been made to use genetic algorithms to evolve aspects of neural networks. In its simplest "feedforward" form (figure 2.16), a neural network is a collection of connected activatable units ("neurons") in which the connections are weighted, usually with real−valued weights. The network is presented with an activation pattern on its input units, such a set of numbers representing features of an image to be classified (e.g., the pixels in an image of a handwritten letter of the alphabet). Activation spreads in a forward direction from the input units through one or more layers of middle ("hidden") units to the output units over the weighted connections. Typically, the activation coming into a unit from other units is multiplied by the weights on the links over which it spreads, and then is added together with other incoming activation. The result is typically thresholded (i.e., the unit "turns on" if the resulting activation is above that unit's threshold). This process is meant to roughly mimic the way activation spreads through networks of neurons in 49
Page 2 and 3: An Introduction to Genetic Algorith
Page 4 and 5: Table of Contents Chapter 4: Theore
Page 6 and 7: Chapter 1: Genetic Algorithms: An O
Page 10 and 11: 1.4 SEARCH SPACES AND FITNESS LANDS
Page 12 and 13: potential energy is a measure of ho
Page 14 and 15: A simple method of implementing fit
Page 16 and 17: from an initial state to a goal. Fo
Page 18 and 19: Robert Axelrod of the University of
Page 26 and 27: instances of H at time t, and let
Page 28 and 29: What is the total payoff after 10 g
Page 30 and 31: 6. * c. d. a. b. Chapter 1: Genetic
Page 32 and 33: As a simple example, consider a pro
Page 34 and 35: Chapter 2: Genetic Algorithms in Pr
Page 36 and 37: it to get one fitness case correct:
Page 38 and 39: Evolving Cellular Automata Chapter
Page 42 and 43: up most of the computation time. Ch
Page 46 and 47: the prospect of using GAs to automa
Page 48 and 49: where Ã is the standard deviation
Page 54 and 55: the brain. In a feedforward network
Page 58 and 59: Figure 2.21: An illustration of Mil
Page 62 and 63: experiments with grammatical encodi
Page 64 and 65: function of the average error of th
Page 66 and 67: COMPUTER EXERCISES 1. 2. 3. 4. 5. 6
Page 70 and 71: simulated generations, and such sim
Page 72 and 73: environments, they can fairly quick
Page 74 and 75: Chapter 3: Genetic Algorithms in Sc
Page 76 and 77: Figure 3.6: A schematic illustratio
Page 82 and 83: an initial population in which the
Page 88 and 89: these random effects, Bedau and Pac
Page 90 and 91: 3. * 4. * the female child in the n
Page 92 and 93: calculating the observed average fi
Page 94 and 95: (4.1) The goal is to find n = n* th
Page 96 and 97: competition in the d *···* parti
Page 98 and 99: schemas with the best observed fitn
Page 100 and 101: Steepest−ascent hill climbing (SA
Page 102 and 103:
(If the algorithm spends only 1/m o
Page 104 and 105:
strings—then in principle crossov
Page 106 and 107:
(4.7) Chapter 4: Theoretical Founda
Page 108 and 109:
2. 3. Calculate the fitness f(x) of
Page 110 and 111:
Defining ri,j(k) and is somewhat tr
Page 112 and 113:
Results of the Formalization How ca
Page 114 and 115:
Chapter 4: Theoretical Foundations
Page 116 and 117:
Page 118 and 119:
Figure 4.5: Predicted and observed
Page 120 and 121:
COMPUTER EXERCISES 1. 2. 3. 4. 5. 6
Page 122 and 123:
prone to rather arbitrary orderings
Page 124 and 125:
Inversion works by choosing two poi
Page 126 and 127:
Page 128 and 129:
The one hitch is that, as was seen
Page 130 and 131:
used in the work of Tanese (1989),
Page 132 and 133:
Two individuals are chosen at rando
Page 134 and 135:
Page 136 and 137:
strategies community, in which para
Page 138 and 139:
* 10. * 11. * 12. * Chapter 4: Theo
Page 140 and 141:
Chapter 6: Conclusions and Future D
Page 142 and 143:
Chapter 6: Conclusions and Future D
Page 144 and 145:
Appendix A: Selected General Refere
Page 146 and 147:
Evolution Artificielle Foundations
Page 148 and 149:
Kaufmann. Appendix B: Other Resourc
Page 150 and 151:
Morgan Kaufmann. Appendix B: Other
Page 152 and 153:
Appendix B: Other Resources Forrest
Page 154 and 155:
Appendix B: Other Resources Grefens
Page 156 and 157:
Kirkpatrick, S., Gelatt, C.D., Jr.,
Page 158 and 159:
Appendix B: Other Resources Mitchel
Page 160 and 161:
Appendix B: Other Resources Roughga
Page 162:
Appendix B: Other Resources Vose, M
show all

An Introduction to Genetic Algorithms - Boente

Create successful ePaper yourself

Delete template?

Save as template?