16.01.2013 Views

An Introduction to Genetic Algorithms - Boente

An Introduction to Genetic Algorithms - Boente

An Introduction to Genetic Algorithms - Boente

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 2: <strong>Genetic</strong> <strong>Algorithms</strong> in Problem Solving<br />

space the GA searched was thus 2 128 —far <strong>to</strong>o large for any kind of exhaustive search.<br />

In our main set of experiments, we set N = 149 (chosen <strong>to</strong> be reasonably large but not computationally<br />

intractable). The GA began with a population of 100 randomly generated chromosomes (generated with some<br />

initial biases—see Mitchell, Crutchfield, and Hraber 1994a, for details). The fitness of a rule in the population<br />

was calculated by (i) randomly choosing 100 ICs (initial configurations) that are uniformly distributed over Á<br />

Î [0.0,1.0], with exactly half with ÁÁc and half with ÁÁc, (ii) running the rule on each IC either until it arrives<br />

at a fixed point or for a maximum of approximately 2N time steps, and (iii) determining whether the final<br />

pattern is correct—i.e., N zeros for Á0Ác and N ones for Á0Ác. The initial density, Á0, was never exactly since<br />

N was chosen <strong>to</strong> be odd. The rule's fitness, f100, was the fraction of the 100 ICs on which the rule produced the<br />

correct final pattern. No partial credit was given for partially correct final configurations.<br />

A few comments about the fitness function are in order. First, as was the case in Hillis's sorting−networks<br />

project, the number of possible input cases (2 149 for N = 149) was far <strong>to</strong>o large <strong>to</strong> test exhaustively. Instead,<br />

the GA sampled a different set of 100 ICs at each generation. In addition, the ICs were not sampled from an<br />

unbiased distribution (i.e., equal probability of a one or a zero at each site in the IC), but rather from a flat<br />

distribution across Á Î [0,1] (i.e., ICs of each density from Á = 0 <strong>to</strong> Á = 1 were approximately equally<br />

represented). This flat distribution was used because the unbiased distribution is binomially distributed and<br />

thus very strongly peaked at . The ICs selected from such a distribution will likely all have , the<br />

hardest cases <strong>to</strong> classify. Using an unbiased sample made it <strong>to</strong>o difficult for the GA <strong>to</strong> ever find any<br />

high−fitness CAs. (As will be discussed below, this biased distribution turns out <strong>to</strong> impede the GA in later<br />

generations: as increasingly fit rules are evolved, the IC sample becomes less and less challenging for the<br />

GA.)<br />

Our version of the GA worked as follows. In each generation, (i) a new set of 100 ICs was generated, (ii) f100<br />

was calculated for each rule in the population, (iii) the population was ranked in order of fitness, (iv) the 20<br />

highest−fitness ("elite") rules were copied <strong>to</strong> the next generation without modification, and (v) the remaining<br />

80 rules for the next generation were formed by single−point crossovers between randomly chosen pairs of<br />

elite rules. The parent rules were chosen from the elite with replacement—that is, an elite rule was permitted<br />

<strong>to</strong> be chosen any number of times. The offspring from each crossover were each mutated twice. This process<br />

was repeated for 100 generations for a single run of the GA. (More details of the implementation are given in<br />

Mitchell, Crutchfield, and Hraber 1994a.)<br />

Note that this version of the GA differs from the simple GA in several ways. First, rather than selecting<br />

parents with probability proportional <strong>to</strong> fitness, the rules are ranked and selection is done at random from the<br />

<strong>to</strong>p 20% of the population. Moreover, all of the <strong>to</strong>p 20% are copied without modification <strong>to</strong> the next<br />

generation, and only the bot<strong>to</strong>m 80% are replaced. This is similar <strong>to</strong> the selection method—called "(¼ +<br />

»)"—used in some evolution strategies; see Back, Hoffmeister, and Schwefel 1991.<br />

This version of the GA was the one used by Packard (1988), so we used it in our experiments attempting <strong>to</strong><br />

replicate his work (Mitchell, Hraber, and Crutchfield 1993) and in our subsequent experiments. Selecting<br />

parents by rank rather than by absolute fitness prevents initially stronger individuals from quickly dominating<br />

the population and driving the genetic diversity down <strong>to</strong>o early. Also, since testing a rule on 100 ICs provides<br />

only an approximate gauge of the true fitness, saving the <strong>to</strong>p 20% of the rules was a good way of making a<br />

"first cut" and allowing rules that survive <strong>to</strong> be tested over more ICs. Since a new set of ICs was produced<br />

every generation, rules that were copied without modification were always retested on this new set. If a rule<br />

performed well and thus survived over a large number of generations, then it was likely <strong>to</strong> be a genuinely<br />

better rule than those that were not selected, since it was tested with a large set of ICs. <strong>An</strong> alternative method<br />

would be <strong>to</strong> test every rule in each generation on a much larger set of ICs, but this would waste computation<br />

time. Too much effort, for example, would go in<strong>to</strong> testing very weak rules, which can safely be weeded out<br />

early using our method. As in most applications, evaluating the fitness function (here, iterating each CA) takes<br />

37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!