16.01.2013 Views

An Introduction to Genetic Algorithms - Boente

An Introduction to Genetic Algorithms - Boente

An Introduction to Genetic Algorithms - Boente

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

(4.1)<br />

The goal is <strong>to</strong> find n = n* that minimizes L(N n, n). This can be done by taking the derivative of L(N n, n)<br />

with respect <strong>to</strong> n, setting it <strong>to</strong> zero, and solving for n:<br />

(4.2)<br />

To solve this equation, we need <strong>to</strong> express q in terms of n so that we can find dq/ dn. Recall that qis the<br />

probability that A1(N, n) = A1. Suppose A1(N, n) indeed is A1; then A1 was given ntrials. Let S n 1be the sum of<br />

the payoffs of the ntrials given <strong>to</strong> A1, and let s Nn 2 be the sum of the payoffs of the N n trials given <strong>to</strong> A 2.<br />

Then<br />

(4.3)<br />

that is, the probability that the observed average payoff of A2 is higher than that of A1. Equivalently,<br />

(4.4)<br />

Chapter 4: Theoretical Foundations of <strong>Genetic</strong> <strong>Algorithms</strong><br />

(For subtle reasons, this is actually only an approximation; see Holland 1975, chapter 5.) Since S n 1 and S Nn 2<br />

are random variables, their difference is also a random variable with a well−defined distribution. Pr((S n 1/n<br />

S Nn 2/(N n)) < 0) is simply the area under the part of the distribution that is less than zero. The problem now<br />

is <strong>to</strong> compute this area—a tricky task. Holland originally approximated it by using the central limit theorem <strong>to</strong><br />

assume a normal distribution. Dan Frantz (as described in chapter 10 of the second edition of Holland 1975)<br />

corrected the original approximation using the theory of large deviations rather than the central limit theorem.<br />

Here the mathematics get complicated (as is often the case for easy−<strong>to</strong>−state problems such as the<br />

Two−Armed Bandit problem). According <strong>to</strong> Frantz, the optimal allocation of trials n * <strong>to</strong> the observed second<br />

best of the two random variables corresponding <strong>to</strong> the Two−Armed Bandit problem is approximated by<br />

where c1,c2, and c3 are positive constants defined by Frantz. (Here In denotes the natural logarithm.) The<br />

details of this solution are of less concern <strong>to</strong> us than its form. This can be seen by rearranging the terms and<br />

performing some algebra <strong>to</strong> get an expression for N n *, the optimal allocation of trials <strong>to</strong> the observed<br />

better arm:<br />

As n* increases, e n */ 2c 1dominates everything else, so we can further approximate (letting c = 1/2 c1):<br />

90

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!