13.07.2015 Views

Neurogammon: a neural-network backgammon program - Neural ...

Neurogammon: a neural-network backgammon program - Neural ...

Neurogammon: a neural-network backgammon program - Neural ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

oth machine learning and computer games research. For machine learningresearchers (and for <strong>neural</strong> <strong>network</strong>s researchers in particular), it provides aconvincing practical demonstration that <strong>network</strong> learning algorithms can beuseful in tackling hard computational tasks. It should be emphasized thattournament performance is a severe test of the <strong>network</strong>s’ ability to extractgood generalizations from the training data, rather than merely memorizespecific individual positions. Apart from the first couple of moves at thestart of each game, it is highly unlikely that any of the positions encounteredby <strong>Neurogammon</strong> at the Computer Olympiad were contained in the trainingset.The significance to computer games research is that it is apparently thefirst time in the history of computer games research that a learning <strong>program</strong>has ever won an official tournament in an established game of skill. Machinelearning procedures have been studied in computer games environments formany years, but learning <strong>program</strong>s typically have not achieved the levels ofperformance necessary to win in tournament competition. <strong>Neurogammon</strong>’sachievement suggests that learning procedures may be more widely useful inother complex games such as chess and Go.While <strong>Neurogammon</strong> can only be described as a strong <strong>program</strong>, it clearlyhas to improve in a number of ways before it can be claimed to perform athuman expert level. One obvious improvement would be to train on a largerand more varied expert data set containing data from several experts exhibitinga variety of styles of play. The move-selection <strong>network</strong>s in particularare dangerously inbred, having been trained only on data from one individual.Further significant improvement might be obtained by adding a smallamount of look-ahead, perhaps by going to a 3-ply search.Another way in which learning could be improved is by incorporatingspecial knowledge about symmetry or topology of the problem either intothe data representation scheme or the <strong>network</strong> architecture. For example,one knows that the evaluation function should have an exact symmetry inthat, if the configuration of the Black and White pieces are swapped, and theplayer to move is reversed, then the evaluation should exactly invert, i.e., thescore S + -S. Also, one knows that <strong>backgammon</strong> has a one-dimensionalspatial structure with approximate translation invariance. Explicitly buildingthese symmetry principles into the learning system could enable it to extractbetter generalizations from a limited amount of training data.Another direction which seems very promising is to abandon supervisedlearning altogether, and instead to learn by playing out a sequence of movesagainst an opponent and observing the outcome. Such a learning systemcould learn on its own, without the aid of an intelligent teacher, and inprinciple could continue to improve until it surpassed even the best humanexperts. Connectionist algorithms such as Sutton’s Temporal-Difference algorithm[8] could be used in such an approach. It is not known whether suchalgorithms are efficient enough to tackle a problem of the scale and complexityof <strong>backgammon</strong>. However, it seems most likely that they could be usedto learn a highly accurate end-game evaluation function. Since the late (butAuthorized licensed use limited to: University of Illinois. Downloaded on August 15, 2009 at 19:24 from IEEE Xplore. Restrictions apply.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!