Guyon Statnikov Aliferismethods on some particular difficulties, and foster<strong>in</strong>g the development of new algorithms.The results <strong>in</strong>dicated that causal discovery from observational data is not animpossible task, but a very hard one and po<strong>in</strong>ted to the need for further research andbenchmarks (Guyon et al., 2008). The Causal Explorer package (Aliferis et al., 2003),which we had made available to the participants and is downloadable as shareware,proved to be competitive and is a good start<strong>in</strong>g po<strong>in</strong>t for researchers new to the field. Itis a Matlab toolkit support<strong>in</strong>g “local” causal discovery algorithms, efficient to discoverthe causal structure around a target variable, even for a large number of variables. Thealgorithms are based on structure learn<strong>in</strong>g from tests of conditional <strong>in</strong>dependence, asall the top rank<strong>in</strong>g methods <strong>in</strong> this first challenge.The first challenge (Guyon et al., 2008) explored an important problem <strong>in</strong> causalmodel<strong>in</strong>g, but is only one of many possible problem statements. The second challenge(Guyon et al., 2010) called “competition pot-luck” aimed at enlarg<strong>in</strong>g the scope ofcausal discovery algorithm evaluation by <strong>in</strong>vit<strong>in</strong>g members of the community to submittheir own problems and/or solve problems proposed by others. The challenge startedSeptember 15, 2008 and ended November 20, 2008, see http://www.causality.<strong>in</strong>f.ethz.ch/pot-luck.php. One task proposed by a participant drew a lot ofattention: the cause-effect pair task. The problem was to try to determ<strong>in</strong>e <strong>in</strong> pairs ofvariables (of known causal relationships), which one was the cause of the other. Thisproblem is hard for a lot of algorithms, which rely on the result of conditional <strong>in</strong>dependencetests of three or more variables. Yet the w<strong>in</strong>ners of the challenge succeeded <strong>in</strong>unravel<strong>in</strong>g 8/8 correct causal directions (Zhang and Hyvär<strong>in</strong>en, 2009).Our planned challenge ExpDeCo (Experimental Design <strong>in</strong> Causal Discovery) willbenchmark methods of experimental design <strong>in</strong> application to causal model<strong>in</strong>g. The goalwill be to identify effective methods to unravel causal models, requir<strong>in</strong>g a m<strong>in</strong>imum ofexperimentation, us<strong>in</strong>g the Virtual Lab. A budget of virtual cash will be allocated toparticipants to “buy” the right to observe or manipulate certa<strong>in</strong> variables, manipulationsbe<strong>in</strong>g more expensive that observations. The participants will have to spend theirbudget optimally to make the best possible predictions on test data. This setup lendsitself to <strong>in</strong>corporat<strong>in</strong>g problems of relevance to development projects, <strong>in</strong> particular <strong>in</strong>medic<strong>in</strong>e and epidemiology where experimentation is difficult while develop<strong>in</strong>g newmethodology.We are plann<strong>in</strong>g another challenge called CoMSICo for “Causal Models for SystemIdentification and Control”, which is more ambitious <strong>in</strong> nature because it will perform acont<strong>in</strong>uous evaluation of causal models rather than separat<strong>in</strong>g tra<strong>in</strong><strong>in</strong>g and test phase. Incontrast with ExpDeCo <strong>in</strong> which the organizers will provide test data with prescribedmanipulations to test the ability of the participants to make predictions of the consequencesof actions, <strong>in</strong> CoMSICo, the participants will be <strong>in</strong> charge of mak<strong>in</strong>g theirown plan of action (policy) to optimize an overall objective (e.g., improve the life expectancyof a population, improve the GNP, etc.) and they will be judged directly withthis objective, on an on-go<strong>in</strong>g basis, with no dist<strong>in</strong>ction between “tra<strong>in</strong><strong>in</strong>g” and “test”data. This challenge will also be via the Virtual Lab. The participants will be givenan <strong>in</strong>itial amount of virtual cash, and, as previously, both actions and observations will136
<strong>Causality</strong> Workbenchhave a price. New <strong>in</strong> CoMSICo, virtual cash rewards will be given for achiev<strong>in</strong>g good<strong>in</strong>termediate performance, which the participants will be allowed to re-<strong>in</strong>vest to conductadditional experiments and improve their plan of action (policy). The w<strong>in</strong>ner willbe the participant end<strong>in</strong>g up with the largest amount of virtual cash.6. ConclusionOur program of data exchange and benchmark proposes to challenge the research communitywith a wide variety of problems from many doma<strong>in</strong>s and focuses on realisticsett<strong>in</strong>gs. Causal discovery is a problem of fundamental and practical <strong>in</strong>terest <strong>in</strong> manyareas of science and technology and there is a need for assist<strong>in</strong>g policy mak<strong>in</strong>g <strong>in</strong> allthese areas while reduc<strong>in</strong>g the costs of data collection and experimentation. Hence, theidentification of efficient techniques to solve causal problems will have a widespreadimpact. By choos<strong>in</strong>g applications from a variety of doma<strong>in</strong>s and mak<strong>in</strong>g connectionsbetween discipl<strong>in</strong>es as varied as mach<strong>in</strong>e learn<strong>in</strong>g, causal discovery, experimental design,decision mak<strong>in</strong>g, optimization, system identification, and control, we anticipatethat there will be a lot of cross-fertilization between different doma<strong>in</strong>s.AcknowledgmentsThis project is an activity of the <strong>Causality</strong> Workbench supported by the Pascal networkof excellence funded by the European Commission and by the U.S. National ScienceFoundation under Grant N0. ECCS-0725746. Any op<strong>in</strong>ions, f<strong>in</strong>d<strong>in</strong>gs, and conclusionsor recommendations expressed <strong>in</strong> this material are those of the authors and do not necessarilyreflect the views of the National Science Foundation. We are very grateful toall the members of the causality workbench team for their contribution and <strong>in</strong> particularto our co-founders Constant<strong>in</strong> Aliferis, Greg Cooper, André Elisseeff, Jean-PhilippePellet, Peter Spirtes, and Alexander Statnikov.ReferencesC. F. Aliferis, I. Tsamard<strong>in</strong>os, A. Statnikov, and L.E. Brown. Causal explorer: A probabilisticnetwork learn<strong>in</strong>g toolkit for biomedical discovery. In 2003 InternationalConference on Mathematics and Eng<strong>in</strong>eer<strong>in</strong>g Techniques <strong>in</strong> Medic<strong>in</strong>e and BiologicalSciences (METMBS), Las Vegas, Nevada, USA, June 23-26 2003. CSREA Press.Constant<strong>in</strong> Aliferis. A Temporal Representation and Reason<strong>in</strong>g Model for MedicalDecision-Support Systems. PhD thesis, University of Pittsburgh, 1998.C. Glymour and G.F. Cooper, editors. Computation, Causation, and Discovery. AAAIPress/The MIT Press, Menlo Park, California, Cambridge, Massachusetts, London,England, 1999.I. Guyon, C. Aliferis, G. Cooper, A. Elisseeff, J.-P. Pellet, P. Spirtes, and A. Statnikov.Design and analysis of the causation and prediction challenge. In JMLR W&CP,137