12The conditional entropy must be minimized in order to improve modelperformance. An acceptable performance is reached when the authors considerthe dynamics of oil flow, by the inclusion of the sample 2)L − 30 in the inputvector. The problem now becomes auto-regressive and the new input vector isthe following:]…['2/L − 33 62/L − <strong>41</strong>%6:L− 30 5*2L − 45 2)L − 30 . Using thisinput vector the joint conditional entropy is 0.1377. Notice that, besides theinclusion of the sample 2)L − 30 , the sample %6:L was replaced by%6:L, in order to predict the oil flow 30 days ahead. Moreover, we can seein the Figure 7 that, at the lag 30, the XEF has the acceptable value 5=0.69. Thislast model can predict the oil flow with average MSE equal to 1.07e-5 andstandard deviation of 2.14e-6. These results consider 100 experiments with avalidation data set. Notice that, the engineer can accept the suggestion of XEF asa reference. However, it is possible to include other variables or to change theirlags, since that the joint conditional entropy be gotten better.&RQFOXVLRQIn spite of the existence of a non-linear dynamic relationship betweenvariables, the XEF allows the correct selection of variables and its lags. This toolis the main contribution of this work. The authors also wish to remind that toolsfrom the information theory, usually applied in telecommunications, are alsosuitable to use in neural models.Considering the dependence among the input variables, the joint conditionalentropy complements the XEF in the regression vector choice, because it allowsa final test. Research effort is being done with the goal of developing anextension of the XEF approach to deal with significant dependence among theinput variables. In this new approach, a Genetic Algorithm is applied forestimating the lags of input variables. The joint conditional entropy is used asfitness function and the chromosomes are composed of input variables withdifferent lags. The input vector, resulting from XEF analysis, is sowed in theinitial population, with the intention of accelerating the algorithm convergence.$FNQRZOHGJPHQWThis work was supported by Petrobras, Institute of Systems and Robotics(ISR-UC), and Fundação para a Ciência e Tecnologia under grantsPOSC/EEA-SRI/58016/2004 and PTDC/EEAACR/72226/2006.
5HIHUHQFHV1. Simon, G. and Verleysen, M. 2007. High-dimensional delay selection for regression modelswith mutual information and distance-to-diagonal criteria. Neurocomput. 70, 7-9 (Mar. 2007),2. François, D., Rossi, F., Wertz, V., and Verleysen, M. 2007. Resampling methods forparameter-free and robust feature selection with mutual information. †‡ˆ¦‰