06.08.2013 Views

A Rebuttal to Christopher Hillar and Friedrich Sommer's Comment ...

A Rebuttal to Christopher Hillar and Friedrich Sommer's Comment ...

A Rebuttal to Christopher Hillar and Friedrich Sommer's Comment ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

July 24, 2009<br />

A <strong>Rebuttal</strong> <strong>to</strong> Chris<strong>to</strong>pher <strong>Hillar</strong> <strong>and</strong> <strong>Friedrich</strong> Sommer’s <strong>Comment</strong> on<br />

Distilling Laws from Data<br />

Michael Schmidt 1 , Hod Lipson 2,3<br />

Introduction<br />

<strong>Hillar</strong> <strong>and</strong> Sommer recently published a comment [3] on the validity of the fitness criterion used <strong>to</strong><br />

search for free-form natural laws in experimental data [1]. The comment presents experimental <strong>and</strong><br />

theoretical arguments that the particular fitness criterion evaluates <strong>to</strong> zero <strong>and</strong> assert that the algorithm<br />

described in the paper cannot accomplish the stated goals. They further propose an alternative metric<br />

which they claim <strong>to</strong> be better suited.<br />

<strong>Rebuttal</strong><br />

We have carefully reviewed the arguments by <strong>Hillar</strong> <strong>and</strong> Sommer <strong>and</strong> have made the following<br />

observations:<br />

1. We identified a significant mistake made by <strong>Hillar</strong> et al. in their calculation of our fitness<br />

function [ref 2, Equation S8]. <strong>Hillar</strong> et al. have assumed that all variables are dependent on all<br />

other variables. This assumption led <strong>to</strong> a degenerate fitness of zero in all cases, which then led<br />

<strong>to</strong> their incorrect conclusion.<br />

2. In Section S2 of our supplementary materials [ref 2, Page 4], we explicitly state that one “…<br />

should not assume that every variable is interdependent on all others.” We further provide an<br />

example for the three-dimensional case [ref 2, Equation S6 <strong>and</strong> S7] where all but one pair of<br />

variables is independent.<br />

3. We have taken the liberty <strong>to</strong> correct their accompanying Matlab code posted online [4] so that<br />

it performs the correct calculation (see Figure 1) <strong>and</strong> produces the correct result (Figure 2).<br />

Further, in the Technical <strong>Comment</strong> [3], <strong>Hillar</strong> et al. propose a Hamil<strong>to</strong>nian fitness metric, HFit [ref 3,<br />

Equation 3] as a comparable fitness metric <strong>to</strong> our fitness metric [ref 2, Equation S8]. We must point out<br />

two important difficulties with the HFit metric:<br />

1. As <strong>Hillar</strong> et al. note, their Hamil<strong>to</strong>nian fitness metric “would require additional measures <strong>to</strong> bias<br />

the search away from functions that are nearly constant” [ref 3, Page 2]. As we discuss in detail<br />

in our manuscript [1], avoiding such trivial solutions is the key difficulty in searching for invariant<br />

equations <strong>to</strong> begin with, which we address in our original paper [1]. Therefore, <strong>Hillar</strong>’s metric<br />

would likely be inadequate in a computational search.<br />

2. The HFit proposed by <strong>Hillar</strong> et al. h<strong>and</strong>les a special case <strong>and</strong> could not be used <strong>to</strong> identify<br />

invariants of arbitrary form. The search for free-form invariants is the second key challenge we<br />

address with our method [1]. While <strong>Hillar</strong>’s metric may have uses in certain circumstances, it is<br />

less general, <strong>and</strong> would likely be inadequate for an open-ended computational search seeking <strong>to</strong><br />

find new invariants that do not necessarily follow a Hamil<strong>to</strong>nian form.<br />

1 Computational Biology, Cornell University, Ithaca, NY 14853, USA<br />

2 School of Mechanical & Aerospace Engineering, Cornell. University, Ithaca NY 14853, USA<br />

3 Computing & Information Science, Cornell University, Ithaca, NY 14853, USA


Conclusions<br />

In conclusion, an incorrect assumption about variable dependence by <strong>Hillar</strong> et al. led <strong>to</strong> a<br />

degenerate fitness calculation <strong>and</strong> their incorrect conclusion. Further, avoiding this mistake is addressed<br />

explicitly in our supplemental materials. We were able <strong>to</strong> modify the code posted online by <strong>Hillar</strong> et al.<br />

<strong>to</strong> perform the correct calculation <strong>and</strong> yield the correct result by editing three lines of their code.<br />

Further, the alternative function proposed by <strong>Hillar</strong> et al. is inadequate both because of its lack of<br />

generality <strong>and</strong> because of its inability <strong>to</strong> avoid trivial invariants – the two key challenges addressed in<br />

our original paper.<br />

<strong>Hillar</strong> et al.<br />

code [4]<br />

Corrected<br />

code<br />

function fffit = fffitness(a,b,c,x,v,t)<br />

Dfx = 2*c*x + b*v.*v + (2*a*v+2*b*v.*x).*pvd(v,x,t);<br />

Dfy = (2*a*v+2*b*v.*x) + (2*c*x + b*v.*v).*pvd(x,v,t);<br />

fffit = -sum(log(1+abs(pvd(x,v,t)-Dfy./Dfx)));<br />

Dfx = 2*c*x + b*v.*v;<br />

Dfy = 2*a*v + 2*b*v.*x;<br />

fffit = -mean(log(1+abs(pvd(x,v,t)-Dfy./Dfx)));<br />

Figure 1. The section of <strong>Hillar</strong>’s code published online with the Technical <strong>Comment</strong> where the mistake occurs.<br />

Hiller et al. assume all variables are dependent on each other; however for a 2D system, both variables must be<br />

independent. Correcting these three lines, the code produces the correct fitness (see Figure 2).<br />

Figure 2. The fitness l<strong>and</strong>scapes of the harmonic oscilla<strong>to</strong>r reproduced from [3, 4]. The blue surface is our correctly<br />

implemented fitness function [2] <strong>and</strong> the green surface is the HFit [3] by <strong>Hillar</strong>, et al. Our fitness function identifies<br />

all invariants of the harmonic oscilla<strong>to</strong>r in this space, which can be seen as a diagonal line of optima. These optima<br />

correspond <strong>to</strong> the Hamil<strong>to</strong>nian equation at different multiples or energies (eg. if H is constant, so is cH for any real<br />

c). Also, note that the two surfaces are scaled differently, but both have favorable gradients <strong>to</strong>ward the optima.


References<br />

[1] Schmidt M., Lipson H. (2009) "Distilling Free-Form Natural Laws from Experimental Data,"<br />

Science, Vol. 324, no. 5923, pp. 81 - 85.<br />

[2] Schmidt M., Lipson H. (2009) "Distilling Free-Form Natural Laws from Experimental Data,"<br />

Supplementary online materials.<br />

[3] <strong>Hillar</strong>, C., Sommer, F. “On the article “Distilling free-form natural laws from experimental data”,”<br />

(accessed July 24, 2009) http://www.msri.org/people/members/chillar/files/hs09b.pdf<br />

[4] <strong>Hillar</strong>, C., Sommer, F. “On the article “Distilling free-form natural laws from experimental data”,”<br />

Matlab files, (accessed July 24, 2009)<br />

http://www.msri.org/people/members/chillar/articles.html

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!