A Rebuttal to Christopher Hillar and Friedrich Sommer's Comment ...

July 24, 2009 

A Rebuttal to Christopher Hillar and Friedrich Sommer’s Comment on 

Distilling Laws from Data 

Michael Schmidt 1 , Hod Lipson 2,3 

Introduction 

Hillar and Sommer recently published a comment [3] on the validity of the fitness criterion used to 

search for free-form natural laws in experimental data [1]. The comment presents experimental and 

theoretical arguments that the particular fitness criterion evaluates to zero and assert that the algorithm 

described in the paper cannot accomplish the stated goals. They further propose an alternative metric 

which they claim to be better suited. 

Rebuttal 

We have carefully reviewed the arguments by Hillar and Sommer and have made the following 

observations: 

1. We identified a significant mistake made by Hillar et al. in their calculation of our fitness 

function [ref 2, Equation S8]. Hillar et al. have assumed that all variables are dependent on all 

other variables. This assumption led to a degenerate fitness of zero in all cases, which then led 

to their incorrect conclusion. 

2. In Section S2 of our supplementary materials [ref 2, Page 4], we explicitly state that one “… 

should not assume that every variable is interdependent on all others.” We further provide an 

example for the three-dimensional case [ref 2, Equation S6 and S7] where all but one pair of 

variables is independent. 

3. We have taken the liberty to correct their accompanying Matlab code posted online [4] so that 

it performs the correct calculation (see Figure 1) and produces the correct result (Figure 2). 

Further, in the Technical Comment [3], Hillar et al. propose a Hamiltonian fitness metric, HFit [ref 3, 

Equation 3] as a comparable fitness metric to our fitness metric [ref 2, Equation S8]. We must point out 

two important difficulties with the HFit metric: 

1. As Hillar et al. note, their Hamiltonian fitness metric “would require additional measures to bias 

the search away from functions that are nearly constant” [ref 3, Page 2]. As we discuss in detail 

in our manuscript [1], avoiding such trivial solutions is the key difficulty in searching for invariant 

equations to begin with, which we address in our original paper [1]. Therefore, Hillar’s metric 

would likely be inadequate in a computational search. 

2. The HFit proposed by Hillar et al. handles a special case and could not be used to identify 

invariants of arbitrary form. The search for free-form invariants is the second key challenge we 

address with our method [1]. While Hillar’s metric may have uses in certain circumstances, it is 

less general, and would likely be inadequate for an open-ended computational search seeking to 

find new invariants that do not necessarily follow a Hamiltonian form. 

1 Computational Biology, Cornell University, Ithaca, NY 14853, USA 

2 School of Mechanical & Aerospace Engineering, Cornell. University, Ithaca NY 14853, USA 

3 Computing & Information Science, Cornell University, Ithaca, NY 14853, USA

Conclusions 

In conclusion, an incorrect assumption about variable dependence by Hillar et al. led to a 

degenerate fitness calculation and their incorrect conclusion. Further, avoiding this mistake is addressed 

explicitly in our supplemental materials. We were able to modify the code posted online by Hillar et al. 

to perform the correct calculation and yield the correct result by editing three lines of their code. 

Further, the alternative function proposed by Hillar et al. is inadequate both because of its lack of 

generality and because of its inability to avoid trivial invariants – the two key challenges addressed in 

our original paper. 

Hillar et al. 

code [4] 

Corrected 

code 

function fffit = fffitness(a,b,c,x,v,t) 

Dfx = 2*c*x + b*v.*v + (2*a*v+2*b*v.*x).*pvd(v,x,t); 

Dfy = (2*a*v+2*b*v.*x) + (2*c*x + b*v.*v).*pvd(x,v,t); 

fffit = -sum(log(1+abs(pvd(x,v,t)-Dfy./Dfx))); 

Dfx = 2*c*x + b*v.*v; 

Dfy = 2*a*v + 2*b*v.*x; 

fffit = -mean(log(1+abs(pvd(x,v,t)-Dfy./Dfx))); 

Figure 1. The section of Hillar’s code published online with the Technical Comment where the mistake occurs. 

Hiller et al. assume all variables are dependent on each other; however for a 2D system, both variables must be 

independent. Correcting these three lines, the code produces the correct fitness (see Figure 2). 

Figure 2. The fitness landscapes of the harmonic oscillator reproduced from [3, 4]. The blue surface is our correctly 

implemented fitness function [2] and the green surface is the HFit [3] by Hillar, et al. Our fitness function identifies 

all invariants of the harmonic oscillator in this space, which can be seen as a diagonal line of optima. These optima 

correspond to the Hamiltonian equation at different multiples or energies (eg. if H is constant, so is cH for any real 

c). Also, note that the two surfaces are scaled differently, but both have favorable gradients toward the optima.

References 

[1] Schmidt M., Lipson H. (2009) "Distilling Free-Form Natural Laws from Experimental Data," 

Science, Vol. 324, no. 5923, pp. 81 - 85. 

[2] Schmidt M., Lipson H. (2009) "Distilling Free-Form Natural Laws from Experimental Data," 

Supplementary online materials. 

[3] Hillar, C., Sommer, F. “On the article “Distilling free-form natural laws from experimental data”,” 

(accessed July 24, 2009) http://www.msri.org/people/members/chillar/files/hs09b.pdf 

[4] Hillar, C., Sommer, F. “On the article “Distilling free-form natural laws from experimental data”,” 

Matlab files, (accessed July 24, 2009) 

http://www.msri.org/people/members/chillar/articles.html

A Rebuttal to Christopher Hillar and Friedrich Sommer's Comment ...

Create successful ePaper yourself

Delete template?

Save as template?