06.06.2013 Views

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

800 0 Statistical Mathematics<br />

For a function f : S ⊆ IR n → IR m , we define ∂f T /∂x to be the n × m matrix,<br />

which is the natural extension <strong>of</strong> ∂/∂x applied to a scalar function, and<br />

∂f/∂x T to be its transpose, the m×n matrix. Although the notation ∂f T /∂x<br />

is more precise because it indicates that the elements <strong>of</strong> f correspond to the<br />

columns <strong>of</strong> the result, we <strong>of</strong>ten drop the transpose in the notation. We have<br />

∂f<br />

∂x<br />

∂fT<br />

= by convention<br />

<br />

∂x<br />

<br />

∂f1 ∂fm<br />

= . . .<br />

∂x ∂x<br />

⎡<br />

⎤<br />

⎢<br />

= ⎢<br />

⎣<br />

∂f1<br />

∂x1<br />

∂f1<br />

∂x2<br />

∂f1<br />

∂xn<br />

∂f2<br />

∂x1<br />

∂f2<br />

∂x2<br />

· · ·<br />

∂f2<br />

∂xn<br />

· · · ∂fm<br />

∂x1<br />

· · · ∂fm<br />

∂x2<br />

· · · ∂fm<br />

∂xn<br />

⎥<br />

⎦<br />

(0.3.48)<br />

if those derivatives exist. This derivative is called the matrix gradient and<br />

is denoted by Gf or ∇f for the vector-valued function f. (Note that the ∇<br />

symbol can denote either a vector or a matrix, depending on whether the<br />

function being differentiated is scalar-valued or vector-valued.)<br />

The m × n matrix ∂f/∂x T = (∇f) T is called the Jacobian <strong>of</strong> f and is<br />

denoted by Jf:<br />

Jf = G T f = (∇f) T . (0.3.49)<br />

The absolute value <strong>of</strong> the determinant <strong>of</strong> the Jacobian appears in integrals<br />

involving a change <strong>of</strong> variables. (Occasionally, the term “Jacobian” is used<br />

to refer to the absolute value <strong>of</strong> the determinant rather than to the matrix<br />

itself.)<br />

To emphasize that the quantities are functions <strong>of</strong> x, we sometimes write<br />

∂f(x)/∂x, Jf(x), Gf(x), or ∇f(x).<br />

0.3.3.14 Derivatives <strong>of</strong> Matrices with Respect to Vectors<br />

The derivative <strong>of</strong> a matrix with respect to a vector is a three-dimensional<br />

object that results from applying equation (0.3.45) to each <strong>of</strong> the elements <strong>of</strong><br />

the matrix. For this reason, it is simpler to consider only the partial derivatives<br />

<strong>of</strong> the matrix Y with respect to the individual elements <strong>of</strong> the vector x; that<br />

is, ∂Y/∂xi. The expressions involving the partial derivatives can be thought<br />

<strong>of</strong> as defining one two-dimensional layer <strong>of</strong> a three-dimensional object.<br />

Using the rules for differentiation <strong>of</strong> powers that result directly from the<br />

definitions, we can write the partial derivatives <strong>of</strong> the inverse <strong>of</strong> the matrix Y<br />

as<br />

∂<br />

∂x Y −1 = −Y −1<br />

∂<br />

∂x Y<br />

<strong>Theory</strong> <strong>of</strong> <strong>Statistics</strong> c○2000–2013 James E. Gentle<br />

<br />

Y −1 . (0.3.50)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!