08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Example: Let f (x) = x k for k an even positive integer. Then, f ′′ (x) = k(k − 1)x k−2<br />

which since k − 2 is even is nonnegative for all x implying that f is convex. Thus,<br />

E (x) ≤ k√ E (x k ),<br />

since t 1 k is a monotone function <strong>of</strong> t, t > 0. It is easy to see that this inequality does not<br />

necessarily hold when k is odd; indeed for odd k, x k is not a convex function.<br />

Tails <strong>of</strong> Gaussian<br />

For bounding the tails <strong>of</strong> Gaussian densities, the following inequality is useful. The<br />

pro<strong>of</strong> uses a technique useful in many contexts. For t > 0,<br />

∫ ∞<br />

x=t<br />

e −x2 dx ≤ e−t2<br />

2t .<br />

In pro<strong>of</strong>, first write: ∫ ∞<br />

x=t e−x2 dx ≤ ∫ ∞ x<br />

x=t t e−x2 dx, using the fact that x ≥ t in the range <strong>of</strong><br />

integration. The latter expression is integrable in closed form since d(e −x2 ) = (−2x)e −x2<br />

yielding the claimed bound.<br />

A similar technique yields an upper bound on<br />

∫ 1<br />

x=β<br />

(1 − x 2 ) α dx,<br />

for β ∈ [0, 1] and α > 0. Just use (1 − x 2 ) α ≤ x β (1 − x2 ) α over the range and integrate in<br />

closed form the last expression.<br />

∫ 1<br />

x=β<br />

(1 − x 2 ) α dx ≤<br />

∫ 1<br />

x=β<br />

= (1 − β2 ) α+1<br />

2β(α + 1)<br />

x<br />

β (1 − x2 ) α dx =<br />

∣<br />

−1<br />

∣∣∣<br />

1<br />

2β(α + 1) (1 − x2 ) α+1<br />

x=β<br />

12.4 Probability<br />

Consider an experiment such as flipping a coin whose outcome is determined by chance.<br />

To talk about the outcome <strong>of</strong> a particular experiment, we introduce the notion <strong>of</strong> a random<br />

variable whose value is the outcome <strong>of</strong> the experiment. The set <strong>of</strong> possible outcomes<br />

is called the sample space. If the sample space is finite, we can assign a probability <strong>of</strong><br />

occurrence to each outcome. In some situations where the sample space is infinite, we can<br />

assign a probability <strong>of</strong> occurrence. The probability p (i) = 6 1<br />

for i an integer greater<br />

π 2 i 2<br />

than or equal to one is such an example. The function assigning the probabilities is called<br />

387

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!