07.07.2014 Views

uniform test of algorithmic randomness over a general ... - CiteSeerX

uniform test of algorithmic randomness over a general ... - CiteSeerX

uniform test of algorithmic randomness over a general ... - CiteSeerX

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

20 PETER GÁCS<br />

can serve as a reasonable deficiency <strong>of</strong> <strong>randomness</strong>. (We will also use the <strong>test</strong> t = 2 d .) If we<br />

substitute m for µ in d µ (x), we get 0. This substitution is not justified, <strong>of</strong> course. The fact<br />

that m is not a probability measure can be helped, at least <strong>over</strong> N, using compactification<br />

as above, and extending the notion <strong>of</strong> <strong>randomness</strong> <strong>test</strong>s. But the <strong>test</strong> d µ can replace d µ<br />

only for computable µ, while m is not computable. Anyway, this is the sense in which all<br />

outcomes might be considered random with respect to m, and the heuristic sense in which<br />

m may still be considered “neutral”.<br />

♦<br />

Remark 5.7. Solomon<strong>of</strong>f proposed the use <strong>of</strong> a universal lower semicomputable semimeasure<br />

(actually, a closely related structure) for inductive inference in [18]. He proved in [19]<br />

that sequences emitted by any computable probability distribution can be predicted well by<br />

his scheme. It may be interesting to see whether the same prediction scheme has stronger<br />

properties when used with the truly neutral measure M <strong>of</strong> the present paper.<br />

♦<br />

6. RELATIVE ENTROPY<br />

Some properties <strong>of</strong> description complexity make it a good expression <strong>of</strong> the idea <strong>of</strong> individual<br />

information content.<br />

6.1. Entropy. The entropy <strong>of</strong> a discrete probability distribution µ is defined as<br />

H(µ) = − ∑ x<br />

µ(x) log µ(x).<br />

To <strong>general</strong>ize entropy to continuous distributions the relative entropy is defined as follows.<br />

Let µ, ν be two measures, where µ is taken (typically, but not always), to be a probability<br />

measure, and ν another measure, that can also be a probability measure but is most<br />

frequently not. We define the relative entropy H ν (µ) as follows. If µ is not absolutely continuous<br />

with respect to ν then H ν (µ) = −∞. Otherwise, writing<br />

dµ<br />

dν = µ(dx) =: f(x)<br />

ν(dx)<br />

for the (Radon-Nikodym) derivative (density) <strong>of</strong> µ with respect to ν, we define<br />

∫<br />

H ν (µ) = − log dµ<br />

dν dµ = −µx log µ(dx)<br />

ν(dx) = −νx f(x) log f(x).<br />

Thus, H(µ) = H # (µ) is a special case.<br />

Example 6.1. Let f(x) be a probability density function for the distribution µ <strong>over</strong> the real<br />

line, and let λ be the Lebesgue measure there. Then<br />

∫<br />

H λ (µ) = − f(x) log f(x)dx.<br />

In information theory and statistics, when both µ and ν are probability measures, then<br />

−H ν (µ) is also denoted D(µ ‖ ν), and called (after Kullback) the information divergence<br />

<strong>of</strong> the two measures. It is frequently used in the role <strong>of</strong> a distance between µ and ν. It is<br />

not symmetric, but can be shown to obey the triangle inequality, and to be nonnegative. Let<br />

us prove the latter property: in our terms, it says that relative entropy is nonpositive when<br />

both µ and ν are probability measures.<br />

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!