uniform test of algorithmic randomness over a general ... - CiteSeerX
uniform test of algorithmic randomness over a general ... - CiteSeerX
uniform test of algorithmic randomness over a general ... - CiteSeerX
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
20 PETER GÁCS<br />
can serve as a reasonable deficiency <strong>of</strong> <strong>randomness</strong>. (We will also use the <strong>test</strong> t = 2 d .) If we<br />
substitute m for µ in d µ (x), we get 0. This substitution is not justified, <strong>of</strong> course. The fact<br />
that m is not a probability measure can be helped, at least <strong>over</strong> N, using compactification<br />
as above, and extending the notion <strong>of</strong> <strong>randomness</strong> <strong>test</strong>s. But the <strong>test</strong> d µ can replace d µ<br />
only for computable µ, while m is not computable. Anyway, this is the sense in which all<br />
outcomes might be considered random with respect to m, and the heuristic sense in which<br />
m may still be considered “neutral”.<br />
♦<br />
Remark 5.7. Solomon<strong>of</strong>f proposed the use <strong>of</strong> a universal lower semicomputable semimeasure<br />
(actually, a closely related structure) for inductive inference in [18]. He proved in [19]<br />
that sequences emitted by any computable probability distribution can be predicted well by<br />
his scheme. It may be interesting to see whether the same prediction scheme has stronger<br />
properties when used with the truly neutral measure M <strong>of</strong> the present paper.<br />
♦<br />
6. RELATIVE ENTROPY<br />
Some properties <strong>of</strong> description complexity make it a good expression <strong>of</strong> the idea <strong>of</strong> individual<br />
information content.<br />
6.1. Entropy. The entropy <strong>of</strong> a discrete probability distribution µ is defined as<br />
H(µ) = − ∑ x<br />
µ(x) log µ(x).<br />
To <strong>general</strong>ize entropy to continuous distributions the relative entropy is defined as follows.<br />
Let µ, ν be two measures, where µ is taken (typically, but not always), to be a probability<br />
measure, and ν another measure, that can also be a probability measure but is most<br />
frequently not. We define the relative entropy H ν (µ) as follows. If µ is not absolutely continuous<br />
with respect to ν then H ν (µ) = −∞. Otherwise, writing<br />
dµ<br />
dν = µ(dx) =: f(x)<br />
ν(dx)<br />
for the (Radon-Nikodym) derivative (density) <strong>of</strong> µ with respect to ν, we define<br />
∫<br />
H ν (µ) = − log dµ<br />
dν dµ = −µx log µ(dx)<br />
ν(dx) = −νx f(x) log f(x).<br />
Thus, H(µ) = H # (µ) is a special case.<br />
Example 6.1. Let f(x) be a probability density function for the distribution µ <strong>over</strong> the real<br />
line, and let λ be the Lebesgue measure there. Then<br />
∫<br />
H λ (µ) = − f(x) log f(x)dx.<br />
In information theory and statistics, when both µ and ν are probability measures, then<br />
−H ν (µ) is also denoted D(µ ‖ ν), and called (after Kullback) the information divergence<br />
<strong>of</strong> the two measures. It is frequently used in the role <strong>of</strong> a distance between µ and ν. It is<br />
not symmetric, but can be shown to obey the triangle inequality, and to be nonnegative. Let<br />
us prove the latter property: in our terms, it says that relative entropy is nonpositive when<br />
both µ and ν are probability measures.<br />
♦