07.07.2014 Views

uniform test of algorithmic randomness over a general ... - CiteSeerX

uniform test of algorithmic randomness over a general ... - CiteSeerX

uniform test of algorithmic randomness over a general ... - CiteSeerX

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

10 PETER GÁCS<br />

From now on, when referring to <strong>randomness</strong> <strong>test</strong>s, we will always assume that our space<br />

X has recognizable Boolean inclusions and hence has a universal <strong>test</strong>. We fix a universal<br />

<strong>test</strong> t µ (x), and call the function<br />

d µ (x) = log t µ (x).<br />

the deficiency <strong>of</strong> <strong>randomness</strong> <strong>of</strong> x with respect to µ. We call an element x ∈ X random with<br />

respect to µ if d µ (x) < ∞.<br />

Remark 3.2. Tests can be <strong>general</strong>ized to include an arbitrary parameter y: we can talk about<br />

the universal <strong>test</strong><br />

t µ (x | y),<br />

where y comes from some constructive topological space Y. This is a maximal (within<br />

a multiplicative constant) lower semicomputable function (x, y, µ) ↦→ f(x, y, µ) with the<br />

property µ x f(x, y, µ) 1.<br />

♦<br />

3.2. Conservation <strong>of</strong> <strong>randomness</strong>. For i = 1, 0, let X i = (X i , d i , D i , α i ) be computable<br />

metric spaces, and let M i = (M(X i ), σ i , ν i ) be the effective topological space <strong>of</strong> probability<br />

measures <strong>over</strong> X i . Let Λ be a computable probability kernel from X 1 to X 0 as defined<br />

in Subsection 2.2. In the following theorem, the same notation d µ (x) will refer to the<br />

deficiency <strong>of</strong> <strong>randomness</strong> with respect to two different spaces, X 1 and X 0 , but this should<br />

not cause confusion. Let us first spell out the conservation theorem before interpreting it.<br />

Theorem 2. For a computable probability kernel Λ from X 1 to X 0 , we have<br />

λ y xt Λ ∗ µ(y) ∗ < t µ (x). (3.3)<br />

Pro<strong>of</strong>. Let t ν (x) be the universal <strong>test</strong> <strong>over</strong> X 0 . The left-hand side <strong>of</strong> (3.3) can be written as<br />

u µ = Λt Λ ∗ µ.<br />

According to (B.4), we have µu µ = (Λ ∗ µ)t Λ ∗ µ which is 1 since t is a <strong>test</strong>. If we show that<br />

∗<br />

(µ, x) ↦→ u µ (x) is lower semicomputable then the universality <strong>of</strong> t µ will imply u µ < t µ .<br />

According to Proposition C.7, as a lower semicomputable function, t ν (y) can be written<br />

as sup n g n (ν, y), where (g n (ν, y)) is a computable sequence <strong>of</strong> computable functions. We<br />

pointed out in Subsection 2.2 that the function µ ↦→ Λ ∗ µ is computable. Therefore the<br />

function (n, µ, x) ↦→ g n (Λ ∗ µ, f(x)) is also a computable. So, u µ (x) is the supremum <strong>of</strong> a<br />

computable sequence <strong>of</strong> computable functions and as such, lower semicomputable. □<br />

It is easier to interpret the theorem first in the special case when Λ = Λ h for a computable<br />

function h : X 1 → X 0 , as in Example 2.7. Then the theorem simplifies to the following.<br />

Corollary 3.3. For a computable function h : X 1 → X 0 , we have d h ∗ µ(h(x)) + < d µ (x).<br />

Informally, this says that if x is random with respect to µ in X 1 then h(x) is essentially at<br />

least as random with respect to the output distribution h ∗ µ in X 0 . Decrease in <strong>randomness</strong><br />

can only be caused by complexity in the definition <strong>of</strong> the function h. It is even easier to<br />

interpret the theorem when µ is defined <strong>over</strong> a product space X 1 × X 2 , and h(x 1 , x 2 ) = x 1<br />

is the projection. The theorem then says, informally, that if the pair (x 1 , x 2 ) is random with<br />

respect to µ then x 1 is random with respect to the marginal µ 1 = h ∗ µ <strong>of</strong> µ. This is a very<br />

natural requirement: why would the throwing-away <strong>of</strong> the information about x 2 affect the<br />

plausibility <strong>of</strong> the hypothesis that the outcome x 1 arose from the distribution µ 1 ?<br />

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!