uniform test of algorithmic randomness over a general ... - CiteSeerX
uniform test of algorithmic randomness over a general ... - CiteSeerX
uniform test of algorithmic randomness over a general ... - CiteSeerX
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
10 PETER GÁCS<br />
From now on, when referring to <strong>randomness</strong> <strong>test</strong>s, we will always assume that our space<br />
X has recognizable Boolean inclusions and hence has a universal <strong>test</strong>. We fix a universal<br />
<strong>test</strong> t µ (x), and call the function<br />
d µ (x) = log t µ (x).<br />
the deficiency <strong>of</strong> <strong>randomness</strong> <strong>of</strong> x with respect to µ. We call an element x ∈ X random with<br />
respect to µ if d µ (x) < ∞.<br />
Remark 3.2. Tests can be <strong>general</strong>ized to include an arbitrary parameter y: we can talk about<br />
the universal <strong>test</strong><br />
t µ (x | y),<br />
where y comes from some constructive topological space Y. This is a maximal (within<br />
a multiplicative constant) lower semicomputable function (x, y, µ) ↦→ f(x, y, µ) with the<br />
property µ x f(x, y, µ) 1.<br />
♦<br />
3.2. Conservation <strong>of</strong> <strong>randomness</strong>. For i = 1, 0, let X i = (X i , d i , D i , α i ) be computable<br />
metric spaces, and let M i = (M(X i ), σ i , ν i ) be the effective topological space <strong>of</strong> probability<br />
measures <strong>over</strong> X i . Let Λ be a computable probability kernel from X 1 to X 0 as defined<br />
in Subsection 2.2. In the following theorem, the same notation d µ (x) will refer to the<br />
deficiency <strong>of</strong> <strong>randomness</strong> with respect to two different spaces, X 1 and X 0 , but this should<br />
not cause confusion. Let us first spell out the conservation theorem before interpreting it.<br />
Theorem 2. For a computable probability kernel Λ from X 1 to X 0 , we have<br />
λ y xt Λ ∗ µ(y) ∗ < t µ (x). (3.3)<br />
Pro<strong>of</strong>. Let t ν (x) be the universal <strong>test</strong> <strong>over</strong> X 0 . The left-hand side <strong>of</strong> (3.3) can be written as<br />
u µ = Λt Λ ∗ µ.<br />
According to (B.4), we have µu µ = (Λ ∗ µ)t Λ ∗ µ which is 1 since t is a <strong>test</strong>. If we show that<br />
∗<br />
(µ, x) ↦→ u µ (x) is lower semicomputable then the universality <strong>of</strong> t µ will imply u µ < t µ .<br />
According to Proposition C.7, as a lower semicomputable function, t ν (y) can be written<br />
as sup n g n (ν, y), where (g n (ν, y)) is a computable sequence <strong>of</strong> computable functions. We<br />
pointed out in Subsection 2.2 that the function µ ↦→ Λ ∗ µ is computable. Therefore the<br />
function (n, µ, x) ↦→ g n (Λ ∗ µ, f(x)) is also a computable. So, u µ (x) is the supremum <strong>of</strong> a<br />
computable sequence <strong>of</strong> computable functions and as such, lower semicomputable. □<br />
It is easier to interpret the theorem first in the special case when Λ = Λ h for a computable<br />
function h : X 1 → X 0 , as in Example 2.7. Then the theorem simplifies to the following.<br />
Corollary 3.3. For a computable function h : X 1 → X 0 , we have d h ∗ µ(h(x)) + < d µ (x).<br />
Informally, this says that if x is random with respect to µ in X 1 then h(x) is essentially at<br />
least as random with respect to the output distribution h ∗ µ in X 0 . Decrease in <strong>randomness</strong><br />
can only be caused by complexity in the definition <strong>of</strong> the function h. It is even easier to<br />
interpret the theorem when µ is defined <strong>over</strong> a product space X 1 × X 2 , and h(x 1 , x 2 ) = x 1<br />
is the projection. The theorem then says, informally, that if the pair (x 1 , x 2 ) is random with<br />
respect to µ then x 1 is random with respect to the marginal µ 1 = h ∗ µ <strong>of</strong> µ. This is a very<br />
natural requirement: why would the throwing-away <strong>of</strong> the information about x 2 affect the<br />
plausibility <strong>of</strong> the hypothesis that the outcome x 1 arose from the distribution µ 1 ?<br />
□