MA359 Lecture notes for Measure Theory - Of the Clux
MA359 Lecture notes for Measure Theory - Of the Clux
MA359 Lecture notes for Measure Theory - Of the Clux
Transform your PDFs into Flipbooks and boost your revenue!
Leverage SEO-optimized Flipbooks, powerful backlinks, and multimedia content to professionally showcase your products and significantly increase your reach.
<strong>MA359</strong><br />
<strong>Lecture</strong> <strong>notes</strong> <strong>for</strong> <strong>Measure</strong> <strong>Theory</strong><br />
Contents<br />
1 Riemann Integral 2<br />
1.1 Properties of <strong>the</strong> Riemann integral . . . . . . . . . . . . . . . . . 2<br />
1.2 Some weak points of Riemann’s integral . . . . . . . . . . . . . . 5<br />
1.2.1 The fundamental <strong>the</strong>orem of calculus . . . . . . . . . . . . 5<br />
1.2.2 Convergence of functions. . . . . . . . . . . . . . . . . . . 6<br />
1.2.3 Fourier coefficients. . . . . . . . . . . . . . . . . . . . . . . 6<br />
2 Lebesgue’s list of properties <strong>for</strong> <strong>the</strong> integral 7<br />
3 Outer <strong>Measure</strong> 9<br />
3.1 The problem of measure of sets. . . . . . . . . . . . . . . . . . . . 9<br />
3.2 General measure spaces . . . . . . . . . . . . . . . . . . . . . . . 18<br />
4 Measurable functions 19<br />
4.1 Approximation by step or simple functions . . . . . . . . . . . . . 21<br />
4.2 Littlewood three principles . . . . . . . . . . . . . . . . . . . . . 23<br />
5 Integration 25<br />
5.1 Simple functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 25<br />
5.2 Bounded functions supported on sets of finite measure. . . . . . . 27<br />
5.3 Non-negative functions . . . . . . . . . . . . . . . . . . . . . . . . 29<br />
5.3.1 Fatou’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . 30<br />
5.4 Integrable functions . . . . . . . . . . . . . . . . . . . . . . . . . 33<br />
5.4.1 The Lebesgue space L 1 . . . . . . . . . . . . . . . . . . . . 35<br />
6 Lebesgue measure in R d . 37<br />
6.1 Complex valued integrals . . . . . . . . . . . . . . . . . . . . . . 44<br />
7 Lebesgue L p spaces 44<br />
7.1 The space L 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />
7.2 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48<br />
7.3 The Lebesgue spaces L p . . . . . . . . . . . . . . . . . . . . . . . . 53<br />
1
8 Abstract measures 56<br />
8.1 Outer measures and Cara<strong>the</strong>odory measurable sets . . . . . . . . 56<br />
8.2 Measurability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
8.3 Radon-Nikodym <strong>the</strong>orem . . . . . . . . . . . . . . . . . . . . . . 60<br />
8.4 Signed measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />
1 Riemann Integral<br />
Since school we learned a lot about integration methods. However, <strong>the</strong>se methods<br />
were focused on <strong>the</strong> problem of how to integrate a given function ra<strong>the</strong>r<br />
than discussing <strong>the</strong> nature of <strong>the</strong> integral itself.<br />
In this section, we will deal with a definition of <strong>the</strong> integral due to Riemann.<br />
Let P = {x 0 , x 1 , ..., x n } be a partition of [a, b]. A set of sampling points {t i },<br />
<strong>for</strong> P, is a sequence of points in [a, b], such that t i ∈ [x i−1 , x i ]. We define<br />
<strong>the</strong> mesh(P) to be <strong>the</strong> size of <strong>the</strong> biggest subinterval in P. Given a function<br />
f : [a, b] → R define<br />
S(f, P, {t i }) =<br />
n∑<br />
f(t i )(x i − x i−1 ).<br />
i=1<br />
Definition. A function f is called Riemann integrable if <strong>the</strong>re exist a number<br />
A, such that <strong>for</strong> every ǫ > 0, <strong>the</strong>re exist δ > 0 such that if P is a partition of<br />
[a, b] with mesh(P) < δ, <strong>the</strong>n <strong>for</strong> every sampling {t i } we have<br />
|S(f, P, {t i }) − A)| < ǫ.<br />
In this case, we call A = ∫ b<br />
f(x)dx <strong>the</strong> Riemann integral of f.<br />
a<br />
The value A is unique.<br />
1.1 Properties of <strong>the</strong> Riemann integral<br />
Assume f and g are Riemann integrable functions defined on [a, b].<br />
• Linearity<br />
∫ b<br />
a<br />
(αf + βg)(x)dx = α<br />
• Positivity. If f(x) ≥ 0 <strong>the</strong>n<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
f(x)dx ≥ 0<br />
f(x)dx + β<br />
In particular, if f(x) ≥ g(x) <strong>for</strong> all x ∈ [a, b] <strong>the</strong>n<br />
∫ b<br />
a<br />
f(x)dx ≥<br />
∫ b<br />
a<br />
g(x)dx<br />
∫ b<br />
a<br />
g(x)dx<br />
2
• ∫ 1<br />
1dx = 1.<br />
0<br />
Proposition 1. If f : [a, b] → R is continuous <strong>the</strong>n f is Riemann integrable.<br />
In fact, it is possible that a Riemann integrable function is discontinuous.<br />
Never<strong>the</strong>less, <strong>the</strong> set of discontinuities can not be “too large” as we can see in<br />
<strong>the</strong> following example.<br />
Example 2 (Dirichlet’s function). Let f : [0, 1] → R be defined as<br />
{<br />
1 if x ∈ Q ∩ [0, 1]<br />
f(x) =<br />
0 if x ∈ [0, 1] \ Q .<br />
The function f is not Riemann integrable. Since every open interval I ⊂ [0, 1]<br />
contains rational and irrational numbers. Hence, given any partition P =<br />
{x 0 , ..., x n }, we can construct samplings {r i } and {q i } with r i ∈ [x i−1 , x i ] ∩ Q<br />
and q i ∈ [x i−1 , x i ] \Q. Then, <strong>the</strong> sampling set {r i } will render S(f, P, {r i }) = 1<br />
whereas <strong>the</strong> sampling {t i } gives S(f, P, {q i }) = 0.<br />
Riemann’s definition of <strong>the</strong> integral also imposes restrictions on <strong>the</strong> function<br />
f, as we see in <strong>the</strong> following:<br />
Theorem 3. If f : [a, b] → R is Riemann integrable, <strong>the</strong>n f is bounded.<br />
Proof. Consider δ > 0 such that<br />
|S(f, P, {t i }) −<br />
∫ b<br />
a<br />
f(x)dx| < 1 2<br />
whenever mesh(P) < δ. Fix a partition P with mesh(P) < δ, and a set of<br />
sampling points {t i } <strong>for</strong> P and let<br />
and<br />
M = max{|f(t 1 ), f(t 2 ), ..., f(t n )|}<br />
△ = min{(x 1 − x 0 ), (x 2 − x 1 ), ..., (x n − x n−1 )} > 0.<br />
Let x ∈ [a, b] and j be <strong>the</strong> smallest index such that x ∈ [x j−1, x j ]. Let T be <strong>the</strong><br />
sampling {t 1 , ..., t j−1 , x, t j+1 , ..., t n }. Note that<br />
|f(x)(x j − x j−1 ) − f(t j )(x j − x j−1 )| = |S(f, P, T) − S(f, P, {t i })|<br />
Moreover,<br />
= |S(f, P, T) −<br />
|S(f, P, T) − S(f, P, {t i })|<br />
∫ b<br />
f(x)dx +<br />
∫ b<br />
a<br />
a<br />
f(x)dx − S(f, P, {t i })|<br />
3
Then<br />
< |S(f, P, T) −<br />
∫ b<br />
f(x)dx| + |<br />
∫ b<br />
a<br />
a<br />
f(x)dx − S(f, P, {t i })| < 1.<br />
|f(x)|(x j − x j−1 ) < |f(t j )|(x j − x j−1 ) + 1 ≤ M(x j − x j−1 ) + 1<br />
which implies that<br />
Thus, f is bounded.<br />
|f(x)| < M +<br />
1<br />
(x j − x j−1 ) ≤ M + 1 △<br />
Now, we will discuss ano<strong>the</strong>r definition of <strong>the</strong> integral due to Darboux. In<br />
fact, this definition is equivalent to Riemann’s definition. Let P = {x 0 , ..., x n }<br />
be partition of [a, b] define<br />
M i = sup{f(x)|x ∈ [x i−1 , x i ]}<br />
and<br />
So, we define<br />
and<br />
Put<br />
m i = inf{f(x)|x ∈ [x i−1 , x i ]}.<br />
n∑<br />
U(f, P) = M i (x i − x i−1 )<br />
i=1<br />
n∑<br />
L(f, P) = m i (x i − x i−1 ).<br />
i=1<br />
and<br />
∫ b<br />
f = inf{U(f, P)|P is a partition of [a, b]}<br />
a<br />
∫ b<br />
a<br />
f = sup{U(f, P)|P is a partition of [a, b]}<br />
Whenever a function f : [a, b] → R is bounded <strong>the</strong> following inequality holds<br />
∫ b<br />
f ≤<br />
a<br />
Proposition 4. A function f : [a, b] → R is Riemann integrable if, and only if,<br />
∫ b<br />
f =<br />
a<br />
∫ b<br />
a<br />
∫ b<br />
In this case, <strong>the</strong> common value is equal to ∫ b<br />
a f(x)dx.<br />
a<br />
f<br />
f.<br />
4
Finally, let us conclude with ano<strong>the</strong>r property of Riemann’s integral.<br />
Proposition 5. If f : [a, b] → R is Riemann integrable, <strong>the</strong>n |f| is Riemann<br />
integrable and<br />
|<br />
∫ b<br />
a<br />
f(x)dx| ≤<br />
∫ b<br />
a<br />
|f(x)|dx.<br />
1.2 Some weak points of Riemann’s integral<br />
Here we want to discuss some of <strong>the</strong> problems that arise with Riemann’s definition<br />
of <strong>the</strong> integral.<br />
1.2.1 The fundamental <strong>the</strong>orem of calculus<br />
The fundamental <strong>the</strong>orem of calculus gives a link between differential calculus<br />
and integral calculus. It consist of two parts:<br />
Theorem 6 (The Fundamental Theorem of Calculus). Let f : [a, b] → R<br />
be a Riemann integrable function.<br />
i) Let F(x) be an antiderivative of f, <strong>the</strong>n<br />
∫ b<br />
a<br />
f(t)dt = F(b) − F(a).<br />
ii) Define F(x) = ∫ x<br />
a<br />
f(t)dt. Then, F is continuous on [a, b]. Moreover, if f<br />
is continuous at ζ ∈ [a, b], <strong>the</strong>n F is differentiable at ζ and F ′ (ζ) = f(ζ).<br />
The existence of continuous functions F that are nowhere differentiable, or<br />
<strong>for</strong> which F ′ (x) exists at every x, but F ′ is not Riemann integrable, leads to <strong>the</strong><br />
problem of finding a general class of functions <strong>for</strong> which <strong>the</strong> <strong>the</strong>orem is valid.<br />
The following example shows a function f which is differentiable at every<br />
point in [0, 1] but whose derivative is not Riemann integrable.<br />
Example 7. Let f : [0, 1] → R defined by<br />
{<br />
x 2 cos( π<br />
f(x) =<br />
x<br />
) if 0 < x ≤ 1<br />
2<br />
0 if x = 0.<br />
Then, f is differentiable on [0, 1] with derivative<br />
{<br />
f ′ 2xcos( π<br />
(x) =<br />
x<br />
) + 2π 2 x sin( π x<br />
) if 0 < x ≤ 1<br />
2<br />
0 if x = 0.<br />
Note that f ′ is not bounded and hence f ′ is not Riemann integrable on [0, 1].<br />
5
1.2.2 Convergence of functions.<br />
Theorem 8. Let f, f k : [a, b] → R be functions <strong>for</strong> k ∈ N. Suppose that each<br />
f k is Riemann integrable and that <strong>the</strong> sequence {f k } converges to f uni<strong>for</strong>mly<br />
on [a, b]. Then f is Riemann integrable over [a, b] and<br />
lim<br />
∫ b<br />
a<br />
f k (x)dx =<br />
∫ b<br />
a<br />
f(x)dx.<br />
Uni<strong>for</strong>m convergence is quite strong condition, but pointwise convergence is<br />
not enough in general, as <strong>the</strong> following example shows:<br />
Example 9. Let {r i } be an enumeration of <strong>the</strong> rational numbers in [0, 1], define<br />
{<br />
1 if x = r i<br />
δ ri (x) =<br />
0 o<strong>the</strong>rwise.<br />
For every k, let f k (x) = ∑ k<br />
i=1 δ r i<br />
(x), <strong>the</strong>n f k converges monotonically to f,<br />
where f is Dirichlet’s function defined in Example 2.<br />
Thus, it would be desirable to have an integral that satisfies weaker conditions<br />
<strong>for</strong> convergence. However, it should be notice that <strong>the</strong> condition of<br />
pointwise convergence alone is not sufficient to guarantee <strong>the</strong> equality of <strong>the</strong><br />
limits; we have to assume extra conditions, as we shall see later.<br />
1.2.3 Fourier coefficients.<br />
Let f : [−π, π] → R be Riemann integrable, <strong>the</strong>n we can associate to f <strong>the</strong><br />
sequence {a n } of Fourier coefficients defined by<br />
a n = 1<br />
2π<br />
∫ π<br />
−π<br />
f(x)e −inx dx.<br />
Then f ∼ ∑ a n e inx and we have Parseval’s identity:<br />
∑<br />
|an | 2 = 1<br />
2π<br />
∫ π<br />
−π<br />
|f(x)| 2 dx.<br />
Consider <strong>the</strong> space l 2 , of all sequences {a n } such that ∑ |a n | 2 < ∞. The<br />
space l 2 is a complete metric space. On <strong>the</strong> o<strong>the</strong>r hand, <strong>the</strong> space of Riemann<br />
integrable f is not complete. Moreover, <strong>for</strong> every sequence in l 2 we can define <strong>the</strong><br />
corresponding function. We would like to improve <strong>the</strong> definition of integrability,<br />
so that include <strong>the</strong>se functions. We also would like to describe <strong>the</strong> space of<br />
functions arising from this construction.<br />
6
2 Lebesgue’s list of properties <strong>for</strong> <strong>the</strong> integral<br />
For any set E ⊂ R, let χ E (x) be <strong>the</strong> characteristic function of E defined by<br />
{<br />
1 if x ∈ E<br />
χ E (x) =<br />
0 o<strong>the</strong>rwise.<br />
We are looking <strong>for</strong> a definition of integrability that satisfies <strong>the</strong> following properties.<br />
Given integrable functions f and g<br />
1. ∫ b<br />
a f(x)dx = ∫ b+h<br />
f(x − h)dx<br />
a+h<br />
2. ∫ b<br />
a f(x)dx + ∫ c<br />
b f(x)dx + ∫ a<br />
c f(x)dx = 0<br />
3. ∫ b<br />
a [f(x) + g(x)]dx = ∫ b<br />
a f(x)dx + ∫ b<br />
a g(x)dx<br />
4. If f ≥ 0 and b > a, <strong>the</strong>n ∫ b<br />
a f(x)dx ≥ 0<br />
5. ∫ 1<br />
0 1 dx = 1<br />
6. If {f} ∞ k=1 is a sequence, such that f k ր f monotonically, <strong>the</strong>n<br />
∫ b<br />
a<br />
f k (x)dx →<br />
∫ b<br />
a<br />
f(x)dx<br />
In o<strong>the</strong>r words, to define <strong>the</strong> integral we start with a list of properties we<br />
want it to satisfy and <strong>the</strong>n try to deduce a definition out of this list.<br />
Theorem 10. Properties (1)-(5) imply:<br />
(i) If f ≥ g and a < b, <strong>the</strong>n ∫ b<br />
a f(x)dx ≥ ∫ b<br />
a g(x)dx<br />
(ii) For every [a, b] <strong>the</strong>n ∫ b<br />
a 1dx = b − a<br />
(iii) For every α, β ∈ R <strong>the</strong>n<br />
∫ b<br />
a<br />
(αf(x) + βg(x))dx = α<br />
∫ b<br />
a<br />
f(x)dx + β<br />
∫ b<br />
a<br />
g(x)dx<br />
(iv) If f is Riemann integrable. The value ∫ b<br />
f(x)dx coincides with <strong>the</strong> Riemann<br />
integral of f on [a,<br />
a<br />
b].<br />
Proof. Note that by setting g = −f in (3) we have ∫ b<br />
a −f(x)d(x) = − ∫ b<br />
a f(x)dx<br />
(Proof of (i)). This is a consequence of (3) and <strong>for</strong> (4), since f > g implies<br />
f − g > 0 and<br />
∫ b<br />
a<br />
f(x)dx −<br />
∫ b<br />
a<br />
f(x)dx =<br />
∫ b<br />
a<br />
(f(x) − g(x))dx ≥ 0<br />
7
(Proof of (ii)) First note that by (1) ∫ b<br />
a 1dx = ∫ d<br />
c<br />
1dx whenever a − b = c − d.<br />
Make r = a − b, so we want to check that ∫ r<br />
0<br />
1dx = r <strong>for</strong> every r > 0. By<br />
induction and using (2) it is easy to see that ∫ n<br />
1dx = n <strong>for</strong> every natural<br />
0<br />
number n. The same argument yields ∫ 1/n<br />
0<br />
1dx = 1/n and iterating (2) again,<br />
we have ∫ q<br />
1dx = q <strong>for</strong> every rational number q. For any real r let p, q rational<br />
0<br />
numbers with p < r < q. Since χ [0,p] < χ [0,r] < χ [0,q we have by (4) that<br />
0 ≤<br />
∫ q<br />
0<br />
1dx −<br />
∫ r<br />
0<br />
1dx ≤<br />
∫ q<br />
0<br />
1dx −<br />
∫ p<br />
Letting p and q approach to r, we have that <strong>for</strong> all r,<br />
∫ r<br />
0<br />
1dx = r.<br />
0<br />
1dx = p − q.<br />
(Proof of (iii)) By (3) and <strong>the</strong> observation at <strong>the</strong> beginning of <strong>the</strong> proof, it<br />
is enough to prove that <strong>for</strong> every α ≤ 0, we have ∫ b<br />
a αf(x)dx = α ∫ b<br />
a f(x)dx.<br />
Putting f = g in (3) and by iterating, we can see that ∫ b<br />
a nf(x)dx = n ∫ b<br />
a f(x)dx<br />
<strong>for</strong> every natural number n. To prove it <strong>for</strong> all reals, we should repeat similar<br />
arguments as <strong>the</strong> ones used in <strong>the</strong> proof of (ii), but we will omit <strong>the</strong>m.<br />
(Proof of (iv)) Let P = {x 0 , ..., x n } be any partition of <strong>the</strong> interval [a, b]. Remind<br />
<strong>the</strong> definitions<br />
M i = sup{f(x)|x ∈ [x i−1 , x i ]}<br />
and<br />
m i = inf{f(x)|x ∈ [x i−1 , x i ]}.<br />
Then <strong>for</strong> every set of sampling points {t i }, we have<br />
which implies<br />
n∑<br />
m i (x i − x i−1 ) ≤<br />
i=1<br />
∫ b<br />
f ≤<br />
a<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
f(x)dx ≤<br />
f(x)dx ≤<br />
n∑<br />
M i (x i − x i−1 )<br />
Thus, if f is Riemann integrable <strong>the</strong>n ∫ b<br />
f = ∫ b f and <strong>the</strong> middle integral must<br />
a a<br />
be equal to <strong>the</strong> Riemann integral of f.<br />
By <strong>the</strong> <strong>the</strong>orem above, any integral satisfying properties (1)-(5) must coincide<br />
with <strong>the</strong> Riemann integral whenever f is Riemann integrable. Now, let us<br />
investigate what comes out of property (6).<br />
Given f : R → R, ǫ > 0 and j ∈ Z, define<br />
i=1<br />
∫ b<br />
a<br />
f.<br />
E j,ǫ = {x ∈ R|jǫ ≤ f(x) < (j + 1)ǫ}.<br />
For [a, b], let A j,ǫ = E j,ǫ ∩ [a, b]. For every n let f n (x) = ∑ n<br />
−n (j/n)χ A j,ǫ<br />
, <strong>the</strong>n,<br />
by construction, we can check that f n ր f as n → ∞ in [a, b]. Then, by<br />
8
property (6), we have ∫ b<br />
a f n(x)dx → ∫ b<br />
a<br />
f(x)dx. Hence, in order to compute <strong>the</strong><br />
integral of f it suffices to know how to compute <strong>the</strong> integral of <strong>the</strong> characteristic<br />
functions of <strong>the</strong> sets A j,ǫ . For every interval I = (c, d) ⊂ [a, b], we know that<br />
∫ b<br />
a χ I(x)dx = d−c, which is just <strong>the</strong> length of I. However, <strong>for</strong> general f, <strong>the</strong> sets<br />
A j,ǫ above are not necessarily intervals. This leads to <strong>the</strong> problem of measure<br />
of sets that will be discussed in <strong>the</strong> following section.<br />
3 Outer <strong>Measure</strong><br />
3.1 The problem of measure of sets.<br />
As we saw in last section, <strong>the</strong> problem of integration becomes <strong>the</strong> problem of<br />
assign to every bounded set E ⊂ R a nonnegative number m(E) such that<br />
satisfies <strong>the</strong> following properties:<br />
1. (Translation invariance) For every h ∈ R define<br />
<strong>the</strong>n m(E + h) = m(E).<br />
E + h = {x + h| <strong>for</strong> some x ∈ E},<br />
2. (Additivity) If {E i } i∈σ is a finite or countable union of pairwise disjoint<br />
sets, <strong>the</strong>n<br />
m( ⋃ E i ) = ∑ m(E i )<br />
i∈σ i∈σ<br />
3. m([0, 1]) = 1<br />
For intervals we have already a function satisfying <strong>the</strong>se properties: If I =<br />
[a, b] <strong>the</strong> length of I given by l(I) = b − a satisfies properties (1)-(3) when<br />
restricted to intervals. Note that, since <strong>the</strong> length of a point is zero, it is<br />
irrelevant whe<strong>the</strong>r we include or not <strong>the</strong> endpoints in I. Our goal now, will be<br />
to extend <strong>the</strong> concept of length to o<strong>the</strong>r sets ra<strong>the</strong>r than intervals.<br />
In what follows, σ will always represent a countable set, that is ei<strong>the</strong>r a finite<br />
or countably infinite set.<br />
Definition. Let E ⊂ R. The outer measure of E is defined by<br />
m ∗ (E) = inf{ ∑ j∈σ<br />
l(I j )}.<br />
The infimum is taken over all countable coverings of E by open intervals {I j } j∈σ<br />
For every open interval I, we have that l(I) > 0, <strong>the</strong>n m ∗ (E) ≥ 0; also, <strong>for</strong><br />
every ǫ > 0, ∅ ⊂ (0, ǫ), <strong>the</strong>n 0 ≤ m ∗ (∅) ≤ 0. Hence, we have m ∗ (∅) = 0.<br />
Theorem 11. The outer measure m ∗ satisfies <strong>the</strong> following properties:<br />
1. m ∗ is monotone. That is, whenever F ⊂ E, <strong>the</strong>n m ∗ (F) ≤ m ∗ (E)<br />
9
2. For any h ∈ R, m ∗ (E + h) = m ∗ (E)<br />
3. For any interval I, m ∗ (I) = l(I)<br />
4. m ∗ is countably subadditive; that is, if {E i } ∞ i=0 is a countable family of<br />
sets in R <strong>the</strong>n<br />
∞⋃ ∞∑<br />
m ∗ ( E i ) ≤ m ∗ (E i ).<br />
i=0<br />
Proof. For Part 1, it suffices to note that every cover of E covers F.<br />
Part 2, <strong>the</strong> length of I = (a, b) is equal to <strong>the</strong> length of I +h = (a+h, b+h).<br />
Every covering of E by intervals {I j } induces <strong>the</strong> covering {I j +h} of E+h with<br />
∑ l(Ij ) = ∑ l(I j +h),which implies m ∗ (E+h) ≤ m ∗ (E). Reciprocally, if {I j } is<br />
a covering of E +h, <strong>the</strong>n {I j −h} is a covering of E with ∑ l(I j ) = ∑ l(I j −h),<br />
so m ∗ (E) ≤ m ∗ (E + h). Hence, m ∗ (E) = m ∗ (E + h).<br />
Let us prove now Part 3, since we can take I with or without a and b, let us<br />
fix I = [a, b] <strong>for</strong> convenience, <strong>the</strong>n I ⊂ (a − ǫ/2, b + ǫ/2), which implies<br />
i=0<br />
m ∗ (I) ≤ l((a − ǫ/2, b + ǫ/2)) = l(I) + ǫ.<br />
The inequality holds <strong>for</strong> every ǫ, so m ∗ (I) ≤ l(I).<br />
Conversely, let {I k } be any covering of I by open intervals, since I is compact,<br />
<strong>the</strong>re is a finite subcovering {I 1 , ..., I n } of I. We claim, that ∑ n<br />
j=1 I j > l(I).<br />
Since I is bounded, we can assume every I j to be bounded, <strong>the</strong>n <strong>the</strong>re<br />
exist i 1 with I i1 = (a 1 , b 1 ), such that a 1 < a < b 1 . If b < b 1 , <strong>the</strong>n I ⊂ I i1<br />
and l(I i1 ) > l(I), o<strong>the</strong>rwise <strong>the</strong>re exists i 2 , such that I i2 = (a 2 , b 2 ), and a 2 <<br />
b 1 < b 2 . The sequence {I j } is finite, <strong>the</strong>n at some point <strong>the</strong>re is a i k , with<br />
I ik = (a k , b k ), satisfying a k < b k−1 < b k and b < b k . In this case, I ⊂ ∪ k j=1 I i j<br />
and ∑ j = 1 k l(I ij ) > l(I) as we claimed. Then m ∗ (I) = l(I).<br />
For <strong>the</strong> proof of Part 4, we may assume that m(E i ) < ∞ <strong>for</strong> every i, because<br />
o<strong>the</strong>rwise <strong>the</strong> equality holds. For every ǫ > 0, <strong>the</strong>re exist {I j,k } covering of E j<br />
such that<br />
∑<br />
l(Ij,k ) ≤ m ∗ (E j ) + ǫ<br />
2 j ,<br />
<strong>the</strong>n E ⊂ ⋃ I j,k and<br />
m(E) ≤ ∑ j,k<br />
≤<br />
∞∑<br />
j=1<br />
≤<br />
(I j,k ) =<br />
∞∑<br />
j=1 k=1<br />
(<br />
m ∗ (E j ) + ǫ<br />
2 j )<br />
∞∑<br />
m ∗ (E j ) + ǫ,<br />
j=1<br />
which implies m ∗ (E) ≤ ∑ ∞<br />
j=1 m∗ (E j ).<br />
∞∑<br />
l(I j,k )<br />
10
As a consequence of Part 4, if E is countable, <strong>the</strong>n m ∗ (E) = 0. Hence, we<br />
have:<br />
Corollary 12. Every interval I is not countable.<br />
The outer measure is additive <strong>for</strong> very especial cases:<br />
Lemma 13. Let {C 1 , ..., C N } be a finite sequence of disjoint compact sets. If<br />
C = ⋃ N<br />
i=1 C i, <strong>the</strong>n<br />
N∑<br />
m ∗ (C) = m ∗ (C i )<br />
Proof. By subadditivity,<br />
i=1<br />
m ∗ (C) ≤ m ∗ (C 1 ) + ... + m ∗ (C N ).<br />
To prove <strong>the</strong> o<strong>the</strong>r inequality, let δ = min{d(C i , C j )|i ≠ j}, choose a covering<br />
{I j } of open intervals with<br />
∑<br />
l(Ij ) < m ∗ (C) + ǫ.<br />
After subdividing assume that <strong>for</strong> all j with diam(I j ) < δ which implies that<br />
each I j intersects at most one of C i . For every k, let<br />
J k = { set of indexes j|I j ∩ C k }.<br />
Then, E k ⊂ ⋃ j∈J k<br />
I j and J k ∩ J k ′ = ∅ whenever k ≠ k ′ . Then<br />
∑<br />
m ∗ (C i ) ≤<br />
N∑ ∑<br />
k=1<br />
j∈J k<br />
l(I j ) ≤ ∑ l(I j ) ≤ m ∗ (C) + ǫ.<br />
Lemma 14. Let {I j } j∈σ be a sequence of disjoint open and bounded intervals,<br />
let E = ∪ j∈σ I j , <strong>the</strong>n<br />
m ∗ (E) = ∑ j∈σ<br />
m ∗ (I j ).<br />
Proof. Given ǫ > 0, let ˜Q j ⊂ I j be a closed interval such that l(I j ) ≤ l(Ĩj)+ǫ/2 j .<br />
Now, <strong>the</strong> sets Q j are compact and disjoint. For a fixed N, by Lemma 13 and<br />
Part 3 of Theorem 11 we have ⋃ N<br />
j=1 Ĩj ⊂ E and, <strong>the</strong>n<br />
N⋃ N∑<br />
m ∗ (E) ≥ m ∗ ( l(Ĩj) = m ∗ (Ĩj)<br />
=<br />
j=1<br />
N∑<br />
l(Ĩj) ≥<br />
j=1<br />
j=1<br />
N∑<br />
(l(I j ) − ǫ/2 j )<br />
j=1<br />
11
By taking <strong>the</strong> limit when N goes to ∞, we have<br />
m ∗ (E) ≥<br />
∞∑<br />
l(I j ) − ǫ<br />
j=1<br />
<strong>for</strong> every ǫ > 0, hence m ∗ (E) ≥ ∑ ∞<br />
j=1 l(I j), <strong>the</strong> o<strong>the</strong>r inequality holds by <strong>the</strong><br />
subadditivity of m ∗ .<br />
The following example shows that, in general, <strong>the</strong> outer measure m ∗ is not<br />
additive.<br />
Example 15. Write x ∼ y if x ∼ y ∈ Q this is an equivalence relationship.<br />
Since x ∼ x, x ∼ y implies y ∼ x, and whenever x ∼ y and y ∼ z <strong>the</strong>n x ∼ z.<br />
Let [x] denote an equivalence class of this relation. Then, [x] ∩ [y] ≠ ∅ implies<br />
[x] = [y]. By <strong>the</strong> axiom of choice, we can construct a set N consisting of exactly<br />
one element in each equivalence class of <strong>the</strong> given relation.<br />
Let {r k } be an enumeration of Q ∩ [−1, 1] and N k = N + r k . Thus we have<br />
[0, 1] ⊂<br />
∞⋃<br />
N k ⊂ [−1, 2],<br />
k=1<br />
which, by <strong>the</strong> monotonicity of m ∗ , implies 1 ≤ m ∗ ( ⋃ ∞<br />
k=1 N k) ≤ 3.<br />
Now N r ∩ N s ≠ ∅ whenever r ≠ s, <strong>the</strong>n by translation invariance of m ∗ , we<br />
have m ∗ (N k ) = m ∗ (N). So, ∑ ∞<br />
k=1 m∗ (N k ) = ∑ ∞<br />
k=1 m∗ (N) which can be ei<strong>the</strong>r<br />
0 or ∞, depending whe<strong>the</strong>r m ∗ (N) is ei<strong>the</strong>r 0 or positive, respectively. In each<br />
case, we have a contradiction.<br />
Hence, m ∗ is not countable additive. The existence of sets like <strong>the</strong> one<br />
constructed in Example 15 implies that a function satisfying <strong>the</strong> conditions of<br />
<strong>the</strong> problem of measure of sets can not be defined in all sets. We come to <strong>the</strong><br />
definition of Lebesgue measurable sets, <strong>the</strong>re are several equivalent definitions<br />
we choose one that is intuitively convenient.<br />
Definition. A set E ⊂ R is called Lebesgue measurable or simply measurable<br />
if <strong>for</strong> any ǫ0 <strong>the</strong>re exist an open set U containing E such that<br />
m ∗ (U \ E) < ǫ.<br />
In this case, we define <strong>the</strong> Lebesgue measure of E, or measure of E by<br />
m(E) = m ∗ (E).<br />
Lebesgue measure inherits all properties of outer measure. Now, we will<br />
describe <strong>the</strong> family of measurable sets. By definition, it is immediate that every<br />
open set in R is Lebesgue measurable. If m ∗ (E) = 0, <strong>the</strong>n E is measurable,<br />
since <strong>for</strong> every ǫ > 0 <strong>the</strong>re is a covering of E by open intervals {I j } with<br />
∑<br />
l(Ij ) < m ∗ (E) + ǫ = ǫ.<br />
12
Take U = ⋃ I j , <strong>the</strong>n U \ E ⊂ U and m ∗ (U \ E) ≤ m ∗ (U) ≤ ǫ. The family of<br />
measurable sets is also closed under countable unions.<br />
Theorem 16. Let {E i } i∈σ be a sequence of measurable sets, <strong>the</strong>n<br />
E = ⋃ i∈σ<br />
E i<br />
is also measurable.<br />
Proof. Given ǫ > 0, <strong>for</strong> every i ∈ σ <strong>the</strong>re is an open set U i containing E i and<br />
m ∗ (U \ E i ) < ǫ<br />
let U = ⋃ i∈σ U i, <strong>the</strong>n<br />
U \ E ⊂<br />
i∈σ(U ⋃ i \ E i )<br />
hence,<br />
m ∗ (U \ E) ≤ ∑ i∈σ<br />
m ∗ (U i \ E i ) < ǫ.<br />
Theorem 17. Closed sets are measurable.<br />
Proof. Let F ⊂ R be a closed set. For every k ∈ N, let B k = [−k, k], <strong>the</strong>n<br />
F = ⋃ ∞<br />
k=1 F ∩ B k. Thus, every closed set in R is countable union of compact<br />
sets. By Theorem 16, it is enough to prove <strong>the</strong> <strong>the</strong>orem <strong>for</strong> compact sets.<br />
Let K be a compact set in R, <strong>the</strong>n <strong>the</strong>re exist a bounded open interval I<br />
such that l(I) = M. The set I \ K is open, so I \ K is a countable union of<br />
disjoint open intervals {I j } j∈σ , moreover by Lemma 14, we have<br />
m(I \ K) = ∑ j∈σ<br />
l(I j ) < M.<br />
Since <strong>the</strong> sum converges, <strong>the</strong>re exist a finite set of intervals {I j1 , ..., I jN } such<br />
that<br />
N∑<br />
l(I jk ) > m(I \ K) − ǫ/2,<br />
k=1<br />
<strong>for</strong> every k ∈ {1, ..., N}, let Q k be a closed interval, with Q k ⊂ I jk , such that<br />
m(I jk \ Q k ) < ǫ<br />
2N .<br />
Notice that <strong>the</strong> set I jk \ Q k consist of two disjoint open intervals, and <strong>the</strong> set<br />
Q = ⋃ N<br />
k=1 Q k is compact.<br />
So, W = I \ Q is an open set containing K and, from Lemma 14 again, it<br />
follows<br />
m(W \ K) = m( ⋃ N∑<br />
I j \ Q) < m(I jk \ Q k ) + ǫ/2 < ǫ<br />
j∈σ<br />
k=1<br />
13
As a consequence of Lemma 13, we have <strong>the</strong> following<br />
Corollary 18. Let {C 1 , ..., C N } be a finite sequence of disjoint compact sets.<br />
If C = ⋃ N<br />
i=1 C i, <strong>the</strong>n<br />
N∑<br />
m(C) = m(C i )<br />
Proof. Every C i and C is closed, <strong>the</strong>n m(C i ) = m ∗ (C i ) and m(C) = m ∗ (C).<br />
i=1<br />
Theorem 19. If E ⊂ R is measurable, <strong>the</strong>n <strong>the</strong> complement E c in R is also<br />
measurable.<br />
Proof. For all n, <strong>the</strong>re exist U n such that m ∗ (U n \ E) < 1/n. Now, Un c is closed<br />
and measurable by Theorem 17. By Theorem 16, <strong>the</strong> set S = ⋃ ∞<br />
n=1 Uc n is also<br />
measurable. Since <strong>for</strong> every n, we have Un c ⊂ Ec , <strong>the</strong>n ⋃ ∞<br />
n=1 Uc n ⊂ Ec . Hence,<br />
which implies <strong>for</strong> every n<br />
m ∗ (E c \<br />
(E c \<br />
∞⋃<br />
Un c ) ⊂ (U n \ E)<br />
n=1<br />
∞⋃<br />
Un) c ≤ m ∗ (U n \ E) < 1/n.<br />
n=1<br />
Thus E c \ ⋃ ∞<br />
n=1 Uc n has measure zero, and so is measurable. Now<br />
∞⋃ ∞⋃<br />
E c = (E c \ Un) c ∪ Un,<br />
c<br />
n=1<br />
n=1<br />
<strong>the</strong>n E c is measurable because is union of measurable sets.<br />
Theorem 20. Let {E i } i∈σ be a countable family of disjoint measurable sets. If<br />
E = ⋃ i∈σ E i, <strong>the</strong>n<br />
m(E) = ∑ i∈σ<br />
m(E i ).<br />
Proof. First, assume that each E i is bounded. For every j, Ej c is measurable,<br />
<strong>the</strong>n <strong>for</strong> every ǫ > 0, <strong>the</strong>re exist a closed set F j ⊂ E j with m(E j \ F j ) ≤ ǫ/2 j .<br />
Because E j is bounded F j is compact. Fix N, so <strong>the</strong> sets F 1 , ..., F N are compact<br />
and disjoint, by Corollary 18 we have<br />
N⋃ N∑<br />
m( F j ) = m(F j ),<br />
j=1<br />
j=1<br />
also<br />
N⋃<br />
F j ⊂ E.<br />
j=1<br />
14
Then,<br />
m(E) ≥<br />
N∑ N∑<br />
m(F j ) ≥ m(E j ) − ǫ,<br />
j=1<br />
j=1<br />
by taking <strong>the</strong> limit when N goes to ∞, we have<br />
m(E) ≥<br />
∞∑<br />
m(E j ) − ǫ.<br />
j=1<br />
Which holds <strong>for</strong> every ǫ > 0, <strong>the</strong>n m(E) ≥ ∑ ∞<br />
j=1 m(E j).<br />
In <strong>the</strong> general case, let B k = [−k, k], <strong>the</strong>n I k ⊂ I k+1 and R = ⋃ ∞<br />
k=1 I k. Let<br />
S 1 = I 1 and <strong>for</strong> every k let S k = I k \ I k−1 . For every j let<br />
E j,k = E j ∩ S k<br />
<strong>the</strong>n<br />
E = ⋃ j,k<br />
E j,k ,<br />
since E j,k is bounded we have<br />
m(E) = ∑ j,k<br />
m(E j,k ) = ∑ j<br />
∑<br />
m(E j,k ) = ∑ j<br />
k<br />
m(E j ).<br />
Example 21 (A non measurable set). Let N be <strong>the</strong> set constructed in Example<br />
15. We showed that <strong>for</strong> every rational numbers r, s ∈ [−1, 1], we have<br />
N ∩ (N + r) =, thus m ∗ (N) = m ∗ (N + r) and<br />
[0, 1] ⊂<br />
⋃<br />
r∈Q∩[−1,1]<br />
(N + r) ⊂ [−1, 2].<br />
By Theorem 20, N can not be a Lebesgue measurable set.<br />
Let {E 1 , E 2 , ..., } be a sequence of measurable sets. We say that {E j } increases<br />
to E if E k ⊂ E k+1 and E = ⋃ ∞<br />
k=1 E k, <strong>the</strong>n we write E j ր E. Similarly,<br />
we say that {E j } decreases to E, if E k ⊃ E k+1 and E = ⋂ ∞<br />
k=1 E k, in this case<br />
we write E j ց E. We have<br />
Corollary 22. Suppose {E 1 , E 2 , ...} is a sequence of measurable sets in R.<br />
(i) If E j ր E <strong>the</strong>n m(E) = lim<br />
N→∞ m(E N)<br />
(ii) If E j ց E and m(E k ) < ∞ <strong>for</strong> some k, <strong>the</strong>n m(E) = lim<br />
N→∞ m(E N)<br />
15
Proof. Part (i). Let G 1 = E 1 , G 2 = E 2 \ E 1 and, recursively, G k = E k \ E k−1 .<br />
Then, <strong>the</strong> sets G i are disjoint and measurable, also E = ⋃ ∞<br />
i=1 G i so<br />
m(E) =<br />
∞∑<br />
m(G k ) = lim<br />
i=1<br />
N→∞<br />
i=1<br />
= lim m( ⋃<br />
N G k )<br />
N→∞<br />
i=1<br />
= lim<br />
N→∞ m(E N)<br />
N∑<br />
m(G i )<br />
Proof of part (ii). Without loss of generality we can assume m(E 1 ) < ∞. For<br />
every k define G k = E k \ E k+1 <strong>the</strong>n<br />
and E ∩ ⋃ ∞<br />
k=1 G k which implies<br />
Since m(E 1 ) < ∞ we have<br />
E 1 = E ∪<br />
m(E 1 ) = m(E) + lim<br />
∞⋃<br />
k=1<br />
N→∞<br />
k=1<br />
G k<br />
N∑<br />
(m(E k ) − m(E k+1 ))<br />
= m(E) − lim<br />
N→∞ m(E N) + m(E 1 )<br />
m(E) = lim<br />
N→∞ m(E N).<br />
We should remark here that without <strong>the</strong> condition m(E k ) < ∞ <strong>for</strong> some k,<br />
<strong>the</strong> second statement is false. For instance, consider <strong>the</strong> sequence E k = [k, ∞).<br />
What follows provides analytic and geometric insight into <strong>the</strong> nature of<br />
measurable sets.<br />
Theorem 23. Suppose that E is measurable in R. <strong>the</strong>n <strong>for</strong> all ǫ > 0<br />
(i) There exist an open set U, with E ⊂ U and m(U \ E) < ǫ.<br />
(ii) There exist a closed set F, with F ⊂ E and m(E \ F) < ǫ.<br />
(iii) If m(E) < ∞, <strong>the</strong>re exist a compact set K, with K ⊂ E and m(E\K) < ǫ.<br />
(iv) If m(E) < ∞, <strong>the</strong>re exist F = ⋃ N<br />
i=1 I j a finite union of intervals I j , such<br />
that m(E △ F) < ǫ.<br />
Let us remind that E △F is called <strong>the</strong> symmetric difference of E and F and<br />
is defined by E △ F = (E \ F) ∪ (F \ E).<br />
16
Proof. Part (i). Is just by definition.<br />
Part (ii). Since E is measurable, <strong>the</strong>n E c is measurable, and <strong>the</strong>n <strong>the</strong>re exist<br />
an open set U, with E c ⊂ U and m(U \ E c ) < ǫ. Let F = U c , <strong>the</strong>n F ⊂ E and<br />
F is closed, also E \ F = U \ E c <strong>the</strong>n m(E \ F) < ǫ.<br />
Part (iii). Pick a closed set F, such that F ⊂ E and m(E \ F) < ǫ/2. For<br />
each n, let D n = [−n, n] and K n = F ∩ D n , now<br />
E \ K n ց E \ F<br />
since m(E) < ∞ we have m(E \ K n ) → m(E \ F). For n large enough m(E \<br />
K n ) < ǫ<br />
Part (iv). Choose {Q j } ∞ j=1 a sequence of closed intervals with E ⊂ ⋃ ∞<br />
j=1 Q j<br />
and<br />
∞∑<br />
l(Q j ) ≤ m(E) + ǫ/2.<br />
j=1<br />
Since m(E) < ∞, <strong>the</strong> series converges and <strong>the</strong>re exist N > 0 such that<br />
∞∑<br />
i=N+1<br />
Then, take F = ⋃ N<br />
j=1 Q j, we have<br />
l(Q j ) < ǫ/2.<br />
m(E △ F) = M(E \ F) + m(F \ E)<br />
≤<br />
≤ m(<br />
∞∑<br />
i=N+1<br />
∞⋃<br />
i=N+1<br />
∞⋃<br />
Q j ) + m( Q j \ E)<br />
j=1<br />
∞∑<br />
l(Q j ) + ( l(Q j ) − m(E))<br />
j=1<br />
≤ ǫ/2 + ǫ/2 = ǫ<br />
We say that a set G is a G δ set if it is a countable intersection of open sets<br />
{U i } i∈σ . Similarly, we say that a set F is a F σ if is a countable union of closed<br />
sets {F i } i∈σ .<br />
We end our discussion of measurable sets with <strong>the</strong> following proposition:<br />
Proposition 24. Let E be a subset of R, <strong>the</strong>n <strong>the</strong> following are equivalent:<br />
(i) E is measurable.<br />
(ii) There exist G, a G δ set, such that m(E △ G) = 0<br />
(iii) There exist F, a F σ set, such that m(E △ F) = 0<br />
17
3.2 General measure spaces<br />
Definition. Let X be any space. A family A, of subsets of X, is called a<br />
σ-algebra if satisfies <strong>the</strong> following properties:<br />
(i) ∅, X ∈ A<br />
(ii) If E ∈ A <strong>the</strong>n E c ∈ A<br />
(iii) If {E j } j∈σ ⊂ A <strong>the</strong>n ⋃ j∈σ E j ∈ A<br />
Remark: As a consequence of De Morgan’s laws <strong>for</strong> sets:<br />
( ⋃ E i ) c = ⋂<br />
i∈σ i∈σ<br />
E c i and ( ⋂ i∈σ<br />
E i ) c = ⋃ i∈σ<br />
E c i .<br />
Every σ-algebra is also closed under intersections.<br />
Definition. A measure , defined on a σ-algebra A, is a function µ : A → [0, ∞]<br />
satisfying:<br />
• µ(∅) = 0<br />
• (Additivity). If {E i } i∈σ is a sequence of pairwise disjoint sets in A <strong>the</strong>n<br />
m( ⋃ E i ) = ∑ µ(E i ).<br />
i∈σ i∈σ<br />
If in addition, we have that µ(X) = 1, <strong>the</strong>n µ is called a probability measure<br />
in X. All toge<strong>the</strong>r, a measure space is a triple (X, A, µ) consisting of <strong>the</strong><br />
underlying space X, a σ-algebra A of sets in X and a measure µ defined on A.<br />
A consequence of Theorem 19 and Theorem 20 we have <strong>the</strong> following:<br />
Proposition 25. The family of Lebesgue measurable sets in R is a σ-algebra<br />
and <strong>the</strong> Lebesgue measure is a measure.<br />
Ano<strong>the</strong>r σ-algebra in R that plays a vital role in analysis is <strong>the</strong> Borel σ-<br />
algebra B, which, by definition is <strong>the</strong> “smallest” σ-algebra containing all open<br />
sets in R. Elements in B are called Borel sets. The term smallest means that,<br />
if A is ano<strong>the</strong>r σ-algebra containing all open sets in R, <strong>the</strong>n B ⊂ A. In fact,<br />
we could define B as <strong>the</strong> intersection of all such σ-algebras. Such intersection is<br />
not empty, since it contains Lebesgue.<br />
It arises <strong>the</strong> natural question: Is every Lebesgue measurable set a Borel set<br />
The answer is no; <strong>the</strong>re are sets of Lebesgue measure 0, and hence measurable,<br />
that are not Borel.<br />
From <strong>the</strong> point of view of Borel sets, Lebesgue sets arise as <strong>the</strong> completion<br />
of Borel sets. That is, by adjoining to B all subsets of Borel sets of measure<br />
zero. Later on, when we discuss product measure, we will see that <strong>the</strong> definition<br />
of Lebesgue measurable sets and Borel sets generalize to R n .<br />
18
4 Measurable functions<br />
From now on, we will be considering extended functions f : R → [−∞, ∞] which<br />
means that we allow f to take on <strong>the</strong> infinite values −∞ and ∞. In practice, f<br />
will take on infinite values on at most a set of measure zero. We say that f is<br />
finite valued if −∞ < f(x) < ∞ <strong>for</strong> all x ∈ R.<br />
Definition. Let E be a measurable set, a function f : E → [−∞, ∞] is called<br />
measurable if <strong>the</strong> set<br />
is measurable <strong>for</strong> every a ∈ R.<br />
To simplify notation, let us put<br />
Using equations of <strong>the</strong> <strong>for</strong>m<br />
and<br />
f −1 ([−∞, a)) = {x ∈ E : f(x) < a}<br />
{x ∈ E : f(x) < a} = {f < a}<br />
{f ≤ a} =<br />
{f < a} =<br />
∞⋂<br />
{f < a + 1/k}<br />
k=1<br />
∞⋃<br />
{f ≤ a + 1/k}<br />
k=1<br />
{f ≥ a} = {f < a} c .<br />
One can check that, in <strong>the</strong> definition of measurable functions, we could equivalently<br />
use ei<strong>the</strong>r {f < a}, {f ≤ a}, {f > a} or {f ≥ a}. Also from <strong>the</strong>se<br />
equations it follows that if f is measurable, <strong>the</strong>n −f is also measurable.<br />
When f is finite valued, if f is measurable if, and only if, <strong>the</strong> set {a ≤ f ≤ b}<br />
is measurable <strong>for</strong> every a, b ∈ R. From <strong>the</strong>se observations it follows:<br />
Proposition 26. Let f be a finite valued function. Then <strong>the</strong> following are<br />
equivalent:<br />
(i) f is measurable.<br />
(ii) For every open set U ∈ R, <strong>the</strong> set f −1 (U) is measurable.<br />
(iii) For every closed set F ∈ R <strong>the</strong> set f −1 (F) is measurable.<br />
As a corollary, we have<br />
Corollary 27. If f is continuous, <strong>the</strong>n f is Lebesgue measurable. Moreover, if<br />
f is measurable and finite valued, and g is a continuous function, <strong>the</strong>n g ◦ f is<br />
also measurable.<br />
19
Notice that, in general, if g is continuous and f is measurable, <strong>the</strong>n <strong>the</strong><br />
composition f ◦ g is not necessarily measurable.<br />
Proposition 28. Suppose {f n } n∈σ is a sequence of measurable functions. Then<br />
<strong>the</strong> following functions are all measurable:<br />
sup f n (x), inf<br />
n<br />
n f n(x), limsup<br />
n→∞<br />
f n (x), liminf<br />
n→∞ f n(x).<br />
Proof. The proof is very similar <strong>for</strong> each case, let us prove that sup n f n is<br />
measurable. To do this just note that {sup n f n > a} = ⋃ n {f n > a}, since if<br />
we take x ∈ {sup n (x) > a}, <strong>the</strong>n sup n f n (x) > a and, by definition of sup, this<br />
happens if, and only if, <strong>the</strong>re exist n such that f n (x) > a so x ∈ {f n > a}.<br />
Since union of measurable sets is measurable. If every f n is measurable,<br />
<strong>the</strong>n {sup n f n > a} is measurable. The remaining functions can be proven to<br />
be measurable by using <strong>the</strong> equations:<br />
inf f n(x) = − sup{−f n (x)}<br />
n<br />
n<br />
and<br />
limsup f n (x) = inf {sup f n (x)}<br />
n→∞<br />
k<br />
n≥k<br />
liminf f n(x) = sup{ inf f n(x)}.<br />
n→∞ n≥k<br />
k<br />
Corollary 29. If {f n } ∞ n=1 is a collection of measurable functions such that<br />
lim n→∞ f n = f, <strong>the</strong>n f is measurable.<br />
Proof. Whenever <strong>the</strong> limit limf n exist, is equal to limsupf n and liminf f n .<br />
Proposition 30. If f and g are measurable, <strong>the</strong>n<br />
(i) The integer powers f k are measurable <strong>for</strong> k ≥ 1.<br />
(ii) If f and g are both finite valued, <strong>the</strong>n f + g and fg are measurable.<br />
Proof. Part (i). If k is odd <strong>the</strong>n {f k > a} = {f > a 1/k }, if k is even <strong>the</strong>n<br />
{f k > a} = {f > a 1/k } ⋃ {f < −a 1/k }.<br />
Part (ii). Notice that<br />
{f + g > a} = ⋃ r∈Q{f > a − r} ∩ {g > r}<br />
and<br />
fg = 1/4[(f + g) 2 + (f − g) 2 ]<br />
which implies that f + g and fg are measurable.<br />
20
We say that f = g almost everywhere if {x|f(x) ≠ g(x)} has measure zero.<br />
In general, we say that a property holds almost everywhere on a set X if <strong>the</strong><br />
property holds except <strong>for</strong> a set of measure zero in X.<br />
Proposition 31. Assume that f = g almost everywhere, and f is measurable,<br />
<strong>the</strong>n g is measurable.<br />
4.1 Approximation by step or simple functions<br />
Let E be any set in R, <strong>the</strong>n it is easy to check that <strong>the</strong> characteristic function<br />
of E,<br />
{<br />
1 if x ∈ E<br />
χ E (x) =<br />
0 o<strong>the</strong>rwise.<br />
is measurable if, and only if, <strong>the</strong> set E is measurable in R. For Riemann integral,<br />
we approximate a Riemann integrable function by step functions which are<br />
function of <strong>the</strong> <strong>for</strong>m<br />
N∑<br />
f =<br />
k=1<br />
<strong>the</strong>re I k are intervals and a k are constants.<br />
In this section, we will see that every measurable function can be approximated<br />
by simple functions. That is, functions of <strong>the</strong> <strong>for</strong>m:<br />
f =<br />
χ Ik<br />
N∑<br />
a k χ Ek ,<br />
k=1<br />
where <strong>the</strong> sets E k are measurable with finite measure. Since simple functions are<br />
linear combinations of measurable functions, simple functions are measurable.<br />
Moreover, by <strong>the</strong> Corollary to Proposition 28 every limit of simple functions is<br />
a measurable function.<br />
Theorem 32. Let f be a non negative measurable function on R. Then <strong>the</strong>re<br />
exist a sequence φ k , of step functions, such that <strong>for</strong> every k and x ∈ R φ k (x) ≤<br />
φ k+1 (x) and<br />
lim<br />
k→∞ φ k(x) = f(x)<br />
Proof. Let B k = [−k, k] <strong>the</strong> ball in R and consider <strong>the</strong> truncation of f:<br />
⎧<br />
⎪⎨ f(x) if x ∈ B k and f(x) ≤ k<br />
F k (x) = k if x ∈ B k and f(x) ≥ k<br />
⎪⎩<br />
0 o<strong>the</strong>rwise.<br />
Then, clearly F k (x) converges to f(x) as k tends ∞. Now, define<br />
E i,j = { i j < F k(x) ≤ i + 1 }<br />
j<br />
21
<strong>for</strong> i such that 0 ≤ i < kj and <strong>for</strong> every k, j let<br />
Φ k,j (x) =<br />
kj−1<br />
∑<br />
i=0<br />
i<br />
j χ E i,j<br />
(x).<br />
By definition, <strong>for</strong> each pair {k, j} <strong>the</strong> function Φ k,j is a simple function and<br />
satisfies<br />
0 ≤ F k (x) − Φ k,j (x) < 1 j ,<br />
let φ k (x) = Φ k,k (x) <strong>the</strong>n<br />
0 ≤ F k (0) − φ k (x) < 1 k ,<br />
and <strong>the</strong> sequence {φ k } satisfies <strong>the</strong> desired properties.<br />
Note that <strong>the</strong> <strong>the</strong>orem allows f to take values on ∞. Now we drop <strong>the</strong><br />
non-negative assumption and allow f to take values on −∞:<br />
Theorem 33. Let f be a measurable function, <strong>the</strong>n <strong>the</strong>re exist a sequence φ k<br />
of simple functions, such that <strong>for</strong> every k and x ∈ R, |φ k (x)| < |φ k+1 (x)| and<br />
Proof. Define<br />
and<br />
lim φ k(x) = f(x).<br />
k→∞<br />
f + (x) = max(f(x), 0)<br />
f − (x) = max(−f(x), 0),<br />
<strong>the</strong>n f(x) = f + (x)−f − (x). The functions f + and f − are non negative measurable<br />
functions. By Theorem 32, <strong>the</strong>re are sequences {φ + k } and {φ− k<br />
} of simple<br />
functions such that <strong>for</strong> every k and every x<br />
φ + k (x) ≤ φ+ k+1 (x) and φ− k (x) ≤ φ− k+1 (x)<br />
Let φ k (x) = φ + k (x) − φ− k (x), <strong>the</strong>n it is easy to check that φ k(x) → f(x) and<br />
since |φ k (x)| = φ + k (x) + φ− k (x), we have that <strong>for</strong> every k and x ∈ R, |φ k(x)| <<br />
|φ k+1 (x)|.<br />
In fact, we can approximate every measurable function by step function.<br />
However, <strong>the</strong> convergence is pointwise and almost everywhere:<br />
Theorem 34. Suppose f is a measurable function on R, <strong>the</strong>n <strong>the</strong>re exist a<br />
sequence of step functions {ψ k } such that<br />
ψ k (x) → f(x) pointwise <strong>for</strong> a.e x ∈ R<br />
22
Proof. By Theorem 33, it is enough to prove <strong>the</strong> <strong>the</strong>orem when f is <strong>the</strong> characteristic<br />
function χ E of a measurable set E. Since E is measurable <strong>for</strong> every<br />
ǫ <strong>the</strong>re exist a finite sequence of intervals {I j } N j=1 such that m(E △ ⋃ I j ) < 2ǫ.<br />
Then<br />
N∑<br />
χ E (x) = χ Ij (x)<br />
j=1<br />
except <strong>for</strong> a set of measure less than 2ǫ. So, <strong>for</strong> every k we can construct a step<br />
function φ k such that<br />
E k = {x|f(x) ≠ ψ k (x)}<br />
satisfies m(E k ) < 2 −k . Let F k = ⋃ ∞<br />
j=k+1 E j, <strong>the</strong>n<br />
m(F k ) ≤<br />
∞∑<br />
j=k+1<br />
m(E j ) <<br />
∞∑<br />
j=k+1<br />
2 −j = 2 −k .<br />
If F = ⋂ ∞<br />
k=1 F k, <strong>the</strong>n m(F) = 0. Moreover, φ k (x) → f(x) as k → ∞ <strong>for</strong> all<br />
x ∈ F c .<br />
4.2 Littlewood three principles<br />
The following statements, due to Littlewood, are very useful as an early guide<br />
to <strong>the</strong> study of measure <strong>the</strong>ory. The idea of <strong>the</strong>m is that measure <strong>the</strong>oretical<br />
objects are closed its topological counterparts.<br />
Littlewood three principles<br />
1. Every set is nearly a finite union of intervals.<br />
2. Every measurable function is nearly continuous.<br />
3. Every convergent sequence is nearly uni<strong>for</strong>mly.<br />
We saw already a precise <strong>for</strong>mulation of <strong>the</strong> first principle in part (iv) of<br />
Theorem 23. Which we write <strong>for</strong> convenience here:<br />
If m(E) < ∞, <strong>the</strong>re exist F = ⋃ N<br />
i=1 I j a finite union of intervals I j , such<br />
that m(E △ F) < ǫ.<br />
The <strong>for</strong>mal statement of <strong>the</strong> third principle translates into <strong>the</strong> following<br />
<strong>the</strong>orem:<br />
Theorem 35 (Egorov’s Theorem). Let {f k } ∞ k=1<br />
be a sequence of measurable<br />
functions defined on a measurable set E with m(E) < ∞. Assume that f k → f<br />
almost everywhere on E. Then, <strong>for</strong> every ǫ > 0 <strong>the</strong>re exist a closed set A ǫ ⊂ E<br />
such that m(E \ A ǫ ) < ǫ and f k → f uni<strong>for</strong>mly on A ǫ .<br />
Proof. By restricting to a smaller set, we can assume that f k (x) → f(x) <strong>for</strong> all<br />
x ∈ E. For every natural number n, let<br />
E n k = {x ∈ E| |f j (x) − f(x)| < 1 n<br />
<strong>for</strong> all j > k}<br />
23
fix n and note that Ek n ⊂ En k+1 and E = ⋃ ∞<br />
k=1 En k , <strong>the</strong>n En k<br />
∞. By Corollary 22 <strong>the</strong>re exist K n such that<br />
ր E as k tends to<br />
and<br />
m(E \ E n k n<br />
) < 1<br />
2 n<br />
|f j (x) − f(x)| < 1 n<br />
<strong>for</strong> all j > k n and x ∈ E n k n<br />
. Choose N so that<br />
and let Ãǫ = ⋂ n≥N En k n<br />
so<br />
∞∑<br />
n=N<br />
1<br />
2 n < ǫ/2,<br />
E \ Ãǫ = E ∩ Ãc ǫ<br />
= E ∩ ( ⋂ E n k n<br />
) c = E ∩ ⋃ (E n k n<br />
) c<br />
⊂ ⋃ (E ∩ (E n k n<br />
) c ) = ⋃ (E \ E n k n<br />
),<br />
<strong>the</strong>n<br />
m(E \ Ãǫ) ≤ m( ⋃<br />
∞∑<br />
E \ Ek n n<br />
) ≤ m(E \ Ek n n<br />
) < ǫ/2.<br />
n≥N<br />
n=N<br />
Let ǫ > 0 and choose n ≥ N such that 1/n < δ. If x ∈ Ãǫ, <strong>the</strong>n x ∈ E n k n<br />
and<br />
|f j (x) − f(x)|1/n < δ <strong>for</strong> all j > k n .<br />
Hence f k → f uni<strong>for</strong>mly on Ãǫ. To finish <strong>the</strong> proof, notice that <strong>the</strong>re exist a<br />
closed set A ǫ ⊂ Ãǫ with m(Ãǫ) < ǫ/2, <strong>the</strong>n we can check that f k → f uni<strong>for</strong>mly<br />
on A ǫ and m(E \ A ǫ ) < ǫ.<br />
Now we are in position to state and proof a precise statement <strong>for</strong> Littlewood’s<br />
second principle:<br />
Theorem 36 (Lusin’s Theorem). Suppose that f is measurable and finite valued<br />
on E, where E is a measurable set of finite measure. Then, <strong>for</strong> every ǫ > 0 <strong>the</strong>re<br />
exist a closed set F ǫ ⊂ E with m(E \ F ǫ ) < ǫ and such that f| Fǫ is continuous.<br />
Remark: The <strong>the</strong>orem says that <strong>the</strong> function f restricted to <strong>the</strong> set F ǫ is<br />
continuous, but not <strong>the</strong> stronger statement that <strong>the</strong> function f defined in E is<br />
continuous at <strong>the</strong> points in F ǫ .<br />
24
Proof. Let f n be a sequence of step functions so that f n → f almost everywhere.<br />
Then, <strong>for</strong> every n <strong>the</strong>re exist E n such that m(E n ) < 1/2 n and f n is continuous<br />
outside E n . By Egorov’s Theorem we may find a set A ǫ/3 ⊂ E on which f n → f<br />
uni<strong>for</strong>mly on A ǫ/3 and<br />
m(E \ A ǫ/3 ) < ǫ/3.<br />
Now let<br />
where N is a number such that<br />
F ′ = A ǫ/3 \ ⋃<br />
∑<br />
n≥N<br />
n≥N<br />
1<br />
2 n < ǫ/3.<br />
For every n, f n is continuous on F ′ , because f n converge uni<strong>for</strong>mly to f on F ′ ,<br />
<strong>the</strong> function f is continuous on F ′ . Now take a closed set F ⊂ F ′ with<br />
E n ,<br />
m(F ′ \ F) < ǫ/3.<br />
Then F is <strong>the</strong> set we are looking <strong>for</strong>, since f is continuous in F, and<br />
E \ F ⊂ (E \ A ǫ/3 ) ∪ (F ′ \ F) ∪ ( ⋃<br />
E n )<br />
n≥N<br />
we have<br />
m(E \ F) ≤ m(E \ A ǫ/3 ) + m(F ′ \ F) + m( ⋃<br />
< ǫ/3 + ǫ/3 + ∑ n≥N<br />
1<br />
2 n < ǫ.<br />
n≥N<br />
E n )<br />
5 Integration<br />
In this section we will define Lebesgue integral <strong>for</strong> measurable functions on R.<br />
In order to do so, we will proceed on several levels of generality, starting with<br />
simple functions, <strong>the</strong>n bounded functions supported on a set of finite measure,<br />
non-negative functions, and finally, <strong>the</strong> general case of integrable functions.<br />
5.1 Simple functions<br />
We start by noting that simple functions φ = ∑ a i χ Ei have several representations;<br />
<strong>for</strong> instance<br />
χ [0,1] = χ [0,1/2) + χ [1/2,1] .<br />
Never<strong>the</strong>less, we can speak of a canonical representation of any simple function:<br />
25
Definition. Let φ = ∑ a i χ Ei a simple function, we say that φ is in its canonical<br />
representation if, <strong>for</strong> all i, j with i ≠ j, we have that a i ≠ 0, a i ≠ a j and<br />
E i ∩ E j = ∅. The canonical representation of φ is unique.<br />
Given any simple function φ, to find <strong>the</strong> canonical representation of φ is<br />
straight<strong>for</strong>ward, since <strong>the</strong> function φ takes on finitely values c 1 , c 2 , ..., c n <strong>for</strong><br />
every k ∈ {1, ...n} let<br />
F k = {x ∈ R|φ(x) = c k }<br />
<strong>the</strong>n φ = ∑ n<br />
k=1 c kχ Fk is <strong>the</strong> canonical representation of φ. Let φ = ∑ n<br />
k=1 a kχ Ek<br />
be a simple function, define<br />
∫ n∑<br />
φ(x) = a k m(E k ).<br />
R<br />
k=1<br />
From now on, whenever E is a measurable set and f a measurable function<br />
define<br />
∫ ∫<br />
f(x)dx = f(x)χ E (x)dx.<br />
R<br />
E<br />
Also we will adopt <strong>the</strong> following notations:<br />
∫<br />
∫ ∫<br />
f(x)dm(x) = f(x)dm = f<br />
R<br />
Depending on whe<strong>the</strong>r we want to make emphasis on <strong>the</strong> variable x, <strong>the</strong> measure<br />
m, or just <strong>the</strong> integral when no confusion is possible.<br />
From <strong>the</strong> definition of simple functions, one can check <strong>the</strong> following Proposition,<br />
which reflects Lebesgue’s list of properties <strong>for</strong> <strong>the</strong> integral of simple functions.<br />
Proposition 37. For simple functions, <strong>the</strong> integral satisfies <strong>the</strong> following properties:<br />
1) The integral is independent of <strong>the</strong> representation of φ.<br />
2) (Linearity.) If φ and ψ are simple functions and a, b ∈ R, <strong>the</strong>n<br />
∫<br />
∫ ∫<br />
(aφ + bψ) = a φ + b psi.<br />
3) (Additivity.) If E and F are disjoint measurable sets with m(E), m(F) <<br />
∞, <strong>the</strong>n ∫ ∫ ∫<br />
φ = φ + φ.<br />
F<br />
E∩F<br />
4) (Monotonicity) If φ and ψ are simple functions such that φ ≤ ψ <strong>the</strong>n<br />
∫ ∫<br />
φ = ψ.<br />
E<br />
5) (Triangle inequality)<br />
∫<br />
|<br />
∫<br />
φ| ≤<br />
|φ|.<br />
26
5.2 Bounded functions supported on sets of finite measure.<br />
We start with a definition.<br />
Definition. Let f be a function, <strong>the</strong> support of f is defined by<br />
supp(f) = {x|f(x) ≠ 0}<br />
We say a measurable f is a bounded function supported on a set of finite measure,<br />
if <strong>the</strong>re is a measurable set E, with m(E) < ∞ and supp(f) ⊂ E.<br />
Let f be a bounded function supported on a set of finite measure, a consequence<br />
of Theorem 33 <strong>the</strong>n <strong>the</strong>re exist a sequence of simple functions {φ n } such<br />
that supp(φ n ) ⊂ E, where m(E) < ∞ and φ n → f <strong>for</strong> all x. More precisely,<br />
we have <strong>the</strong> following useful result:<br />
Lemma 38. Let f be a bounded function supported on a set of finite measure<br />
E. Let {φ n } be a sequence of simple functions such that <strong>the</strong>re exist M > 0 such<br />
that |φ n | < M and φ n (x) → f(x) a.e x ∈ E. Then,<br />
1) The limit<br />
exists.<br />
∫<br />
lim<br />
n→∞<br />
φ n<br />
2) If f = 0 a.e, <strong>the</strong>n<br />
∫<br />
lim<br />
n→∞<br />
φ n = 0.<br />
Proof. Proof of (1). If <strong>the</strong> convergence were uni<strong>for</strong>m, <strong>the</strong> result would be ra<strong>the</strong>r<br />
clear. Never<strong>the</strong>less, we have Egorov’s <strong>the</strong>orem to fix <strong>the</strong> situation.<br />
Since m(E) < ∞, <strong>for</strong> every ǫ > 0 <strong>the</strong>re exist a closed set A ǫ ⊂ E such that<br />
m(E \ A ǫ ) < ǫ and φ n → f uni<strong>for</strong>mly on A ǫ . Let<br />
∫<br />
I n =<br />
by <strong>the</strong> triangle inequality<br />
∫<br />
|I n − I m | ≤<br />
E<br />
φ n<br />
|φ n (x) − φ m (x)|<br />
∫<br />
∫<br />
= |φ n (x) − φ m (x)| + |φ n (x) − φ m (x)|<br />
A ǫ E\A ǫ<br />
∫<br />
≤ |φ n (x) − φ m (x)| + 2Mm(E \ A ǫ )<br />
A ǫ<br />
∫<br />
≤ |φ n (x) − φ m (x)| + 2Mǫ<br />
A ǫ<br />
27
Now, since <strong>the</strong> convergence in A ǫ is uni<strong>for</strong>m <strong>the</strong>re is N > 0 such that if n, m > N<br />
<strong>the</strong>n<br />
|φ n (x) − φ m (x)| < ǫ, <strong>for</strong> all x ∈ A ǫ<br />
which implies<br />
|I n − I m | ≤ m(A ǫ )ǫ + 2Mǫ<br />
= (m(A ǫ ) + 2M)ǫ<br />
since A ǫ ⊂ E we have m(A ǫ ) < ∞, which implies that {I n } is a Cauchy sequence<br />
in R, and <strong>the</strong>n {I n } converges.<br />
Proof (2). If f = 0 we repeat <strong>the</strong> argument to see that <strong>for</strong> n big enough<br />
which implies that limI n = 0.<br />
|I n | ≤ m(E)ǫ + Mǫ<br />
Let f be a bounded function supported on a set of finite measure E. With<br />
Lemma 38 at hand we can define <strong>the</strong> integral of f<br />
∫<br />
∫<br />
f(x)dx = lim φ n (x)dx.<br />
n→∞<br />
Where φ n is a sequence of simple functions supported on E and converging to<br />
f.<br />
The second part of Lemma 38 also guarantees that this integral is well defined,<br />
because if ψ n is ano<strong>the</strong>r sequence of simple functions supported on E and<br />
converging to f, <strong>the</strong>n (φ n − ψ n ) → 0, hence ∫ (φ n − ψ n ) → 0 which implies<br />
∫ ∫<br />
lim<br />
n→∞<br />
φ n = lim<br />
n→∞<br />
Passing to <strong>the</strong> limit, from <strong>the</strong> properties of integrals of simple functions in<br />
Proposition 37, we obtain <strong>the</strong> following:<br />
Proposition 39. Let f and g be bounded functions over a set of finite measure.<br />
Then <strong>the</strong> following properties hold<br />
ψ n .<br />
(1) (Linearity) For every a, b ∈ R<br />
∫ ∫<br />
(af + bg) = a<br />
∫<br />
f + b<br />
g.<br />
(2) (Additivity) For E and F disjoint measurable sets<br />
∫ ∫ ∫<br />
f = f + f.<br />
E∪F E F<br />
(3) (Monotonicity) If f ≤ g, <strong>the</strong>n<br />
∫<br />
∫<br />
f ≤<br />
g.<br />
28
(4) (Triangle inequality) The function |f| is also bounded supported on a set<br />
of finite measure, and ∫ ∫<br />
| f| ≤ |f|.<br />
Now, let us prove a convergence <strong>the</strong>orem <strong>for</strong> bounded measurable functions:<br />
Theorem 40 (Bounded Convergence Theorem). Suppose that {f n } is a sequence<br />
of measurable function, such that <strong>for</strong> all n, |f n | < M, supp(f n ) ⊂ E<br />
with m(E) < ∞ and<br />
f n (x) → f(x) <strong>for</strong> a.e. x ∈ E.<br />
Then f is measurable, bounded and supported on E almost everywhere. Moreover<br />
∫<br />
|f n − f| → 0<br />
which implies<br />
∫<br />
∫<br />
f n →<br />
f<br />
Proof. By passing to <strong>the</strong> limit, if follows from <strong>the</strong> assumptions that f is measurable,<br />
|f| < M a.e, and m(supp(f) ∩ E c ) = 0. By <strong>the</strong> triangle inequality, it is<br />
enough to prove<br />
∫<br />
|f n − f| → 0.<br />
By Egorov’s <strong>the</strong>orem, <strong>for</strong> every ǫ > 0 <strong>the</strong>re exist a closed set A ǫ ⊂ E such that<br />
m(E \ A ǫ ) < ǫ and f n → f uni<strong>for</strong>mly on A ǫ . Then, <strong>for</strong> every n > N<br />
∫<br />
∫<br />
∫<br />
|f n (x) − f(x)| ≤ |f n (x) − f(x)| + |f n (x) − f(x)|<br />
A ǫ E\A ǫ<br />
≤ ǫm(A ǫ ) + 2Mm(E \ A ǫ )<br />
< ǫ(m(E) + 2M).<br />
Which finishes <strong>the</strong> proof since ǫ is arbitrary and M(E) + 2M < ∞<br />
5.3 Non-negative functions<br />
Now, we will proceed to define <strong>the</strong> integral of non-negative measurable functions.<br />
Let us remark that, in this case, we allow functions to take values on ∞. Let<br />
f be a measurable, non-negative function. Let f be a non negative measurable<br />
function, define ∫ ∫<br />
f(x)dx = sup g(x)dx<br />
g<br />
where <strong>the</strong> supremum is taken over all measurable functions g, where g is bounded<br />
and supported on a set of finite measure and 0 ≤ g ≤ g. This supremum is ei<strong>the</strong>r<br />
finite or infinite. When ∫<br />
f(x)dx < ∞<br />
29
we say that f is Lebesgue integrable (or just integrable <strong>for</strong> short).<br />
Proposition 41. The integral of non-negative measurable functions satisfy:<br />
(1) Linearity<br />
(2) Additivity<br />
(3) Let g be integrable and f such that 0 ≤ f ≤ g, <strong>the</strong>n f is measurable and<br />
integrable. I<br />
(4) If f is integrable <strong>the</strong>n f(x) < ∞ a.e. x.<br />
(5) If ∫ f = 0 <strong>the</strong>n f(x) = 0 a.e. x.<br />
Proof. For part (1), let us check that ∫ f + g = ∫ f + ∫ g, let φ ≤ f and ψ ≤ g<br />
be bounded measurable functions supported on a set of finite measure. Then<br />
φ + ψ ≤ f + g<br />
which implies<br />
∫<br />
∫<br />
f +<br />
∫<br />
g ≤<br />
f + g.<br />
Now, let η ≤ f +g be a bounded measurable function supported on a set of finite<br />
measure. Let η 1 = min{f(x), η(x)} and η 2 = η − η 1 . By definition, η = η 1 + η 2<br />
and one can easily check that η 1 ≤ f and η 2 ≤ g. Hence<br />
∫ ∫ ∫ ∫ ∫<br />
η = η 1 + eta 2 ≤ f + g<br />
which implies<br />
∫<br />
∫<br />
(f + g) ≤<br />
∫<br />
f +<br />
g.<br />
To prove part (5), define E k = {x|f(x) ≥ k} and E ∞ = {x|f(x) = ∞}, <strong>the</strong>n<br />
E k ց E ∞ . Since f is integrable, <strong>for</strong> every k,<br />
∫ ∫<br />
∞ > f ≥ χ Ek f ≥ km(E k )<br />
Hence m(E k ) → 0 as k → ∞, <strong>the</strong>n m(E ∞ ) = 0.<br />
5.3.1 Fatou’s Lemma<br />
As in <strong>the</strong> case of bounded functions, we are concerned about how good <strong>the</strong><br />
definition of <strong>the</strong> integral behaves under limits. Suppose that we have a sequence<br />
f n of non negative measurable functions, and that f n (x) → f(x) <strong>for</strong> almost every<br />
x. Is it true that ∫ f n (x)dx → ∫ f(x)dx The answer is no as we can check on<br />
<strong>the</strong> following example:<br />
30
Example 42. Let f n (x) = nχ [0,1/n] , <strong>the</strong>n f n → 0 a.e. ∫ f n = 1 but ∫ f = 0.<br />
However, note that lim ∫ f n > ∫ f. This is true in general, and is <strong>the</strong> content<br />
of Fatou’s Lemma.<br />
Lemma 43 (Fatou’s Lemma). Suppose that {f n } is a sequence of non negative<br />
measurable functions. If lim n→∞ f n (x) = f(x) <strong>for</strong> a.e. x. Then<br />
∫ ∫<br />
f ≤ liminf f n .<br />
n→∞<br />
Proof. Let g be a measurable function supported on a set E of finite measure,<br />
and assume g is bounded by M and 0 ≤ g ≤ f. For every x, set<br />
Then <strong>for</strong> every n,<br />
g n (x) = min{g(x), f n (x)}.<br />
|g n (x)| ≤ |g(x)| ≤ M<br />
and g n (x) → g(x). By <strong>the</strong> Bounded Convergence Theorem<br />
∫ ∫<br />
g n → g<br />
since g n < f n we have ∫ g n < ∫ f n which implies<br />
∫ ∫<br />
g ≤ liminf<br />
f n<br />
taking supreme over g:<br />
∫<br />
∫<br />
f ≤ liminf<br />
f n .<br />
Fatou’s lemma is still true even if <strong>the</strong> limit f does not exist, in this case, we<br />
replace f by liminf f n .<br />
Corollary 44. Suppose f is a non negative measurable function, and {f n } is<br />
a sequence of non negative measurable function with f n (x) ≤ f(x) and f n (x) →<br />
f(x) a.e. x. Then<br />
∫ ∫<br />
lim f n = f.<br />
n→∞<br />
Proof. Since f n (x) ≤ f(x) <strong>for</strong> a.e x, we have ∫ f n ≤ ∫ f <strong>for</strong> all n. Hence, by<br />
Fatou’s Lemma we have<br />
∫ ∫ ∫ ∫<br />
limsup f n ≤ f ≤ liminf f n ≤ limsup<br />
f n<br />
which implies<br />
∫<br />
lim<br />
n→∞<br />
∫<br />
f n =<br />
f.<br />
31
In analogy with <strong>the</strong> symbols used <strong>for</strong> sets ր and ց, we write f n ր f if<br />
f n (x) ≤ f n+1 (x) and f n (x) → f(x) <strong>for</strong> a.e. x. Similarly, we write f n ց f if<br />
f n (x) ≥ f n+1 (x) and f n (x) → f(x) <strong>for</strong> a.e. x. As a special case of <strong>the</strong> previous<br />
corollary we have <strong>the</strong> following:<br />
Corollary 45 (Monotone Convergence Theorem). Suppose {f n } is a sequence<br />
of measurable functions such that f n ≥ 0 and f n ր f. Then<br />
∫ ∫<br />
f n = f.<br />
lim<br />
n→∞<br />
Monotone Convergence Theorem provides a useful criterion <strong>for</strong> convergence<br />
of series:<br />
Corollary 46. Consider a series ∑ ∞<br />
k=1 a k(x) where <strong>for</strong> each k, a k is a non<br />
negative measurable function. Then<br />
∫<br />
∑ ∞ ∞∑<br />
∫<br />
a k (x)dx = a k (x)dx<br />
k=1<br />
Moreover, if ∑ ∞<br />
∫<br />
k=1 ak (x)dx < ∞, <strong>the</strong>n ∑ ∞<br />
k=1 a k(x) converges <strong>for</strong> a.e. x.<br />
Proof. Let f n (x) = ∑ n<br />
k=1 a k(x) and f(x) = ∑ ∞<br />
k=1 a k(x), <strong>the</strong>n <strong>for</strong> each n, f n is<br />
measurable non negative and f n ր f.<br />
By <strong>the</strong> Monotone Convergence Theorem, lim ∫ f n = ∫ f, but<br />
Then<br />
Now, if ∑ ∞<br />
k=1<br />
k=1<br />
∫<br />
f n =<br />
k=1<br />
k=1<br />
n∑<br />
∫<br />
a k (x)dx.<br />
∞∑<br />
∫ ∫<br />
∑ ∞<br />
a k (x)dx = a k (x)dx<br />
k=1<br />
∫<br />
ak (x)dx < ∞, <strong>the</strong>n ∑ ∞<br />
k=1 a k(x) is integrable, hence<br />
∞∑<br />
a k (x) < ∞ <strong>for</strong> a.e. x.<br />
k=1<br />
Let {E k } be a sequence of measurable sets, define<br />
limsup E k =<br />
∞⋂<br />
n=1 n=1<br />
∞⋃<br />
E k .<br />
Corollary 47 (Borel-Cantelli Lemma). Let {E k } be a sequence of measurable<br />
sets. If ∑ ∞<br />
k=1 m(E k) < ∞, <strong>the</strong>n m(limsupE k ) = 0.<br />
32
Proof. Let a k (x) = χ Ek (x), <strong>the</strong>n ∫ a k (x)dx = m(E k ). So<br />
∞∑<br />
∞∑<br />
∫<br />
m(E k ) = a k (x)dx < ∞,<br />
k=1 k=1<br />
<strong>the</strong>n by <strong>the</strong> corollary above, ∑ ∞<br />
k=1 a k(x) < ∞ a.e. Note that ∑ ∞<br />
k=1 a k(x) = ∞,<br />
if, and only if, x ∈ limsup E k , hence m(limsupE k ) = 0.<br />
5.4 Integrable functions<br />
Finally, we consider <strong>the</strong> definition of <strong>the</strong> integral <strong>for</strong> <strong>the</strong> general case, let f :<br />
R → [−∞, ∞], we say that f is Lebesgue integrable (or just integrable) if |f| is<br />
Lebesgue integrable (in <strong>the</strong> sense of non negative functions). Now, let f be an<br />
integrable function. Define<br />
f + (x) = max(f(x), 0), f − (x) = max(−f(x), 0)<br />
Then f + , f − ≥ 0 and f = f + −f − . Also, note that f ± ≤ |f|. So f is integrable<br />
if, and only if, f − and f + are integrable. We define<br />
∫ ∫<br />
f = f + − f −<br />
Note that <strong>the</strong> integral of f does not depend on <strong>the</strong> decomposition of f into<br />
non negative functions. Since, if f = g 1 − g 2 = f 1 − f 2 , where f 1 , f 2 , g 1 , g 2 ≥ 0<br />
are integrable functions, <strong>the</strong>n<br />
g 1 + f 2 = f 1 + g 2<br />
which implies<br />
∫<br />
∫<br />
g 1 +<br />
∫<br />
f 2 =<br />
∫<br />
f 1 +<br />
g 2 ,<br />
so we have<br />
∫<br />
∫<br />
g 1 −<br />
∫<br />
g 2 =<br />
∫<br />
f 1 −<br />
f 2<br />
Proposition 48. The integral of Lebesgue integrable functions is (1) linear, (2)<br />
additive, (3) monotone, and satisfies <strong>the</strong> triangle inequality.<br />
Theorem 49. Suppose f is integrable on R, <strong>the</strong>n <strong>for</strong> every ǫ > 0.<br />
(i) There exist a set B of finite measure such that<br />
∫<br />
B c |f| < ǫ<br />
(ii) (Absolute continuity) There exist δ > 0, such that if m(E) < δ, <strong>the</strong>n<br />
∫<br />
|f| < ǫ.<br />
E<br />
33
Proof. By replacing f by |f|, we can assume f ≥ 0. For every N ∈ N, let<br />
E N = {x|f(x) ≤ N} and f N (x) = f(x)χ EN (x).<br />
Proof of (i). By definition f N is non negative and measurable. Also f N (x) ≤<br />
f N+1 (x) and lim N→∞ f N (x) = f(x). By <strong>the</strong> Monotone Convergence Theorem<br />
∫ ∫<br />
lim<br />
N→∞<br />
f N =<br />
Thus, given ǫ > 0 <strong>the</strong>re exist M > 0, such that<br />
∫ ∫<br />
f − fχ EM < ǫ.<br />
f.<br />
But 1 − χ EM = χ E c<br />
M<br />
, hence<br />
∫<br />
∫<br />
fχ EM < ǫ = f < ǫ.<br />
E c M<br />
Proof of (ii). Again given ǫ > 0 <strong>the</strong>re exist M > 0 such that<br />
∫<br />
f − f M < ǫ/2,<br />
pick δ > 0 such that Mδ < ǫ/2. If m(E) < δ, we have<br />
∫ ∫ ∫<br />
f = f − f M +<br />
E<br />
E<br />
f M<br />
E<br />
Since f M (x) < M <strong>for</strong> every x.<br />
≤ ǫ/2 + Mm(E)<br />
ǫ/2 + Mδ < ǫ.<br />
Theorem 50 (Dominated Convergence Theorem). Suppose {f n } is a sequence<br />
of measurable functions such that f n (x) → f(x) almost everywhere, and |f| < g<br />
where g is an integrable function. Then<br />
∫<br />
|f n − f| → 0 as n → ∞,<br />
which implies<br />
∫<br />
∫<br />
f n →<br />
f as n → ∞.<br />
Proof. For each N ≤ 0, let E N = {x| |x| ≤ N, g(x) ≤ N}. Given ǫ > 0 <strong>the</strong>re<br />
exist M > 0 such that ∫<br />
g < ǫ/3.<br />
E c M<br />
34
Then f n χ EM is measurable bounded and supported on a set of finite measure.<br />
By <strong>the</strong> Bounded convergence <strong>the</strong>orem,<br />
∫<br />
|f n − f| < ǫ/3.<br />
E M<br />
Thus we have<br />
∫<br />
∫ ∫<br />
|f n − f| = |f n − f| +<br />
E M<br />
∫<br />
∫<br />
≤ |f n − f| + 2<br />
E M<br />
≤ ǫ/3 + 2ǫ/3 = ǫ.<br />
E c M<br />
E c M<br />
g<br />
|f n − f|<br />
5.4.1 The Lebesgue space L 1 .<br />
Let L 1 = L 1 (R) be <strong>the</strong> space of integrable functions on R. under almost everywhere<br />
equivalence. That is, two elements f, g ∈ L 1 (R) are equivalent if<br />
f(x) = g(x) <strong>for</strong> almost every x. The properties of <strong>the</strong> integral imply that L 1 is<br />
a vector space. Let f be an integrable function, <strong>the</strong> L 1 -norm of f, is given by<br />
∫<br />
‖f‖ 1 = |f|dx.<br />
Proposition 51. The L 1 -norm ‖‖ 1 satisfies <strong>for</strong> all f, g in L 1 :<br />
(i) For all a ∈ R, ‖af‖ = |a|‖f‖ 1 ,<br />
(ii) ‖f + g‖ 1 ≤ ‖f‖ 1 + ‖g‖ 1 ,<br />
(iii) ‖f‖ 1 = 0 if, and only if f = 0 a.e.,<br />
(iv) d(f, g) = ‖f − g‖ 1 defines a metric L 1 (R).<br />
Due to almost everywhere equivalence, <strong>the</strong> L 1 -norm is indeed a norm in L 1 .<br />
On <strong>the</strong> o<strong>the</strong>r hand, <strong>the</strong> space L 1 is not quite a function space; since points<br />
have measure zero, we can not evaluate elements in L 1 at a point. Never<strong>the</strong>less,<br />
we keep <strong>the</strong> imprecise terminology that an element f ∈ L 1 (R) is an integrable<br />
function, and every time we define an element in L 1 we keep in mind that <strong>the</strong><br />
definition holds almost everywhere.<br />
Remind that a metric space (X, d) is complete if any Cauchy sequence {x i }<br />
converges in X. In fact, <strong>for</strong> a Cauchy sequence to converge it is enough to have<br />
a convergent subsequence:<br />
Lemma 52. Let {x n } be a Cauchy sequence on a metric space X, if <strong>the</strong>re exist<br />
a subsequence x nk such that<br />
Then {x n } converges to L.<br />
R<br />
x nk → L as k → ∞.<br />
35
Proof. Let ǫ > 0, and let N > 0 such that <strong>for</strong> every n k , n, m > N we have<br />
d(x nk , L) < ǫ/2 and d(x n , x m ) < ǫ/2. Then <strong>for</strong> every n > N<br />
So, {x n } converges to L.<br />
d(x n , L) ≤ d(x n , x nk ) + d(x nk , L) < ǫ<br />
Now we prove that <strong>the</strong> metric defined in part (iv) of Proposition 51 is complete<br />
in L 1 :<br />
Theorem 53 (Riesz-Fisher). The Lebesgue space L 1 is a complete metric space.<br />
Proof. Let {f n } be a Cauchy sequence in L 1 , by Lemma 52 to prove that {f n }<br />
converges, it is enough to show that {f n } has a convergent subsequence. Since<br />
{f n } is Cauchy, we can find a subsequence {f nk } satisfying<br />
Define<br />
and<br />
Consider <strong>the</strong> partial sums<br />
and<br />
‖f nk+1 (x) − f nk (x)‖ 1 ≤ 1 <strong>for</strong> all k ≥ 1.<br />
2k f(x) = f n1 +<br />
g(x) = |f n1 | +<br />
∞∑<br />
(f nk+1 (x) − f nk (x))<br />
k=1<br />
∞∑<br />
|f nk+1 (x) − f nk (x)|.<br />
k=1<br />
S K (f)(x) = f n1 +<br />
S K (g)(x) = |f n1 | +<br />
By <strong>the</strong> triangle inequality, <strong>for</strong> all k<br />
‖S K (g)‖ 1 ≤ ‖f n1 ‖ 1 +<br />
K∑<br />
(f nk+1 (x) − f nk (x))<br />
k=1<br />
K∑<br />
|f nk+1 (x) − f nk (x)|.<br />
k=1<br />
K∑<br />
‖f nk+1 (x) − f nk (x)‖ 1 ≤ ‖f n1 ‖ 1 + 2,<br />
k=1<br />
thus <strong>the</strong> partial sums S K (g) are bounded and S K (g) ր g, by <strong>the</strong> Monotone<br />
Convergence Theorem g is integrable. Since |f| ≤ g, f is also integrable. The<br />
partial sums S K (f) telescopes to f nk so S K (f)(x) = f nk (x) → f(x) <strong>for</strong> almost<br />
all x. To prove that f nk → f(x) in L 1 , note that<br />
|f − S K (f)| ≤ (2g) <strong>for</strong> all K.<br />
By <strong>the</strong> Dominated Convergence Theorem<br />
‖f nk − f‖ 1 → 0<br />
36
6 Lebesgue measure in R d .<br />
So far, we have discussed measurability in dimension 1. All definitions carry<br />
over higher dimension. Here we write down <strong>the</strong> relevant settings and differences.<br />
Instead of intervals, in R d we will deal with rectangles of <strong>the</strong> <strong>for</strong>m<br />
R = (a 1 , b 1 ) × ... × (a d , b d ).<br />
Where b j ≥ a j <strong>for</strong> all j. Note that all rectangles have sides parallel to <strong>the</strong><br />
coordinate axes. Instead of <strong>the</strong> length of an interval, we will deal with <strong>the</strong><br />
volume of a rectangle<br />
|R| = vol(R) = (b 1 − a 1 ) · ... · (b d − a d )<br />
In dimension 1, every open set is a countable union of disjoint open intervals.<br />
In higher dimension this is not <strong>the</strong> case, however, in R d every open set is a<br />
countable union of almost disjoint of open rectangles, meaning that we can cover<br />
<strong>the</strong> open set except on <strong>the</strong> boundaries of <strong>the</strong> rectangle. Since <strong>the</strong> boundary of<br />
rectangles have 0 volume, we do not care much <strong>for</strong> this issue.<br />
Here are <strong>the</strong> definitions of Lebesgue outer measure in R d and Lebesgue<br />
measurability:<br />
Definition. Let E ⊂ R d . The outer measure of E is defined by<br />
m ∗ (E) = inf{ ∑ j∈σ<br />
|Q j |}.<br />
The infimum is taken over all countable coverings of E by open rectangles<br />
{Q j } j∈σ<br />
Definition. A set E ⊂ R d is called Lebesgue measurable or simply measurable<br />
if <strong>for</strong> any ǫ0 <strong>the</strong>re exist an open set U ⊂ R d containing E such that<br />
m ∗ (U \ E) < ǫ.<br />
In this case, we define <strong>the</strong> Lebesgue measure of E, or measure of E by<br />
m(E) = m ∗ (E).<br />
The outer measure in R d and Lebesgue measurable sets have <strong>the</strong> same properties<br />
than R. In this section, we will describe properties of Lebesgue measurable<br />
sets in R d with respect to its factors.<br />
Write R d = R d1 × R d2 with d = d 1 + d 2 and d 1 , d 2 ≥ 1. A point in R d has<br />
<strong>the</strong> <strong>for</strong>m (x, y) where x ∈ R d1 and y ∈ R d2 . Let f : R d → R be a function, <strong>the</strong><br />
y-slice of f corresponding to y ∈ R d2 is <strong>the</strong> function<br />
f y : R d1 → R<br />
given by<br />
f y (x) = f(x, y)<br />
37
Similarly, <strong>the</strong> x-slice of f corresponding to x ∈ R d1 is <strong>the</strong> function<br />
defined by<br />
f x : R d2 → R<br />
f x (y) = f(x, y)<br />
For measurable sets, let E be a measurable set in R d . For y ∈ R d2 and<br />
x ∈ R d1 let<br />
E y = {x ∈ R d1 |(x, y) ∈ E}<br />
and<br />
E x = {y ∈ R d2 |(x, y) ∈ E}<br />
If f is measurable, it does not immediately follows f x or f y are measurable<br />
<strong>for</strong> every x ∈ R d2 and y ∈ R d1 . For E measurable, it might happen that some<br />
E y slice is not measurable.<br />
Example 54. Let y ∈ R and E = N × y. Since E belongs to a set of measure<br />
0 in R 2 , E is measurable in R 2 . But E y = N is not measurable on R. Fubini’s<br />
<strong>the</strong>orem and its consequences deal with this kind of considerations.<br />
Theorem 55 (Fubini’s Theorem). Suppose f(x, y) is integrable on R d = R d1 ×<br />
R d2 , <strong>the</strong>n <strong>for</strong> almost every y ∈ R d2<br />
(1) The slice f y is integrable on R d1 .<br />
(2) The function given by<br />
is integrable in R d2 . Moreover,<br />
(3) ∫<br />
R d 2<br />
∫<br />
R d 1<br />
∫<br />
y ↦→<br />
R d 1<br />
f y (x)dx<br />
∫<br />
f y (x)dxdy = f(x)dx<br />
R d<br />
The <strong>the</strong>orem is symmetric on x and y. So we have that <strong>for</strong> almost every<br />
x ∈ R d1 , f x is integrable on R d2 , <strong>the</strong> function x ↦→ ∫ R f x(y)dy is integrable on<br />
d2<br />
R d1 and<br />
∫ ∫<br />
∫ ∫<br />
∫<br />
f x (y)dydx = f y (x)dxdy = f(x)dx<br />
R d 1 R d 2<br />
R d 2<br />
R d<br />
Proof. Let F denote <strong>the</strong> class of integrable function on R d which satisfy all three<br />
conditions of <strong>the</strong> <strong>the</strong>orem. The goal is to prove<br />
R d 1<br />
L 1 (R d ) ⊂ F.<br />
We will achieve this goal in several steps.<br />
Step 1 F is closed under linear combinations. Let {f k } N k=1 ⊂ F <strong>for</strong> all k<br />
<strong>the</strong>re exist A k ⊂ R d2 such that m(A k ) = 0 and f y k is integrable <strong>for</strong> all y ∈ Ac k .<br />
38
Let A = ⋃ N<br />
k=1 A k, <strong>the</strong>n <strong>for</strong> all k and every y ∈ A c . f y k<br />
is integrable, which<br />
implies that ∑ N<br />
k=1 a kf y k is integrable, <strong>for</strong> every linear combination of fy k . Also,<br />
since <strong>the</strong> function<br />
∫<br />
y ↦→ f y k (x)dx<br />
R d 1<br />
is integrable, so it is every linear combination<br />
∫<br />
y ↦→<br />
R d 1<br />
k=1<br />
and, by <strong>the</strong> properties of <strong>the</strong> integral,<br />
∫<br />
=<br />
N ∑<br />
R d k=1<br />
∫<br />
=<br />
N∑<br />
∫<br />
a k<br />
k=1<br />
R d 2<br />
∫<br />
a k f k =<br />
R d 2<br />
N∑<br />
a k f y k (x)dx,<br />
∫<br />
R d 1<br />
k=1<br />
N∑<br />
a k f k<br />
∫R d<br />
k=1<br />
R d 1<br />
f y k (x)dxdy<br />
N∑<br />
a k f y k (x)dxdy.<br />
So ∑ N<br />
k=1 a kf k ∈ F.<br />
Step 2 F is closed under limits.<br />
Let {f k } be a sequence of elements in F, such that f k ր f or f k ց f. By<br />
considering −f k if necessary we may assume f k ր f. Also, by replacing f k by<br />
f k − f 1 , we may assume that f k ≥ 0. By <strong>the</strong> Monotone Convergence Theorem<br />
∫ ∫<br />
limk → ∞ = f,<br />
R d R d<br />
<strong>the</strong>n f is integrable in R d . For all k, <strong>the</strong>re exist A k ⊂ R d2 so that f y k<br />
is integrable<br />
on R d1 , whenever y ∈ A c k . Let A = ⋃ ∞<br />
k=1 A k, <strong>the</strong>n m(A) = 0, which implies<br />
that <strong>for</strong> all y ∈ A c , <strong>the</strong> function f y k is integrable in , and since f y Rd1 k ր fy <strong>the</strong>n<br />
∫<br />
∫<br />
g k (y) = f y k (x)dx → g(y) = f y (x)dx,<br />
R d 1<br />
hence g is integrable in R d2 . Moreover<br />
∫ ∫<br />
g k (y)dy →<br />
that is<br />
which implies that f ∈ F.<br />
R d 2<br />
R d 2<br />
∫ ∫<br />
f k → f,<br />
R d R d<br />
R d 1<br />
g(y)dy,<br />
39
Step 3 For every G δ set E of finite measure or every E of measure cero,<br />
χ E ∈ F. We separate this step into several cases. Let us consider first, <strong>the</strong> case<br />
when E is an open rectangle, so E = R 1 × R 2 . Clearly, <strong>for</strong> every y ∈ R d2 , χ y E<br />
is integrable and<br />
∫<br />
{<br />
|R 1 | if y ∈ R 2<br />
g(y) = χ E (x, y)dy =<br />
0 o<strong>the</strong>rwise<br />
R d 1<br />
hence g(y) = |R 1 |χ R2 which implies<br />
∫<br />
g(y)dy = |R 1 ||R 2 |.<br />
But<br />
R d 2<br />
∫<br />
χ = |R 1 ||R 2 |,<br />
R d<br />
so χ E ∈ F.<br />
Suppose now that E ⊂ ∂(R 1 × R 2 ), thus ∫ R<br />
χ d E = 0, it is clear that almost<br />
every y-slice of χ E is integrable and<br />
∫<br />
g(y) = χ E (x, y)dx<br />
R d 1<br />
implies that g(y) = 0 <strong>for</strong> almost all y ∈ R d2 . Then<br />
∫<br />
g(y)dy = 0<br />
R d 2<br />
So, again χ E ∈ F.<br />
To check that χ E ∈ F when E = ⋃ N<br />
k=1 Q k is a finite union of disjoint<br />
rectangles, note that χ E = ∑ χ Qk = ∑ χ intQk + ∑ χ ∂Qk where intQ k de<strong>notes</strong><br />
<strong>the</strong> interior of Q k . By <strong>the</strong> first step of <strong>the</strong> proof, and <strong>the</strong> previous cases χ E ∈ F.<br />
When E is an open set of finite measure, <strong>the</strong>n E = ⋃ ∞<br />
k=1 Q k is almost a<br />
disjoint union of rectangle Q k . Let f N = ∑ N<br />
k=1 χ Q k<br />
, <strong>the</strong>n f k → χ E , and by <strong>the</strong><br />
previous case f k ∈ F. The second step of <strong>the</strong> proof implies χ E ∈ F..<br />
Finally, let us check that χ E ∈ F when E is a G δ set of finite measure. By<br />
definition <strong>the</strong>re exist a countable sequence of open sets {Ũk} such that<br />
E =<br />
∞⋂<br />
Ũ k ,<br />
k=1<br />
since m(E) < ∞, <strong>the</strong>re exist an open set U 0 , which contains E and m(U 0 ) < ∞.<br />
Let<br />
U k = U 0 ∩ ⋂ j=1<br />
Ũ j<br />
<strong>the</strong>n U k ց E and f k = χ Uk is a sequence such that f k ց χ E , by <strong>the</strong> previous<br />
case χ Uk ∈ F and by <strong>the</strong> second step of <strong>the</strong> proof χ E ∈ F.<br />
40
Assume that E is a set of measure 0. Then <strong>the</strong>re exist a G δ set G, such that<br />
E ⊂ G and m(G) = 0. Then χ G ∈ F and also E y ⊂ G y <strong>for</strong> all y. Then<br />
∫ ∫<br />
∫<br />
χ G (x, y)dxdy = χ G = 0.<br />
But this implies that<br />
R d 2<br />
R d 1<br />
∫<br />
R d 1<br />
χ G (x, y)dx = 0<br />
<strong>for</strong> almost all y. Then m(G y ) = 0 <strong>for</strong> a.e. y, since E y ⊂ G y we have m(E y ) = 0<br />
<strong>for</strong> almost every y. Then ∫<br />
χ Ey (x)dx = 0<br />
and so<br />
∫<br />
R d 2<br />
R d 1<br />
∫<br />
R d 1<br />
χ E dxdy = 0.<br />
Step 4 Conclusion. After all <strong>the</strong> cases of Step 3, let us check that χ E F,<br />
when E is any measurable set of finite measure. In this case, <strong>the</strong>re exist a G δ<br />
set G, with E ⊂ G and m(G \ E) = 0, thus χ E = χ G − χ G\E , thus χ E is a<br />
linear combination of elements in F. By <strong>the</strong> first step, χ E ∈ F. When f is<br />
any integrable function, we can decompose f = f + − f − where f + and f − are<br />
non negative functions. In each case, f + , and f − are limits of simple functions<br />
over sets of finite measure. By <strong>the</strong> previous steps f ∈ F. So L 1 (R d ) ∈ F as we<br />
wanted to show.<br />
It is worth to note that Fubini’s <strong>the</strong>orem is valid when f is a non negative<br />
measurable function:<br />
Theorem 56. Suppose f(x, y) is integrable on R d = R d1 ×R d2 , <strong>the</strong>n <strong>for</strong> almost<br />
every y ∈ R d2<br />
(1) The slice f y is integrable on R d1 .<br />
(2) The function given by<br />
is integrable in R d2 . Moreover,<br />
(3) ∫<br />
R d 2<br />
∫<br />
R d 1<br />
∫<br />
y ↦→<br />
R d 1<br />
f y (x)dx<br />
∫<br />
f y (x)dxdy = f(x)dx<br />
R d<br />
Here is how we apply Theorem 56, suppose we are given a measurable function<br />
f and we want to compute ∫ R d f to justify <strong>the</strong> use of iterated integrals<br />
we only apply it <strong>for</strong> |f|. If we found that ∫ |f| < ∞, <strong>the</strong>n we get that f is<br />
integrable.<br />
41
Proof. Consider <strong>the</strong> truncations<br />
{<br />
f(x, y) if |(x, y)| ≤ k and f(x, y) ≤ k<br />
f k (x, y) =<br />
0 o<strong>the</strong>rwise.<br />
Thus f k is integrable <strong>for</strong> every k, and f k ր f. By Fubini’s Theorem, <strong>for</strong> every<br />
k <strong>the</strong>re exist E k such that m(E k ) = 0 and f y k is integrable <strong>for</strong> every y ∈ Ec k .<br />
Let E = ⋃ E k , so m(E) = 0 and <strong>for</strong> every y ∈ E c we have f y k ր fy where f y<br />
is integrable. Moreover, by <strong>the</strong> Monotone Convergence Theorem<br />
∫ ∫<br />
f y k (x)dx → f y (x)dx<br />
R d 1<br />
R d 1<br />
and<br />
hence<br />
∫ ∫<br />
∫ ∫<br />
f y k (x)dxdy → f y (x)dxdy<br />
R d 2 R d 1<br />
R d 2 R d 1<br />
∫ ∫<br />
f k → f<br />
which implies by Fubini’s <strong>the</strong>orem on f k that<br />
∫ ∫<br />
∫<br />
f y (x)dxdy =<br />
R d 2 R d 1<br />
f.<br />
Corollary 57. Let E be a measurable set in R d1 × R d2 <strong>the</strong>n<br />
E y = {x ∈ R d1 |(x, y) ∈ E}<br />
is measurable <strong>for</strong> almost every y ∈ R d2 , <strong>the</strong> function y ↦→ m(E y ) is measurable<br />
and<br />
∫<br />
m(E) = m(E y )dy<br />
Proof. The corollary is just Theorem 56 applied to χ E .<br />
R d 2<br />
The converse to this corollary is false as <strong>the</strong> following example shows.<br />
Example 58. Consider E = [0, 1] × N, where N is <strong>the</strong> non-measurable set we<br />
constructed in Example 15. Then E y is [0, 1] <strong>for</strong> y ∈ N or 0 if y ∈ N c . Hence<br />
E y is measurable <strong>for</strong> every y ∈ R, but E is not measurable in R 2 .<br />
Theorem 59. If E = E 1 × E 2 is measurable in R d and m(E 2 ) > 0 <strong>the</strong>n E 1 is<br />
measurable.<br />
42
Proof. By <strong>the</strong> corollary above, <strong>for</strong> almost all y ∈ R d2 <strong>the</strong> function<br />
χ y E 1×E 2<br />
(x) = χ E1 (x) · χ E2 (y)<br />
is a measurable function of x. Now, we claim that <strong>the</strong>re exist y ∈ E 2 such that<br />
χ y E 1×E 2<br />
is measurable. If this claim is true, we have that χ y E 1×E 2<br />
= χ E1 (x) is<br />
measurable, which implies that E 1 is measurable. Let us proof <strong>the</strong> claim. Let<br />
F ⊂ R d2 be <strong>the</strong> set of all y such that E y is measurable, <strong>the</strong>n by assumption<br />
m(F c ) = 0. Now<br />
E 2 = (E 2 ∩ F) ∪ (E 2 ∩ F c ).<br />
but<br />
0 < m(E 2 ) = m(E 2 ∩ F) + m(E 2 ∩ F c ) = m(E 2 ∩ F)<br />
since m(E ∩ F c ) = 0. Thus E 2 ∩ F ≠ ∅ which proves <strong>the</strong> claim.<br />
Proposition 60. Let E 1 ⊂ R d1 and E 2 ⊂ R d2 be any subsets. Then<br />
m ∗ (E 1 × E 2 ) ≤ m ∗ (E 1 )m ∗ (E 2 ).<br />
If any E has outer measure 0, <strong>the</strong>n m ∗ (E 1 × E 2 ) = 0.<br />
Theorem 61. Suppose E 1 ⊂ R d1 and E 2 ⊂ R d2 are measurable subsets. Then<br />
E = E 1 × E 2 is measurable in R d and<br />
m(E) = m(E 1 ) · m(E 2 ).<br />
Proof. It is enough to prove that E is measurable. Since E 1 and E 2 are measurable<br />
<strong>the</strong>re exist G δ sets, G 1 ⊂ R d1 and G 2 ⊂ R d2 , such that E i ⊂ G i and<br />
m ∗ (E i ) = m(G i ), <strong>for</strong> i = 1, 2, but <strong>the</strong>n G = G 1 × G 2 is G δ , hence measurable,<br />
and<br />
G 1 × G 2 \ (E 1 × E 2 ) ⊂ (G 1 \ E 1 ) × G 2 ∪ G 1 × (G 2 \ E 2 ).<br />
Then m ∗ (G \ E) = 0, so E is measurable.<br />
Corollary 62. Suppose f is a measurable function on R d1 , <strong>the</strong>n ˜f(x, y) = f(x)<br />
is measurable on R d .<br />
Proof. For all a ∈ R, <strong>the</strong> set E 1 = {x ∈ R d1 |f(x) < α} is measurable in R d1 .<br />
Then<br />
{(x, y) ∈ R d : ˜f(x, y) < α} = E 1 × R<br />
is measurable. Thus, ˜f is measurable.<br />
Corollary 63. Suppose f(x) is a non-negative function on R d and let<br />
A = {(x, y) ∈ R d × R : 0 ≤ y ≤ f(x)}.<br />
Then f is measurable if and only if A is measurable in R d+1 . Moreover, if f is<br />
measurable <strong>the</strong>n ∫<br />
R d f(x)dx = m(A).<br />
43
Proof. If f is measurable on R d , <strong>the</strong>n <strong>the</strong> previous proposition guarantees that<br />
<strong>the</strong> function<br />
F(x, y) = y − f(x)<br />
is measurable, and <strong>the</strong> set A = {y > 0} ∩ {F ≤ 0} is measurable. For <strong>the</strong><br />
converse, assume A is measurable, <strong>the</strong>n <strong>for</strong> every x ∈ R d<br />
A x = {y ∈ R|(x, y) ∈ A} = [0, f(x)]<br />
is measurable, and <strong>the</strong> function x ↦→ m(A x ) = f(x) is measurable. Also<br />
∫<br />
∫<br />
∫<br />
m(A) = χ A (x, y)dydx = m(A x )dx = f(x)dx.<br />
R d 1<br />
R d 1<br />
Proposition 64. If f is measurable on R d , <strong>the</strong>n ˜f(x, y) = f(x−y) is measurable<br />
on R d × R d .<br />
This proposition is particularly useful to define convolutions. Given f, g<br />
measurable on R d , <strong>the</strong>n f(x − y)g(y) is measurable on R 2d . If each function is<br />
integrable <strong>the</strong>n f(x−y)g(y) is also integrable. The convolution of two integrable<br />
functions f, g is given by<br />
∫<br />
(f ∗ g)(x) = f(x − y)g(y)dy<br />
R d<br />
Which is a well defined function of x <strong>for</strong> almost all x.<br />
6.1 Complex valued integrals<br />
Let f : R d → C, <strong>the</strong>n if Re(f) = u and Im(f) = v,<br />
f(x) = u(x) + iv(x)<br />
where u, v : R d → R. We say that f is Lebesgue integrable if an only if<br />
u(x) and v(x) are both integrable. Note that f is integrable if and only if<br />
|f(x)| = √ (u(x)) 2 + (v(x)) 2 is integrable. In case f is integrable we put<br />
∫ ∫ ∫<br />
f = u + i v.<br />
R d R d R d<br />
7 Lebesgue L p spaces<br />
7.1 The space L 2<br />
Let L 2 (E) ≡ L 2 (R d , C) be <strong>the</strong> set of complex valued functions f that satisfy<br />
∫<br />
R d |f(x)| 2 dx < ∞<br />
44
The requirement that L 2 (R d ) consist of complex valued functions is just to<br />
establish <strong>the</strong> definitions on its greater generality. However, all properties of<br />
L 2 (R d ) that will be discussed here, apply <strong>for</strong> <strong>the</strong> space L 2 (R d , R) of real valued<br />
functions satisfying ∫ E |f(x)|2 dx < ∞.<br />
We can define <strong>the</strong> L 2 -norm on L 2 (R d ) <strong>for</strong> a function f by<br />
(∫<br />
‖f‖ 2 = |f(x)| 2 dx<br />
R d<br />
Again, as in <strong>the</strong> case of L 1 we will consider two functions f, g ∈ L 2 (R d ) to be<br />
<strong>the</strong> same if f = g almost everywhere. An important property of L 2 (R d ), is that<br />
is equipped with an inner product given by<br />
∫<br />
< f, g >= f(x)g(x)dx <strong>for</strong> all f, g ∈ L 2 (R d ).<br />
R d<br />
Here ¯z de<strong>notes</strong> complex conjugation. Note that<br />
< f, f > 1 2 = ‖f‖2 .<br />
) 1<br />
2<br />
Theorem 65. The space L 2 (R d ) has <strong>the</strong> following properties:<br />
(1) L 2 (R d ) is a vector space over C.<br />
(2) If f, g ∈ L 2 (R d ) <strong>the</strong>n fḡ ∈ L 1 (R d ), and <strong>the</strong> Cauchy-Schwarz inequality<br />
holds<br />
| < f, g > | ≤ ‖f‖ 2 ‖g‖ 2 .<br />
(3) If g ∈ L 2 (R d ) is fixed. The map f ↦→< f, g > is linear and < f, g >=<br />
< g, f >.<br />
(4) The triangle inequality holds<br />
‖f + g‖ 2 ≤ ‖f‖ 2 + ‖g‖ 2 .<br />
Proof. Proof part (1). If f, g ∈ L 2 (R d ) <strong>the</strong>n<br />
which implies<br />
|f(x) + g(x)| ≤ 2 max{|f(x)|, |g(x)|}<br />
|f(x) + g(x)| 2 ≤ 4(|f(x)| + |g(x)|) 2<br />
integrating both sides:<br />
∫<br />
(∫ ∫ )<br />
|f(x) + g(x)| 2 dx ≤ 4<br />
R d |f(x)| 2 dx +<br />
R d |g(x)| 2 dx<br />
R d<br />
so f + g ∈ L 2 (R d ). It is clear that <strong>for</strong> all λ ∈ C, λf ∈ L 2 (R d ) whenever<br />
f ∈ L 2 (R d ).<br />
45
Proof of part (2). Let us check that fḡ is integrable, <strong>for</strong> that we use <strong>the</strong> fact<br />
that, A, B ≥ 0 <strong>the</strong>n 2AB ≤ A 2 + B 2 so<br />
|f||ḡ| ≤ 1 (<br />
|f| 2 + |g| 2) ,<br />
2<br />
integrating<br />
∫<br />
|f||ḡ| ≤ 1 2<br />
(∫<br />
∫<br />
|f| 2 +<br />
|g| 2 )<br />
< ∞,<br />
thus fḡ is integrable. Let us prove Cauchy-Schwarz, if ei<strong>the</strong>r f or g are such<br />
that ‖f‖ 2 = 0 or ‖g‖ 2 = 0 <strong>the</strong>n fḡ = 0 almost every where, so < f, g >= 0 and<br />
<strong>the</strong> inequality holds. Assume first that ‖f‖ 2 = ‖g‖ 2 = 1, hence<br />
∫<br />
| < f, g > | ≤ |fḡ| ≤ 1 2<br />
(<br />
‖f‖<br />
2<br />
2 + ‖g‖ 2 )<br />
2 = 1 = ‖f‖2 ‖g‖ 2 .<br />
If nei<strong>the</strong>r f nor g have norm zero, in <strong>the</strong> general case consider ˜f =<br />
˜g =<br />
that is,<br />
g<br />
‖g‖ 2<br />
, <strong>the</strong>n we have ‖ ˜f‖ 2 = ‖˜g‖ 2 = 1, by <strong>the</strong> previous case<br />
which implies<br />
∣ ∣∣∣<br />
∫<br />
∣∫<br />
∣∣∣<br />
˜f(˜g)<br />
∣ =<br />
| < ˜f, ˜g > | ≤ 1,<br />
|f||ḡ|<br />
‖f‖ 2 ‖g‖ 2<br />
∣ ∣∣∣<br />
=<br />
∣ ∣∣∣<br />
∫<br />
fḡ<br />
∣ ≤ ‖f‖ 2‖g‖ 2<br />
∣∫<br />
1 ∣∣∣<br />
‖f‖ 2 ‖g‖ 2<br />
fḡ<br />
∣ ≤ 1<br />
f<br />
‖f‖ 2<br />
and<br />
Proof part (3). This is a consequence of <strong>the</strong> linearity of <strong>the</strong> integral: For all<br />
f 1 , f 2 ∈ L 2 (R d ) and α ∈ C<br />
∫ ∫ ∫<br />
< f 1 + f 2 , g >= (f 1 + f 2 )ḡ = f 1 ḡ + f 2 ḡ =< f 1 , g > + < f 2 , g ><br />
also,<br />
∫<br />
< αf, g >=<br />
∫<br />
(αf)ḡ = α<br />
fḡ = α < f, g ><br />
and<br />
∫<br />
< f, g > =<br />
∫<br />
fḡ =<br />
∫<br />
fḡ =<br />
¯fg =< g, f > .<br />
Proof part (4).<br />
‖f + g‖ 2 2 =< f + g, f + g ><br />
= ‖f‖ 2 2+ < f, g > + < g, f > +‖g‖ 2 2<br />
= ‖f‖ 2 2 + 2| < f, g > | + ‖g‖2 2<br />
≤ ‖f‖ 2 2 + 2‖f‖ 2 ‖g‖ 2 + ‖g‖ 2 2<br />
by <strong>the</strong> Cauchy-Schwarz inequality, hence<br />
‖f + g‖ 2 2 ≤ (‖f‖ 2 + ‖g‖ 2 ) 2 .<br />
46
As in <strong>the</strong> case of L 1 , <strong>the</strong> space L 2 (R d ) is a metric space with metric given<br />
by d(f, g) = ‖f − g‖ 2 and Riesz-Fisher’s Theorem holds:<br />
Theorem 66 (Riesz-Fisher). The space L 2 (R d ) is complete in its metric.<br />
Proof. The proof is a modification of <strong>the</strong> argument in proof of Riesz-Fisher<br />
<strong>the</strong>orem <strong>for</strong> L 1 . Let {f n } be a Cauchy sequence in L 2 , by Lemma 52 to prove that<br />
{f n } converges, it is enough to show that {f n } has a convergent subsequence.<br />
Since {f n } is Cauchy, we can find a subsequence {f nk } satisfying<br />
Define<br />
and<br />
Consider <strong>the</strong> partial sums<br />
and<br />
‖f nk+1 (x) − f nk (x)‖ 2 ≤ 1 <strong>for</strong> all k ≥ 1.<br />
2k f(x) = f n1 +<br />
g(x) = |f n1 | +<br />
∞∑<br />
(f nk+1 (x) − f nk (x))<br />
k=1<br />
∞∑<br />
|f nk+1 (x) − f nk (x)|.<br />
k=1<br />
S K (f)(x) = f n1 +<br />
S K (g)(x) = |f n1 | +<br />
By <strong>the</strong> triangle inequality, <strong>for</strong> all k<br />
‖S K (g)‖ 2 ≤ ‖f n1 ‖ 2 +<br />
K∑<br />
(f nk+1 (x) − f nk (x))<br />
k=1<br />
K∑<br />
|f nk+1 (x) − f nk (x)|.<br />
k=1<br />
K∑<br />
‖f nk+1 (x) − f nk (x)‖ 2 ≤ ‖f n1 ‖ 2 + 2,<br />
k=1<br />
thus <strong>the</strong> partial sums S K (g) are bounded and S K (g) ր g, by <strong>the</strong> Monotone<br />
Convergence Theorem g is integrable and belongs to L 2 (R d ). Since |f| ≤ g, f<br />
is also integrable and<br />
|f| 2 ≤ |g| 2<br />
implies that <strong>the</strong> function f belongs to L 2 (R d ). The partial sums S K (f) telescopes<br />
to f nk so S K (f)(x) = f nk (x) → f(x) <strong>for</strong> almost all x. To prove that<br />
f nk → f(x) in L 2 (R d ), note that<br />
|f − S K (f)| 2 ≤ (2g) 2 <strong>for</strong> all K,<br />
<strong>the</strong>n by <strong>the</strong> Dominated Convergence Theorem<br />
‖f nk − f‖ 2 → 0<br />
47
Definition. A metric vector space X is called separable, if it contains a countable<br />
set L such that <strong>the</strong> set of all finite linear combinations of elements in L is<br />
dense in X.<br />
Theorem 67. The space L 2 (R d ) is separable.<br />
Sketch of <strong>the</strong> proof. Let<br />
and<br />
Q × iQ = {z ∈ C|Re(x) ∈ Q, Im(z) ∈ Q}<br />
Q = {All rectangles in R d with coordinates inQ × iQ}<br />
Since Q × iQ and Q are countable, <strong>the</strong> set<br />
A = {rχ R : r ∈ Q × iQ and R ∈ Q}<br />
is countable. To finish <strong>the</strong> proof, one has to check that every simple function in<br />
R d can be approximated by finite linear combination of elements in A.<br />
7.2 Hilbert Spaces<br />
Definition. A set H is called a Hilbert space if satisfies <strong>the</strong> following properties:<br />
(1) H is a vector space over C (or R)<br />
(2) H it is equipped with an inner product 〈·, ·〉 satisfying<br />
(i) The map f ↦→ 〈f, g〉 is linear <strong>for</strong> every g fixed.<br />
(ii) 〈f, g〉 = 〈g, f〉.<br />
(iii) 〈f, f〉 ≥ 0 <strong>for</strong> all f ∈ H<br />
(3) The inner product defines a norm ‖‖ given by<br />
and ‖f‖ = 0 if and only if f = 0.<br />
‖f‖ = 〈f, f〉 1 2 ,<br />
(4) The Cauchy-Schwarz and triangle inequalities hold. Thus, <strong>for</strong> all f, g ∈ H<br />
|〈f, g〉| ≤ ‖f‖‖g‖ and ‖f + g‖ ≤ ‖f‖ + ‖g‖<br />
(5) H is complete in <strong>the</strong> metric d(f, g) = ‖f − g‖.<br />
(6) H is separable.<br />
As we have seen, <strong>the</strong> space L 2 (R d ) under almost everywhere equivalence is<br />
a Hilbert space. Now we give a list of useful Hilbert spaces.<br />
48
Example 68. If E is a measurable set in R d with m(E) > 0, <strong>the</strong> space<br />
∫<br />
L 2 (E) = {f supported on E| |f(x)| 2 dx < ∞}<br />
is also a Hilbert space, with inner product given by<br />
∫<br />
〈f, g〉 = f(x)g(x)dx<br />
and norm<br />
(∫<br />
‖f‖ 2 =<br />
E<br />
E<br />
E<br />
) 1<br />
|f(x)| 2 2<br />
dx .<br />
Formally, we have to think L 2 (E) under <strong>the</strong> equivalence relation that f is equivalent<br />
g if and only if f = g almost everywhere.<br />
Example 69. A simple example is <strong>the</strong> finite dimensional complex Euclidean<br />
space<br />
C n = {(a 1 , ..., a n )|a k ∈ C}<br />
<strong>the</strong> inner product is given by<br />
and norm<br />
〈a, b〉 =<br />
‖a‖ =<br />
n∑<br />
a k¯bk<br />
k=1<br />
( n<br />
∑<br />
k=1<br />
a 2 k<br />
)1<br />
2<br />
.<br />
Example 70. An infinite dimensional analog of C n is<br />
l 2 (Z) = {(..., a −2 , a −2 , a 0 , a 1 , a 2 , ...)|a k ∈ C,<br />
<strong>the</strong> inner product is given by<br />
〈a, b〉 =<br />
∞∑<br />
k=−∞<br />
a k¯bk<br />
∞∑<br />
k=−∞<br />
|a k | 2 < ∞}<br />
and norm<br />
A variant is<br />
‖a‖ =<br />
( ∞<br />
∑<br />
k=−∞<br />
a 2 k<br />
l 2 (N) = {(a 1 , a 2 , ...)|a k ∈ C,<br />
) 1<br />
2<br />
.<br />
∞∑<br />
|a k | 2 < ∞}<br />
k=1<br />
49
It turns out, all infinite dimensional Hilbert spaces are l 2 in disguise<br />
Orthogonality Let f, g ∈ H be elements in a Hilbert space, f and g are called<br />
orthogonal or perpendicular if 〈f, g〉 = 0, and we write f ⊥ g. The first simple<br />
observation is Pythagoras <strong>the</strong>orem <strong>for</strong> Hilbert spaces:<br />
Theorem 71 (Pythagoras Theorem). If f ⊥ g <strong>the</strong>n<br />
‖f + g‖ 2 = ‖f‖ 2 + ‖g‖ 2 .<br />
Proof. Note that 〈f, g〉 = 0 implies 〈g, f〉 = 0 and<br />
‖f + g‖ 2 = ‖f‖ 2 + 〈f, g〉 + 〈g, f〉 + ‖g‖ 2 = ‖f‖ 2 + ‖g‖ 2 .<br />
Definition. A finite or countable subset {e 1 , e 2 , ...} is called orthonormal if<br />
{<br />
1 if i = j<br />
〈e i , e j 〉 =<br />
0 if i ≠ j.<br />
As a generalization of Pythagoras <strong>the</strong>orem we have:<br />
Proposition 72. If {e k } ∞ k=1 is an orthonormal set in H and f = ∑ N<br />
k=1 e k ∈ H<br />
where N is a finite, <strong>the</strong>n<br />
N∑<br />
‖f‖ 2 = |a k | 2 .<br />
k=1<br />
Given {e k } ∞ k=1 a natural problem is to determine if it spans all H. That is,<br />
if <strong>the</strong> set of finite linear combinations of {e k } ∞ k=1<br />
is dense in H. In this case we<br />
say that {e k } ∞ k=1<br />
is an orthonormal basis.<br />
Example 73. Consider L 2 ([−π, π]) with normalized inner product<br />
〈f, g〉 = 1 ∫ π<br />
f(x)g(x)dx.<br />
2π −π<br />
The set {e inx } ∞ n=∞ is an orthonormal basis of L2 ([−π, π]). Given f ∈ L 2 ([−π, π]),<br />
and a n = 〈f, e inx 〉 <strong>the</strong>n f ∼ ∑ ∞<br />
n=−∞ a ne inx .<br />
We say that f n → f in norm if ‖f n − f‖ → 0.<br />
Theorem 74. The following properties of an orthogonal set {e k } are equivalent:<br />
(1) The set of finite linear combinations of elements in {e k } is dense in H.<br />
(2) If f ∈ H and 〈f, e j 〉 = 0 <strong>for</strong> all j, <strong>the</strong>n f = 0.<br />
(3) If f ∈ H and S N (f) = ∑ N<br />
k=1 a ke k , where a k = 〈f, e k 〉 and S N (f) → f as<br />
N → ∞ in norm.<br />
50
(4) If a k = 〈f, e k 〉, <strong>the</strong>n ‖f‖ 2 = ∑ ∞<br />
k=1 |a k| 2 .<br />
Proof. (1)⇒ (2). There exist a sequence {g n } a sequence of finite linear combination<br />
of e k with g n → f, such that ‖f − g n ‖ → 0, since 〈f, e j 〉 = 0 <strong>the</strong>n<br />
〈f, g n 〉 = 0 <strong>for</strong> all n. So<br />
‖f‖ 2 = 〈f, f〉 = 〈f, f − g n 〉 = 0<br />
<strong>for</strong> all n. By taking <strong>the</strong> limit we get ‖f‖ = 0 which implies f = 0.<br />
(2)⇒ (3). For f ∈ H, let S N (f) = ∑ N<br />
k=1 a ke k where a k = 〈f, e k 〉. Then<br />
By Pythagoras <strong>the</strong>orem we have<br />
thus<br />
(f − S N (f)) ⊥ S N (f)<br />
‖f‖ 2 = ‖f − S N (f)‖ 2 + ‖S N (f)‖ 2 = ‖f − S N (f)‖ 2 +<br />
‖f‖ 2 ≥<br />
N∑<br />
|a k | 2<br />
k=1<br />
letting n → ∞, we obtain Bessel’s inequality<br />
∞∑<br />
‖f‖ 2 ≥ |a k | 2<br />
So,<br />
k=1<br />
∞∑<br />
|a k | 2<br />
k=1<br />
N∑<br />
|a k | 2 ,<br />
converges, which implies that {S N (f)} is a Cauchy sequence, because H is<br />
complete, S N (f) → g <strong>for</strong> some g ∈ H. Fix j, <strong>for</strong> N large enough<br />
〈f − S N (f), e j 〉 = a j − a j = 0<br />
Since S N (f) → g, <strong>the</strong>n 〈f − g, e j 〉 = 0 <strong>for</strong> all j, which by <strong>the</strong> assumption in (2)<br />
implies f = g and<br />
∞∑<br />
f = a k e k<br />
(3)⇒ (4). It is immediate that<br />
‖f‖ 2 =<br />
k=1<br />
∞∑<br />
|a k | 2 .<br />
k=1<br />
This equation is known as Parseval’s identity.<br />
(4)⇒ (1). By definition S N (f) is a finite linear combination of {e k } and by <strong>the</strong><br />
assumption in (4),<br />
‖f − S N (f)‖ → 0.<br />
k=1<br />
51
Let us remark that Bessel’s inequality holds <strong>for</strong> all orthogonal families, in<br />
contrast Parseval’s identity holds only when <strong>the</strong> family is a basis.<br />
Proposition 75. Any Hilbert space has an orthonormal basis.<br />
Riesz Representation Theorem.<br />
Definition. A bounded linear functional in H, is a linear function φ : H → R<br />
such that <strong>the</strong>re exist A > 0, such that <strong>for</strong> all f ∈ H<br />
|φ(f)| ≤ A‖f‖<br />
Example 76. If g ∈ L 2 (R d ) is fixed <strong>the</strong>n<br />
∫<br />
φ g (f) =< f, g >=<br />
R d }fḡ<br />
is a bounded linear functional. Note that φ g is well defined, that is if g ′ = g a.e.,<br />
<strong>the</strong>n φ g = φ g ′. The function φ is bounded by <strong>the</strong> Cauchy-Schwarz inequality:<br />
∫<br />
|φ g (f)| ≤ |fḡ|dµ ≤ ‖f‖ 2 ‖ḡ‖ 2<br />
R d<br />
and taking A = ‖ḡ‖ 2 .<br />
The following <strong>the</strong>orem states that all bounded linear functional in a H are<br />
of this <strong>for</strong>m.<br />
Theorem 77 (Riesz Representation Theorem). Let φ be a continuous linear<br />
functional on a Hilbert space H. Then <strong>the</strong>re exist a unique g ∈ H such that<br />
<strong>for</strong> all f ∈ H.<br />
φ(f) = 〈f, g〉<br />
Proof. Consider <strong>the</strong> subspace of H defined by<br />
S = {f ∈ H|φ(f) = 0}.<br />
Since φ is continuous, S is a closed subspace of H. The space S is called <strong>the</strong><br />
null-space of φ. If S = H, <strong>the</strong>n φ = 0 and we can take g = 0. O<strong>the</strong>rwise, S ⊥ ≠ 0<br />
so <strong>the</strong>re exist h ∈ S ⊥ with ‖h‖ = 1, let g = φ(h)h and <strong>for</strong> every f ∈ H let<br />
u = φ(f)h − φ(h)f, <strong>the</strong>n φ(u) = 0 which implies that u ∈ S and 〈u, h〉 = 0.<br />
Then we have<br />
that is,<br />
0 = 〈φ(f)h − φ(h)f, h〉 = φ(f)〈h, h〉 − 〈f, φ(h)h〉<br />
as we wanted to show.<br />
φ(f) = 〈f, g〉,<br />
52
7.3 The Lebesgue spaces L p .<br />
For p ≥ 1 we sat that f ∈ L p (R d ) (resp. L p (E), <strong>for</strong> a measurable set E in R d ),<br />
if |f| p is integrable over R d (resp. E). Again, we will consider two functions, f<br />
and g to be <strong>the</strong> same if f = g a.e.<br />
Definition. For each p ≥ 1, and p < ∞ we define<br />
∫<br />
L p (E) = {f measurable on E| |f| p < ∞}<br />
<strong>the</strong> L p -norm of a function f ∈ L p (E) is given by<br />
(∫ ) 1<br />
‖f‖ p = |f| p p<br />
(Case p = ∞) Let f : E → [−∞, ∞] be a measurable function, <strong>the</strong>n<br />
esssup f = inf{c||f| < c a.e. }<br />
Definition. A measurable function f : E → [−∞, ∞] is called essentially<br />
bounded if<br />
esssup |f| < ∞.<br />
The space of essentially bounded functions on E is denoted by L ∞ (E).<br />
It is easy to check that L ∞ (E) is a vector space with norm<br />
E<br />
‖f‖ ∞ = esssup |f|.<br />
Here again, two functions f and g are considered <strong>the</strong> same if <strong>the</strong>y are equal<br />
almost everywhere.<br />
Let us check that <strong>for</strong> all p ≥ 1, <strong>the</strong> Lebesgue space L p is a vector space.<br />
Remind that if f and g are measurable, <strong>the</strong>n cf and f + g are measurable.<br />
Moreover, since |cf(x)| p = |c| p |f(x)|, we have<br />
(∫<br />
‖cf(x)‖ p =<br />
E<br />
|cf(x)| p ) 1<br />
p<br />
= |c|<br />
(∫<br />
E<br />
E<br />
|f(x)| p )<br />
= |c|‖f‖ p<br />
Thus cf ∈ L p (E) if, and only if, f ∈ L p (E). Now note that<br />
|f(x) + g(x)| p ≤ 2 p max{|f(x)| p , |g(x)| p }<br />
which implies ‖f + g‖ p < ∞ if, and only if, ‖f‖ p < ∞ and ‖g(x)‖ p < ∞.<br />
Clearly, ‖f‖ p = 0 if, and only if, |f(x)| = 0 a.e., that is f(x) = 0 a.e.<br />
For 1 ≤ p < ∞ triangle inequality in L p (E) is by no means obvious. However,<br />
<strong>the</strong> case p = ∞ is ra<strong>the</strong>r easy:<br />
implies<br />
|f + g| ≤ |f| + |g|<br />
‖f + g‖ ∞ ≤ ‖f‖ ∞ + ‖g‖ ∞ .<br />
Besides ‖f‖ ∞ = 0, if and only if f = 0 a.e.<br />
53
Lemma 78. For any non negative real numbers x, y and all α, β ∈ (0, 1) such<br />
that α + β = 1, we have <strong>the</strong> following inequality<br />
x α y β ≤ αx + βy<br />
Proof. If x = 0, <strong>the</strong> inequality is obvious. Let x > 0 and let f(t) = (1 − β) +<br />
βt − t β , <strong>for</strong> t ≥ 0 and β as given. Then<br />
so we have<br />
and<br />
f ′ (t) = (1 − β) − βt β−1<br />
f ′ (t) < 0 <strong>for</strong> t ∈ (0, 1)<br />
f ′ (t) > 0 <strong>for</strong> t ∈ (1, ∞)<br />
which implies that f(1) is <strong>the</strong> only minimum of f on [0, ∞], hence f(t) ≥ 0 <strong>for</strong><br />
all t ≥ 0. Let t = y x , <strong>the</strong>n (1 − β) + β y ( y<br />
) β<br />
x − ≥ 0,<br />
x<br />
(1 − β) + β y x ≥ ( y<br />
x) β<br />
,<br />
(1 − β)x + βy ≥ y β x 1−β ,<br />
αx + βy ≥ x α y β .<br />
Theorem 79 (Hölder’s Inequality). If 1 p + 1 q = 1 and p > 1, <strong>the</strong>n <strong>for</strong> f ∈ Lp (E)<br />
and g ∈ L q (E) we have fg ∈ L 1 (E) and ‖fg‖ 1 ≤ ‖f‖ p ‖q‖ q , that is<br />
∫<br />
E<br />
(∫<br />
|fg| ≤<br />
E<br />
) 1 (∫ ) 1<br />
|f| p p<br />
|g| q q<br />
E<br />
Proof. If ‖f‖ p = 0 or ‖g‖ q = 0 <strong>the</strong>n |fg| = 0 a.e., and <strong>the</strong> inequality follows,<br />
hence we can assume that ‖f‖ p ≠ 0 and ‖g‖ q ≠ 0. Let us consider first <strong>the</strong> case<br />
where ‖f‖ p = ‖g‖ q = 1, so we want to show<br />
∫<br />
|fg| ≤ 1.<br />
Take α = 1 p , β = 1 q , x = |f|p , y = |g| q on <strong>the</strong> previous Lemma so we have<br />
|fg| = x 1 p y<br />
1<br />
q ≤<br />
1<br />
p |f|p + 1 q |g|q .<br />
Integrating both sides<br />
∫<br />
|fg| ≤ 1 ∫<br />
p<br />
|f| p + 1 q<br />
∫<br />
|g| q = 1 p + 1 q = 1<br />
54
as we wanted to show. For <strong>the</strong> general case let<br />
˜f =<br />
f<br />
‖f‖ p<br />
and ˜g =<br />
<strong>the</strong>n ‖ ˜f‖ p = ‖˜g‖ q = 1 by <strong>the</strong> previous case<br />
g<br />
‖g‖ p<br />
‖ ˜f˜g‖ 1 ≤ ‖ ˜f‖ p ‖˜g‖ q ≤ 1,<br />
hence<br />
‖fg‖ 1<br />
‖f‖ p ‖g‖ q<br />
≤ ‖f‖ p‖g‖ q<br />
‖f‖ p ‖g‖ q<br />
‖fg‖ 1 ≤ ‖f‖ p ‖g‖ q .<br />
Now we prove <strong>the</strong> triangle inequality <strong>for</strong> L p , which is called Minkowski’s<br />
inequality:<br />
Theorem 80 (Minkowski’s Inequality). Let f, g ∈ L p where p ≥ 1, <strong>the</strong>n<br />
Proof. First, note that<br />
‖f + g‖ p ≤ ‖f‖ p + ‖g‖ p .<br />
|f + g| p ≤ |f + g||f + g| p−1 ≤ |f||f + g| p−1 + |g||f + g| p−1 .<br />
Let q be such that 1 p + 1 q = 1, by definition, f +g ∈ Lp implies that |f +g| p ∈ L 1 ,<br />
since p = q(p − 1) <strong>the</strong>n |f + g| p−1 ∈ L q by Hölder’s inequality<br />
∫<br />
|f||f + g| p−1 ≤ ‖f‖ p<br />
(∫<br />
|f + g| (p−1)q ) 1<br />
q<br />
≤ ‖f‖p ‖f + g‖ p q p . (1)<br />
Similarly,<br />
∫<br />
|g||f + g| p−1 ≤ ‖g‖ p ‖f + g‖ p q p . (2)<br />
Then combining previous inequalities, we obtain<br />
∫ ∫<br />
∫<br />
‖f + g‖ p p = |f + g| p ≤ |f||f + g| p−1 +<br />
|g||f + g| p−1<br />
≤ (‖f‖ p + ‖g‖ p ) ‖f + g‖ p q .<br />
If ‖f + g‖ p = 0, Minkowski’s inequality is valid. Assume ‖f + g‖ p ≠ 0 so we<br />
can divide each side by |f + g‖ p q<br />
p to get<br />
‖f + g‖ p− p q<br />
p = ‖f + g‖ p ≤ ‖f‖ p + ‖g‖ p .<br />
55
For all p, <strong>the</strong> L p norm ‖‖ p defines a metric, given by d(f, g) = ‖f − g‖ p , as<br />
in <strong>the</strong> cases <strong>for</strong> p = 1 and p = 2 we have:<br />
Theorem 81 (Riesz-Fisher). For all p with 1 ≤ p ≤ ∞, <strong>the</strong> Lebesgue space<br />
L p (E) is complete on its metric.<br />
Proof. The case p < ∞ uses <strong>the</strong> same argument as <strong>for</strong> p = 1 or p = 2, so we do<br />
not repeat it. The case p = ∞ is much simpler, let {f n } be a Cauchy sequence<br />
in L ∞ (E), we want to show that {f n } converges. For each n, <strong>the</strong>re exist A n<br />
with m(A n ) = 0 and such that<br />
|f n (x)| ≤ ‖f n ‖ ∞ <strong>for</strong> all x ∈ E \ A n<br />
Let A = ⋃ A n so m(A) = 0, <strong>the</strong>n we can assume that<br />
and<br />
|f n (x)| ≤ ‖f n ‖ ∞ <strong>for</strong> all x ∈ E \ A and all n<br />
|f n (x) − f m (x)| ≤ ‖f n − f m ‖ ∞ <strong>for</strong> all x ∈ E \ A and all n, m.<br />
Since {f n } is a Cauchy sequence in L ∞ , <strong>the</strong>n <strong>the</strong> sequence {f n (x)} is uni<strong>for</strong>mly<br />
convergent in E \ A, define<br />
{<br />
limf n (x) x ∈ E \ A<br />
f(x) =<br />
0 x ∈ A.<br />
Then, f is measurable and ‖f n − f‖ ∞ → 0.<br />
8 Abstract measures<br />
8.1 Outer measures and Cara<strong>the</strong>odory measurable sets<br />
Let X be any set, an exterior measure µ ∗ on X is a function µ ∗ : P → [0, ∞]<br />
satisfying<br />
(1) µ ∗ (∅) = 0<br />
(2) If E 1 ⊂ E 2 , <strong>the</strong>n µ ∗ (E 1 ) ≤ µ ∗ (E 2 ),<br />
(3) If {E n } is a countable family of sets, <strong>the</strong>n<br />
∞⋃<br />
∞∑<br />
µ ∗ ( E n ) ≤ µ ∗ (E n )<br />
n=1<br />
n=1<br />
Example 82. Lebesgue outer measure m ∗ on R d .<br />
56
8.2 Measurability<br />
A set E ⊂ X is called Cara<strong>the</strong>odory measurable (or just measurable) if <strong>for</strong> all<br />
A ⊂ X one has<br />
µ ∗ (A) = µ ∗ (E ∩ A) + µ ∗ (A ∩ E c ). (3)<br />
Because µ ∗ is subadditive, in order to prove equation (3) <strong>for</strong> a set E, we just<br />
need to check<br />
µ ∗ (A) ≥ µ ∗ (E ∩ A) + µ ∗ (A ∩ E c ).<br />
Proposition 83. Given an exterior measure µ ∗ on a set X, <strong>the</strong> collection M<br />
of Cara<strong>the</strong>odory measurable sets <strong>for</strong>ms a σ-algebra. Moreover, µ = µ ∗ restricted<br />
to <strong>the</strong> sets in M is a measure on M.<br />
Proof. For all A ⊂ X,<br />
µ ∗ (∅ ∩ A) + µ ∗ (∅ c ∩ A)<br />
= µ ∗ (∅ ∩ A) + µ ∗ (X ∩ A) = µ ∗ (A),<br />
hence ∅, X ∈ M. By <strong>the</strong> symmetry of equation (3), it is clear that E ∈ M if,<br />
and only if E c ∈ M. Now, let us check that M is closed under finite unions. If<br />
E 1 , E 2 ∈ M, and A ⊂ X, <strong>the</strong>n<br />
µ ∗ (A) = µ ∗ (E 2 ∩ A) + µ(E c 2 ∩ A)<br />
= µ ∗ (E 1 ∩ E 2 ∩ A) + µ ∗ (E c 1 ∩ E 2 ∩ A)<br />
+µ ∗ (E 1 ∩ E c 2 ∩ A) + µ∗ (E c 1 ∩ Ec 2 ∩ A)<br />
≥ µ ∗ ((E 1 ∪ E 2 ) ∩ A) + µ ∗ ((E 1 ∪ E 2 ) c ∩ A)<br />
where <strong>the</strong> second equality is equation (3) apply to E 2 and E 1 ∩ A, and <strong>the</strong> last<br />
inequality uses<br />
(E 1 ∪ E 2 ) ∩ A = (E 1 ∩ E 2 ∩ A) ∪ (E c 1 ∩ E 2 ∩ A) ∪ (E 1 ∩ E c 2 ∩ A)<br />
sudadditivity of µ ∗ , and E c 1 ∩Ec 2 = (E 1∪E 2 ) c . All toge<strong>the</strong>r implies that E 1 ∪E 2 ∈<br />
M, if in addition we assume E 1 ∩ E 2 = ∅, we have<br />
µ(E 1 ∪ E 2 ) = µ ∗ (E 1 ∪ E 2 ) = µ ∗ (E 1 ∩ (E 1 ∪ E 2 ) + µ ∗ (E c 1 ∩ (E 1 ∪ E 2 ))<br />
= µ ∗ (E 1 ) + µ ∗ (E 2 ) = µ(E 1 ) + µ(E 2 ).<br />
Hence, µ is additive <strong>for</strong> finite union of disjoint sets in M. Now we have to<br />
check that M is closed under countable unions, and µ is additive <strong>for</strong> unions of<br />
countable disjoint sets.<br />
Since M is closed under finite unions and complements, every set E that is<br />
countable union of sets in M, is also a countable union of disjoint sets. Hence<br />
it is enough to check that M is closed under countable union of disjoint sets in<br />
M. Let {E k } be a countable collection of disjoint sets in M and define<br />
G n =<br />
n⋃<br />
∞⋃<br />
E k , and G = E k .<br />
k=1<br />
57<br />
k=1
Since G n ∈ M <strong>the</strong>n<br />
µ ∗ (G n ∩ A) = µ ∗ (E n ∩ (G n ∩ A)) + µ ∗ (En c ∩ (G n ∩ A))<br />
= µ ∗ (E n ∩ A) + µ ∗ (G n−1 ∩ A) =<br />
but G c ⊂ G c n thus<br />
µ ∗ (A) = µ ∗ (G n ∩ A) + µ ∗ (G c n ∩ A) ≥<br />
letting n → ∞ we obtain<br />
n∑<br />
µ ∗ (E k ∩ A)<br />
k=1<br />
n∑<br />
µ ∗ (E k ∩ A) + µ ∗ (G c ∩ A)<br />
k=1<br />
∞∑<br />
µ ∗ (A) ≥ µ ∗ (E k ∩ A) + µ ∗ (G c ∩ A)<br />
k=1<br />
≥ µ ∗ (G ∩ A) + µ ∗ (G c ∩ A) ≥ µ ∗ (A)<br />
that is G ∈ M. To check that µ is additive take A = G, <strong>the</strong>n<br />
∞∑<br />
∞∑<br />
µ ∗ (G) = µ ∗ (E k ∩ G) + µ ∗ (G c ∩ G) = µ ∗ (E k ).<br />
k=1<br />
k=1<br />
Let E ⊂ X be a measurable set such that µ ∗ (E) = 0, <strong>the</strong>n <strong>for</strong> all A ⊂ X,<br />
A ∩ E ⊂ E and µ ∗ (A ∩ E) = 0 so<br />
µ ∗ (A ∩ E) + µ ∗ (A ∩ E c ) ≤ µ ∗ (A)<br />
which implies that E ∈ M. Since, <strong>the</strong> exterior measure µ ∗ is monotone whenever<br />
F ⊂ E, µ ∗ (F) = 0 and <strong>the</strong>n F M. Then, <strong>the</strong> measure µ restricted to<br />
Cara<strong>the</strong>odory measurable sets is complete. Not all measures µ on σ-algebras A<br />
are complete. Since, it is possible that subsets of sets of µ-measure 0 in A may<br />
not belong to A:<br />
Example 84. The Borel σ-algebra B in R is <strong>the</strong> minimal σ-algebra containing<br />
all open sets in R. The Lebesgue measure m restricted to <strong>the</strong> Borel σ-algebra is<br />
not complete. There are sets of Lebesgue measure 0 that are not Borel.<br />
Now we are going to prove that <strong>the</strong> measurability notions of Cara<strong>the</strong>odory<br />
and Lebesgue are equivalent. For convenience, let us remind that a set E ⊂ R d<br />
is Lebesgue measurable if <strong>for</strong> every ǫ > 0 <strong>the</strong>re exist an open set U, with E ⊂ U<br />
and such that<br />
m ∗ (U \ E) < ǫ.<br />
This is equivalent to <strong>the</strong> existence of a G δ set G, such that m ∗ (G△E) = 0<br />
58
Theorem 85. Let E ∈ R d , <strong>the</strong>n E is Lebesgue measurable if and only if E is<br />
Cara<strong>the</strong>odory measurable with respect to m ∗ .<br />
Proof. Let E be Lebesgue measurable and let A be any subset of R d . Then<br />
<strong>the</strong>re exist a G δ set G such that m ∗ (A) = m ∗ (G) <strong>the</strong>n<br />
m ∗ (E ∩ A) + m ∗ (E c ∩ A) ≤ m ∗ (E ∩ G) + m ∗ (E c ∩ G)<br />
= m(E ∩ G) + m(E c ∩ G) = m(G) = m ∗ (A).<br />
Hence E is Cara<strong>the</strong>odory measurable.<br />
Now, assume E is a Cara<strong>the</strong>odory measurable set, since m∗ is σ-finite on R d ,<br />
we may assume m ∗ (E) < ∞. Take a G δ such that E ⊂ G, with m ∗ (E) = m ∗ (G),<br />
m ∗ (G) = m ∗ (E ∩ G) + m ∗ (E c ∩ G)<br />
= m ∗ (E) + m ∗ (E c ∩ G)<br />
so m ∗ (G \ E) = 0 which implies that E is Lebesgue measurable.<br />
Let (X, A, µ) a measure space, where X is <strong>the</strong> underlying set, A a σ-algebra.<br />
Assume that <strong>the</strong> measure µ is σ-finite, that is <strong>the</strong>re exist a countable collection<br />
{E k } ⊂ A with µ(E k ) < ∞ <strong>for</strong> all k, and X = ∪E k .<br />
A function f : X → [−∞, ∞] is called µ-measurable (or just measurable) if<br />
<strong>for</strong> all a ∈ R<br />
f −1 ([−∞, ∞]) = {x ∈ X|f(x) < a} ∈ A.<br />
Then we can define <strong>the</strong> integral<br />
∫ ( ∫<br />
f(x)dx =<br />
X<br />
X<br />
∫ ∫<br />
fdµ = fdµ =<br />
in <strong>the</strong> same way we defined <strong>the</strong> Lebesgue integral. For instance, if φ = ∑ a k χ Ek<br />
is a simple function, define<br />
∫<br />
φdµ = ∑ a k µ(E k ),<br />
X<br />
<strong>the</strong>n generalize to bounded functions, non negative and finally, we say that a<br />
µ-measurable function f is integrable if<br />
∫<br />
fdµ < ∞<br />
X<br />
)<br />
f<br />
Moreover, this definition of integral satisfies <strong>the</strong> following properties:<br />
Proposition 86. Let {f n } be a sequence of µ-measurable functions.<br />
(1) Fatou’s Lemma. If <strong>for</strong> every n, f n is non negative<br />
∫<br />
∫<br />
liminf f n dµ ≤ liminf f n dµ.<br />
59
(2) Monotone Convergence Theorem. If all f n are non negative, and f n ր f,<br />
<strong>the</strong>n<br />
∫ ∫<br />
f n dµ = fdµ<br />
lim<br />
n→∞<br />
(3) Dominated Convergence Theorem. If f n → f a.e., and |f n | < g where g<br />
is a µ-integrable function, <strong>the</strong>n<br />
∫<br />
|f n − f|dµ → 0<br />
which implies<br />
∫<br />
∫<br />
f n dµ →<br />
fdµ.<br />
For (X, A, µ), we can define L p (X, A, µ) = {f µ − measurable | ∫ X |f|p dµ <<br />
∞}. The L 2 space is a Hilbert space, and L p are vector spaces with a norm<br />
given by<br />
(∫ ) 1<br />
‖f‖ p = |f| d p<br />
dµ .<br />
X<br />
All <strong>the</strong>orems we proved <strong>for</strong> L p (R d ) and L 2 (R d ) carry on to <strong>the</strong> spaces L p (X, A, µ).<br />
8.3 Radon-Nikodym <strong>the</strong>orem<br />
Definition. Let µ, ν be two σ-finite measures on (X, A), we say that µ is absolutely<br />
continuous with respect to ν, and write µ ≺≺ ν if ν(E) = 0 implies<br />
µ(E) = 0.<br />
Example 87. Let f be a non negative ν-measurable function, define<br />
∫<br />
µ(E) = fdν<br />
<strong>for</strong> all E ∈ A. Then µ ≺≺ ν.<br />
Example 88. Let m be <strong>the</strong> Lebesgue measure on R, and Q = {q n } some enumeration<br />
of <strong>the</strong> rational numbers. Define<br />
µ(E) =<br />
∞∑<br />
n=1<br />
E<br />
1<br />
2 nχ E(q n ),<br />
<strong>the</strong>n µ is not absolutely continuous with respect to m.<br />
Theorem 89 (Radon-Nikodym). Let µ, ν be two σ-finite measures on (X, A).<br />
If µ ≺≺ ν, <strong>the</strong>n <strong>the</strong>re exists a non negative measurable function g : X → R such<br />
that <strong>for</strong> every µ-integrable function f,<br />
∫ ∫<br />
fdµ = fgdν.<br />
Moreover, g is unique up to null sets.<br />
X<br />
X<br />
60
Definition. In this case we write gdν = dµ or g = dµ<br />
dν<br />
, <strong>the</strong> Radon-Nikodym<br />
derivative of µ with respect to ν.<br />
Proof. Since µ and ν are σ-finite, it is enough to consider <strong>the</strong> case µ(X), ν(X) <<br />
∞, once this case is settle <strong>the</strong> general case follows by an argument over a countable<br />
partition of <strong>the</strong> space.<br />
Define λ = µ + ν and let φ : L 2 (X, A, λ) → R be <strong>the</strong> functional<br />
∫<br />
φ(f) = fdµ.<br />
This φ is well defined, if f 1 = f 2 λ-a.e. Then<br />
0 = λ({f 1 ≠ f 2 }) = µ({f 1 ≠ f 2 }) + ν({f 1 ≠ f 2 })<br />
X<br />
≥ µ({f 1 ≠ f 2 })<br />
<strong>the</strong>n f 1 = f 2 µ-a.e. Which implies<br />
∫ ∫<br />
φ(f 1 ) = f 1 dµ = f 2 dµ = φ(f 2 ).<br />
Clearly φ is linear, and by Cauchy-Schwarz inequality in L 2 (X, A, λ), φ is also<br />
bounded:<br />
∫ ∫<br />
|φ(f)| ≤ |f|dµ ≤ |f|dλ<br />
√ ∫ ∫<br />
≤ 1 dλ√ 2 |f| 2 dλ ≤ √ λ(X) · ‖f‖ L2 (X,λ)<br />
By <strong>the</strong> Riesz Representation Theorem <strong>the</strong>re exist h ∈ L 2 (X, A, λ) such that <strong>for</strong><br />
all f<br />
∫<br />
φ(f) = fhdλ<br />
<strong>the</strong>n<br />
∫<br />
∫<br />
fdµ =<br />
∫<br />
fhdλ =<br />
∫<br />
fhdµ +<br />
fhdν,<br />
and<br />
∫<br />
∫<br />
f(1 − h)dµ =<br />
fhdν<br />
so, we have (1 − h)dµ = hdν which suggest<br />
dµ = h<br />
1 − h dν.<br />
But we have to prove it rigorously:<br />
First, we claim that 0 ≤ h ≤ 1 <strong>for</strong> λ-a.e. To prove <strong>the</strong> claim, note that <strong>for</strong><br />
all E ∈ A, χ E ∈ L 2 (X, A, λ). We also have<br />
∫<br />
µ(E) = φ(χ E ) = hdλ<br />
61<br />
E
and<br />
Now<br />
<strong>the</strong>n<br />
∫<br />
ν(E) = λ(E) − µ(E) =<br />
∫<br />
0 ≥<br />
{h1}<br />
<strong>the</strong>n ν({h > 1}) = 0 and since µ ≺≺ ν, we have µ({h > 1}) = 0, thus<br />
λ({h > 1}) = 0 which proves our claim.<br />
Note that ∫<br />
∫<br />
χ E · (1 − h)dµ = χ E hdν,<br />
by linearity we also have<br />
∫<br />
∫<br />
φ · (1 − h)dµ =<br />
φhdν,<br />
<strong>for</strong> all simple functions φ. If f is non negative, choose a sequence of non negative<br />
simple functions φ k ր f, <strong>the</strong>n we also have 0 ≤ φ k (1 − h) ր f(1 − h) and<br />
0 ≤ φ k h ր fh. By <strong>the</strong> Monotone Convergence Theorem (applied twice)<br />
∫<br />
∫<br />
f(1 − h)dµ = lim<br />
n→∞<br />
φ n (1 − h)dµ<br />
∫ ∫<br />
= lim<br />
n→∞<br />
φ n hdν = fhdν<br />
<strong>for</strong> all functions f non negative and measurable. But if f ≥ 0 <strong>the</strong>n f<br />
1−h ≥ 0 by<br />
<strong>the</strong> claim be<strong>for</strong>e. By replacing f<br />
1−h<br />
instead of f in <strong>the</strong> equation above, we have<br />
∫ ( ) ∫ ( )<br />
f<br />
fh<br />
(1 − h)dµ = dν<br />
1 − h<br />
1 − h<br />
that is<br />
So, g = h<br />
1−h<br />
∫ ∫<br />
h<br />
fdµ = f<br />
1 − h dν.<br />
is <strong>the</strong> function we were looking <strong>for</strong>.<br />
62
8.4 Signed measures<br />
Definition. Let (X, A) be a measurable space. Where X is <strong>the</strong> underlying set<br />
and A is a σ-algebra. A signed measure µ is a set function defined on A,<br />
µ : A → [−∞, ∞] satisfying:<br />
(1) µ attains at must one of <strong>the</strong> values ±∞,<br />
(2) µ(∅) = 0<br />
(3) (Additivity) If E = ⊔ ∞<br />
n=1 E n is a disjoint union of sets {E n } in A, <strong>the</strong>n<br />
m(E) =<br />
∞∑<br />
n=1<br />
E n<br />
The triple (X, A, µ) is called a signed measure space.<br />
We will need <strong>the</strong> following definition <strong>for</strong> some measurable sets in A:<br />
Definition. Let (X, A, µ) be a signed measure space.<br />
(1) A set E ∈ A is called null if <strong>for</strong> every E ′ ⊆ E measurable in A, m(E ′ ) = 0.<br />
(2) A set E ∈ A is called positive if <strong>for</strong> every E ′ ⊆ E measurable in A,<br />
m(E ′ ) ≥ 0.<br />
(3) A set E ∈ A is called negative if <strong>for</strong> every E ′ ⊆ E measurable in A,<br />
m(E ′ ) ≤ 0.<br />
Lemma 90. Let (X, A, µ) be a signed measure space. Let E ∈ A such that<br />
µ(E) > 0, <strong>the</strong>n <strong>the</strong>re exist a positive set E ′ ⊂ E and such that µ(E ′ ) > 0.<br />
Proof. If E is not positive, <strong>the</strong>re is a subset if E with negative measure. Let n 1<br />
be <strong>the</strong> smallest positive number such that <strong>the</strong>re exist E 1 ⊂ E and µ(E 1 ) < − 1 n 1<br />
.<br />
So<br />
µ(E \ E 1 ) = µ(E) − µ(E 1 ) > µ(E) ≥ 0.<br />
Again, if E \ E 1 is positive <strong>the</strong>n we are done, o<strong>the</strong>rwise let n 2 be <strong>the</strong> smallest<br />
positive number such that <strong>the</strong>re exist E 2 ⊂ E \ E 1 and µ(E 2 ) < − 1 n 2<br />
.<br />
Repeating <strong>the</strong> argument we obtain a sequence of disjoints sets {E k }, and a<br />
sequence of numbers {n k } such that µ(E k ) ≤ − 1<br />
n k<br />
. Let F = ⋃ ∞<br />
k=1 E k, so<br />
∞∑<br />
∞∑ 1<br />
µ(F) = µ(E k ) < − ≤ 0<br />
n k<br />
k=1<br />
Hence 1<br />
n k<br />
→ 0. Let G ⊂ E \ F be a measurable set in A, if µ(G) < 0, <strong>the</strong>n<br />
<strong>the</strong>re exist k, such that µ(G) < − 1<br />
(n k −1) , but G ⊂ E \F \E \ {E 1 ∪... ∪E k } and<br />
n k − 1 < n k contradicting <strong>the</strong> definition of n k , There<strong>for</strong>e µ(G) ≥ 0 and <strong>the</strong>n<br />
µ(E \ F) = µ(E) − µ(F) > 0.<br />
k=1<br />
63
Theorem 91 (Hahn’s Decomposition Theorem). If µ is a signed measure on<br />
A, <strong>the</strong>re are sets P, and N with X = P ∪N and P ∩N = ∅ where P is positive<br />
and N is negative.<br />
Proof. The class of positive sets P since ∅ ∈ P. Then, we can define<br />
α = sup{µ(A)|A ∈ P}<br />
By replacing −µ instead of µ, we can assume that µ does not attain <strong>the</strong> value<br />
∞ and α < ∞. By definition, <strong>the</strong>re exist a sequence {A n } of positive sets<br />
such that µ(A n ) → ∞. The union of two positive sets is positive, thus we<br />
can choose A n to be strictly increasing. Let P = ⋃ A n . Then A n ր P and<br />
µ(P) = limµ(A n ) = α, also P is positive, since <strong>for</strong> every E ∈ A<br />
µ(E ∩ P) = µ(E ∩ ⋃ A n ) = µ( ⋃ (A n ∩ E)) = limµ(A n ∩ E) ≥ 0<br />
Let N = X \P, we claim that N is negative, o<strong>the</strong>rwise <strong>the</strong>re is a set E ⊂ N with<br />
µ(E) > 0, by <strong>the</strong> previous Lemma, <strong>the</strong>re exist S ⊂ E positive with µ(S) > 0,<br />
<strong>the</strong>n<br />
µ(P ∪ S) = µ(P) + µ(S) > α<br />
contradicting <strong>the</strong> definition of α.<br />
Lemma 92. If X = P 1 ∪N 1 = P 2 ∪N 2 are two Hanh decompositions <strong>for</strong> µ <strong>the</strong>n<br />
µ(P 1 △ P 2 ) = 0 and µ(N 1 △ N 2 ) = 0.<br />
Proof. We only prove that µ(P 1 △P 2 ) = 0. Since P 1 \P 2 is positive and P 1 \P 2 ⊂<br />
N 2 , <strong>the</strong>n µ(P 1 \P 2 ) = 0. Similarly µ(P 2 \P 1 ) = 0, and P 1 △P 2 = P 1 \P 2 ∪P 2 \P 1<br />
implies <strong>the</strong> equation.<br />
Definition. Let µ, ν be two measures on (X, A), we say that µ and ν are mutually<br />
singular, and write µ⊥ν, if <strong>the</strong>re are disjoint sets X µ and X ν in A such<br />
that<br />
X = X µ ∪ X ν , and µ(X µ ) = µ(X ν ) = 0<br />
Theorem 93 (Jordan’s decomposition Theorem). Let (X, A, µ) be a signed<br />
measure. Then <strong>the</strong>re exist a decomposition µ = µ + − µ − , where µ + and µ − are<br />
proper measures and µ + ⊥µ − .<br />
Proof. Let X = P ∪ N be a Hanh decomposition of X, and <strong>for</strong> every E ⊂ A,<br />
let<br />
µ + (E) = µ(E ∩ P) and µ − (E) = −µ(E ∩ N).<br />
The reader can check that µ + and µ − are proper measures, µ = µ + − µ − and,<br />
µ + ⊥µ − .<br />
64