a la physique de l'information - Lisa - Université d'Angers

Recommendations

Info

Author's personal copy 3970 F. Chapeau-Blondeau, D. Rousseau / Physica A 388 (2009) 3969–3984 process order estimation [9,10], data clustering [11,12], signal denoising [13,14], image segmentation [15,16], curve fitting [17,18], analysis of chaotic systems [19,20], genomic sequencing [21,22], neural networks [23,24]. Novel applications also are still emerging [6]. We believe that the MDL approach still holds many potentialities relevant to scientific investigation. A specifically interesting aspect is that the MDL principle offers a unifying thread for approaching many distinct tasks of signal and data processing that otherwise would stand as separate problems. Furthermore, the unified view which is provided is formulated as a information-theoretic framework, and this may be specially relevant to advance an information point of view in science [25–27]. Application of the MDL principle to probability density estimation by histograms was introduced in Ref. [28]. Part of the present paper consists in reviewing this approach of Ref. [28], and also in providing additional illustrative examples, through a presentation emphasizing intuitive and concrete arguments. Implementation of the MDL principle critically relies on definite specifications for measuring the description lengths. As another part of the present paper, we also consider alternative ways of measuring the description lengths, which differ from the choice made in Ref. [28], and which arguably can be found more suited in this context of probability density estimation by histograms. We also explicitly exhibit here the complete forms of the description lengths that arise from the various choices, through formulas involving the information entropy and redundancy of the data, and which are not given in other studies. And we analyze and compare these formulas for the description lengths. We also provide an application to measured data, in the line of a presentation emphasizing concrete and physical appreciation of the MDL approach. In this way, for a part the present paper has a pedagogical and illustrative intent as it proposes a detailed and illustrated review emphasizing concrete interpretations and intuition, on the MDL principle for probability density estimation by histograms. For another part, the paper provides additional results and insight with comparison of alternative choices and complementary analyses. Minimum description length is often associated with another comparable approach identified as minimum stochastic complexity. These are two distinct, although related, approaches. In particular, stochastic complexity is usually based on the introduction, for the parameters of the model, of a specific prior probability distribution, upon which the subsequent results depend. A uniform prior can be used as in Ref. [28], or the so-called Jeffreys prior as in Ref. [3]. Both description length and stochastic complexity are examined in Ref. [28] for probability density estimation by histograms. Ref. [29] concentrates on stochastic complexity with uniform prior for probability density estimation by histograms. These two notions of description length and stochastic complexity can be defined as distinct notions, as it emerges from Refs. [28,29,3]. However, some other studies imply the terminologies ‘‘description length’’ and ‘‘stochastic complexity’’ as synonyms to designate a same underlying notion. Ref. [30] uses the terminologies ‘‘description length’’ and ‘‘stochastic complexity’’ essentially as synonymous, although there is a single underlying notion which is description length as we understand it here, and not stochastic complexity as in Refs. [28,29,3]. Ref. [30] provides detailed mathematical proofs concerning asymptotic properties and a general theoretical bound, through the introduction of an index of resolvability, for the statistical accuracy and efficacy of probability density estimation by any type of estimators, not necessarily histograms. Further refinements and improvements on these theoretical properties are given in Ref. [31]. Two asymptotic theorems are also proved in Ref. [28], and two theorems concerning upper bounds are established in Ref. [29]. Ref. [32] confronts, for histogram estimation, several forms of penalized maximum-likelihood methods that include the MDL and stochastic complexity based approaches of Ref. [28]. Refs. [33,34] present another form of MDL for histogram density estimation, as they define stochastic complexity by means of the notion of normalized maximum likelihood to avoid a specific prior and in order to obtain a minimax optimality, and then complement this stochastic complexity by a measure of the description length of the parameters to form the criterion to be minimized. In our present paper, for probability density estimation by histograms, we concentrate on the minimum description length, as in Ref. [28] and Ref. [30], and not on the minimum stochastic complexity as considered in Refs. [28,29] with uniform prior, or in Ref. [3] with Jeffreys’ prior, or in Refs. [33,34] via normalized maximum likelihood. We see this minimum description length endowed with the advantage of a simple and concrete informational interpretation which is not shared by the minimum stochastic complexity. We review, illustrate and complement the MDL approach here. So far, MDL for probability density estimation by histograms has mainly been discussed in the literature connected to information theory and statistics. Formal proofs have been established for important mathematical properties of the approach. As a complement, we propose here to discuss the MDL methodology in a more physically-oriented presentation, leaning on concrete intuition and illustrative examples. Such a relation between information theory and statistical physics seems interesting to us to promote for the potentialities of mutual enrichment, as for instance illustrated by the recent studies of Refs. [35–38]. 2. A histogram model for probability density One disposes of N observed data points xn forming the data set x = {xn, n = 1, . . . N}. (1) These N data points xn are assumed to be N independent realizations of a random variable X distributed according to the probability density function f (x). The probability P(x) of observing a given data set x is therefore expressible as P(x) = dx N N f (xn), (2) n=1 where dx measures the infinitesimal domain of reference around xn. 151/197
Author's personal copy F. Chapeau-Blondeau, D. Rousseau / Physica A 388 (2009) 3969–3984 3971 One seeks to estimate the probability density f (x) from the N data points xn of Eq. (1). For this purpose, a histogram model is introduced for the unknown density f (x) under the common form of an approximation by a piecewise constant function. This histogram model is denoted M and is defined as follows. The density f (x) is modeled by K constant plateaus of value fk, for k = 1 to K , each of these plateaus being defined in the abscissa between xmin and xmax over a regular bin of width δx = xmax − xmin K = x , (3) K with xmin and xmax respectively the minimum and maximum values of the xn’s over the data set x of Eq. (1). Especially, consistency of the probability density model imposes K fkδx = 1. (4) k=1 The probability P(x) of Eq. (2), based on the histogram model M for the density f (x), is expressible as P(x) = dx N K k=1 f Nk k , (5) where Nk is the number of data points xn of the data set x that fall within bin number k, verifying K k=1 Nk = N. 3. Maximum-likelihood histogram estimation When the number of bins K is fixed, the density model M is specified by the K parameters fk for k = 1 to K . To determine these parameters from the data, a standard approach is the maximum-likelihood method [39] which consists in selecting those values of the parameters fk that maximize the probability P(x) in Eq. (5) of the observed data set x. Maximizing P(x) of Eq. (5) under the constraint of Eq. (4) is achieved by the well-known maximum-likelihood solution fk = Nk , k = 1, . . . K. (6) Nδx The maximum-likelihood solution of Eq. (6) completely specifies, for the probability density f (x), the histogram model with a fixed number K of regular bins. 4. Minimum description length Another point of view can be adopted to arrive at the solution of Eq. (6). Information theory stipulates that to code data xn appearing with probability P(xn), the optimal code assigns a codeword with length − log P(xn). To code the whole data set x of Eq. (1), the optimal code assigns a length − log P(x), which by the probability model of Eq. (5) is Ldata = − log P(x) = − log(dx N ) − K Nk log(fk). (7) k=1 The maximum-likelihood solution of Eq. (6) maximizes the likelihood P(x) of Eq. (5) and equivalently the loglikelihood log P(x). Therefore, the solution of Eq. (6) also minimizes the code length Ldata = − log P(x) of Eq. (7). The solution of Eq. (6) selects from the data, the K parameters fk of the probability density model M, so that the optimal code designed for the data from this density model, achieves the minimal code length. This is the rationale of the MDL principle: to select the parameters of the model that allow the shortest coding of the complete data. This guarantees that the selected model is the best (within its class) at capturing the structures and regularities in the data. We can add here, that the minimum of the description length (7) achieved by the solution of Eq. (6) can be expressed as x Lmin = NH({pk}) − N log(K) + N log , (8) dx where we have introduced the entropy H({pk}) = − K pk log(pk) (9) k=1 of the empirical probabilitiespk = fkδx = Nk/N deduced from Eq. (6). Here, when the number of bins K of the histogram model is fixed in an a priori way, the MDL solution coincides with the maximum-likelihood solution of Eq. (6). However, the MDL principle can be extended to also optimally select the number of bins K of the model from the data, along with the K parameter values fk for k = 1 to K . This extension proceeds in the 152/197
Page 1 and 2:
Université d’Angers Laboratoire
Page 3 and 4:
Ce mémoire est dédié à mon épo
Page 5 and 6:
4 Physique de l’information 39 4.
Page 7 and 8:
1.2 Organisation du document L’es
Page 9 and 10:
4/197
Page 11 and 12:
de Rennes 1. Licence et Maîtrise
Page 13 and 14:
2.5 Activités de recherche 2.5.1 B
Page 15 and 16:
2.5.2 Encadrement Thèses Thèse so
Page 17 and 18:
2.5.3 Responsabilités Management d
Page 19 and 20:
[A20] S. BLANCHARD, D. ROUSSEAU, D.
Page 21 and 22:
In 8th Euro-American Workshop on In
Page 23 and 24:
études visaient alors à analyser
Page 25 and 26:
Fig. 3.1 réside dans le fait qu’
Page 27 and 28:
également permis de réaliser qu
Page 29 and 30:
caracteristique effective g eff (u)
Page 31 and 32:
des architectures neuronales contr
Page 33 and 34:
Dans l’ Éq.(3.10), ∆t représe
Page 35 and 36:
quantifieur, le niveau de saturatio
Page 37 and 38:
exemples présentés dans [39], l
Page 39 and 40:
• le type de validation des résu
Page 41 and 42:
présenté de façon condensée dan
Page 43 and 44:
38/197
Page 45 and 46:
pour l’étude de la turbulence pa
Page 47 and 48:
A B C Figure 4.1 : Images de coupe
Page 49 and 50:
atteint la capacité informationnel
Page 51 and 52:
* Figure 4.5 : Échelle optimale d
Page 53 and 54:
• L’“intégrale de corrélati
Page 55 and 56:
4.2.3 Analyse multifractale en imag
Page 57 and 58:
fonction de partition Z 10 30 10 25
Page 59 and 60:
exposant τ(q) 15 10 5 0 −5 −10
Page 61 and 62:
complexité colorimétrique des ima
Page 63 and 64:
dimension fractale généralisée D
Page 65 and 66:
est la moyenne du signal calculée
Page 67 and 68:
A B C Figure 4.17 : Observations in
Page 69 and 70:
A B C Figure 4.20 : Nombre moyen de
Page 71 and 72:
[14] S. Blanchard, D. Rousseau et F
Page 73 and 74:
[43] J. Chauveau, D. Rousseau et F.
Page 75 and 76:
[76] A. Humeau, B. Buard, F. Chapea
Page 77 and 78:
[100] M. D. McDonnell, N. G. Stocks
Page 79 and 80:
[131] D. Rousseau, F. Duan, J. Roja
Page 81 and 82:
76/197
Page 83 and 84:
D. ROUSSEAU and F. CHAPEAU-BLONDEAU
Page 85 and 86:
ROUSSEAU AND CHAPEAU-BLONDEAU: BAYE
Page 87 and 88:
ROUSSEAU AND CHAPEAU-BLONDEAU: BAYE
Page 89 and 90:
D. ROUSSEAU, G. V. ANAND, and F. CH
Page 91 and 92:
quantizers. Classically, the design
Page 93 and 94:
Fig. 2. Nonlinear array used as pre
Page 95 and 96:
probability of error P er 10 0 10 -
Page 97 and 98:
probability of error (expected as s
Page 99 and 100:
[3] N.H. Lu, B.A. Einstein, Detecti
Page 101 and 102:
Abstract Neurocomputing 71 (2007) 3
Page 103 and 104:
3. Assessing nonlinear transmission
Page 105 and 106: output SNR 250 200 150 100 50 0 N=3
Page 107 and 108: precisely the scope of the present
Page 109 and 110: [4] F. Chapeau-Blondeau, G. Chauvet
Page 111 and 112: Nonlinear SNR amplification of harm
Page 113 and 114: P.R. BHAT and D. ROUSSEAU and G. V.
Page 115 and 116: It is assumed that the signal s is
Page 117 and 118: Let us de£ne the improvement in pe
Page 119 and 120: F. CHAPEAU-BLONDEAU and D. ROUSSEAU
Page 121 and 122: Contents Raising the noise to impro
Page 123 and 124: Raising the noise to improve perfor
Page 135 and 136: S. BLANCHARD, D. ROUSSEAU, D. GINDR
Page 137 and 138: 1984 OPTICS LETTERS / Vol. 32, No.
Page 139 and 140: F. CHAPEAU-BLONDEAU, D. ROUSSEAU, S
Page 141 and 142: 1288 J. Opt. Soc. Am. A/Vol. 25, No
Page 147 and 148: 36 IEEE SIGNAL PROCESSING LETTERS,
Page 149 and 150: 38 IEEE SIGNAL PROCESSING LETTERS,
Page 151 and 152: A. HISTACE and D. ROUSSEAU. Constru
Page 153 and 154: noises Z i in (3) are chosen Gaussi
Page 155: Author's personal copy Physica A 38
Page 159 and 160: Author's personal copy F. Chapeau-B
Page 161 and 162: edundancy Author's personal copy F.
Page 163 and 164: model description length Author's p
Page 165 and 166: total description length x 10 4 1.6
Page 171 and 172: J. CHAUVEAU and D. ROUSSEAU and F.
Page 173 and 174: 2 for natural images, and carry rel
Page 175 and 176: 4 Fig. 2 Random RGB color image I2(
Page 177 and 178: 6 log2[ number of boxes N(a) ] 24 2
Page 179 and 180: 8 Fig. 7 Color histogram in the RGB
Page 181 and 182: 10 log2[ number of boxes N(a) ] 8 7
Page 183 and 184: 12 17. Landgrebe, D.: Hyperspectral
Page 185 and 186: Numerical simulation of laser Doppl
Page 187 and 188: esults are meaningful to better app
Page 189 and 190: A. HUMEAU, B. BUARD, F. CHAPEAU-BLO
Page 191 and 192: 618 A Humeau et al 1. Introduction
Page 193 and 194: 620 A Humeau et al (a) (b) (c) Figu
Page 195 and 196: 622 A Humeau et al (a) (b) (c) Figu
Page 197 and 198: 624 A Humeau et al Figure 5. Averag
Page 199 and 200: 626 A Humeau et al Therefore, the m
Page 201 and 202: 628 A Humeau et al peripheral cardi
show all

a la physique de l'information - Lisa - Université d'Angers

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?