01.08.2013 Views

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981<br />

You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.<br />

6.5: Demonstration 121<br />

Common ground<br />

I have emphasized the difference in philosophy behind arithmetic coding <strong>and</strong><br />

Lempel–Ziv coding. There is common ground between them, though: in principle,<br />

one can design adaptive probabilistic models, <strong>and</strong> thence arithmetic<br />

codes, that are ‘universal’, that is, models that will asymptotically compress<br />

any source in some class to within some factor (preferably 1) of its entropy.<br />

However, for practical purposes, I think such universal models can only be<br />

constructed if the class of sources is severely restricted. A general purpose<br />

compressor that can discover the probability distribution of any source would<br />

be a general purpose artificial intelligence! A general purpose artificial intelligence<br />

does not yet exist.<br />

6.5 Demonstration<br />

An interactive aid for exploring arithmetic coding, dasher.tcl, is available. 2<br />

A demonstration arithmetic-coding software package written by Radford<br />

Neal 3 consists of encoding <strong>and</strong> decoding modules to which the user adds a<br />

module defining the probabilistic model. It should be emphasized that there<br />

is no single general-purpose arithmetic-coding compressor; a new model has to<br />

be written for each type of source. Radford Neal’s package includes a simple<br />

adaptive model similar to the Bayesian model demonstrated in section 6.2.<br />

The results using this Laplace model should be viewed as a basic benchmark<br />

since it is the simplest possible probabilistic model – it simply assumes the<br />

characters in the file come independently from a fixed ensemble. The counts<br />

{Fi} of the symbols {ai} are rescaled <strong>and</strong> rounded as the file is read such that<br />

all the counts lie between 1 <strong>and</strong> 256.<br />

A state-of-the-art compressor for documents containing text <strong>and</strong> images,<br />

DjVu, uses arithmetic coding. 4 It uses a carefully designed approximate arithmetic<br />

coder for binary alphabets called the Z-coder (Bottou et al., 1998), which<br />

is much faster than the arithmetic coding software described above. One of<br />

the neat tricks the Z-coder uses is this: the adaptive model adapts only occasionally<br />

(to save on computer time), with the decision about when to adapt<br />

being pseudo-r<strong>and</strong>omly controlled by whether the arithmetic encoder emitted<br />

a bit.<br />

The JBIG image compression st<strong>and</strong>ard for binary images uses arithmetic<br />

coding with a context-dependent model, which adapts using a rule similar to<br />

Laplace’s rule. PPM (Teahan, 1995) is a leading method for text compression,<br />

<strong>and</strong> it uses arithmetic coding.<br />

There are many Lempel–Ziv-based programs. gzip is based on a version<br />

of Lempel–Ziv called ‘LZ77’ (Ziv <strong>and</strong> Lempel, 1977). compress is based on<br />

‘LZW’ (Welch, 1984). In my experience the best is gzip, with compress being<br />

inferior on most files.<br />

bzip is a block-sorting file compressor, which makes use of a neat hack<br />

called the Burrows–Wheeler transform (Burrows <strong>and</strong> Wheeler, 1994). This<br />

method is not based on an explicit probabilistic model, <strong>and</strong> it only works well<br />

for files larger than several thous<strong>and</strong> characters; but in practice it is a very<br />

effective compressor for files in which the context of a character is a good<br />

predictor for that character. 5<br />

2<br />

http://www.inference.phy.cam.ac.uk/mackay/itprnn/softwareI.html<br />

3<br />

ftp://ftp.cs.toronto.edu/pub/radford/www/ac.software.html<br />

4<br />

http://www.djvuzone.org/<br />

5<br />

There is a lot of information about the Burrows–Wheeler transform on the net.<br />

http://dogma.net/DataCompression/BWT.shtml

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!