12.07.2015 Views

Data Compression: The Complete Reference

Data Compression: The Complete Reference

Data Compression: The Complete Reference

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

382 4. Image <strong>Compression</strong>they constitute overhead. Generally, histogram compaction improves compression, butin rare cases the overhead may be bigger than the savings due to histogram compaction.4.14 Vector QuantizationThis is a generalization of the scalar quantization method (Section 1.6). It is used forboth image and sound compression. In practice, vector quantization is commonly usedto compress data that have been digitized from an analog source, such as sampled soundand scanned images (drawings or photographs). Such data is called digitally sampledanalog data (DSAD). Vector quantization is based on two facts:1. We know (from Section 3.1) that compression methods that compress strings, ratherthan individual symbols, can, in principle, produce better results.2. Adjacent data items in an image (i.e., pixels) and in digitized sound (i.e., samples;see Section 7.1) are correlated. <strong>The</strong>re is a good chance that the near neighbors of a pixelP will have the same values as P or very similar values. Also, consecutive sound samplesrarely differ by much.We start with a simple, intuitive vector quantization method for image compression.Given an image, we divide it into small blocks of pixels, typically 2×2 or4×4. Each blockis considered a vector. <strong>The</strong> encoder maintains a list (called a codebook) of vectors andcompresses each block by writing on the compressed stream a pointer to the block in thecodebook. <strong>The</strong> decoder has the easy task of reading pointers, following each pointer toa block in the codebook, and joining the block to the image-so-far. Vector quantizationis thus an asymmetric compression method.In the case of 2×2 blocks, each block (vector) consists of four pixels. If each pixelis one bit, then a block is four bits long and there are only 2 4 = 16 different blocks. It iseasy to store such a small, permanent codebook in both encoder and decoder. However,a pointer to a block in such a codebook is, of course, four bits long, so there is nocompression gain by replacing blocks with pointers. If each pixel is k bits, then eachblock is 4k bits long and there are 2 4k different blocks. <strong>The</strong> codebook grows very fastwith k (for k = 8, for example, it is 256 4 =2 32 = 4 Tera entries) but the point is thatwe again replace a block of 4k bits with a 4k-bit pointer, resulting in no compressiongain. This is true for blocks of any size.Once it becomes clear that this simple method won’t work, the next thing thatcomes to mind is that any given image may not contain every possible block. Given8-bit pixels, the number of 2×2 blocksis2 2·2·8 =2 32 ≈ 4.3 billion, but any particularimage may contain a few thousand different blocks. Thus, our next version of vectorquantization starts with an empty codebook and scans the image block by block. <strong>The</strong>codebook is searched for each block. If the block is already in the codebook, the encoderoutputs a pointer to the block in the (growing) codebook. If the block is not in thecodebook, it is added to the codebook and a pointer is output.<strong>The</strong> problem with this simple method is that each block added to the codebookhas to be written on the compressed stream. This greatly reduces the effectiveness ofthe method and may lead to low compression and even to expansion. <strong>The</strong>re is also thesmall added complication that the codebook grows during compression, so the pointersget longer, but this is not hard for the decoder to handle.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!