12.07.2015 Views

Data Compression: The Complete Reference

Data Compression: The Complete Reference

Data Compression: The Complete Reference

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

8.6 Word-Based Text <strong>Compression</strong> 789procedure RowCol(ind,R1,R2,C1,C2: integer);case ind of0: R2:=(R1+R2)÷2; C2:=(C1+C2)÷2;1: R2:=(R1+R2)÷2; C1:=((C1+C2)÷2) + 1;2: R1:=((R1+R2)÷2) + 1; C2:=(C1+C2)÷2;3: R1:=((R1+R2)÷2) + 1; C1:=((C1+C2)÷2) + 1;endcase;if ind≤n then RowCol(ind+1,R1,R2,C1,C2);end RowCol;main programinteger ind, R1, R2, C1, C2;integer array id[10];bit array M[2 n , 2 n ];ind:=0; R1:=0; R2:=2 n − 1; C1:=0; C2:=2 n − 1;RowCol(ind, R1, R2, C1, C2);M[R1,C1]:=1;end;Figure 8.29: Recursive Procedure RowCol.8.6 Word-Based Text <strong>Compression</strong>All the data compression methods mentioned in this book operate on small alphabets. Atypical alphabet may consist of the two binary digits, the sixteen 4-bit pixels, the 7-bitASCII codes, or the 8-bit bytes. In this section we consider the application of knownmethods to large alphabets that consist of words.It is not clear how to define a word in cases where the input stream consists of thepixels of an image, so we limit our discussion to text streams. In such a stream a wordis defined as a maximal string of either alphanumeric characters (letters and digits) orother characters (punctuations and spaces). We denote by A the alphabet of all thealphanumeric words and by P, that of all the other words. One consequence of thisdefinition is that in any text stream—whether the source code of a computer program,a work of fiction, or a restaurant menu—words from A and P strictly alternate. Asimpleexample is the C-language source line“␣␣for␣(␣short␣i=0;␣i␣

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!