30.04.2013 Views

Program #8: The Future Pleads, 'Save Your ASCII Art!'

Program #8: The Future Pleads, 'Save Your ASCII Art!'

Program #8: The Future Pleads, 'Save Your ASCII Art!'

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CSc 352 — Systems <strong>Program</strong>ming and UNIX<br />

Fall 2012 (McCann)<br />

http://www.cs.arizona.edu/classes/cs352/fall12/<br />

<strong>Program</strong> <strong>#8</strong>: <strong>The</strong> <strong>Future</strong> <strong>Pleads</strong>, ‘Save <strong>Your</strong> <strong>ASCII</strong> <strong>Art</strong>!’<br />

Due Date: November 20 th , 2012, at the beginning of class<br />

Overview: Backinthegoodol’daysofcomputing, whenyourscreenhadonewindow(creatively, the‘screen’),<br />

it showed only <strong>ASCII</strong> (or EBCDIC) characters, and printers were large, loud, and consumed fan-fold paper by<br />

the box-full, users demonstrated their creativity by constructing and distributing “<strong>ASCII</strong> art”. Each picture<br />

used a subset of the printable <strong>ASCII</strong> characters to form an image by using one character for each pixel of the<br />

picture’s representation. Up close, the image often looked like a bunch of characters, but from a distance it<br />

looked like ...a bunch of characters that sort of resembled a picture.<br />

It’s a good thing you know all of this, because recently your phone rang and you received an oddly distorted<br />

call ...from the future! “Please, you must help. We’ve lost all of the <strong>ASCII</strong> art created in the past, and have<br />

only enough power in our temporal iPad 26 to call back to 2012. Convert all of the <strong>ASCII</strong> art you can find to<br />

the one true image format, PCX, and put it on a USB stick stuck with gum under the bar in Dirtbag’s for us.<br />

Our historians are counting on you!”<br />

<strong>The</strong> future is in luck; there are still a few archives of <strong>ASCII</strong> art on the ’net. <strong>Your</strong> task is to write a C program<br />

that accepts the filename of such an image’s text file and converts it to a (binary) PCX file. PCX was created<br />

by ZSoft for its PC Paintbrush program, which Microsoft distributed with its first Microsoft Mouse in the late<br />

1980s. Its simplicity earned it a long life and widespread support in image viewing programs, although you<br />

hardly ever find an image stored in that format today. Clearly, the format is due for a resurgence in the future!<br />

EventhoughthePCXformatissimplebycomparisontoJPGorPNGformats, it’snotexactlytrivial. Attached<br />

to this handout is a brief description of the PCX format. Expect to read that description several times before<br />

you have a good feel for what’s going on! Note that we will not be using the color palette section, just the<br />

header and the image data sections.<br />

Assignment: Write a complete, well-documented (take that, whm! :-)), well-structured C program named<br />

prog08.c that reads an <strong>ASCII</strong> image from a text file whose name is given on the command line (e.g.,<br />

myimage.txt) and creates a PCX version of that image named with the same name but a .pcx extension<br />

(e.g., myimage.pcx). Following the text file’s name on the command line may be a second argument; see below<br />

for details.<br />

Data: While you’re keen to help the future, you have limited time. You will assume that the length of the first<br />

line of the supplied text file is the length of all of the file’s lines. If an image has some longer lines, truncate<br />

them. For shorter lines, pad them with spaces to reach the length of the first line.<br />

Each image will have at least one pixel (the future doesn’t want void art). If the image filename is not supplied,<br />

if it cannot be opened for reading for any reason, or if any out-of-range characters are encountered, display an<br />

error message and terminate.<br />

Here’s an example of a small <strong>ASCII</strong> image file’s contents (from http://www.asciiworld.com):<br />

xMhn. .nlMx<br />

XMMf" xXMX XMXx "lMMX<br />

"=x.. ......-MMMX "MMMMMmMMMMM" XMMM-....... ..x="<br />

x ‘"!MMMMMMMMMMMMx’MM> "MMMMMMM"


tilde (<strong>ASCII</strong> 126). Any given <strong>ASCII</strong> image is unlikely to use them all, so your program will have to determine<br />

which characters the given image actually uses, and map them to grayscale colors to be used in the PCX<br />

version. See the Output section for details.<br />

As mentioned above, the command line is expected to contain at least the file name and extension, if any,<br />

of the source file. <strong>The</strong> same file name (w/o any extension) is to be used as the name of the PCX file, just<br />

with the extension “.pcx” appended. Thus, if the input file name is apple.txt, the output file name would<br />

be apple.pcx. You may assume that the last period of the command line argument and any subsequent<br />

characters comprise the extension. That is, .ascii is the extension of this.is.my.image.ascii.<br />

Optionally, a second string may be supplied on the command line that is the dark-to-light sequence of the<br />

<strong>ASCII</strong> characters that the user wishes to use for this image. For example, the string @#A8%(, indicates that @<br />

is the darkest shade of gray and , the lightest, for that image. If the string has a larger selection of characters<br />

than does the image, no problem; some shades of gray won’t be used. If the opposite is true, display an error<br />

message and terminate the program.<br />

Output: In keeping with traditional UNIX utility practice, your program should output nothing to the screen<br />

if the conversion to PCX is successful, and it should create the .pcx file with a name constructed as discussed<br />

above. Display error messages to stderr as necessary.<br />

As you have probably inferred from the optional dark-to-light command-line string, we aren’t specifying a<br />

specific grayscale sequence for the 95 <strong>ASCII</strong> characters; you are to define your own default sequence of those<br />

<strong>ASCII</strong> characters, to use when the user doesn’t provide one. (When the user does supply one, your program<br />

is to create a mapping on the fly.) PCX allows 256 intensities for each of red, green, and blue in the RGB<br />

system. To create shades of gray, simply set R = G = B. With 256 intensities for each of R, G, and B, you<br />

can form a grayscale for the 95 characters with room to spare.<br />

To verify that your PCX file is correctly structured, use an image viewing program capable of displaying PCX<br />

files. If it can display it, and it looks like your source <strong>ASCII</strong> art image, your file’s format is probably OK.<br />

Some suggestions for image viewers are given below.<br />

Turn In: Use turnin to submit your prog08.c file to the submission directory cs352p08. Of course, make<br />

sure that it compiles and executes correctly on lectura before you do.<br />

General Requirements, Hints, and Miscellany:<br />

• If you feel the following PCX file format information is inadequate, you can find descriptions on the ’net.<br />

For example: http://www.fileformat.info/format/pcx/egff.htm and<br />

http://www.techheap.com/compression/graphics/pcxfmt.html<br />

• Creating a grayscale mapping of <strong>ASCII</strong> characters to intensities of gray is subjective.<br />

http://paulbourke.net/dataformats/asciiart/ offers an incomplete “character ramp” from dark to<br />

light that you might wish to use as a starting point (note that the space character is at the far end, but<br />

is of course not displayed):<br />

$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~i!lI;:,"^‘’.<br />

• Needsome<strong>ASCII</strong>imagesfortesting? http://www.asciiworld.com/andhttp://www.chris.com/ascii/<br />

have a lot you can cut-n-paste. (Most images are G-rated, but these sites have ‘adult’ images, too. Shockingly,<br />

the pre-Internet era of computing was not terribly different from the modern Internet era. You’ve<br />

been warned!)<br />

Don’t bothering trying Google Images; it has examples, but formatted as .gif, .jpg, and/or .png files.<br />

• Suggestions for image viewers that can handle PCX files:<br />

– (Nearly) All OSes: Gimp (/usr/bin/gimp in the lab, or free from www.gimp.org)<br />

– Windows: Irfanview (free from www.irfanview.com)<br />

– Linux: Eye of Gnome (/usr/bin/eog in the lab)<br />

You’ll probably have to turn off pixel blending (which goes by various names in menus) in some programs,<br />

and use extreme zooming, to get a clear view of the individual pixels in images.<br />

2


Historical Background<br />

A Description of the PCX Image File Format<br />

(L. McCann)<br />

When Microsoft began selling its own mouse (creatively named “<strong>The</strong> Microsoft Mouse”), it promoted its use<br />

by shipping it with a program called PC Paintbrush, written by a company named ZSoft. Both ZSoft and PC<br />

Paintbrush are long gone, but the image file format ZSoft created for it, PCX, lives on, even though today it<br />

is more than a bit outdated.<br />

<strong>The</strong> PCX File Format<br />

A PCX file has two, and possibly three, sections:<br />

1. Header (exactly 128 bytes)<br />

2. Image Data (unlimited)<br />

3. Color Palette (exactly 768 bytes, used only for 256-color images)<br />

Here are the details on these three sections.<br />

1. Header: <strong>The</strong> header contains several fields, some of which are essentially constants while others changed<br />

based on the image. <strong>The</strong> following table briefly explains each field:<br />

Byte(s) Field Type Value Remarks<br />

0 Manufacturer unsigned char 10 10 = ZSoft<br />

1 Version unsigned char 5 Version 5 is the most recent<br />

2 Encoding unsigned char 1 1 means that the image is encoded in RLE<br />

3 Bits Per Pixel unsigned char 8 2 8 =256, num. of intensities per color<br />

4-5 Xmin 2-byte integer - (Xmin,Ymin) is upper left corner of image<br />

6-7 Ymin 2-byte integer - ”<br />

8-9 Xmax 2-byte integer - (Xmax,Ymax) is lower right corner<br />

10-11 Ymax 2-byte integer - ”<br />

12-13 Horizontal DPI 2-byte integer - often ignored, or = 300 by default<br />

14-15 Vertical DPI 2-byte integer - ”<br />

16-63 EGA Colormap 48 bytes 0’s unused for modern graphics modes<br />

64 – unused – unsigned char 0 Reserved; set to 0<br />

65 Color Planes unsigned char 3 One each for Red, Green, and Blue<br />

66-67 Bytes per Line 2-byte integer - Really bytes/line/color<br />

68-69 Palette Type 2-byte integer 1 1 means a color or grayscale palette<br />

70-71 Horiz. Screen Size 2-byte integer - Depends on size of window<br />

72-73 Vertical Screen Size 2-byte integer - ”<br />

74-127 Filler 54 bytes 0’s Room for future expansion<br />

2. Image Data: <strong>The</strong> image data (a.k.a. color planes) is usually compressed using a technique called RLE<br />

– Run-Length Encoding. Information on how PCX handles each is covered in this section.<br />

• PCX’s Implementation of RLE.<br />

In RLE, if you have a sequence of identical values (a ‘run’), you store one copy of that value and<br />

a repetition quantity instead of the sequence. For example, instead of storing “4 4 4 4 4 4”, RLE<br />

stores “6 4”.<br />

(continued ...)<br />

3


But how do you know that 6 is a count, and not a data value? PCX’s version of RLE solves the<br />

problem this way:<br />

– If a byte’s value is < 192, then it is the value of a single pixel.<br />

– If it is >= 192, then it is a count equal to itself minus 192. <strong>The</strong> next byte is repeated that<br />

many times within the image.<br />

Notes:<br />

∗ <strong>The</strong> maximum byte value is 255, making the longest possible run 63 (255 - 192).<br />

∗ How can we store a data value that is >= 192? We store it as a run of size 1. For example,<br />

200 is stored as the pair 193 200.<br />

∗ Instead of just reading to the end of the file, it is smarter to verify that, after decompression,<br />

the total number of pixels found is equal to (Xmax−Xmin+1)∗(Ymax−Ymin+1).<br />

∗ Runs are not allowed to extend beyond the end of a scan line (that is, beyond the end of a<br />

row of pixels).<br />

∗ (256 colors only!) Immediately following the compressed image data is a byte with the value<br />

12.<br />

As an example, let’s decompress the sequence 195 16 127 193 192 200 6. “195 16” decompresses<br />

to “16 16 16” (195 - 192 = 3 copies). 127 is less than 192, and so is just 127. “193 192” becomes<br />

the value 192. Finally, “200 6” decompresses to “6 6 6 6 6 6 6 6”. <strong>The</strong> result: 16 16 16 127 192<br />

6 6 6 6 6 6 6 6<br />

• Color Planes.<br />

If the image is capable of having more than 256 colors, PCX encodes the color data as part of the<br />

picture data, and the third section of a PCX file, the color palette, is not used. (If the color planes<br />

field of the header is set to 3, your file is capable of having more than 256 colors.)<br />

Each scan line is stored as three RLE–compressed color lines – R (red), G (green), and B (blue) –<br />

in that order. Here’s a representation of the color data order for first two scan lines of an image:<br />

/ RLE-compressed Red data<br />

Scan Line (Row) #0 RLE-compressed Green data<br />

\ RLE-compressed Blue data<br />

/ RLE-compressed Red data<br />

Scan Line (Row) #1 RLE-compressed Green data<br />

\ RLE-compressed Blue data<br />

For example, consider a 2-column by 3-row image of 24 bits per pixel (8 bits per color):<br />

( R , G , B ) ( R , G , B )<br />

-----------------------------<br />

Row 0: | ( 32, 64,128) (255,255,255) |<br />

Row 1: | (255,255,255) (255, 0,128) |<br />

Row 2: | (255,255,255) (255,255,255) |<br />

-----------------------------<br />

In hexadecimal, here are the PCX RLE encodings of the three rows. Remember, we’re encoding<br />

each color of each row separately, and note that C016 = 19210:<br />

Colors: ----RED---- ---GREEN--- ----BLUE----<br />

Row 0: Color Sequences: [ 32 255 ] [ 64 255 ] [ 128 255 ]<br />

Hex encoding: 20 C1 FF 40 C1 FF 80 C1 FF<br />

Row 1: Color Sequences: [ 255 255 ] [ 255 0 ] [ 255 128 ]<br />

Hex encoding: C2 FF C1 FF 00 C1 FF 80<br />

Row 2: Color Sequences: [ 255 255 ] [ 255 255 ] [ 255 255 ]<br />

Hex encoding: C2 FF C2 FF C2 FF<br />

(continued ...)<br />

4


3. Color Palette (for 256-color images only):<br />

For 256-color (VGA) images, the palette is 768 bytes in length. Each color is a RGB intensity triple; R,<br />

then G, then B, with one byte per color. As an example, here are the first three palette entries:<br />

R G B R G B R G B<br />

Byte 0 1 2 3 4 5 6 7 8<br />

Color #0 #1 #2<br />

<strong>The</strong> intensities are in the range 0–255. If the image’s data intensities do not match this range, they must<br />

be scaled accordingly. For example, in VGA’s Mode 13h, which was once very popular for MS-DOS<br />

games, intensities are in the range 0–63. To save a Mode 13h image in PCX, the intensities must be<br />

multiplied by 4.<br />

Because the palette is at the end of the file, it is easy to read the palette before reading the image data.<br />

This is desirable so that the image data can be decoded directly to the screen. C’s fseek() function<br />

makes this easy: fseek(fp, -768, SEEK END);<br />

Remember: For this assignment, we are not using the color palette section!<br />

On Converting from <strong>ASCII</strong> Characters to Colored Pixels<br />

<strong>The</strong> above example moves from RGB triples (colors of pixels) to an RLE encoding. You may be wondering<br />

how to move from <strong>ASCII</strong> characters to the RGB triples. This example may help.<br />

First, you need a translation table that matches characters to RGB triples. That is, you’d want to say, for<br />

example, that a space is white (red = green = blue = FF16, or (FF,FF,FF)), an ‘X’ is mostly red (C816,0,0),<br />

and an ‘O’ is lightly green (0,6416,0). Of course, you’ll have to do that for the other 92 characters, too.<br />

Second, you need to form the sequences of red, green, and blue components for each row before you can<br />

compress them (and remember that each color of each row is compressed separately).<br />

Let’s say that you want to use the above colors for space, ‘X’ and ‘O’ to convert this tic-tac-toe board into<br />

a 3x3 image. <strong>The</strong> diagram shows the conversion from characters to RGB triples, and the arranging of color<br />

components into sequences. From there, the individual sequences can be run-length encoded.<br />

3 3<br />

XOX (C8,00,00) (00,64,00) (C8,00,00) R:[C8 00 C8] G:[00 64 00] B:[00 00 00]<br />

OX ==> (00,64,00) (C8,00,00) (FF,FF,FF) ==> R:[00 C8 FF] G:[64 00 FF] B:[00 00 FF]<br />

X O (C8,00,00) (FF,FF,FF) (00,64,00) R:[C8 FF 00] G:[00 FF 64] B:[00 FF 00]<br />

<strong>ASCII</strong> RGB (R,G,B) Triples Unencoded Color Sequences<br />

5


Bonus Example<br />

Consider this grayscale representation of a 4x2x24bpp PCX file:<br />

<strong>The</strong> non–white pixel on the top row is black, and the pixel on the second row is a dirty green. (View the PDF<br />

of this handout to see the lovely green color.) Here is the hexadecimal dump of this PCX file:<br />

$ od -Ax -x -v twopixel.pcx<br />

000000 050a 0801 0000 0000 0003 0001 0060 0060<br />

000010 0000 0000 0000 0000 0000 0000 0000 0000<br />

000020 0000 0000 0000 0000 0000 0000 0000 0000<br />

000030 0000 0000 0000 0000 0000 0000 0000 0000<br />

000040 0300 0004 0001 0000 0000 0000 0000 0000<br />

000050 0000 0000 0000 0000 0000 0000 0000 0000<br />

000060 0000 0000 0000 0000 0000 0000 0000 0000<br />

000070 0000 0000 0000 0000 0000 0000 0000 0000<br />

000080 ffc1 c200 c1ff 00ff ffc2 ffc1 c200 80ff<br />

000090 ffc3 c380 40ff ffc3<br />

<strong>The</strong> image data section starts at 8016. <strong>The</strong> byte pairs are inconveniently displayed by od in reverse order, so<br />

ffc1 is really c1ff in memory. Put another way, c1 is at address 8016 and ff is at address 8116. Here is a list<br />

of the data in sequence:<br />

C1 FF 00 C2 FF C1 FF 00 C2 FF C1 FF 00 C2 FF 80 C3 FF 80 C3 FF 40 C3 FF<br />

Organized into scan lines (rows) and decoded, the data looks like this:<br />

---------RED-------- --------GREEN------- --------BLUE--------<br />

Line 1: C1 FF 00 C2 FF C1 FF 00 C2 FF C1 FF 00 C2 FF<br />

[ 255 0 255 255] [ 255 0 255 255] [ 255 0 255 255]<br />

Line 2: 80 C3 FF 80 C3 FF 40 C3 FF<br />

[ 128 255 255 255 ] [128 255 255 255 ] [ 64 255 255 255 ]<br />

You can determine the amount of decoded data per color by the length of a line of the image (4 pixels, in this<br />

example).<br />

6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!