15.08.2013 Views

General Computer Science 320201 GenCS I & II Lecture ... - Kwarc

General Computer Science 320201 GenCS I & II Lecture ... - Kwarc

General Computer Science 320201 GenCS I & II Lecture ... - Kwarc

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Now, checking whether a code is a prefix code can be a tedious undertaking: the naive algorithm<br />

for this needs to check all pairs of codewords. Therefore we will look at a couple of properties of<br />

character codes that will ensure a prefix code and thus decodeability.<br />

Sufficient Conditions for Prefix Codes<br />

Theorem 215 If c is a code with |c(a)| = k for all a ∈ A for some k ∈ N, then c is prefix<br />

code.<br />

Proof: by contradiction.<br />

P.1 If c is not at prefix code, then there are a, b ∈ A with c(a) ⊳ c(b).<br />

P.2 clearly |c(a)| < |c(b)|, which contradicts our assumption.<br />

Theorem 216 Let c: A → B + be a code and ∗ ∈ B be a character, then there is a prefix<br />

code c ∗ : A → (B ∪ {∗}) + , such that c(a) ⊳ c ∗ (a), for all a ∈ A.<br />

Proof: Let c ∗ (a) := c(a) + ”∗” for all a ∈ A.<br />

P.1 Obviously, c(a) ⊳ c ∗ (a).<br />

P.2 If c ∗ is not a prefix code, then there are a, b ∈ A with c ∗ (a) ⊳ c ∗ (b).<br />

P.3 So, c ∗ (b) contains the character ∗ not only at the end but also somewhere in the middle.<br />

P.4 This contradicts our construction c ∗ (b) = c(b) + ”∗”, where c(b) ∈ B +<br />

c○: Michael Kohlhase 124<br />

2.4.3 Character Codes in the Real World<br />

We will now turn to a class of codes that are extremely important in information technology:<br />

character encodings. The idea here is that for IT systems we need to encode characters from<br />

our alphabets as bit strings (sequences of binary digits 0 and 1) for representation in computers.<br />

Indeed the Morse code we have seen above can be seen as a very simple example of a character<br />

encoding that is geared towards the manual transmission of natural languages over telegraph lines.<br />

For the encoding of written texts we need more extensive codes that can e.g. distinguish upper<br />

and lowercase letters.<br />

The ASC<strong>II</strong> code we will introduce here is one of the first standardized and widely used character<br />

encodings for a complete alphabet. It is still widely used today. The code tries to strike a balance<br />

between a being able to encode a large set of characters and the representational capabiligies<br />

in the time of punch cards (cardboard cards that represented sequences of binary numbers by<br />

rectangular arrays of dots). 6 EdNote:6<br />

The ASC<strong>II</strong> Character Code<br />

Definition 217 The American Standard Code for Information Interchange (ASC<strong>II</strong>) code<br />

assigns characters to numbers 0-127<br />

Code ···0 ···1 ···2 ···3 ···4 ···5 ···6 ···7 ···8 ···9 ···A ···B ···C ···D ···E ···F<br />

0··· NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI<br />

1··· DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US<br />

2··· ! ” # $ % & ′<br />

( ) ∗ + , − . /<br />

3··· 0 1 2 3 4 5 6 7 8 9 : ; < = > ?<br />

4··· @ A B C D E F G H I J K L M N O<br />

5··· P Q R S T U V W X Y Z [ \ ] ˆ<br />

6··· ‘ a b c d e f g h i j k l m n o<br />

7··· p q r s t u v w x y z { | } ∼ DEL<br />

6 EdNote: is the 7-bit grouping really motivated by the cognitive limit?<br />

67

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!