23.07.2013 Views

Data Encryption Based On Protein Synthesis - Nguyen Dang Binh

Data Encryption Based On Protein Synthesis - Nguyen Dang Binh

Data Encryption Based On Protein Synthesis - Nguyen Dang Binh

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Figure 3: Illustration of first method - <strong>On</strong>e Bite of<br />

data is translated to one Bite of code.<br />

Figure 4: Illustration of second method - <strong>On</strong>e Bite<br />

of data is translated to a string.<br />

In Figures 5-a, b, these two coding methods are<br />

compared in term of their coding structure.<br />

Figure 5-a: indicates the table of unique codes<br />

based on first method.<br />

Elements including: 00, 01, 10 and 11.<br />

Figure 5-b: indicates the table of unique codes<br />

based on second method.<br />

Elements including: A, H, K and M<br />

3.3 The transmission<br />

Since data communication in digital systems is in<br />

the form of 0 and 1, the first method well complies<br />

with these communication channels. However,<br />

because the second method produces the alphabetic<br />

outputs rather than binary digits, additional encoding<br />

is employed to transform alphabets to binary system<br />

while improving security. This is carried out by<br />

Huffman coding.<br />

4 Required background of Huffman<br />

Code<br />

Huffman Coding is a lossless method of<br />

compressing data and a form of entropy encoding<br />

[1, 6]. Lossless data compression is an algorithm<br />

where original data can be reconstructed from the<br />

compressed data. Entropy encoding is lossless data<br />

compression which assigns codes to symbols with<br />

the length of each codeword proportional to the<br />

frequency of that symbol [4, 7]. Specifically Huffman<br />

codes are variable-length codes; shorter codeword<br />

are assigned to symbols with the highest frequency.<br />

This has an advantage over fixed-length codes as<br />

the average overall bits produced after encoding the<br />

data using variable-length codes, is fewer than<br />

encoding the same data using fixed-length codes.<br />

4.1 Encoding using Huffman Tree<br />

The technique works by creating a binary tree of<br />

nodes. This is done first by considering each symbol<br />

to be encoded as a single tree, each consisting of<br />

a single node. Each node is shown by the frequency<br />

of its symbol. The trees which sum of their root<br />

nodes is the least total frequency among other trees<br />

are selected, “producing” the sub trees of a new root<br />

which is the sum of the frequencies of the sub trees.<br />

The sub trees are then removed from the forest.<br />

This process is recursive and continues until only<br />

one tree -The Huffman Encoding Tree- remains.<br />

Each level is represented using one bit code, 0 or 1.<br />

Classically a value of 0 is associated with an edge to<br />

any left child and a value of 1 with an edge to any<br />

right child .Thus the most frequent symbols’ codes<br />

have fewer bits as they are nearer to the root [1].<br />

There are two variations for constructing Huffman<br />

codes: arbitrary and right-heavy Huffman coding. In<br />

the former, after combining two least frequent<br />

symbols as sub trees of the new root, the decision is<br />

arbitrary as to assign which sub tree as the right or<br />

left child of the root. However in the latter algorithm<br />

this is always the sub tree having greater frequency<br />

assigned as the right child. “By concatenating the<br />

labels associated with the edges that make up the<br />

path from the root to a leaf, we get a binary string.<br />

Thus the mapping is defined [1].”

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!