08.01.2013 Views

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Digital Information Encoding on DNA 157<br />

by grouping (see below) DNA bases into two groups, the 0-group (say a, t) and<br />

the 1-group (say c, g), corresponding to the bits <strong>of</strong> T , and then rewriting the<br />

codewords in C so that their bits are interpreted as one (a or c) or the other<br />

(g or t). Codewords are generated in such a way that the distances would be<br />

essentially preserved (in proportion <strong>of</strong> 1 : 1), as shown in Table 1(b).<br />

The quality <strong>of</strong> the code set produced naturally depends on the quality <strong>of</strong> selfdistance<br />

<strong>of</strong> T and the error-correcting capability <strong>of</strong> C. If a minimum self-distance<br />

is assured, the method can be used to produce codewords with appropriate c/g<br />

content to have nearly uniform melting temperatures. In [2], an exhaustive search<br />

was conducted <strong>of</strong> templates <strong>of</strong> length l ≤ 32 and templates with minimum selfdistance<br />

about l/3. However, the thermodynamical analysis and lab performance<br />

are still open problems. To produce a point <strong>of</strong> comparison, the template method<br />

was used, although with a different seed code (32-bit BCH codes [26]). The<br />

pairwise Gibbs energy pr<strong>of</strong>iles <strong>of</strong> the series <strong>of</strong> template codeword sets obtained<br />

using some templates listed in [2] are shown below in Fig. 2(a).<br />

3.2 Tensor Products<br />

A new iterative technique is now introduced to systematically construct large<br />

code sets with high values for the minimum h-distance τ between codewords. The<br />

technique requires a base code <strong>of</strong> short biners (good set <strong>of</strong> short oligos) to seed<br />

the process. It is based on a new operation between strands, called the tensor<br />

product between words from two given codes <strong>of</strong> s- andt-biners, to produce a<br />

code <strong>of</strong> s + t-biners. The codewords are produced as follows. Given two biners<br />

x = ab, andy fromtwocodingsetsC, D <strong>of</strong> s- andt-biners, respectively, where<br />

a, b are the corresponding halves <strong>of</strong> x, new codewords <strong>of</strong> length s + t are given<br />

by x ⊘ y = a ′ y ′ b ′ , where the prime indicates a cyclic permutation <strong>of</strong> the word<br />

obtained by bringing one or few <strong>of</strong> the last symbols <strong>of</strong> each word to the front.<br />

The tensor product <strong>of</strong> two sets C ⊘D is the set <strong>of</strong> all such products between pairs<br />

<strong>of</strong> words x from C and y from D, where the cyclic permutation is performed once<br />

again on the previously used word so that no word from C or D is used more<br />

than once in the concatenation a ′ y ′ b ′ . The product code C ⊘D therefore contains<br />

at least |C||D| codewords. In one application below, the construction is used<br />

without applying cyclic permutations to the factor codewords with comparable,<br />

if lesser, results.<br />

Size/τ Best codes<br />

5/1 10000,10100,11000<br />

6/2 100010.100100,110000<br />

7/2 1000010,1001000,1100000<br />

8/2 11000100,11010000,11100000<br />

BNA DNA base<br />

000,010/111,101 a/t<br />

011,100/001,110 c/g<br />

Table 1. (a) Best seed codes, and (b) mapping to convert BNA to a DNA strand.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!