08.01.2013 Views

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Methods for Constructing Coded DNA<br />

Languages<br />

Nataˇsa Jonoska and Kalpana Mahalingam<br />

Department <strong>of</strong> Mathematics<br />

University <strong>of</strong> South Florida, Tampa, FL 33620, USA<br />

jonoska@math.usf.edu, mahaling@helios.acomp.usf.edu<br />

Abstract. The set <strong>of</strong> all sequences that are generated by a biomolecular<br />

protocol forms a language over the four letter alphabet ∆ = {A, G, C, T }.<br />

This alphabet is associated with a natural involution mapping θ, A ↦→ T<br />

and G ↦→ C whichisanantimorphism<strong>of</strong>∆ ∗ . In order to avoid undesirable<br />

Watson-Crick bonds between the words (undesirable hybridization),<br />

the language has to satisfy certain coding properties. In this paper we<br />

build upon an earlier initiated study and give general methods for obtaining<br />

sets <strong>of</strong> code words with the same properties. We show that some<br />

<strong>of</strong> these code words have enough entropy to encode {0, 1} ∗ in a symbolto-symbol<br />

mapping.<br />

1 Introduction<br />

In bio-molecular computing and in particular DNA based computations and<br />

DNA nanotechnology, one <strong>of</strong> the main problems is associated with the design<br />

<strong>of</strong> the oligonucleotides such that mismatched pairing due to the Watson-Crick<br />

complementarity is minimized. In laboratory experiments non-specific hybridizations<br />

pose potential problems for the results <strong>of</strong> the experiment. Many authors<br />

have addressed this problem and proposed various solutions. Common approach<br />

has been to use the Hamming distance as a measure for uniqueness [3,8,9,11,19].<br />

Deaton et al. [8,11] used genetic algorithms to generate a set <strong>of</strong> DNA sequences<br />

that satisfy predetermined Hamming distance. Marathe et al. [20] also used<br />

Hamming distance to analyze combinatorial properties <strong>of</strong> DNA sequences, and<br />

they used dynamic programing for design <strong>of</strong> the strands used in [19]. Seeman’s<br />

program [23] generates sequences by testing overlapping subsequences to enforce<br />

uniqueness. This program is designed for producing sequences that are suitable<br />

for complex three-dimensional DNA structures, and the generation <strong>of</strong> suitable<br />

sequences is not as automatic as the other programs have proposed. Feldkamp<br />

et al. [10] also uses the test for uniqueness <strong>of</strong> subsequences and relies on tree<br />

structures in generating new sequences. Ruben at al. [22] use a random generator<br />

for initial sequence design, and afterwards check for unique subsequences<br />

with a predetermined properties based on Hamming distance. One <strong>of</strong> the first<br />

theoretical observations about number <strong>of</strong> DNA code words satisfying minimal<br />

Hamming distance properties was done by Baum [3]. Experimental separation<br />

<strong>of</strong> strands with “good” codes that avoid intermolecular cross hybridization was<br />

N. Jonoska et al. (Eds.): <strong>Molecular</strong> <strong>Computing</strong> (Head Festschrift), <strong>LNCS</strong> <strong>2950</strong>, pp. 241–253, 2004.<br />

c○ Springer-Verlag Berlin Heidelberg 2004

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!