08.01.2013 Views

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

LNCS 2950 - Aspects of Molecular Computing (Frontmatter Pages)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Methods for Constructing Coded DNA Languages 247<br />

same probability 1<br />

p . In this case the entropy essentially counts the average number<br />

<strong>of</strong> words <strong>of</strong> a given length as subwords <strong>of</strong> the code words [17]. From the<br />

coding theorem, it follows that {0, 1} ∗ can be encoded by X∗ with Σ ↦→ {0, 1} if<br />

the entropy <strong>of</strong> X∗ is at least log 2 ([1], see also Theorem 5.2.5 in [6]). The codes<br />

for θ-free, strictly θ-free, and θ-k-codes designed in this section have entropy<br />

larger than log 2 when the alphabet has p = 4 symbols. Hence, such DNA codes<br />

can be used for encoding bit-strings.<br />

We start with the entropy definition as defined in [6].<br />

Definition 31 Let X be a code. The entropy <strong>of</strong> X∗ is defined by<br />

1<br />

�(X) =limn→∞<br />

n log |Subn(X ∗ )|.<br />

If G is a deterministic automaton or an automaton with a delay that recognizes<br />

X∗ and AG is the adjacency matrix <strong>of</strong> G, then by Perron-Frobenius theory<br />

AG has a positive eigen value ¯µ and the entropy <strong>of</strong> X∗ is log ¯µ (see Chapter 4<br />

<strong>of</strong> [6]). We will use this fact in the following computations <strong>of</strong> the entropies <strong>of</strong><br />

the designed codes. In [13], Proposition 16, authors designed a set <strong>of</strong> DNA code<br />

words that is strictly θ-free. The following propositions show that in a similar<br />

way we can construct codes with additional “good” propoerties.<br />

In what follows we assume that Σ is a finite alphabet with |Σ| ≥3and<br />

θ : Σ → Σ is an involution which is not identity. We denote with p the number<br />

<strong>of</strong> symbols in Σ. WealsousethefactthatXis (strictly) θ-free iff X∗ is (strictly)<br />

θ-free, (Proposition 4 in [14]).<br />

Proposition 32 Let a, b ∈ Σ be such that for all c ∈ Σ \{a, b}, θ(c) /∈ {a, b}.<br />

Let X = �∞ i=1 am (Σ \{a, b}) ibm for a fixed integer m ≥ 1.<br />

Then X and X∗ are θ-free. The entropy <strong>of</strong> X∗ is such that log(p−2) < �(X ∗ ).<br />

Pro<strong>of</strong>. Let x1,x2,y ∈ X such that x1x2 = sθ(y)t for some s, t ∈ Σ + such<br />

that x1 = ampbm , x2 = amqbm and y = amrbm ,forp, q, r ∈ (Σ \{a, b}) + .<br />

Since θ is an involution, if θ(a) �= a, b, then there is a c ∈ Σ \{a, b} such<br />

that θ(c) = a, which is excluded by assumption. Hence, either θ(a) = a or<br />

θ(a) =b. Whenθis morphic θ(y) =θ(am )θ(r)θ(bm )andwhenθis antimorphic<br />

θ(y) =θ(bm )θ(r)θ(am ). So, θ(y) =amθ(r)bm or θ(y) =bmθ(r)am .Sincex1x2 =<br />

ampbmamqbm = sbmθ(r)am t or x1x2 = ampbmamqbm = samθ(r)bm t the only<br />

possibilities for r are θ(r) =p or θ(r) =q. In the first case s =1andinthe<br />

second case t = 1 which is a contradiction with the definition <strong>of</strong> θ-free. Hence X<br />

is θ-free.<br />

Let A = (V,E,λ) be the automaton that recognizes X∗ where V =<br />

{1, ..., 2m +1} is the set <strong>of</strong> vertices, E ⊆ V × Σ × V and λ : E → Σ (with<br />

(i, s, j) ↦→ s) is the labeling function.<br />

An edge (i, s, j) isinEif and only if:<br />

⎧<br />

⎨ a, for 1 ≤ i ≤ m, j = i +1<br />

s = b, for m +2≤ i ≤ 2m, j = i +1, and i =2m +1,j =1<br />

⎩<br />

s, for i = m +1,m+2,j= m +2,s∈ Σ \{a, b}

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!