12.07.2015 Views

Protein Engineering Protocols - Mycobacteriology research center

Protein Engineering Protocols - Mycobacteriology research center

Protein Engineering Protocols - Mycobacteriology research center

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

16 Kono et al.such a bias toward specific amino acids. Pseudo-independent nucleotide probabilitiesat each position of a set of partially random genes can be determinedcomputationally, such that the protein library encoded by the gene library bestreproduces the desired amino acid profile. The calculated gene library can thenbe used in context of standard DNA synthesis.Let P 1(n 1), P 2(n 2), P 3(n 3) be the probabilities of each of the four possiblenucleotides (n i= A, T, G, C) in the first, second, and third position of a codon,respectively. If these are treated as independent, the probability that amino acidα will appear as encoded by the codon n 1n 2n 3is P(α|n 1,n 2,n 3) = P 1(n 1)P 2(n 2)P 3(n 3)δ(α|n 1,n 2,n 3), where δ(α|n 1,n 2,n 3) = 1 only if n 1,n 2,n 3is a codon for aminoacid α, and is zero otherwise. If the codons of amino acid α are equally likely(no codon bias), the probability of an amino acid α is the sum of codon probabilitiescorresponding to this amino acid:Pcalc( α) = ∑ P 1( n 1) P 2( n 2) P 3( n 3) δ(α| n 1, n2, n3)n1, n2,n3Objective functions quantify the difference between a desired amino acid probabilitydistribution and the amino acid probability distribution encoded from agiven set of nucleotide probabilities (67,68). To find the nucleotide probabilitiesthat not only best reproduce the desired amino acid frequencies but also preventthe occurrence of stop codons, a new objective function has been presented (69).The objective function comprises both a χ 2 function, which quantifies the absolutedifference between the desired and calculated amino acid probabilities, and a relativeentropy term. Such relative entropies are commonly used to quantify the “distance”between two probability distributions, and are strong indicators of cases inwhich information in one distribution is not contained in the other (50):21⎧ PcalcH = P ( )ln ( α + εcalcαPdesPdes( ))+ +2⎫∑⎨05 .[ ( α)− P ⎬⎩α εcalc( α)]α=1⎭Here, ε is introduced as an arbitrary small constant (ε =10 –6 ), to avoid numericalinstability if P des(α) vanishes. Stop codons are treated as an “effectiveamino acid.” The objective function is optimized (minimized), subject to theusual constraints on the nucleotide probabilities: 0 ≤ P i(n i) ≤ 1, and ∑ nPi( ni) = 1.iThis may be done using a Lagrange multiplier method or computational packagesavailable for constrained minimization (69). Codons optimized for aparticular organism or expression system may also be included in an objectivefunction of this type (69).Illustrated in Fig. 3 is a nucleotide design for a particular amino acid positionin a protein, here site 54 of the SH3 domain. Shown are the desired frequenciesof the amino acids (open bar in the upper panel of Fig. 3) and theamino acid frequencies as encoded (filled bar in the upper panel of Fig. 3) by

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!