New Approaches to in silico Design of Epitope-Based Vaccines
New Approaches to in silico Design of Epitope-Based Vaccines
New Approaches to in silico Design of Epitope-Based Vaccines
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
prediction <strong>of</strong> T-cell epi<strong>to</strong>pes is a rather challeng<strong>in</strong>g problem. However, the capability <strong>of</strong> a<br />
peptide <strong>to</strong> <strong>in</strong>duce a T cell-mediated immune response, i.e., its immunogenicity, has been<br />
shown <strong>to</strong> correlate well with MHC b<strong>in</strong>d<strong>in</strong>g aff<strong>in</strong>ity [9]: most high-aff<strong>in</strong>ity peptides seem <strong>to</strong><br />
be immunogenic. S<strong>in</strong>ce the processes govern<strong>in</strong>g pMHC b<strong>in</strong>d<strong>in</strong>g are well-unders<strong>to</strong>od, the<br />
epi<strong>to</strong>pe discovery problem is typically reduced <strong>to</strong> the problem <strong>of</strong> MHC b<strong>in</strong>d<strong>in</strong>g prediction.<br />
MHC B<strong>in</strong>d<strong>in</strong>g Prediction<br />
In 2004, S<strong>in</strong>gh-Jasuja et al. presented the Tüb<strong>in</strong>gen approach [10] <strong>to</strong> acquire an experimentally<br />
validated <strong>in</strong>itial list <strong>of</strong> epi<strong>to</strong>pes from tumor-associated antigens. In their work, the<br />
<strong>in</strong>corporation <strong>of</strong> computational methods for prediction <strong>of</strong> pMHC b<strong>in</strong>d<strong>in</strong>g <strong>in</strong> the process is<br />
proposed. S<strong>in</strong>ce such prediction methods help <strong>to</strong> reduce the number <strong>of</strong> experiments <strong>to</strong> be<br />
performed, they have become standard <strong>to</strong>ols <strong>in</strong> immunology. Most popular approaches are<br />
either matrix-based [11, 12] or use mach<strong>in</strong>e learn<strong>in</strong>g techniques [13, 14]. Typical matrixbased<br />
approaches assume that each side cha<strong>in</strong> <strong>of</strong> a peptide contributes <strong>in</strong>dependently <strong>to</strong><br />
the peptide’s b<strong>in</strong>d<strong>in</strong>g aff<strong>in</strong>ity. In contrast, most mach<strong>in</strong>e learn<strong>in</strong>g methods are capable <strong>of</strong><br />
captur<strong>in</strong>g nonl<strong>in</strong>ear correlations. Hence, the result<strong>in</strong>g models are generally more accurate.<br />
Their construction does, however, require considerable amounts <strong>of</strong> data.<br />
A major challenge <strong>in</strong> MHC b<strong>in</strong>d<strong>in</strong>g prediction is the highly polymorphic nature <strong>of</strong><br />
the MHC. As <strong>of</strong> <strong>to</strong>day, 4,946 MHC-I and 1,457 MHC-II alleles are known (IMGT/HLA<br />
database [15, 16], release 3.4.0). Different allelic variants <strong>of</strong> MHC molecules b<strong>in</strong>d and<br />
present different sets <strong>of</strong> peptides. Hence, the MHC ligandome varies from <strong>in</strong>dividual <strong>to</strong><br />
<strong>in</strong>dividual. This has important implications for vacc<strong>in</strong>e design: because T cells recognize<br />
immunogenic peptides only when bound <strong>to</strong> MHC molecules, a peptide capable <strong>of</strong> <strong>in</strong>duc<strong>in</strong>g<br />
a T cell-mediated immune response <strong>in</strong> one <strong>in</strong>dividual might not be capable <strong>of</strong> do<strong>in</strong>g so <strong>in</strong><br />
another. Moreover, with<strong>in</strong> a population, some MHC alleles are more common than others<br />
and the allele distribution differs between populations.<br />
Classical prediction methods develop allele-specific models and require a m<strong>in</strong>imum <strong>of</strong><br />
experimental b<strong>in</strong>d<strong>in</strong>g data for the respective allele. For the majority <strong>of</strong> all known MHC<br />
alleles the amount <strong>of</strong> available data is <strong>in</strong>sufficient. However, <strong>to</strong> address the challenges <strong>of</strong><br />
vacc<strong>in</strong>e design, the b<strong>in</strong>d<strong>in</strong>g specificities the gene products <strong>of</strong> all MHC alleles need <strong>to</strong> be<br />
known. It is thus crucial <strong>to</strong> f<strong>in</strong>d a way <strong>to</strong> overcome the problem caused by the lack <strong>of</strong> data.<br />
We propose two approaches <strong>to</strong> overcome this problem and thus <strong>to</strong> improve MHC-I<br />
b<strong>in</strong>d<strong>in</strong>g prediction. Both approaches employ support vec<strong>to</strong>r mach<strong>in</strong>es (SVMs), a powerful<br />
and very popular mach<strong>in</strong>e learn<strong>in</strong>g technique. Given a sufficiently large set <strong>of</strong> examples,<br />
e.g., MHC b<strong>in</strong>ders and MHC non-b<strong>in</strong>ders, an SVM detects patterns with<strong>in</strong> the examples<br />
and, based on these, learns <strong>to</strong> make <strong>in</strong>telligent decisions regard<strong>in</strong>g unseen data. In order<br />
<strong>to</strong> do so, SVMs require so-called kernel functions or kernels, which measure the similarity<br />
between examples.<br />
Our first approach <strong>to</strong> address the problem <strong>of</strong> scarce data focuses on improv<strong>in</strong>g the predictive<br />
power <strong>of</strong> SVMs for alleles with little experimental b<strong>in</strong>d<strong>in</strong>g data by more skillfully<br />
exploit<strong>in</strong>g the available <strong>in</strong>formation. MHC molecules b<strong>in</strong>d peptides <strong>in</strong> an extended conformation.<br />
With<strong>in</strong> the complex the peptide’s side cha<strong>in</strong>s <strong>in</strong>teract with surround<strong>in</strong>g side<br />
cha<strong>in</strong>s <strong>of</strong> the MHC and also with each other. Each <strong>of</strong> the peptide’s side cha<strong>in</strong>s contributes<br />
3