Skip Navigation


Bioinformatics Advance Access originally published online on April 14, 2008
Bioinformatics 2008 24(11):1397-1398; doi:10.1093/bioinformatics/btn128
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/11/1397    most recent
btn128v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lundegaard, C.
Right arrow Articles by Nielsen, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lundegaard, C.
Right arrow Articles by Nielsen, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Accurate approximation method for prediction of class I MHC affinities for peptides of length 8, 10 and 11 using prediction tools trained on 9mers

Claus Lundegaard *, Ole Lund and Morten Nielsen

Center for Biological Sequence Analysis – CBS, Department of Systems Biology, The Technical University of Denmark – DTU, Kemitorvet Build. 208, 2800 Lyngby, Denmark

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 REFERENCES
 

Summary: Several accurate prediction systems have been developed for prediction of class I major histocompatibility complex (MHC):peptide binding. Most of these are trained on binding affinity data of primarily 9mer peptides. Here, we show how prediction methods trained on 9mer data can be used for accurate binding affinity prediction of peptides of length 8, 10 and 11. The method gives the opportunity to predict peptides with a different length than nine for MHC alleles where no such peptides have been measured. As validation, the performance of this approach is compared to predictors trained on peptides of the peptide length in question. In this validation, the approximation method has an accuracy that is comparable to or better than methods trained on a peptide length identical to the predicted peptides.

Availablility: The algorithm has been implemented in the web-accessible servers NetMHC-3.0: http://www.cbs.dtu.dk/services/NetMHC-3.0, and NetMHCpan-1.1: http://www.cbs.dtu.dk/services/NetMHCpan-1.1

Contact: lunde{at}cbs.dtu.dk

Supplementary information: Supplementary data are available at Bioinformatics online


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 REFERENCES
 
Determination of peptide binding to MHC class I is an important step in cytotoxic T cell lymphocyte (CTL) epitope discovery methods for class I MHC peptide binding. These methods have become increasingly accurate (Lundegaard et al., 2007; Moutaftsi et al., 2006; Peters et al., 2006), limiting the effort significantly. Most MHCs, however, prefer peptides of the length 9, making available binding data of 9mer peptides significantly more abundant than data for other lengths such as 8, 10 and 11mers as these more rarely binds to the MHCs. Since the amount of available data is crucial for the developing of accurate predictions (Yu et al., 2002), the number of accurate predictors of these other lengths is limited. Ligand motifs have been elucidated for several MHCs and groups of MHCs (Lund et al., 2004; Rammensee et al., 1999; Sette and Sidney, 1998). According to these motifs, the most important (anchor) residues are in general the positions 2, 3 and the C-terminal, disregarding the peptide length. However, for a limited number of non-human MHCs other peptide positions might be the primary anchors (see Supplementary Material). Using this knowledge, we exploited the possibility of generating pseudo 9mers from peptides of other length by fixating these positions and inserting or deleting residues at other positions. This resulted in a simple though remarkably accurate method to overcome the length problem using affinity predictions by 9mer predictors of such pseudo 9mers. This method will in principle work with any type of existing MHC 9mer binding prediction methods.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 REFERENCES
 
2.1 9mer predictions
Here, we used predictions generated by NetMHC-3.0 (http://www.cbs.dtu.dk/services/NetMHC-3.0) (Buus et al., 2003; Nielsen et al., 2003). However, any 9mer:MHC binding prediction algorithm that accepts unknown amino acids (i.e. X) can be used.

2.2 Prediction of 8mer affinities
In 8mer peptides (e.g. EIGHTMER) an X is inserted repeatedly at either position 4, 5, 6, 7 or 8, resulting in five new pseudo peptides of length 9; EIGXHTMER, EIGHXTMER, EIGHTXMER, EIGHTMXER and EIGHTMEXR (Fig. 1A). The final predicted affinity is calculated as the geometrical mean of the five predicted affinities in nano Molar units.


Figure 1
View larger version (12K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. (A) Illustration of one 8mer to five 9mer conversion. (B) Illustration of one 10mer to six 9mer conversion. (C) Calculation of six pseudo 9mer predictions back to a 10mer prediction.

 
2.3 Prediction of 10 and 11mer affinities
The longermers (e.g. TENELEVENS) are converted into 9mers by deleting 1 (10mers) or 2 (11mers) residues at positions 4, 5, 6, 7, 8 or 9, resulting in six new pseudo-peptides; TENLEVENS, TENEEVENS, TENELVENS, TENELEENS, TENELEVNS and TENELEVES (Fig. 1B). The final predicted affinity is calculated as the geometrical mean of the six predicted affinities in nano Molar units (Fig. 1C).

2.4 Evaluation
The method was evaluated using peptide IC50 and Kd affinity data extracted from the web site of the Immune Epitope Database and Analysis resource (IEDB) (Sette et al., 2005). This resulted in the 8mer, 9mer and 10mer evaluation data available at http://www.cbs.dtu.dk/services/NetMHC-3.0/evalset_8mers.xls, http://www.cbs.dtu.dk/services/NetMHC-3.0/evalset_10mers_all.xls and http://www.cbs.dtu.dk/services/NetMHC-3.0/evalset_11mers.xls, respectively. 8mer data: 1975 measurements distributed on 35 MHC alleles. 10mer data: 13 507 measurements distributed on 31 MHC alleles. 11mer data: 181 measurements, distributed on 25 MHC alleles. We evaluated the accuracy by Pearson correlation coefficients (PCC), and area under receiver operating characteristic (ROC) curves (AUC) using a binding cutoff of 500 nM.


    3 RESULTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 REFERENCES
 
We predicted affinities for 8mer, 10mer and 11mer data using the approximation method described in Methods. The overall PCCs for all predictions within each dataset were 0.69, 0.73 and 0.74, respectively. The overall AUCs were 0.86, 0.87 and 0.89, respectively. To calculate an AUC value, we needed both negative (IC50 or Kd ≥ 500 nM) and positive (IC50 or Kd < 500 nM) affinity data, thus we removed alleles having only binders or non-binders from the evaluation. This lead to 27 compared alleles on 8mer peptides. The resulting PCC values had a mean of 0.72. Acceptable AUC values (above 0.7) were obtained for 25 of the 27 covered alleles (Table 1). To evaluate the 10mer approximation, we calculated PCC and AUC values for 27 alleles. Using a 500 nM threshold, 26 of 27 alleles had AUCs above 0.7 (Table 1).


View this table:
[in this window]
[in a new window]

 
Table 1. PCC and AUC values of predicted 8mer and 10mer affinities using the approximation method

 
To compare the approximation method with specifically trained methods, we used artificial neural networks (ANNs) previously trained as described in (Nielsen et al., 2003) on 10mer data. For 10mers, 2037 new data points covering 16 alleles had become available since training of 10mer specific ANNs, available at http://www.cbs.dtu.dk/services/NetMHC-3.0/evalset_10mers.xls. AUC values were calculated for each allele using either ANNs trained on 10mers or the approximation method described here (Supplementary Fig. 1). For 12 of the 16 alleles the approximation method performed better than the 10mer trained ANNs (P < 0.01).

For the currently small number of alleles for which the primary peptide anchor position(s) are in positions 4–8 the approximation method will not work well. Examples of such alleles and how to identify these are described in the Supplementary Material.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Burkhard Rost

Received on February 8, 2008; revised on April 4, 2008; accepted on April 4, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 REFERENCES
 

    Buus S, et al. Sensitive quantitative predictions of peptide-MHC binding by a ‘Query by Committee' artificial neural network approach. Tissue Antigens (2003) 62:378–384.[CrossRef][Web of Science][Medline]

    Lund O, et al. Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics (2004) 55:797–810.[CrossRef][Web of Science][Medline]

    Lundegaard C, et al. Modeling the adaptive immune system: predictions and simulations. Bioinformatics (2007) 23:3265–3275.[Abstract/Free Full Text]

    Moutaftsi M, et al. A consensus epitope prediction approach identifies the breadth of murine T(CD8+)-cell responses to vaccinia virus. Nat. Biotechnol (2006) 24:817–819.[CrossRef][Web of Science][Medline]

    Nielsen M, et al. Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci (2003) 12:1007–1017.[CrossRef][Web of Science][Medline]

    Peters B, et al. A community resource benchmarking predictions of peptide binding to MHC-I molecules. PLoS Comput. Biol (2006) 2:e65.[CrossRef][Medline]

    Rammensee H, et al. SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics (1999) 50:213–219.[CrossRef][Web of Science][Medline]

    Sette A, et al. A roadmap for the immunomics of category A-C pathogens. Immunity (2005) 22:155–161.[CrossRef][Medline]

    Sette A, Sidney J. HLA supertypes and supermotifs: a functional perspective on HLA polymorphism. Curr. Opin. Immunol (1998) 10:478–482.[CrossRef][Web of Science][Medline]

    Yu K, et al. Methods for prediction of peptide binding to MHC molecules: a comparative study. Mol. Med (2002) 8:137–148.[Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/11/1397    most recent
btn128v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lundegaard, C.
Right arrow Articles by Nielsen, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lundegaard, C.
Right arrow Articles by Nielsen, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?