Computational Tools for the Identification and Interpretation of Sequence Motifs in Immunopeptidomes.
Alvarez, B., Barra, C., Nielsen, M. and Andreatta, M.
Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Argentina.
Department of Bio and Health Informatics, Technical University of Denmark, Denmark.
Recent advances in proteomics and mass-spectrometry have widely expanded the detectable peptide repertoire presented by major histocompatibility complex (MHC) molecules on the cell surface, collectively known as the immunopeptidome. Finely characterizing the immunopeptidome brings about important basic insights into the mechanisms of antigen presentation, but can also reveal promising targets for vaccine development and cancer immunotherapy. In this report, we describe a number of practical and efficient approaches to analyze immunopeptidomics data, discussing the identification of meaningful sequence motifs in various scenarios and considering current limitations. We address the issue of filtering false hits and contaminants, and the problem of motif deconvolution in cell lines expressing multiple MHC alleles, both for the MHC class I and class II systems. Finally, we demonstrate how machine learning can be readily employed by non-expert users to generate accurate prediction models directly from mass-spectrometry eluted ligand data sets. This article is protected by copyright. All rights reserved.
Proteomics 18(12): e1700252 (2018)