EXPANDING THE UNIVERSE OF ACTIONABLE NEOANTIGENS THROUGH DEEP LEARNING AND HUMAN LEUKOCYTE ANTIGEN PEPTIDOMICS
The success of cancer immunotherapy relies on the discovery of neoantigens that can be used to activate the patient’s immune response to specifically target his/her tumor cells. Currently, the most direct method for identifying antigens presented on the tumor cell surface involves mass spectrometry (MS)-based characterization of human leukocyte antigen (HLA)-bound peptides. However, standard analysis of MS data is not effective at discovering unexpected amino acid sequences, including those that would be produced by translation of canonical non-coding RNA and proteasome-mediated peptide splicing. In light of recent findings that a significant number of targetable tumor-specific antigens belong to this pool of poorly-characterized antigens, a major improvement in MS data analysis is needed to address this limitation.
We developed SMSNet, a deep artificial neural network model that is able to identify amino acid sequences from MS data without any restriction, based on >25 million MS datapoints. The performance of SMSNet was evaluated against state-of-the-art methods on diverse HLA peptidomes deriving from monoallelic cell lines or patient samples. Origins of newly identified antigens were determined by searching against databases of known human proteins, noncoding RNAs, and in silico generated spliced peptides. Binding affinities between new antigens and their respective HLA molecules were evaluated in silico predictions.
SMSNet uncovers >10,000 previously uncharacterized HLA antigens, including >6,000 antigens with new amino acid sequences that have not been studied before according to the Immune Epitope Database (IEDB) and >1,500 antigens that can be traced to proteasome-mediated peptide splicing and translation of non-canonical open reading frames. Newly identified antigens also exhibit high predicted binding affinities with HLA molecules and rank competitively against previously reported antigens. Overall, SMSNet expands the coverage of identified HLA antigens by almost 30% from standard approaches.
SMSNet identifies a large number of new HLA antigens and should be incorporated into future HLA peptidomics-based neoantigen discovery pipeline.