Phenotype Classification using Proteome Data in a Data-Independent Acquisition Tensor Format.
Journal
Journal of the American Society for Mass Spectrometry
ISSN: 1879-1123
Titre abrégé: J Am Soc Mass Spectrom
Pays: United States
ID NLM: 9010412
Informations de publication
Date de publication:
04 Nov 2020
04 Nov 2020
Historique:
pubmed:
27
10
2020
medline:
6
10
2021
entrez:
26
10
2020
Statut:
ppublish
Résumé
A novel approach for phenotype prediction is developed for data-independent acquisition (DIA) mass spectrometric (MS) data without the need for peptide precursor identification using existing DIA software tools. The first step converts the DIA-MS data file into a new file format called DIA tensor (DIAT), which can be used for the convenient visualization of all the ions from peptide precursors and fragments. DIAT files can be fed directly into a deep neural network to predict phenotypes such as appearances of cats, dogs, and microscopic images. As a proof of principle, we applied this approach to 102 hepatocellular carcinoma samples and achieved an accuracy of 96.8% in distinguishing malignant from benign samples. We further applied a refined model to classify thyroid nodules. Deep learning based on 492 training samples achieved an accuracy of 91.7% in an independent cohort of 216 test samples. This approach surpassed the deep-learning model based on peptide and protein matrices generated by OpenSWATH. In summary, we present a new strategy for DIA data analysis based on a novel data format called DIAT, which enables facile two-dimensional visualization of DIA proteomics data. DIAT files can be directly used for deep learning for biological and clinical phenotype classification. Future research will interpret the deep-learning models emerged from DIAT analysis.
Identifiants
pubmed: 33104352
doi: 10.1021/jasms.0c00254
doi:
Substances chimiques
Peptides
0
Proteome
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM