Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration.
bioinformatics
deep learning
genetic variability
genomics
Journal
Life (Basel, Switzerland)
ISSN: 2075-1729
Titre abrégé: Life (Basel)
Pays: Switzerland
ID NLM: 101580444
Informations de publication
Date de publication:
21 Dec 2021
21 Dec 2021
Historique:
received:
27
11
2021
revised:
14
12
2021
accepted:
17
12
2021
entrez:
21
1
2022
pubmed:
22
1
2022
medline:
22
1
2022
Statut:
epublish
Résumé
Identifying the cell of origin of cancer is important to guide treatment decisions. Machine learning approaches have been proposed to classify the cell of origin based on somatic mutation profiles from solid biopsies. However, solid biopsies can cause complications and certain tumors are not accessible. Liquid biopsies are promising alternatives but their somatic mutation profile is sparse and current machine learning models fail to perform in this setting. We propose an improved method to deal with sparsity in liquid biopsy data. Firstly, data augmentation is performed on sparse data to enhance model robustness. Secondly, we employ data integration to merge information from: (i) SNV density; (ii) SNVs in driver genes and (iii) trinucleotide motifs. Our adapted method achieves an average accuracy of 0.88 and 0.65 on data where only 70% and 2% of SNVs are retained, compared to 0.83 and 0.41 with the original model, respectively. The method and results presented here open the way for application of machine learning in the detection of the cell of origin of cancer from liquid biopsy data.
Identifiants
pubmed: 35054395
pii: life12010001
doi: 10.3390/life12010001
pmc: PMC8780455
pii:
doi:
Types de publication
Journal Article
Langues
eng
Subventions
Organisme : Dutch Research Council
ID : 639.072.715
Pays : Netherlands
Organisme : Oncode Institute
Références
J Clin Oncol. 2013 Jan 1;31(1):17-22
pubmed: 23129736
Nat Commun. 2017 Nov 6;8(1):1324
pubmed: 29109393
Science. 2018 Feb 23;359(6378):926-930
pubmed: 29348365
Cancer Discov. 2014 Jun;4(6):650-61
pubmed: 24801577
Nat Commun. 2020 Feb 5;11(1):728
pubmed: 32024849
Sci Rep. 2016 Feb 09;6:20707
pubmed: 26856619
PLoS One. 2017 Jan 3;12(1):e0169231
pubmed: 28046008
Sci Rep. 2019 Jul 18;9(1):10426
pubmed: 31320709
Nature. 2013 Aug 22;500(7463):415-21
pubmed: 23945592
Sci Rep. 2019 Nov 19;9(1):17052
pubmed: 31745186
Cell. 2011 Mar 4;144(5):646-74
pubmed: 21376230
Cell. 2018 Apr 5;173(2):371-385.e18
pubmed: 29625053
Int J Mol Sci. 2019 May 05;20(9):
pubmed: 31060263
Nature. 2020 Feb;578(7793):82-93
pubmed: 32025007
J Mol Diagn. 2020 Feb;22(2):228-235
pubmed: 31837429
Genome Med. 2021 May 17;13(1):85
pubmed: 34001236
Nat Med. 2020 Jul;26(7):1114-1124
pubmed: 32483360
Int J Cancer. 2006 Mar 15;118(6):1426-9
pubmed: 16187281
Oncotarget. 2017 Nov 15;8(63):106901-106912
pubmed: 29290998
Nature. 2019 Jun;570(7761):385-389
pubmed: 31142840
Curr Treat Options Oncol. 2013 Dec;14(4):634-42
pubmed: 23990214
Sci Transl Med. 2014 Feb 19;6(224):224ra24
pubmed: 24553385
Clin Cancer Res. 2012 Jun 15;18(12):3462-9
pubmed: 22421194
Cell. 2012 Jul 20;150(2):251-63
pubmed: 22817889
Leukemia. 2012 Jun;26(6):1383-90
pubmed: 22189900
EMBO Mol Med. 2021 Aug 9;13(8):e12881
pubmed: 34291583
Acta Oncol. 2004;43(5):453-9
pubmed: 15360049
Nat Med. 2019 Dec;25(12):1928-1937
pubmed: 31768066
Carcinogenesis. 2017 Apr 1;38(4):465-473
pubmed: 28334319
PLoS Comput Biol. 2019 Apr 15;15(4):e1006953
pubmed: 30986244
J Adv Res. 2015 May;6(3):375-82
pubmed: 26257935