TripletGO: Integrating Transcript Expression Profiles with Protein Homology Inferences for Gene Function Prediction.
Gene Ontology
Gene function annotation
Protein-level alignment
Transcript expression profile
Triplet network
Journal
Genomics, proteomics & bioinformatics
ISSN: 2210-3244
Titre abrégé: Genomics Proteomics Bioinformatics
Pays: China
ID NLM: 101197608
Informations de publication
Date de publication:
10 2022
10 2022
Historique:
received:
20
08
2021
revised:
02
03
2022
accepted:
16
04
2022
pubmed:
15
5
2022
medline:
14
3
2023
entrez:
14
5
2022
Statut:
ppublish
Résumé
Gene Ontology (GO) has been widely used to annotate functions of genes and gene products. Here, we proposed a new method, TripletGO, to deduce GO terms of protein-coding and non-coding genes, through the integration of four complementary pipelines built on transcript expression profile, genetic sequence alignment, protein sequence alignment, and naïve probability. TripletGO was tested on a large set of 5754 genes from 8 species (human, mouse, Arabidopsis, rat, fly, budding yeast, fission yeast, and nematoda) and 2433 proteins with available expression data from the third Critical Assessment of Protein Function Annotation challenge (CAFA3). Experimental results show that TripletGO achieves function annotation accuracy significantly beyond the current state-of-the-art approaches. Detailed analyses show that the major advantage of TripletGO lies in the coupling of a new triplet network-based profiling method with the feature space mapping technique, which can accurately recognize function patterns from transcript expression profiles. Meanwhile, the combination of multiple complementary models, especially those from transcript expression and protein-level alignments, improves the coverage and accuracy of the final GO annotation results. The standalone package and an online server of TripletGO are freely available at https://zhanggroup.org/TripletGO/.
Identifiants
pubmed: 35568117
pii: S1672-0229(22)00041-9
doi: 10.1016/j.gpb.2022.03.001
pmc: PMC10025770
pii:
doi:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
1013-1027Subventions
Organisme : NIAID NIH HHS
ID : R01 AI134678
Pays : United States
Organisme : NIH HHS
ID : S10 OD026825
Pays : United States
Organisme : NIEHS NIH HHS
ID : P30 ES017885
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210967
Pays : United States
Informations de copyright
Copyright © 2022 The Authors. Published by Elsevier B.V. All rights reserved.
Références
Bioinformatics. 2015 Nov 1;31(21):3460-7
pubmed: 26139634
Neural Netw. 2019 Feb;110:232-242
pubmed: 30616095
Bioinformatics. 2020 Feb 15;36(4):1182-1190
pubmed: 31562759
Nucleic Acids Res. 2018 Jul 2;46(W1):W60-W64
pubmed: 29912392
Nat Rev Genet. 2011 Nov 18;12(12):861-74
pubmed: 22094949
Bioinformatics. 2018 Feb 15;34(4):660-668
pubmed: 29028931
Nat Biotechnol. 2006 Dec;24(12):1474-5; author reply 1475-6
pubmed: 17160037
IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1334-9
pubmed: 16119271
Nat Commun. 2021 Mar 5;12(1):1464
pubmed: 33674610
Annu Rev Biomed Eng. 2002;4:129-53
pubmed: 12117754
Cytometry A. 2010 Aug;77(8):733-42
pubmed: 20653013
BMC Bioinformatics. 2013;14 Suppl 3:S15
pubmed: 23630983
Nucleic Acids Res. 2012 Jan;40(Database issue):D76-83
pubmed: 22139911
Comput Biol Med. 2019 Jan;104:149-162
pubmed: 30472497
IEEE Trans Neural Netw. 2002;13(3):780-4
pubmed: 18244475
J Mol Biol. 2018 Jul 20;430(15):2256-2265
pubmed: 29534977
Brief Bioinform. 2021 Mar 22;22(2):2096-2105
pubmed: 32249297
Genome Biol. 2019 Nov 19;20(1):244
pubmed: 31744546
Nucleic Acids Res. 2017 Jul 3;45(W1):W291-W299
pubmed: 28472402
Nucleic Acids Res. 2019 Jan 8;47(D1):D55-D62
pubmed: 30462320
Nat Commun. 2021 May 26;12(1):3168
pubmed: 34039967
Nat Genet. 2000 May;25(1):25-9
pubmed: 10802651
Proc Natl Acad Sci U S A. 2019 Dec 26;116(52):27151-27158
pubmed: 31822622
IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):396-406
pubmed: 28489543
Nat Commun. 2019 Jun 28;10(1):2837
pubmed: 31253775
Plant Cell Physiol. 2018 Jan 1;59(1):e3
pubmed: 29216398
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402
pubmed: 9254694
Nucleic Acids Res. 2011 Jan;39(Database issue):D38-51
pubmed: 21097890
Genomics Proteomics Bioinformatics. 2021 Dec;19(6):998-1011
pubmed: 33631427
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12
pubmed: 25348405