Evaluation of Sirtuin-3 probe quality and co-expressed genes using literature cohesion.
BXD mice
GeneNetwork.org
Latent Semantic Indexing
Microarray
Sirt3
Text mining
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
14 Mar 2019
14 Mar 2019
Historique:
entrez:
16
3
2019
pubmed:
16
3
2019
medline:
3
5
2019
Statut:
epublish
Résumé
Gene co-expression studies can provide important insights into molecular and cellular signaling pathways. The GeneNetwork database is a unique resource for co-expression analysis using data from a variety of tissues across genetically distinct inbred mice. However, extraction of biologically meaningful co-expressed gene sets is challenging due to variability in microarray platforms, probe quality, normalization methods, and confounding biological factors. In this study, we tested whether literature derived functional cohesion could be used as an objective metric in lieu of 'ground truth' to evaluate the quality of probes and microarray datasets. We examined Sirtuin-3 (Sirt3) co-expressed gene sets extracted from either liver or brain tissues of BXD recombinant inbred mice in the GeneNetwork database. Depending on the microarray platform, there were as many as 26 probes that targeted different regions of Sirt3 primary transcript. Co-expressed gene sets (ranging from 100-1000 genes) associated with each Sirt3 probe were evaluated using the previously developed literature-derived cohesion p-value (LPv) and benchmarked against 'gold standards' derived from proteomic studies or Gene Ontology classifications. We found that the maximal F-measure was obtained at an average window size of 535 genes. Using set size of 500 genes, the Pearson correlations between LPv and F-measure as well as between LPv and mitochondrial gene enrichment p-values were 0.90 and 0.93, respectively. Importantly, we found that the LPv approach can distinguish high quality Sirt3 probes. Analysis of the most functionally cohesive Sirt3 co-expressed gene set revealed core metabolic pathways that were shared between hippocampus and liver as well as distinct pathways which were unique to each tissue. These results are consistent with other studies that suggest Sirt3 is a key metabolic regulator and has distinct functions in energy-producing vs. energy-demanding tissues. Our results provide proof-of-concept that literature cohesion analysis is useful for evaluating the quality of probes and microarray datasets, particularly when experimentally derived gold standards are unavailable. Our approach would enable researchers to rapidly identify biologically meaningful co-expressed gene sets and facilitate discovery from high throughput genomic data.
Sections du résumé
BACKGROUND
BACKGROUND
Gene co-expression studies can provide important insights into molecular and cellular signaling pathways. The GeneNetwork database is a unique resource for co-expression analysis using data from a variety of tissues across genetically distinct inbred mice. However, extraction of biologically meaningful co-expressed gene sets is challenging due to variability in microarray platforms, probe quality, normalization methods, and confounding biological factors. In this study, we tested whether literature derived functional cohesion could be used as an objective metric in lieu of 'ground truth' to evaluate the quality of probes and microarray datasets.
RESULTS
RESULTS
We examined Sirtuin-3 (Sirt3) co-expressed gene sets extracted from either liver or brain tissues of BXD recombinant inbred mice in the GeneNetwork database. Depending on the microarray platform, there were as many as 26 probes that targeted different regions of Sirt3 primary transcript. Co-expressed gene sets (ranging from 100-1000 genes) associated with each Sirt3 probe were evaluated using the previously developed literature-derived cohesion p-value (LPv) and benchmarked against 'gold standards' derived from proteomic studies or Gene Ontology classifications. We found that the maximal F-measure was obtained at an average window size of 535 genes. Using set size of 500 genes, the Pearson correlations between LPv and F-measure as well as between LPv and mitochondrial gene enrichment p-values were 0.90 and 0.93, respectively. Importantly, we found that the LPv approach can distinguish high quality Sirt3 probes. Analysis of the most functionally cohesive Sirt3 co-expressed gene set revealed core metabolic pathways that were shared between hippocampus and liver as well as distinct pathways which were unique to each tissue. These results are consistent with other studies that suggest Sirt3 is a key metabolic regulator and has distinct functions in energy-producing vs. energy-demanding tissues.
CONCLUSIONS
CONCLUSIONS
Our results provide proof-of-concept that literature cohesion analysis is useful for evaluating the quality of probes and microarray datasets, particularly when experimentally derived gold standards are unavailable. Our approach would enable researchers to rapidly identify biologically meaningful co-expressed gene sets and facilitate discovery from high throughput genomic data.
Identifiants
pubmed: 30871457
doi: 10.1186/s12859-019-2621-z
pii: 10.1186/s12859-019-2621-z
pmc: PMC6419539
doi:
Substances chimiques
Sirtuin 3
EC 3.5.1.-
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
104Subventions
Organisme : NIA NIH HHS
ID : R21 AG047619
Pays : United States
Références
Neuroinformatics. 2003;1(4):343-57
pubmed: 15043220
Bioinformatics. 2005 Jan 1;21(1):104-15
pubmed: 15308538
Nat Genet. 2005 Mar;37(3):233-42
pubmed: 15711545
Nat Methods. 2005 May;2(5):345-50
pubmed: 15846361
BMC Genomics. 2006 Mar 03;7:40
pubmed: 16515682
Mol Syst Biol. 2005;1:2005.0016
pubmed: 16729051
Proc Natl Acad Sci U S A. 2008 Sep 23;105(38):14447-52
pubmed: 18794531
Brief Funct Genomic Proteomic. 2009 May;8(3):199-212
pubmed: 19734302
J Integr Bioinform. 2008 Aug 25;5(2):null
pubmed: 20134059
BMC Syst Biol. 2011 Mar 16;5:43
pubmed: 21410935
PLoS One. 2011 Apr 14;6(4):e18851
pubmed: 21533142
BMC Bioinformatics. 2011 Oct 18;12 Suppl 10:S19
pubmed: 22165960
Nat Rev Mol Cell Biol. 2012 Mar 07;13(4):225-238
pubmed: 22395773
Mol Cell. 2013 Jan 10;49(1):186-99
pubmed: 23201123
BMC Genomics. 2012;13 Suppl 8:S23
pubmed: 23282414
BMC Genomics. 2013 Jan 16;14:14
pubmed: 23324084
Proc Natl Acad Sci U S A. 2013 Apr 16;110(16):6601-6
pubmed: 23576753
Brief Funct Genomics. 2014 Jan;13(1):66-78
pubmed: 23960099
Front Aging Neurosci. 2013 Sep 06;5:48
pubmed: 24046746
Genes Brain Behav. 2014 Jan;13(1):13-24
pubmed: 24320616
PLoS One. 2014 Feb 28;9(2):e89279
pubmed: 24586654
Int Rev Neurobiol. 2014;116:195-231
pubmed: 25172476
Cell. 2014 Sep 11;158(6):1415-1430
pubmed: 25215496
Cell Metab. 2015 Apr 7;21(4):637-46
pubmed: 25863253
Cell Metab. 2016 Jan 12;23(1):128-42
pubmed: 26698917
Front Genet. 2016 Sep 28;7:169
pubmed: 27733864
BMC Bioinformatics. 2016 Oct 6;17(Suppl 13):350
pubmed: 27766940
Front Bioeng Biotechnol. 2017 Aug 28;5:48
pubmed: 28894735
Front Mol Neurosci. 2018 Apr 05;11:102
pubmed: 29674951