GenomicDistributions: fast analysis of genomic intervals with Bioconductor.

Bioconductor Data visualization Genomic regions R package Region set summary

Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
12 Apr 2022
Historique:
received: 25 10 2021
accepted: 13 03 2022
entrez: 13 4 2022
pubmed: 14 4 2022
medline: 15 4 2022
Statut: epublish

Résumé

Epigenome analysis relies on defined sets of genomic regions output by widely used assays such as ChIP-seq and ATAC-seq. Statistical analysis and visualization of genomic region sets is essential to answer biological questions in gene regulation. As the epigenomics community continues generating data, there will be an increasing need for software tools that can efficiently deal with more abundant and larger genomic region sets. Here, we introduce GenomicDistributions, an R package for fast and easy summarization and visualization of genomic region data. GenomicDistributions offers a broad selection of functions to calculate properties of genomic region sets, such as feature distances, genomic partition overlaps, and more. GenomicDistributions functions are meticulously optimized for best-in-class speed and generally outperform comparable functions in existing R packages. GenomicDistributions also offers plotting functions that produce editable ggplot objects. All GenomicDistributions functions follow a uniform naming scheme and can handle either single or multiple region set inputs. GenomicDistributions offers a fast and scalable tool for exploratory genomic region set analysis and visualization. GenomicDistributions excels in user-friendliness, flexibility of outputs, breadth of functions, and computational performance. GenomicDistributions is available from Bioconductor ( https://bioconductor.org/packages/release/bioc/html/GenomicDistributions.html ).

Sections du résumé

BACKGROUND BACKGROUND
Epigenome analysis relies on defined sets of genomic regions output by widely used assays such as ChIP-seq and ATAC-seq. Statistical analysis and visualization of genomic region sets is essential to answer biological questions in gene regulation. As the epigenomics community continues generating data, there will be an increasing need for software tools that can efficiently deal with more abundant and larger genomic region sets. Here, we introduce GenomicDistributions, an R package for fast and easy summarization and visualization of genomic region data.
RESULTS RESULTS
GenomicDistributions offers a broad selection of functions to calculate properties of genomic region sets, such as feature distances, genomic partition overlaps, and more. GenomicDistributions functions are meticulously optimized for best-in-class speed and generally outperform comparable functions in existing R packages. GenomicDistributions also offers plotting functions that produce editable ggplot objects. All GenomicDistributions functions follow a uniform naming scheme and can handle either single or multiple region set inputs.
CONCLUSIONS CONCLUSIONS
GenomicDistributions offers a fast and scalable tool for exploratory genomic region set analysis and visualization. GenomicDistributions excels in user-friendliness, flexibility of outputs, breadth of functions, and computational performance. GenomicDistributions is available from Bioconductor ( https://bioconductor.org/packages/release/bioc/html/GenomicDistributions.html ).

Identifiants

pubmed: 35413804
doi: 10.1186/s12864-022-08467-y
pii: 10.1186/s12864-022-08467-y
pmc: PMC9003978
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

299

Subventions

Organisme : NIGMS NIH HHS
ID : R35 GM128636
Pays : United States
Organisme : NIGMS NIH HHS
ID : GM128636
Pays : United States

Informations de copyright

© 2022. The Author(s).

Références

Nucleic Acids Res. 2018 Jul 2;46(W1):W194-W199
pubmed: 29878235
Bioinformatics. 2017 Oct 01;33(19):3088-3090
pubmed: 28575171
Bioinformatics. 2019 Dec 1;35(23):4907-4911
pubmed: 31150060
Bioinformatics. 2016 Feb 15;32(4):587-9
pubmed: 26508757
Nucleic Acids Res. 2016 Jul 8;44(12):5550-6
pubmed: 27257071
Bioinformatics. 2021 Jun 22;:
pubmed: 34156475
Nucleic Acids Res. 2019 Jan 8;47(D1):D729-D735
pubmed: 30462313
Nat Biotechnol. 2010 May;28(5):495-501
pubmed: 20436461
Nucleic Acids Res. 2017 Jan 4;45(D1):D658-D662
pubmed: 27789702
Bioinformatics. 2016 Aug 1;32(15):2366-8
pubmed: 27153580
Bioinformatics. 2016 Jan 15;32(2):289-91
pubmed: 26424858
Front Genet. 2020 Feb 12;11:53
pubmed: 32117461
Genome Biol. 2020 Mar 30;21(1):81
pubmed: 32228704
Nucleic Acids Res. 2018 Jan 4;46(D1):D794-D801
pubmed: 29126249
Bioinformatics. 2018 Aug 1;34(15):2649-2650
pubmed: 29506020
Bioinformatics. 2020 Dec 26;:
pubmed: 33367484
Bioinformatics. 2015 Jul 15;31(14):2382-3
pubmed: 25765347
Genome Biol. 2020 Sep 7;21(1):240
pubmed: 32894181
Genome Biol. 2021 Aug 20;22(1):238
pubmed: 34416909
BMC Bioinformatics. 2010 May 11;11:237
pubmed: 20459804
Bioinformatics. 2017 Aug 01;33(15):2381-2383
pubmed: 28369316
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Nat Methods. 2018 Feb;15(2):123-126
pubmed: 29309061

Auteurs

Kristyna Kupkova (K)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.
Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, USA.

Jose Verdezoto Mosquera (JV)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.
Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, USA.

Jason P Smith (JP)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.
Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, USA.

Michał Stolarczyk (M)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.

Tessa L Danehy (TL)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.

John T Lawson (JT)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.
Department of Biomedical Engineering, University of Virginia, Charlottesville, USA.

Bingjie Xue (B)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.
Department of Biomedical Engineering, University of Virginia, Charlottesville, USA.

John T Stubbs (JT)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.
Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, USA.

Nathan LeRoy (N)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA.
Department of Biomedical Engineering, University of Virginia, Charlottesville, USA.

Nathan C Sheffield (NC)

Center for Public Health Genomics, University of Virginia, Charlottesville, USA. nsheffield@virginia.edu.
Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, USA. nsheffield@virginia.edu.
Department of Biomedical Engineering, University of Virginia, Charlottesville, USA. nsheffield@virginia.edu.
Department of Public Health Sciences, University of Virginia, Charlottesville, USA. nsheffield@virginia.edu.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Coal Metagenome Phylogeny Bacteria Genome, Bacterial
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH