'Multi-SpaM': a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees.


Journal

NAR genomics and bioinformatics
ISSN: 2631-9268
Titre abrégé: NAR Genom Bioinform
Pays: England
ID NLM: 101756213

Informations de publication

Date de publication:
Mar 2020
Historique:
received: 30 04 2019
revised: 31 07 2019
accepted: 13 10 2019
entrez: 12 2 2021
pubmed: 30 10 2019
medline: 30 10 2019
Statut: epublish

Résumé

Word-based or 'alignment-free' methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate 'pairwise' distances between nucleic-acid or protein sequences; these distance values can then be used as input for tree-reconstruction programs such as neighbor-joining. In this paper, we propose the first word-based phylogeny approach that is based on 'multiple' sequence comparison and 'maximum likelihood'. Our algorithm first samples small, gap-free alignments involving four taxa each. For each of these alignments, it then calculates a quartet tree and, finally, the program 'Quartet MaxCut' is used to infer a super tree for the full set of input taxa from the calculated quartet trees. Experimental results show that trees produced with our approach are of high quality.

Identifiants

pubmed: 33575565
doi: 10.1093/nargab/lqz013
pii: lqz013
pmc: PMC7671388
doi:

Types de publication

Journal Article

Langues

eng

Pagination

lqz013

Informations de copyright

© The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

Références

PLoS Comput Biol. 2016 Oct 19;12(10):e1005107
pubmed: 27760124
J Comput Biol. 2013 Feb;20(2):64-79
pubmed: 23383994
Bioinformatics. 2011 Sep 1;27(17):2433-4
pubmed: 21690104
PLoS One. 2020 Feb 10;15(2):e0228070
pubmed: 32040534
Mol Biol Evol. 2012 Apr;29(4):1115-23
pubmed: 22160766
Algorithms Mol Biol. 2015 Feb 11;10:5
pubmed: 25685176
Sci Rep. 2013;3:2634
pubmed: 24022334
Sci Rep. 2016 Jul 01;6:28970
pubmed: 27363362
Brief Bioinform. 2014 May;15(3):343-53
pubmed: 24064230
Bioinformatics. 2011 Feb 1;27(3):334-42
pubmed: 21148543
Biosystems. 1992;28(1-3):47-55
pubmed: 1292666
Nucleic Acids Res. 2013 Apr;41(7):e75
pubmed: 23335788
Algorithms Mol Biol. 2017 Feb 14;12:1
pubmed: 28289437
BMC Bioinformatics. 2014;15 Suppl 9:S1
pubmed: 25252700
BMC Bioinformatics. 2019 Dec 17;20(Suppl 20):638
pubmed: 31842735
Genome Biol. 2017 Oct 3;18(1):186
pubmed: 28974235
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W7-11
pubmed: 24829447
BMC Res Notes. 2012 Feb 28;5:123
pubmed: 22373455
Gigascience. 2019 Mar 1;8(3):
pubmed: 30535314
Bioinformatics. 2017 Apr 1;33(7):971-979
pubmed: 28073754
Mol Syst Biol. 2011 Oct 11;7:539
pubmed: 21988835
J Mol Evol. 1981;17(6):368-76
pubmed: 7288891
Bioinformatics. 2014 Jul 15;30(14):2000-8
pubmed: 24828656
Algorithms Mol Biol. 2017 Dec 11;12:27
pubmed: 29238399
Front Plant Sci. 2012 Aug 29;3:192
pubmed: 22952468
Bioinformatics. 2015 Apr 15;31(8):1169-75
pubmed: 25504847
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
Bioinformatics. 2014 Jul 15;30(14):2079-80
pubmed: 24651968
Annu Rev Biomed Data Sci. 2018 Jul;1:93-114
pubmed: 31828235
J Mol Evol. 2018 Feb;86(2):150-165
pubmed: 29460038
Pac Symp Biocomput. 2002;:115-26
pubmed: 11928468
Mol Biol Evol. 1997 Jul;14(7):685-95
pubmed: 9254330
Brief Bioinform. 2019 Mar 22;20(2):426-435
pubmed: 28673025
Mol Biol Evol. 1987 Jul;4(4):406-25
pubmed: 3447015
Brief Bioinform. 2014 May;15(3):407-18
pubmed: 24291823
PLoS Comput Biol. 2016 Jun 23;12(6):e1004985
pubmed: 27336403
BMC Bioinformatics. 2017 Jun 7;18(Suppl 8):238
pubmed: 28617225
Mol Phylogenet Evol. 2012 Jan;62(1):1-8
pubmed: 21762785
BMC Bioinformatics. 2018 Nov 30;19(Suppl 15):441
pubmed: 30497364
J Comput Biol. 2006 Mar;13(2):336-50
pubmed: 16597244
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):704-18
pubmed: 21030737
PLoS One. 2013;8(2):e56925
pubmed: 23451112
Genome Biol. 2019 Feb 13;20(1):34
pubmed: 30760303
J Comput Biol. 2016 Jun;23(6):472-82
pubmed: 27058840
Bioinformatics. 2019 Oct 1;35(19):3547-3552
pubmed: 30994912
Genome Biol. 2016 Jun 20;17(1):132
pubmed: 27323842
Genome Biol. 2019 Jul 25;20(1):144
pubmed: 31345254
J Comput Biol. 2009 Oct;16(10):1487-500
pubmed: 19803738
Bioinformatics. 2014 Jul 15;30(14):1991-9
pubmed: 24700317
Nat Microbiol. 2016 Dec 22;2:16241
pubmed: 28005061

Auteurs

Thomas Dencker (T)

Department of Bioinformatics, Institute of Microbiology and Genetics, Universität Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany.

Chris-André Leimeister (CA)

Department of Bioinformatics, Institute of Microbiology and Genetics, Universität Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany.

Michael Gerth (M)

Institute for Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, L69 7ZB Liverpool, UK.

Christoph Bleidorn (C)

Department of Animal Evolution and Biodiversity, Universität Göttingen, Untere Karspüle 2, 37073 Göttingen, Germany.
Museo Nacional de Ciencias Naturales, Spanish National Research Council (CSIC), 28006 Madrid, Spain.

Sagi Snir (S)

Institute of Evolution, Department of Evolutionary and Environmental Biology, University of Haifa, 199 Aba Khoushy Ave. Mount Carmel, Haifa, Israel.

Burkhard Morgenstern (B)

Department of Bioinformatics, Institute of Microbiology and Genetics, Universität Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany.
Göttingen Center of Molecular Biosciences (GZMB), Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany.

Classifications MeSH