Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies.


Journal

Genome medicine
ISSN: 1756-994X
Titre abrégé: Genome Med
Pays: England
ID NLM: 101475844

Informations de publication

Date de publication:
14 12 2020
Historique:
received: 19 07 2020
accepted: 03 12 2020
entrez: 15 12 2020
pubmed: 16 12 2020
medline: 6 11 2021
Statut: epublish

Résumé

Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden. Strain-specific genomic diversity in the known lineages of MTBC is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Fast and accurate tracking of MTBC strains is therefore crucial for infection control, and our previous work developed a 62-single nucleotide polymorphism (SNP) barcode to inform on the phylogenetic identity of 7 human lineages and 64 sub-lineages. To update this barcode, we analysed whole genome sequencing data from 35,298 MTBC isolates (~ 1 million SNPs) covering 9 main lineages and 3 similar animal-related species (M. tuberculosis var. bovis, M. tuberculosis var. caprae and M. tuberculosis var. orygis). The data was partitioned into training (N = 17,903, 50.7%) and test (N = 17,395, 49.3%) sets and were analysed using an integrated phylogenetic tree and population differentiation (F By constructing a phylogenetic tree on the training MTBC isolates, we characterised 90 lineages or sub-lineages or species, of which 30 are new, and identified 421 robust barcoding mutations, of which a minimal set of 90 was selected that included 20 markers from the 62-SNP barcode. The barcoding SNPs (90 and 421) discriminated perfectly the 86 MTBC isolate (sub-)lineages in the test set and could accurately reconstruct the clades across the combined 35k samples. The validated 90 SNPs can be used for the rapid diagnosis and tracking of MTBC strains to assist public health surveillance and control. To facilitate this, the SNP markers have now been incorporated into the TB-Profiler informatics platform ( https://github.com/jodyphelan/TBProfiler ).

Sections du résumé

BACKGROUND
Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden. Strain-specific genomic diversity in the known lineages of MTBC is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Fast and accurate tracking of MTBC strains is therefore crucial for infection control, and our previous work developed a 62-single nucleotide polymorphism (SNP) barcode to inform on the phylogenetic identity of 7 human lineages and 64 sub-lineages.
METHODS
To update this barcode, we analysed whole genome sequencing data from 35,298 MTBC isolates (~ 1 million SNPs) covering 9 main lineages and 3 similar animal-related species (M. tuberculosis var. bovis, M. tuberculosis var. caprae and M. tuberculosis var. orygis). The data was partitioned into training (N = 17,903, 50.7%) and test (N = 17,395, 49.3%) sets and were analysed using an integrated phylogenetic tree and population differentiation (F
RESULTS
By constructing a phylogenetic tree on the training MTBC isolates, we characterised 90 lineages or sub-lineages or species, of which 30 are new, and identified 421 robust barcoding mutations, of which a minimal set of 90 was selected that included 20 markers from the 62-SNP barcode. The barcoding SNPs (90 and 421) discriminated perfectly the 86 MTBC isolate (sub-)lineages in the test set and could accurately reconstruct the clades across the combined 35k samples.
CONCLUSIONS
The validated 90 SNPs can be used for the rapid diagnosis and tracking of MTBC strains to assist public health surveillance and control. To facilitate this, the SNP markers have now been incorporated into the TB-Profiler informatics platform ( https://github.com/jodyphelan/TBProfiler ).

Identifiants

pubmed: 33317631
doi: 10.1186/s13073-020-00817-3
pii: 10.1186/s13073-020-00817-3
pmc: PMC7734807
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

114

Subventions

Organisme : Foundation for the National Institutes of Health
ID : D43TW009127
Organisme : Medical Research Council (GB)
ID : MR/N010469/1
Organisme : Medical Research Council
ID : MR/M01360X/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/R025576/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/N010469/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/R013063/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/R020973/1
Pays : United Kingdom
Organisme : FIC NIH HHS
ID : D43 TW009127
Pays : United States

Références

Bioinformatics. 2012 Sep 15;28(18):i333-i339
pubmed: 22962449
Nat Genet. 2013 Jul;45(7):784-90
pubmed: 23749189
Mol Biol Evol. 2016 Jun;33(6):1635-8
pubmed: 26921390
Bioinformatics. 2012 Nov 15;28(22):2991-3
pubmed: 23014632
Infect Genet Evol. 2019 Sep;73:337-341
pubmed: 31170529
BMC Genomics. 2016 Feb 29;17:151
pubmed: 26923687
Biomed Res Int. 2014;2014:645802
pubmed: 24527454
Bioinformatics. 2011 Nov 1;27(21):2987-93
pubmed: 21903627
J Comput Biol. 2012 May;19(5):455-77
pubmed: 22506599
Mol Biol Evol. 2015 Jan;32(1):268-74
pubmed: 25371430
mBio. 2017 Jan 17;8(1):
pubmed: 28096490
Genome Med. 2019 Jun 24;11(1):41
pubmed: 31234910
Evolution. 1984 Nov;38(6):1358-1370
pubmed: 28563791
mBio. 2013 Jul 30;4(4):
pubmed: 23900170
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Emerg Infect Dis. 2019 Mar;25(3):482-488
pubmed: 30789126
EBioMedicine. 2018 Aug;34:131-138
pubmed: 30115606
Sci Rep. 2019 Oct 25;9(1):15343
pubmed: 31653874
BMC Genomics. 2019 Mar 29;20(1):252
pubmed: 30922221
Genome Res. 2010 Sep;20(9):1297-303
pubmed: 20644199
Nat Commun. 2020 Jun 9;11(1):2917
pubmed: 32518235
Bioinformatics. 2014 Aug 1;30(15):2114-20
pubmed: 24695404
Nat Commun. 2014 Sep 01;5:4812
pubmed: 25176035
Semin Immunol. 2014 Dec;26(6):431-44
pubmed: 25453224
Biometrika. 1947;34(1-2):28-35
pubmed: 20287819
Immunol Rev. 2015 Mar;264(1):6-24
pubmed: 25703549
Nucleic Acids Res. 2019 Jul 2;47(W1):W256-W259
pubmed: 30931475
Int J Tuberc Lung Dis. 2019 Sep 1;23(9):972-979
pubmed: 31615603
Am J Hum Genet. 2007 Sep;81(3):559-75
pubmed: 17701901
PLoS One. 2010 Mar 10;5(3):e9490
pubmed: 20224823

Auteurs

Gary Napier (G)

Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.

Susana Campino (S)

Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.

Yared Merid (Y)

Armauer Hansen Research Institute, Addis Ababa, Ethiopia.
Department of Microbiology, Immunology and Parasitology, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia.
Hawassa University College of Medicine and Health Sciences, Hawassa, Ethiopia.

Markos Abebe (M)

Armauer Hansen Research Institute, Addis Ababa, Ethiopia.

Yimtubezinash Woldeamanuel (Y)

Department of Microbiology, Immunology and Parasitology, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia.

Abraham Aseffa (A)

Armauer Hansen Research Institute, Addis Ababa, Ethiopia.

Martin L Hibberd (ML)

Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.

Jody Phelan (J)

Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.

Taane G Clark (TG)

Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK. taane.clark@lshtm.ac.uk.
Faculty of Epidemiology and Population Health, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK. taane.clark@lshtm.ac.uk.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH