Sequencing platform shifts provide opportunities but pose challenges for combining genomic data sets.
HiSeq
NGS
NovaSeq
poly-G
reproducibility
reusability
Journal
Molecular ecology resources
ISSN: 1755-0998
Titre abrégé: Mol Ecol Resour
Pays: England
ID NLM: 101465604
Informations de publication
Date de publication:
Apr 2021
Apr 2021
Historique:
received:
04
09
2020
revised:
23
11
2020
accepted:
07
12
2020
pubmed:
15
12
2020
medline:
18
8
2021
entrez:
14
12
2020
Statut:
ppublish
Résumé
Technological advances in DNA sequencing over the last decade now permit the production and curation of large genomic data sets in an increasing number of nonmodel species. Additionally, these new data provide the opportunity for combining data sets, resulting in larger studies with a broader taxonomic range. Whilst the development of new sequencing platforms has been beneficial, resulting in a higher throughput of data at a lower per-base cost, shifts in sequencing technology can also pose challenges for those wishing to combine new sequencing data with data sequenced on older platforms. Here, we outline the types of studies where the use of curated data might be beneficial, and highlight potential biases that might be introduced by combining data from different sequencing platforms. As an example of the challenges associated with combining data across sequencing platforms, we focus on the impact of the shift in Illumina's base calling technology from a four-channel system to a two-channel system. We caution that when data are combined from these two systems, erroneous guanine base calls that result from the two-channel chemistry can make their way through a bioinformatic pipeline, eventually leading to inaccurate and potentially misleading conclusions. We also suggest solutions for dealing with such potential artefacts, which make samples sequenced on different sequencing platforms appear more differentiated from one another than they really are. Finally, we stress the importance of archiving tissue samples and the associated sequences for the continued reproducibility and reusability of sequencing data in the face of ever-changing sequencing platform technology.
Identifiants
pubmed: 33314612
doi: 10.1111/1755-0998.13309
doi:
Types de publication
News
Langues
eng
Sous-ensembles de citation
IM
Pagination
653-660Subventions
Organisme : SNSF
ID : 31003A_163446
Organisme : European Regional Development Fund
Organisme : Swiss Federal Office for the Environment and Eawag
Organisme : Portuguese National Science Foundation
ID : SFRH/BD/145153/2019
Informations de copyright
© 2020 John Wiley & Sons Ltd.
Références
Andrews, S. (2010). fastqc: A quality control tool for high throughput sequence data. Retrieved from https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Arora, K., Shah, M., Johnson, M., Sanghvi, R., Shelton, J., Nagulapalli, K., Oschwald, D. M., Zody, M. C., Germer, S., Jobanputra, V., Carter, J., & Robine, N. (2019). Deep whole-genome sequencing of 3 cancer cell lines on 2 sequencing platforms. Scientific Reports, 9, 19123. https://doi.org/10.1038/s41598-019-55636-3
Baker, B. J., De Anda, V., Seitz, K. W., Dombrowski, N., Santoro, A. E., & Lloyd, K. G. (2020). Diversity, ecology and evolution of Archaea. Nature Microbiology, 5(7), 887-900. https://doi.org/10.1038/s41564-020-0715-z
Bergland, A. O., Behrman, E. L., O'Brien, K. R., Schmidt, P. S., & Petrov, D. A. (2014). Genomic evidence of rapid and stable adaptive oscillations over seasonal time scales in Drosophila. Plos Genetics, 10(11), e1004775. https://doi.org/10.1371/journal.pgen.1004775
Bi, K. E., Linderoth, T., Singhal, S., Vanderpool, D., Patton, J. L., Nielsen, R., Moritz, C., & Good, J. M. (2019). Temporal genomic contrasts reveal rapid evolutionary responses in an alpine mammal during recent climate change. Plos Genetics, 15(5), e1008119. https://doi.org/10.1371/journal.pgen.1008119
Bottery, M. J., Wood, A. J., & Brockhurst, M. A. (2019). Temporal dynamics of bacteria-plasmid coevolution under antibiotic selection. ISME Journal, 13(2), 559-562. https://doi.org/10.1038/s41396-018-0276-9
Brawand, D., Wagner, C. E., Li, Y. I., Malinsky, M., Keller, I., Fan, S., Simakov, O., Ng, A. Y., Lim, Z. W., Bezault, E., Turner-Maier, J., Johnson, J., Alcazar, R., Noh, H. J., Russell, P., Aken, B., Alföldi, J., Amemiya, C., Azzouzi, N., … Di Palma, F. (2014). The genomic substrate for adaptive radiation in African cichlid fish. Nature, 513(7518), 375-381. https://doi.org/10.1038/nature13726
Chen, S. F., Zhou, Y. Q., Chen, Y. R., & Gu, J. (2018). fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 34(17), 884-890. https://doi.org/10.1093/bioinformatics/bty560
Cheng, S. F., Melkonian, M., Smith, S. A., Brockington, S., Archibald, J. M., Delaux, P. M., Li, F.-W., Melkonian, B., Mavrodiev, E. V., Sun, W., Fu, Y., Yang, H., Soltis, D. E., Graham, S. W., Soltis, P. S., Liu, X., Xu, X., & Wong, G. K. S. (2018). 10KP: A phylodiverse genome sequencing plan. GigaScience, 7(3), https://doi.org/10.1093/gigascience/giy013
Feulner, P. G. D., Chain, F. J. J., Panchal, M., Huang, Y., Eizaguirre, C., Kalbe, M., Lenz, T. L., Samonte, I. E., Stoll, M., Bornberg-Bauer, E., Reusch, T. B. H., & Milinski, M. (2015). Genomics of divergence along a continuum of parapatric population differentiation. PLOS Genetics, 11(2), e1005414. https://doi.org/10.1371/journal.pgen.1004966
Greenway, R., Barts, N., Henpita, C., Brown, A. P., Arias Rodriguez, L., Rodríguez Peña, C. M., Arndt, S., Lau, G. Y., Murphy, M. P., Wu, L., Lin, D., Shaw, J. H., Kelley, J. L., & Tobler, M. (2020). Convergent evolution of conserved mitochondrial pathways underlies repeated adaptation to extreme environments. Proceedings of the National Academy of Sciences of the United States of America, 117(28), 16424-16430. https://doi.org/10.1073/pnas.2004223117
Jones, M. R., Mills, L. S., Jensen, J. D., & Good, J. M. (2020). The origin and spread of locally adaptive seasonal camouflage in snowshoe hares. American Naturalist, 196(3), 316-332. https://doi.org/10.1086/710022
Kim, J. M., Santure, A. W., Barton, H. J., Quinn, J. L., Cole, E. F., Great Tit HapMap Consortium, Visser, M. E., Sheldon, B. C., Groenen, M. A. M., van Oers, K., & Slate, J. (2018). A high-density SNP chip for genotyping great tit (Parus major) populations and its application to studying the genetic architecture of exploration behaviour. Molecular Ecology Resources, 18(4), 877-891. https://doi.org/10.1111/1755-0998.12778
Kirch, M., Romundset, A., Gilbert, M. T. P., Jones, F. C., & Foote, A. D. (2020). Pleistocene stickleback genomes reveal the constraints on parallel evolution. bioRxiv, 2020.2008.2012.248427. https://doi.org/10.1101/2020.08.12.248427
Lamichhaney, S., Berglund, J., Almén, M. S., Maqbool, K., Grabherr, M., Martinez-Barrio, A., Promerová, M., Rubin, C.-J., Wang, C., Zamani, N., Grant, B. R., Grant, P. R., Webster, M. T., & Andersson, L. (2015). Evolution of Darwin's finches and their beaks revealed by genome sequencing. Nature, 518(7539), 371-375. https://doi.org/10.1038/nature14181
Leebens-Mack, J. H., Barker, M. S., Carpenter, E. J., Deyholos, M. K., Gitzendanner, M. A., Graham, S. W., & One Thousand Plant Transcriptomes Initiative (2019). One thousand plant transcriptomes and the phylogenomics of green plants. Nature, 574(7780), 679-685. https://doi.org/10.1038/s41586-019-1693-2
Leigh, D. M., Lischer, H. E. L., Grossen, C., & Keller, L. F. (2018). Batch effects in a multiyear sequencing study: False biological trends due to changes in read lengths. Molecular Ecology Resources, 18, 778-788. https://doi.org/10.1111/1755-0998.12779
Martin, S. H., Most, M., Palmer, W. J., Salazar, C., McMillan, W. O., Jiggins, F. M., & Jiggins, C. D. (2016). Natural selection and genetic diversity in the butterfly Heliconius melpomene. Genetics, 203(1), 525-541. https://doi.org/10.1534/genetics.115.183285
Mason, C. C. (2017). Four study design principles for genetic investigations using next generation sequencing. Bmj-British Medical Journal, 359, j4069. https://doi.org/10.1136/bmj.j4069
Meirmans, P. G. (2015). Seven common mistakes in population genetics and how to avoid them. Molecular Ecology, 24, 3223-3231. https://doi.org/10.1111/mec.13243
Morris, J. L., Puttick, M. N., Clark, J. W., Edwards, D., Kenrick, P., Pressel, S., Wellman, C. H., Yang, Z., Schneider, H., & Donoghue, P. C. J. (2018). The timescale of early land plant evolution. Proceedings of the National Academy of Sciences of the United States of America, 115(10), E2274-E2283. https://doi.org/10.1073/pnas.1719588115
O'Leary, S. J., Puritz Jonathan, B., Willis Stuart, C., Hollenbeck Christopher, M., & Portnoy David, S. (2018). These aren't the loci you're looking for: Principles of effective SNP filtering for molecular ecologists. Molecular Ecology, 27, 3193-3206. https://doi.org/10.1111/mec.14792.
Peter, J., De Chiara, M., Friedrich, A., Yue, J.-X., Pflieger, D., Bergström, A., Sigwalt, A., Barre, B., Freel, K., Llored, A., Cruaud, C., Labadie, K., Aury, J.-M., Istace, B., Lebrigand, K., Barbry, P., Engelen, S., Lemainque, A., Wincker, P., … Schacherer, J. (2018). Genome evolution across 1,011 Saccharomyces cerevisiae isolates. Nature, 556(7701), 339-344. https://doi.org/10.1038/s41586-018-0030-5
Poulsen, C. S., Pamp, S. J., Ekstrøm, C. T., & Aarestrup, F. M. (2019). Library preparation and sequencing platform introduce bias in metagenomics characterisation of microbial communities. bioRxiv, 592154. https://doi.org/10.1101/592154
Ravinet, M., Kume, M., Ishikawa, A., & Kitano, J. (2020). Patterns of genomic divergence and introgression between Japanese stickleback species with overlapping breeding habitats. Journal of Evolutionary Biology, 1-14. https://doi.org/10.1111/jeb.13664
Samuk, K., Owens, G. L., Delmore, K. E., Miller, S. E., Rennison, D. J., & Schluter, D. (2017). Gene flow and selection interact to promote adaptive divergence in regions of low recombination. Molecular Ecology, 26(17), 4378-4390. https://doi.org/10.1111/mec.14226
Sato, M. P., Ogura, Y., Nakamura, K., Nishida, R., Gotoh, Y., Hayashi, M., Hisatsune, J., Sugai, M., Takehiko, I., & Hayashi, T. (2019). Comparison of the sequencing bias of currently available library preparation kits for Illumina sequencing of bacterial genomes and metagenomes. DNA Research, 26(5), 391-398. https://doi.org/10.1093/dnares/dsz017
Shen, X.-X., Opulente, D. A., Kominek, J., Zhou, X., Steenwyk, J. L., Buh, K. V., Haase, M. A. B., Wisecaver, J. H., Wang, M., Doering, D. T., Boudouris, J. T., Schneider, R. M., Langdon, Q. K., Ohkuma, M., Endoh, R., Takashima, M., Manabe, R.-I., Čadež, N., Libkind, D., … Rokas, A. (2018). Tempo and mode of genome evolution in the budding yeast subphylum. Cell, 175(6), 1533-1545. https://doi.org/10.1016/j.cell.2018.10.023
Shi, M., Lin, X. D., Chen, X., Tian, J. H., Chen, L. J., Li, K., & Zhang, Y. Z. (2018). The evolutionary history of vertebrate RNA viruses. Nature, 561(7722), E6. https://doi.org/10.1038/s41586-018-0310-0
Soria-Carrasco, V., Gompert, Z., Comeault, A. A., Farkas, T. E., Parchman, T. L., Johnston, J. S., Buerkle, C. A., Feder, J. L., Bast, J., Schwander, T., Egan, S. P., Crespi, B. J., & Nosil, P. (2014). Stick insect genomes reveal natural selection's role in parallel speciation. Science, 344(6185), 738-742. https://doi.org/10.1126/science.1252136
Stankowski, S., Chase, M. A., Fuiten, A. M., Rodrigues, M. F., Ralph, P. L., & Streisfeld, M. A. (2019). Widespread selection and gene flow shape the genomic landscape during a radiation of monkeyflowers. Plos Biology, 17(7), e3000391. https://doi.org/10.1371/journal.pbio.3000391
Tenaillon, O., Barrick, J. E., Ribeck, N., Deatherage, D. E., Blanchard, J. L., Dasgupta, A., Wu, G. C., Wielgoss, S., Cruveiller, S., Médigue, C., Schneider, D., & Lenski, R. E. (2016). Tempo and mode of genome evolution in a 50,000-generation experiment. Nature, 536(7615), 165-170. https://doi.org/10.1038/nature18959
Tollis, M., Hutchins, E. D., Stapley, J., Rupp, S. M., Eckalbar, W. L., Maayan, I., Lasku, E., Infante, C. R., Dennis, S. R., Robertson, J. A., May, C. M., Crusoe, M. R., Bermingham, E., DeNardo, D. F., Hsieh, S.-T., Kulathinal, R. J., McMillan, W. O., Menke, D. B., Pratt, S. C., … Kusumi, K. (2018). Comparative genomics reveals accelerated evolution in conserved pathways during the diversification of anole lizards. Genome Biology and Evolution, 10(2), 489-506. https://doi.org/10.1093/gbe/evy013
Vijay, N., Bossu, C. M., Poelstra, J. W., Weissensteiner, M. H., Suh, A., Kryukov, A. P., & Wolf, J. B. W. (2016). Evolution of heterogeneous genome differentiation across multiple contact zones in a crow species complex. Nature Communications, 7, 10. https://doi.org/10.1038/ncomms13195
Zhang, C., Zhang, T., Luebert, F., Xiang, Y., Huang, C.-H., Hu, Y. I., Rees, M., Frohlich, M. W., Qi, J. I., Weigend, M., & Ma, H. (2020). Asterid phylogenomics/phylotranscriptomics uncover morphological evolutionary histories and support phylogenetic placement for numerous whole-genome duplications. Molecular Biology and Evolution, 37(11), 3188-3210. https://doi.org/10.1093/molbev/msaa160
Zhang, G., Li, C., Li, Q., Li, B., Larkin, D. M., Lee, C., Storz, J. F., Antunes, A., Greenwold, M. J., Meredith, R. W., Odeen, A., Cui, J., Zhou, Q., Xu, L., Pan, H., Wang, Z., Jin, L., Zhang, P., Hu, H., … Froman, D. P. (2014). Comparative genomics reveals insights into avian genome evolution and adaptation. Science, 346(6215), 1311-1320. https://doi.org/10.1126/science.1251385