Benchmarking Oxford Nanopore read alignment-based insertion and deletion detection in crop plant genomes.
Journal
The plant genome
ISSN: 1940-3372
Titre abrégé: Plant Genome
Pays: United States
ID NLM: 101273919
Informations de publication
Date de publication:
06 2023
06 2023
Historique:
received:
22
09
2022
accepted:
15
01
2023
medline:
20
6
2023
pubmed:
30
3
2023
entrez:
29
3
2023
Statut:
ppublish
Résumé
Structural variations (SVs) are larger polymorphisms (> 50 bp in length), which consist of insertions, deletions, inversions, duplications, and translocations. They can have a strong impact on agronomical traits and play an important role in environmental adaptation. The development of long-read sequencing technologies, including Oxford Nanopore, allows for comprehensive SV discovery and characterization even in complex polyploid crop genomes. However, many of the SV discovery pipeline benchmarks do not include complex plant genome datasets. In this study, we benchmarked insertion and deletion detection by popular long-read alignment-based SV detection tools for crop plant genomes. We used real and simulated Oxford Nanopore reads for two crops, allotetraploid Brassica napus (oilseed rape) and diploid Solanum lycopersicum (tomato), and evaluated several read aligners and SV callers across 5×, 10×, and 20× coverages typically used in re-sequencing studies. We further validated our findings using maize and soybean datasets. Our benchmarks provide a useful guide for designing Oxford Nanopore re-sequencing projects and SV discovery pipelines for crop plants.
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e20314Informations de copyright
© 2023 The Authors. The Plant Genome published by Wiley Periodicals LLC on behalf of Crop Science Society of America.
Références
Alkan, C., Coe, B. P., & Eichler, E. E. (2011). Genome structural variation discovery and genotyping. Nature Reviews Genetics, 12, 363-376.
Alonge, M., Lebeigle, L., Kirsche, M., Aganezov, S., Wang, X., Lippman, Z. B., Schatz, M. C., & Soyk, S. (2021). Automated assembly scaffolding elevates a new tomato system for high-throughput genome editing. bioRxiv, 2021.11.18.469135.
Alonge, M., Wang, X., Benoit, M., Soyk, S., Pereira, L., Zhang, L., Suresh, H., Ramakrishnan, S., Maumus, F., Ciren, D., Levy, Y., Harel, T. H., Shalev-Schlosser, G., Amsellem, Z., Razifard, H., Caicedo, A. L., Tieman, D. M., Klee, H., Kirsche, M., … & Lippman, Z. B. (2020). Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell, 182, 145-161.e23. https://doi.org/10.1016/j.cell.2020.05.021
Bolognini, D., & Magi, A. (2021). Evaluation of germline structural variant calling methods for nanopore sequencing data. Frontiers in Genetics, 12. https://doi.org/10.3389/fgene.2021.761791
Bolognini, D., Sanders, A., Korbel, J. O., Magi, A., Benes, V., & Rausch, T. (2020). VISOR: A versatile haplotype-aware structural variant simulator for short- and long-read sequencing. Bioinformatics, 36, 1267-1269. https://doi.org/10.1093/bioinformatics/btz719
Bornowski, N., Michel, K. J., Hamilton, J. P., Ou, S., Seetharam, A. S., Jenkins, J., Grimwood, J., Plott, C., Shu, S., Talag, J., Kennedy, M., Hundley, H., Singan, V. R., Barry, K., Daum, C., Yoshinaga, Y., Schmutz, J., Hirsch, C. N., Hufford, M. B., … & Buell, C. R. (2021). Genomic variation within the maize stiff-stalk heterotic germplasm pool. Plant Genome, 14, e20114. https://doi.org/10.1002/tpg2.20114
Chawla, H. S., Lee, H., Gabur, I., Vollrath, P., Tamilselvan-Nattar-Amutha, S., Obermeier, C., Schiessl, S. V., Song, J. -M., Liu, K., Guo, L., Parkin, I. A. P., & Snowdon, R. J. (2021). Long-read sequencing reveals widespread intragenic structural variants in a recent allopolyploid crop plant. Plant Biotechnology Journal, 19, 240-250. https://doi.org/10.1111/pbi.13456
Cleal, K., & Baird, D. M. (2022). Dysgu: Efficient structural variant calling using short or long reads. Nucleic Acids Research, 50, e53. https://doi.org/10.1093/nar/gkac039
Coster, W. d., Rijk, P. d., Roeck, A. d., Pooter, T. d., D'Hert, S., Strazisar, M., Sleegers, K., & van Broeckhoven, C. (2019). Structural variants identified by Oxford Nanopore PromethION sequencing of the human genome. Genome Research, 29, 1178-1187. https://doi.org/10.1101/gr.244939.118
Coster, W. d., Weissensteiner, M. H., & Sedlazeck, F. J. (2021). Towards population-scale long-read sequencing. Nature Reviews Genetics, 22, 572-587. https://doi.org/10.1038/s41576-021-00367-3
Delahaye, C., & Nicolas, J. (2021). Sequencing DNA with nanopores: Troubles and biases. PLoS ONE, 16, e0257521. https://doi.org/10.1371/journal.pone.0257521
Dierckxsens, N., Li, T., Vermeesch, J. R., & Xie, Z. (2021). A benchmark of structural variation detection by long reads through a realistic simulated model. Genome Biology, 22, 342. 10.1186/s13059-021-02551-4
English, A. C., Menon, V. K., Gibbs, R., Metcalf, G. A., & Sedlazeck, F. J. (2022). Truvari: Refined structural variant comparison preserves allelic diversity. BioRxiv, 2022.02.21.481353.
Fu, Y., Mahmoud, M., Muraliraman, V. V., Sedlazeck, F. J., & Treangen, T. J. (2021). Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment. Gigascience, 10, giab063. https://doi.org/10.1093/gigascience/giab063
Fuentes, R. R., Chebotarov, D., Duitama, J., Smith, S., La Hoz, J. F. d., Mohiyuddin, M., Wing, R. A., McNally, K. L., Tatarinova, T., Grigoriev, A., Mauleon, R., & Alexandrov, N. (2019). Structural variants in 3000 rice genomes. Genome Research, 29, 870-880. https://doi.org/10.1101/gr.241240.118
Gill, R. A., Scossa, F., King, G. J., Golicz, A. A., Tong, C., Snowdon, R. J., Fernie, A. R., & Liu, S. (2021). On the role of transposable elements in the regulation of gene expression and subgenomic interactions in crop genomes. Critical Reviews in Plant Sciences, 40, 157-189. https://doi.org/10.1080/07352689.2021.1920731
Goel, M., Sun, H., Jiao, W. -B., & Schneeberger, K. (2019). SyRI: Finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biology, 20, 277. https://doi.org/10.1186/s13059-019-1911-0
Hall, M. B. (2022). Rasusa: Randomly subsample sequencing reads to a specified coverage. Journal of Open Source Software, 7, 3941. https://doi.org/10.21105/joss.03941
Heller, D., & Vingron, M. (2019). SVIM: Structural variant identification using mapped long reads. Bioinformatics, 35, 2907-2915. https://doi.org/10.1093/bioinformatics/btz041
Heller, D., & Vingron, M. (2020). SVIM-asm: Structural variant detection from haploid and diploid genome assemblies. Bioinformatics, 36, 5519-5521. https://doi.org/10.1093/bioinformatics/btaa1034
Hosmani, P. S., Flores-Gonzalez, M., van de Geest, H., Maumus, F., Bakker, L. V., Schijlen, E., van Haarst, J., Cordewener, J., Sanchez-Perez, G., Peters, S., Fei, Z., Giovannoni, J. J., Mueller, L. A., & Saha, S. (2019). An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. BioRxiv, 767764.
Jain, M., Olsen, H. E., Paten, B., & Akeson, M. (2016). The Oxford Nanopore minion: Delivery of nanopore sequencing to the genomics community. Genome Biology, 17, 239. https://doi.org/10.1186/s13059-016-1103-0
Jeffares, D. C., Jolly, C., Hoti, M., Speed, D., Shaw, L., Rallis, C., Balloux, F., Dessimoz, C., Bähler, J., & Sedlazeck, F. J. (2017). Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nature Communications, 8, 14061. https://doi.org/10.1038/ncomms14061
Jiang, T., Liu, S., Cao, S., Liu, Y., Cui, Z., Wang, Y., & Guo, H. (2021). Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation. BMC Bioinformatics [Electronic Resource], 22, 552. https://doi.org/10.1186/s12859-021-04422-y
Jiang, T., Liu, Y., Jiang, Y., Li, J., Gao, Y., Cui, Z., Liu, Y., Liu, B., & Wang, Y. (2020). Long-read-based human genomic structural variation detection with cuteSV. Genome Biology, 21, 189. https://doi.org/10.1186/s13059-020-02107-y
Lee, H., Chawla, H. S., Obermeier, C., Dreyer, F., Abbadi, A., & Snowdon, R. (2020). Chromosome-Scale assembly of winter oilseed rape Brassica napus. Frontiers in Plant Science, 11. https://doi.org/10.3389/fpls.2020.00496
Lemay, M. -A., Sibbesen, J. A., Torkamaneh, D., Hamel, J., Levesque, R. C., & Belzile, F. (2022). Combined use of Oxford Nanopore and illumina sequencing yields insights into soybean structural variation biology. BMC Biology, 20, 53. https://doi.org/10.1186/s12915-022-01255-w
Li, H. (2018). Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics, 34, 3094-3100. https://doi.org/10.1093/bioinformatics/bty191
Mahmoud, M., Gobet, N., Cruz-Dávalos, D. I., Mounier, N., Dessimoz, C., & Sedlazeck, F. J. (2019). Structural variant calling: The long and the short of it. Genome Biology, 20, 246. https://doi.org/10.1186/s13059-019-1828-7
Meyers, L. A., & Levin, D. A. (2006). On the abundance of polyploids in flowering plants. Evolution; Internation Journal of Organic Evolution, 60, 1198-1206.
Ren, J., & Chaisson, M. J. P. (2021). lra: A long read aligner for sequences and contigs. PLOS Computational Biology, 17, e1009078. https://doi.org/10.1371/journal.pcbi.1009078
Roberts, R. J., Carneiro, M. O., & Schatz, M. C. (2013). The advantages of SMRT sequencing. Genome Biology, 14, 405. https://doi.org/10.1186/gb-2013-14-6-405
Sedlazeck, F. J., Lee, H., Darby, C. A., & Schatz, M. C. (2018a). Piercing the dark matter: Bioinformatics of long-range sequencing and mapping. Nature Reviews Genetics, 19, 329-346. https://doi.org/10.1038/s41576-018-0003-4
Sedlazeck, F. J., Rescheneder, P., Smolka, M., Fang, H., Nattestad, M., Haeseler, A. v., & Schatz, M. C. (2018b). Accurate detection of complex structural variations using single-molecule sequencing. Nature Methods, 15, 461-468. https://doi.org/10.1038/s41592-018-0001-7
Song, J. -M., Guan, Z., Hu, J., Guo, C., Yang, Z., Wang, S., Liu, D., Wang, B., Lu, S., Zhou, R., Xie, W.-Z., Cheng, Y., Zhang, Y., Liu, K., Yang, Q.-Y., Chen, L.-L., & Guo, L. (2020). Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nature Plants, 6, 34-45. https://doi.org/10.1038/s41477-019-0577-7
Tao, Y., Zhao, X., Mace, E., Henry, R., & Jordan, D. (2019). Exploring and exploiting Pan-genomics for crop improvement. Molecular Plant, 12, 156-169. https://doi.org/10.1016/j.molp.2018.12.016
Tham, C. Y., Tirado-Magallanes, R., Goh, Y., Fullwood, M. J., Koh, B. T., Wang, W., Ng, C. H., Chng, W. J., Thiery, A., Tenen, D. G., & Benoukraf, T. (2019). NanoVar: Accurate characterization of patients’ genomic structural variants using low-depth nanopore sequencing. BioRxiv, 662940.
Udall, J. A., Quijada, P. A., & Osborn, T. C. (2005). Detection of chromosomal rearrangements derived from homeologous recombination in four mapping populations of Brassica napus L. Genetics, 169, 967-979. https://doi.org/10.1534/genetics.104.033209
Vollrath, P., Chawla, H. S., Schiessl, S. V., Gabur, I., Lee, H., Snowdon, R. J., & Obermeier, C. (2021). A novel deletion in FLOWERING LOCUS t modulates flowering time in winter oilseed rape. Theoretical and Applied Genetics, 134, 1217-1231. https://doi.org/10.1007/s00122-021-03768-4
Wenger, A. M., Peluso, P., Rowell, W. J., Chang, P. -C., Hall, R. J., Concepcion, G. T., Ebler, J., Fungtammasan, A., Kolesnikov, A., Olson, N. D., Töpfer, A., Alonge, M., Mahmoud, M., Qian, Y., Chin, C. -S., Phillippy, A. M., Schatz, M. C., Myers, G., DePristo, M. A., … & Hunkapiller, M. W. (2019). Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nature Biotechnology, 37, 1155-1162. https://doi.org/10.1038/s41587-019-0217-9
Yildiz, G., Zanini, S. F., Knight, P., & Golicz, A. A. (2022). Pangenomics in agriculture. CABI Biotechnology Series. CABI.
Yuan, Y., Bayer, P. E., Batley, J., & Edwards, D. (2021). Current status of structural variation studies in plants. Plant Biotechnology Journal, 19, 2153-2163. https://doi.org/10.1111/pbi.13646
Zanini, S. F., Bayer, P. E., Wells, R., Snowdon, R. J., Batley, J., Varshney, R. K., Nguyen, H. T., Edwards, D., & Golicz, A. A. (2022). Pangenomics in crop improvement-from coding structural variations to finding regulatory variants with pangenome graphs. Plant Genome, 15, e20177. https://doi.org/10.1002/tpg2.20177
Zhang, F., Xue, H., Dong, X., Li, M., Zheng, X., Li, Z., Xu, J., Wang, W., & Wei, C. (2022). Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes. Genome Research, 32, 853-863.
Zhou, A., Lin, T., & Xing, J. (2019). Evaluating nanopore sequencing data processing pipelines for structural variation identification. Genome Biology, 20, 237. https://doi.org/10.1186/s13059-019-1858-1