3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      On the ability to extract MLVA profiles of Vibrio cholerae isolates from WGS data generated with Oxford Nanopore Technologies

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Objective

          Multiple-Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA) is widely used to subtype pathogens causing foodborne and waterborne disease outbreaks. The MLVAType shiny application was previously designed to extract MLVA profiles of Vibrio cholerae isolates from whole-genome sequencing (WGS) data, and provide backward compatibility with traditional MLVA typing methods. The previous development and validation work was conducted using short (pair-end 300 and 150 nt long) reads from Illumina MiSeq and Hiseq sequencing. In this study, the MLVAType application was validated using long reads generated by Oxford Nanopore Technologies (ONT) sequencing platforms. In silico MLVA profiles of V. cholerae isolates ( n = 9) from the Democratic Republic of the Congo were generated using the MLVAType application on Nanopore WGS data. The WGS-derived in silico MLVA profiles were extracted from Canu (v.2.2) assemblies obtained through MinION and GridION sequencing by ONT. The results were compared to those obtained from SPAdes assemblies (v3.13.0; k-mer 175) generated from short-read (pair-end 300-bp) reference data obtained by MiSeq sequencing, Illumina.

          Results

          For each isolate, the in silico MLVA profiles were concordant across all three sequencing methods, demonstrating that the MLVAType application can accurately predict the MLVA profiles from assembled genomes generated by long-reads ONT sequencers.

          Related collections

          Most cited references10

          • Record: found
          • Abstract: found
          • Article: not found

          SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

          The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation

            Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences (PacBio) or Oxford Nanopore technologies and achieves a contig NG50 of >21 Mbp on both human and Drosophila melanogaster PacBio data sets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Sequencing DNA with nanopores: Troubles and biases

              Oxford Nanopore Technologies’ (ONT) long read sequencers offer access to longer DNA fragments than previous sequencer generations, at the cost of a higher error rate. While many papers have studied read correction methods, few have addressed the detailed characterization of observed errors, a task complicated by frequent changes in chemistry and software in ONT technology. The MinION sequencer is now more stable and this paper proposes an up-to-date view of its error landscape, using the most mature flowcell and basecaller. We studied Nanopore sequencing error biases on both bacterial and human DNA reads. We found that, although Nanopore sequencing is expected not to suffer from GC bias, it is a crucial parameter with respect to errors. In particular, low-GC reads have fewer errors than high-GC reads (about 6% and 8% respectively). The error profile for homopolymeric regions or regions with short repeats, the source of about half of all sequencing errors, also depends on the GC rate and mainly shows deletions, although there are some reads with long insertions. Another interesting finding is that the quality measure, although over-estimated, offers valuable information to predict the error rate as well as the abundance of reads. We supplemented this study with an analysis of a rapeseed RNA read set and shown a higher level of errors with a higher level of deletion in these data. Finally, we have implemented an open source pipeline for long-term monitoring of the error profile, which enables users to easily compute various analysis presented in this work, including for future developments of the sequencing device. Overall, we hope this work will provide a basis for the design of better error-correction methods.
                Bookmark

                Author and article information

                Contributors
                jerome.ambroise@uclouvain.be
                Journal
                BMC Res Notes
                BMC Res Notes
                BMC Research Notes
                BioMed Central (London )
                1756-0500
                16 January 2025
                16 January 2025
                2025
                : 18
                : 18
                Affiliations
                Center for Applied Molecular Technologies (CTMA), Institute of Clinical and Experimental Research (IREC), Université catholique de Louvain (UCLouvain), ( https://ror.org/02495e989) Brussels, Belgium
                Article
                7093
                10.1186/s13104-025-07093-7
                11740648
                39819345
                5dfb6bcb-d4b8-4099-a7e9-663b4ae2c6e6
                © The Author(s) 2025

                Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

                History
                : 13 September 2024
                : 8 January 2025
                Funding
                Funded by: Belgian Cooperation Agency of the ARES (Académie de Recherche et d’Enseignement Supérieur)
                Award ID: COOP-CONV-20-022
                Categories
                Research Note
                Custom metadata
                © BioMed Central Ltd., part of Springer Nature 2025

                Medicine
                in silico mlva profiles,sequencing,nanopore,long reads,wgs
                Medicine
                in silico mlva profiles, sequencing, nanopore, long reads, wgs

                Comments

                Comment on this article