Authors
Chen Jiang, Yuanyan Xiong
Published in
Bioinformatics (Oxford, England). Jun 25, 2026. Epub Jun 25, 2026.
Abstract
Long-read sequencing (LRS) platforms offer extended read lengths but present computational challenges due to high error rates and frequent insertion-deletion (indel) artifacts. While sample multiplexing is essential for cost-efficiency, existing demultiplexing solutions face a dichotomy: vendor-provided tools (e.g., Dorado) often lack the structural flexibility required for highly non-canonical designs, while open-source tools (e.g., Cutadapt) often lack the speed or algorithmic robustness to handle custom, high-complexity barcode designs. Here, we present ReadChop, a high-performance demultiplexer implemented in Rust. ReadChop leverages Myers' bit-parallel algorithm to efficiently model indel-rich error profiles and employs a streaming architecture to ensure low memory footprint. Benchmarking demonstrates that ReadChop achieves classification precision exceeding 99.99% on both simulated datasets-even under ultra-high multiplexing conditions (e.g., 13,824-plex)-and empirical SARS-CoV-2 amplicons. Furthermore, it excels in filtering in silico chimeras (0.1% miss rate) and exhibits linear computational scalability on ultra-long templates (up to 100 kb). Crucially, it significantly accelerates execution speeds-being >6 times faster than Dorado, >2 times faster than Nanoplexer, and >30 times faster than Cutadapt-with memory usage consistently below 200 MB. ReadChop provides a flexible, robust solution for processing massive LRS datasets with non-canonical experimental designs.
Source code and documentation are freely available under the MIT license at https://github.com/cherryamme/ReadChop.
Supplementary data are available at Bioinformatics online.
PMID:
42348199
Bibliographic data and abstract were imported from PubMed on 25 Jun 2026.
Read full publication at:
Please sign in
to see all details.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 12
- Comments 0