Authors
Bing-Shiun Tsai, Jin-Yung Wong, Huai-Kuang Tsai
Published in
Briefings in bioinformatics. Volume 27. Issue 4. Jul 03, 2026.
Abstract
Object detection has revolutionized multiple domains by enabling models to jointly classify and localize targets within data. Yet, its potential in genomic sequence analysis remains largely unexplored. Here, we introduce DNA-DETR, an adaptation of the DETR architecture for one-dimensional genomic object detection. Surprisingly, the direct application of object detection to DNA sequences yielded poor performance, even for elements with simple definitions such as Non-B DNA. We found that the widely used one-hot encoding failed to capture key structural features of several Non-B DNA types. To address this limitation, we systematically investigated how different sequence representations, including one-hot encoding, dot matrix, and their combination, affect detection accuracy and model generalization. Our experiments demonstrate that the choice of representation profoundly affects both localization and classification. Notably, the combined representation consistently outperformed single representations, particularly for complex sequence elements. Our findings suggest that there is no universal 'one-representation-fits-all' solution in sequence feature learning. Despite the common perception that end-to-end learning diminishes the importance of representation, our results highlight that thoughtful selection of sequence representation remains critical for model design.
PMID:
42398068
Bibliographic data and abstract were imported from PubMed on 04 Jul 2026.
Read full publication at:
Please sign in
to see all details.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 5
- Comments 0