Authors
Ikemura, T., Iwasaki, Y., Wada, K., Wada, Y., Abe, T.
Abstract
Unsupervised and explainable AI can uncover genomic features that extend beyond human expectations. Oligonucleotides such as penta- and hexanucleotides often function as core binding motifs for regulatory proteins, and their usage provides a powerful tool in functional genomics. We applied an unsupervised, explainable AI approach to odds ratio (observed/expected) profiles of all 1-Mb euchromatic fragments in the human genome. This odds-ratio analysis identified oligonucleotide features independent of mononucleotide composition, thereby highlighting functional roles of oligonucleotide motifs. AI-based clustering of all 1-Mb euchromatic fragments, using either penta- or hexanucleotide odds ratios (1,024 or 4,096 variables), unexpectedly revealed nearly 2,000 distinct zones, despite the large differences in dimensionality. If these ~2,000 zones represent biologically meaningful segmentations, comparable structures would be expected to emerge when other oligonucleotide types are analyzed. Consistent with this expectation, CG containing penta and hexanucleotides (244 and 1,185 variables) produced comparable ~2,000 zones, indicating that the underlying segmental structures reflect fundamental functional divisions within the genome. Human chromosomes exhibit well established Giemsa-banding patterns comprising 850 bands at prometaphase and 2,000 bands at prophase. Since single-nucleotide coordinates are available for the 850 bands, we identified a diagnostic oligonucleotide set that distinguishes Giemsa-negative and -positive regions. Computational pseudo band reconstruction based on this set generated genome segmentations that more closely paralleled the AI derived ~2,000 clusters than the 850 bands. These unexpected findings indicate that AI captures the characteristic features of chromosomal bands and can predict high-resolution banding from genome sequences alone, thereby bridging classical cytogenetics and modern AI-based genomics.
Preprint server:
bioRxiv
The authors list and abstract were imported from bioRxiv on 12 Mar 2026.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 25
- Comments 0