Authors
Pronk, B., Makrodimitris, S., Wilting, S., Reinders, M.
Abstract
Accurate discrimination between healthy individuals and patients with cancer using minimally invasive liquid biopsies could improve cancer diagnosis and monitoring. Circulating cell-free DNA (cfDNA) is a promising biomarker, since fragmentation patterns reflect chromatin organization and have been used to interrogate regulatory regions such as transcription start sites (TSSs). Classification approaches typically rely on hypothesis-driven selection of genomic regions based on literature or external tissue data. Therefore, they assume that tumor-derived cfDNA constitutes the dominant diagnostic signal, potentially overlooking a systemic, genome-wide shift in the cfDNA pool. We present a data-driven framework that identifies discriminative genomic loci directly from cfDNA whole-genome sequencing data. Using fragmentomic features captured at TSSs within a nested cross-validation framework, the model outperforms ichorCNA and hypothesis-driven baselines in distinguishing healthy from colorectal and breast cancer samples (AUROC 0.95+-0.039). Performance was maintained in a pan-cancer setting across seven malignancies (AUROC 0.946+-0.032) and generalized to previously unseen cancer types within the same cohorts (AUROC 0.934+-0.006). While validation in an independent external cohort showed a performance gap (AUROC 0.694), the data-driven model was consistently competitive with baseline methods. These results indicate that robust cancer detection is enabled by integrating distributed genome-wide fragmentation patterns rather than restricting analysis to predefined regions.
Preprint server:
bioRxiv
The authors list and abstract were imported from bioRxiv on 27 Jun 2026.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 5
- Comments 0