Authors
Or Livne, Keren Yizhak
Published in
Bioinformatics (Oxford, England). Jun 24, 2026. Epub Jun 24, 2026.
Abstract
Formalin-fixed paraffin-embedded (FFPE) tissues are widely used in clinical and research settings, yet their use for detecting somatic mutations from RNA sequencing (RNA-seq) is hindered by artefactual mutations introduced by cytosine deamination and strand-specific damage. Existing FFPE noise-filtering tools are tailored to DNA-seq and rely on strand bias, rendering them unsuitable for RNA-seq. Here, we present FFixR, a machine learning-based framework that filters FFPE-induced artefacts from RNA-seq data without requiring matched-normal samples.
Trained on FFPE melanoma samples with matched DNA, FFixR leverages allele-specific read counts, variant features, and mutational signature probabilities. FFixR removed up to 98% of artefactual mutations while maintaining ∼92% recall of true variants. SHAP analysis revealed key feature interactions guiding model decisions. When applied to independent cohorts, FFixR restored the correlation between RNA- and DNA-derived tumor mutational burden (R2 = 0.881) and recovered biologically meaningful mutational signatures. FFixR enables accurate somatic variant calling from FFPE RNA-seq data, expanding the utility of archival samples for research and clinical applications.
FFixR tool is freely available on the web at https://github.com/yizhak-lab-ccg/FFixR and https://doi.org/10.6084/m9.figshare.31998315. The repository also includes a readme file describing the inputs, outputs and the entire pipeline. The results presented here were produced using v1.0.0.
Supplementary data are available at Bioinformatics online.
PMID:
42340671
Bibliographic data and abstract were imported from PubMed on 24 Jun 2026.
Read full publication at:
Please sign in
to see all details.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 6
- Comments 0