Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

From regression to machine learning: improving prediction of rectal cancer recurrence.

Created on 03 Jul 2026

Authors

Thanat Tantinam, Ekkarin Supatrakul, Pawit Sutharat, Suwan Sanmee, Kullawat Bhatanaprabhabhan, Boonchai Ngamsirimas, Nataphon Santrakul, Rangsima Thiengthiantham, Punnawat Chandrachamnong, Suradet Buakhrun, Sarawut Ramjan

Published in

Annals of coloproctology. Volume 42. Issue 3. Pages 293-302. Epub Jun 26, 2026.

Abstract

Recurrence after curative-intent treatment of rectal cancer remains a major clinical challenge. Conventional regression models may not adequately capture complex clinical interactions, whereas machine learning algorithms may improve predictive accuracy. This study compared regression-based and machine learning models for predicting recurrence using routinely collected clinical data.
A retrospective cohort of 581 patients with rectal adenocarcinoma treated between 2013 and 2022 at 2 university hospitals in Thailand was analyzed. Seventeen demographic, pathological, and treatment-related features were used to develop logistic regression (LR), random forest (RF), support vector machine, extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) models. Data were split using stratified train-test sampling (80:20). Fivefold cross-validation, calibration, and decision curve analyses were performed. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), average precision (AP), F1 score, sensitivity, specificity, and feature importance.
All models demonstrated moderate discriminative performance. AUC values ranged from 0.626 (LightGBM) to 0.662 (XGBoost), while AP scores ranged from 0.483 (LightGBM) to 0.584 (RF). LR achieved perfect sensitivity (1.000) at the F1-optimized threshold, RF provided the strongest balance between precision and recall, and XGBoost achieved the highest AUC. Key predictors included carcinoembryonic antigen level, nodal status, surgical margin status, tumor grade, lymphovascular invasion, and the number of lymph nodes retrieved.
Machine learning models modestly outperformed regression-based approaches, although their clinical value lies primarily in tailoring model strengths to specific priorities. Selecting LR, RF, or XGBoost according to objectives such as maximizing sensitivity, achieving balanced precision-recall, or enabling risk stratification may enhance recurrence surveillance in rectal cancer.

PMID:
42392853
Bibliographic data and abstract were imported from PubMed on 03 Jul 2026.

Read full publication at:
Please sign in to see all details.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Reviewers' rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this publication? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 4
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement