SOORENA: Self-lOOp containing or autoREgulatory Nodes in biological network Analysis

Authors

Arar, H., Aldahdooh, J., Nickchi, P., JAFARI, M.

Abstract

Autoregulatory mechanisms, in which proteins modify their own activity or expression, are fundamental components of biological regulatory systems but remain challenging to identify systematically within the scientific literature. Manual curation is outpaced by publication growth, with self-regulation often described implicitly. To address the lack of automated tools for identifying protein autoregulatory mechanisms, we present SOORENA, a two-stage transformer-based model designed to predict and classify such mechanisms within PubMed abstracts. In Stage 1, the model determines whether a publication describes any form of protein autoregulation. In Stage 2, positive instances are further classified into one of seven mechanistic categories: autophosphorylation, autoubiquitination, autocatalytic activity, autoinhibition, autolysis, autoinducer production, and autoregulation. SOORENA was fine-tuned from PubMedBERT using a curated dataset of 1,332 experimentally validated abstracts sourced from UniProt-referenced publications. On a held-out test set, Stage 1 achieved an accuracy of 96.0% and a precision of 97.8%, effectively minimizing false positive propagation. Stage 2 demonstrated robust performance across all classes, with an overall accuracy of 95.5% and a macro-F1 score of 96.2%, including perfect classification for the two least-represented categories. Error analysis revealed that most misclassifications occurred between mechanistically related categories, suggesting that the model's learned representations reflect underlying biological relationships. We deployed SOORENA as a shiny app enabling interactive search, metadata-based filtering, and confidence-weighted prioritization of predictions alongside standardized ontology definitions to support scientific exploration. These results demonstrate that domain-specific language models can scale the discovery and curation of biologically critical self-regulatory mechanisms.

Preprint server: bioRxiv
The authors list and abstract were imported from bioRxiv on 05 Nov 2025.

Sign up!

Did you like this preprint? Sign up with Life Science Network.
If you already have a Life Science Network account, sign in, or connect with LinkedIn, Google.

Stats

Community rating n/a 0 votes

1-terrible, 9-excellent. How would you rate this preprint? Sign in in to submit your rating.

Recommendations n/a n/a positive of 0 vote(s)
Views 38
Comments 0

Comments

There are no comments yet.

Authors

Abstract

Sign up!

Stats

Recommended by

Post a comment

Comments