Authors
Wu, C.-i., Banda, K., Swisher, E., Sailem, H.
Abstract
Whole slide images (WSIs) contain hierarchical information from cellular to tissue architecture, but their gigapixel scale poses major memory and computational challenges. Existing multi-scale graph and transformer models capture complex WSI features effectively but struggle with efficiency. We propose an Adaptive Multi-Scale Graph Transformer (AMGT) for WSI classification that addresses this limitation through two key modules: a Self-Guided Token Aggregation (SGTA) mechanism that fuses multi-resolution features to reduce redundancy, and a Prototypical Transformer (PT) that groups similar tokens into phenotype-representative prototypes with linear complexity. This design preserves essential spatial and semantic information, substantially lowering memory cost and improving interpretability by prototypical learning. AMGT achieves superior performance and efficiency, outperforming state-of-the-art models by 1.8% and 5.3% AUC on high-grade ovarian cancer and Camelyon16 datasets, respectively. These results demonstrate AMGT's capacity for scalable, interpretable multi-scale representation learning.
Preprint server:
bioRxiv
The authors list and abstract were imported from bioRxiv on 01 Nov 2025.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 139
- Comments 0