Brain2VLM: Hierarchical Alignment Between Cortical Representations and Vision-Language Latent Spaces

Authors

Pritam, N. A. A., O, J. S., Jain, S.

Abstract

This work introduces Brain2VLM, a framework for analyzing how cortical representations align with latent spaces of pretrained diffusion-based vision-language models for brain-to-image reconstruction. While recent approaches achieve strong performance by mapping functional Magnetic Resonance Imaging (fMRI) signals to model latents, the structure of this mapping remains poorly understood. We hypothesize that brain-to-latent alignment is hierarchical, with early visual cortex exhibiting approximately linear correspondence to structural diffusion latents, and higher-order visual areas requiring nonlinear mappings to align with semantic embedding spaces. To test this, we decode diffusion latents and CLIP embeddings from fMRI signals using both linear ridge regression and a nonlinear residual MLP on the Natural Scenes Dataset. Our results reveal that nonlinear decoding provides only marginal improvements for diffusion latents ({Delta} {approx} 0.05 - 0.06 in correlation), but yields substantial gains for semantic embeddings ({Delta} {approx} 0.47), significantly improving distributional alignment (MMD: 0.042 vs 0.358). However, increased decoder expressivity can introduce shifts in latent distributions, highlighting a trade-off between prediction accuracy and generative compatibility. Despite using a simple reconstruction pipeline, Brain2VLM achieves strong performance (PixCorr: 0.33, CLIP: 85%), suggesting that improvements in brain-to-latent alignment play an important role in reconstruction quality alongside generative modeling. These findings provide empirical evidence for hierarchical alignment between cortical representations and model latent spaces, positioning the brain-to-latent interface as a primary bottleneck in brain decoding systems. Our code can be found at https://github.com/adarsh-crafts/Brain2VLM

Preprint server: bioRxiv
The authors list and abstract were imported from bioRxiv on 25 Apr 2026.

Sign up!

Did you like this preprint? Sign up with Life Science Network.
If you already have a Life Science Network account, sign in, or connect with LinkedIn, Google.

Stats

Community rating n/a 0 votes

1-terrible, 9-excellent. How would you rate this preprint? Sign in in to submit your rating.

Recommendations n/a n/a positive of 0 vote(s)
Views 15
Comments 0

Comments

There are no comments yet.