Semi-blind machine learning for fMRI-based predictions of intelligence

The figures shows the prediction accuracies of our semi-blind machine learning appraoch (SML-EL) compared with the… [more]
The figures shows the prediction accuracies of our semi-blind machine learning appraoch (SML-EL) compared with the standard blind method (BML-EL) using data from the Amsterdam Open MRI Collection (AOMIC-ID1000). Various training set sizes were investigated. The size of the test sets was 80 throughout. Note that the prediction accuracy initially correlates with the size of the training sets, but reaches a plateau at a sample size of about~600 beyond which it remains relatively constant. Also note that the semi-blind approach (SML-EL) surpasses the prediction accuracy of ``Education'' for training sample sizes larger than 300, whereas the blind approach (BML-EL) fails in this regard.
[less]

The figures shows the prediction accuracies of our semi-blind machine learning appraoch (SML-EL) compared with the standard blind method (BML-EL) using data from the Amsterdam Open MRI Collection (AOMIC-ID1000). Various training set sizes were investigated. The size of the test sets was 80 throughout. Note that the prediction accuracy initially correlates with the size of the training sets, but reaches a plateau at a sample size of about~600 beyond which it remains relatively constant. Also note that the semi-blind approach (SML-EL) surpasses the prediction accuracy of ``Education'' for training sample sizes larger than 300, whereas the blind approach (BML-EL) fails in this regard.

Predicting neuromarkers for cognitive abilities using fMRI has been a major focus of research in the past few years. However, it has recently been reported that many thousands of participants are required to obtain reproducible results (Marek et al (2022)). This appears to be a major impediment to obtaining neuromarkers from fMRI because large sample sizes are typically not available in neuroimaging studies. Here we show that the out-of-sample prediction accuracy can be dramatically improved by supplementing fMRI with readily available non-imaging information so that reliable predictive modeling becomes feasible even for small sample sizes. Specifically, we introduce a novel machine learning method that predicts intelligence from resting-state fMRI data, leveraging educational level as supplementary information. We refer to our approach as "semi-blind machine learning (SML)" because it operates under the assumption that supplementary information, such as educational level, is available for subjects in both the training and test sets. This setup closely mirrors real-world scenarios, especially in clinical contexts, where patient background information typically exists and can be utilized to boost prediction accuracy. However, guarding against bias is crucial. Subjects should not be categorized as more intelligent simply based on their higher education levels. Therefore, our approach contains a component explicitly designed for bias control. We have applied our method to three different data collections and observed marked improvements in prediction accuracies across a wide range of sample sizes. We anticipate that semi-blind machine learning provides a promising approach to fMRI-based predictive modelling with the potential for a wide range of future applications.

Lohmann, Heczko, Mahler, Wang, Steiglchner, Kumar, Roost, Jost, Scheffler

Improving the reliability of fMRI-based predictions of intelligence via semi-blind machine learning.

bioRxiv, 2023

DOI