Comparative Analyses of Linear Regression &
Neural Network Models for fMRI Autism
Discrimination to Aid in Diagnosis
Nabil Salehiyan
Background
Purpose: Create a method to assist clinicians and researchers in determining ASD diagnosis
using data from fMRI scans and machine learning algorithms.
Autism spectrum Disorder (ASD) is a neurobiological disorder.
Communication, behavioral, and social deficits
1 in every 160 children are affected (Mellema et. al, 2022)
Occurs on a spectrum from mild, moderate, to severe
Many individuals require life-long support in several forms
Treatment costs could rise to 1 trillion by 2025 if the prevalence rate continues to rise as seen
in the last decade (Leigh and Du, 2015)
Early intervention and diagnosis is the most important for improving prognostic outcomes and
reducing economic costs
Eslami et al., 2021
Relatively subjective
Unstructured clinical interviews with various
informants
Subjective rating scales
Direct and indirect behavioral observation
Lack of uniformity
Potential to over-, under-, and/or misdiagnosis the
disorder
Limited experts in the field
Problems with traditional diagnosis:
Machine Learning (ML) Methods
as an accurate diagnostic tool:
Incorporates neuroimaging features from structural
and functional MRI (sMRI and fMRI) data to reveal
putative known and unknown biomarkers
Improve traditional diagnosis methods and
contribute to understanding of ASD etiology
Other Approaches
Aghdam et al. (2019)
Data: rs-fMRI of young
children, 2D slice for
each axis
Methods: Mixture of
experts with CNNs
Train accuracy:
70.45% with Adam
70% with Adamax
Reiter et al. (2021)
Data: rs-fMRI of 6-18
year olds
Methods: Random
forest
Train accuracy:
Ranged from 62.5%
to 73.7%
Mellema et al. (2022)
Data: fMRI and sMRI
features, processed with
fconn1000 pipeline
Methods: compared 12
models
Train accuracy with
DFNN:
Ranged from 79% to
80.4%
Our Approach
Our study’s dataset used rs-fMRI of 6-18 year olds, and we performed several machine
learning techniques. We then compared those different methods to determine which model
was the most accurate. Our results provided higher accuracy for our dataset than previous
papers for their datasets, for multiple methods, with the best two being a specific logistic
regression method and a 5-layer, 20-node neural network.
Methods: Data Set
283 subjects from the National
Database of Autism Research (123
ASD & 160 TD)
Dekhil et al. (2018) assigned every
subject a membership score for 4 brain
regions to discriminate ASD/TD
Train/test split derived from the Pareto
principle that roughly 80% of
consequences come from 20% of
causes
Methods: Preprocessing
Dekhil, et al. (2018). Using
resting state functional MRI
to build a personalized
autism diagnosis system.
PLOS ONE, 13(10)
Methods: Algorithms
Models
Julia using the scikit-learn package
LogisticRegression limited-memory BroydenFletcherGoldfarbShanno (L-BFGS)
solver
MLPClassifier 5 neuron single layer neural network that uses a stochastic gradient-
based optimizer
GridSearchCV → logistic regression
GridSearchCV → neural network
KFold & cross_validate → all cross-validated x 5
Results: Baseline Models
Logistic Regression Model
Precision: 89%
Recall: 84%
F1-score: 85%
Accuracy: 87.6%, SD = 3.3%
Shallow Neural Network (5 nodes)
Precision: 84%
Recall: 81%
F1-score: 81%
Accuracy: 62.9%, SD = 23.2%
Best Logistic Regression Model (C = 0.9, Newton-CG)
Best Baseline
Precision: 89% (89%)
Recall: 84%
(84%)
F1-score: 85% (85%)
Accuracy: 88.1% (87.6%)
SD: 3.6%
(3.3%)
Best Neural Network (SGD, 5 hidden layers of 20 nodes)
Best Baseline
Precision: 87% (84%)
Recall: 84%
(81%)
F1-score: 84% (81%)
Accuracy: 88.0% (62.9%)
SD: 3.9%
(23.2%)
Discussion
Best predictor: logistic regression
model with inverse regularization
strength of 0.9 and Newton-CG
solver
Neural network with 5 hidden layers
of 20 nodes each much less sensitive
to training data changes compared to
the baseline neural network
SD of 3.9% rather than 23.2%
Although accuracy scores were not
high enough for these models to be
used alone, use in conjunction with
expert clinicians can help minimize
human error
Average accuracy of 88.1% with
best logistic regression model
Average accuracy of 88.0% with
best neural network
Discussion
Mellema et. al. (2022): dense
feedforward neural network (DFNN)
maximum AUROC of 80.4% and
79%
Mellema et. al. (2022): SVM with a
linear kernel and logistic regression
with ridge regularization
AUROC of 70.4% and 69.4%
Aghdam et al. (2019):
70.45% accuracy with Adam
70% accuracy with Adamax
Reiter et al. (2021):
classification accuracies in
validation samples were
62.5%, 65%, 70%, and
73.75%.
Limitations
- Larger sample size
- Various brain regions
- Different models for higher accuracy
- More deep learning options
References
Aghdam, M. A., Sharifi, A., & Pedram, M. M. (2019). Diagnosis of autism spectrum disorders in young
children based on resting-state functional magnetic resonance imaging data using Convolutional Neural
Networks. Journal of Digital Imaging, 32(6), 899918. https://doi.org/10.1007/s10278-019-00196-1
Dekhil, O., Hajjdiab, H., Shalaby, A., Ali, M. T., Ayinde, B., Switala, A., Elshamekh, A., Ghazal, M., Keynton,
R., Barnes, G., & El-Baz, A. (2018). Using resting state functional MRI to build a personalized autism
diagnosis system. PLOS ONE, 13(10). https://doi.org/10.1371/journal.pone.0206351
Mellema, C. J., Nguyen, K. P., Treacher, A., & Montillo, A. (2022). Reproducible neuroimaging features for
diagnosis of autism spectrum disorder with machine learning. Scientific Reports, 12(1).
https://doi.org/10.1038/s41598-022-06459-2
Reiter, M. A., Jahedi, A., Fredo, A. R., Fishman, I., Bailey, B., & Müller, R.-A. (2020). Performance of
machine learning classification models of autism using resting-state fmri is contingent on sample
heterogeneity. Neural Computing and Applications, 33(8), 32993310. https://doi.org/10.1007/s00521-020-
05193-y