Comparative Analyses of Linear Regression &

Neural Network Models for fMRI Autism

Discrimination to Aid in Diagnosis

Nabil Salehiyan

Background

●Purpose: Create a method to assist clinicians and researchers in determining ASD diagnosis

using data from fMRI scans and machine learning algorithms.

●Autism spectrum Disorder (ASD) is a neurobiological disorder.

○Communication, behavioral, and social deficits

○1 in every 160 children are affected (Mellema et. al, 2022)

○Occurs on a spectrum from mild, moderate, to severe

●Many individuals require life-long support in several forms

○Treatment costs could rise to 1 trillion by 2025 if the prevalence rate continues to rise as seen

in the last decade (Leigh and Du, 2015)

●Early intervention and diagnosis is the most important for improving prognostic outcomes and

reducing economic costs

Eslami et al., 2021

●Relatively subjective

○Unstructured clinical interviews with various

informants

○Subjective rating scales

○Direct and indirect behavioral observation

●Lack of uniformity

●Potential to over-, under-, and/or misdiagnosis the

disorder

●Limited experts in the field

Problems with traditional diagnosis:

Machine Learning (ML) Methods

as an accurate diagnostic tool:

●Incorporates neuroimaging features from structural

and functional MRI (sMRI and fMRI) data to reveal

putative known and unknown biomarkers

●Improve traditional diagnosis methods and

contribute to understanding of ASD etiology

Other Approaches

Aghdam et al. (2019)

●Data: rs-fMRI of young

children, 2D slice for

each axis

●Methods: Mixture of

experts with CNNs

●Train accuracy:

○70.45% with Adam

○70% with Adamax

Reiter et al. (2021)

●Data: rs-fMRI of 6-18

year olds

●Methods: Random

forest

●Train accuracy:

○Ranged from 62.5%

to 73.7%

Mellema et al. (2022)

●Data: fMRI and sMRI

features, processed with

fconn1000 pipeline

●Methods: compared 12

models

●Train accuracy with

DFNN:

○Ranged from 79% to

80.4%

Our Approach

Our study’s dataset used rs-fMRI of 6-18 year olds, and we performed several machine

learning techniques. We then compared those different methods to determine which model

was the most accurate. Our results provided higher accuracy for our dataset than previous

papers for their datasets, for multiple methods, with the best two being a specific logistic

regression method and a 5-layer, 20-node neural network.

Methods: Data Set

●283 subjects from the National

Database of Autism Research (123

ASD & 160 TD)

●Dekhil et al. (2018) assigned every

subject a membership score for 4 brain

regions to discriminate ASD/TD

●Train/test split derived from the Pareto

principle that roughly 80% of

consequences come from 20% of

causes

Methods: Preprocessing

Dekhil, et al. (2018). Using

resting state functional MRI

to build a personalized

autism diagnosis system.

PLOS ONE, 13(10)

Methods: Algorithms

Models

●Julia using the scikit-learn package

● LogisticRegression → limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS)

solver

● MLPClassifier → 5 neuron single layer neural network that uses a stochastic gradient-

based optimizer

● GridSearchCV → logistic regression

● GridSearchCV → neural network

● KFold & cross_validate → all cross-validated x 5

Results: Baseline Models

Logistic Regression Model

●Precision: 89%

●Recall: 84%

●F1-score: 85%

●Accuracy: 87.6%, SD = 3.3%

Shallow Neural Network (5 nodes)

●Precision: 84%

●Recall: 81%

●F1-score: 81%

●Accuracy: 62.9%, SD = 23.2%

Best Logistic Regression Model (C = 0.9, Newton-CG)

Best Baseline

●Precision: 89% (89%)

●Recall: 84%

(84%)

●F1-score: 85% (85%)

●Accuracy: 88.1% (87.6%)

○SD: 3.6%

(3.3%)

Best Neural Network (SGD, 5 hidden layers of 20 nodes)

Best Baseline

●Precision: 87% (84%)

●Recall: 84%

(81%)

●F1-score: 84% (81%)

●Accuracy: 88.0% (62.9%)

○SD: 3.9%

(23.2%)

Discussion

●Best predictor: logistic regression

model with inverse regularization

strength of 0.9 and Newton-CG

solver

●Neural network with 5 hidden layers

of 20 nodes each much less sensitive

to training data changes compared to

the baseline neural network

○SD of 3.9% rather than 23.2%

●Although accuracy scores were not

high enough for these models to be

used alone, use in conjunction with

expert clinicians can help minimize

human error

○Average accuracy of 88.1% with

best logistic regression model

○Average accuracy of 88.0% with

best neural network

Discussion

●Mellema et. al. (2022): dense

feedforward neural network (DFNN)

○maximum AUROC of 80.4% and

79%

●Mellema et. al. (2022): SVM with a

linear kernel and logistic regression

with ridge regularization

○AUROC of 70.4% and 69.4%

●Aghdam et al. (2019):

○70.45% accuracy with Adam

○70% accuracy with Adamax

●Reiter et al. (2021):

○classification accuracies in

validation samples were

62.5%, 65%, 70%, and

73.75%.

Limitations

- Larger sample size

- Various brain regions

- Different models for higher accuracy

- More deep learning options

References

●Aghdam, M. A., Sharifi, A., & Pedram, M. M. (2019). Diagnosis of autism spectrum disorders in young

children based on resting-state functional magnetic resonance imaging data using Convolutional Neural

Networks. Journal of Digital Imaging, 32(6), 899–918. https://doi.org/10.1007/s10278-019-00196-1

●Dekhil, O., Hajjdiab, H., Shalaby, A., Ali, M. T., Ayinde, B., Switala, A., Elshamekh, A., Ghazal, M., Keynton,

R., Barnes, G., & El-Baz, A. (2018). Using resting state functional MRI to build a personalized autism

diagnosis system. PLOS ONE, 13(10). https://doi.org/10.1371/journal.pone.0206351

●Mellema, C. J., Nguyen, K. P., Treacher, A., & Montillo, A. (2022). Reproducible neuroimaging features for

diagnosis of autism spectrum disorder with machine learning. Scientific Reports, 12(1).

https://doi.org/10.1038/s41598-022-06459-2

●Reiter, M. A., Jahedi, A., Fredo, A. R., Fishman, I., Bailey, B., & Müller, R.-A. (2020). Performance of

machine learning classification models of autism using resting-state fmri is contingent on sample

heterogeneity. Neural Computing and Applications, 33(8), 3299–3310. https://doi.org/10.1007/s00521-020-

05193-y