### Build a Classiﬁcation Model using Monthly Changes in Radiology Imaging and Clinical Data

Multiple studies such as (Lahmiri and Shmuel, 2018) and (Zhang and Sejdi´c, 2019) typically apply various machine learning algorithms to ﬁnd the best model which correctly classiﬁes stages of Alzheimer’s disease. In this implementation, features are selected based on two papers ((Goyal et al., 2018) and (Kruthika et al., 2019a)). The main categories of features are radiology images e.g., MRI, PET, CSF measure and clinical data e.g., cognitive tests and information about the patient. New features are created to determine the monthly change in the selected features and are based on the literature (Young et al., 2014b) which also used changes in biomarkers across time for modelling. Missing values are replaced with 0.0 and values are scaled. The categorical variables e.g., gender, marital status are converted into dummy variables. The data is then divided into training and test data sets. The models are from scikit-learn library and trained using new features and dummy variables. Machine learning algorithms use OvR strategy to train and test data set is used for evaluation. This is a continuation from here.

#### Implementation, Evaluation and Result of Logistic Regression

Logistic regression is implemented to calculate the probability of an event occurrence for categorical target variables. It is implemented using the scikit-learn library and the function is LogisticRegression(). The multiclass option is set to one-vs-rest (OvR). Logistic regression implementation from the library uses LASSO regularization as used by the study (Lee et al., 2016). The model resulted in an average AUROC score of 0.595.

#### Implementation, Evaluation and Result of Linear Discriminant Analysis

Linear discriminant analysis estimates the probability of an input belonging to every class and is used by a paper (Mehdipour Ghazi et al., 2019). It is implemented using the scikit-learn library and the function is LinearDiscriminantAnalysis(). Singular value decomposition is applied as the solver because it does not compute the covariance matrix. The model resulted in an average AUROC score of 0.593.

#### Implementation, Evaluation and Result of K-nearest Neighbours

K-nearest neighbours assign an input to the class of which it is the nearest neighbour. It is implemented using the scikit-learn library and the function is KNeighborsClassiﬁer(). It uses ﬁve nearest neighbours and all points are weighted equally in each neighbourhood. The model resulted in an average AUROC score of 0.612.

#### Implementation, Evaluation and Result of Decision Tree

Decision tree split the data into cascading questions based on signiﬁcant contrast in the input. It is implemented using scikit-learn library and the function is DecisionTreeClassiﬁer( ). “Best” strategy is adopted to select the split at each node and “gini” represent the function to measure the quality of a split. The model resulted in an average AUROC score of 0.651.

#### Implementation, Evaluation and Result of Random Forest

Random forest is an ensemble of a decision tree. It is implemented using scikit-learn library. The function used to implement is RandomForestClassiﬁer(). The number of trees is 10 and the maximum depth of the tree is set to 1. The model resulted in an average AUROC score of 0.581.

#### Implementation, Evaluation and Result of Gaussian Naive Bayes

Naive Bayes predicts a class for the input using conditional probability. Gaussian naive Bayes is implemented because some features are continuous and categorical values are present in the data. It is implemented using scikit-learn library. The function used to implement is GaussianNB() and uses default parameters. The model resulted in an average AUROC score of 0.524.

#### Implementation, Evaluation and Result of Support Vector Machine

Support vector machine (SVM) ﬁnds a hyperplane before dividing into classes. It is implemented using scikit-learn library and the function is SVC(). It uses default parameters from the library. The model resulted in average AUROC score of 0.60.

The figure below is a normalized confusion matrix for SVM.

It is a visual representation in which the row denotes an instance of true class whereas column denotes an instance of the predicted class. It measures the performance over a ﬁxed threshold. The values of the diagonal elements denote the degree of correctly predicted class i.e., 0.72 for normal (NL), 0.16 for MCI and 0.59 for dementia. The oﬀ-diagonal elements are mistakenly confused with the other classes. SVM is not an appropriate model for the selected features because the diagonal values are low when the threshold for the classiﬁer is ﬁxed at 0.5.

The figure below is AUROC curve for SVM and shows the measure of the random positive sample ranked before random negative sample.

A random model obtains an AUROC score of 0.5 and hence the classiﬁers should perform better than 0.5. It helps to measure the performance of the model without ﬁxing the threshold. It plots a point for every possible threshold and is helpful to select the threshold of the model depending on the use case. SVM is better in predicting dementia with AUROC score of 0.66 against normal and MCI, normal with AUROC score of 0.64 against MCI and dementia when the threshold is unﬁxed. However, it is only little better than random method to predict MCI against normal and dementia because AUROC score is 0.54.

#### Implement, Evaluate and Result of Neural Network

Neural network attempts simulating the human brain. It is implemented using scikit-learn library. The function used to implement is MLPClassiﬁer(). It has 100 numbers of neurons and activation function is “relu”. The model resulted in average AUROC score of 0.619.

#### Comparison of Developed Models

Figure shows average multiclass AUROC score (CADDementia, 2014) for diﬀerent machine learning algorithms. The metric assigns equal weight to the classiﬁcation of each class.

It shows decision tree, neural network and SVM are the top three performing models with decision tree as the best performing model. Decision tree maps non-linear relationships and is easy to interpret. However, it tends to overﬁt and does not handle non-numeric data well. SVM also separates classes in a multi-dimensional space but is equally likely to overﬁt.

Figure shows AUROC score per class for multiple algorithms.

AUROC score is a measure of how well a model can distinguish between a class and other classes. For example, SVM can distinguish normal patients from other two classes with AUROC score of 0.605. It also shows that all the algorithms have a higher AUROC score for normal and dementia than MCI.

The report continues here.