A few reports have adopted unsupervised learning methods as state classifiers across individuals. Unsupervised learning methods are generally used for spatially and temporally clustering voxel signals to localize brain activity for dimensionality reduction or feature selection before applying a supervised learning algorithm to the data or for mapping specific brain patterns to a cognitive state. Although the ‘cluster purity’ given by clustering is, in principle, the same as the ‘classification accuracy’ given by supervised classifiers (both give the percentage of correct classifications as against the known ground truth), we would use the term ‘cluster purity’ in this work, so as to highlight that this metric was obtained through unsupervised clustering, and not through conventional supervised classification.įMRI studies generally use supervised learning methods to classify disease or cognitive states. As such, unsupervised learning is agnostic to pre-assigned labels, and thus determines inherent classes instead of fitting a model based on classes provided by us.
Conversely, in unsupervised learning, patterns within the entire dataset are used to ‘cluster’ the data without any pre-assigned labels, and cluster purity is measured against the known ground-truth, post hoc, instead of an accuracy. Then, the classification accuracy is measured by testing the model on the test data with known labels. The classifier is then ‘trained’ on this data to determine a generalized model (thus ‘supervised’ learning). Each member of the training data is given a ‘label’ as to which class (or group) it belongs to. Supervised learning, commonly used in fMRI studies, involves splitting the dataset into training and test data. There are two major categories of machine learning classification techniques: supervised and unsupervised. Since the successful emergence of functional neuroimaging, a new barrier has surfaced: can a strong correlation be established between brain activity (as measured by functional magnetic resonance imaging ) and the cognitive state of an individual? More specifically, can we accurately classify neurological diseases based on fMRI data? In response to this, machine learning classifiers have been employed on neuroimaging features to generate models that, within some accuracy, predict the cognitive and disease states to which new data belong. This study demonstrates that density-based clustering can accurately and robustly identify diagnostic classes in an unsupervised way using brain connectivity. Results demonstrated that both clustering algorithms were effective, although OPTICS with dynamic connectivity features outperformed in terms of cluster purity (95.46%) and robustness to noise/outliers. To assess the robustness of clustering to noise/outliers, we propose a novel method called recursive-clustering using additive-noise (R-CLAN).
We used static and dynamic functional connectivity features for clustering, which captures the strength and temporal variation of brain connectivity respectively. In this study, we compare the performance of two popular density-based clustering methods, DBSCAN and OPTICS, in accurately identifying individuals with three stages of cognitive impairment, including Alzheimer’s disease. Unlike the popular k-means clustering, the number of clusters need not be specified. Its advantages include insensitivity to outliers and ability to work with unlabeled data. Density-based clustering, which overcomes these issues, is a popular unsupervised learning approach whose utility for high-dimensional neuroimaging data has not been previously evaluated.
These methods generally use supervised classifiers that are sensitive to outliers and require labeling of training data to generate a predictive model.
Various machine-learning classification techniques have been employed previously to classify brain states in healthy and disease populations using functional magnetic resonance imaging (fMRI).