Dataset from Study: Pierre Bellec, Carlton Chu, François Chouinard-Decorte, Yassine Benhajali, Daniel S. Margulies, R. Cameron Craddock (2017). The Neuro Bureau ADHD-200 Preprocessed repository. NeuroImage, 144, Part B, pp. 275 - 286. doi:10.1016/j.neuroimage.2016.06.034
This notebook contains an analysis of the ADHD-200 dataset available on Nilearn. The dataset contains resting state fMRI data from 40 subjects and their phenotypic information. Half of the subjects are patients diagnosed with ADHD and the remaining half are healthy controls. The subjects in the study are all children and adolescents. The analysis in this notebook is my attempt to predict ADHD diagnosis using resting state fMRI data.
I have modified the phenotypic file and used an imported file ("phenotypics.csv") instead of the phenotypic file that comes with Nilearn dataset since it did not contain information about the differentiation between healthy and patient - subject types - I derived that information based on the dataset supplied at "http://preprocessed-connectomes-project.org/adhd200/download.html".
This project uses machine learning techniques to predict ADHD diagnosis from fMRI data.
The data used in this project comes from the ADHD-200 study.
Machine learning algorithms are used in this project to classify patients with ADHD from healthy controls. My aim for this project is to gain experience with processing fMRI data using programming languages as well as with FSL. I will further experiement with different machine learning techniques (algorithms/ classifiers, cross-validation methods for multi-voxel pattern analysis and also fine-tweaking the hyperparameters of the classifying models) and compare how they do with each implementation.
This project has used the following technologies:
nilearn
ipywidgets
and plotly.express
scikit-learn
Jupyter Notebook
pandas
#import data
from nilearn import datasets
data = datasets.fetch_adhd(n_subjects=None)
#from nilearn.image import smooth_img
#data = smooth_img("dataset/subject_*.nii")
/Users/shubhaviarya/.local/lib/python3.9/site-packages/nilearn/datasets/__init__.py:93: FutureWarning: Fetchers from the nilearn.datasets module will be updated in version 0.9 to return python strings instead of bytes and Pandas dataframes instead of Numpy arrays. warn("Fetchers from the nilearn.datasets module will be " /Users/shubhaviarya/.local/lib/python3.9/site-packages/nilearn/datasets/func.py:240: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default. phenotypic = np.genfromtxt(phenotypic, names=True, delimiter=',',
Phenotypic info for the subjects is included with the data but I will perform some cleaning to start
#import phenotypic data
import pandas
# I will use manually edited file for supplying phenotypic data which I created by
#combining the nilearn dataset file and "from http://preprocessed-connectomes-project.org/adhd200/download.html"
phenos = pandas.read_csv("/Users/shubhaviarya/nilearn_data/adhd/myphenotypics.csv")
phenos
Unnamed: 0 | Subject | Rest.Scan | MeanFD | NumFD_greater_than_0.20 | rootMeanSquareFD | FDquartile.top1.4thFD. | PercentFD_greater_than_0.20 | MeanDVARS | MeanFD_Jenkinson | ... | sess_1_rest_6_eyes | sess_1_anat_1 | sess_1_which_anat | sess_2_rest_1 | sess_2_rest_1_eyes | sess_2_rest_2 | sess_2_rest_2_eyes | sess_2_anat_1 | defacing_ok | defacing_notes | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2014113 | rest_1 | 0.0576 | 2 | 0.2400 | 0.0944 | 1.6000 | 16.1677 | 1.3868 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
1 | 2 | 3902469 | rest_1 | 0.0580 | 0 | 0.2409 | 0.0931 | 0.0000 | 17.4188 | 1.2040 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | 3 | 4275075 | rest_1 | 0.0789 | 0 | 0.2808 | 0.1520 | 0.0000 | 17.7796 | 1.8105 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | 4 | 7774305 | rest_1 | 0.0679 | 0 | 0.2606 | 0.1054 | 0.0000 | 16.7169 | 1.5137 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | defaced part of front skull/brain |
4 | 5 | 1019436 | rest_1 | 0.0904 | 0 | 0.3006 | 0.1927 | 0.0000 | 19.6124 | 1.6769 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
5 | 6 | 3699991 | rest_1 | 0.1113 | 15 | 0.3337 | 0.2524 | 9.8039 | 19.8704 | 2.6121 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
6 | 7 | 3154996 | rest_1 | 0.0881 | 0 | 0.2968 | 0.1619 | 0.0000 | 18.6120 | 1.8118 | ... | NaN | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
7 | 8 | 3884955 | rest_1 | 0.0988 | 8 | 0.3144 | 0.1881 | 6.4000 | 17.3609 | 1.9543 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
8 | 9 | 27034 | rest_1 | 0.0484 | 1 | 0.2201 | 0.0860 | 0.3831 | 21.5895 | 0.9262 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
9 | 10 | 4134561 | rest_1 | 0.0439 | 4 | 0.2095 | 0.0849 | 1.5267 | 25.0028 | 0.9207 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
10 | 11 | 27018 | rest_1 | 0.0622 | 5 | 0.2494 | 0.1221 | 1.9157 | 19.7246 | 1.0707 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
11 | 12 | 6115230 | rest_1 | 0.0539 | 2 | 0.2322 | 0.1040 | 0.7634 | 23.1873 | 1.1637 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | deface front part of skull/brain |
12 | 13 | 27037 | rest_1 | 0.0686 | 0 | 0.2619 | 0.1121 | 0.0000 | 21.8621 | 1.5228 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
13 | 14 | 8409791 | rest_1 | 0.0567 | 4 | 0.2380 | 0.1068 | 1.5267 | 22.7119 | 1.0041 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
14 | 15 | 27011 | rest_1 | 0.1212 | 1 | 0.3482 | 0.2302 | 0.3817 | 21.6233 | 2.4654 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
15 | 16 | 3007585 | rest_1 | 0.0528 | 0 | 0.2298 | 0.1094 | 0.0000 | 25.1240 | 0.9446 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
16 | 17 | 8697774 | rest_1 | 0.0527 | 0 | 0.2296 | 0.0800 | 0.0000 | 2.0728 | 1.1396 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
17 | 18 | 9750701 | rest_1 | 0.0654 | 0 | 0.2557 | 0.1026 | 0.0000 | 2.1544 | 1.3514 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
18 | 19 | 10064 | rest_1 | 0.0623 | 0 | 0.2496 | 0.0926 | 0.0000 | 2.0866 | 0.9943 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
19 | 20 | 21019 | rest_1 | 0.0575 | 0 | 0.2398 | 0.0955 | 0.0000 | 2.0339 | 1.1973 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
20 | 21 | 10042 | rest_1 | 0.0559 | 0 | 0.2365 | 0.0922 | 0.0000 | 2.2915 | 1.0089 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
21 | 22 | 10128 | rest_1 | 0.0689 | 0 | 0.2624 | 0.1132 | 0.0000 | 2.1422 | 1.2641 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
22 | 23 | 2497695 | rest_1 | 0.0482 | 0 | 0.2195 | 0.0739 | 0.0000 | 2.1269 | 1.0257 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
23 | 24 | 4164316 | rest_1 | 0.0774 | 11 | 0.2782 | 0.1786 | 6.2147 | 2.2470 | 1.4483 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
24 | 25 | 1552181 | rest_1 | 0.0408 | 0 | 0.2021 | 0.0665 | 0.0000 | 12.8089 | 0.8528 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25 | 26 | 4046678 | rest_1 | 0.1139 | 9 | 0.3375 | 0.2162 | 11.3924 | 16.4176 | 2.7597 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
26 | 27 | 23012 | rest_1 | 0.0569 | 0 | 0.2386 | 0.1044 | 0.0000 | 16.8908 | 1.3229 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
27 | 28 | 1679142 | rest_1 | 0.1482 | 0 | 0.3849 | 0.2947 | 0.0000 | 18.4691 | 2.2433 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
28 | 29 | 1206380 | rest_1 | 0.0719 | 1 | 0.2681 | 0.1294 | 1.2658 | 16.9621 | 2.1269 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
29 | 30 | 23008 | rest_1 | 0.0801 | 7 | 0.2831 | 0.1710 | 8.9744 | 13.4263 | 1.7741 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
30 | 31 | 4016887 | rest_1 | 0.0879 | 3 | 0.2965 | 0.1516 | 3.7975 | 17.4466 | 2.5734 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
31 | 32 | 1418396 | rest_1 | 0.0713 | 2 | 0.2670 | 0.1471 | 2.5316 | 17.4328 | 1.8083 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
32 | 33 | 2950754 | rest_1 | 0.0523 | 2 | 0.2287 | 0.0864 | 0.8439 | 20.3974 | 1.1459 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
33 | 34 | 3994098 | rest_1 | 0.0547 | 0 | 0.2340 | 0.0993 | 0.0000 | 21.5907 | 1.3200 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN |
34 | 35 | 3520880 | rest_1 | 0.0509 | 0 | 0.2255 | 0.0743 | 0.0000 | 17.8432 | 1.3762 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
35 | 36 | 1517058 | rest_1 | 0.0733 | 0 | 0.2708 | 0.1450 | 0.0000 | 18.3401 | 1.4190 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
36 | 37 | 9744150 | rest_1 | 0.0547 | 0 | 0.2338 | 0.0966 | 0.0000 | 19.1607 | 1.4302 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
37 | 38 | 1562298 | rest_1 | 0.0722 | 2 | 0.2686 | 0.1246 | 0.8439 | 25.0944 | 1.8281 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
38 | 39 | 3205761 | rest_1 | 0.0679 | 8 | 0.2605 | 0.1572 | 3.3755 | 23.8591 | 1.3947 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
39 | 40 | 3624598 | rest_1 | 0.0653 | 14 | 0.2556 | 0.1404 | 5.9072 | 8.2800 | 1.6411 | ... | NaN | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
40 rows × 66 columns
I'll extract Subject ID from the niifti file names using index slicing and then merge the fMRI file paths to the phenotypic data
print(data.func)
['/Users/shubhaviarya/nilearn_data/adhd/data/0010042/0010042_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0010064/0010064_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0010128/0010128_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0021019/0021019_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0023008/0023008_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0023012/0023012_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027011/0027011_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027018/0027018_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027034/0027034_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027037/0027037_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1019436/1019436_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1206380/1206380_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1418396/1418396_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1517058/1517058_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1552181/1552181_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1562298/1562298_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1679142/1679142_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/2014113/2014113_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/2497695/2497695_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/2950754/2950754_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3007585/3007585_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3154996/3154996_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3205761/3205761_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3520880/3520880_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3624598/3624598_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3699991/3699991_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3884955/3884955_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3902469/3902469_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3994098/3994098_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4016887/4016887_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4046678/4046678_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4134561/4134561_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4164316/4164316_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4275075/4275075_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/6115230/6115230_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/7774305/7774305_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/8409791/8409791_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/8697774/8697774_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/9744150/9744150_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/9750701/9750701_rest_tshift_RPI_voreg_mni.nii.gz']
# get subject id from file paths
fileNames = []
for path in data.func:
fileNames.append(path[43:50]) #change the indexes here
#/Users/shubhaviarya/nilearn_data/adhd/data/0010042/0010042_rest_tshift_RPI_voreg_mni.nii.gz
print(fileNames)
['0010042', '0010064', '0010128', '0021019', '0023008', '0023012', '0027011', '0027018', '0027034', '0027037', '1019436', '1206380', '1418396', '1517058', '1552181', '1562298', '1679142', '2014113', '2497695', '2950754', '3007585', '3154996', '3205761', '3520880', '3624598', '3699991', '3884955', '3902469', '3994098', '4016887', '4046678', '4134561', '4164316', '4275075', '6115230', '7774305', '8409791', '8697774', '9744150', '9750701']
#creating dataframe of file paths with ID
myFiles = pandas.DataFrame(data.func, columns= ['path'])
myFiles['Subject'] = fileNames
myFiles['Subject'] = myFiles['Subject'].astype(int)
print(myFiles['Subject'])
0 10042 1 10064 2 10128 3 21019 4 23008 5 23012 6 27011 7 27018 8 27034 9 27037 10 1019436 11 1206380 12 1418396 13 1517058 14 1552181 15 1562298 16 1679142 17 2014113 18 2497695 19 2950754 20 3007585 21 3154996 22 3205761 23 3520880 24 3624598 25 3699991 26 3884955 27 3902469 28 3994098 29 4016887 30 4046678 31 4134561 32 4164316 33 4275075 34 6115230 35 7774305 36 8409791 37 8697774 38 9744150 39 9750701 Name: Subject, dtype: int64
#add phenotypic data to file paths
import pandas
myPheno = pandas.merge(phenos, myFiles)
#string decoding --- change the attributes here according to json
myPheno['sex'] = phenos['sex'].map(lambda x: x)
myPheno['handedness'] = phenos['handedness'].map(lambda x: x)
myPheno['adhd'] = phenos['adhd'].map(lambda x: x) #0= no, 1=yes
myPheno['subject_type'] = phenos['subject_type'].map(lambda x: x)
Now let us look at the data. And save the phenotypic data to a csv in the way I needed it
myPheno
Unnamed: 0 | Subject | Rest.Scan | MeanFD | NumFD_greater_than_0.20 | rootMeanSquareFD | FDquartile.top1.4thFD. | PercentFD_greater_than_0.20 | MeanDVARS | MeanFD_Jenkinson | ... | sess_1_anat_1 | sess_1_which_anat | sess_2_rest_1 | sess_2_rest_1_eyes | sess_2_rest_2 | sess_2_rest_2_eyes | sess_2_anat_1 | defacing_ok | defacing_notes | path | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2014113 | rest_1 | 0.0576 | 2 | 0.2400 | 0.0944 | 1.6000 | 16.1677 | 1.3868 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/201... |
1 | 2 | 3902469 | rest_1 | 0.0580 | 0 | 0.2409 | 0.0931 | 0.0000 | 17.4188 | 1.2040 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/390... |
2 | 3 | 4275075 | rest_1 | 0.0789 | 0 | 0.2808 | 0.1520 | 0.0000 | 17.7796 | 1.8105 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/427... |
3 | 4 | 7774305 | rest_1 | 0.0679 | 0 | 0.2606 | 0.1054 | 0.0000 | 16.7169 | 1.5137 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | defaced part of front skull/brain | /Users/shubhaviarya/nilearn_data/adhd/data/777... |
4 | 5 | 1019436 | rest_1 | 0.0904 | 0 | 0.3006 | 0.1927 | 0.0000 | 19.6124 | 1.6769 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/101... |
5 | 6 | 3699991 | rest_1 | 0.1113 | 15 | 0.3337 | 0.2524 | 9.8039 | 19.8704 | 2.6121 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/369... |
6 | 7 | 3154996 | rest_1 | 0.0881 | 0 | 0.2968 | 0.1619 | 0.0000 | 18.6120 | 1.8118 | ... | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/315... |
7 | 8 | 3884955 | rest_1 | 0.0988 | 8 | 0.3144 | 0.1881 | 6.4000 | 17.3609 | 1.9543 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/388... |
8 | 9 | 27034 | rest_1 | 0.0484 | 1 | 0.2201 | 0.0860 | 0.3831 | 21.5895 | 0.9262 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
9 | 10 | 4134561 | rest_1 | 0.0439 | 4 | 0.2095 | 0.0849 | 1.5267 | 25.0028 | 0.9207 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/413... |
10 | 11 | 27018 | rest_1 | 0.0622 | 5 | 0.2494 | 0.1221 | 1.9157 | 19.7246 | 1.0707 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
11 | 12 | 6115230 | rest_1 | 0.0539 | 2 | 0.2322 | 0.1040 | 0.7634 | 23.1873 | 1.1637 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | deface front part of skull/brain | /Users/shubhaviarya/nilearn_data/adhd/data/611... |
12 | 13 | 27037 | rest_1 | 0.0686 | 0 | 0.2619 | 0.1121 | 0.0000 | 21.8621 | 1.5228 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
13 | 14 | 8409791 | rest_1 | 0.0567 | 4 | 0.2380 | 0.1068 | 1.5267 | 22.7119 | 1.0041 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/840... |
14 | 15 | 27011 | rest_1 | 0.1212 | 1 | 0.3482 | 0.2302 | 0.3817 | 21.6233 | 2.4654 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
15 | 16 | 3007585 | rest_1 | 0.0528 | 0 | 0.2298 | 0.1094 | 0.0000 | 25.1240 | 0.9446 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/300... |
16 | 17 | 8697774 | rest_1 | 0.0527 | 0 | 0.2296 | 0.0800 | 0.0000 | 2.0728 | 1.1396 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/869... |
17 | 18 | 9750701 | rest_1 | 0.0654 | 0 | 0.2557 | 0.1026 | 0.0000 | 2.1544 | 1.3514 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/975... |
18 | 19 | 10064 | rest_1 | 0.0623 | 0 | 0.2496 | 0.0926 | 0.0000 | 2.0866 | 0.9943 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/001... |
19 | 20 | 21019 | rest_1 | 0.0575 | 0 | 0.2398 | 0.0955 | 0.0000 | 2.0339 | 1.1973 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
20 | 21 | 10042 | rest_1 | 0.0559 | 0 | 0.2365 | 0.0922 | 0.0000 | 2.2915 | 1.0089 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/001... |
21 | 22 | 10128 | rest_1 | 0.0689 | 0 | 0.2624 | 0.1132 | 0.0000 | 2.1422 | 1.2641 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/001... |
22 | 23 | 2497695 | rest_1 | 0.0482 | 0 | 0.2195 | 0.0739 | 0.0000 | 2.1269 | 1.0257 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/249... |
23 | 24 | 4164316 | rest_1 | 0.0774 | 11 | 0.2782 | 0.1786 | 6.2147 | 2.2470 | 1.4483 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/416... |
24 | 25 | 1552181 | rest_1 | 0.0408 | 0 | 0.2021 | 0.0665 | 0.0000 | 12.8089 | 0.8528 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/155... |
25 | 26 | 4046678 | rest_1 | 0.1139 | 9 | 0.3375 | 0.2162 | 11.3924 | 16.4176 | 2.7597 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/404... |
26 | 27 | 23012 | rest_1 | 0.0569 | 0 | 0.2386 | 0.1044 | 0.0000 | 16.8908 | 1.3229 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
27 | 28 | 1679142 | rest_1 | 0.1482 | 0 | 0.3849 | 0.2947 | 0.0000 | 18.4691 | 2.2433 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/167... |
28 | 29 | 1206380 | rest_1 | 0.0719 | 1 | 0.2681 | 0.1294 | 1.2658 | 16.9621 | 2.1269 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/120... |
29 | 30 | 23008 | rest_1 | 0.0801 | 7 | 0.2831 | 0.1710 | 8.9744 | 13.4263 | 1.7741 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
30 | 31 | 4016887 | rest_1 | 0.0879 | 3 | 0.2965 | 0.1516 | 3.7975 | 17.4466 | 2.5734 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/401... |
31 | 32 | 1418396 | rest_1 | 0.0713 | 2 | 0.2670 | 0.1471 | 2.5316 | 17.4328 | 1.8083 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/141... |
32 | 33 | 2950754 | rest_1 | 0.0523 | 2 | 0.2287 | 0.0864 | 0.8439 | 20.3974 | 1.1459 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/295... |
33 | 34 | 3994098 | rest_1 | 0.0547 | 0 | 0.2340 | 0.0993 | 0.0000 | 21.5907 | 1.3200 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/399... |
34 | 35 | 3520880 | rest_1 | 0.0509 | 0 | 0.2255 | 0.0743 | 0.0000 | 17.8432 | 1.3762 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/352... |
35 | 36 | 1517058 | rest_1 | 0.0733 | 0 | 0.2708 | 0.1450 | 0.0000 | 18.3401 | 1.4190 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/151... |
36 | 37 | 9744150 | rest_1 | 0.0547 | 0 | 0.2338 | 0.0966 | 0.0000 | 19.1607 | 1.4302 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/974... |
37 | 38 | 1562298 | rest_1 | 0.0722 | 2 | 0.2686 | 0.1246 | 0.8439 | 25.0944 | 1.8281 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/156... |
38 | 39 | 3205761 | rest_1 | 0.0679 | 8 | 0.2605 | 0.1572 | 3.3755 | 23.8591 | 1.3947 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/320... |
39 | 40 | 3624598 | rest_1 | 0.0653 | 14 | 0.2556 | 0.1404 | 5.9072 | 8.2800 | 1.6411 | ... | pass | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/362... |
40 rows × 67 columns
Now let us make subsets for patients and controls since I have the file names matching with the phenotypic data
#create lists of filepaths - patients, controls
myPatients = []
myControls = []
for i in myPheno.index:
if myPheno.loc[i, 'subject_type'] == 'Patient': # look up attributes in json
myPatients.append(myPheno.loc[i, 'path'])
else:
myControls.append(myPheno.loc[i, 'path'])
myPatients
['/Users/shubhaviarya/nilearn_data/adhd/data/2014113/2014113_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4275075/4275075_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1019436/1019436_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3154996/3154996_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027034/0027034_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027018/0027018_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027037/0027037_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0027011/0027011_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/8697774/8697774_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0010064/0010064_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0010042/0010042_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/2497695/2497695_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1552181/1552181_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0023012/0023012_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1206380/1206380_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4016887/4016887_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/2950754/2950754_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3520880/3520880_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/9744150/9744150_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3205761/3205761_rest_tshift_RPI_voreg_mni.nii.gz']
myControls
['/Users/shubhaviarya/nilearn_data/adhd/data/3902469/3902469_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/7774305/7774305_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3699991/3699991_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3884955/3884955_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4134561/4134561_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/6115230/6115230_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/8409791/8409791_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3007585/3007585_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/9750701/9750701_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0021019/0021019_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0010128/0010128_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4164316/4164316_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/4046678/4046678_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1679142/1679142_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/0023008/0023008_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1418396/1418396_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3994098/3994098_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1517058/1517058_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/1562298/1562298_rest_tshift_RPI_voreg_mni.nii.gz', '/Users/shubhaviarya/nilearn_data/adhd/data/3624598/3624598_rest_tshift_RPI_voreg_mni.nii.gz']
The code below creates an interactive application using plotly express which will plot a histogram of subject age
#HISTOGRAM code
import plotly.express as pltex
from jupyter_dash import JupyterDash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
#Loading data
myDF = myPheno
#Building application
appl = JupyterDash(__name__)
appl.layout = html.Div([
html.H1("Age"),
dcc.Graph(id='graph'),
html.Label([
"Subject type",
dcc.Dropdown(
id='subject_type', clearable=False,
value='Patient', options=[
{'label': c, 'value': c}
for c in myDF.subject_type.unique() #all unique values
])
]),
])
#callback to update graph
@appl.callback(
Output('graph','figure'),
[Input("subject_type", "value")]
)
def update_fig(subject_type):
return pltex.histogram(
myDF[myDF["subject_type"]==subject_type], x="age", color="sex"
)
#Running visualization and displaying result
appl.run_server(mode='inline')
This analysis uses the BASC atlas to define ROIs. I will focus on 64 ROIs for this analysis
#import atlas
myParcellations = datasets.fetch_atlas_basc_multiscale_2015(version='sym')
atlasFile = myParcellations.scale064
#visualize atlas
from nilearn import plotting
plotting.plot_roi(atlasFile, draw_cross = False)
<nilearn.plotting.displays.OrthoSlicer at 0x7fe03376cb50>
Now I will generate correlation matrices for each subject and then add them to the phenotypic data.
from nilearn.input_data import NiftiLabelsMasker
from nilearn.connectome import ConnectivityMeasure
#create mask
myMask = NiftiLabelsMasker(labels_img=atlasFile, standardize=True, memory='nilearn_cache',verbose=1)
#initializing correlation measure
correlationMeasure = ConnectivityMeasure(kind='correlation', vectorize=True, discard_diagonal=True)
import pandas as pd
#initializing empty dataframe
allFeatures = pd.DataFrame(columns=['features', 'file'])
for num,subject in enumerate(data.func):
timeSeries = myMask.fit_transform(subject, confounds=data.confounds[num])
#creating a region x region correlation matrix
correlationMatrix = correlationMeasure.fit_transform([timeSeries])[0]
#adding features and filename to dataframe
allFeatures = allFeatures.append({'features': correlationMatrix, 'file': data.func[num]},ignore_index=True)
#keeping track of status
print('finished %s of %s' %(num+1,len(data.func)))
[NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz Resampling labels finished 1 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz ________________________________________________________________________________ [Memory] Calling nilearn.input_data.base_masker.filter_and_extract... filter_and_extract('/Users/shubhaviarya/nilearn_data/adhd/data/0010064/0010064_rest_tshift_RPI_voreg_mni.nii.gz', <nilearn.input_data.nifti_labels_masker._ExtractionFunctor object at 0x7fe050f02fa0>, { 'background_label': 0, 'detrend': False, 'dtype': None, 'high_pass': None, 'high_variance_confounds': False, 'labels': None, 'labels_img': '/Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz', 'low_pass': None, 'mask_img': None, 'reports': True, 'smoothing_fwhm': None, 'standardize': True, 'standardize_confounds': True, 'strategy': 'mean', 't_r': None, 'target_affine': None, 'target_shape': None}, confounds=['/Users/shubhaviarya/nilearn_data/adhd/data/0010064/0010064_regressors.csv'], sample_mask=None, dtype=None, memory=Memory(location=nilearn_cache/joblib), memory_level=1, verbose=1) [NiftiLabelsMasker.transform_single_imgs] Loading data from /Users/shubhaviarya/nilearn_data/adhd/data/0010064/0010064_rest_tshift_RPI_voreg_mni.nii.gz [NiftiLabelsMasker.transform_single_imgs] Extracting region signals [NiftiLabelsMasker.transform_single_imgs] Cleaning extracted signals _______________________________________________filter_and_extract - 2.9s, 0.0min finished 2 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 3 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 4 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 5 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 6 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 7 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 8 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 9 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 10 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 11 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 12 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 13 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 14 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 15 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 16 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 17 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 18 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 19 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 20 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 21 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 22 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 23 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 24 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 25 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 26 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 27 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 28 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 29 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 30 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 31 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 32 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 33 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 34 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 35 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 36 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 37 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 38 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 39 of 40 [NiftiLabelsMasker.fit_transform] loading data from /Users/shubhaviarya/nilearn_data/basc_multiscale_2015/template_cambridge_basc_multiscale_nii_sym/template_cambridge_basc_multiscale_sym_scale064.nii.gz finished 40 of 40
#create pandas dataframe of features and phenotypic data
complete = pandas.merge(myPheno, allFeatures, left_on = 'path', right_on='file')
Here is the pandas dataframe with complete demographic information and a column that has the correlation matrix for each subject as an array
complete
Unnamed: 0 | Subject | Rest.Scan | MeanFD | NumFD_greater_than_0.20 | rootMeanSquareFD | FDquartile.top1.4thFD. | PercentFD_greater_than_0.20 | MeanDVARS | MeanFD_Jenkinson | ... | sess_2_rest_1 | sess_2_rest_1_eyes | sess_2_rest_2 | sess_2_rest_2_eyes | sess_2_anat_1 | defacing_ok | defacing_notes | path | features | file | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2014113 | rest_1 | 0.0576 | 2 | 0.2400 | 0.0944 | 1.6000 | 16.1677 | 1.3868 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/201... | [0.099702746, -0.18896824, 0.04915938, 0.25369... | /Users/shubhaviarya/nilearn_data/adhd/data/201... |
1 | 2 | 3902469 | rest_1 | 0.0580 | 0 | 0.2409 | 0.0931 | 0.0000 | 17.4188 | 1.2040 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/390... | [0.089416295, -0.0077906298, 0.15645671, 0.289... | /Users/shubhaviarya/nilearn_data/adhd/data/390... |
2 | 3 | 4275075 | rest_1 | 0.0789 | 0 | 0.2808 | 0.1520 | 0.0000 | 17.7796 | 1.8105 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/427... | [0.06138464, -0.038983215, 0.04661025, 0.08573... | /Users/shubhaviarya/nilearn_data/adhd/data/427... |
3 | 4 | 7774305 | rest_1 | 0.0679 | 0 | 0.2606 | 0.1054 | 0.0000 | 16.7169 | 1.5137 | ... | NaN | NaN | NaN | NaN | NaN | yes | defaced part of front skull/brain | /Users/shubhaviarya/nilearn_data/adhd/data/777... | [-0.05577563, -0.22510526, 0.31184578, 0.33335... | /Users/shubhaviarya/nilearn_data/adhd/data/777... |
4 | 5 | 1019436 | rest_1 | 0.0904 | 0 | 0.3006 | 0.1927 | 0.0000 | 19.6124 | 1.6769 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/101... | [0.24521679, -0.13658871, -0.05749052, 0.24750... | /Users/shubhaviarya/nilearn_data/adhd/data/101... |
5 | 6 | 3699991 | rest_1 | 0.1113 | 15 | 0.3337 | 0.2524 | 9.8039 | 19.8704 | 2.6121 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/369... | [0.0646003, -0.029432628, 0.16057898, 0.293279... | /Users/shubhaviarya/nilearn_data/adhd/data/369... |
6 | 7 | 3154996 | rest_1 | 0.0881 | 0 | 0.2968 | 0.1619 | 0.0000 | 18.6120 | 1.8118 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/315... | [0.2347607, -0.31418574, 0.090578176, 0.027886... | /Users/shubhaviarya/nilearn_data/adhd/data/315... |
7 | 8 | 3884955 | rest_1 | 0.0988 | 8 | 0.3144 | 0.1881 | 6.4000 | 17.3609 | 1.9543 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/388... | [-0.19471535, -0.19354574, 0.055207245, 0.1772... | /Users/shubhaviarya/nilearn_data/adhd/data/388... |
8 | 9 | 27034 | rest_1 | 0.0484 | 1 | 0.2201 | 0.0860 | 0.3831 | 21.5895 | 0.9262 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... | [0.27563015, 0.10243464, 0.22862752, 0.1330881... | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
9 | 10 | 4134561 | rest_1 | 0.0439 | 4 | 0.2095 | 0.0849 | 1.5267 | 25.0028 | 0.9207 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/413... | [0.056150664, 0.16192757, 0.20446236, 0.245703... | /Users/shubhaviarya/nilearn_data/adhd/data/413... |
10 | 11 | 27018 | rest_1 | 0.0622 | 5 | 0.2494 | 0.1221 | 1.9157 | 19.7246 | 1.0707 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... | [-0.017702665, -0.0321446, -0.2318833, -0.0850... | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
11 | 12 | 6115230 | rest_1 | 0.0539 | 2 | 0.2322 | 0.1040 | 0.7634 | 23.1873 | 1.1637 | ... | NaN | NaN | NaN | NaN | NaN | yes | deface front part of skull/brain | /Users/shubhaviarya/nilearn_data/adhd/data/611... | [0.048174337, -0.038958076, 0.34462577, 0.0785... | /Users/shubhaviarya/nilearn_data/adhd/data/611... |
12 | 13 | 27037 | rest_1 | 0.0686 | 0 | 0.2619 | 0.1121 | 0.0000 | 21.8621 | 1.5228 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... | [0.036746763, 0.12673074, -0.14303601, 0.02012... | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
13 | 14 | 8409791 | rest_1 | 0.0567 | 4 | 0.2380 | 0.1068 | 1.5267 | 22.7119 | 1.0041 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/840... | [0.3284404, 0.14619106, 0.18256181, 0.24275021... | /Users/shubhaviarya/nilearn_data/adhd/data/840... |
14 | 15 | 27011 | rest_1 | 0.1212 | 1 | 0.3482 | 0.2302 | 0.3817 | 21.6233 | 2.4654 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... | [0.2435922, 0.036132835, 0.088765755, -0.05950... | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
15 | 16 | 3007585 | rest_1 | 0.0528 | 0 | 0.2298 | 0.1094 | 0.0000 | 25.1240 | 0.9446 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/300... | [0.17836411, -0.057775058, -0.09736896, 0.0838... | /Users/shubhaviarya/nilearn_data/adhd/data/300... |
16 | 17 | 8697774 | rest_1 | 0.0527 | 0 | 0.2296 | 0.0800 | 0.0000 | 2.0728 | 1.1396 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/869... | [0.26257858, -0.01471501, 0.3056692, 0.4140661... | /Users/shubhaviarya/nilearn_data/adhd/data/869... |
17 | 18 | 9750701 | rest_1 | 0.0654 | 0 | 0.2557 | 0.1026 | 0.0000 | 2.1544 | 1.3514 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/975... | [0.34606516, -0.11297555, 0.12377982, 0.393092... | /Users/shubhaviarya/nilearn_data/adhd/data/975... |
18 | 19 | 10064 | rest_1 | 0.0623 | 0 | 0.2496 | 0.0926 | 0.0000 | 2.0866 | 0.9943 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/001... | [0.05906585, 0.023257246, 0.2309234, 0.2589358... | /Users/shubhaviarya/nilearn_data/adhd/data/001... |
19 | 20 | 21019 | rest_1 | 0.0575 | 0 | 0.2398 | 0.0955 | 0.0000 | 2.0339 | 1.1973 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... | [0.29425192, 0.034056704, 0.10205513, 0.091403... | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
20 | 21 | 10042 | rest_1 | 0.0559 | 0 | 0.2365 | 0.0922 | 0.0000 | 2.2915 | 1.0089 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/001... | [0.38576013, 0.19752927, -0.034569763, 0.26150... | /Users/shubhaviarya/nilearn_data/adhd/data/001... |
21 | 22 | 10128 | rest_1 | 0.0689 | 0 | 0.2624 | 0.1132 | 0.0000 | 2.1422 | 1.2641 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/001... | [0.46423686, -0.05559261, -0.18878943, 0.42604... | /Users/shubhaviarya/nilearn_data/adhd/data/001... |
22 | 23 | 2497695 | rest_1 | 0.0482 | 0 | 0.2195 | 0.0739 | 0.0000 | 2.1269 | 1.0257 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/249... | [0.3090028, -0.20101869, 0.15623687, 0.123762,... | /Users/shubhaviarya/nilearn_data/adhd/data/249... |
23 | 24 | 4164316 | rest_1 | 0.0774 | 11 | 0.2782 | 0.1786 | 6.2147 | 2.2470 | 1.4483 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/416... | [0.18266006, 0.053213164, 0.13737457, 0.301389... | /Users/shubhaviarya/nilearn_data/adhd/data/416... |
24 | 25 | 1552181 | rest_1 | 0.0408 | 0 | 0.2021 | 0.0665 | 0.0000 | 12.8089 | 0.8528 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/155... | [0.24331683, 0.11997671, 0.15477629, 0.365618,... | /Users/shubhaviarya/nilearn_data/adhd/data/155... |
25 | 26 | 4046678 | rest_1 | 0.1139 | 9 | 0.3375 | 0.2162 | 11.3924 | 16.4176 | 2.7597 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/404... | [0.26400584, 0.09710038, 0.16714007, 0.4725717... | /Users/shubhaviarya/nilearn_data/adhd/data/404... |
26 | 27 | 23012 | rest_1 | 0.0569 | 0 | 0.2386 | 0.1044 | 0.0000 | 16.8908 | 1.3229 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... | [0.29637876, -0.3057408, -0.2015853, 0.0247812... | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
27 | 28 | 1679142 | rest_1 | 0.1482 | 0 | 0.3849 | 0.2947 | 0.0000 | 18.4691 | 2.2433 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/167... | [0.2544344, 0.14044009, -0.04573749, 0.4189829... | /Users/shubhaviarya/nilearn_data/adhd/data/167... |
28 | 29 | 1206380 | rest_1 | 0.0719 | 1 | 0.2681 | 0.1294 | 1.2658 | 16.9621 | 2.1269 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/120... | [0.20193276, 0.08832485, 0.20624438, 0.2480782... | /Users/shubhaviarya/nilearn_data/adhd/data/120... |
29 | 30 | 23008 | rest_1 | 0.0801 | 7 | 0.2831 | 0.1710 | 8.9744 | 13.4263 | 1.7741 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/002... | [-0.031444725, 0.14899744, 0.22901624, 0.28501... | /Users/shubhaviarya/nilearn_data/adhd/data/002... |
30 | 31 | 4016887 | rest_1 | 0.0879 | 3 | 0.2965 | 0.1516 | 3.7975 | 17.4466 | 2.5734 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/401... | [0.26914153, 0.21769321, 0.02183819, 0.5068154... | /Users/shubhaviarya/nilearn_data/adhd/data/401... |
31 | 32 | 1418396 | rest_1 | 0.0713 | 2 | 0.2670 | 0.1471 | 2.5316 | 17.4328 | 1.8083 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/141... | [0.13874619, 0.22254132, 0.3190973, 0.3857753,... | /Users/shubhaviarya/nilearn_data/adhd/data/141... |
32 | 33 | 2950754 | rest_1 | 0.0523 | 2 | 0.2287 | 0.0864 | 0.8439 | 20.3974 | 1.1459 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/295... | [0.34258807, 0.039793734, 0.34796214, 0.254069... | /Users/shubhaviarya/nilearn_data/adhd/data/295... |
33 | 34 | 3994098 | rest_1 | 0.0547 | 0 | 0.2340 | 0.0993 | 0.0000 | 21.5907 | 1.3200 | ... | NaN | NaN | NaN | NaN | NaN | yes | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/399... | [0.17507254, -0.058690447, 0.28509158, 0.48999... | /Users/shubhaviarya/nilearn_data/adhd/data/399... |
34 | 35 | 3520880 | rest_1 | 0.0509 | 0 | 0.2255 | 0.0743 | 0.0000 | 17.8432 | 1.3762 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/352... | [0.30443344, 0.21947744, 0.14888616, 0.4287756... | /Users/shubhaviarya/nilearn_data/adhd/data/352... |
35 | 36 | 1517058 | rest_1 | 0.0733 | 0 | 0.2708 | 0.1450 | 0.0000 | 18.3401 | 1.4190 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/151... | [0.25477687, 0.19738586, 0.22110052, 0.3100055... | /Users/shubhaviarya/nilearn_data/adhd/data/151... |
36 | 37 | 9744150 | rest_1 | 0.0547 | 0 | 0.2338 | 0.0966 | 0.0000 | 19.1607 | 1.4302 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/974... | [0.14122805, 0.039847065, 0.13463525, 0.395856... | /Users/shubhaviarya/nilearn_data/adhd/data/974... |
37 | 38 | 1562298 | rest_1 | 0.0722 | 2 | 0.2686 | 0.1246 | 0.8439 | 25.0944 | 1.8281 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/156... | [0.18526037, 0.17634061, 0.4151504, 0.49105507... | /Users/shubhaviarya/nilearn_data/adhd/data/156... |
38 | 39 | 3205761 | rest_1 | 0.0679 | 8 | 0.2605 | 0.1572 | 3.3755 | 23.8591 | 1.3947 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/320... | [0.063356735, 0.0003985194, 0.0142550655, 0.29... | /Users/shubhaviarya/nilearn_data/adhd/data/320... |
39 | 40 | 3624598 | rest_1 | 0.0653 | 14 | 0.2556 | 0.1404 | 5.9072 | 8.2800 | 1.6411 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/shubhaviarya/nilearn_data/adhd/data/362... | [0.3069562, -0.089238256, 0.09645871, 0.333239... | /Users/shubhaviarya/nilearn_data/adhd/data/362... |
40 rows × 69 columns
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure, savefig
patientFeatures = list(complete.loc[complete['subject_type']=='Patient']['features'])
controlFeatures = list(complete.loc[complete['subject_type']=='Control']['features'])
figure(figsize=(16,6))
plt.subplot(1,2,1)
plt.imshow(patientFeatures, aspect='auto')
plt.colorbar()
plt.title('Patients')
plt.xlabel('features')
plt.ylabel('subjects')
plt.subplot(1,2,2)
plt.imshow(controlFeatures, aspect='auto')
plt.colorbar()
plt.title('Controls')
plt.xlabel('features')
plt.ylabel('subjects')
savefig('myFeatures.png', transparent=True)
This section contains the main data analysis. I will be predicting "...." diagnosis here. The features used are the correlation matrices generated above, and the diagnosis labels are contained in subject_type
column in our phenotypic data.
I will first split the data into training and testing sets, with a ratio of 80/20.
from sklearn.model_selection import train_test_split
#Split the sample to training/testing sets with a 80/20 ratio
x_Train, xVal, y_Train, yVal = train_test_split(list(complete['features']),#x \
complete['subject_type'], #y \
test_size = 0.2, #80/20 ratio \
shuffle = True, #shuffle dataset
stratify = complete['subject_type'], \
random_state = 242)
My starting classifier will be a linear support vector machine - SVC()
in nilearn. This is because of the high recommendation for it for classification problems with small sample sizes.
I will be using 10-fold crosss validation to get a rough benchmark of performance for each classifier. I will use F1 as my performance metric. After each run I will look at the performance of the classifier across the folds as well as the average performance.
#building SVC classifier
from sklearn.svm import SVC
my_svc = SVC(kernel='linear')
#F1 score by averaging each fold
from sklearn.model_selection import cross_val_score
import numpy as np
svcScore = cross_val_score(my_svc, x_Train, y_Train, cv=10, scoring='f1_macro')
print(np.mean(svcScore))
print(svcScore)
0.7233333333333334 [1. 1. 0.25 0.66666667 0.25 1. 0.4 1. 1. 0.66666667]
Based on above results, Linear SVC has a good performance as seen above with an average F1 score of ~0.72.
I will now try gradient boosting as my classifier. The gradient boost model will use a greater number of estimators and a larger max depth than the defaults in order to try and improve performance.
#building gradient boost classifier
from sklearn.ensemble import GradientBoostingClassifier
myBoost = GradientBoostingClassifier(n_estimators=500, \
max_depth=4, \
random_state=242)
#train model
myBoost.fit(x_Train, y_Train)
#F1 score by averaging each fold
from sklearn.model_selection import cross_val_score
import numpy as np
myBoostScore = cross_val_score(myBoost, x_Train, y_Train, cv=10, scoring='f1_macro')
print(np.mean(myBoostScore))
print(myBoostScore)
0.38499999999999995 [0.73333333 0.5 0. 0.4 0.25 0.25 0.4 0.25 0.66666667 0.4 ]
Based on above results, the gradient boost model seems to be highly variable and isn't comparable to the performance of the SVC. Next, I will try K nearest neighbors as my classifier.
#KNN
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier()
knnScore = cross_val_score(knn, x_Train, y_Train, cv=10, scoring='f1_macro')
print(np.mean(knnScore))
print(knnScore)
0.4716666666666667 [1. 0.73333333 0.25 0.25 0.25 0.4 0.66666667 0.25 0.25 0.66666667]
Based on above performance, K Nearest Neighbors performs poorly with default parameters. Given the large difference between KNN and other classifiers, I will not adjust and retry this classifier.
Now I will try a Random Forest classifier. I will increase the number of estimators like I did with the gradient boost model.
#Random Forest
from sklearn.ensemble import RandomForestClassifier
myRfc = RandomForestClassifier(n_estimators = 500, random_state = 242)
rfcScore = cross_val_score(myRfc, x_Train, y_Train, cv=10, scoring='f1_macro')
print(np.mean(rfcScore))
print(rfcScore)
0.5233333333333332 [0.73333333 0.5 0.25 0.66666667 0.25 1. 0.66666667 0.66666667 0.25 0.25 ]
Based on above results, the Random Forest model performed alright but not as much as the linear SVC. With some parameter adjustment and modifications, the same performance might be achieveable but since the random forest classifier is more complex, and takes longer to train, I will use SVC as the final model.
I will now see if I can improve the performance of my SVC model by tweaking the hyperparameters. But with linear SVC, the only option I have to tweak is the C
parameter.
I will create a range of values for C
and then compare each using cross validation.
from sklearn.model_selection import validation_curve
cRange = 10. ** np.arange(-3,8) # a range of values for C
trainScores, validScores = validation_curve(my_svc, x_Train, y_Train,\
param_name = "C", param_range=cRange, \
cv=10,\
scoring='f1_macro')
#Creating a Pandas dataframe of the results
t_scores = pandas.DataFrame(trainScores).stack().reset_index()
t_scores.columns = ['C', 'Fold', 'Score']
t_scores.loc[:,'Type'] = ['Train' for x in range(len(t_scores))]
v_scores = pandas.DataFrame(validScores).stack().reset_index()
v_scores.columns = ['C', 'Fold', 'Score']
v_scores.loc[:, 'Type'] = ['Validate' for x in range(len(v_scores))]
val_curves = pandas.concat([t_scores, v_scores]).reset_index(drop=True)
#Plotting the performance of different values of C
import seaborn as sns
myPlot = sns.catplot(x='C', y='Score', hue='Type', data=val_curves, kind='point')
myPlot.set_xticklabels(cRange, rotation=90)
<seaborn.axisgrid.FacetGrid at 0x7fe033787d30>
Based on the above plot, the model seems to perform best at C value of 0.1 but it's a minor difference. I will try one more thing now.
I will try to change the SVC kernel to the default 'rbf' which would let me adjust C and gamma. I will use a grid search to see if optimizing the rbf kernel would give a better result than a linear kernel.
#RBF SVC classifier
from sklearn.model_selection import GridSearchCV
rbf_svc = SVC(kernel='rbf')
cRange = 10. ** np.arange(-3,8)
gammaRange = 10. ** np.arange(-8,3)
param_grid = dict(gamma=gammaRange, C=cRange)
myGrid = GridSearchCV(rbf_svc, param_grid=param_grid, cv=10)
myGrid.fit(x_Train, y_Train)
GridSearchCV(cv=10, estimator=SVC(), param_grid={'C': array([1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06, 1.e+07]), 'gamma': array([1.e-08, 1.e-07, 1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01, 1.e+02])})
print(myGrid.best_params_)
{'C': 10000000.0, 'gamma': 1e-08}
rbf_svc = SVC(kernel='rbf', C=100.0, gamma=0.001)
rbf_svcScore = cross_val_score(rbf_svc, x_Train, y_Train, cv=10, scoring='f1_macro')
print(np.mean(rbf_svcScore))
print(rbf_svcScore)
0.7233333333333334 [1. 1. 0.25 0.66666667 0.25 1. 0.4 1. 1. 0.66666667]
Based on the above results and modifications, it looks like SVC with an RBF kernel and tweaked hyperparameters also performing the same as SVC with a linear kernel, so I can select either one as my final model. I will choose the SVC with RBF kernel as my final model due to the further adjustments I could make to it if I spend finetuning the model more.
I will now run the model on the remaining data (testing data) and check how accurately it performs.
#Testing
from sklearn.metrics import f1_score, accuracy_score
rbf_svc.fit(x_Train, y_Train)
predictedResult = rbf_svc.predict(xVal)
print('F1:', f1_score(yVal, predictedResult, pos_label='Patient'))
print('Accuracy:', accuracy_score(yVal, predictedResult))
F1: 0.5454545454545454 Accuracy: 0.375
Based on above results, A F1 score of 0.54 is not very bad for a binary classification problem given that my training dataset was almost perfectly balanced in the number of control and patient subjects. I will check how the model is handling attributes by looking at the confusion matrix.
import matplotlib.pyplot as plt
from sklearn.metrics import plot_confusion_matrix
myDisplay = plot_confusion_matrix(rbf_svc, xVal, yVal, cmap=plt.cm.Blues, normalize=None)
myDisplay.ax_.set_title('SVC ADHD Labels')
print(myDisplay.confusion_matrix)
[[0 4] [1 3]]
/Users/shubhaviarya/.local/lib/python3.9/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_confusion_matrix is deprecated; Function `plot_confusion_matrix` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: ConfusionMatrixDisplay.from_predictions or ConfusionMatrixDisplay.from_estimator.
Based on the above chart, the model appears to be doing very well in predicting patient populations but poorly predicting Control subjects. Since my dataset was very small (total=40, training=32), it is difficult to improve the classifiers more significantly. I am overall satisfied with the performance and application of Machine learning classifiers to predict ADHD diagnosis from a given sample and would like to improve and work on it further with a more larger dataset in the future as this approach looks promising.
Data was prepared in this study by extracting subject-IDs from the nifti file paths and then were merged with the phenotypic data to create the final names. This was done to easily subset attributes for creating data visualizations. The last step was to generate time series and correlation matrix for each subject and the correlation matrices were added to the phenotypic data for each subject.
Plots of the fMRI features were created that showed the average activation for both patients and controls. A Plotly Express interactive visualization was created to present a histogram of age for both patients and controls. I also created several other visualizations to display various data including the average correlation matrices for controls and patients.
My primary goal with this project was to predict an ADHD diagnosis from fMRI data. I tried and assesed various machine learning techniques on the dataset. I split the dataset into 80/20 for training and testing. Each classifier model was used on the training data using a 10-fold cross validation. I used the F1 score as a measure to assess performance. I also used a grid search to fine tune the hyperparameters.
Classifier models:
Both Linear Support Vector Machine Classifier (SVC) and SVC using an RBF kernel performed equally well but I proceeded to choose the RBF kernel SVC as my final model in order to experiment and modify further in future with a larger dataset. The values I had for C
was 10000000.0 and gamma
parameter was 1e-08. I used this model afterwards to predict the ADHD diagnosis on the testing set. On the testing set, the model had a final F1 score of 0.55 and an accuracy of 0.38.
An F1 score of 0.55 is not bad for a binary classification application given that I had such a small sample size and the dataset had an equal distribution (0.5) of both patients and controls. It is certainly possible to improve the performance using several other techniques. My analysis used 64 ROIs, but I could use upto 444 (the number supported by the atlas). Increasing the number of features will improve the performance of the model. Also, it is possible to reduce the dimensions to decrease the number of features accessible to the model. Finetuning hyperparameters of more complex machine learning models could also lead to increased performance of the models.
I gained a lot of experience working with fMRI data and applying machine learning techniques to neuroimaging data analysis. The available documentation, readings and textbook made it easier for me to learn and apply the various tools and techniques I implemented in this project. I was also able to further improve my programming skills specifically in relation to its applications to fMRI data. In the future, I would like to further improve upon my fMRI analysis skills and experiment with more advanced machine learning algorithms.