Machine learning for neuroimaging with scikit-learn
Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Mueller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion, Gaël Varoquaux, Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Mueller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion, Gaël Varoquaux
Abstract
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.
Keywords: Python; machine learning; neuroimaging; scikit-learn; statistical learning.
Figures
References
- Beckmann C. F., Smith S. M. (2004). Probabilistic independent component analysis for functional magnetic resonance imaging. Trans. Med. Imaging 23, 137–152 10.1109/TMI.2003.822821
- Biswal B., Zerrin Yetkin F., Haughton V., Hyde J. (1995). Functional connectivity in the motor cortex of resting human brain using echo-planar MRI. Magn. Reson. Med. 34, 53719 10.1002/mrm.1910340409
- Calhoun V. D., Adali T., Pearlson G. D., Pekar J. J. (2001). A method for making group inferences from fMRI data using independent component analysis. Hum. Brain Mapp. 14, 140 10.1002/hbm.1048
- Craddock R., James G., Holtzheimer P., III, Hu X., Mayberg H. (2011). A whole brain fmri atlas generated via spatially constrained spectral clustering. Hum. Brain Mapp. 33, 1914–1928 10.1002/hbm.21333
- Detre G., Polyn S., Moore C., Natu V., Singer B., Cohen J., et al. (2006). The multi-voxel pattern analysis (mvpa) toolbox, in Poster Presented at the Annual Meeting of the Organization for Human Brain Mapping (Florence, Italy: ). Available online at:
- Efron B., Hastie T., Johnstone L., Tibshirani R. (2004). Least angle regression. Ann. Stat. 32, 407–499 10.1214/009053604000000067
- Friston K. (2007). Statistical Parametric Mapping: The Analysis of Functional Brain Images. London: Academic Press
- Gorgolewski K., Burns C. D., Madison C., Clark D., Halchenko Y. O., Waskom M. L., et al. (2011). Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in python. Front. Neuroinform. 5:13 10.3389/fninf.2011.00013
- Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., Witten I. H. (2009). The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11, 10–18 10.1145/1656274.1656278
- Hanke M., Halchenko Y. O., Sederberg P. B., Hanson S. J., Haxby J. V., Pollmann S. (2009). PyMVPA: a python toolbox for multivariate pattern analysis of fMRI data. Neuroinformatics 7, 37–53 10.1007/s12021-008-9041-y
- Hanson S. J., Halchenko Y. O. (2008). Brain reading using full brain support vector machines for object recognition: there is no “face” identification area. Neural Comput. 20, 486–503 10.1162/neco.2007.09-06-340
- Hanson S. J., Matsuka T., Haxby J. V. (2004). Combinatorial codes in ventral temporal lobe for object recognition: Haxby (2001) revisited: is there a “face” area? Neuroimage 23, 156–166 10.1016/j.neuroimage.2004.05.020
- Hastie T., Tibshirani R., Friedman J. J. H. (2001). The Elements of Statistical Learning, Vol. 1 New York, NY: Springer; 10.1007/978-0-387-21606-5
- Haxby J. V., Gobbini I. M., Furey M. L., Ishai A., Schouten J. L., Pietrini P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425 10.1126/science.1063736
- Hunter J. D. (2007). Matplotlib: a 2d graphics environment. Comput. Sci. Eng. 9, 90–95 10.1109/MCSE.2007.55
- Hyvärinen A., Oja E. (2000). Independent component analysis: algorithms and applications. Neural networks 13, 411–430 10.1016/S0893-6080(00)00026-5
- Kiviniemi V., Kantola J., Jauhiainen J., Hyvärinen A., Tervonen O. (2003). Independent component analysis of nondeterministic fmri signal sources. Neuroimage 19, 253 10.1016/S1053-8119(03)00097-1
- Kriegeskorte N., Goebel R., Bandettini P. (2006). Information-based functional brain mapping. Proc. Natl. Acad. Sci. U.S.A. 103, 3863–3868 10.1073/pnas.0600244103
- Michel V., Gramfort A., Varoquaux G., Eger E., Keribin C., Thirion B. (2012). A supervised clustering approach for fMRI-based inference of brain states. Pattern Recogn. 45, 2041 10.1016/j.patcog.2011.04.006
- Millman K. J., Brett M. (2007). Analysis of functional magnetic resonance imaging in python. Comput. Sci. Eng. 9, 52–55 10.1109/MCSE.2007.46
- Miyawaki Y., Uchida H., Yamashita O., Sato M.-A., Morito Y., Tanabe H. C., et al. (2008). Visual image reconstruction from human brain activity using a combination of multiscale local image decoders. Neuron 60, 915–929 10.1016/j.neuron.2008.11.004
- Mur M., Bandettini P. A., Kriegeskorte N. (2009). Revealing representational content with pattern-information fMRI–an introductory guide. Soc. Cogn. Affect. Neurosci. 4, 101–109 10.1093/scan/nsn044
- Naselaris T., Kay K. N., Nishimoto S., Gallant J. L. (2011). Encoding and decoding in fMRI. Neuroimage 56, 400–410 10.1016/j.neuroimage.2010.07.073
- O'Toole A. J., Jiang F., Abdi H., Pénard N., Dunlop J. P., Parent M. A. (2007). Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data. J. Cogn. Neurosci. 19, 1735–1752 10.1162/jocn.2007.19.11.1735
- Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., et al. (2011). Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825
- Pereira F., Mitchell T., Botvinick M. (2009). Machine learning classifiers and fMRI: a tutorial overview. Neuroimage 45, S199–S209 10.1016/j.neuroimage.2008.11.007
- Schaul T., Bayer J., Wierstra D., Sun Y., Felder M., Sehnke F., et al. (2010). PyBrain. J. Mach. Learn. Res. 11, 743–746
- Schrouff J., Rosa M. J., Rondina J., Marquand A., Chu C., Ashburner J., et al. (2013). PRoNTo: pattern recognition for neuroimaging toolbox. Neuroinformatics 11, 319–337 10.1007/s12021-013-9178-1
- Smith S., Fox P., Miller K., Glahn D., Fox P., Mackay C., et al. (2009). Correspondence of the brain's functional architecture during activation and rest. Proc. Natl. Acad. Sci. U.S.A. 106, 13040 10.1073/pnas.0905267106
- Smith S. M., Jenkinson M., Woolrich M. W., Beckmann C. F., Behrens T. E., Johansen-Berg H., et al. (2004). Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23, S208–S219 10.1016/j.neuroimage.2004.07.051
- Sonnenburg S., Rätsch G., Henschel S., Widmer C., Behr J., Zien A., et al. (2010). The SHOGUN machine learning toolbox. J. Mach. Learn. Res. 11, 1799
- Thirion B., Flandin G., Pinel P., Roche A., Ciuciu P., Poline J. (2006). Dealing with the shortcomings of spatial normalization: Multi-subject parcellation of fMRI datasets. Hum. Brain Mapp. 27, 678 10.1002/hbm.20210
- Tibshirani R. (1996). Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. 58, 267–288
- Van Der Walt S., Colbert S. C., Varoquaux G. (2011). The numpy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13, 22–30 10.1109/MCSE.2011.37
- Varoquaux G., Craddock R. C. (2013). Learning and comparing functional connectomes across subjects. Neuroimage 80, 405 10.1016/j.neuroimage.2013.04.007
- Varoquaux G., Gramfort A., Pedregosa F., Michel V., Thirion B. (2011). Multi-subject dictionary learning to segment an atlas of brain spontaneous activity. Inf. Process Med. Imaging 22, 562–573 10.1007/978-3-642-22092-0_46
- Varoquaux G., Sadaghiani S., Pinel P., Kleinschmidt A., Poline J. B., Thirion B. (2010). A group model for stable multi-subject ICA on fMRI datasets. Neuroimage 51, 288 10.1016/j.neuroimage.2010.02.010
Source: PubMed