VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data

Jean Daunizeau, Vincent Adam, Lionel Rigoux, Jean Daunizeau, Vincent Adam, Lionel Rigoux

Abstract

This work is in line with an on-going effort tending toward a computational (quantitative and refutable) understanding of human neuro-cognitive processes. Many sophisticated models for behavioural and neurobiological data have flourished during the past decade. Most of these models are partly unspecified (i.e. they have unknown parameters) and nonlinear. This makes them difficult to peer with a formal statistical data analysis framework. In turn, this compromises the reproducibility of model-based empirical studies. This work exposes a software toolbox that provides generic, efficient and robust probabilistic solutions to the three problems of model-based analysis of empirical data: (i) data simulation, (ii) parameter estimation/model selection, and (iii) experimental design optimization.

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Figure 1. The experimental cycle.**
The experimental cycle summarizes the interaction between modelling, experimental work and statistical data analysis. One starts with new competing hypotheses about a system of interest. These are then embodied into a set of candidate models that are to be compared with each other given empirical data. One then designs an experiment that is maximally discriminative with respect to the candidate models. Data acquisition and analysis then proceed, the conclusion of which serves to generate a new set of competing hypotheses, etc… Adapted from .

**Figure 2. The mean-field/Laplace approximation.**
The variational Bayesian approach furnishes an approximation to the marginal posterior densities of subsets of unknown model parameters . Here, the 2D landscape depicts a (true) joint posterior density and the two black lines are the subsequent marginal posterior densities of and , respectively. The mean-field approximation basically describes the joint posterior density as the product of the two marginal densities (black profiles). In turn, stochastic dependencies between parameter subsets are replaced by deterministic dependencies between their posterior sufficient statistics. The Laplace approximation further assumes that the marginal densities can be described by Gaussian densities (red profiles).

Figure 3. Selection error rate and the… — **Figure 3. Selection error rate and the Laplace-Chernoff risk.**
The (univariate) prior predictive density of two generative models (blue) and (green) are plotted as a function of data y, given an arbitrary design u. The dashed grey line shows the marginal predictive density that captures the probabilistic prediction of the whole comparison set . The area under the curve (red) measures the model selection error rate , which depends upon the discriminability between the two prior predictive density and . This is precisely what the Laplace-Chernoff risk is a measure of. Adapted from .

Figure 4. Comparison of asymmetric utility and… — **Figure 4. Comparison of asymmetric utility and asymmetric learning rate.**
This figure summarizes the analysis of choice and value data using models that assume asymmetric utility, asymmetric learning rate, both asymmetries or none. **Upper left**: Trial-by-trial feedback history (either negative, neutral or positive). Grey neutral feedbacks correspond to ‘no-go’ choices. **Upper right**: Trial-by-trial dynamics of true value (red), measured value (black) and agent's binary go(1)/no-go(0) choices (black dots). **Middle-left**: posterior probability of the four models given simulated choice data. **Middle-right**: same format, given value data. **Lower left**: family posterior probabilities for both model spaces partitions, given choice data (left: family ‘yes’ = {‘utility’, ‘both’} vs family ‘no’ = {‘learning’, ‘none’}, right: family ‘yes’ = {‘learning’, ‘both’} vs family ‘no’ = {‘utility’, none’}. **Lower left**: same format, given value data.

Figure 5. Online design optimization for DCM… — **Figure 5. Online design optimization for DCM comparison.**
This figure summarizes the simulation of online design optimization, in the aim of best discriminating between two brain network models ( and ) given fMRI data time series. In this case, the problem reduces to deciding whether or not to introduce the second experimental factor (here, = attentional modulation), on top of the first factor ( = photic stimulation). **Upper left**: the two network models to be compared given fMRI data (top/bottom: with/without attentional modulation of the V1→V5 connection). **Upper middle**: block-by-block temporal dynamics of design efficiency of both types of blocks. Green (resp. blue) dots correspond to blocks with (resp. without.) attentional modulation. **Upper right**: scan-by-scan temporal dynamics of the optimized (on-line) design. **Lower left**: scan-by-scan temporal dynamics of the simulated fMRI signal (blue: V1, green: V5). **Lower middle**: block-by-block temporal dynamics of 95% posterior confidence intervals on the estimated modulatory effect (under model ). The green line depicts the strength of the simulated effect. **Lower right**: block-by-block temporal dynamics of log Bayes factors .

Figure 6. Comparison of deterministic and stochastic… — **Figure 6. Comparison of deterministic and stochastic dynamical systems.**
This figure summarizes the VB comparison of deterministic (upper row) and stochastic (lower row) variants of a Lorenz dynamical system, given data simulated under the stochastic variant of the model. **Upper left**: fitted data (x-axis) is plotted against simulated data (y-axis), for the deterministic case. Perfect model fit would align all points on the red line. **Lower left**: same format, for the stochastic case. **Upper middle**: 95% posterior confidence intervals on hidden-states dynamics. Recall that for deterministic systems, uncertainty in the hidden states arises from evolution parameters' uncertainty. **Lower middle**: same format, stochastic system. **Upper right**: residuals' empirical autocorrelation (y-axis) as a function of temporal lag (x-axis), for the deterministic system. **Lower right**: same format, stochastic system.

Figure 7. Comparison of delayed and non-delayed… — **Figure 7. Comparison of delayed and non-delayed dynamical systems.**
This figure summarizes the VB comparison of non-delayed (upper row) and delayed (lower row) variants of a linear deterministic dynamical system, given data simulated under the delayed variant of the model. This figure uses the same format as Figure 6.

Figure 8. Comparison of white and auto-correlated… — **Figure 8. Comparison of white and auto-correlated state-noise.**
This figure summarizes the VB comparison of stochastic systems driven with either white (upper row) or auto-correlated (lower row) state noise. This figure uses the same format as Figure 6.

**Figure 9. Effect of the micro-time resolution.**
This figure summarizes the effect of relying on either a slow (upper row) or fast (lower row) micro-time resolution, when inverting nonlinear dynamical systems. **Left**: same format as Figure 6. **Upper middle**: estimated hidden-states dynamics at low micro-time resolution (data samples are depicted using dots). **Lower middle**: same format, fast micro-time resolution. **Upper right**: parameters' posterior correlation matrix, at low micro-time resolution. **Lower middle**: same format, fast micro-time resolution.

**Figure 10. Binary data classification.**
This figure exemplifies a classification analysis, which is used to infer on the link between a continuous variable X and a binary data y. The analysis is conducted on data simulated under either a null model (H0: no link) or a sigmoid mapping (H1). **Upper left**: the classification accuracy, in terms of the Monte-Carlo average probability of correct prediction under both types of data (left: H1, right: H0), for the training dataset. The green dots show the expected classification accuracy, using the true values of each model's set of parameters. The dotted red line depicts chance level. **Upper right**: same format, test dataset (no model fitting). **Lower left**: same format, for the log Bayes factor , given the training dataset. **Lower right**: same format, given the full (train+test) dataset.

**Figure 11. Random-effect analysis.**
This figure exemplifies a random-effect GLM analysis, which is used to infer on the group mean of an effect of interest. The analysis is conducted on data simulated under either a null model (H0: group mean is zero) or a non-zero RFX model (H1). **Left**: Monte-Carlo average of the VB-estimated group mean under H1, given both types of data (left: H1, right: H0). **Left**: same format, for the log Bayes factor .

**Figure 12. Random-effect group-BMS.**
This figure exemplifies a random-effect group-BMS analysis, which is used to infer on the best model at the group level. The analysis is conducted on two groups of 32 subjects, whose data were simulated under either a ‘full’ (, group 1) or a ‘reduced’ (, group 2) model. **Upper left**: simulated data (y-axis) plotted against fitted data (x-axis), for a typical simulation. **Lower left**: histograms of log Bayes factor , for both groups (red: group 1, blue: group 2). **Upper middle**: model attributions, for group 1. The posterior probability for each subject is coded on a black-and-white colour scale (black = 1, white = 0). **Lower middle**: same format, group 2. **Upper right**: exceedance probabilities, for group 1. The red line indicates the usual 95% threshold. **Lower right**: same format, group 2.

Figure 13. Improving Q-learning models with inversion… — **Figure 13. Improving Q-learning models with inversion diagnostics.**
This figure demonstrates the added-value of Volterra decompositions, when deriving learning models with changing learning rates. **Upper left**: simulated belief (blue/red: outcome probability for the first/second action, green/magenta: volatility of the outcome contingency for the first/second action) of the Bayesian volatile learner (y-axis) plotted against trials (x-axis). **Lower left**: estimated hidden states of the deterministic variant of the dynamic learning rate model (blue/green: first/second action value, red: learning rate). This model corresponds to the standard Q-learning model (the learning rate is constant over time). **Upper middle**: estimated hidden states of the stochastic variant of the dynamic learning rate model (same format). Note the wide posterior uncertainty around the learning rate estimates. **Lower middle**: Volterra decomposition of the stochastic learning rate (blue: agent's chosen action, green: winning action, red: winning action instability). **Upper right**: estimated hidden states of the augmented Q-learning model (same format as before). **Lower right**: Volterra decomposition of the augmented Q-learning model's learning rate (same format as before).

References

1. Stephan KE, Friston KJ, Frith CD (2009) Dysconnection in Schizophrenia: From Abnormal Synaptic Plasticity to Failures of Self-monitoring. Schizophrenia Bull 353: 509–27
1. Schmidt A, Smieskova R, Aston J, Simon A, Allen P, et al. (2013) Brain connectivity abnormalities predating the onset of psychosis: correlation with the effect of medication. JAMA Psychiatry 709: 903–12
1. Schofield T, Penny W, Stephan KE, Crinion J, Thompson AJ, et al. (2012) Changes in auditory feedback connections determine the severity of speech processing deficits after stroke. J Neurosci 32: 4260–4270
1. Moran R, Symmonds M, Stephan K, Friston K, Dolan R (2011) An in vivo assay of synaptic function mediating human cognition. Curr Biol 21: 1320–1325
1. Daunizeau J, David O, Stephan KE (2011) Dynamic causal modeling: a critical review of the biophysical and statistical foundations. NeuroImage 582: 312–22
1. Daunizeau J, Den Ouden HEM, Pessiglione M, Stephan KE, Kiebel SJ, et al. (2010) Observing the observer (I): meta-Bayesian models of learning and decision-making. PLoS ONE 512: e15554.
1. Mathys C, Daunizeau J, Friston K, Stephan K (2011) A Bayesian foundation for learning under uncertainty. Frontiers Hum Neurosci 5: 39
1. Beal M. (2003), Variational algorithms for approximate Bayesian inference. PhD thesis, Gatsby Computational Unit, University College London, UK.
1. Friston KJ, Mattout J, Trujillo-Barreto, Ashburner J, Peeny W (2007) Variational free energy and the Laplace approximation. Neuroimage 34: 220–234
1. Kloeden P. E., Platen E. (1999), Numerical solution of stochastic differential equations. Springer-Verlag, ISBN 3-540-54062-8.
1. Daunizeau J, Stephan KE, Friston KJ (2012) Stochastic Dynamic Causal Modelling of fMRI data: should we care about neural noise? Neuroimage 62: 464–481
1. Robert C. (2007), The Bayesian choice: From Decision-Theoretic Foundations to Computational Implementation. Springer, August 2007.
1. Myung JI, Pitt MA (2009) Optimal experimental design for model discrimination. Psychol Rev 116: 499–518
1. Daunizeau J, Preuschoff K, Friston KJ, Stephan KE (2011) Optimizing experimental design for comparing models of brain function. PLoS Comp. Biol 711: e1002280
1. Daunizeau J, Friston KJ, Kiebel SJ (2009) Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models. Physica D 238: 2089–2118
1. Stephan KE, Penny WD, Daunizeau J, Moran RJ, Friston KJ (2009) Bayesian model selection for group studies. Neuroimage 46: 1004–1017
1. Friston K, Penny W (2011) Post hoc Bayesian model selection. Neuroimage 56: 2089–2099
1. Bach DR, Daunizeau J, Friston KJ, Dolan RJ (2010) Dynamic causal modelling of anticipatory skin conductance responses. Biological Psychology 2010: 163–170
1. Daw ND (2008) The cognitive neuroscience of motivation and learning. Social Cogn 26: 593–620
1. Thorndike EL (1911) Animal intelligence. New York: Macmillan.
1. Rescorla R. A., Wagner A. R. (1972) A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF (eds) Classical conditioning II: current research and theory. Appleton-Century-Crofts, New York, pp 64–99.
1. Kahneman D, Tversky A (1984) Choices, Values, and Frames. Am Psychol 394: 341–350
1. Penny W, Joao M, Flandin G, Daunizeau J, Stephan KE, et al. (2010) Comparing Families of Dynamic Causal Models. PLoS Comp. Biol 63: e1000709
1. Sporns O, Chialvo DR, Kaiser M, Hilgetag CC (2004) Organization, development and function of complex brain networks. Trends Cog. Sci 89: 418–425
1. Tononi G, Sporns O, Edelman GM (1994) A measure for brain complexity: relating functional segregation and integration in the nervous system. Proc Natl Acad Sci USA 91: 5033–5037
1. Friston KJ, Harrison L, Penny WD (2003) Dynamic Causal Modelling. Neuroimage 19: 1273–1302
1. Stephan KE, Kasper L, Harrison L, Daunizeau J, et al. (2008) Nonlinear dynamic causal models for fMRI. Neuroimage 42: 649–662
1. Stephan KE, Weiskopf N, Drysdale PM, Robinson PA, Friston KJ (2007) Comparing hemodynamic models with DCM. Neuroimage 38: 387–401
1. Friston KJ, Li B, Daunizeau J, Stephan KE (2011) Network discovery with DCM. Neuroimage 56: 1202–1221
1. Li B, Daunizeau J, Stephan KE, Penny W, Hu D, Friston KJ (2011) Generalized filtering and stochastic DCM for fMRI. Neuroimage 582: 442–457
1. Büchel C, Friston KJ (1997) Modulation of connectivity in visual pathways by attention: Cortical interactions evaluated with structural equation modelling and fMRI. Cerebral Cortex 7: 768–778
1. David O, Kiebel SJ, Harrison LM, Mattout J, Kilner JM, et al. (2006) Dynamic causal modeling of evoked responses in EEG and MEG. Neuroimage 304: 1255–1272
1. Den Ouden HEM, Daunizeau J, Roiser J, Friston KJ, Stephan KE (2010) Striatal prediction error modulates cortical coupling. J Neurosci 30: 3210–3219
1. Sutton R., Barto A. (1998), Reinforcement Learning. MIT Press. ISBN 0-585-02445-6.
1. Rigoux L, Stephan K, Friston K, Daunizeau J (2013) Bayesian model selection for group studies - revisited. Neuroimage 84: 971–85
1. Festinger L. (1985), A theory of cognitive dissonance, Stanford, CA: Stanford University Press, ISBN 0-8047-0131-8.
1. Bickel W, Odum A, Madden G (1999) Impulsivity and cigarette smoking: delay discounting in current, never, and ex-smokers. Psychopharmacology 1464: 447–454
1. Meyniel F, Sergent C, Rigoux L, Daunizeau J, Pessiglione M (2013) A neuro-computational account of how the human brain decides when to have a break. Proc Natl Acad Sci 1107: 2641–2646
1. Hodgkin AL, Huxley AF (1952) A quantitative description of membrane current and its application to conduction and excitation in nerve. J Physiol 177: 500–544
1. FitzHugh R (1961) Impulses and physiological states in theoretical models of nerve membrane. Biophysical J 1: 445–466
1. Jansen BH, Rit VG (1995) Electroencephalogram and visual evoked potential generation in a mathematical model of coupled cortical columns. Biol Cybern 73: 357–366
1. Amari S (1977) Dynamics of pattern formation in lateral inhibition type neural fields. Biol Cybern 27: 77–87

Source: PubMed

VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data

Abstract

Conflict of interest statement

Figures

References

스폰서 및 공동 작업자

건강 상태

약물 개입