26.11.2014 Views

Linking across levels of computation in model ... - Michael Frank

Linking across levels of computation in model ... - Michael Frank

Linking across levels of computation in model ... - Michael Frank

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>L<strong>in</strong>k<strong>in</strong>g</strong> <strong>across</strong> <strong>levels</strong> <strong>of</strong> <strong>computation</strong><br />

<strong>in</strong> <strong>model</strong>-based cognitive neuroscience<br />

<strong>Michael</strong> J <strong>Frank</strong><br />

Cognitive, L<strong>in</strong>guistic & Psychological Sciences<br />

Brown Institute for Bra<strong>in</strong> Science<br />

<strong>Michael</strong>_<strong>Frank</strong>@Brown.edu<br />

Abstract Computational approaches to cognitive neuroscience encompass multiple <strong>levels</strong> <strong>of</strong> analysis,<br />

from detailed biophysical <strong>model</strong>s <strong>of</strong> neural activity to abstract algorithmic or normative <strong>model</strong>s <strong>of</strong><br />

cognition, with several <strong>levels</strong> <strong>in</strong> between. Despite <strong>of</strong>ten strong op<strong>in</strong>ions on the ‘right’ level <strong>of</strong> <strong>model</strong><strong>in</strong>g,<br />

there is no s<strong>in</strong>gle panacea: attempts to l<strong>in</strong>k biological with higher level cognitive processes require a<br />

multitude <strong>of</strong> approaches. Here I argue that these disparate approaches should not be viewed as<br />

competitive, nor should they be accessible to only other researchers already endors<strong>in</strong>g the particular<br />

level <strong>of</strong> <strong>model</strong><strong>in</strong>g. Rather, <strong>in</strong>sights ga<strong>in</strong>ed from one level <strong>of</strong> <strong>model</strong><strong>in</strong>g should <strong>in</strong>form <strong>model</strong><strong>in</strong>g<br />

endeavors at the level above and below it. One way to achieve this synergism is to l<strong>in</strong>k <strong>levels</strong> <strong>of</strong><br />

<strong>model</strong><strong>in</strong>g by quantitatively fitt<strong>in</strong>g the behavioral outputs <strong>of</strong> detailed mechanistic <strong>model</strong>s with higher<br />

level descriptions. If the fits are reasonable (e.g., similar to those achieved when apply<strong>in</strong>g high level<br />

<strong>model</strong>s to human behavior), one can then derive plausible l<strong>in</strong>ks between mechanism and<br />

<strong>computation</strong>. Model-based cognitive neuroscience approaches can then be employed to manipulate or<br />

measure neural function motivated by the candidate mechanisms, and to test whether these are<br />

related to high level <strong>model</strong> parameters. I describe several examples <strong>of</strong> this approach <strong>in</strong> the doma<strong>in</strong> <strong>of</strong><br />

reward-based learn<strong>in</strong>g, cognitive control, and decision mak<strong>in</strong>g and show how neural and algorithmic<br />

<strong>model</strong>s have each <strong>in</strong>formed or ref<strong>in</strong>ed the other.<br />

1 Introduction<br />

Cognitive neuroscience is <strong>in</strong>herently <strong>in</strong>terested <strong>in</strong> l<strong>in</strong>k<strong>in</strong>g <strong>levels</strong> <strong>of</strong> analysis, from biological<br />

mechanism to cognitive and behavioral phenomena. But there are not just two <strong>levels</strong>, rather, a<br />

cont<strong>in</strong>uum <strong>of</strong> many. One can consider the implications <strong>of</strong> particular ion channel conductances and<br />

receptors, the morphological structure <strong>of</strong> <strong>in</strong>dividual neurons, the translation <strong>of</strong> mRNA, <strong>in</strong>tracellular<br />

molecular signal<strong>in</strong>g cascades <strong>in</strong>volved <strong>in</strong> synaptic plasticity, and so forth. At the cognitive level, the<br />

field typically discusses constructs such as work<strong>in</strong>g memory, executive control, re<strong>in</strong>forcement learn<strong>in</strong>g,<br />

episodic memory, to name a few. In between these <strong>levels</strong> reside architectures <strong>of</strong> neural systems, such<br />

as the frontal cortex, parietal cortex, hippocampus and basal ganglia, the <strong>in</strong>teractions among all <strong>of</strong>


these systems, and their modulations by neurotransmitters <strong>in</strong> response to relevant task events.<br />

Computational <strong>model</strong>s greatly facilitate the l<strong>in</strong>k<strong>in</strong>g <strong>of</strong> <strong>levels</strong> <strong>of</strong> analysis, because they force one to be<br />

explicit about their assumptions, to provide a unify<strong>in</strong>g coherent framework, and to specify the<br />

<strong>computation</strong>al objectives <strong>of</strong> any given cognitive problem which provide constra<strong>in</strong>ts on <strong>in</strong>terpret<strong>in</strong>g the<br />

underly<strong>in</strong>g mechanisms.<br />

Nevertheless, the question rema<strong>in</strong>s <strong>of</strong> which level <strong>of</strong> <strong>model</strong><strong>in</strong>g to use. The field <strong>of</strong><br />

<strong>computation</strong>al neuroscience for example typically considers how low level mechanisms can give rise<br />

to higher level “behaviors”, but where behavior here is def<strong>in</strong>ed <strong>in</strong> terms <strong>of</strong> the changes <strong>in</strong> membrane<br />

potentials <strong>of</strong> <strong>in</strong>dividual neurons or even compartments with<strong>in</strong> neurons, or <strong>in</strong> terms <strong>of</strong> synchrony <strong>of</strong><br />

neural fir<strong>in</strong>g <strong>across</strong> populations <strong>of</strong> cells. The field <strong>of</strong> <strong>computation</strong>al cognitive science, on the other<br />

hand, considers how behavioral phenomena might be <strong>in</strong>terpreted as optimiz<strong>in</strong>g some <strong>computation</strong>al<br />

goal, like m<strong>in</strong>imiz<strong>in</strong>g effort costs, maximiz<strong>in</strong>g expected future reward, or optimally trad<strong>in</strong>g <strong>of</strong>f<br />

uncerta<strong>in</strong>ty about multiple sources <strong>of</strong> perceptual and cognitive <strong>in</strong>formation to make <strong>in</strong>ferences about<br />

causal structure. In between, <strong>computation</strong>al cognitive neuroscience considers how mechanisms with<strong>in</strong><br />

neural systems can solve trade<strong>of</strong>fs, for example between pattern separation and pattern completion <strong>in</strong><br />

hippocampal networks (O’Reilly & McClelland, 1994), between updat<strong>in</strong>g and ma<strong>in</strong>tenance <strong>of</strong> work<strong>in</strong>g<br />

memory <strong>in</strong> prefrontal cortex (Braver & Cohen, 2000; <strong>Frank</strong>, Loughry & O’Reilly, 2001), or between<br />

speed and accuracy <strong>in</strong> perceptual decision mak<strong>in</strong>g as a function <strong>of</strong> connectivity between cortex and<br />

basal ganglia (Bogacz, Wagenmakers, Forstmann & Nieuwenhuis, 2010; Lo & Wang, 2006). Even<br />

with this limited number <strong>of</strong> examples however, <strong>model</strong>s took multiple <strong>levels</strong> <strong>of</strong> description, from those<br />

us<strong>in</strong>g detailed spik<strong>in</strong>g neurons to higher level <strong>computation</strong>s captur<strong>in</strong>g reaction time distributions,<br />

where latent estimated parameters are correlated with neural measures extracted from functional<br />

imag<strong>in</strong>g. In general, theorists and experimentalists are happy to “live” at one level <strong>of</strong> analysis which<br />

<strong>in</strong>tuitively has greatest aesthetic, even though there is large variation <strong>in</strong> the appreciation <strong>of</strong> what<br />

constitutes the “right” level. Although there is a rich literature <strong>in</strong> mathematical psychology on how to<br />

select the most parsimonious <strong>model</strong> that best accounts for data without overfitt<strong>in</strong>g, this issue primarily<br />

perta<strong>in</strong>s to applications <strong>in</strong> which one quantitatively fits data, and not to the endeavor <strong>of</strong> construct<strong>in</strong>g a


generative <strong>model</strong>. For example, if given only error rates and reaction time distributions, a<br />

mathematical psychologist will select a m<strong>in</strong>imalist <strong>model</strong> that can best account for these data with a<br />

few free parameters, but this <strong>model</strong> will not <strong>in</strong>clude the <strong>in</strong>ternal dynamics <strong>of</strong> neural activities. Some<br />

researchers also perform <strong>model</strong> selection together with functional imag<strong>in</strong>g to select the best <strong>model</strong><br />

that accounts for correlations between bra<strong>in</strong> areas and psychological parameters (e.g., Forstmann et<br />

al., 2008; Jahfari et al., 2012), but even here the neural data are relatively sparse and the<br />

observations do not <strong>in</strong>clude access to <strong>in</strong>ternal circuitry dynamics, neurotransmitters, etc – even though<br />

everyone appreciates that those dynamics drive the observed measurements.<br />

The reason everyone appreciates this claim is that the expert cognitive neuroscientist has<br />

amassed an <strong>in</strong>formative prior on valid <strong>model</strong>s based on a large body <strong>of</strong> research that spans multiple<br />

methods, species, analysis tools etc. Nevertheless, it is simply not feasible to apply these <strong>in</strong>formative<br />

priors <strong>in</strong>to a quantitative <strong>model</strong> selection process given much more sparse data (e.g. given BOLD<br />

fMRI data and behavioral responses one would not be advised to try to identify the latent parameters<br />

<strong>of</strong> a detailed neuronal <strong>model</strong> that <strong>in</strong>cludes receptor aff<strong>in</strong>ities, membrane potential time constants, etc).<br />

In this chapter, I advocate an alternative multi-level <strong>model</strong><strong>in</strong>g strategy to address this issue.<br />

This strategy <strong>in</strong>volves critical <strong>in</strong>tegration <strong>of</strong> the two <strong>levels</strong> to derive predictions for experiments and to<br />

perform quantitative fits to data. In one prong, theoreticians can create detailed neural <strong>model</strong>s <strong>of</strong><br />

<strong>in</strong>teract<strong>in</strong>g bra<strong>in</strong> areas, neurotransmitters, etc, constra<strong>in</strong>ed by a variety <strong>of</strong> observations. These <strong>model</strong>s<br />

attempt to specify <strong>in</strong>teractions among multiple neural mechanisms and can show how the network<br />

dynamics recapitulate those observed <strong>in</strong> electrophysiological data, and how perturbation <strong>of</strong> those<br />

dynamics by alter<strong>in</strong>g neurotransmitter <strong>levels</strong> or receptor aff<strong>in</strong>ities can lead to observable changes <strong>in</strong><br />

behavior that qualitatively match those reported <strong>in</strong> the literature. In a second prong, the <strong>model</strong>er can<br />

construct a higher level <strong>computation</strong>al <strong>model</strong> motivated by the mathematical psychological literature<br />

which summarizes the basic cognitive mechanism. Examples <strong>of</strong> this level <strong>in</strong>clude simple<br />

re<strong>in</strong>forcement learn<strong>in</strong>g <strong>model</strong>s from the mach<strong>in</strong>e learn<strong>in</strong>g literature (e.g., Q learn<strong>in</strong>g; Watk<strong>in</strong>s &<br />

Dayan, 1992), or sequential sampl<strong>in</strong>g <strong>model</strong>s <strong>of</strong> decision mak<strong>in</strong>g such as the drift diffusion <strong>model</strong> (see<br />

Ratcliff & McKoon, 2008, for a review). These <strong>model</strong>s should be constructed such that the parameters


are identifiable, mean<strong>in</strong>g that if one generates fake data from the <strong>model</strong> they should be able to reliably<br />

recover the generative parameters and to differentiate between changes that would be due to<br />

alterations <strong>in</strong> one parameter separately from other parameters. Ideally, the experimental task design<br />

will be <strong>in</strong>formed by this exercise prior to conduct<strong>in</strong>g the experiment, so that the task can provide<br />

conditions that would be more diagnostic <strong>of</strong> differences <strong>in</strong> underly<strong>in</strong>g parameters. The identifiability <strong>of</strong><br />

a <strong>model</strong> is a property <strong>of</strong> not only <strong>of</strong> the <strong>model</strong> itself but also the different task conditions. To see this,<br />

consider a <strong>model</strong> <strong>in</strong> which task difficulty <strong>of</strong> one sort or another is thought to selectively impact a given<br />

<strong>model</strong> parameter. One might <strong>in</strong>clude two or more different difficulty <strong>levels</strong>, but the degree to which<br />

these <strong>in</strong>fluence behavior, and thus lead to observable differences that can be captured by the relevant<br />

<strong>model</strong> parameter, <strong>of</strong>ten <strong>in</strong>teract with the other task and <strong>model</strong> parameters.<br />

Given an identifiable <strong>model</strong> and appropriate task, the next step is to l<strong>in</strong>k the <strong>levels</strong> <strong>of</strong> <strong>model</strong><strong>in</strong>g<br />

to each other. Here one generates data from the detailed neural <strong>model</strong> exposed to the task such that it<br />

produces data at the same level as one would obta<strong>in</strong> from a given experiment – error rates and RT<br />

distributions for example, or perhaps also some summary statistic <strong>of</strong> neural activity <strong>in</strong> a given<br />

simulated bra<strong>in</strong> area <strong>in</strong> one condition vs. another as one might obta<strong>in</strong> with fMRI or EEG. Then, these<br />

outputs are treated just as one does when fitt<strong>in</strong>g human (or other animal): assum<strong>in</strong>g they were<br />

generated by the higher level <strong>model</strong> (or several candidate higher level <strong>model</strong>s). Model selection and<br />

parameter optimization proceeds exactly as it would when fitt<strong>in</strong>g these <strong>model</strong>s to actual human data.<br />

The purpose <strong>of</strong> this exercise is to determ<strong>in</strong>e which <strong>of</strong> the higher level <strong>model</strong>s best summarizes the<br />

effective <strong>computation</strong>s <strong>of</strong> the detailed <strong>model</strong>. Further, one can perform systematic parameter<br />

manipulations <strong>in</strong> the neural <strong>model</strong> to determ<strong>in</strong>e whether these have observable selective effects on<br />

the higher level parameter estimates. If the fit is reasonable (for example if the measures <strong>of</strong> <strong>model</strong> fit<br />

are similar to those obta<strong>in</strong>ed by fitt<strong>in</strong>g the higher level <strong>model</strong> to human data), this process can lead to<br />

a higher level <strong>computation</strong>al description <strong>of</strong> neural circuit function not directly afforded by the detailed<br />

neural dynamic simulations. (This approach is complementary to that taken by Bogacz and<br />

colleagues, who have used a s<strong>in</strong>gle level <strong>of</strong> <strong>computation</strong> but made observations about how dist<strong>in</strong>ct<br />

aspects <strong>of</strong> the <strong>computation</strong>al process can be mapped onto dist<strong>in</strong>ct nuclei with<strong>in</strong> the cortico-basal


ganglia network (Bogacz & Gurney, 2007). In Bogacz’s work the <strong>computation</strong>s afforded by this circuitry<br />

is identical to that <strong>of</strong> the optimal Bayesian <strong>model</strong> <strong>of</strong> decision mak<strong>in</strong>g. In our <strong>model</strong>s described below,<br />

these <strong>computation</strong>s emerge from nonl<strong>in</strong>ear neural dynamics and the mapp<strong>in</strong>g is approximate, not<br />

exact, and hence allow for potential ref<strong>in</strong>ements <strong>in</strong> the higher level description that best characterizes<br />

human cognition and behavior.)<br />

When successful, this exercise can also motivate experiments <strong>in</strong> which the same biological<br />

manipulation is performed on actual participants (e.g. a medication manipulation, bra<strong>in</strong> stimulation,<br />

pseudoexperimental manipulation via genetics). Indeed, by l<strong>in</strong>k<strong>in</strong>g <strong>across</strong> <strong>levels</strong> <strong>of</strong> <strong>computation</strong> one<br />

derives precise, falsifiable predictions about which <strong>of</strong> the higher level observable and identifiable<br />

parameters will be affected, and <strong>in</strong> which direction, by the manipulation. It can also provide <strong>in</strong>formative<br />

constra<strong>in</strong>ts on <strong>in</strong>terpret<strong>in</strong>g how component processes are altered as a function <strong>of</strong> mental illness (Maia<br />

& <strong>Frank</strong>, 2011). Of course, one may derive these predictions <strong>in</strong>tuitively based on their own<br />

understand<strong>in</strong>g <strong>of</strong> neural mechanisms, but there are various <strong>in</strong>stances <strong>in</strong> which explicit simulations with<br />

more detailed <strong>model</strong>s can lend <strong>in</strong>sight <strong>in</strong>to <strong>in</strong>teractions that may not have been envisioned otherwise.<br />

2 Applications to re<strong>in</strong>forcement learn<strong>in</strong>g, decision mak<strong>in</strong>g and cognitive control<br />

The above “recipe” for l<strong>in</strong>k<strong>in</strong>g <strong>levels</strong> <strong>of</strong> <strong>computation</strong> is relatively abstract. The rest <strong>of</strong> this chapter<br />

focuses on concrete examples from my lab. We study the neuro<strong>computation</strong>al mechanisms <strong>of</strong> corticobasal<br />

ganglia functions – <strong>in</strong>clud<strong>in</strong>g action selection, re<strong>in</strong>forcement learn<strong>in</strong>g and cognitive control. For<br />

theory development, we leverage a comb<strong>in</strong>ation <strong>of</strong> two dist<strong>in</strong>ct <strong>levels</strong> <strong>of</strong> <strong>computation</strong>. First, our lab and<br />

others have simulated <strong>in</strong>teractions with<strong>in</strong> and between corticostriatal circuits via dynamical neural<br />

systems <strong>model</strong>s which specify the roles <strong>of</strong> particular neural mechanisms (<strong>Frank</strong>, 2005, 2006; <strong>Frank</strong> &<br />

Claus, 2006; Ratcliff & <strong>Frank</strong>, 2012; Gurney, Prescott, & Redgrave, 2001; <strong>Frank</strong>, Loughry, & O’Reilly,<br />

2001; Wiecki, Ried<strong>in</strong>ger, Mayerh<strong>of</strong>er, Schmidt & <strong>Frank</strong>, 2009; Wiecki & <strong>Frank</strong>, 2010). These <strong>model</strong>s<br />

are motivated by anatomical, physiological, and functional constra<strong>in</strong>ts. The mechanisms <strong>in</strong>cluded <strong>in</strong><br />

the detailed neural <strong>model</strong>s were orig<strong>in</strong>ally based on data from animal <strong>model</strong>s, but <strong>in</strong>tegrated together<br />

<strong>in</strong>to a systems-level functional <strong>model</strong> (i.e., one that has objectives and l<strong>in</strong>ks to behavior). As such they


have reciprocally <strong>in</strong>spired rodent researchers to test, and validate, key <strong>model</strong> predictions us<strong>in</strong>g genetic<br />

eng<strong>in</strong>eer<strong>in</strong>g methods <strong>in</strong> rodents by manipulat<strong>in</strong>g activity <strong>in</strong> separable corticostriatal pathways (Kravitz,<br />

Tye & Kreitzer, 2012; Hikida, Kimura, Wada, Funabiki & Nakanishi, 2010; Tai, Lee, Benavidez, Bonci,<br />

Wilbrecht, 2012). Second, we adopt and ref<strong>in</strong>e higher level mathematical <strong>model</strong>s to analyze the<br />

functional properties <strong>of</strong> the neurocognitive systems, afford<strong>in</strong>g a pr<strong>in</strong>cipled <strong>computation</strong>al <strong>in</strong>terpretation<br />

and allow<strong>in</strong>g for tractable quantitative fits to bra<strong>in</strong>-behavior relationships (Cavanagh et al., 2011;<br />

Cavanagh, <strong>Frank</strong>, Kle<strong>in</strong> & Allen, 2010; Ratcliff & <strong>Frank</strong>, 2012; Doll, Hutch<strong>in</strong>son & <strong>Frank</strong>, 2011; Badre,<br />

Doll, Long & <strong>Frank</strong>, 2012; Doll, Jacobs, Sanfey & <strong>Frank</strong>, 2009). Examples <strong>of</strong> the two <strong>levels</strong> <strong>of</strong><br />

description are shown <strong>in</strong> Figure 1 (neural systems) and Figures 2, 3, 4 (abstractions).<br />

For empirical support, we design computerized tasks sensitive to the hypothesized neural<br />

<strong>computation</strong>s that probe re<strong>in</strong>forcement learn<strong>in</strong>g, cognitive control, and reward-based decision mak<strong>in</strong>g<br />

under uncerta<strong>in</strong>ty. We provide quantitative estimates <strong>of</strong> <strong>in</strong>dividual performance parameters us<strong>in</strong>g<br />

mathematical <strong>model</strong>s, yield<strong>in</strong>g objective assessments <strong>of</strong> the degree to which subjects rely on specific<br />

<strong>computation</strong>s when learn<strong>in</strong>g and mak<strong>in</strong>g decisions (<strong>Frank</strong> & Claus, 2006; Cavanagh et al., 2011;<br />

<strong>Frank</strong> & O’Reilly, 2006; <strong>Frank</strong>, Samanta, Moustafa & Sherman, 2007; Samejima, Ueda, Doya &<br />

Kimura, 2005). We assess how these parameters vary with markers <strong>of</strong> neural activity (EEG, fMRI),<br />

and how they are altered as a function <strong>of</strong> illness, bra<strong>in</strong> stimulation, pharmacology, and genetics<br />

(Cavanagh et al., 2011; <strong>Frank</strong> & O’Reilly, 2006; Pressiglione, Seymour, Fland<strong>in</strong>, Dolan & Frith, 2006;<br />

<strong>Frank</strong> et al., 2007; Maia & <strong>Frank</strong>, 2011).<br />

This approach has contributed to a coherent depiction <strong>of</strong> frontostriatal function and testable<br />

predictions. As one example, we have identified mechanisms underly<strong>in</strong>g two dist<strong>in</strong>ct forms <strong>of</strong><br />

impulsivity. The first stems from a deficit <strong>in</strong> learn<strong>in</strong>g from “negative reward prediction errors” (when<br />

decision outcomes are worse than expected, dependent on dips <strong>in</strong> dopam<strong>in</strong>e and their resultant<br />

effects on activity and plasticity <strong>in</strong> a subpopulation <strong>of</strong> striatal neurons express<strong>in</strong>g D2 dopam<strong>in</strong>e<br />

receptors). Deficiencies <strong>in</strong> the mechanism lead to a failure to properly consider negative outcomes <strong>of</strong><br />

prospective decisions, and hence lead to a bias to focus primarily on ga<strong>in</strong>s. The second form <strong>of</strong><br />

impulsivity stems from a failure to adaptively pause the decision process given conflict<strong>in</strong>g evidence


about the values <strong>of</strong> alternative actions (dependent on communication between frontal cortex and the<br />

subthalamic nucleus (STN), which temporarily provides a s<strong>of</strong>t brake on motor output). Deficiencies <strong>in</strong><br />

this mechanism lead to rash choices, i.e., a failure to consider the value <strong>of</strong> all the decision options but<br />

<strong>in</strong>stead to quickly accept or reject an option based on its value alone. Notably these two forms <strong>of</strong><br />

impulsivity are differentially impacted by dist<strong>in</strong>ct forms <strong>of</strong> treatment <strong>in</strong> Park<strong>in</strong>son’s disease –<br />

medications that produce elevations <strong>in</strong> dopam<strong>in</strong>e and deep bra<strong>in</strong> stimulation <strong>of</strong> the STN – support<strong>in</strong>g<br />

the posited mechanisms by which these treatments alter neural circuitry (Cavanagh et al., 2011; <strong>Frank</strong><br />

et al., 2007; Zaghloul et al., 2012). Further <strong>in</strong>sights come from l<strong>in</strong>k<strong>in</strong>g these precise mechanisms to<br />

their high level <strong>computation</strong>s. To do so, we next present a more detailed overview <strong>of</strong> the neural<br />

<strong>model</strong>, followed by its l<strong>in</strong>k<strong>in</strong>g to abstract <strong>computation</strong>s.<br />

3 Cortico-Striatal Interactions dur<strong>in</strong>g Choice and Learn<strong>in</strong>g<br />

The basal ganglia (BG) are a collection <strong>of</strong> subcortical structures that are anatomically,<br />

neurochemically, and functionally l<strong>in</strong>ked (Gurney et al., 2001; M<strong>in</strong>k, 1996; Alexander & Crutcher, 1990;<br />

<strong>Frank</strong>, 2011). Through a network <strong>of</strong> <strong>in</strong>terconnected loops with the frontal cortex, they modulate motor,<br />

cognitive, and affective functions. The def<strong>in</strong><strong>in</strong>g characteristic <strong>of</strong> this <strong>in</strong>teraction is a “gat<strong>in</strong>g function”:<br />

the frontal cortex first generates candidate options based on their prior history <strong>of</strong> execution <strong>in</strong> the<br />

sensory context and the BG facilitates selection (Gurney et al., 2001) <strong>of</strong> one <strong>of</strong> these candidates given<br />

their relative learned reward values (<strong>Frank</strong>, 2005, 2011).<br />

Choice: Two ma<strong>in</strong> projection pathways from the striatum go through different striatal nuclei on the way<br />

to thalamus and up to cortex. Activity <strong>in</strong> the direct “Go” pathway provides evidence <strong>in</strong> favor <strong>of</strong><br />

facilitation <strong>of</strong> the candidate cortical action, by dis<strong>in</strong>hibit<strong>in</strong>g thalamocortical activity for the action with<br />

highest reward value. Conversely, activity <strong>in</strong> the <strong>in</strong>direct “NoGo” pathway <strong>in</strong>dicates that the action is<br />

maladaptive and hence should not be gated. Thus for any given choice, direct pathway neurons<br />

convey the positive evidence <strong>in</strong> favor <strong>of</strong> that action based on learned reward history, whereas <strong>in</strong>direct<br />

pathway neurons signal the negative evidence (likelihood <strong>of</strong> lead<strong>in</strong>g to a negative outcome). The


Figure 1. Neural network <strong>model</strong> <strong>of</strong> a s<strong>in</strong>gle<br />

cortico-basal ganglia circuit[2]. Sensory and<br />

motor (pre-SMA) cortices project to the basal<br />

ganglia. Two oppos<strong>in</strong>g “Go” and “NoGo”<br />

(direct and <strong>in</strong>direct) pathways regulate action<br />

facilitation and suppression based on reward<br />

evidence for and aga<strong>in</strong>st each decision<br />

option. Dopam<strong>in</strong>e (DA) modulates activity<br />

<strong>levels</strong> and plasticity <strong>in</strong> these populations,<br />

<strong>in</strong>fluenc<strong>in</strong>g both choice and learn<strong>in</strong>g. The<br />

‘hyperdirect’ pathway from cortex to STN acts<br />

to provide a temporary Global NoGo signal<br />

<strong>in</strong>hibit<strong>in</strong>g the selection <strong>of</strong> all actions,<br />

particularly under conditions <strong>of</strong> decision<br />

conflict (co-activation <strong>of</strong> compet<strong>in</strong>g pre-SMA<br />

units). GPi/e: Globus Pallidus <strong>in</strong>ternal /<br />

external segment.<br />

action most likely to be gated is a function <strong>of</strong> the relative difference <strong>in</strong> Go/NoGo activity for each action<br />

(<strong>Frank</strong>, 2005). Dopam<strong>in</strong>e <strong>in</strong>fluences the cost/benefit trade<strong>of</strong>f by modulat<strong>in</strong>g the balance <strong>of</strong> activity <strong>in</strong><br />

these pathways via differential effects on D1 and D2 receptors, thereby modulat<strong>in</strong>g choice <strong>in</strong>centive<br />

(whether choices are determ<strong>in</strong>ed by positive or negative potential outcomes). Similar pr<strong>in</strong>ciples apply<br />

to the selection <strong>of</strong> cognitive actions (notably, work<strong>in</strong>g memory gat<strong>in</strong>g) <strong>in</strong> BG-prefrontal cortical circuits<br />

(<strong>Frank</strong>, 2005, 2011; <strong>Frank</strong> et al., 2001). F<strong>in</strong>ally, conflict between compet<strong>in</strong>g choices, represented <strong>in</strong><br />

medi<strong>of</strong>rontal/premotor cortices, activates the subthalamic nucleus (STN) via the hyperdirect pathway.<br />

In turn, STN activity delays action selection, mak<strong>in</strong>g it more difficult for striatal Go signals to facilitate a<br />

choice and buy<strong>in</strong>g more time to settle on the optimal choice (<strong>Frank</strong>, 2006); <strong>in</strong> abstract formulations,<br />

this is equivalent to a temporary elevation <strong>in</strong> the decision threshold (Ratcliff & <strong>Frank</strong>, 2012) (see<br />

below; Figure 3).<br />

Learn<strong>in</strong>g: The <strong>model</strong>s simulate phasic changes <strong>in</strong> dopam<strong>in</strong>e <strong>levels</strong> that occur dur<strong>in</strong>g positive and<br />

negative reward prediction errors (difference between expected and obta<strong>in</strong>ed reward), and their<br />

effects on plasticity <strong>in</strong> the two striatal pathways. Phasic bursts <strong>of</strong> dopam<strong>in</strong>e cell fir<strong>in</strong>g dur<strong>in</strong>g positive<br />

prediction errors act as teach<strong>in</strong>g signals that drive Go learn<strong>in</strong>g <strong>of</strong> reward<strong>in</strong>g behaviors via D1 receptor


stimulation (<strong>Frank</strong>, 2005; Schultz, 2002; <strong>Frank</strong>, Seeberger, O’Reilly, 2004). Conversely, negative<br />

prediction errors lead to pauses <strong>in</strong> dopam<strong>in</strong>e fir<strong>in</strong>g (Schultz, 2002), support<strong>in</strong>g NoGo learn<strong>in</strong>g to avoid<br />

unreward<strong>in</strong>g choices via D2 receptor dis<strong>in</strong>hibition. An imbalance <strong>in</strong> learn<strong>in</strong>g or choice <strong>in</strong> these<br />

pathways can lead to a host <strong>of</strong> aberrant neurological and psychiatric symptoms (Maia & <strong>Frank</strong>, 2011).<br />

4 Higher level descriptions<br />

Thus far we have considered basic mechanisms <strong>in</strong> dynamical <strong>model</strong>s <strong>of</strong> corticostriatal circuits<br />

and their resultant effects on behavior. However, these <strong>model</strong>s are complex (cf. Figure 1 – this is the<br />

‘base’ <strong>model</strong> and other variants build from there to <strong>in</strong>clude multiple circuits and their <strong>in</strong>teractions).<br />

Moreover, because they simulate <strong>in</strong>ternal neural dynamics consistent with electrophysiological data,<br />

they require many more parameters than is necessary to relate to behavior alone. Higher level<br />

<strong>computation</strong>al descriptions allow us to abstract away from the detailed implementation. In particular,<br />

the tendency to <strong>in</strong>crementally learn from reward prediction errors and to select among multiple<br />

candidate options has been fruitfully <strong>model</strong>ed us<strong>in</strong>g re<strong>in</strong>forcement learn<strong>in</strong>g (RL) <strong>model</strong>s <strong>in</strong>herited from<br />

the computer science literature. These <strong>model</strong>s summarize valuation <strong>of</strong> alternative actions, reflect<strong>in</strong>g<br />

the contributions <strong>of</strong> the striatal units <strong>in</strong> the neural <strong>model</strong>s, <strong>in</strong> terms <strong>of</strong> simple “Q values” which are<br />

<strong>in</strong>cremented and decremented as a function <strong>of</strong> reward prediction errors. A simple choice function is<br />

used to compare Q values <strong>across</strong> all options and to stochastically select that with the highest<br />

predicted value: this function summarizes the effective <strong>computation</strong>s <strong>of</strong> the gat<strong>in</strong>g circuitry that<br />

facilitates cortical actions (where noise <strong>in</strong> both cortex and striatum results <strong>in</strong> stochastic choice<br />

function). An asymmetry <strong>in</strong> learn<strong>in</strong>g from positive or negative prediction errors (reflect<strong>in</strong>g high or low<br />

dopam<strong>in</strong>e <strong>levels</strong> and their effects on activity/plasticity) can be captured by us<strong>in</strong>g separate learn<strong>in</strong>g<br />

rates (<strong>Frank</strong>, D’Lauro & Curran, 2007; Doll et al 2011). However, a better depiction <strong>of</strong> the neural <strong>model</strong><br />

allows for not only differential modulation <strong>of</strong> learn<strong>in</strong>g, but also differential modulation <strong>of</strong> choice<br />

<strong>in</strong>centive dur<strong>in</strong>g action selection (Coll<strong>in</strong>s & <strong>Frank</strong>, <strong>in</strong> prep). This <strong>model</strong> uses separate Q values, QG<br />

and QN, to represent the Go and NoGo pathway respectively, each with their own learn<strong>in</strong>g rate, and<br />

where the ‘activity’, i.e. the current QG or QN value, further <strong>in</strong>fluences learn<strong>in</strong>g. The choice probability


is then a function <strong>of</strong> the relative difference <strong>in</strong> QG and QN values for each decision option, with a ga<strong>in</strong><br />

factor that can differentially weigh <strong>in</strong>fluences <strong>of</strong> QG vs QN. This ga<strong>in</strong> factor can be varied to simulate<br />

dopam<strong>in</strong>e effects on choice <strong>in</strong>centive by boost<strong>in</strong>g the extent to which decisions are made based on<br />

learned QG or QN values, even after learn<strong>in</strong>g has taken place.<br />

Because these <strong>model</strong>s are simple and m<strong>in</strong>imal, they can be used to quantitatively fit behavioral<br />

data, and to determ<strong>in</strong>e whether the best fitt<strong>in</strong>g parameters vary as a function <strong>of</strong> biological<br />

manipulations. But because there is a clear mapp<strong>in</strong>g from these <strong>model</strong>s to the neural versions, there<br />

are strong a priori reasons to manipulate particular neural mechanisms and to test whether the<br />

result<strong>in</strong>g estimates <strong>of</strong> <strong>computation</strong>al parameters are altered as predicted or not.<br />

Figure 2 Effects <strong>of</strong> (A) dopam<strong>in</strong>e medication manipulations <strong>in</strong> Park<strong>in</strong>son’s disease (<strong>Frank</strong> et al., 2004) and (B)<br />

genotypes related to striatal D1 and D2 pathways <strong>in</strong> healthy participants on choice accuracy (<strong>Frank</strong>,<br />

Moustafa, Haughey, Curran & Hutch<strong>in</strong>son, 2007; Doll et al., 2011). “Choose Pos” assesses the ability to<br />

choose the probabilistically most rewarded (positive) action based on previous Go learn<strong>in</strong>g; “Avoid Neg”<br />

assesses the ability to avoid the probabilistically most punished (negative) action based on previous NoGo<br />

learn<strong>in</strong>g. (C) Quantitative fits with a re<strong>in</strong>forcement learn<strong>in</strong>g (RL) <strong>model</strong> capture these choice dissociations by<br />

assign<strong>in</strong>g asymmetric Go vs. NoGo learn<strong>in</strong>g rates that vary as a function <strong>of</strong> PD, medications, and genotype<br />

(<strong>Frank</strong> & Fossella, 2011). Unlike studies l<strong>in</strong>k<strong>in</strong>g candidate genes to complex disease phenotypes, where<br />

f<strong>in</strong>d<strong>in</strong>gs <strong>of</strong>ten fail to replicate (Munafὸ, Stothart & Fl<strong>in</strong>t, 2009), this l<strong>in</strong>k<strong>in</strong>g <strong>of</strong> neurogenetic markers <strong>of</strong><br />

corticostriatal function to specific <strong>computation</strong>al processes has been replicated <strong>across</strong> multiple experiments.<br />

Indeed, evidence validat<strong>in</strong>g these multi-level <strong>model</strong> mechanisms has mounted over the last<br />

decade <strong>across</strong> species. Monkey record<strong>in</strong>gs comb<strong>in</strong>ed with Q learn<strong>in</strong>g <strong>model</strong> fits <strong>in</strong>dicate that separate<br />

populations <strong>of</strong> striatal cells code for positive and negative Q values associated with action facilitation


and suppression (Samejima et al., 2005; Ford & Everl<strong>in</strong>g, 2009; Watanabe & Munoz, 2009; Lau &<br />

Glimcher. 2008). In mice, targeted manipulations confirm selective roles <strong>of</strong> direct and <strong>in</strong>direct<br />

pathways <strong>in</strong> the facilitation and suppression <strong>of</strong> behavior (Kravitz et al., 2010), which are necessary<br />

and sufficient to <strong>in</strong>duce reward and punishment learn<strong>in</strong>g, respectively (Kravitz et al., 2012; Hikida et<br />

al., 2010). Phasic stimulation or <strong>in</strong>hibition <strong>of</strong> dopam<strong>in</strong>e neurons <strong>in</strong>duces reward/approach and<br />

aversive/avoidance learn<strong>in</strong>g, respectively (Tsai et al., 2009; Tan et al., 2012). Synaptic plasticity<br />

studies reveal dual mechanisms for potentiation and depression <strong>in</strong> the two pathways as a function <strong>of</strong><br />

D1 and D2 receptors (Shen, Flajolet, Greengard & Surmeier, 2008), as <strong>in</strong> the <strong>model</strong>s. In humans,<br />

striatal dopam<strong>in</strong>e manipulation <strong>in</strong>fluences the degree to which <strong>in</strong>dividuals learn more from positive or<br />

negative outcomes (Figure 2), with DA elevations enhanc<strong>in</strong>g reward learn<strong>in</strong>g but impair<strong>in</strong>g punishment<br />

learn<strong>in</strong>g, and vice-versa for DA depletion (<strong>Frank</strong> & O’Reilly, 2006; Pessiglione et al., 2006; Bódi et al.,<br />

2009; <strong>Frank</strong>, Santamaria, O’Reilly & Willcutt, 2007; <strong>Frank</strong> et al., 2004; Voon et al., 2010). Quantitative<br />

fits us<strong>in</strong>g RL <strong>model</strong>s reveal that these can be accounted for by differential effects <strong>of</strong> dopam<strong>in</strong>e<br />

manipulations on learn<strong>in</strong>g rates from positive and negative prediction errors. Moreover, <strong>in</strong> the absence<br />

<strong>of</strong> any acute manipulation, <strong>in</strong>dividual differences <strong>in</strong> these fit learn<strong>in</strong>g rate parameters are associated<br />

with genetic polymorphisms that differentially impact the efficacy <strong>of</strong> striatal D1 and D2 pathways<br />

(<strong>Frank</strong> & Fossella, 2011; <strong>Frank</strong> & Hutch<strong>in</strong>son, 2009; <strong>Frank</strong>, Moustafa, Haughey, Curran & Hutch<strong>in</strong>son,<br />

2007; Doll et al., 2011; <strong>Frank</strong>, Doll, Oas-Terpstra & Moreno, 2009) (Figure 2).<br />

One <strong>of</strong> the advantages <strong>of</strong> high level <strong>model</strong>s, besides be<strong>in</strong>g simpler and more naturally used for<br />

quantitative behavioral fits, is that they can also <strong>in</strong>clude relevant processes that are out <strong>of</strong> scope <strong>in</strong> the<br />

neural versions. For example, when humans perform a “re<strong>in</strong>forcement learn<strong>in</strong>g task”, they are not only<br />

<strong>in</strong>crementally learn<strong>in</strong>g probabilistic stimulus-action-outcome associations and choos<strong>in</strong>g between them,<br />

but they also engage <strong>in</strong> other cognitive strategies <strong>in</strong>volv<strong>in</strong>g hypothesis test<strong>in</strong>g and work<strong>in</strong>g memory.<br />

Fitt<strong>in</strong>g their behavior with a RL <strong>model</strong> alone – no matter how well this <strong>model</strong> summarizes the<br />

corticostriatal learn<strong>in</strong>g process and its contribution to behavior – is then mislead<strong>in</strong>g, because it will<br />

capture variance that is really due to work<strong>in</strong>g memory capacity by absorb<strong>in</strong>g this <strong>in</strong>to the learn<strong>in</strong>g rate<br />

parameters <strong>of</strong> the RL process. Coll<strong>in</strong>s & <strong>Frank</strong> (2012) showed clear evidence <strong>of</strong> such effects by


manipulat<strong>in</strong>g the number <strong>of</strong> stimuli <strong>in</strong> the set to be learned (and hence work<strong>in</strong>g memory load). They<br />

found that when us<strong>in</strong>g RL <strong>model</strong>s alone and without factor<strong>in</strong>g <strong>in</strong> work<strong>in</strong>g memory, one needed to<br />

<strong>in</strong>clude a separate learn<strong>in</strong>g rate for each set size to capture the data, and that a gene related to<br />

prefrontal but not striatal function was predictive <strong>of</strong> this learn<strong>in</strong>g rate. However, when an augmented<br />

<strong>model</strong> which <strong>in</strong>cluded a capacity-limited work<strong>in</strong>g memory process was used, the overall fits to the data<br />

were improved, and the RL process could be captured by a s<strong>in</strong>gle learn<strong>in</strong>g rate that applies <strong>across</strong> all<br />

set sizes. Further, this learn<strong>in</strong>g rate <strong>in</strong> this best fit <strong>model</strong> varied with striatal genetic function, whereas<br />

the prefrontal gene was now related to work<strong>in</strong>g memory capacity.<br />

On the other hand, algorithmic RL <strong>model</strong>s that only predict choice probability miss out on the<br />

dynamics <strong>of</strong> choice, reflected <strong>in</strong> RT distributions, which emerge naturally from the neural <strong>model</strong><br />

because it is a process <strong>model</strong>. First, fir<strong>in</strong>g rate noise throughout the network produces variance <strong>in</strong> the<br />

speed with which an action is gated. Second, the action value <strong>of</strong> the candidate option impacts not only<br />

the likelihood <strong>of</strong> select<strong>in</strong>g that option relative to its competitors, but also the speed with which this<br />

option is selected. F<strong>in</strong>ally, as mentioned above, when multiple candidate options have similar<br />

frequencies <strong>of</strong> execution based on their choice history – that is, when there is conflict or choice<br />

entropy – this elicits hyperdirect pathway activity from medi<strong>of</strong>rontal cortex to the STN, which provides<br />

a temporary brake on the striatal gat<strong>in</strong>g process, thereby slow<strong>in</strong>g down response time and <strong>in</strong>creas<strong>in</strong>g<br />

the likelihood <strong>in</strong> settl<strong>in</strong>g on the optimal response (<strong>Frank</strong>, 2006).<br />

High level descriptions <strong>of</strong> process <strong>model</strong>s have been extensively used to simulate dynamics <strong>of</strong><br />

simple decision mak<strong>in</strong>g <strong>in</strong> cognitive psychology for over three decades. In particular, the drift diffusion<br />

<strong>model</strong> (DDM) belongs to a class <strong>of</strong> sequential sampl<strong>in</strong>g <strong>model</strong>s <strong>in</strong> which noisy evidence is<br />

accumulated <strong>in</strong> favor <strong>of</strong> one <strong>of</strong> two options, and a choice is executed once this evidence cross a<br />

critical decision threshold. The slope at which evidence accumulates is called the drift rate and reflects<br />

the ease <strong>of</strong> the decision. These <strong>model</strong>s capture not only choice proportions and mean RT, but the<br />

entire shape <strong>of</strong> the RT distribution for correct and erroneous responses.


Figure 3 (A) Decision conflict-<strong>in</strong>duced response time slow<strong>in</strong>g <strong>in</strong> PD patients and controls (“Seniors”). Hi Conflict:<br />

alternative actions have similar reward probabilities (high entropy). Lo Conflict: alternative actions have qualitatively<br />

different reward probabilities. DBS reverses conflict-<strong>in</strong>duced slow<strong>in</strong>g, lead<strong>in</strong>g to impulsive choice (large effect for<br />

suboptimal “error” choices) (B) STN fir<strong>in</strong>g rate <strong>in</strong> the neural <strong>model</strong> surges dur<strong>in</strong>g action selection, to a greater extent<br />

dur<strong>in</strong>g high conflict trials, delay<strong>in</strong>g respond<strong>in</strong>g (not shown). (C) Two-choice decision mak<strong>in</strong>g is captured by the drift<br />

diffusion <strong>model</strong>. Evidence accumulates for one option over the other from a start<strong>in</strong>g po<strong>in</strong>t “z” at a particular average<br />

rate “v”; choices are made when this evidence crosses the decision threshold (“a”). Noisy accumulation leads to<br />

variability <strong>in</strong> response times (example RT distributions shown). Dynamics <strong>of</strong> STN function are captured <strong>in</strong> this<br />

framework by an <strong>in</strong>creased decision threshold when conflict is detected (red curve), followed by a collapse to a static<br />

asymptotic value, similar to STN activity <strong>in</strong> (B); see Ratcliff & <strong>Frank</strong>, 2012. Quantitative fits <strong>of</strong> the BG <strong>model</strong> with the<br />

DDM show that STN strength parametrically modulates decision threshold, as supported by experimental<br />

manipulations <strong>of</strong> STN .<br />

Notably, when fitt<strong>in</strong>g the behavioral outputs <strong>of</strong> the neural <strong>model</strong> with the DDM, we found that<br />

parametric manipulations <strong>of</strong> both corticostriatal and STN output projection strengths were related to<br />

estimated decision threshold, with corticostriatal strength decreas<strong>in</strong>g threshold (see Forstmann et al.,<br />

2008) and STN strength <strong>in</strong>creas<strong>in</strong>g threshold (Ratcliff & <strong>Frank</strong>, 2012; Wiecki & <strong>Frank</strong>, <strong>in</strong> press).<br />

Studies with Park<strong>in</strong>son’s patients on and <strong>of</strong>f STN deep bra<strong>in</strong> stimulation provide an opportunity<br />

to test the impact <strong>of</strong> <strong>in</strong>terference <strong>of</strong> the STN pathway, which can also lead to cl<strong>in</strong>ical impulsivity.<br />

Indeed, this procedure provides a selective disruption <strong>of</strong> conflict-<strong>in</strong>duced slow<strong>in</strong>g, without impact<strong>in</strong>g<br />

learn<strong>in</strong>g (<strong>Frank</strong> et al., 2007; Coulthard et al., 2012) (Figure 3). We have extended this f<strong>in</strong>d<strong>in</strong>g <strong>in</strong> three<br />

critical ways. First, EEG revealed that <strong>in</strong> healthy participants and patients <strong>of</strong>f DBS, the amount <strong>of</strong><br />

medial prefrontal (mPFC) theta-band activity dur<strong>in</strong>g high conflict trials was predictive on a trial-to-trial<br />

basis <strong>of</strong> the amount <strong>of</strong> conflict-<strong>in</strong>duced RT slow<strong>in</strong>g. STN-DBS reversed this relationship, presumably<br />

by <strong>in</strong>terfer<strong>in</strong>g with hyperdirect pathway function, without alter<strong>in</strong>g mPFC theta itself. Second, we


developed a toolbox for hierarchical Bayesian parameter estimation allow<strong>in</strong>g us to estimate the impact<br />

<strong>of</strong> trial-to-trial variations <strong>in</strong> neural activities on decision parameters (Wiecki, S<strong>of</strong>er & <strong>Frank</strong>, <strong>in</strong> prep;<br />

[http://ski.clps.brown.edu/hddm_docs]). We found that mPFC theta was predictive <strong>of</strong> decision<br />

threshold adjustments (and not other decision parameters), and, moreover, that DBS reversed this<br />

mPFC-threshold relationship (Cavanagh et al., 2011). Third, electrophysiological record<strong>in</strong>gs with<strong>in</strong><br />

STN revealed decision conflict-related activity <strong>in</strong> a similar time and frequency range as mPFC <strong>in</strong> both<br />

humans and monkeys (Cavanagh et al., 2011; Aron, Behrens, Smith, <strong>Frank</strong> & Poldrack, 2007; Isoda &<br />

Hikosaka, 2007, 2008; Zaghloul et al., 2012). These f<strong>in</strong>d<strong>in</strong>gs thus provide support for a <strong>computation</strong>al<br />

account <strong>of</strong> hyperdirect pathway function, and a potential explanation for the observed impulsivity that<br />

can sometimes result from DBS.<br />

Thus far we have considered the ability <strong>of</strong> exist<strong>in</strong>g abstract formulations to summarize the<br />

<strong>computation</strong>s <strong>of</strong> more detailed neural <strong>model</strong>s, provid<strong>in</strong>g a l<strong>in</strong>k between <strong>levels</strong>. It is also possible<br />

however, that aspects <strong>of</strong> the neural <strong>model</strong>s, if valid, should alter the way we th<strong>in</strong>k about the abstract<br />

formulation. In the above example, we claimed that the STN was <strong>in</strong>volved <strong>in</strong> regulat<strong>in</strong>g decision<br />

threshold. Consider its <strong>in</strong>ternal dynamics however (Figure 3B). STN activity is not static throughout a<br />

trial, but rather exhibits an <strong>in</strong>itial <strong>in</strong>crease <strong>in</strong> activity, which then subsides with time dur<strong>in</strong>g the action<br />

selection process. Moreover the <strong>in</strong>itial STN surge is larger and more prolonged when there is higher<br />

decision conflict. This <strong>model</strong> dynamic is supported by electrophysiological evidence <strong>in</strong> both monkeys<br />

and humans (Isoda & Hikosaka, 2008; Zaghloul et al., 2012), and implies that STN effects on<br />

prevent<strong>in</strong>g BG gat<strong>in</strong>g should be transient and decrease with time, imply<strong>in</strong>g a collaps<strong>in</strong>g rather than<br />

fixed decision threshold. Functionally this collaps<strong>in</strong>g threshold ensures that a decision is eventually<br />

made, prevent<strong>in</strong>g decision paralysis (this collaps<strong>in</strong>g threshold is optimal when there are response<br />

deadl<strong>in</strong>es; Frazier & Yu, 2008). Indeed, quantitative fits us<strong>in</strong>g the DDM to capture RT distributions <strong>of</strong><br />

the BG <strong>model</strong> showed that a collaps<strong>in</strong>g threshold provided a good account <strong>of</strong> the <strong>model</strong>’s behavior,<br />

notably, with the temporal dynamics <strong>of</strong> the best fitt<strong>in</strong>g exponentially collaps<strong>in</strong>g threshold match<strong>in</strong>g<br />

reasonably well to the dynamics <strong>of</strong> STN activity – despite the fact that the DDM fits had no access to<br />

this activity but only to RT distributions (Ratcliff & <strong>Frank</strong>, 2012). This study also found that when fitt<strong>in</strong>g


human behavioral data <strong>in</strong> the same reward conflict decision-mak<strong>in</strong>g task, fits were improved when<br />

assum<strong>in</strong>g a higher and collaps<strong>in</strong>g threshold <strong>in</strong> conflict trials, compared to the fixed threshold <strong>model</strong>.<br />

This last result supports the assertion that neural mechanism constra<strong>in</strong>ts can be <strong>in</strong>cluded to<br />

ref<strong>in</strong>e higher level descriptions. However, we must also admit that we do not have well constra<strong>in</strong>ed<br />

neural mechanistic <strong>model</strong>s for all cognitive processes. The next example I turn to is the explorationexploitation<br />

trade<strong>of</strong>f <strong>in</strong> re<strong>in</strong>forcement learn<strong>in</strong>g, a process studied <strong>in</strong> mach<strong>in</strong>e learn<strong>in</strong>g for many years<br />

but only recently considered <strong>in</strong> the cognitive neurosciences.<br />

5 Beyond basic mechanisms: uncerta<strong>in</strong>ty driven exploration and hierarchical learn<strong>in</strong>g<br />

Often <strong>in</strong>dividuals need to explore alternative courses <strong>of</strong> action to maximize potential ga<strong>in</strong>s. But<br />

how does one know when to explore rather than exploit learned values? Basic RL <strong>model</strong>s usually<br />

assume a degree <strong>of</strong> random exploration, but a more efficient strategy is to keep track <strong>of</strong> the<br />

uncerta<strong>in</strong>ty about value estimates, and to guide exploration toward the action with higher uncerta<strong>in</strong>ty<br />

(Dayan & Sejnowksi, 1996). We have reported evidence for just such a mechanism, whereby trial-bytrial<br />

behavioral adjustments are quantitatively related to a Bayesian <strong>model</strong> estimate <strong>of</strong> relative<br />

outcome uncerta<strong>in</strong>ty. In this case, there is no exist<strong>in</strong>g neural <strong>model</strong> for how this relative uncerta<strong>in</strong>ty<br />

measure is encoded or updated as a function <strong>of</strong> reward experiences. Nevertheless, <strong>in</strong>dividual<br />

differences <strong>in</strong> the employment <strong>of</strong> this uncerta<strong>in</strong>ty-driven exploration strategy are predicted by genetic<br />

variations <strong>in</strong> the COMT (Catechol-O-methyltransferase) gene, which is related to prefrontal cortical<br />

dopam<strong>in</strong>e function (<strong>Frank</strong> et al., 2009). Further, a recent <strong>model</strong>-based fMRI study (Badre et al., 2012)<br />

revealed that the rostrolateral prefrontal cortex (RLPFC) parametrically tracks the relative uncerta<strong>in</strong>ty<br />

between outcome values, preferentially so <strong>in</strong> “Explorers” (def<strong>in</strong>ed based on behavioral fits alone). In<br />

EEG, relative uncerta<strong>in</strong>ty is reflected by variations <strong>in</strong> theta power over RLPFC (<strong>in</strong> contrast to the<br />

mPFC <strong>in</strong>dices <strong>of</strong> conflict noted above), aga<strong>in</strong> preferentially <strong>in</strong> Explorers (Cavanagh, Figueroa, Cohen<br />

& <strong>Frank</strong>, 2011). These converg<strong>in</strong>g data <strong>across</strong> <strong>model</strong><strong>in</strong>g, behavior, EEG, genetics and fMRI <strong>in</strong>dicate a<br />

potential prefrontal strategy for explor<strong>in</strong>g and over-rid<strong>in</strong>g reward-based action selection <strong>in</strong> the BG.<br />

Notably, patients with schizophrenia, specifically those with anhedonia, exhibit pr<strong>of</strong>ound reductions <strong>in</strong>


uncerta<strong>in</strong>ty-driven exploration (Strauss et al., 2011). Thus this measure has potential relevance for<br />

understand<strong>in</strong>g motivational alterations <strong>in</strong> cl<strong>in</strong>ical populations, and motivates the development <strong>of</strong><br />

mechanistic <strong>model</strong>s <strong>of</strong> how relative uncerta<strong>in</strong>ty estimates are computed and updated <strong>in</strong> populations <strong>of</strong><br />

prefrontal neurons.<br />

Figure 4.The exploration-exploitation trade<strong>of</strong>f. Top: probability density functions represent<strong>in</strong>g the degree<br />

<strong>of</strong> belief <strong>in</strong> values <strong>of</strong> two alternative actions. One action has a higher value estimate, but with<br />

uncerta<strong>in</strong>ty, quantified by Bayesian updat<strong>in</strong>g <strong>of</strong> belief distributions as a function <strong>of</strong> experience.<br />

Exploration is predicted to occur when alternative actions have relatively higher uncerta<strong>in</strong>ty.<br />

Approximately half <strong>of</strong> participants (“Explorers”) employ this exploration strategy, with choice adjustments<br />

proportional to relative uncerta<strong>in</strong>ty. Bottom: The degree <strong>of</strong> uncerta<strong>in</strong>ty-driven exploration estimated by<br />

<strong>model</strong> parameter varies as a function <strong>of</strong> a genetic variant <strong>in</strong> COMT, associated with prefrontal<br />

dopam<strong>in</strong>e (<strong>Frank</strong> et al., 2009). fMRI data show that activity <strong>in</strong> RLPFC varies parametrically with relative<br />

uncerta<strong>in</strong>ty <strong>in</strong> Explorers (Cavanagh et al., 2011; Badre et al., 2012). EEG topographical map shows thetaband<br />

activity correlates with relative uncerta<strong>in</strong>ty, maximal over rostrolateral electrodes.<br />

As the field matures, it becomes less clear which level <strong>of</strong> <strong>model</strong><strong>in</strong>g motivated the other – and<br />

this is a good th<strong>in</strong>g, as mutual constra<strong>in</strong>ts become available. Coll<strong>in</strong>s & <strong>Frank</strong> (<strong>in</strong> press) confronted the<br />

situation <strong>in</strong> which a learner has to decide whether, when enter<strong>in</strong>g a new context, the rules dictat<strong>in</strong>g<br />

l<strong>in</strong>ks between states, actions and outcomes (“task-sets”) should be re-used from those experienced <strong>in</strong><br />

previous contexts, or whether <strong>in</strong>stead a new task-set should be created and learned.<br />

They developed a high level “context-task-set” (C-TS) <strong>computation</strong>al <strong>model</strong> based on non-parametric<br />

Bayesian methods (Dirichlet process mixtures), describ<strong>in</strong>g how the learner can cluster contexts<br />

around task-set rules, generalizable to novel situations. This <strong>model</strong> was motivated by analogous<br />

cluster<strong>in</strong>g <strong>model</strong>s <strong>in</strong> category learn<strong>in</strong>g (e.g., Anderson, 1991; Sanborn, Griffiths & Navarro, 2010), but


applied to hierarchical cognitive control, and as such was similarly motivated by the hierarchical<br />

structure <strong>of</strong> prefrontal cortical basal ganglia networks and <strong>model</strong><strong>in</strong>g implementations there<strong>of</strong> (<strong>Frank</strong> &<br />

Badre, 2012; Koechl<strong>in</strong> & Summerfield, 2007; Reynolds & O’Reilly, 2009; Botv<strong>in</strong>ick, Niv & Barto, 2009).<br />

They also constructed a ref<strong>in</strong>ed hierarchical PFC-BG network which confronted the same tasks, and<br />

showed that its functionality is well mimicked by the C-TS <strong>model</strong>. Quantitative <strong>model</strong> fitt<strong>in</strong>g l<strong>in</strong>k<strong>in</strong>g<br />

these <strong>levels</strong> showed that particular neural mechanisms were associated with specific C-TS <strong>model</strong><br />

parameters. For example, the prior tendency to re-use vs. create new structure <strong>in</strong> C-TS, captured by<br />

Dirichlet alpha parameter, was directly related to the sparseness <strong>of</strong> the connectivity matrix from<br />

contextual <strong>in</strong>put to PFC (Figure 5). Thus <strong>in</strong> this case, there existed well established and validated<br />

<strong>model</strong>s <strong>of</strong> <strong>in</strong>teractions between PFC and BG dur<strong>in</strong>g learn<strong>in</strong>g, work<strong>in</strong>g memory, and action selection<br />

(<strong>in</strong>clud<strong>in</strong>g some hierarchical implementations), but the <strong>computation</strong>s afforded by the novel C-TS<br />

<strong>model</strong> further <strong>in</strong>spired ref<strong>in</strong>ement and elaboration <strong>of</strong> the network. In turn, this exercise reciprocally<br />

allowed us to derive more specific predictions about mechanisms lead<strong>in</strong>g to differential response<br />

times and error patterns (which were confirmed behaviorally), and to marry the re<strong>in</strong>forcement learn<strong>in</strong>g<br />

<strong>model</strong>s described previously with the cognitive control mechanisms <strong>in</strong>volv<strong>in</strong>g decision threshold<br />

regulation.<br />

Figure 5 Left: Schematic <strong>of</strong> hierarchical corticostriatal network <strong>model</strong> for creat<strong>in</strong>g task-sets (TS) which<br />

are gated <strong>in</strong>to prefrontal cortex depend<strong>in</strong>g on the context C). The lower motor loop selects motor<br />

actions M depend<strong>in</strong>g on the selected TS <strong>in</strong> PFC and the current sensory state S. The same TS can be<br />

reused <strong>across</strong> contexts, support<strong>in</strong>g cluster<strong>in</strong>g and generalization <strong>of</strong> behaviors, or if needed,a new TS<br />

can be gated, prevent<strong>in</strong>g <strong>in</strong>terference <strong>in</strong> learned state-action mapp<strong>in</strong>gs between different contexts/TS.<br />

Right: Parametric manipulation <strong>of</strong> the sparseness <strong>of</strong> the connectivity matrix from Context to PFC<br />

(enforc<strong>in</strong>g a prior tendency to encode dist<strong>in</strong>ct C’s as dist<strong>in</strong>ct PFC TS) is well fit by an <strong>in</strong>creased<br />

Dirichlet process cluster<strong>in</strong>g parameter <strong>in</strong> the C-TS <strong>model</strong> which creates and re-uses TS accord<strong>in</strong>g to<br />

non-parametric Bayesian methods. From Coll<strong>in</strong>s & <strong>Frank</strong> (2013)


One novel f<strong>in</strong>d<strong>in</strong>g from this <strong>model</strong><strong>in</strong>g work was that <strong>in</strong> such environments, the STN mechanism,<br />

previously l<strong>in</strong>ked only to decision mak<strong>in</strong>g and impulsivity, plays a key role <strong>in</strong> learn<strong>in</strong>g. In particular, the<br />

simulations showed that early <strong>in</strong> the trial, when there is uncerta<strong>in</strong>ty about the identity <strong>of</strong> the PFC taskset,<br />

this conflict between alternative PFC states activated the STN, prevent<strong>in</strong>g the motor loop from<br />

respond<strong>in</strong>g. This process ensures that the PFC state is resolved prior to motor action selection, and<br />

as such, when the outcome arrives, stimulus-action learn<strong>in</strong>g is conditionalized by the selected PFC<br />

state. As the STN contribution is reduced, there is <strong>in</strong>creas<strong>in</strong>g <strong>in</strong>terference <strong>in</strong> learn<strong>in</strong>g <strong>across</strong> task-sets,<br />

hence learn<strong>in</strong>g is less efficient. This novel theory specify<strong>in</strong>g the role <strong>of</strong> the STN <strong>in</strong> conditionaliz<strong>in</strong>g<br />

learn<strong>in</strong>g by PFC state needs to be tested empirically (e.g. with DBS or fMRI), but each component is<br />

grounded by prior empirical and theoretical work, yet it would not likely have emerged without this<br />

multi-level <strong>model</strong><strong>in</strong>g endeavor. In related work, <strong>Frank</strong> & Badre (2012) considered hierarchical learn<strong>in</strong>g<br />

tasks with multidimensional stimuli with two <strong>levels</strong> <strong>of</strong> <strong>model</strong><strong>in</strong>g. A Bayesian mixture <strong>of</strong> experts <strong>model</strong><br />

summarized how participants may learn hierarchical structure <strong>of</strong> the type, “if the color is red, then the<br />

response is determ<strong>in</strong>ed by the shape, whereas if the color is blue, the response is determ<strong>in</strong>ed by the<br />

orientation.” Quantitative <strong>model</strong><strong>in</strong>g showed that estimated attention to the hierarchical expert was<br />

l<strong>in</strong>ked to speeded learn<strong>in</strong>g <strong>in</strong> hierarchical conditions, and when fit to a PFC-BG network, was related<br />

to a measure <strong>of</strong> gat<strong>in</strong>g policy abstraction, learned via RL, <strong>in</strong> the hierarchical connections from PFC to<br />

striatum. Badre & <strong>Frank</strong> (2012) then used <strong>model</strong>-based fMRI to show that <strong>in</strong> participants, estimated<br />

attention to hierarchical structure was l<strong>in</strong>ked to PFC-BG activity with<strong>in</strong> a particular rostrocaudal level <strong>of</strong><br />

the network consistent with the “second-order” rule level <strong>of</strong> the task.<br />

6 Conclud<strong>in</strong>g Comments<br />

The examples described above demonstrate mutual, reciprocal constra<strong>in</strong>ts between <strong>model</strong>s <strong>of</strong> neural<br />

circuitry and physiology to <strong>model</strong>s <strong>of</strong> <strong>computation</strong>al function. This exercise leads to multiple testable<br />

predictions us<strong>in</strong>g <strong>model</strong>-based cognitive neuroscience methods. Ultimately, <strong>model</strong>s are judged based<br />

on their predictive power, and as such, they can <strong>in</strong>spire <strong>in</strong>formative experiments valuable even to<br />

those who question the validity or assumptions <strong>of</strong> either <strong>of</strong> the <strong>levels</strong> <strong>of</strong> <strong>model</strong><strong>in</strong>g employed.<br />

7 Exercises<br />

1. Give examples <strong>of</strong> implementational neural <strong>model</strong>s and higher level algorithmic <strong>model</strong>s <strong>in</strong> any<br />

doma<strong>in</strong>. What sorts <strong>of</strong> data do these <strong>model</strong>s attempt to capture?<br />

2. Th<strong>in</strong>k <strong>of</strong> some examples <strong>in</strong> which an abstract <strong>model</strong> exists but would benefit from a mechanistic<br />

elaboration for mak<strong>in</strong>g cognitive neuroscience predictions.<br />

3. Can you th<strong>in</strong>k <strong>of</strong> potential advantages <strong>of</strong> comb<strong>in</strong><strong>in</strong>g the <strong>model</strong>s? How about some pitfalls?


4. Describe how dopam<strong>in</strong>e may contribute both to learn<strong>in</strong>g and to choice <strong>in</strong>centive (the degree to<br />

which decisions are made based on positive vs negative consequences).<br />

5. Re<strong>in</strong>forcement learn<strong>in</strong>g <strong>model</strong>s and sequential sampl<strong>in</strong>g <strong>model</strong>s <strong>of</strong> decision mak<strong>in</strong>g have been<br />

largely separate literatures <strong>in</strong> mathematical psychology yet each <strong>of</strong> these classes <strong>of</strong> <strong>model</strong>s<br />

have been fit to the basal ganglia neural <strong>model</strong> described <strong>in</strong> this chapter. Read Bogacz &<br />

Larsen (2011) for a complementary approach to l<strong>in</strong>k<strong>in</strong>g these formulations with<strong>in</strong> an algorithmic<br />

framework.<br />

6. Conversely, read Wong & Wang 2006 for a complementary example <strong>of</strong> a s<strong>in</strong>gle neural <strong>model</strong><br />

captur<strong>in</strong>g dynamics <strong>of</strong> decision mak<strong>in</strong>g and work<strong>in</strong>g memory.<br />

8 Solutions<br />

1. Daniel Durstewitz has several detailed neural <strong>model</strong>s <strong>of</strong> prefrontal dopam<strong>in</strong>e mechanisms <strong>in</strong><br />

work<strong>in</strong>g memory. These are complemented by algorithmic <strong>model</strong>s <strong>of</strong> work<strong>in</strong>g memory updat<strong>in</strong>g<br />

(e.g. see Cohen, Braver & Brown, 2002 for a discussion <strong>of</strong> both <strong>levels</strong>). In this case, the<br />

Durstewitz <strong>model</strong>s capture attractor dynamics and effects <strong>of</strong> D1 receptors on sodium and potassium<br />

currents, etc, whereas the algorithmic <strong>model</strong>s simulate performance <strong>in</strong> work<strong>in</strong>g<br />

memory tasks as a function <strong>of</strong> reward prediction errors.<br />

2. Uncerta<strong>in</strong>ty driven exploration (see section 5), as one example<br />

3. Advantages discussed <strong>in</strong> this chapter. Pitfalls: perhaps assumptions <strong>of</strong> either level <strong>of</strong> <strong>model</strong><strong>in</strong>g<br />

are flawed, and one might actually detract from the other.<br />

4. See sections 3 and 4.<br />

5. –<br />

6. –<br />

9 Further Read<strong>in</strong>g<br />

1. Coll<strong>in</strong>s & <strong>Frank</strong> (2013) present two <strong>levels</strong> <strong>of</strong> <strong>model</strong><strong>in</strong>g describ<strong>in</strong>g the <strong>in</strong>teractions between<br />

cognitive control and learn<strong>in</strong>g needed to construct task-set rules generalizable to novel situations.<br />

This endeavor reaps the benefits <strong>of</strong> both RL <strong>model</strong>s and the temporal dynamics <strong>of</strong> decision<br />

mak<strong>in</strong>g, and how each affects the other. It also shows theoretically how a non-parametric<br />

Bayesian approach to task-set cluster<strong>in</strong>g can be implemented <strong>in</strong> hierarchical PFC-BG circuitry.<br />

2. Wang (2012) reviews neural <strong>model</strong>s <strong>of</strong> decision mak<strong>in</strong>g and their relation to normative theory.<br />

3. Britta<strong>in</strong> et al. (2012) present evidence for STN <strong>in</strong>volvement <strong>in</strong> deferred choice under response<br />

conflict <strong>in</strong> a non-reward based task, complement<strong>in</strong>g f<strong>in</strong>d<strong>in</strong>gs described <strong>in</strong> this chapter.<br />

References<br />

Alexander, G.E. & Crutcher, M.D. (1990). Functional architecture <strong>of</strong> basal ganglia circuits: neural substrates <strong>of</strong><br />

parallel process<strong>in</strong>g. Trends <strong>in</strong> Neuroscience, 13(7), 266-71.<br />

Anderson, J. R. (1991). The adaptive nature <strong>of</strong> human categorization. Psychological Review, 98(3),<br />

409–429.


Aron, A.R., Behrens, T.E., Smith, S., <strong>Frank</strong>, M.J., & Poldrack, R. (2007). Triangulat<strong>in</strong>g a cognitive control network<br />

us<strong>in</strong>g diffusion-weighted magnetic resonance imag<strong>in</strong>g (MRI) and functional MRI. Journal <strong>of</strong> Neuroscience,<br />

27(14), 3743-52.<br />

Badre, D., Doll, B.B., Long, N.M. & <strong>Frank</strong>, M.J. (2012). Rostrolateral prefrontal cortex and <strong>in</strong>dividual differences<br />

<strong>in</strong> uncerta<strong>in</strong>ty-driven exploration. Neuron, 73, 595-607.<br />

Bόdi, N., Kéri, S., Nagy, H., Moustafa, A., Myers, C.E., Daw, N., Dibó, G., Takáts, A., Bereczki, D., Gluck,<br />

M.A. (2009). Reward-learn<strong>in</strong>g and the novelty-seek<strong>in</strong>g personality: a between- and with<strong>in</strong>-subjects study <strong>of</strong> the<br />

effects <strong>of</strong> dopam<strong>in</strong>e agonists on young Park<strong>in</strong>son's patients. Bra<strong>in</strong>, 132, 2385-95.<br />

Bogacz, R. & Gurney, K. (2007). The basal ganglia and cortex implement optimal decision mak<strong>in</strong>g between alternative<br />

actions. Neural Computation, 19(2), 442-77.<br />

Bogacz, R. & Larsen, T. (2011). Integration <strong>of</strong> re<strong>in</strong>forcement learn<strong>in</strong>g and optimal decision-mak<strong>in</strong>g theories <strong>of</strong><br />

the basal ganglia. Neural Computation, 23(4), 817-51.<br />

Bogacz, R., Wagenmaker, E.J., Forstmann, B.U. & Nieuwenhuis, S. (2010). The neural basis <strong>of</strong> the speedaccuracy<br />

trade<strong>of</strong>f. Trends <strong>in</strong> Neuroscience, 33(1), 10-16.<br />

Botv<strong>in</strong>ick, M.M., Niv, Y., Barto, A.C. (2009). Hierarchically organized behavior and its neural foundations: a<br />

re<strong>in</strong>forcement learn<strong>in</strong>g perspective. Cognition, 113(3), 262-80.<br />

Braver, T. S., & Cohen, J. D. (2000). On the control <strong>of</strong> control: The role <strong>of</strong> dopam<strong>in</strong>e <strong>in</strong> regulat<strong>in</strong>g pre- frontal<br />

function and work<strong>in</strong>g memory. In S. Monsell, & J. Driver (Eds.), Control <strong>of</strong> cognitive processes: Attention and<br />

performance XVIII (pp. 713–737). Cambridge, MA: MIT Press.<br />

Britta<strong>in</strong>, J.S., Watk<strong>in</strong>s, K.E., Joundi, R.A., Ray, N.J., Holland, P., Green, A.L., Aziz, T.J. & Jenk<strong>in</strong>son, N. (2012). A<br />

role for the subthalamic nucleus <strong>in</strong> response <strong>in</strong>hibition dur<strong>in</strong>g conflict. Journal <strong>of</strong> Neuroscience, 32(39), 13396-<br />

401.<br />

Cavanagh, J.F., Figueroa, C.M., Cohen, M.X. & <strong>Frank</strong>, M.J. (2011). Frontal theta reflects uncerta<strong>in</strong>ty and unexpectedness<br />

dur<strong>in</strong>g exploration and exploitation. Cerebral Cortex, 22(11), 2575-86.<br />

Cavanagh, J.F., <strong>Frank</strong>, M.J., Kle<strong>in</strong>, T.J. & Allen, J.J.B. (2010). Frontal theta l<strong>in</strong>ks prediction error to behavioral<br />

adaptation <strong>in</strong> re<strong>in</strong>forcement learn<strong>in</strong>g. NeuroImage, 49(4), 3198-3209.<br />

Cavanagh, J.F., Wiecki, T.V., Cohen, M.X, Figueroa, C.M., Samanta, J., Sherman S.J., <strong>Frank</strong>, M.J. (2011).<br />

Subthalamic nucleus stimulation reverses medi<strong>of</strong>rontal <strong>in</strong>fluence over decision threshold. Nature Neuroscience,<br />

14(11), 1462-1467.<br />

Cohen, J.D., Braver, T.S. & Brown, J.W. (2002). Computational perspectives on dopam<strong>in</strong>e function <strong>in</strong> prefrontal<br />

cortex. Current Op<strong>in</strong>ion <strong>in</strong> Neurobiology,12(2), 223-9.<br />

Coll<strong>in</strong>s, A.G.E. & <strong>Frank</strong>, M.J. (2012). How much <strong>of</strong> re<strong>in</strong>forcement learn<strong>in</strong>g is work<strong>in</strong>g memory, not re<strong>in</strong>forcement<br />

learn<strong>in</strong>g? A behavioral, <strong>computation</strong>al, and neurogenetic analysis. European Journal <strong>of</strong> Neuroscience,<br />

35(7), 1024-35.<br />

Coll<strong>in</strong>s, A.G.E. & <strong>Frank</strong>, M.J. (2013). Cognitive control over learn<strong>in</strong>g: Creat<strong>in</strong>g, cluster<strong>in</strong>g and generaliz<strong>in</strong>g taskset<br />

structure. Psychological Review.<br />

Coulthard, E.J., Bogacz, R., Javed, S., Mooney, L.K., Murphy, G., Keeley, S., Whone, A.L. (2012). Dist<strong>in</strong>ct roles<br />

<strong>of</strong> dopam<strong>in</strong>e and subthalamic nucleus <strong>in</strong> learn<strong>in</strong>g and probabilistic decision mak<strong>in</strong>g. Bra<strong>in</strong>.<br />

Dayan, P. & Sejnowksi, T. (1996). Exploration bonuses and dual control. Mach<strong>in</strong>e Learn<strong>in</strong>g, 25, 5-22<br />

Doll, B.B., Hutchison, K.E., & <strong>Frank</strong>, M.J. (2011). Dopam<strong>in</strong>ergic genes predict <strong>in</strong>dividual differences <strong>in</strong><br />

susceptibility to confirmation bias. Journal <strong>of</strong> Neuroscience, 31, 188-198.


Doll, B.B., Hutchison, K.E. & <strong>Frank</strong>, M.J. (2011). Dopam<strong>in</strong>ergic genes predict <strong>in</strong>dividual differences <strong>in</strong><br />

susceptibility to confirmation bias. Journal <strong>of</strong> Neuroscience, 31(16), 6188-98.<br />

Doll, B.B., Jacobs, W.J., Sanfey, A.G. & <strong>Frank</strong>, M. J. (2009). Instructional control <strong>of</strong> re<strong>in</strong>forcement learn<strong>in</strong>g: a<br />

behavioral and neuro<strong>computation</strong>al <strong>in</strong>vestigation. Bra<strong>in</strong> Research, 1299, 74-94.<br />

Ford, K.A. & Everl<strong>in</strong>g, S. (2009). Neural activity <strong>in</strong> primate caudate nucleus associated with pro- and<br />

antisaccades. Journal <strong>of</strong> Neurophysiology 102(4), 2334-41.<br />

Forstmann, B.U., Dutilh, G., Brown, S., Neumann, J., von Cramon, D.Y., Ridder<strong>in</strong>kh<strong>of</strong>, K.R. & Wagenmakers,<br />

E.J. (2008). Striatum and pre-SMA facilitate decision-mak<strong>in</strong>g under time pressure. Proceed<strong>in</strong>gs <strong>of</strong> the National<br />

Academy <strong>of</strong> Sciences <strong>of</strong> the United States <strong>of</strong> America, 105(45), 17538-42.<br />

Forstmann, B.U., Jahfari, S., Scholte, H.S., Wolfensteller, U., van den Wildenberg, W.P. & Ridder<strong>in</strong>kh<strong>of</strong>, K.R.<br />

(2008). Function and structure <strong>of</strong> the right <strong>in</strong>ferior frontal cortex predict <strong>in</strong>dividual differences <strong>in</strong> response<br />

<strong>in</strong>hibition: a <strong>model</strong>-based approach. Journal <strong>of</strong> Neuroscience, 28(39), 9790-6.<br />

Frazier, P., Yu, A.J. (2008). Sequential hypothesis test<strong>in</strong>g under stochastic deadl<strong>in</strong>es. Advances <strong>in</strong> Neural<br />

Information Process<strong>in</strong>g Systems, 20, 465-72. MIT Press, Cambridge, MA.<br />

<strong>Frank</strong>, M.J. & Badre, D. (2012) Mechanisms <strong>of</strong> hierarchical re<strong>in</strong>forcement learn<strong>in</strong>g <strong>in</strong> corticostriatal circuits 1:<br />

<strong>computation</strong>al analysis. Cerebral Cortex, 22(3), 509-26.<br />

<strong>Frank</strong>, M.J. & Claus, E.D. (2006). Anatomy <strong>of</strong> a decision: striato-orbit<strong>of</strong>rontal <strong>in</strong>teractions <strong>in</strong> re<strong>in</strong>forcement<br />

learn<strong>in</strong>g, decision mak<strong>in</strong>g, and reversal. Psychological Review, 113(2), 300-26.<br />

<strong>Frank</strong>, M.J., Doll, B.B., Oas-Terpstra, J. & Moreno, F. (2009). Prefrontal and striatal dopam<strong>in</strong>ergic genes predict<br />

<strong>in</strong>dividual differences <strong>in</strong> exploration and exploitation. Nature Neuroscience, 12, 1062-1068.<br />

<strong>Frank</strong>, M.J. & Fossella, J.A. (2011). Neurogenetics and Pharmacology <strong>of</strong> Learn<strong>in</strong>g, Motivation, and Cognition.<br />

Neuropsychopharmacology, 36, 133-152.<br />

<strong>Frank</strong>, M.J. & Hutchison, K. (2009). Genetic Contributions to Avoidance-Based Decisions: Striatal D2 Receptor<br />

Polymorphisms. Neuroscience, 164(1), 131-140.<br />

<strong>Frank</strong>, M.J. & O'Reilly R.C. (2006). A mechanistic account <strong>of</strong> striatal dopam<strong>in</strong>e function <strong>in</strong> human cognition:<br />

psychopharmacological studies with cabergol<strong>in</strong>e and haloperidol. Behavioral Neuroscience, 120(3), 497-517.<br />

<strong>Frank</strong>, M.J. (2005). Dynamic dopam<strong>in</strong>e modulation <strong>in</strong> the basal ganglia: a neuro<strong>computation</strong>al account <strong>of</strong><br />

cognitive deficits <strong>in</strong> medicated and nonmedicated Park<strong>in</strong>sonism. Journal <strong>of</strong> Cognitive Neuroscience, 17(1), 51-<br />

72.<br />

<strong>Frank</strong>, M.J. (2006). Hold your horses: a dynamic <strong>computation</strong>al role for the subthalamic nucleus <strong>in</strong> decision<br />

mak<strong>in</strong>g. Neural Networks, 19(8), 1120-36.<br />

<strong>Frank</strong>, M.J. (2011). Computational <strong>model</strong>s <strong>of</strong> motivated action selection <strong>in</strong> corticostriatal circuits. Current Op<strong>in</strong>ion<br />

<strong>in</strong> Neurobiology, 2, 381-6.<br />

<strong>Frank</strong>, M.J., Santamaria, A., O'Reilly, R. & Willcutt, E. (2007). Test<strong>in</strong>g <strong>computation</strong>al <strong>model</strong>s <strong>of</strong> dopam<strong>in</strong>e and<br />

noradrenal<strong>in</strong>e dysfunction <strong>in</strong> attention deficit/hyperactivity disorder. Neuropsychopharmacology, 32(7), 1583-99.<br />

<strong>Frank</strong>, M.J., Doll, B.B., Oas-Terpstra, J. & Moreno, F. (2009). Prefrontal and striatal dopam<strong>in</strong>ergic genes predict<br />

<strong>in</strong>dividual differences <strong>in</strong> exploration and exploitation. Nature Neuroscience, 12(8), 1062-8.<br />

<strong>Frank</strong>, M.J., D'Lauro, C., & Curran, T. (2007). Cross-task <strong>in</strong>dividual differences <strong>in</strong> error process<strong>in</strong>g: neural,<br />

electrophysiological, and genetic components. Cognitive, Affective, & Behavioral Neuroscience, 7(4), 297-308.<br />

<strong>Frank</strong>, M.J., Loughry, B. & O'Reilly, R.C. (2001). Interactions between frontal cortex and basal ganglia <strong>in</strong> work<strong>in</strong>g<br />

memory: a <strong>computation</strong>al <strong>model</strong>. Cognitive Affective & Behavioral Neuroscience 1(2), 137-60.


<strong>Frank</strong>, M.J., Moustafa, A.A., Haughey, H., Curran, T. & Hutchison, K. (2007). Genetic triple dissociation reveals<br />

multiple roles for dopam<strong>in</strong>e <strong>in</strong> re<strong>in</strong>forcement learn<strong>in</strong>g. Proceed<strong>in</strong>gs <strong>of</strong> the National Academy <strong>of</strong> Sciences,<br />

104(41), 16311-6.<br />

<strong>Frank</strong>, M.J., Samanta, J., Moustafa, A.A. & Sherman, S.J. (2007). Hold your horses: Impulsivity, deep bra<strong>in</strong><br />

stimulation and medication <strong>in</strong> Park<strong>in</strong>sonism. Science, 318, 1309-1312.<br />

<strong>Frank</strong>, M.J., Seeberger, L.C. & O'Reilly, R.C. (2004). By carrot or by stick: cognitive re<strong>in</strong>forcement learn<strong>in</strong>g <strong>in</strong><br />

park<strong>in</strong>sonism. Science, 306(5703), 1940-3.<br />

Gurney, K., Prescott, T.J., & Redgrave, P. (2001). A <strong>computation</strong>al <strong>model</strong> <strong>of</strong> action selection <strong>in</strong> the basal ganglia.<br />

II. Analysis and simulation <strong>of</strong> behaviour. Biological Cybernetics, 84(6), 411-23.<br />

Hikida, T., Kimura, K., Wada, N., Funabiki, K. & Nakanishi, S. (2010). Dist<strong>in</strong>ct roles <strong>of</strong> synaptic transmission <strong>in</strong><br />

direct and <strong>in</strong>direct striatal pathways to reward and aversive behavior. Neuron, 66(6), 896-907.<br />

Isoda, M. & Hikosaka, O. (2007). Switch<strong>in</strong>g from automatic to controlled action by monkey medial frontal cortex.<br />

Nature Neuroscience, 10(2), 240-8.<br />

Isoda, M. & Hikosaka, O. (2008). Role for subthalamic nucleus neurons <strong>in</strong> switch<strong>in</strong>g from automatic to controlled<br />

eye movement. Journal Neuroscience, 28(28), 7209-18.<br />

Jahfari, S., Verbruggen, F., <strong>Frank</strong>, M.J., Waldorp, L.J., Colzato, L., Ridder<strong>in</strong>kh<strong>of</strong>, K.R. & Forstmann, B.U. (2012).<br />

How preparation changes the need for top-down control <strong>of</strong> the basal ganglia when <strong>in</strong>hibit<strong>in</strong>g premature actions.<br />

Journal <strong>of</strong> Neuroscience, 32(32), 10870-8.<br />

Koechl<strong>in</strong>, E., Summerfield, C. (2007). An <strong>in</strong>formation theoretical approach to the prefrontal executive function.<br />

Trends <strong>in</strong> Cognitive Science, 11(6), 229-35.<br />

Kravitz, A.V., Freeze, B.S., Parker, P.R.L., Kay, K., Thw<strong>in</strong>, M.T., Deisseroth, K. & Kreitzer, A.C. (2010). Regulation<br />

<strong>of</strong> park<strong>in</strong>sonian motor behaviours by optogenetic control <strong>of</strong> basal ganglia circuitry. Nature, 466(7306), 622-6.<br />

Kravitz, A.V., Tye, L.D. & Kreitzer, A.C. (2012). Dist<strong>in</strong>ct roles for direct and <strong>in</strong>direct pathway striatal neurons <strong>in</strong><br />

re<strong>in</strong>forcement. Nature Neuroscience 15, 816–818.<br />

Lau, B. & Glimcher, P.W. (2008). Value representations <strong>in</strong> the primate striatum dur<strong>in</strong>g match<strong>in</strong>g behavior.<br />

Neuron, 58(3), 451-63.<br />

Lo, C.C. & Wang, X.J. (2006). Cortico-basal ganglia circuit mechanism for a decision threshold <strong>in</strong> reaction time<br />

tasks. Nature Neuroscience, 9(7), 956-63.<br />

Maia, T.V. & <strong>Frank</strong>, M.J. (2011). From re<strong>in</strong>forcement learn<strong>in</strong>g <strong>model</strong>s to psychiatric and neurological disorders.<br />

Nature Neuroscience, 2, 154-162.<br />

M<strong>in</strong>k, J.W. (1996). The basal ganglia: focused selection and <strong>in</strong>hibition <strong>of</strong> compet<strong>in</strong>g motor programs. Progress <strong>in</strong><br />

Neurobiology, 50(4), 381-425.<br />

Munafò, M.R., Stothart, G. & Fl<strong>in</strong>t, J. (2009). Bias <strong>in</strong> genetic association studies and impact factor. Molecular<br />

Psychiatry, 14, 119-20.<br />

O'Reilly RC, McClelland, JL. (1994). Hippocampal conjunctive encod<strong>in</strong>g, storage, and recall: avoid<strong>in</strong>g a trade<strong>of</strong>f.<br />

Hippocampus, 4(6), 661-82.<br />

Pessiglione, M., Seymour, B., Fland<strong>in</strong>, G., Dolan, R.J., Frith, C.D. (2006). Dopam<strong>in</strong>e-dependent prediction errors<br />

underp<strong>in</strong> reward-seek<strong>in</strong>g behaviour <strong>in</strong> humans. Nature, 442(7106), 1042-1045.<br />

Ratcliff, R. & <strong>Frank</strong>, M.J. (2012). Re<strong>in</strong>forcement-based decision mak<strong>in</strong>g <strong>in</strong> corticostriatal circuits: Mutual<br />

constra<strong>in</strong>ts by neuro<strong>computation</strong>al and diffusion <strong>model</strong>s. Neural Computation, 24, 1186-1229.


Ratcliff, R. & McKoon, G. (2008). The diffusion decision <strong>model</strong>: theory and data for two-choice decision tasks.<br />

Neural Computation, 20(4), 873-922.<br />

Reynolds, J.R., O’Reilly, R.C. (2009). Develop<strong>in</strong>g PFC representations us<strong>in</strong>g re<strong>in</strong>forcement learn<strong>in</strong>g. Cognition,<br />

113(3), 281-92.<br />

Samejima, K., Ueda, Y., Doya, K., & Kimura, M. (2005). Representation <strong>of</strong> action-specific reward values <strong>in</strong> the<br />

striatum. Science, 310(5752), 1337-40.<br />

Sanborn, A. N., Griffiths, T. L., & Navarro, D. J. (2010) Rational approximations to rational <strong>model</strong>s:<br />

Alternative algorithms for category learn<strong>in</strong>g. Psychological review, 117(4), 1144–1167.<br />

Schultz, W. (2002). Gett<strong>in</strong>g formal with dopam<strong>in</strong>e and reward. Neuron 36(2), 241-63.<br />

Shen, W., Flajolet, M., Greengard, P. & Surmeier, D.J. (2008). Dichotomous dopam<strong>in</strong>ergic control <strong>of</strong> striatal<br />

synaptic plasticity. Science, 321(5890), 848-51.<br />

Strauss, G.P., <strong>Frank</strong>, M.J., Waltz, J.A., Kasanova, Z., Herbener, E.S. & Gold, J.M. (2011). Deficits <strong>in</strong> positive<br />

re<strong>in</strong>forcement learn<strong>in</strong>g and uncerta<strong>in</strong>ty-driven exploration are associated with dist<strong>in</strong>ct aspects <strong>of</strong> negative symptoms<br />

<strong>in</strong> schizophrenia. Biological Psychiatry, 69, 424-431.<br />

Tai, L.H., Lee, A.M., Benavidez, N. Bonci, A., & Wilbrecht, L. (2012). Transient stimulation <strong>of</strong> dist<strong>in</strong>ct subpopulations<br />

<strong>of</strong> striatal neurons mimics changes <strong>in</strong> action value. Nature Neuroscience, 15, 1281-89.<br />

Tan, K.R., Yvon, C., Turiault, M., Mirabekov, J.J., Doehner, J., Labouèbe, G., Deisseroth, K., Tye, K.M. &<br />

Lüscher, C. (2012). GABA neurons <strong>of</strong> the VTA drive conditioned place aversion. Neuron, 73,1173-83.<br />

Tsai, H.C., Zhang, F., Adamantidis, A., Stuber, G.D., Bonci, A., de Lecea, L. & Deisseroth, K. (2009). Phasic<br />

fir<strong>in</strong>g <strong>in</strong> dopam<strong>in</strong>ergic neurons is sufficient for behavioral condition<strong>in</strong>g. Science, 324, 1080-84.<br />

Voon, V., Pessiglione, M., Brez<strong>in</strong>g, C., Gallea, C., Fernandez H.H., Dolan, R.J. & Hallett, M. (2010). Mechanisms<br />

underly<strong>in</strong>g dopam<strong>in</strong>e-mediated reward bias <strong>in</strong> compulsive behaviors. Neuron, 65(1), 135-42.<br />

Wang, X. J. (2012). Neural dynamics and circuit mechanisms <strong>of</strong> decision-mak<strong>in</strong>g. Current Op<strong>in</strong>ion <strong>in</strong> Neurobiology<br />

22, 1039-1046.<br />

Watanabe, M. & Munoz, D.P. (2009). Neural correlates <strong>of</strong> conflict resolution between automatic and volitional<br />

actions by basal ganglia. European Journal <strong>of</strong> Neuroscience, 30(11), 2165-76.<br />

Watk<strong>in</strong>s, C.J.C.H. & Dayan, P. (1992). Q-Learn<strong>in</strong>g. Mach<strong>in</strong>e Learn<strong>in</strong>g, 8, 279-92.<br />

Wiecki, T.V. & <strong>Frank</strong>, M.J. (2010). Neuro<strong>computation</strong>al <strong>model</strong>s <strong>of</strong> motor and cognitive deficits <strong>in</strong> Park<strong>in</strong>son's disease.<br />

Progress <strong>in</strong> Bra<strong>in</strong> Research, 183, 275-97.<br />

Wiecki, T.V. & <strong>Frank</strong>, M.J. (<strong>in</strong> press). A <strong>computation</strong>al <strong>model</strong> <strong>of</strong> <strong>in</strong>hibitory control <strong>in</strong> frontal cortex and basal<br />

ganglia. Psychological Review.<br />

Wiecki, T.V., Ried<strong>in</strong>ger, K., Meyerh<strong>of</strong>er, A., Schmidt, W. & <strong>Frank</strong>, M.J. (2009). A neuro<strong>computation</strong>al account <strong>of</strong><br />

catalepsy sensitization <strong>in</strong>duced by D2 receptor blockade <strong>in</strong> rats: context dependency, ext<strong>in</strong>ction, and renewal.<br />

Psychopharmacology, 204, 265-77.<br />

Wiecki, T.V., S<strong>of</strong>er, I., <strong>Frank</strong>, M.J. (2012). Hierarchical Bayesian parameter estimation <strong>of</strong> Drift Diffusion Models<br />

(Version 0.4RC1) [s<strong>of</strong>tware]. Available from http://ski.clps.brown.edu/hddm_docs/.<br />

Wong, K.F. & Wang, X.J. (2006). A recurrent network mechanism <strong>of</strong> time <strong>in</strong>tegration <strong>in</strong> perceptual decisions.<br />

Journal <strong>of</strong> Neuroscience, 26(4), 1314-28.


Zaghloul, K., Weidemann, C.T., Lega, B.C., Jaggi, J.L., Baltuch, G.H., & Kahana, M.J. (2012). Neuronal activity<br />

<strong>in</strong> the human subthalamic nucleus encodes decision conflict dur<strong>in</strong>g action selection. Journal <strong>of</strong> Neuroscience,<br />

32(7), 2453-60.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!