package
0.0.0-20240905221516-6df987c31e31
Repository: https://github.com/princetoncompmemlab/neurodiff_leabra.git
Documentation: pkg.go.dev

# README

PBWM Version 1

GoDoc

See sir2 example for working model.

In the cemer (C++ emergent) versioning system, this is version 5 of PBWM, reflecting a number of intermediate unpublished versions.

In the Go framework, it is version 1.

PBWM is the prefrontal-cortex basal-ganglia working memory model O'Reilly & Frank, 2006, where the basal ganglia (BG) drives gating of PFC working memory maintenance, switching it dynamically between updating of new information vs. maintenance of existing information. It was originally inspired by existing data, biology, and theory about the role of the BG in motor action selection, and the LSTM (long short-term-memory) computational model of Hochreiter & Schmidhuber, which solved limitations in existing recurrent backpropagation networks by adding dynamic input and output gates. These LSTM models have experienced a significant resurgence along with the backpropagation neural networks in general.

The simple computational idea is that the BG gating signals fire phasically to disinhibit corticothalamic loops through the PFC, enabling the robust maintenance of new information there. In the absence of such gating signals, the PFC will continue to maintain existing information. The output of the BG through the GPi (globus pallidus internal segment) and homologous SNr (substantia nigra pars reticulata) represents a major bottleneck with a relatively small number of neurons, meaning that each BG gating output affects a relatively large group of PFC neurons. One idea is that these BG gating signals target different PFC hypercolumns or stripes -- these correspond to Pools of neurons within the layers in the current implementation.

In the current version, we integrate with the broader DeepLeabra framework (in the deep directory) that incorporates the separation between superficial and deep layers in cortex and their connections with the thalamus: the thalamocortical loops are principally between the deep layers. Thus, within a given PFC area, you can have the superficial layers being more sensitive to current inputs, while the deep layers are more robustly maintaining information through the thalamocortical loops.

Furthermore, it allows a unification of maintenance and output gating, both of which are effectively opening up a gate between superficial to deep (via the thalamocortical loops) -- deep layers the drive principal output of frontal areas (e.g., in M1, deep layers directly drive motor actions through subcortical projections). In PFC, deep layers are a major source of top-down activation to other cortical areas, in keeping with the idea of executing "cognitive actions" that influence processing elsewhere in the brain. The only real difference is whether the neurons exhibit significant sustained maintenance, or are only phasically activated by gating. Both profiles are widely observed e.g., Sommer & Wurtz, 2000.

The key, more complex computational challenges are:

  • How to actually sequence the updating of PFC from maintaining prior information to now encoding new information, which requires some mechanism for clearing out the prior information.

  • How maintenance and output gating within a given area are organized and related to each other.

  • Learning when the BG should drive update vs. maintain signals, which is particularly challenging because of the temporally-delayed nature of the consequence of an earlier gating action -- you only know if it was useful to maintain something later when you need it. This is the temporal credit assignment problem.

Updating

For the updating question, we compute a BG gating signal in the middle of the 1st quarter (set by GPiThal.Timing.GateQtr) of the overall AlphaCycle of processing (cycle 18 of 25, per GPiThal.Timing.Cycle parameter), which has the immediate effect of clearing out the existing PFC activations (see PFCLayer.Maint.Clear, such that by the end of the next quarter (2), the new information is sufficiently represented in the superficial PFC neurons. At the end of 2nd quarter (per PFCLayer.DeepBurst.BurstQtr), the superficial activations drive updating of the deep layers (via the standard deep CtxtGe computation), to maintain the new information. In keeping with the beta frequency cycle of the BG / PFC circuit (20 Hz, 50 msec cycle time), we support a second round of gating in the middle of the 2nd quarter (again by GPiThal.Timing.GateQtr), followed by maintenance activating in deep layers after the 4th quarter.

For PFCout layers (with PFCLayer.Gate.OutGate set), there is an OutQ1Only option (default true) which, with PFCLayer.DeepBurst.BurstQtr set to Q1, causes output gating to update at the end of the 1st quarter, which gives more time for it to drive output responding. And the 2nd beta-frequency gating comes too late in a standard AlphaCycle based update sequence to drive output, so it is not useful. However, supporting two phases of maintenance updating allows for stripes cleared by output gating (see next subsection) to update in the 2nd half of the alpha cycle, which is useful.

In summary, for PFCmnt maintenance gating:

  • Q1, cycle 18: BG gating, PFC clearing of any existing act
  • Q2, end: Super -> Deep (CtxtGe)
  • Q2, cycle 18: BG gating, PFC clearing
  • Q4, end: Super -> Deep (CtxtGe)

And PFCout output gating:

  • Q1, cycle 18: BG gating -- triggers clearing of corresponding Maint stripe
  • Q1, end: Super -> Deep (CtxtGe) so Deep can drive network output layers

Maint & Output Organization

For the organization of Maint and Out gating, we make the simplifying assumption that each hypercolumn ("stripe") of maintenance PFC has a corresponding output stripe, so you can separately decide to maintain something for an arbitrary amount of time, and subsequently use that information via output gating. A key question then becomes: what happens to the maintained information? Empirically, many studies show a sudden termination of active maintenance at the point of an action using maintained information Sommer & Wurtz, 2000, which makes computational sense: "use it and lose it". In addition, it is difficult to come up with a good positive signal to independently drive clearing: it is much easier to know when you do need information, than to know the point at which you no longer need it. Thus, we have output gating clear corresponding maintenance gating (there is an option to turn this off too, if you want to experiment). The availability of "open" stripes for subsequent maintenance after this clearing seems to be computationally beneficial in our tests.

Learning

Finally, for the learning question, we adopt a computationally powerful form of trace-based dopamine-modulated learning (in MatrixTracePrjn), where each BG gating action leaves a synaptic trace, which is finally converted into a weight change as a function of the next phasic dopamine signal, providing a summary "outcome" evaluation of the net value of the recent gating actions. This directly solves the temporal credit assignment problem, by allowing the synapses to bridge the temporal gap between action and outcome, over a reasonable time window, with multiple such gating actions separately encodable.

Biologically, we suggest that widely-studied synaptic tagging mechanisms have the appropriate properties for this trace mechanism. Extensive research has shown that these synaptic tags, based on actin fiber networks in the synapse, can persist for up to 90 minutes, and when a subsequent strong learning event occurs, the tagged synapses are also strongly potentiated (Redondo & Morris, 2011, Rudy, 2015, Bosch & Hayashi, 2012).

This form of trace-based learning is very effective computationally, because it does not require any other mechanisms to enable learning about the reward implications of earlier gating events. In earlier versions of the PBWM model, we relied on CS (conditioned stimulus) based phasic dopamine to reinforce gating, but this scheme requires that the PFC maintained activations function as a kind of internal CS signal, and that the amygdala learn to decode these PFC activation states to determine if a useful item had been gated into memory.

The CS-driven DA under the trace-based framework effectively serves to reinforce sub-goal actions that lead to the activation of a CS, which in turn is predicting final reward outcomes. Thus, the CS DA provides an intermediate bridging kind of reinforcement evaluating the set of actions leading up to that point. Kind of a "check point" of success prior to getting the real thing.

Layers

Here are the details about each different layer type in PBWM:

  • MatrixLayer: this is the dynamic gating system representing the matrix units within the dorsal striatum of the basal ganglia. The MatrixGo layer contains the "Go" (direct pathway) units (DaR = D1), while the MatrixNoGo layer contains "NoGo" (indirect pathway, DaR = D2). The Go units, expressing more D1 receptors, increase their weights from dopamine bursts, and decrease weights from dopamine dips, and vice-versa for the NoGo units with more D2 receptors. As is more consistent with the BG biology than earlier versions of this model, most of the competition to select the final gating action happens in the GPe and GPi (with the hyperdirect pathway to the subthalamic nucleus also playing a critical role, but not included in this more abstracted model), with only a relatively weak level of competition within the Matrix layers. We also combine the maintenance and output gating stripes all in the same Matrix layer, which allows them to all compete with each other here, and more importantly in the subsequent GPi and GPe stripes. This competitive interaction is critical for allowing the system to learn to properly coordinate maintenance when it is appropriate to update/store new information for maintenance vs. when it is important to select from currently stored representations via output gating.

  • GPeNoGo: This is a standard provides a first round of competition between all the NoGo stripes, which critically prevents the model from driving NoGo to all of the stripes at once. Indeed, there is physiological and anatomical evidence for NoGo unit collateral inhibition onto other NoGo units. Without this NoGo-level competition, models frequently ended up in a state where all stripes were inhibited by NoGo, and when nothing happens, nothing can be learned, so the model essentially fails at that point!

  • GPiThalLayer: Has a strong competition for selecting which stripe gets to gate, based on projections from the MatrixGo units, and the NoGo influence from GPeNoGo, which can effectively veto a few of the possible stripes to prevent gating. We have combined the functions of the GPi (or SNr) and the Thalamus into a single abstracted layer, which has the excitatory kinds of outputs that we would expect from the thalamus, but also implements the stripe-level competition mediated by the GPi/SNr. If there is more overall Go than NoGo activity, then the GPiThal unit gets activated, which then effectively establishes an excitatory loop through the corresponding deep layers of the PFC, with which the thalamus neurons are bidirectionally interconnected. This layer uses GateLayer framework to update GateState which is broadcast to the Matrix and PFC, so they have current gating state information.

  • PFCLayer: Uses deep super vs. deep dynamics with gating (in GateState values broadcast from GPiThal) determining when super drives deep. Actual maintenance in deep layer can be set using PFCDyn fixed dynamics that provides a simple way of shaping a temporally-evolving activation pattern over the layer, with a minimal case of just stable fixed maintenance. Gating in the out stripe will drive clearing of maintenance in corresponding mnt stripe.

Dopamine Layers

See the rl package for the DALayer interface and simpler forms of phasic dopamine algorithms, including Rescorla-Wagner and Temporal Differences (TD).

Given that PBWM minimally requires a RW-level "primary value" dopamine signal, basic models can use this as follows:

  • Rew, RWPred, SNc: The Rew layer represents the reward activation driven on the Recall trials based on whether the model gets the problem correct or not, with either a 0 (error, no reward) or 1 (correct, reward) activation. RWPred is the prediction layer that learns based on dopamine signals to predict how much reward will be obtained on this trial. The SNc is the final dopamine unit activation, reflecting reward prediction errors. When outcomes are better (worse) than expected or states are predictive of reward (no reward), this unit will increase (decrease) activity. For convenience, tonic (baseline) states are represented here with zero values, so that phasic deviations above and below this value are observable as positive or negative activations. (In the real system negative activations are not possible, but negative prediction errors are observed as a pause in dopamine unit activity, such that firing rate drops from baseline tonic levels). Biologically the SNc actually projects dopamine to the dorsal striatum, while the VTA projects to the ventral striatum, but there is no functional difference in this level of model.

Implementation Details

Network

The pbwm.Network provides "wizard" methods for constructing and configuring standard PBWM and RL components.

It extends the core Cycle method called every cycle of updating as follows:

func (nt *Network) Cycle(ltime *leabra.Time) {
	nt.Network.Network.Cycle(ltime) // basic version from leabra.Network (not deep.Network, which calls DeepBurst)
	nt.GateSend(ltime) // GateLayer (GPiThal) computes gating, sends to other layers
	nt.RecGateAct(ltime) // Record activation state at time of gating (in ActG neuron var)
	nt.DeepBurst(ltime) // Act -> Burst (during BurstQtr) (see deep for details)
	nt.SendMods(ltime) // send modulators (DA)
}

which determines the additional steps of computation after the activations have been updated in the current cycle, supporting the extra gating and DA modulation functions.

From deep.Network, there is a key addition to QuarterFinal method that calls DeepCtxt which in turn calls SendCtxtGe and CtxtFmGe -- this is how the deep layers get their "context" inputs from corresponding superficial layers (mediated through layer 5IB neurons in the biology, which burst periodically). This is when the PFC layers update deep from super.

GPiThal and GateState

GPiThalLayer is source of key GateState:

GateState.Cnt provides key tracker of gating state. It is separately updated in each layer -- GPiThal only broadcasts the basic Act signal and Now signals. For PFC, Cnt is:

  • -1 = initialized to this value, not maintaining.
  • 0 = just gated -- any time the GPiThal activity exceeds the gating threshold (at specified Timing.Cycle) we reset counter (re-gate)
  • >= 1: maintaining -- first gating goes to 1 in QuarterFinal of the BurstQtr gating quarter, counts up thereafter.
  • <= -1: not maintaining – when cleared, reset to -1 in Quarter_Init just following clearing quarter, counts down thereafter.

All gated PBWM layers are of type GateLayer which just has infrastructure to maintain GateState values and synchronize across layers.

PFCLayer

PFCLayer supports mnt and out and Super vs. Deep PFC layers.

  • Cycle:

    • ActFmG calls Gating which is only run on Super layers, and only does something when GateState.Now = true and it calls DecayStatePool if GateState.Act > 0 (i.e., the stripe has gated) to clear out any existing activation, and resets Cnt = 0 indicating just gated. For out layers, it clears the corresponding mnt stripe.

    • BurstFmAct for Super layers applies gating from Cnt state to Burst activations (which reflect 5IB activity as gated by BG, and are what is sent to the deep layer during SendCtxtGe)

  • QuarterFinal for Super calls GateStateToDeep to copy updated GateState info computed in Gating over to the corresponding Deep layer.

    • SendCtxtGe (called after QuarterFinal by Network in a separate pass) updates the GateState.Cnt for Super and Deep layers, incrementing Cnt up for maintaining layers, and decrementing for non-maintaining. Super then sends CtxtGe to Deep.

    • CtxtFmGe (only Deep) gets the CtxtGe value from super (always) and calls DeepMaint which applies the PFCDyn dynamics to the CtxtGe currents if using those. It saves the initial CtxtGe as a Maint neuron-level value which is visible as Cust1 variable in NetView, and is used to multiply the dynamics by the original activation strength.

In summary, for PFCmnt maintenance gating:

  • Q1, cycle 18: BG gating, PFC clearing of any existing act: Gating call
  • Q2, end: Super -> Deep (CtxtGe): QuarterFinal based on BurstFmAct gated Burst vals
  • ...

And PFCout output gating:

  • Q1, cycle 18: BG gating -- triggers clearing of corresponding Maint stripe
  • Q1, end: Super -> Deep (CtxtGe) so Deep can drive network output layers

TODO

  • Matrix uses net_gain = 0.5 -- why?? important for SIR2?, but not SIR1

  • patch -- not essential for SIR1, test in SIR2

  • TAN -- not essential for SIR1, test for SIR2

  • del_inhib -- delta inhibition -- SIR1 MUCH faster learning without! test for SIR2

  • slow_wts -- not important for SIR1, test for SIR2

  • GPe, GPi learning too -- allows Matrix to act like a hidden layer!

  • Currently only supporting 1-to-1 Maint and Out prjns -- Out gating automatically clears same pool in maint -- could explore different arrangements

References

Bosch, M., & Hayashi, Y. (2012). Structural plasticity of dendritic spines. Current Opinion in Neurobiology, 22(3), 383–388. https://doi.org/10.1016/j.conb.2011.09.002

Hochreiter, S., & Schmidhuber, J. (1997). Long Short Term Memory. Neural Computation, 9, 1735–1780.

O'Reilly, R.C. & Frank, M.J. (2006), Making Working Memory Work: A Computational Model of Learning in the Frontal Cortex and Basal Ganglia. Neural Computation, 18, 283-328.

Redondo, R. L., & Morris, R. G. M. (2011). Making memories last: The synaptic tagging and capture hypothesis. Nature Reviews Neuroscience, 12(1), 17–30. https://doi.org/10.1038/nrn2963

Rudy, J. W. (2015). Variation in the persistence of memory: An interplay between actin dynamics and AMPA receptors. Brain Research, 1621, 29–37. https://doi.org/10.1016/j.brainres.2014.12.009

Sommer, M. A., & Wurtz, R. H. (2000). Composition and topographic organization of signals sent from the frontal eye field to the superior colliculus. Journal of Neurophysiology, 83(4), 1979–2001.

# Constants

No description provided by the author
No description provided by the author
Appetititve is a positive valence US (food, water, etc).
Aversive is a negative valence US (shock, threat etc).
No description provided by the author
D1R primarily expresses Dopamine D1 Receptors -- dopamine is excitatory and bursts of dopamine lead to increases in synaptic weight, while dips lead to decreases -- direct pathway in dorsal striatum.
D2R primarily expresses Dopamine D2 Receptors -- dopamine is inhibitory and bursts of dopamine lead to decreases in synaptic weight, while dips lead to increases -- indirect pathway in dorsal striatum.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Maint is maintenance gating -- toggles active maintenance in PFC.
MaintOut for maint and output gating.
No description provided by the author
Out is output gating -- drives deep layer activation.
No description provided by the author
No description provided by the author

# Variables

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
NeuronVars are the pbwm neurons plus some custom variables that sub-types use for their algo-specific cases -- need a consistent set of overall network-level vars for display / generic interface.
No description provided by the author
No description provided by the author
SynVarsAll is the pbwm collection of all synapse-level vars (includes TraceSynVars).
No description provided by the author

# Structs

AChSrcLayer is the basic type of layer that sends ACh to other layers.
DaHebbPrjn does dopamine-modulated Hebbian learning -- i.e., the 3-factor learning rule: Da * Recv.Act * Send.Act.
Params for effects of dopamine (Da) based modulation, typically adding a Da-based term to the Ge excitatory synaptic input.
GateLayer is a layer that cares about thalamic (BG) gating signals, and has slice of GateState fields that a gating layer will update.
GateShape defines the shape of the outer pool dimensions of gating layers, organized into Maint and Out subsets which are arrayed along the X axis with Maint first (to the left) then Out.
GateState is gating state values stored in layers that receive thalamic gating signals including MatrixLayer, PFCLayer, GPiThal layer, etc -- use GateLayer as base layer to include.
GPiGateParams has gating parameters for gating in GPiThal layer, including threshold.
GPiNeuron contains extra variables for GPiThalLayer neurons -- stored separately.
GPiThalLayer represents the combined Winner-Take-All dynamic of GPi (SNr) and Thalamus.
GPiThalPrjn accumulates per-prjn raw conductance that is needed for separately weighting NoGo vs.
GPiTimingParams has timing parameters for gating in the GPiThal layer.
pbwm.Layer is the base layer type for PBWM framework -- has variables for the layer-level neuromodulatory variables: dopamine, ach, serotonin.
MatrixLayer represents the dorsal matrisome MSN's that are the main Go / NoGo gating units in BG driving updating of PFC WM in PBWM.
MatrixNeuron contains extra variables for MatrixLayer neurons -- stored separately.
MatrixParams has parameters for Dorsal Striatum Matrix computation These are the main Go / NoGo gating units in BG driving updating of PFC WM in PBWM.
MatrixTracePrjn does dopamine-modulated, gated trace learning, for Matrix learning in PBWM context.
ModLayer provides DA modulated learning to basic Leabra layers.
pbwm.Network has methods for configuring specialized PBWM network components.
PFC dynamic behavior element -- defines the dynamic behavior of deep layer PFC units.
PFCGateParams has parameters for PFC gating.
PFCLayer is a Prefrontal Cortex BG-gated working memory layer.
PFCMaintParams for PFC maintenance functions.
PFCNeuron contains extra variables for PFCLayer neurons -- stored separately.
Params for for trace-based learning in the MatrixTracePrjn.
TraceSyn holds extra synaptic state for trace projections.

# Interfaces

GateLayerer is an interface for GateLayer layers.
PBWMLayer defines the essential algorithmic API for PBWM at the layer level.
PBWMPrjn defines the essential algorithmic API for PBWM at the projection level.

# Type aliases

DaReceptors for D1R and D2R dopamine receptors.
GateTypes for region of striatum.
NeurVars are indexes into extra PBWM neuron-level variables.
PFCDyns is a slice of dyns.
Valences for Appetitive and Aversive valence coding.