package
0.0.0-20240905221516-6df987c31e31
Repository: https://github.com/princetoncompmemlab/neurodiff_leabra.git
Documentation: pkg.go.dev

# README

Reinforcement Learning and Dopamine

GoDoc

The rl package provides core infrastructure for dopamine neuromodulation and reinforcement learning, including the Rescorla-Wagner learning algorithm (RW) and Temporal Differences (TD) learning, and a minimal ClampDaLayer that can be used to send an arbitrary DA signal.

  • da.go defines a simple DALayer interface for getting and setting dopamine values, and a SendDA list of layer names that has convenience methods, and ability to send dopamine to any layer that implements the DALayer interface.

  • The RW and TD DA layers use the CyclePost layer-level method to send the DA to other layers, at end of each cycle, after activation is updated. Thus, DA lags by 1 cycle, which typically should not be a problem.

  • See the separate pvlv package for the full biologically-based pvlv model on top of this basic DA infrastructure.

# Functions

AddClampDaLayer adds a ClampDaLayer of given name.
AddRWLayers adds simple Rescorla-Wagner (PV only) dopamine system, with a primary Reward layer, a RWPred prediction layer, and a dopamine layer that computes diff.
AddTDLayers adds the standard TD temporal differences layers, generating a DA signal.

# Variables

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author

# Structs

ClampAChLayer is an Input layer that just sends its activity as the acetylcholine signal.
ClampDaLayer is an Input layer that just sends its activity as the dopamine signal.
RWDaLayer computes a dopamine (DA) signal based on a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework).
RWPredLayer computes reward prediction for a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework).
RWPrjn does dopamine-modulated learning for reward prediction: Da * Send.Act Use in RWPredLayer typically to generate reward predictions.
TDDaLayer computes a dopamine (DA) signal as the temporal difference (TD) between the TDRewIntegLayer activations in the minus and plus phase.
TDRewIntegLayer is the temporal differences reward integration layer.
TDRewIntegParams are params for reward integrator layer.
TDRewPredLayer is the temporal differences reward prediction layer.
TDRewPredPrjn does dopamine-modulated learning for reward prediction: DWt = Da * Send.ActQ0 (activity on *previous* timestep) Use in TDRewPredLayer typically to generate reward predictions.

# Interfaces

AChLayer is an interface for a layer with acetylcholine neuromodulator on it.
DALayer is an interface for a layer with dopamine neuromodulator on it.

# Type aliases

SendACh is a list of layers to send acetylcholine to.
SendDA is a list of layers to send dopamine to.