Categorygithub.com/PrincetonCompMemLab/neurodiff_leabrarl

package

0.0.0-20240905221516-6df987c31e31

Repository: https://github.com/princetoncompmemlab/neurodiff_leabra.git

Documentation: pkg.go.dev

# README

Reinforcement Learning and Dopamine

The rl package provides core infrastructure for dopamine neuromodulation and reinforcement learning, including the Rescorla-Wagner learning algorithm (RW) and Temporal Differences (TD) learning, and a minimal ClampDaLayer that can be used to send an arbitrary DA signal.

da.go defines a simple DALayer interface for getting and setting dopamine values, and a SendDA list of layer names that has convenience methods, and ability to send dopamine to any layer that implements the DALayer interface.
The RW and TD DA layers use the CyclePost layer-level method to send the DA to other layers, at end of each cycle, after activation is updated. Thus, DA lags by 1 cycle, which typically should not be a problem.
See the separate pvlv package for the full biologically-based pvlv model on top of this basic DA infrastructure.

# Functions

AddClampDaLayer

AddClampDaLayer adds a ClampDaLayer of given name.

AddRWLayers adds simple Rescorla-Wagner (PV only) dopamine system, with a primary Reward layer, a RWPred prediction layer, and a dopamine layer that computes diff.

AddTDLayers adds the standard TD temporal differences layers, generating a DA signal.

# Variables

KiT_ClampAChLayer

No description provided by the author

KiT_ClampDaLayer

No description provided by the author

No description provided by the author

KiT_RWPredLayer

No description provided by the author

No description provided by the author

No description provided by the author

KiT_TDRewIntegLayer

No description provided by the author

KiT_TDRewPredLayer

No description provided by the author

KiT_TDRewPredPrjn

No description provided by the author

# Structs

ClampAChLayer is an Input layer that just sends its activity as the acetylcholine signal.

ClampDaLayer is an Input layer that just sends its activity as the dopamine signal.

RWDaLayer computes a dopamine (DA) signal based on a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework).

RWPredLayer computes reward prediction for a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework).

RWPrjn does dopamine-modulated learning for reward prediction: Da * Send.Act Use in RWPredLayer typically to generate reward predictions.

TDDaLayer computes a dopamine (DA) signal as the temporal difference (TD) between the TDRewIntegLayer activations in the minus and plus phase.

TDRewIntegLayer

TDRewIntegLayer is the temporal differences reward integration layer.

TDRewIntegParams

TDRewIntegParams are params for reward integrator layer.

TDRewPredLayer is the temporal differences reward prediction layer.

TDRewPredPrjn does dopamine-modulated learning for reward prediction: DWt = Da * Send.ActQ0 (activity on *previous* timestep) Use in TDRewPredLayer typically to generate reward predictions.

# Interfaces

AChLayer is an interface for a layer with acetylcholine neuromodulator on it.

DALayer is an interface for a layer with dopamine neuromodulator on it.

# Type aliases

SendACh is a list of layers to send acetylcholine to.

SendDA is a list of layers to send dopamine to.