Please use this identifier to cite or link to this item: https://doi.org/10.1093/cercor/bhab456
DC FieldValue
dc.titleA Nonlinear Hidden Layer Enables Actor-Critic Agents to Learn Multiple Paired Association Navigation.
dc.contributor.authorKumar, M Ganesh
dc.contributor.authorTan, Cheston
dc.contributor.authorLibedinsky, Camilo
dc.contributor.authorYen, Shih-Cheng
dc.contributor.authorTan, Andrew YY
dc.date.accessioned2022-04-08T07:36:17Z
dc.date.available2022-04-08T07:36:17Z
dc.date.issued2022-01-17
dc.identifier.citationKumar, M Ganesh, Tan, Cheston, Libedinsky, Camilo, Yen, Shih-Cheng, Tan, Andrew YY (2022-01-17). A Nonlinear Hidden Layer Enables Actor-Critic Agents to Learn Multiple Paired Association Navigation.. Cereb Cortex. ScholarBank@NUS Repository. https://doi.org/10.1093/cercor/bhab456
dc.identifier.issn10473211
dc.identifier.issn14602199
dc.identifier.urihttps://scholarbank.nus.edu.sg/handle/10635/218744
dc.description.abstractNavigation to multiple cued reward locations has been increasingly used to study rodent learning. Though deep reinforcement learning agents have been shown to be able to learn the task, they are not biologically plausible. Biologically plausible classic actor-critic agents have been shown to learn to navigate to single reward locations, but which biologically plausible agents are able to learn multiple cue-reward location tasks has remained unclear. In this computational study, we show versions of classic agents that learn to navigate to a single reward location, and adapt to reward location displacement, but are not able to learn multiple paired association navigation. The limitation is overcome by an agent in which place cell and cue information are first processed by a feedforward nonlinear hidden layer with synapses to the actor and critic subject to temporal difference error-modulated plasticity. Faster learning is obtained when the feedforward layer is replaced by a recurrent reservoir network.
dc.publisherOxford University Press (OUP)
dc.sourceElements
dc.subjectHebbian plasticity
dc.subjectreinforcement learning
dc.subjecttemporal difference error
dc.typeArticle
dc.date.updated2022-04-07T04:35:36Z
dc.contributor.departmentDEAN'S OFFICE (ENGINEERING)
dc.contributor.departmentPHYSIOLOGY
dc.description.doi10.1093/cercor/bhab456
dc.description.sourcetitleCereb Cortex
dc.published.stateUnpublished
Appears in Collections:Staff Publications
Elements
Students Publications

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
Cerebral Cortex resubmission 2111051235 NUS ScholarBank.pdfAccepted version5.15 MBAdobe PDF

OPEN

Post-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.