![]() |
|
|
|||||||||||||||||||
|
a Computation and Neural Systems, California Institute of Technology, Pasadena, California, USA
Key Words: reinforcement learning learning rate least squares learning dopaminergic system reward anticipation prediction risk uncertainty adaptive encoding
Address for correspondence: Peter Bossaerts, m/c 228-77 California Institute of Technology, Pasadena, CA 91125, USA. Voice: +1-626-395-4028; fax: +1-626-405-9841. pbs{at}rioja.caltech.edu
This article analyzes the simple RescorlaWagner learning rule from the vantage point of least squares learning theory. In particular, it suggests how measures of risk, such as prediction risk, can be used to adjust the learning constant in reinforcement learning. It argues that prediction risk is most effectively incorporated by scaling the prediction errors. This way, the learning rate needs adjusting only when the covariance between optimal predictions and past (scaled) prediction errors changes. Evidence is discussed that suggests that the dopaminergic system in the (human and nonhuman) primate brain encodes prediction risk, and that prediction errors are indeed scaled with prediction risk (adaptive encoding).
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||