Accurately Explaining Exploratory Decisions in a Non-Stationary Bandit Problem: A Recovery Study of the Kalman Filter Model
Abstract: Daw, O’Doherty, Dayan, Seymour, and Dolan (2006) claim that a model consisting of the Kalman filter and softmax rule can be used to explain human decisions in a non-stationary four-armed bandit task. This paper aims to evaluate whether the model's parameters can be recovered accurately, while keeping the original conditions as much as possible intact. It is shown that three parameters could not be recovered, which indicates serious identification problems. Our conclusion is that the model must be used with caution and suggestions are included to improve recovery.
Included: Internship report and scripts (appendix).