Volatile environment.
To assess the performance of t-RNN model in a volatile environment, we conducted an additional analysis similar to the one described in the main text (see Validating t-RNN using synthetic behavior) with agents simulated in a two-armed bandit task. However, here instead of a fixed reward probability for each arm, the reward expected value schedule was governed by a random walk with a drift rate of 0.025, and upper and lower bounds of 0.15 and 0.85, respectively. Our findings align with the conclusions mentioned in the main text, demonstrating that the t-RNN outperformed the alternatives in terms of both action prediction and parameter estimation. Therefore, we conclude that our conclusions regarding tRNN performance generalized to two-armed bandit tasks in a volatile environment.
(PDF)