Losses resulting from deliberate exploration trigger beta oscillations in frontal cortex
We examined neural signatures of directed exploration by contrasting magnetoencephalographic beta power changes (16-30 Hz) between disadvantageous and advantageous choices in the two-choice probabilistic reward task. Recording was made after participants learned task contingencies, i.e., they acquired the value-based inner model. Therefore, rare disadvantageous choices made by participants represent deliberate exploratory quests for information concerning the less familiar option. The study brought two main findings. First, decision making leading to disadvantageous choices took more time and involved greater suppression of beta-band oscillations than its advantageous alternative. Thus, increased recruitment of neural resources during disadvantageous decisions strongly suggests their deliberately explorative nature conflicting with the acquired utility model. Second, outcomes of disadvantageous and advantageous choices had qualitatively different impact on feedback-related beta oscillations: following disadvantageous choices, only losses – but not gains – induced late beta synchronization in frontal cortex. Our results are consistent with the role of frontal beta oscillations in stabilizing neural representations for selected behavioral rule when exploratory strategy conflicts with value-based behavior. Punishment for exploratory choices being congruent with its low value in the reward history is more likely to be strengthened through punishment-related beta oscillations than the representation of its competitor – the inner utility model.