Dominic Rigby

Spurious Rewards: Rethinking Training Signals in RLVR

Date: 4th June 2025

arXiv link

Key Points:

Key Methods: