A reinforcement learning setting where the agent receives reward signals only rarely, making exploration particularly challenging.