By teaching agents to learn from environmental feedback and explore alternative paths when they fail, LEAFE improves their problem-solving capacity across multiple attempts (Pass@k) better than methods that only optimize for single successful outcomes.
This paper introduces LEAFE, a training method that helps AI agents learn from their mistakes during long interactions with environments. Instead of just optimizing for final success, LEAFE teaches agents to reflect on feedback, backtrack to earlier decisions, try alternative approaches, and internalize these recovery strategies.