Embodied agents can continuously improve without retraining by organizing experiences with detailed failure diagnosis and using those insights to constrain and guide planning at test time.
Steve-Evolving is a framework that helps AI agents learn and improve from their experiences in open-world environments like Minecraft. Instead of updating model weights, it organizes what the agent learns into structured experiences, diagnoses why actions succeed or fail in detail, and uses those insights to guide future planning through retrieved skills and safety guardrails.