Language models can improve themselves in production by learning from actual user interactions—extracting knowledge from deployment experience and feeding it back into training without requiring access to the original environment.
This paper introduces Online Experiential Learning (OEL), a system that lets language models continuously improve by learning from real interactions during deployment. Instead of relying only on offline training data, OEL extracts useful knowledge from user interactions, then updates the model with this knowledge without needing access to the original environment.