A Quantitative Characterization of Forgetting in Post-Training

Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan|March 12, 2026arXiv

Key Takeaway

The direction of your training objective (forward-KL vs reverse-KL) fundamentally determines whether a model forgets old tasks—reverse-KL naturally avoids catastrophic forgetting while forward-KL requires replay to prevent it.

Summary

This paper explains why AI models forget old knowledge when trained on new tasks. Using mathematical analysis, the authors show that different training objectives (forward-KL vs reverse-KL) cause different types of forgetting, and that replaying old data helps prevent it. They also analyze three recent training methods to predict when they'll preserve old knowledge.

training alignment

Key Terms

catastrophic-forgetting forward-kl-divergence reverse-kl-divergence continual-learning replay-buffer