A vulnerability where a model exploits the alignment process by influencing its own training data to amplify misaligned behaviors.