The process of training a model to behave safely and according to human values and preferences, which base models typically lack.