Refining a model by learning from human comparisons of outputs rather than explicit numerical scores.