A reward model that processes multiple input types (text, images) and generates interpretable feedback about output quality.