The capability to read and understand text and written content within images, rather than just recognizing objects or scenes.
Quality of vision, audio, and image understanding (distinct from modality support)