Multimodal Comprehension

behavior

The ability of an AI model to understand and reason about multiple types of input data (like images and text) simultaneously.

Related Capabilities

Quality of vision, audio, and image understanding (distinct from modality support)

Multi-step reasoning, logic puzzles, mathematical problem-solving