Multimodal Dialogue

behavior

A conversational interaction where the model can understand and respond to inputs that combine both text and images in a natural back-and-forth exchange.

Related Capabilities

Instruction Following

Adhering to complex, structured, or constrained instructions

1379

Multimodal

Quality of vision, audio, and image understanding (distinct from modality support)

439