A computer vision task that identifies and locates specific objects within an image by drawing boxes around them.
Quality of vision, audio, and image understanding (distinct from modality support)