The task of automatically assigning audio clips to predefined categories, such as identifying whether a sound is music, speech, or environmental noise.