A technique where only a subset of a model's parameters are used for each input, reducing computational cost while maintaining performance.