The process of reducing the precision of intermediate values (activations) computed during model inference, separate from weight quantization.