Using different numerical precisions (e.g., 8-bit, 4-bit) for different parts of a model to reduce memory and computation.