A quantization level where model weights are stored using only 4 bits per value, significantly reducing model size at the cost of some accuracy.