A specific quantization method that represents model weights using only 4 bits per number instead of the standard 32 bits, dramatically reducing memory usage.