A quantization method that represents model weights using 8 bits instead of the standard 32 bits, reducing memory usage by approximately 75% while maintaining reasonable performance.