Models Capabilities Use Cases Benchmarks Papers Glossary

Models Capabilities Use Cases Benchmarks Papers Glossary

About Privacy Terms RSS

ThinkLLM

Spot an error in our data? Let us know.

Glossary/Weight and Activation Quantization (W8A8)

Weight and Activation Quantization (W8A8)

deployment

A specific quantization method that compresses both the model's stored weights and its intermediate calculations to 8-bit precision, significantly reducing memory and computation requirements.

Weight and Activation Quantization (W8A8) — Glossary — ThinkLLM