Models Capabilities Use Cases Benchmarks Papers Glossary

Models Capabilities Use Cases Benchmarks Papers Glossary

About Privacy Terms RSS

ThinkLLM

Spot an error in our data? Let us know.

Glossary/Quantization

Quantization

deployment

Reducing a model's numerical precision (e.g., from 16-bit to 4-bit) to shrink memory usage and speed up inference.

Learn more on Wikipedia

Quantization — Glossary — ThinkLLM