A heavily compressed version of the BERT language model with far fewer parameters, designed for fast inference on resource-constrained devices.