A transformer-based neural network architecture optimized for understanding language through masked language prediction during training.
World knowledge accuracy, recall of facts and relationships
Adhering to complex, structured, or constrained instructions