A training technique where random words in text are hidden, and the model learns to predict them based on surrounding context.
World knowledge accuracy, recall of facts and relationships