A neural network design that processes text by analyzing relationships between all words simultaneously, forming the foundation of modern large language models.