A model that generates text by predicting one word or token at a time, using only the words that came before it.