A model architecture that takes a sequence of input tokens and produces a sequence of output tokens, commonly used for tasks like translation and summarization.