Training a model to solve a complete task directly from raw input (like document images) to final output, without breaking it into separate intermediate steps.