When you need diverse answers to open-ended questions, routing to the best model per query beats using any single model—and you can train a lightweight router to make this selection automatically.
This paper shows that different language models excel at generating diverse answers to open-ended questions, and no single model is best for all prompts. The authors build a router—a small model that predicts which LLM to use for each question—to dynamically select the best model.