Combining conversation with visual content (multimodality) improves learning in STEM, but conversation alone can create a false sense of understanding without actual learning gains.
This study compares three ways to learn biology: a conversational AI with images and text, one with text only, and a traditional search interface. Students using the multimodal conversational system learned best and felt most satisfied, while text-only conversation felt easier but didn't improve learning—showing that engagement doesn't always mean better outcomes.