Towards Faithful Multimodal Concept Bottleneck Models

Pierre Moreau, Emeline Pineau Ferrand, Yann Choho, Benjamin Wong, Annabelle Blangero et al.|March 13, 2026arXiv

Key Takeaway

Concept Bottleneck Models can now work reliably across text and images by jointly addressing concept detection and information leakage—enabling interpretable AI without sacrificing accuracy.

Summary

This paper introduces f-CBM, a framework for building interpretable multimodal AI models that make predictions through human-understandable concepts. The key innovation is solving two problems simultaneously: accurately detecting concepts and preventing 'leakage' (where irrelevant information sneaks into predictions).

multimodal architecture

Key Terms

concept-bottleneck-model leakage vision-language-backbone kolmogorov-arnold-network