Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

Arnas Uselis, Andrea Dittadi, Seong Joon Oh|February 27, 2026arXiv

Key Takeaway

For AI models to recognize new combinations of familiar concepts, their internal representations must be mathematically linear and orthogonal—a s...

Summary

This paper explains why neural networks need to organize information in a specific geometric way to recognize familiar concepts in new combinations. The researchers prove that for a model to generalize to unseen combinations of concepts, its internal representations must decompose into separate, perpendicular components for each concept.

architecture reasoning evaluation

Key Terms

compositional-generalization linear-representation-hypothesis orthogonal-representations embedding-geometry