LLMs generate more structurally consistent analogies than humans by better preserving relational patterns in embedding space—suggesting the parallelogram model is sound, but humans are inconsistent analogy-makers.
This paper compares how humans and LLMs generate word analogies (A:B::C:D problems). While previous research suggested the geometric "parallelogram" model poorly explains human analogies, this work shows LLMs actually produce better analogies that align more closely with the parallelogram structure.