LLMs perform well on familiar games but fail when payoff structures change, suggesting they rely on memorized patterns rather than understanding underlying strategic principles.
This paper tests whether large language models can genuinely reason about game theory or just memorize patterns. Researchers created modified versions of classic games (Prisoner's Dilemma and Rock-Paper-Scissors) with different payoffs and labels to see if LLMs could adapt their strategy.