LLMs can reason about financial fundamentals with retrieval help, but struggle significantly with trading signals and time-series patterns—a critical gap for real-world financial decision-making.
FinTradeBench is a benchmark with 1,400 questions testing how well AI models reason about financial decisions by combining company fundamentals (from financial reports) and trading signals (from stock price patterns). The benchmark reveals that current AI models struggle with numerical reasoning and time-series data, even when given access to relevant information.