Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Łukasz Borchmann, Jordy Van Landeghem, Michał Turski, Shreyansh Padarha, Ryan Othniel Kearns et al.|March 12, 2026arXiv

Key Takeaway

Current document-reasoning agents succeed through exhaustive search rather than strategic thinking—they need better planning abilities, not just more attempts, to handle real-world document workflows efficiently.

Summary

This paper introduces MADQA, a benchmark with 2,250 questions across 800 PDF documents, to test whether AI agents can strategically navigate documents or just randomly search. The researchers found that while agents match human accuracy on some questions, they use brute-force trial-and-error rather than smart planning, and fall 20% short of optimal performance.

evaluation agents reasoning

Key Terms

multimodal-agent document-intensive-workflows accuracy-effort-trade-off strategic-reasoning