LifeSim: Long-Horizon User Life Simulator for Personalized Assistant Evaluation

Feiyu Duan, Xuanjing Huang, Zhongyu Wei|March 12, 2026arXiv

Key Takeaway

Current LLMs struggle with implicit user intentions and long-term preference modeling—they can handle immediate requests but fail to understand what users really need or remember their preferences over extended interactions.

Summary

LifeSim creates realistic simulated users with beliefs, desires, and intentions to test how well AI assistants handle long-term, multi-scenario interactions. The benchmark evaluates whether AI can understand both explicit requests and hidden user needs, maintain accurate user profiles over time, and provide contextually appropriate responses across 1,200 diverse life scenarios.

evaluation agents applications

Key Terms

belief-desire-intention user-simulator implicit-intention long-horizon-evaluation