A Model-Free Universal AI

Yegon Kim, Juho Lee|February 26, 2026arXiv

Key Takeaway

You don't need to model the environment to build an optimal AI agent—learning action values directly can be just as powerful.

Summary

This paper introduces AIQI, the first AI agent that learns optimal behavior without building an explicit model of its environment. Instead of predicting how the world works, it directly learns which actions produce the best outcomes. This is a theoretical breakthrough showing that model-free approaches can match the performance of model-based agents in general reinforcement learning.

reasoning training agents

Key Terms

model-free-learning q-learning reinforcement-learning universal-induction