LLM REgression with a Latent Iterative State Head

Yiheng Su, Matthew Lease|April 1, 2026arXiv

Key Takeaway

You can make LLMs predict continuous numeric values more efficiently by adding a tiny learned head that works with frozen representations, rather than decoding text or fine-tuning the entire model.

Summary

RELISH is a lightweight method for making LLMs predict numeric values directly from their internal representations. Instead of generating numbers as text, it uses a small learned component that iteratively refines a latent state through attention over token representations, then outputs a single number. It outperforms existing approaches while adding minimal parameters (0.01-0.04% overhead).

architecture efficiency applications

Key Terms

parameter-efficient-fine-tuning cross-attention latent-state linear-regressor