You can make LLMs predict continuous numeric values more efficiently by adding a tiny learned head that works with frozen representations, rather than decoding text or fine-tuning the entire model.
RELISH is a lightweight method for making LLMs predict numeric values directly from their internal representations. Instead of generating numbers as text, it uses a small learned component that iteratively refines a latent state through attention over token representations, then outputs a single number. It outperforms existing approaches while adding minimal parameters (0.01-0.04% overhead).