A technique that dynamically adjusts how much a model explores new outputs versus exploiting known good ones.