The smaller neural network component in speculative decoding that quickly generates candidate tokens before verification by the main model.