Converting trajectories or behaviors discovered during exploration into a trainable policy that can be deployed.