Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD — ThinkLLM