Learning to Present: Inverse Specification Rewards for Agentic Slide Generation — ThinkLLM