Evaluating LLM-Based Test Generation Under Software Evolution — ThinkLLM