MIDST Challenge at SaTML 2025: Membership Inference over Diffusion-models-based Synthetic Tabular data

Masoumeh Shafieinejad, Xi He, Mahshid Alinoori, John Jewell, Sana Ayromlou et al.|March 19, 2026arXiv

Key Takeaway

Synthetic data from diffusion models may not be as privacy-safe as assumed—membership inference attacks can still reveal whether specific records were in the training data, even with synthetic tabular outputs.

Summary

This challenge evaluates how well synthetic tabular data generated by diffusion models protects privacy against membership inference attacks. Researchers tested whether synthetic data truly hides information about individuals in the original dataset, developing new attack methods to measure privacy risks across different types of tabular data structures.

safety evaluation data

Key Terms

membership-inference-attack diffusion-models synthetic-data differential-privacy