Diffusion models for Fast and accurate simulations of the low level CMS experiment data.


One of the important aspects of searches for new physics at the Large Hadron Collider (LHC) involves the identification and reconstruction of single particles, jets and event topologies of interest in collision events. The End-to-End Deep Learning (E2E) project in the CMS experiment focuses on the development of these reconstruction and identification tasks with innovative deep learning approaches.

Diffusion based generative models are strong candidates for Fast Simulation models. The idea of this project is to build a diffusion-based ML model to model the underlying structure of the data which can be inturn be used for generating novel samples from the given distribution. Moreover this project also aims to explore conditional diffusion models that can generate specific types of data given a certain input to the model. .


Total project length: 175/350 hours.

Task ideas

Expected results

Difficulty level



Python, PyTorch and some previous experience in Machine Learning.


Please use this link to access the test for this project.


Please DO NOT contact mentors directly by email. Instead, please email ml4-sci@cern.ch with Project Title and include your CV and test results. The mentors will then get in touch with you.

Corresponding Project

Participating Organizations