Masked Auto-Encoders for Efficient End-to-End Particle Reconstruction and Compression for the CMS Experiment


One of the important aspects of searches for new physics at the Large Hadron Collider (LHC) involves the identification and reconstruction of single particles, jets and event topologies of interest in collision events. The End-to-End Deep Learning (E2E) project in the CMS experiment focuses on the development of these reconstruction and identification tasks with innovative deep learning approaches.

The data involved in these tasks are often sparse low-level detector information, which is computationally expensive to store and process. Recent works have shown masked auto-encoders to be a viable alternative to generic autoencoder-based architecture for data compression and downstream transfer learning-based tasks. The aim of this project will be to develop state-of-the-art Vision-Tranformer-based hybrid masked autoencoders to achieve competitive reconstruction and transfer learning scores when compared to pre-existing methods.


Total project length: 175/350 hours.

Task ideas

Expected results

Difficulty level



C++, Python, PyTorch, Tensorflow and some previous experience in Deep Learning.


Please use this link to access the test for this project.


Please DO NOT contact mentors directly by email. Instead, please email with Project Title and include your CV and test results. The mentors will then get in touch with you.

Corresponding Project

Participating Organizations