Evolutionary and Transformer Models for Symbolic Regression


Symbolic regression can be used to rapidly provide solutions to problems in science which may have large computational complexity or may even be intractable. It can be used to discover a symbolic expression describing data such as a physical law. Current directions in symbolic regression focus either on evolutionary/genetic programming approaches or alternatively transformer based solutions. This project will explore a combination of these ideas towards a new tool for symbolic regression that can be used to solve many problems in science. As a concrete testbed for these new algorithms, the project will focus on predicting physical quantities, such as cross sections in high-energy physics, e.g a probability that a particular process takes place in the interaction of elementary particles. Its measure provides a testable link between theory and experiment. It is obtained theoretically mainly by calculating the squared amplitude.


Total project length: 175/350 hours.

Task ideas and expected results


Significant experience with Transformer machine learning models in Python (preferably using pytorch).

Difficulty Level



Please use this link to access the test for this project.


Please DO NOT contact mentors directly by email. Instead, please email ml4-sci@cern.ch with Project Title and include your CV and test results. The mentors will then get in touch with you.

Corresponding Project

Participating Organizations