Evolutionary and Transformer Models for Symbolic Regression

Description

Symbolic regression can be used to rapidly provide solutions to problems in science which may have large computational complexity or may even be intractable. It can be used to discover a symbolic expression describing data such as a physical law. Current directions in symbolic regression focus either on evolutionary/genetic programming approaches or alternatively transformer based solutions. This project will explore a combination of these ideas towards a new tool for symbolic regression that can be used to solve many problems in science. As a concrete testbed for these new algorithms, the project will focus on predicting physical quantities, such as cross sections in high-energy physics, e.g a probability that a particular process takes place in the interaction of elementary particles. Its measure provides a testable link between theory and experiment. It is obtained theoretically mainly by calculating the squared amplitude.

Duration

Total project length: 175/350 hours.

Task ideas and expected results

Develop symbolic regression models based on evolutionary and transformer models.
Develop a hybrid method based on elements of both approaches
Benchmark these models on synthetic and high-energy physics datasets

Requirements

Significant experience with Transformer machine learning models in Python (preferably using pytorch).

Difficulty Level

Advanced

Mentors

Eric Reinhardt (University of Alabama)
Abdulhakim Alnuqaydan (Qassim University)
Sergei Gleyzer (University of Alabama)
Neeraj Anand (Indian Institute of Technology Dhanbad)
Harrison Prosper (Florida State University)
Nobuchika Okada (University of Alabama)
Marco Knipfer (University of Erlangen-Nürnberg)

Please DO NOT contact mentors directly by email. Instead, please email ml4-sci@cern.ch with Project Title and include your CV and test results. The mentors will then get in touch with you.

Corresponding Project

SYMBA