Symbolic regression can be used to rapidly provide solutions to problems in science which may have large computational complexity or may even be intractable. It can be used to discover a symbolic expression describing data such as a physical law. Previous work has explored combinations of Transformer models combined with genetic algorithms or reinforcement learning. Future work on this project might extend those approaches but could also include explorations of alternative approaches such as incorporation of Kolmogorov-Arnold Layers or novel LLM-based approaches. As a concrete testbed for these new algorithms, the project will focus on predicting physical quantities, such as cross sections in high-energy physics, e.g a probability that a particular process takes place in the interaction of elementary particles. Its measure provides a testable link between theory and experiment. It is obtained theoretically mainly by calculating the squared amplitude.
Total project length: 175/350 hours.
Significant experience with Transformer machine learning models in Python (preferably using pytorch).
Intermediate
Please DO NOT contact mentors directly by email. Questions should instead be directed to ml4-sci@cern.ch which is forwarded to mentors. To submit your proposal, CV, and test task solutions, please use this Google form.