State-of-the-art sequence to sequence models (seq2seq) have yielded spectacular advances in neural machine translation (NMT) (see, for example, Ref1 ). Recently, these models have been successfully applied to symbolic mathematics by conceptualizing the latter as translation from one sequence of symbols to another ( Ref2 ). It is easy to imagine numerous tasks that can be construed as translations. In the proposed Gsoc project the goal is to create a tool that automatically provides an accurate symbolic representation of a histogram by construing the problem as one of translation from a histogram to a symbolic function. We call the project Fast Accurate Symbolic Empirical Representation Of Histograms (FASEROH).
Total project length: 175 hours.
Python, previous experience in Machine Learning.
Please DO NOT contact mentors directly by email. Instead, please email ml4-sci@cern.ch with Project Title and include your CV and test results. The mentors will then get in touch with you.