Each HEP frontier presents its own Big Data challenges, inviting the use of AI/ML to tackle them. Here we choose three specific challenges, one from each of the Energy, Intensity, and Cosmic Frontiers, that can be tackled during the school by small project teams.
Each has a dataset associated with it, which can be either downloaded to your local (or remote) computing resource, or imported to Google colab. Your team might then pick up one of the approaches described in the lectures, and try and apply it. We provide a number of tutorial notebooks below, that introduce the datasets and provide some possible starting points for you.
On the last Thursday of the school, we will hear very short presentations from each project team in a common slide deck, and award various small prizes.
For maximum community value, project teams should plan to submit their project notebook back to this repo via a pull request, so everyone can benefit from their hard work. Fork this repo and get to work!
Have a look at the Getting Started
slides to get started with Github and Google Colab.
Energy Frontier: here, the challenge is to develop ML models for LHC jets.
These could be for classification, or generative modeling.
We provide a dataset to explore that includes various boosted jets, including high-level jet features, jet-images, and per-particle features.
Many thanks to SSI lecturer Jennifer Ngadubia, from whose recent course
the materials for this challenge are drawn!
Cosmic Frontier: here, the challenge is to develop methods for mapping Dark Matter in the Universe from weak lensing data, after exploring some related inverse problems using LSST-like imaging data. We provide suitable weak lensing datasets. Many thanks to SSI Lecturer François Lanusse for the materials for this challenge, which are based on the materials used at the Quarks2Cosmos conference!
Intensity Frontier: here, the challenge is to... Many thanks to SSI Organizer Kazu Terao for the materials for this challenge!
Prerequisites for the course include basic knowledge of GitHub, Colab and python. It is thus required before the course to go through these slides as well as the following two python basics notebooks:
python_intro_part1.ipynb
- Quickstart
- Indentation
- Comments
- Variables
- Conditions and
if
statements - Arrays
- Strings
- Loops:
while
andfor
- Dictionaries
python_intro_part2.ipynb
- Functions
- Classes/Objects
- Inheritance
- Modules
- JSON data format
- Exception Handling
- File Handling
We've organized a variety of tutorial notebooks below, grouped by Frontier (after some more general tutorials you may find helpful). Note that your project might well benefit from techniques you pick up by looking for tutorials across the Frontiers...
- Intro to Numpy:
numpy_intro.ipynb
- Intro to Pandas:
pandas_intro.ipynb
- Intro to Matplotlib:
matplotlib_intro.ipynb
- Intro to PyTorch:
pytorch_intro.ipynb
andpytorch_NeuralNetworks.ipynb
- Intro to PyTorch Geometric:
1.IntroToPyG.ipynb
- Node classification with PyG on Cora citation dataset:
2.KCNodeClassificationPyG.ipynb
- Graph classification with PyG on molecular prediction dataset:
3.TUGraphClassification.ipynb
- Introduction to dataset and tasks [slides: GettingStarted.pdf]
- Dataset exploration:
1.LHCJetDatasetExploration.ipynb
- MLP implementation with Keras:
2.JetTaggingMLP.ipynb
- Conv2D implementation with Keras:
3.JetTaggingConv2D.ipynb
- Conv1D implementation with Keras:
4.JetTaggingConv1D.ipynb
- GRU for LHC jet tagging task:
5.JetTaggingRNN.ipynb
- Graph classification with PyG on LHC jet dataset:
6.JetTaggingGCN.ipynb
- Transformer model for LHC jet tagging with tensorflow:
7.JetTaggingTransformer.ipynb
- Anomaly detection for LHC jets with AE
8.JetAnomalyDetectionAE.ipynb
- Anomaly detection for LHC jets with VAE
9.JetAnomalyDetectionVAE.ipynb
1.PartI-DifferentiableForwardModel.ipynb
- How to write a probabilistic forward model for galaxy images with Jax + TensorFlow Probability
- How to optimize parameters of a Jax model
- Write a forward model of ground-based galaxy images
2.PartII-GenerativeModels.ipynb
- Write an Auto-Encoder in Jax+Haiku
- Build a Normalizing Flow in Jax+Haiku+TensorFlow Probability
- Bonus: Learn a prior by Denoising Score Matching
- Build a generative model of galaxy morphology from Space-Based images
3.PartIII-VariationalInference.ipynb
- Solve inverse problem by MAP
- Learn how to sample from the posterior using Variational Inference
- Bonus: Learn to sample with SDE
- Recover high-resolution posterior images for HSC galaxies
- Propose an inpainting model for masked regions in HSC galaxies
- Bonus: Demonstrate single band deblending!
- Open challenge
4.MappingDarkMatterDataChallenge.ipynb
- Use Jax to write a differentiable model for weak gravitational lensing
- Use an analytic Gaussian prior to solve the inverse problem (Wiener Filtering)
- Use Denoising Score Matching to learn the score of a prior distribution
- Use Stochastic Differential Equations for sampling from the posterior
- Pattern Recognition and Machine Learning, Bishop (2006)
- Deep Learning, Goodfellow et al. (2016) --
link
- Introduction to machine learning, Murray (2010) --
video lectures
- Stanford ML courses --
link