Skip to content

CMS Dark matter analysis implementation using Spark and HDF5

Notifications You must be signed in to change notification settings

abuchanan/spark-hdf5-cms

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goal

Can big data tools and High Performance Computing (HPC) resources benefit data- and compute-intensive statistical analyses in high energy physics (HEP) to improve time-to-physics?

Synopsis

We use Spark to implement an active Large Hadron Collider (LHC) analysis, searching for Dark Matter with the Compact Muon Solenoid (CMS) detector as our use case. Our input data is in HDF5 format, we provide a custom HDF5 to Spark DataFrame reader. We also provide several examples to manipulate multiple DataFrames using UDFs, aggregations, etc. We provide functions specific to the CMS data, and evaluate the performance on the supercomputing resources provided by National Energy Research Scientific Computing Center (NERSC).

About

CMS Dark matter analysis implementation using Spark and HDF5

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Scala 88.0%
  • Python 9.0%
  • R 3.0%