Skip to content

Commit

Permalink
Merge pull request #88 from sliwowitz/cms_alpaka
Browse files Browse the repository at this point in the history
Project: Alpaka task-parallel constructs for CMS
  • Loading branch information
davidlange6 authored Mar 12, 2024
2 parents f376e64 + 4f8182c commit b594bd9
Show file tree
Hide file tree
Showing 2 changed files with 82 additions and 30 deletions.
52 changes: 52 additions & 0 deletions projects/cms-alpaka.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
name: Alpaka for CMS
postdate: 2024-03-06
categories:
- Analysis tools
- Open science
- Computing
durations:
- 3 months
experiments:
- CMS
skillset:
- C++
- CUDA
status:
- Available
project:
- IRIS-HEP
location:
- Any
commitment:
- Any
program:
- IRIS-HEP fellow
shortdescription: Extending the [Alpaka](https://github.com/alpaka-group/alpaka) performance portability library with task-parallel constructs for the [CMS pixel reconstruction](https://github.com/cms-patatrack/pixeltrack-standalone/)
description: >
This project proposes to extend the [Alpaka](https://github.com/alpaka-group/alpaka) performance portability library with task-parallel constructs,
like task graphs and cooperative groups, and to evaluate their performance using them in the [pixel track](https://github.com/cms-patatrack/pixeltrack-standalone/)
reconstruction software of the CMS experiment at CERN. As data volume and complexity surge, the CMS pixel
reconstruction process, crucial for accurate particle tracking and collision event analysis, demands optimized
computational strategies for timely data processing. Alpaka, facilitating development across diverse
hardware architectures by providing a unified API for writing parallel software for CPUs, GPUs, and FPGAs,
will be extended with task graph and cooperative groups APIs to meet this demand.
Integrating task graphs into Alpaka will streamline the scheduling and execution of interdependent tasks,
optimizing resource utilization and reducing time-to-solution for complex data analyses.
Cooperative groups will facilitate more flexible and efficient thread collaboration, crucial
for fine-grained parallelism and dynamic workload distribution. These developments aim to
improve the performance and scalability of CMS pixel track reconstruction algorithms,
ensuring faster and more accurate data analysis for high-energy physics research.
Students participating in this project will have the opportunity to contribute to programming
a state-of-the-art C++ library, engaging directly with a developer group that adheres
to the best software programming practices. They will gain hands-on experience in developing
and implementing advanced computational solutions within a real-world scientific framework,
enhancing their technical skills in high-performance computing and software development
within a collaborative and cutting-edge research environment.
contacts:
- name: Jiri Vyskocil
email: [email protected]
- name: Volodymyr Bezguba
email: [email protected]
60 changes: 30 additions & 30 deletions projects/uproot-awkwardforth-refactor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,39 +20,39 @@ commitment:
program:
- IRIS-HEP fellow
shortdescription: >
Keeping the functionality of Uproot's accelerated reading through AwkwardForth, but
making it more maintainable by removing mutable state/coding it in a functional style.
Keeping the functionality of Uproot's accelerated reading through AwkwardForth, but
making it more maintainable by removing mutable state/coding it in a functional style.
description: >
Uproot is a Python library for reading and writing ROOT files (the most common file
format in particle physics). While it is relatively fast at reading "columnar" data,
either arrays of numbers or arrays of numbers that are grouped into variable-length
lists, any other data type requires iteration, which is a performance limitation in
the Python language. ("for" loops in Python are 100's of times slower than in compiled
languages.) To improve this situation, we introduced a domain-specific language (DSL)
called AwkwardForth, in which loops are much faster to execute than they are in Python
(factors of 100's again). This language was created in 2021 (https://arxiv.org/abs/2102.13516)
and added to Uproot in 2022 (https://arxiv.org/abs/2303.02202). In the end, an example
data structure (std::vector<std::vector<float>>) could be read 400× faster with
AwkwardForth than with Python. Users of Uproot don't have to opt in or change their
code, it just runs faster.
Uproot is a Python library for reading and writing ROOT files (the most common file
format in particle physics). While it is relatively fast at reading "columnar" data,
either arrays of numbers or arrays of numbers that are grouped into variable-length
lists, any other data type requires iteration, which is a performance limitation in
the Python language. ("for" loops in Python are 100's of times slower than in compiled
languages.) To improve this situation, we introduced a domain-specific language (DSL)
called AwkwardForth, in which loops are much faster to execute than they are in Python
(factors of 100's again). This language was created in 2021 (https://arxiv.org/abs/2102.13516)
and added to Uproot in 2022 (https://arxiv.org/abs/2303.02202). In the end, an example
data structure (std::vector<std::vector<float>>) could be read 400× faster with
AwkwardForth than with Python. Users of Uproot don't have to opt in or change their
code, it just runs faster.
That would be the end of the story, except that the AwkwardForth-generating code in
Uproot has been very hard to maintain. In part, it's because it's doing something
complicated: generating code that runs later or generating code that generates code
that runs later. But it is also more complicated than it needs to be, with Python
objects that change their own attributes in arbitrary ways as information about what
AwkwardForth needs to be generated accumulates. The code would be much easier to read
and reason about if it were stateless or append-only (see: functional programming),
and it easily could be. This project would be to restructure the AwkwardForth-generating
code in a functional style, to "remove the moving parts."
That would be the end of the story, except that the AwkwardForth-generating code in
Uproot has been very hard to maintain. In part, it's because it's doing something
complicated: generating code that runs later or generating code that generates code
that runs later. But it is also more complicated than it needs to be, with Python
objects that change their own attributes in arbitrary ways as information about what
AwkwardForth needs to be generated accumulates. The code would be much easier to read
and reason about if it were stateless or append-only (see: functional programming),
and it easily could be. This project would be to restructure the AwkwardForth-generating
code in a functional style, to "remove the moving parts."
To be clear, the project will not require you to understand the AwkwardForth that is
being generated (though that's not a bad thing), and it will not require you to figure
out how to generate the right AwkwardForth for a given data type. This part of the problem
has been solved and there are many unit tests that can check correctness, to allow you to
do test-driven development. The project is about software engineering: how to structure
code so that it can be read and understood, while keeping the problem-solving aspect
unchanged.
To be clear, the project will not require you to understand the AwkwardForth that is
being generated (though that's not a bad thing), and it will not require you to figure
out how to generate the right AwkwardForth for a given data type. This part of the problem
has been solved and there are many unit tests that can check correctness, to allow you to
do test-driven development. The project is about software engineering: how to structure
code so that it can be read and understood, while keeping the problem-solving aspect
unchanged.
contacts:
- name: Ioana Ifrim
email: [email protected]
Expand Down

0 comments on commit b594bd9

Please sign in to comment.