Skip to content

A multiphase multiphysics dataset and benchmarks for scientific machine learning

Notifications You must be signed in to change notification settings

HPCForge/BubbleML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BubbleML

Paper

A multiphase, multiphysics dataset of boiling processes. These simulations can be used to model datacenter cooling systems, like liquid cooling flowing across a GPU. They can even model the cooling of nuclear waste: a pool of liquid sitting on a heated surface.

SubCooled Temperature

We hope that BubbleML will be useful to members of the Thermal Science community who are interested in exploring and applying machine learning techniques. We also believe this dataset offers interesting challenges to the scientific machine learning community: handling multiphase data, handling complex boundary conditions, achieving stability in long auto-regressive rollouts, etc.

Documentation and Examples

Documentation discussing the data fields, format, and relevant parameters can be found in bubbleml_data/DOCS.md. We also provide a set of examples illustrating how to use the dataset.

The examples are Jupyter Notebooks showing how to read and visualize BubbleML and train a Fourier Neural Operator on the BubbleML dataset. These are stand-alone examples that use a small, downsampled version of Subcooled Pool boiling. These examples are intended to show 1. how to load the dataset, 2. how to read tensors from the dataset, and 3. how to setup model training for the dataset. Extended descriptions can be found in bubbleml_data/DOCS.md. To run the examples, you should follow the environment setup for the SciML code.

Download BubbleML

BubbleML is publicly available and open source. We provide links to download each study in bubbleml_data/README.md.

Extending BubbleML

It's possible that BubbleML will not match your needs. For instance, in BubbleML's current iteration, each study varies one parameter. One obvious extension is to vary multiple parameters, like both the heater and liquid temperatures. This will lead to different phenomena. Another idea is runnning low resolution simulations to study upscaling models. And, of course, there are some labs who may just want to generate very large datasets, containing hundreds or thousands of individual simulations!

To support such efforts, we provide a reproducibility capsule for running your own boiling simulations with Flash-X. This includes lab notebooks for running simulations. It also includes analysis scripts and the submissions files used to generate BubbleML.

Models

Checkpoints for the models mentioned in the paper, along with ther respective results are listed in the model zoo. (Note: metrics will not necessarily match the paper. We hope that this page serves as a "live" listing that shows the best results thus far.)

Running SciML Code

Please refer to the SciML README.md

Running Optical Flow Benchmarks

Please refer to the Optical Flow README.md

Citation

If you have found BubbleML useful in your research, please consider citing the following paper:

@inproceedings{
    hassan2023bubbleml,
    title={Bubble{ML}: A Multi-Physics Dataset and Benchmarks for Machine Learning},
    author={Sheikh Md Shakeel Hassan and Arthur Feeney and Akash Dhruv and Jihoon Kim and 
            Youngjoon Suh and Jaiyoung Ryu and Yoonjin Won and Aparna Chandramowlishwaran},
    booktitle={Advances in Neural Information Processing Systems},
    year={2023},
    url={https://openreview.net/forum?id=0Wmglu8zak}
}