A package for machine learning inference in FPGAs. We create firmware implementations of machine learning algorithms using high level synthesis language (HLS). We translate traditional open-source machine learning package models into HLS that can be configured for your use-case!
If you have any questions, comments, or ideas regarding hls4ml or just want to show us how you use hls4ml, don't hesitate to reach us through the discussions tab.
For more information visit the webpage: https://fastmachinelearning.org/hls4ml/
Detailed tutorials on how to use hls4ml
's various functionalities can be found here.
pip install hls4ml
To install the extra dependencies for profiling:
pip install hls4ml[profiling]
import hls4ml
# Fetch a keras model from our example repository
# This will download our example model to your working directory and return an example configuration file
config = hls4ml.utils.fetch_example_model('KERAS_3layer.json')
# You can print the configuration to see some default parameters
print(config)
# Convert it to a hls project
hls_model = hls4ml.converters.keras_to_hls(config)
# Print full list of example models if you want to explore more
hls4ml.utils.fetch_example_list()
Building a project with Xilinx Vivado HLS (after downloading and installing from here)
Note: Vitis HLS is not yet supported. Vivado HLS versions between 2018.2 and 2020.1 are recommended.
# Use Vivado HLS to synthesize the model
# This might take several minutes
hls_model.build()
# Print out the report if you want
hls4ml.report.read_vivado_report('my-hls-test')
If you use this software in a publication, please cite the software
@software{fastml_hls4ml,
author = {{FastML Team}},
title = {fastmachinelearning/hls4ml},
year = 2023,
publisher = {Zenodo},
version = {v0.8.1},
doi = {10.5281/zenodo.1201549},
url = {https://github.com/fastmachinelearning/hls4ml}
}
and first publication:
@article{Duarte:2018ite,
author = "Duarte, Javier and others",
title = "{Fast inference of deep neural networks in FPGAs for particle physics}",
eprint = "1804.06913",
archivePrefix = "arXiv",
primaryClass = "physics.ins-det",
reportNumber = "FERMILAB-PUB-18-089-E",
doi = "10.1088/1748-0221/13/07/P07027",
journal = "JINST",
volume = "13",
number = "07",
pages = "P07027",
year = "2018"
}
Additionally, if you use specific features developed in later papers, please cite those as well. For example, CNNs:
@article{Aarrestad:2021zos,
author = "Aarrestad, Thea and others",
title = "{Fast convolutional neural networks on FPGAs with hls4ml}",
eprint = "2101.05108",
archivePrefix = "arXiv",
primaryClass = "cs.LG",
reportNumber = "FERMILAB-PUB-21-130-SCD",
doi = "10.1088/2632-2153/ac0ea1",
journal = "Mach. Learn. Sci. Tech.",
volume = "2",
number = "4",
pages = "045015",
year = "2021"
}
@article{Ghielmetti:2022ndm,
author = "Ghielmetti, Nicol\`{o} and others",
title = "{Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml}",
eprint = "2205.07690",
archivePrefix = "arXiv",
primaryClass = "cs.CV",
reportNumber = "FERMILAB-PUB-22-435-PPD",
doi = "10.1088/2632-2153/ac9cb5",
journal ="Mach. Learn. Sci. Tech.",
year = "2022"
}
binary/ternary networks:
@article{Loncar:2020hqp,
author = "Ngadiuba, Jennifer and others",
title = "{Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML}",
eprint = "2003.06308",
archivePrefix = "arXiv",
primaryClass = "cs.LG",
reportNumber = "FERMILAB-PUB-20-167-PPD-SCD",
doi = "10.1088/2632-2153/aba042",
journal = "Mach. Learn. Sci. Tech.",
volume = "2",
pages = "015001",
year = "2021"
}
If you benefited from participating in our community, we ask that you please acknowledge the Fast Machine Learning collaboration, and particular individuals who helped you, in any publications. Please use the following text for this acknowledgment:
We acknowledge the Fast Machine Learning collective as an open community of multi-domain experts and collaborators. This community and <names of individuals>, in particular, were important for the development of this project.
We gratefully acknowledge previous and current support from the U.S. National Science Foundation (NSF) Harnessing the Data Revolution (HDR) Institute for Accelerating AI Algorithms for Data Driven Discovery (A3D3) under Cooperative Agreement No. PHY-2117997, U.S. Department of Energy (DOE) Office of Science, Office of Advanced Scientific Computing Research under the Real‐time Data Reduction Codesign at the Extreme Edge for Science (XDR) Project (DE-FOA-0002501), DOE Office of Science, Office of High Energy Physics Early Career Research Program (DE-SC0021187, DE-0000247070), and the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (Grant No. 772369).