Releases: fastmachinelearning/hls4ml
Releases · fastmachinelearning/hls4ml
edelweiss 0.8.1
What's Changed
- Fix for #905 by @calad0i in #906
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #921
- Fix logos in README.md by @vloncar in #930
- Fix writer precision when fp bits >= 14 by @calad0i in #909
- Let repack_stream optimizer inheirt original precision by @calad0i in #907
- Update A3D3 grant no. by @schsu in #941
- Add precision inherition for when generating stream clone by @calad0i in #911
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #942
- Quartus multi out with stream fix by @calad0i in #908
- Fix profiling for Keras LSTM layers. by @Landay7 in #940
- Fix for multiple inputs that may get out of order by @jmduarte in #937
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #944
- Bump actions/upload-artifact from 3 to 4 by @dependabot in #943
- better repalce_node fn by @calad0i in #934
- bump to 0.8.1 by @jmitrevs in #945
New Contributors
Full Changelog: v0.8.0...v0.8.1
edelweiss 0.8.0
What's Changed
- Decouple pipeline style from strategy by @vloncar in #781
- Don't use reader in ModelGraph and layers by @vloncar in #770
- Remove tf_to_hls by @vloncar in #795
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #796
- Fix parsing of QConv2DBatchnorm weights by @vloncar in #802
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #801
- Discussion - Inlined Conv slows down latency significantly (up to x15 - x20) by @bo3z in #800
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #807
- Fix over-allocation of bits for quantised po2 by @bo3z in #806
- Propagate zeros from Conv layers to multiplication config by @bo3z in #797
- Fix Vitis Conv1D/2D latency strategy by @vloncar in #815
- Improved parsing of pytorch models using torch.FX - Clean by @JanFSchulte in #799
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #816
- Support for parsing nested models by @vloncar in #794
- Fix loading weights in n-dim dense -> 1x1 conv by @vloncar in #821
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #828
- Fix loading weights in GarNetStacked and GarNet internal array precisions by @joshlerner in #827
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #830
- Fix profiling for GRU/LSTM by @drankincms in #833
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #835
- remove obsolete and unused docker directory by @jmitrevs in #836
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #842
- Remove obsolete parameter mapping between pytorch and keras by @JanFSchulte in #847
- Make binary CNN match between Keras and hls4ml by @jmitrevs in #804
- No longer make ExponentPrecisionType and XnorPrecisionType inherit from IntegerPrecisionType by @jmitrevs in #845
- Add support for flattening to the pytorch parser by @JanFSchulte in #852
- Add option to configure IP version by @AdrianAlan in #851
- Bug fix for named nn.Sequential in pytorch parser by @JanFSchulte in #848
- Add QDepthwiseConv2D, DepthwiseConv2D, DepthwiseConv1D support by @jmitrevs in #834
- Symbolic expressions in hls4ml by @vloncar in #660
- Update dependencies, add testing extras by @jmitrevs in #837
- Bump actions/checkout from 3 to 4 by @dependabot in #866
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #869
- try to use new runners for gitlab CI by @jmitrevs in #879
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #880
- Fix weight precision format string by @vloncar in #877
- add acknowledgments by @jmduarte in #862
- Support for quantized SeparableConv1D/2D by @vloncar in #861
- Speed up Keras profiling by @AdrianAlan in #863
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #882
- Fix profiling SeparableConv1D and SeparableConv2D by @qberthet in #891
- Add support for filt_height==1 for streaming quartus conv2d by @jmitrevs in #886
- Fix config structure name in pragma for SeparableConv1D by @qberthet in #884
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #895
- Fix bit overflow with softmax by @calad0i in #887
- bump 0.8.0rc1 by @jmitrevs in #915
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #902
- Add funding acknowledgements by @jmduarte in #918
- Fix fetching models from example-models repo by @vloncar in #919
- add blank line to make rst format correct by @jmitrevs in #923
- Update default FPGA part number from KU115 to VU13P by @jmduarte in #924
- update to 0.8.0 by @jmitrevs in #925
New Contributors
- @pre-commit-ci made their first contribution in #796
- @joshlerner made their first contribution in #827
- @qberthet made their first contribution in #891
Full Changelog: v0.7.1...v0.8.0
edelweiss 0.8.0rc1
What's Changed
- Decouple pipeline style from strategy by @vloncar in #781
- Don't use reader in ModelGraph and layers by @vloncar in #770
- Remove tf_to_hls by @vloncar in #795
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #796
- Fix parsing of QConv2DBatchnorm weights by @vloncar in #802
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #801
- Discussion - Inlined Conv slows down latency significantly (up to x15 - x20) by @bo3z in #800
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #807
- Fix over-allocation of bits for quantised po2 by @bo3z in #806
- Propagate zeros from Conv layers to multiplication config by @bo3z in #797
- Fix Vitis Conv1D/2D latency strategy by @vloncar in #815
- Improved parsing of pytorch models using torch.FX - Clean by @JanFSchulte in #799
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #816
- Support for parsing nested models by @vloncar in #794
- Fix loading weights in n-dim dense -> 1x1 conv by @vloncar in #821
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #828
- Fix loading weights in GarNetStacked and GarNet internal array precisions by @joshlerner in #827
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #830
- Fix profiling for GRU/LSTM by @drankincms in #833
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #835
- remove obsolete and unused docker directory by @jmitrevs in #836
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #842
- Remove obsolete parameter mapping between pytorch and keras by @JanFSchulte in #847
- Make binary CNN match between Keras and hls4ml by @jmitrevs in #804
- No longer make ExponentPrecisionType and XnorPrecisionType inherit from IntegerPrecisionType by @jmitrevs in #845
- Add support for flattening to the pytorch parser by @JanFSchulte in #852
- Add option to configure IP version by @AdrianAlan in #851
- Bug fix for named nn.Sequential in pytorch parser by @JanFSchulte in #848
- Add QDepthwiseConv2D, DepthwiseConv2D, DepthwiseConv1D support by @jmitrevs in #834
- Symbolic expressions in hls4ml by @vloncar in #660
- Update dependencies, add testing extras by @jmitrevs in #837
- Bump actions/checkout from 3 to 4 by @dependabot in #866
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #869
- try to use new runners for gitlab CI by @jmitrevs in #879
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #880
- Fix weight precision format string by @vloncar in #877
- add acknowledgments by @jmduarte in #862
- Support for quantized SeparableConv1D/2D by @vloncar in #861
- Speed up Keras profiling by @AdrianAlan in #863
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #882
- Fix profiling SeparableConv1D and SeparableConv2D by @qberthet in #891
- Add support for filt_height==1 for streaming quartus conv2d by @jmitrevs in #886
- Fix config structure name in pragma for SeparableConv1D by @qberthet in #884
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #895
- Fix bit overflow with softmax by @calad0i in #887
- bump 0.8.0rc1 by @jmitrevs in #915
New Contributors
- @pre-commit-ci made their first contribution in #796
- @joshlerner made their first contribution in #827
- @qberthet made their first contribution in #891
Full Changelog: v0.7.1...v0.8.0rc1
delphinium 0.7.1
What's Changed
- bump version to v0.7.0 by @jmduarte in #778
- Fix for 2D conv layers in the special case of io_parallel with full parallelization by @drankincms in #760
- Fix RNN layers when strategy=resource by @vloncar in #780
- Update Jenkins test environment to avoid dependency hell by @vloncar in #786
- Explicitly set strategy for pointwise conv by @vloncar in #785
- Minor docs fixes for 0.7.1 by @vloncar in #788
- bump 0.7.1 by @jmitrevs in #791
Full Changelog: v0.7.0...v0.7.1
delphinium
What's Changed
- fix conv1d io_parallel resource by @jmitrevs in #403
- Speed up CI tests by @thesps in #407
- Fix GlobalPooling1D Layers by @jmduarte in #399
- Fix batched multiple inputs by @jmduarte in #414
- Fixed 'qkeras_mnist_dense' example build problem #423 by @siorpaes in #424
- Update for pyyaml 6.0 by @thesps in #435
axi_stream_driver
update by @nicologhielmetti in #420- Reshape fixes: don't repack stream for flatten; remove final reshape by @jmduarte in #443
- Fix Conv2D with
io_type = io_parallel
&Strategy: Resource
by @thesps in #448 - Support applying Softmax over multidimensional tensors by @vloncar in #384
- Disable some unsupported layers by @thesps in #447
- Fixes: quantized_relu & unsigned profiling part II by @thesps in #441
- GarNet and GarNetStack in config.py by @yiiyama in #344
- support ZeroPadding layers by @jmduarte in #480
- New backend development framework by @vloncar in #395
- Register
ApplyAlpha
layer templates by @thesps in #499 - Parsing extended by @nicologhielmetti in #501
- Remove intermediate casting in product by @jmitrevs in #490
- Add QKeras as a package dependency by @vloncar in #511
- Copy flows from config by @thesps in #510
- VivadoAccelerator backend updates by @thesps in #508
- Optimized look-up table by @nemerchiedde in #527
- Upsampling2D test case by @ChiRuiChen in #520
- Support UpSampling1D by @vloncar in #475
- RNN support (part 1) by @vloncar in #521
- Quartus Custom Matrix Multiplication & Quantization by @bo3z in #523
- Vivado-equivalent implementation of Softmax on Quartus by @bo3z in #540
- Ensure 2 bits for scale in po2 quantizers by @vloncar in #531
- Link update by @bkmgit in #519
- Fix removal of nodes ingested by multiple downstream nodes by @jmduarte in #544
- Enable SeparableConv2d by @jmduarte in #547
- Extension API by @vloncar in #528
- change string ReuseFactor to int by @jmitrevs in #416
- Make the size of bn scale and bias what they really are by @jmitrevs in #532
- Raise runtime error when a layer is named
input
by @jmduarte in #482 - fix insertion before a node with multiple inputs + support additional broadcasting by @jmduarte in #551
- Pointwise conv1d/2d resource by @jmduarte in #471
- Quartus Embedding Layer by @bo3z in #548
- Fix for QActivations passed as an argument by @AdrianAlan in #553
- Don't override precision directly in the QKeras optimizer by @vloncar in #567
- Remove the in/out size from top function by @vloncar in #559
- Transpose2d, Concatenate2d, and up to 3 Clones for io_stream by @jmduarte in #402
- Remove io_serial as io_stream and add some more info in docs. by @Duchstf in #334
- Update docs for v0.6.0 by @thesps in #453
- Use correct number of args for multiple outputs by @apfusco in #487
- Fixed a few typos in the documentation by @pitmonticone in #467
- returning integer from _compute_n_samples by @JochiSt in #537
- Providing support for Alveo boards by @selwyn96 in #552
- Make layer names case sensitive in config. by @jmitrevs in #577
- Add issue and PR templates by @jmduarte in #582
- Vivado Backend GRU/LSTM support by @drankincms in #560
- Update CI template syntax by @thesps in #593
- Update flow dependencies by @vloncar in #588
- Fix parsing of ZeroPadding layers by @vloncar in #595
- remove cppname by @jmitrevs in #562
- Remove email helpline from the docs by @vloncar in #601
- Fixes for GRU/LSTM in Vivado backend by @drankincms in #598
- Remove io_serial by @vloncar in #609
- Fix test_graph by @vloncar in #611
- Override parent backend optimizer passes with derived backend passes by @thesps in #597
- Enforce function pipelining when using io_parallel with Resource strategy by @vloncar in #605
- FIFO depth optimization by @nicologhielmetti in #509
- Add tracing support for the quartus backend by @jmitrevs in #583
- Quartus streaming support for Activations, Dense & Batch Normalization by @bo3z in #557
- QConv alpha != 1 bug fix by @bo3z in #612
- Quartus Stream Embedding by @bo3z in #625
- change master to main by @jmitrevs in #602
- Edit order of the optimizers in the flow so that BramFactor is followed by @jmitrevs in #621
- Softmax LUT Optimization by @bo3z in #570
- Quartus Synthesis Flow Improvement by @bo3z in #618
- Quartus Extensions by @bo3z in #628
- Quartus GRU by @bo3z in #596
- Quartus Merge layers by @bo3z in #634
- fix nondefault project name handling by @jmitrevs in #626
- Fix parsing of logic synthesis reports by @vloncar in #639
- Fix conv1d stream implementation hls directives by @Jonathan-Shoemaker in #635
- Implementation and optimizations linked to Simple-RNN and LSTM for qu… by @nemerchiedde in #575
- Softsign optimization by @nemerchiedde in #585
- Parallel CNNs, Pooling & Image Layers for Quartus Backend by @bo3z in #561
- Quartus Streaming Softsign (PR #585 contd.) by @bo3z in #655
- Remove final reshapes even for Quartus by @jmitrevs in #661
- Unrolled CNN implementation by @vloncar in #600
- the strategy was not propagated in the pytest by @jmitrevs in #663
- Fix keras model loading issue with loading model with KerasH5 by @calad0i in #664
- append applied_flows container before filling instead of after by @jmitrevs in #641
- set version using
setuptools_scm
by @jmduarte in #479 - Argmax Softmax by @bo3z in #627
- Fix version extraction in Sphinx config by @vloncar in #669
- Add requested citations to README by @jmduarte in #615
- skip BatchNorm fusion when input/output is used multiple times by @jmduart...
delphinium rc1
What's Changed
- fix conv1d io_parallel resource by @jmitrevs in #403
- Speed up CI tests by @thesps in #407
- Fix GlobalPooling1D Layers by @jmduarte in #399
- Fix batched multiple inputs by @jmduarte in #414
- Fixed 'qkeras_mnist_dense' example build problem #423 by @siorpaes in #424
- Update for pyyaml 6.0 by @thesps in #435
axi_stream_driver
update by @nicologhielmetti in #420- Reshape fixes: don't repack stream for flatten; remove final reshape by @jmduarte in #443
- Fix Conv2D with
io_type = io_parallel
&Strategy: Resource
by @thesps in #448 - Support applying Softmax over multidimensional tensors by @vloncar in #384
- Disable some unsupported layers by @thesps in #447
- Fixes: quantized_relu & unsigned profiling part II by @thesps in #441
- GarNet and GarNetStack in config.py by @yiiyama in #344
- support ZeroPadding layers by @jmduarte in #480
- New backend development framework by @vloncar in #395
- Register
ApplyAlpha
layer templates by @thesps in #499 - Parsing extended by @nicologhielmetti in #501
- Remove intermediate casting in product by @jmitrevs in #490
- Add QKeras as a package dependency by @vloncar in #511
- Copy flows from config by @thesps in #510
- VivadoAccelerator backend updates by @thesps in #508
- Optimized look-up table by @nemerchiedde in #527
- Upsampling2D test case by @ChiRuiChen in #520
- Support UpSampling1D by @vloncar in #475
- RNN support (part 1) by @vloncar in #521
- Quartus Custom Matrix Multiplication & Quantization by @bo3z in #523
- Vivado-equivalent implementation of Softmax on Quartus by @bo3z in #540
- Ensure 2 bits for scale in po2 quantizers by @vloncar in #531
- Link update by @bkmgit in #519
- Fix removal of nodes ingested by multiple downstream nodes by @jmduarte in #544
- Enable SeparableConv2d by @jmduarte in #547
- Extension API by @vloncar in #528
- change string ReuseFactor to int by @jmitrevs in #416
- Make the size of bn scale and bias what they really are by @jmitrevs in #532
- Raise runtime error when a layer is named
input
by @jmduarte in #482 - fix insertion before a node with multiple inputs + support additional broadcasting by @jmduarte in #551
- Pointwise conv1d/2d resource by @jmduarte in #471
- Quartus Embedding Layer by @bo3z in #548
- Fix for QActivations passed as an argument by @AdrianAlan in #553
- Don't override precision directly in the QKeras optimizer by @vloncar in #567
- Remove the in/out size from top function by @vloncar in #559
- Transpose2d, Concatenate2d, and up to 3 Clones for io_stream by @jmduarte in #402
- Remove io_serial as io_stream and add some more info in docs. by @Duchstf in #334
- Update docs for v0.6.0 by @thesps in #453
- Use correct number of args for multiple outputs by @apfusco in #487
- Fixed a few typos in the documentation by @pitmonticone in #467
- returning integer from _compute_n_samples by @JochiSt in #537
- Providing support for Alveo boards by @selwyn96 in #552
- Make layer names case sensitive in config. by @jmitrevs in #577
- Add issue and PR templates by @jmduarte in #582
- Vivado Backend GRU/LSTM support by @drankincms in #560
- Update CI template syntax by @thesps in #593
- Update flow dependencies by @vloncar in #588
- Fix parsing of ZeroPadding layers by @vloncar in #595
- remove cppname by @jmitrevs in #562
- Remove email helpline from the docs by @vloncar in #601
- Fixes for GRU/LSTM in Vivado backend by @drankincms in #598
- Remove io_serial by @vloncar in #609
- Fix test_graph by @vloncar in #611
- Override parent backend optimizer passes with derived backend passes by @thesps in #597
- Enforce function pipelining when using io_parallel with Resource strategy by @vloncar in #605
- FIFO depth optimization by @nicologhielmetti in #509
- Add tracing support for the quartus backend by @jmitrevs in #583
- Quartus streaming support for Activations, Dense & Batch Normalization by @bo3z in #557
- QConv alpha != 1 bug fix by @bo3z in #612
- Quartus Stream Embedding by @bo3z in #625
- change master to main by @jmitrevs in #602
- Edit order of the optimizers in the flow so that BramFactor is followed by @jmitrevs in #621
- Softmax LUT Optimization by @bo3z in #570
- Quartus Synthesis Flow Improvement by @bo3z in #618
- Quartus Extensions by @bo3z in #628
- Quartus GRU by @bo3z in #596
- Quartus Merge layers by @bo3z in #634
- fix nondefault project name handling by @jmitrevs in #626
- Fix parsing of logic synthesis reports by @vloncar in #639
- Fix conv1d stream implementation hls directives by @Jonathan-Shoemaker in #635
- Implementation and optimizations linked to Simple-RNN and LSTM for qu… by @nemerchiedde in #575
- Softsign optimization by @nemerchiedde in #585
- Parallel CNNs, Pooling & Image Layers for Quartus Backend by @bo3z in #561
- Quartus Streaming Softsign (PR #585 contd.) by @bo3z in #655
- Remove final reshapes even for Quartus by @jmitrevs in #661
- Unrolled CNN implementation by @vloncar in #600
- the strategy was not propagated in the pytest by @jmitrevs in #663
- Fix keras model loading issue with loading model with KerasH5 by @calad0i in #664
- append applied_flows container before filling instead of after by @jmitrevs in #641
- set version using
setuptools_scm
by @jmduarte in #479 - Argmax Softmax by @bo3z in #627
- Fix version extraction in Sphinx config by @vloncar in #669
- Add requested citations to README by @jmduarte in #615
- skip BatchNorm fusion when input/output is used multiple times by @jmduart...
coris
What's Changed
VivadoAccelerator
backend: targetpynq-z2
andzcu102
boards directly from hls4ml by @nicologhielmetti- Updated
PyTorch
andONNX
converters by @Duchstf line_buffer
Conv2D implementation forio_stream
: reduced resource usage and latency by @Keb-L, @violatingcp, @vloncar- Support
QConv2DBatchnorm
layer fromQKeras
by @nicologhielmetti - Improved profiling plots - easier to compare original vs
hls4ml
converted models by @maksgraczyk - Better derivation of data types for
QKeras
models by @jmduarte, @thesps - Improved CI by @thesps
- More support for models with branches, skip connections,
Merge
andConcatenate
layers by @jmduarte, @vloncar - Support for
Dense
layers over multi-dimensional tensors by @vloncar - Overall improvements by @vloncar, @jmduarte, @thesps, @jmitrevs & others
New Contributors
- @siorpaes made their first contribution in #424
- @jmitrevs made their first contribution in #403
- @anders-wind made their first contribution in #302
- @KOVI89alipes made their first contribution in #318
- @maksgraczyk made their first contribution in #323
- @Keb-L made their first contribution in #332
- @ConsVin made their first contribution in #307
- @nicologhielmetti made their first contribution in #298
Full Changelog: v0.5.0...v0.6.0
bartsia
What's new:
- Streaming IO layer implementations, especially of Convolutional layers, accessed through the config with
IOType: io_stream
. Scales CNN support to much larger models than previously possible (see arXiv:2101.05108) - New documentation and API reference
- Further optimizations for QKeras / quantization aware training. A 'shift' operation is now used for
po2
quantizers - Allow redefinition of weights directory for standalone project compilation
profiling
for PyTorch models
Deprecated:
IOType : io_serial
is deprecated, and superceded by newIOType: io_stream
Bugfixes:
- Fix to Initiation Interval and different min/max latency for
Strategy: Resource
- Fix warnings in
hls4ml
command line script flow - Write yml config from Python API - for mixed API / command line flow
v0.5.0-beta
Pre-release of hls4ml version v0.5.0
.
What's new:
- Streaming IO layer implementations, especially of Convolutional layers, accessed through the config with
io_type: io_stream
. Scales CNN support to much larger models than previously possible (see paper) - New documentation and API reference
- Further optimizations for QKeras / quantization aware training. A 'shift' operation is now used for
po2
quantizers - Allow redefinition of weights directory for standalone project compilation
aster
What's new:
- Support for GarNet layer (see paper)
- Input layer precision added to config generator utility
- New 'SkipOptimizers' config option. Now you can run all Optimizers by default (as in v0.3.0) but subtract any specified by 'SkipOptimizers' e.g.
hls_config['SkipOptimizers'] = ['fuse_consecutive_batch_normalization']
- Print out the latency report from Cosimulation
Bugfixes:
- Fixes related to tensorflow 2.3: new Functional API, changes to handling of Input layer
- Fix error with config generator utility and activation layers gor
granularity='name'
- Fix issue with reloading of emulation library after configuration change
- Fix to handling of layers with
use_bias=False
and merged Dense and BatchNormalization