Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated the reduce function to no longer making unnecessary copies - … #302

Merged
merged 1 commit into from
Mar 15, 2021

Conversation

anders-wind
Copy link
Contributor

@anders-wind anders-wind commented Mar 10, 2021

…saving a memory usage of NlogN. Also making it C++ standard compliant by removing a zero-sized array

This solves: #300

…saving a memory usage of NlogN. Also making the C++ standard compliant
@anders-wind
Copy link
Contributor Author

anders-wind commented Mar 10, 2021

I dont have access to a linux machine currently, but I will try to get @KOVI89alipes to run the code.
Do you have some CI actions set up to try some of these things?

@vloncar
Copy link
Contributor

vloncar commented Mar 10, 2021

I am running some CI tests on this (the server was down before). We'll see if anything regressed.

@anders-wind
Copy link
Contributor Author

Looks like CI checks out :) Is there anything you would like us to try manually?

@thesps
Copy link
Contributor

thesps commented Mar 11, 2021

Indeed, the KERAS_3layer performance is unchanged.

I made this standalone test as well, and everything looks good: identical resources, latency, II, and numerical performance with the old and new version.

Is there anything you would like us to try manually?

Given this started as a solution to #300, we should make sure this actually works on Windows. It would be good if you or @KOVI89alipes can check that.

And as @vloncar says we need to check that a model using Pooling and io_stream is unaffected. The dummy model in #299 should do the trick. I'll use it to test your PR as well as that one.

saving a memory usage of NlogN.

FYI this isn't really a factor in evaluating this: the axes of performance are FPGA resource consumption, latency, throughput (II) and numerical accuracy. On those counts the new version is identical to the old - which is good, and expected.

@thesps
Copy link
Contributor

thesps commented Mar 11, 2021

And as @vloncar says we need to check that a model using Pooling and io_stream is unaffected. The dummy model in #299 should do the trick. I'll use it to test your PR as well as that one.

I've done this. Again, looks good: no change to any of the metrics with the new vs old reduce function.

So now I think we just need the confirmation that it does compile on Windows.

@anders-wind
Copy link
Contributor Author

FYI this isn't really a factor in evaluating this: the axes of performance are FPGA resource consumption, latency, throughput (II) and numerical accuracy. On those counts the new version is identical to the old - which is good, and expected.

Okay - I have very limited knowledge about FPGA design and code generation. But good to hear that the generated code is atleast identical.

@KOVI89alipes
Copy link
Contributor

I checked this fix, it compiles both in GCC10.1 and MSVC(VS2017) with no extra warning

@thesps
Copy link
Contributor

thesps commented Mar 12, 2021

And how about in Vivado HLS on Windows? It needs to work on whichever compilers they supply

@KOVI89alipes
Copy link
Contributor

Vivado 2020.1
Windows 10

Both CSim and HLS synthesis passed

@thesps thesps merged commit 0d23584 into fastmachinelearning:master Mar 15, 2021
calad0i pushed a commit to calad0i/hls4ml that referenced this pull request Jul 1, 2023
Updated the reduce function to no longer making unnecessary copies - …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants