Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vitis HLS backend #629

Merged
merged 26 commits into from
Mar 31, 2023
Merged

Vitis HLS backend #629

merged 26 commits into from
Mar 31, 2023

Conversation

vloncar
Copy link
Contributor

@vloncar vloncar commented Aug 3, 2022

Description

An initial attempt at supporting Vitis HLS. Introduces a new backend (Vitis) that takes most of the stuff over from Vivado backend and changes things a little. I tried to bring the two backends closer together, so the changes that Vitis requires but are begnin in Vivado backend are ported to Vidado backend. The more extensive ones are implemented as overrides of the existing HLS routines (currently only the streaming implementations needed this). The biggest change in Vitis HLS is the more strict use of pragmas, so some workarounds we did before are no longer possible, and require changes. For example, setting stream depth in a loop is no longer possible (loop variable is not constant, so pragma refuses to use it), which means encoded CNN implementation doesn't work and is removed from Vitis backend. There are also big issues with Resource strategy in general. Regardless of which of the three implementations I try to use, the compiler will reintroduce the modulo operator and synthesize an urem core which adds to the latency significantly. I'll try to resolve this later unless someone volunteers (yeah, unlikely).

Some plots comparing the latency and resource usage between the two will follow.

Type of change

  • New feature (non-breaking change which adds functionality)

Tests

TODO.
We will also need to extend the synthesis tests to include the Vitis HLS (probably meaning we either fully switch to CERN machines, or we split over multiple machines at Fermilab.

Test Configuration:

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation. -> Later
  • My changes generate no new warnings. -> In fact they do warn the user if they try something not supported
  • I have added tests that prove my fix is effective or that my feature works. -> Later

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 3, 2022

Thanks for this PR. It will be very useful for me. When I tried synthesizing the code that has many 1D CNNs with Vitis, I get the following error.

ERROR: [HLS 214-272] In function 'void nnet::kernel_shift_1d<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 32u>, config8>(nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 32u> const&, nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 32u>::value_type*)', Pragma conflict happens on 'INLINE' and 'PIPELINE' pragmas: same function (firmware/nnet_utils/nnet_conv_stream.h:161:0)

I will try commenting out the pipeline pragma, but this is shared with Vivado.

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 3, 2022

Some warnings I get:

WARNING: [HLS 207-5551] unexpected pragma argument 'instances', expects function/operation (firmware/nnet_utils/nnet_conv1d_latency.h:67:24)
WARNING: [HLS 207-5551] unexpected pragma argument 'instances', expects function/operation (firmware/nnet_utils/nnet_conv1d_latency.h:145:24)
WARNING: [HLS 207-5556] Only for/while/do support the pipeline  pragma (firmware/nnet_utils/nnet_dense_stream.h:21:9)
WARNING: [HLS 214-113] Either use an argument of the function or declare the variable inside the dataflow loop body (firmware/myproject.cpp:57:73)
Resolution: For help on HLS 214-113 see www.xilinx.com/cgi-bin/docs/rdoc?v=2021.2;t=hls+guidance;d=214-113.html
WARNING: [HLS 214-113] Either use an argument of the function or declare the variable inside the dataflow loop body (firmware/myproject.cpp:57:77)
Resolution: For help on HLS 214-113 see www.xilinx.com/cgi-bin/docs/rdoc?v=2021.2;t=hls+guidance;d=214-113.html
WARNING: [HLS 214-113] Either use an argument of the function or declare the variable inside the dataflow loop body (firmware/myproject.cpp:69:72)

Sadly, I still seem to have the line buffer 1d CNN implementation seem to take very long to synthesize (which is why I was using encoded with Vivado_HLS).

@vloncar
Copy link
Contributor Author

vloncar commented Aug 4, 2022

I fixed the warnings. The current CNN implementations for io_parallel are going to be replaced with #600 so I did't test them thoroughly. The warning in dense_stream was due to me forgetting to commit the change 🤦‍♂️

@vloncar
Copy link
Contributor Author

vloncar commented Aug 19, 2022

I would say this is ready for review/merge. Far from complete as it is missing the resource strategy and some warnings about the interfaces still need tweaks, but it mostly works.

I tested this on a range of individual layers and toy models. I'm attaching the script used to build the models and run synthesis (build.py), the script to compare the results and produce the tables (compare.py) and the shell scripts to automate the test. To run, one can just do:

./build_vivado.sh && ./build_vitis.sh && ./compare.sh

(note that you'll get your results faster if you manually run the build steps in parallel 😉)

I'm also including the actual results. Overall, I see slight differences, like the slightly lower usage of LUTs, a cycle spared here & there. Always in favor of Vitis.

The failing SimpleRNN test is a separate bug, that will be addressed in another PR.

vitis_test.zip

@vloncar vloncar marked this pull request as ready for review August 19, 2022 14:36
@jmitrevs
Copy link
Contributor

Are there still issues with the resource implementation? I am testing this on a 1D CNN, and the compiler seems to fail when I use latency. This particular has a with of 11 with 32 channels, and 64 filters of width 9.

@jmitrevs
Copy link
Contributor

Using the resource implementation, I have successfully used this backend for a 1D CNN Vitis workflow.

super().__init__()

def write_nnet_utils_overrides(self, model):
###################
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should be made clearer (and changed to a docstring)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? Pretty sure it is obvious to you and everyone who looked what this does. People who can't figure this out shouldn't be looking at the codebase in the first place 😄. Jokes aside, like mentioned in the other comment, docstrings will be addressed in a separate PR.

@jmitrevs
Copy link
Contributor

The FINN people mentioned the default pipeline time changing. Are there any issues related to that?

jmitrevs
jmitrevs previously approved these changes Sep 15, 2022


class ValidateConvImplementation(OptimizerPass):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer docstrings in new classes that are added describing their purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning a new PR that adds docstrings to most of the remaining codebase, not just Vitis parts.

from hls4ml.report import parse_vivado_report


class VitisBackend(VivadoBackend):
Copy link
Contributor

@jmitrevs jmitrevs Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns inheriting from the VivadoBackend for two reasons. One is that it seems to violate the "is a" requirement for inheritance. Maybe the common features can be factored out? The other is more long-term. In the future when Vitis HLS becomes our primary backend, do we still want to inherit from VivadoBackend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree that having a parent XilinxBackend from which VivadoBackend and VitisBackend are derived would be more elegant, but if we plan on deprecating and ultimately phasing out one and switching to the other I think this is fine. The clean solution would require another level in the hierarchy of backend inheritance, which may confuse the developers of flows. Once we are happy with the state of Vitis (i.e., when we gather more feedback from users, which seems very positive so far), we can move stuff from VivadoBackend to VitisBackend and make the Vivado the "override" of Vitis.

@jmitrevs jmitrevs dismissed their stale review September 18, 2022 18:47

We discussed at the hls4ml meeting that we would prefer discussing this PR in an hls4ml meeting first.

@vloncar
Copy link
Contributor Author

vloncar commented Oct 20, 2022

Since 94dbe80 there is support for resource strategy in Dense/Conv layers. The one corner case (when rf > n_in & rf % n_in != 0) still uses urem cores, but that is rare. What remains is to follow up this change through the new CNN implementation that uses modified version of matrix-vector multiplication based on resource strategy.

@thesps
Copy link
Contributor

thesps commented Oct 31, 2022

What about the interplay with VivadoAccelerator backend? The IPs generated from Vitis HLS should be perfectly compatible with those block designs, and it would be good to take advantage of that.

@jmduarte jmduarte added the please test Trigger testing by creating local PR branch label Nov 17, 2022
@jmitrevs
Copy link
Contributor

jmitrevs commented Dec 2, 2022

I get the following synthesis error with the Vitis backend.

WARNING: [HLS 214-356] Array transformation on pointer indexing may lead to incorrect bank selection. Use array index instead of pointer. (firmware/nnet_utils/nnet_conv1d_resource.h:95:11)
ERROR: [HLS 214-384] in function 'void nnet::conv_1d_resource_cl<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>(ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>*, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>*, config2::weight_t*, config2::bias_t*) (.268)': Cannot apply array transformation pragma/directive because of full array load/store. (firmware/nnet_utils/nnet_conv1d_resource.h:95:11)
ERROR: [HLS 200-1715] Encountered problem during source synthesis

@jmitrevs
Copy link
Contributor

When working with CNNs I get lots of warnings of the type:

WARNING: [HLS 214-113] Either use an argument of the function or declare the variable inside the dataflow loop body (cnn/firmware/vplane.cpp:57:73)
Resolution: For help on HLS 214-113 see www.xilinx.com/cgi-bin/docs/rdoc?v=2022.2;t=hls+guidance;d=214-113.html

inside a dataflow section. The warning seems to be because the weights and biases are global variables, not local variables. As a test, I put the weights inside the function body, right after the #pragma HLS DATAFLOW (by moving the includes to be there instead of in the header file), and the warning went away. How concerned should we be about this warning?

Copy link
Member

@jmduarte jmduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line will need to be updated for Vitis (at least 2022.2):

exec vivado -mode batch -source vivado_synth.tcl >@ stdout

It appears we're supposed to use the xsct command line tool now instead of invoking vitis directly, but the syntax seems to have changed and just doing xsct vivado_synth.tcl doesn't work

@brkortma
Copy link

Hello All, is this pull request still under review? I would be interested in using Vitis_HLS in combination with hls4ml and also willing to do some testing if required.

Please let me know!

Best,
Bryan

@vloncar
Copy link
Contributor Author

vloncar commented Mar 15, 2023

Yes, it is still under consideration for merging, hopefully we'll decide on Friday whether to include it in this release cycle (as an experimental feature) or postpone it to the next release. In fact, I'm expanding the test suite to include Vitis right now. Expect some commits later today that will rebase this to current main branch. Feel free to test it in any way you need it. We would be interested to get feedback on QoR differences between Vitis and Vivado, especially anything that works worse on Vitis. So far we're aware of issues with resource strategy and @jmitrevs had more success with relaxed clock uncertainty. The reduce operation (used by e.g., pooling layers) also uses more resources in some cases.

@brkortma
Copy link

Right, I would be quite keen on these changes once they become available. currently im setting up a test setup with this Vitis_port branch in this pull request. I can definitely provide some testing on the Vitis end. However our I currently don't have Vivado_HLS setup so comparisons between the two I cannot provide I think.

You mentioned later today some commits to be rebased in the current main branch. Which main branch is this(of which fork/repo)? Could you point me to it?

Best,
Bryan

@vloncar
Copy link
Contributor Author

vloncar commented Mar 15, 2023

This is the latest Vitis branch, I meant I will resolve current merge conflicts so it can be merged into the main branch of hls4ml and these commits will appear here.

@brkortma
Copy link

Ahh great news, thanks very much. I'll keep an eye on it then. If you need any specific testing feel free to let me know.

@brkortma
Copy link

Hi, I was trying to build my model but its taking quite a long time. Do you perhaps have any small testing models that i can try to build, which you have verifier work with Vivado 2022.1? (vitis_hls)

@jmduarte jmduarte added this to the v0.7.0 milestone Mar 24, 2023
* Add quantized sigmoid, fix quantized tanh for QKeras (fastmachinelearning#569)

* snapshot of beginnings

* make version that works for Vivado, error for Quartus

* Change order of precision from quantizer

* add hard sigmoid and tanh

* fix setting of slope and shift type

* revert config parsing--seems a little strange but works

* fix hard_sigmoid and hard_tanh for streaming

* update pytest for quantized tanh and sigmoid

* remove inadvertently included matoplotlib

* add special case when W == min_width.

* fix merge of main

* Go back to having AP_TRN and AP_WRP as defaults

* handle case when use_real_tanh is not defined

* make the activations use AP_RND_CONV (and AP_SAT) by default

* remove use of use_real_tanh in test since not always supported

* fix incorrect default types for Keras (not QKeras) hard_sigmoid

* Mostly fix up things for Quartus

* get rid of intermediate cast

* fix an i++ compilation issue

* Quartus seems to not like ac_fixed<1,0,false>, so make 2 bits.

* fix activation quantizer

* make sat, round defeult activation parameters, don't need to set again

* Make the slope and shift not be configurable for HardActivation

* some pre-commit fixes

* pre-commint //hls to // hls fixes

* update CI version

* fixes for parsing errors from pre-commits

* remove qactivation from list of activation_layers

* print_vivado_report function for nicer reports (fastmachinelearning#730)

* print_vivado_report function for fancier reports

* Fancy reports (#51)

* fix uram divide by 0

* add test

* fix parsing of vsynth in 2020.1; add test

* Update test_report.py

* exclude pregenerated reports

---------

Co-authored-by: Javier Duarte <[email protected]>

---------

Co-authored-by: Jovan Mitrevski <[email protected]>
Co-authored-by: Vladimir <[email protected]>
@jmduarte jmduarte added enhancement please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Mar 25, 2023
@jmitrevs jmitrevs merged commit 3dbf777 into fastmachinelearning:main Mar 31, 2023
@jmduarte jmduarte mentioned this pull request Apr 9, 2023
calad0i pushed a commit to calad0i/hls4ml that referenced this pull request Jul 1, 2023
@Andre-coder
Copy link

I'm trying to run a model on vitis_hls 2023.2 getting the following error after extracting successfully the model from hls4ml.

make: *** [Makefile.rules:318: csim.exe] Error 1
ERROR: [SIM 211-100] 'csim_design' failed: compilation error(s).
INFO: [SIM 211-3] *************** CSIM finish ***************
INFO: [HLS 200-111] Finished Command csim_design CPU user time: 6.72 seconds. CPU system time: 0.77 seconds. Elapsed time: 15.3 seconds; current allocated memory: 0.211 MB.
4
while executing
"source build_prj.tcl"
("uplevel" body line 1)
invoked from within
"uplevel #0 [list source $tclfile]

I would be glad if i get any help on this matter. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement please test Trigger testing by creating local PR branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants