Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🏁 Windows Support #266

Merged
merged 15 commits into from
Sep 10, 2024
Merged

🏁 Windows Support #266

merged 15 commits into from
Sep 10, 2024

Conversation

burgholzer
Copy link
Member

@burgholzer burgholzer commented Aug 26, 2024

Description

This PR adds continuous testing and deployment for Windows.
To this end, it switches to the reusable MQT workflows for CI and CD.
Along the way, it fixes a couple of errors that only revealed themselves on Windows.

Checklist:

  • The pull request only contains commits that are related to it.
  • I have added appropriate tests and documentation.
  • I have made sure that all CI jobs on GitHub pass.
  • The pull request introduces no new warnings and follows the project's style guidelines.

@burgholzer burgholzer added enhancement Enhancement to existing feature dependencies Pull requests that update a dependency file continuous integration Anything related to the CI setup packaging Anything related to Python packaging usability Anything related to usability python Pull requests that update Python code minor Minor version update code quality Code quality improvements refactor Changes the refactor the code base labels Aug 26, 2024
@burgholzer burgholzer self-assigned this Aug 26, 2024
@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

Copy link

codecov bot commented Aug 26, 2024

Codecov Report

Attention: Patch coverage is 65.67164% with 23 lines in your changes missing coverage. Please review.

Project coverage is 83.6%. Comparing base (361c4ae) to head (ceea10c).
Report is 80 commits behind head on main.

Files with missing lines Patch % Lines
src/UFDecoder.cpp 45.9% 20 Missing ⚠️
include/Code.hpp 81.2% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main    #266     +/-   ##
=======================================
- Coverage   83.6%   83.6%   -0.1%     
=======================================
  Files         49      49             
  Lines       4162    4159      -3     
  Branches     372     372             
=======================================
- Hits        3481    3477      -4     
- Misses       681     682      +1     
Flag Coverage Δ
cpp 83.6% <56.6%> (-0.3%) ⬇️
python 83.5% <100.0%> (+0.1%) ⬆️
Files with missing lines Coverage Δ
...mation_decoding/simulators/memory_experiment_v2.py 89.6% <ø> (+5.1%) ⬆️
...mation_decoding/simulators/quasi_single_shot_v2.py 69.9% <100.0%> (ø)
...alog_information_decoding/simulators/simulation.py 82.7% <100.0%> (ø)
src/mqt/qecc/cc_decoder/decoder.py 98.2% <100.0%> (ø)
include/Code.hpp 79.5% <81.2%> (-0.4%) ⬇️
src/UFDecoder.cpp 61.2% <45.9%> (-2.6%) ⬇️

@burgholzer burgholzer linked an issue Aug 27, 2024 that may be closed by this pull request
@burgholzer burgholzer force-pushed the infrastructure-update branch from d334c28 to 8dda0b1 Compare August 31, 2024 11:24
@burgholzer burgholzer changed the title ✨ Infrastructure Update 🏁 Windows Support Aug 31, 2024
@burgholzer burgholzer removed dependencies Pull requests that update a dependency file python Pull requests that update Python code code quality Code quality improvements refactor Changes the refactor the code base packaging Anything related to Python packaging labels Aug 31, 2024
@burgholzer
Copy link
Member Author

Alright. This is looking much better now already.

@pehamTom I just removed the timeouts from the state prep tests and that seems to have done it. Although some of these tests (sometimes) seem to take forever. Do you see any kind of way to reduce the tests or to speed them up?
Some of the (simulation) tests also sporadically fail because a certain threshold is not reached. Could you maybe fix the random seed for these kind of tests so that the results are reproducible and less flaky?

@lucasberent could you please look at the change here: b58b604
This was necessary to fix one of the tests on Windows. However, the test now fails on Ubuntu and macOS. Do you see any kind of reason for that.
I was kind of suspicious that the following to lines are not the same despite the same comment:

@pehamTom
Copy link
Member

pehamTom commented Sep 2, 2024

@pehamTom I just removed the timeouts from the state prep tests and that seems to have done it. Although some of these tests (sometimes) seem to take forever. Do you see any kind of way to reduce the tests or to speed them up? Some of the (simulation) tests also sporadically fail because a certain threshold is not reached. Could you maybe fix the random seed for these kind of tests so that the results are reproducible and less flaky?

Well, the timeout is there, so the SAT solver doesn't spend a long time on unsatisfiable instances. I can try to trim down the tests further. But again, there are not many small code instances one can meaningfully test.

@burgholzer
Copy link
Member Author

@pehamTom I just removed the timeouts from the state prep tests and that seems to have done it. Although some of these tests (sometimes) seem to take forever. Do you see any kind of way to reduce the tests or to speed them up? Some of the (simulation) tests also sporadically fail because a certain threshold is not reached. Could you maybe fix the random seed for these kind of tests so that the results are reproducible and less flaky?

Well, the timeout is there, so the SAT solver doesn't spend a long time on unsatisfiable instances. I can try to trim down the tests further. But again, there are not many small code instances one can meaningfully test.

At least on Windows that has led to basically none of the synthesis tasks returning a circuit (see the CI logs from a couple commits ago).
It's odd, but not completely unexpected that performance on windows is so different to the Unix systems.
Maybe relaxing the timeouts a little bit instead of removing them would be even better.
Just seeking a pragmatic solution here.
The current CI run shows that it works in principle given a long enough timeout.

@lucasberent
Copy link
Collaborator

lucasberent commented Sep 3, 2024

@lucasberent could you please look at the change here: b58b604 This was necessary to fix one of the tests on Windows. However, the test now fails on Ubuntu and macOS. Do you see any kind of reason for that. I was kind of suspicious that the following to lines are not the same despite the same comment:

the test should be correct as is, the comment is wrong the estimate should be [1,0,0]..

@burgholzer
Copy link
Member Author

Alright. This is looking much better now already.
@pehamTom I just removed the timeouts from the state prep tests and that seems to have done it. Although some of these tests (sometimes) seem to take forever. Do you see any kind of way to reduce the tests or to speed them up? Some of the (simulation) tests also sporadically fail because a certain threshold is not reached. Could you maybe fix the random seed for these kind of tests so that the results are reproducible and less flaky?
@lucasberent could you please look at the change here: b58b604 This was necessary to fix one of the tests on Windows. However, the test now fails on Ubuntu and macOS. Do you see any kind of reason for that. I was kind of suspicious that the following to lines are not the same despite the same comment:

the test should be correct as is, the comment is wrong the estimate should be [1,0,0]..

In that case, something is going wrong in the decoder on Windows as it always yields [0, 0, 0]. Is there anything that you could think of that could cause this?
I'll revert the test change for now.

@lucasberent
Copy link
Collaborator

In that case, something is going wrong in the decoder on Windows as it always yields [0, 0, 0]. Is there anything that you could think of that could cause this? I'll revert the test change for now.

perhaps something with the np arrays? Maybe using 'astype' helps otherwise unclear to me.

@burgholzer burgholzer force-pushed the infrastructure-update branch from 9885cf5 to 02f604d Compare September 9, 2024 10:38
@burgholzer
Copy link
Member Author

Ok. pymatching 2.2.1 already puts a pin on numpy < 2 so we don't have to.
Hopefully, all tests pass now and this can be merged. I'll then proceed to release a new version, which will be the first one to ship Windows wheels 🎉

Also cleaned up the PR history a little bit so it becomes clearer what changed.

@burgholzer burgholzer force-pushed the infrastructure-update branch from 02f604d to 6b2ab17 Compare September 9, 2024 10:45
@burgholzer burgholzer force-pushed the infrastructure-update branch 2 times, most recently from fb9cbb8 to 069d624 Compare September 9, 2024 20:40
@burgholzer
Copy link
Member Author

@pehamTom sorry that I have to bother you again, but I can't seem to get the ft simulation tests to work under Windows 3.12; even after increasing the number of shots by a factor of 10.
The logical error rate seems to be quite off in the two ft simulation tests.
Windows 3.9 seems to be fine and is not complaining.
Any idea what could be causing this?

@pehamTom
Copy link
Member

@pehamTom sorry that I have to bother you again, but I can't seem to get the ft simulation tests to work under Windows 3.12; even after increasing the number of shots by a factor of 10. The logical error rate seems to be quite off in the two ft simulation tests. Windows 3.9 seems to be fine and is not complaining. Any idea what could be causing this?

It was another timeout issue with the state preparation. Since some optimal circuits are synthesized for the simulation, they timed out and returned a non-verified circuit.

But now the time to run the tests is 77 minutes on windows 3.12. I don't think that is an acceptable amount of time. I think the only way to circumvent this is to either not test the SAT solution for the state preparation and verification circuit synthesis or test the individual SAT formulas used in the encoding separately.

@burgholzer
Copy link
Member Author

@pehamTom sorry that I have to bother you again, but I can't seem to get the ft simulation tests to work under Windows 3.12; even after increasing the number of shots by a factor of 10. The logical error rate seems to be quite off in the two ft simulation tests. Windows 3.9 seems to be fine and is not complaining. Any idea what could be causing this?

It was another timeout issue with the state preparation. Since some optimal circuits are synthesized for the simulation, they timed out and returned a non-verified circuit.

But now the time to run the tests is 77 minutes on windows 3.12. I don't think that is an acceptable amount of time. I think the only way to circumvent this is to either not test the SAT solution for the state preparation and verification circuit synthesis or test the individual SAT formulas used in the encoding separately.

Yeah. That's somewhat rough..
What about the most pragmatic solution of not running these tests under Windows but running them under all other operating systems?
It's not ideal, but at least we don't completely give up testing the functionality.

@pehamTom
Copy link
Member

@pehamTom sorry that I have to bother you again, but I can't seem to get the ft simulation tests to work under Windows 3.12; even after increasing the number of shots by a factor of 10. The logical error rate seems to be quite off in the two ft simulation tests. Windows 3.9 seems to be fine and is not complaining. Any idea what could be causing this?

It was another timeout issue with the state preparation. Since some optimal circuits are synthesized for the simulation, they timed out and returned a non-verified circuit.
But now the time to run the tests is 77 minutes on windows 3.12. I don't think that is an acceptable amount of time. I think the only way to circumvent this is to either not test the SAT solution for the state preparation and verification circuit synthesis or test the individual SAT formulas used in the encoding separately.

Yeah. That's somewhat rough.. What about the most pragmatic solution of not running these tests under Windows but running them under all other operating systems? It's not ideal, but at least we don't completely give up testing the functionality.

That would be fine by me.

@burgholzer burgholzer force-pushed the infrastructure-update branch from 3088d85 to c3dc83a Compare September 10, 2024 20:09
@burgholzer burgholzer force-pushed the infrastructure-update branch from c3dc83a to 16e4a65 Compare September 10, 2024 20:32
@burgholzer burgholzer force-pushed the infrastructure-update branch from 16e4a65 to 25eb6b8 Compare September 10, 2024 20:45
@burgholzer burgholzer merged commit 3cf545e into main Sep 10, 2024
42 of 43 checks passed
@burgholzer burgholzer deleted the infrastructure-update branch September 10, 2024 21:52
@burgholzer
Copy link
Member Author

Everything is passing and I am glad that this is finally in 🎉
Nice step forward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
continuous integration Anything related to the CI setup enhancement Enhancement to existing feature minor Minor version update usability Anything related to usability
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Fix py wheels windows
3 participants