-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GPUParticles are not rendered for older AMD GPUs with OpenGL+Angle #96413
Conversation
a7823e0
to
6e6c3f7
Compare
OK on LG Nexus 7 -
I have commented out all “renames”, 2D Platformer also looked normal, should other MRP be tested? /*
#define packUnorm4x8 godot_packUnorm4x8
#define unpackUnorm4x8 godot_unpackUnorm4x8
#define packSnorm4x8 godot_packSnorm4x8
#define unpackSnorm4x8 godot_unpackSnorm4x8
#define packHalf2x16 godot_packHalf2x16
#define unpackHalf2x16 godot_unpackHalf2x16
#define packUnorm2x16 godot_packUnorm2x16
#define unpackUnorm2x16 godot_unpackUnorm2x16
#define packSnorm2x16 godot_packSnorm2x16
#define unpackSnorm2x16 godot_unpackSnorm2x16
*/
|
Particles and Shader are most important here. Maybe your Adreno device is fine? I'm not really into the 'Adreno problems', but checking the git history:
If you are lucky, you do not suffer from any of those problems of that wrong exposure or buggy implementation of the |
Hmmm, this section of code created a lot of problems for us (as you can see in the history). The old We should introduce the same logic in a branchless manner like in #73332 Or perhaps we should do something more robust and use the "ultrafast method" from https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats (described here: https://stackoverflow.com/a/60047308) |
Given that #95797 only happens with ANGLE, I suspect it is more likely a precision issue rather than a problem with the actual conversion. Its likely that inserting the branch just forces different codegen by the driver. I would see if using high precision helps before committing to changing the conversion code. |
Funny enough, the lines did not appear still on all the older AMD devices I tested (with ANGLE ofc). And also fixed by this PR.
Is this something you will check? I can help with testing if needed! |
6e6c3f7
to
86d9f05
Compare
I don't have a device that can reproduce your issue, so I can't really check. I am just skeptical about this solution as it removes the check for 0 exponent and replaces it with a check for infinity. Both might be needed, but its clear the root of the problem is bad code gen by the driver. Since OpenGL works fine on the same hardware I suspect the issue comes from the different between OpenGL and OpenGL ES. The most likely culprit is floating point precision as desktop GL drivers almost always ignore precision. |
Is there something I can try here? Currently trying out the fast algorihm. Also attached a project that contains nodes for all issues we had in the past for fast checking. Will also keep it up-to-date if there is more. |
Tested the new fast algorithm solution and it works! Works for my system where everything was good before, and also worked for the broken system. Can check with other GPUs in the next days as well (e.g. Intel and Nvidia). Adreno device would be nice as well. Checked that the following issues do not reappear (as in the project attached on this PR):
|
Using a better and faster algorithm for the float conversions
86d9f05
to
9cc9df5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh well. This approach is a few more instructions than we had before which is not ideal, but it is more correct and handles more edge cases.
I looked into my precision theory deeper and I am not sure, in theory all unspecified floats and ints should be high precision. So it may be something else in the driver code gen causing the issue.
Since you have confirmed that this works using many of the prior bug reports I think it makes sense to go with this.
The final thing is the licensing issue. There were concerns expressed about the licence previously. Since the code is basically a one-liner, I don't think a licence can be attached to it. The FSF considers any chunk of code fewer than 15 lines to be too trivial to copyright.
Thanks! |
Fixes: #95797
Using the algorithm (bottom of the paper) as proposed by @clayjohn: https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats
Made sure that my fix does not cause any regressions (see project below):
This should be reviewed by someone much more familiar than me.
GPUParticles
now work fine for older AMD GPUs as specified in the issue, as well as e.g. my much newer GPU, that did not suffer from this problem (works before, works after).I plan to test this on other GPUs as well soon.
cc @Alex2782 - testing on an Adreno 3xx device would be nice as well :)
cc @JonqsGames - since you did the initial change that caused this 'regression' in #72914
cc @clayjohn - as you wrote and reviewed most of the OpenGL stuff :D
Project for testing, that contains all issues we once had regarding the conversion:
ParticleBug.zip
Everything is named like to GH issue and it is 'linked' in the description so you can check everything fast.
| | |
Tested project above on: