Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression in testUsdImagingGLInstancing_nestedInstance #2097

Closed
pmolodo opened this issue Nov 17, 2022 · 20 comments
Closed

regression in testUsdImagingGLInstancing_nestedInstance #2097

pmolodo opened this issue Nov 17, 2022 · 20 comments

Comments

@pmolodo
Copy link
Contributor

pmolodo commented Nov 17, 2022

Description of Issue

Somewhere in the commit range d27b1be..9d21755, we started seeing consistent failures of the testUsdImagingGLInstancing_nestedInstance:

Baseline:
image

Generated:
image

Possibly related, or may just be a red herring - noticed this in my error log:

------- Stderr: -------
Warning: in RunTest at line 724 of /buildAgent/work/f651207570849f20/USD/pxr/usdImaging/usdImagingGL/unitTestGLDrawing.cpp -- Draw mode  not supported!
Warning: in RunTest at line 736 of /buildAgent/work/f651207570849f20/USD/pxr/usdImaging/usdImagingGL/unitTestGLDrawing.cpp -- Cull style  not supported!

Steps to Reproduce

  1. Enable GL image tests, build, and run testUsdImagingGLInstancing_nestedInstance

System Information (OS, Hardware)

Ubuntu 20.04.5 LTS, NVIDIA A10G

Package Versions

USD 9d21755

@sunyab
Copy link
Contributor

sunyab commented Nov 17, 2022

I believe this is fixed in the latest set of changes pushed to dev. Can you retry the test on your side and see if that's the case for you?

@pmolodo
Copy link
Contributor Author

pmolodo commented Nov 18, 2022

We're still seeing this on latest dev, 43671cf.

I'll see if I can set up an automated bisect to get an exact commit where the test first started failing for us...

@pmolodo
Copy link
Contributor Author

pmolodo commented Nov 18, 2022

Hi - so it looks like the commit the introduced the regression was bf18190 - which was actually where one of our MRs was merged, #1856.

And I'm now remembering that we had the same problem earlier - in the discussion for that MR, you can see essentially the same image pair that I've posted here. @wrma6 mentioned that a change related to testUsdImagingGLInstancing_nestedInstance was fixed, which I took to mean the rendering error we were seeing was resolved... but it would seem not.

I think we were always getting this result, but the test fix now properly exposed it. I'm assuming the test is passing on your infrastructure?

@sunyab
Copy link
Contributor

sunyab commented Nov 18, 2022

Yes, I double-checked the test results from prior to my last push to dev and the test was passing. I double-checked the images and they look correct as well.

I think the test had been failing internally with the push before my last push, but it got fixed sometime afterwards. So I'm surprised it's still failing for you. I can re-run the tests tomorrow with the latest set of internal changes to see how things look.

@sunyab
Copy link
Contributor

sunyab commented Nov 19, 2022

Filed as internal issue #USD-7782

@FlorianZ
Copy link
Contributor

FlorianZ commented Jan 5, 2023

Closing this as fixed, but feel free to re-open if this is still failing for you.

@FlorianZ FlorianZ closed this as completed Jan 5, 2023
@pmolodo
Copy link
Contributor Author

pmolodo commented Jan 6, 2023

I just checked again, and I'm still seeing this on the latest dev (dce6cf9).

This time my system info was:

OS: Ubuntu 20.04.5 LTS
GPU: NVIDIA GeForce RTX 3080
Driver: 510.108.03

@pmolodo
Copy link
Contributor Author

pmolodo commented Jan 6, 2023

Also, I don't think I have permission to re-open the issue. Should I create a new one?

@spiffmon spiffmon reopened this Jan 6, 2023
@spiffmon
Copy link
Member

spiffmon commented Jan 6, 2023

I'll also reopen the internal ticket. @pmolodo , were you or others going to work on fixing the regression, or should we triage this? May be tricky if we cannot reproduce it locally!

@pmolodo
Copy link
Contributor Author

pmolodo commented Jan 6, 2023

Thanks, @spiffmon.
Unfortunately, I don't think I'm going to be able to tackle this immediately, as it's not really a blocker for us, (we've disabled the test in our build system), and focus around here is more on our RTX renderer than Storm support.
On the other hand, I hate seeing bugs / regressions go unaddressed, and I can't expect you folks to tackle a bug that you can't reproduce.
It seems that the situation we're in is that this bug is reliably reproducible on all our systems, and not on any of your systems. So I'll at least see if I can reproduce on my own personal Ubuntu box (which should be relatively vanilla), using a straight invocation of build_usd.py, and see where that leaves us.

@spiffmon
Copy link
Member

spiffmon commented Jan 6, 2023

Much appreciated, @pmolodo !

@pmolodo
Copy link
Contributor Author

pmolodo commented Jan 11, 2023

So I was eventually able to reproduce the issue on my personal Ubuntu box, and what's more, was able to create a Dockerfile that can be used to build (and test, if you have nvidia-docker2 installed):

https://github.com/pmolodo/usd_docker_ubuntu/blob/testUsdImagingGLInstancing_nestedInstance/Dockerfile

Instructions for doing so are in the Dockerfile.

Unless you never use NVIDIA hardware, I'm guessing that the key difference is Ubuntu. (Probably you folks use some rpm-based distribution?) Probably in execution environment, rather than build environment, since we actually do all our builds on an old CentOS image. Though it would be good to have that confirmed too.

Anyway, let me know if you're able to replicate the issue using this!

@davidgyu
Copy link
Member

Thanks @pmolodo I've been able to repro using your Dockerfile, but interestingly not when using a clean dev build on the same host. This is also using a personal Ubuntu box:

OS: Ubuntu 20.04.5 LTS
GPU: NVIDIA RTX 3080 Ti
Driver: 525.60.11

Being able to repro is super helpful, though! Hopefully, more info soon.

@pmolodo
Copy link
Contributor Author

pmolodo commented Jan 17, 2023

Glad you were able to repro! Odd that it didn't with a build on the same host... Out of curiosity, did you both build and test on the Docker image, or just build and then test on the host?

@davidgyu
Copy link
Member

I've only been able to run the repro inside of docker. When I try to run from the docker build on the host I hit GLIBC version errors on my Ubuntu 20.04 system.

@davidgyu
Copy link
Member

Alright! The existing test baseline represents an invalid result. The two meshes which are rendering incorrectly are quad dominant subdivision surface meshes bound to a triangle ptex texture. Also, the ptexFaceOffset values specified in the test asset are invalid. I've updated the test to use a quad ptex texture with correct ptexFaceOffset values and I'm seeing consistent results.
I've also filed an internal ticket to improve error checking for these cases since ideally we'd have caught these errors earlier.

@pmolodo
Copy link
Contributor Author

pmolodo commented Jan 18, 2023

Great work, thanks! 🙏

@davidgyu
Copy link
Member

davidgyu commented Jan 18, 2023

One more thing... I did add --ptex to the arg list for the build_usd.py run step in the Dockerfile since Ptex is not enabled by default. I think this by itself might be enough to get the tests passing, but the underlying test asset fixes are good to have as well.
We've also fixed the spurious warning message mentioned above in the initial post on this issue.

@pmolodo
Copy link
Contributor Author

pmolodo commented Jan 19, 2023

Hmm... Does that mean this test should only be run if there's ptex support enabled?

@davidgyu
Copy link
Member

Yes, that's correct. We have a ticket open here to fix this.
I suspect this got overlooked when we added the test registration to CMake since we always have Ptex enabled in our internal builds.

pixar-oss pushed a commit that referenced this issue Jan 27, 2023
Updated the argument handling for the "-shading" and "-cullStyle"
options to initialize defaults and only report warnings for unhandled
modes.

Without this fix we get spurious warnings when running through any
of the current unit tests which use the defaults for these arguments.
Also, fixed the diagnostic text to avoid potential confusion
with UsdGeomModelAPI drawMode.

See #2097

(Internal change: 2259725)
pmolodo added a commit to pmolodo/OpenUSD that referenced this issue Aug 4, 2023
specifically:
- testUsdImagingGLInstancing_nestedInstance
- testUsdImagingGLInstancing_SceneIndex_nestedInstance

See discussion here:

PixarAnimationStudios#2097 (comment)
aaye pushed a commit to aaye/OpenUSD that referenced this issue Sep 24, 2023
specifically:
- testUsdImagingGLInstancing_nestedInstance
- testUsdImagingGLInstancing_SceneIndex_nestedInstance

See discussion here:

PixarAnimationStudios#2097 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants