Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] py docstrings: when generating docstrings, ignore old __DOC.py #2578

Merged

Conversation

pmolodo
Copy link
Contributor

@pmolodo pmolodo commented Aug 8, 2023

Description of Change(s)

py docstrings: when generating docstrings, ignore old __DOC.py

...otherwise, the old generated python docstrings will cause new docstring generation to be skipped... but the new docstrings are still written out as empty

The result was that, before this change, every other run of docstring generation would wipe most of the docstrings

NOTE:
This PR is one of several targeting the python docstring generation process. To see all these PRs in a branch, see here.

  • I have verified that all unit tests pass with the proposed changes
  • I have submitted a signed Contributor License Agreement

@jesschimein
Copy link
Collaborator

Filed as internal issue #USD-8572

@dsyu-pixar
Copy link

@pmolodo first of all, huge thanks for doing this change -- I have been burned several times by this issue while testing python doc string builds.

I might be paranoid, but I am slightly worried that your change to Tf that accesses the new env var is a little subtle and could result in confusion down the road. I tried an experiment that does things a little differently, specifically, I removed the section of cdWriterDocstring.py -> __getOutputFormat that bails if it finds an existing doc for the given pyobj. In other words, __getOutputFormat just always overwrites existing doc strings. This seems like reasonable behavior to me -- the comment in the code about not overwriting existing docstrings because "it may be custom authored in the C++ wrap files" I think is coming from internal use of the script at Pixar, and I don't think applies to the USD libs.

Let me know what you think about this alternate approach.

@pmolodo
Copy link
Contributor Author

pmolodo commented Oct 17, 2023

@dsyu-pixar - I'm not sure if that's the best approach or not - there are, in fact, some custom-authored python docstrings in USD that would be overwritten - ie, see here and here.

It's possible the new docs generated from parsing C++ might be better (?), but we should probably at least confirm that before making this change.

As for the env var - I think having the ability to disable using the __DOC.py files is potentially a handy feature in it's own right - ie, if you want the fastest possible loads for some reason (tests, perhaps?), or they get corrupted somehow, or you're doing debugging / testing, etc.

However - if you really don't like the env var approach, we could try an approach where we delete __DOC.py files in cdWriterDocstring.py before generating new ones. Might be a bit tricky to implement if you're trying to call it on a single module or specific subset of modules, though, as deleting just a subset might get tricky due to module import interdependencies.

@dsyu-pixar
Copy link

Thanks @pmolodo and good point about some of the custom-authored strings in Sdf and Vt. My proposed approach is probably not the way to go.

In a review of this change, there was some concern about the Tf-envvar approach. I apologize for the back-and-forth on this change, but would it be possible to try the delete-file approach that you mentioned?

@pmolodo
Copy link
Contributor Author

pmolodo commented Nov 15, 2023

Hi - so thinking a bit more, I think doing the "right thing" for single-module docstring rewrites, and only deleting the specific files that are going to be written, would be a pain to implement.

Going back to the general approach you mentioned earlier, about changing the code so it overwrites the docstrings... what if we simply marked the auto-generated docstrings somehow? ie, write some sentinel attribute (._auto_gen_docstring), or simply keep a set of all names that have auto-generated docstrings at the module level? (I think I prefer the set approach.)

@pmolodo pmolodo force-pushed the pr/fix-docstring-repeat-builds branch from 6d1b59b to 1a7b35b Compare November 17, 2023 02:39
@pmolodo
Copy link
Contributor Author

pmolodo commented Nov 17, 2023

Ignore my previous comment... generating the list of files isn't hard, we can figure it out beforehand. I've pushed a version that works in the manner you proposed.

It also includes a new flag --cacheParsing that I added while testing because it significantly speeds up re-runs, if the xml files haven't changed. Not very useful outside of testing, though, so I can axe that commit if you'd prefer.

...this significantly speeds up subsequent runs, which is useful when
iterating / developing
...otherwise, the old generated python docstrings will cause new docstring
generation to be skipped... but the new docstrings are still written out
as empty

The result was that, before this change, every other run of docstring
generation would wipe most of the docstrings
@pmolodo pmolodo force-pushed the pr/fix-docstring-repeat-builds branch from 1a7b35b to 9270d58 Compare November 29, 2023 19:25
@pmolodo
Copy link
Contributor Author

pmolodo commented Nov 29, 2023

Rebased to fix merge conflicts...

@pixar-oss pixar-oss merged commit ea86325 into PixarAnimationStudios:dev Dec 11, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants