-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wip Remove docstring substitutions #57683
wip Remove docstring substitutions #57683
Conversation
So it seems that the advantage of docstring substitutions is that you only document certain things in one place. Otherwise, you have to repeat doc text in multiple places. That could create a future maintenance burden, if you have to change the description of a parameter in multiple places. So maybe the solution would be to write a special code that does the substitutions for the purpose of doing the code validation. Not sure how hard that would be. |
There have been a number of times where I've noticed something wrong with the docs, went to go correct it only to see that I'd have to tease apart and redo a lot of the sharing. Not having the time (and many times, the desire) to do so, I either raised and issue or let it slip. While I do agree that we may end up with inconsistencies, in my experience I think our docstrings would be a lot better if we made them easier to edit. It is also possible to set up tests to check that e.g. |
Awesome to see that automation took care of a lot of this! Big +1. Maintaining docstrings as purely text is worth the tradeoff of maintaining complex docstring automation and sharing. It will also be more straightforward for new contributors as well. If we want to maintain consistency, we can do that through unit testing instead (we also do this with the docstrings of |
😄 same A bit of copy-and-pasting for docs isn't too bad, and as you've both noted, it's possible to write unit tests to ensure that some docstrings stay in sync If there's no objections, I'll set aside some time to make this happen then (but my first priority is to properly handle pyarrow dtypes in the interchange protocol) |
If you can do this, I'm all for it. My concern here is that most contributors won't realize that they have to change some docstring in 4 different files without some kind of unit test checking that. |
While reusing docstrings can surely make things easier given the amount of duplicated methods in pandas, I do agree that in the current state it's probably creating more trouble than the value it adds. For example, the Seems from the comments here that in general we're happy to fully get rid of the duplication, but I don't think a final decision has been made. I see some discussion in different issues and PRs, mostly here, #57578 and #57710, but since the impact is quite big in term of changes to the code, what do you think about creating a specific issue to make a global decision (maybe even a PDEP), on whether we want to do reusing at all, in which well defined cases would it be acceptable if any, and how are we planning to move forward with the changes (maybe some automation like a script to try to identify all the docstrings being reused, and keep an inventory to work on)? Maybe I'm too cautious, but feels like we may end up in a situation where we're not well aligned among the core dev team, and we have lots of PRs related to this, also mixed with other possibly conflicting docstring efforts. |
now that the docstring validation has been noticeably sped up in CI, I don't think this is as important - closing then, thanks for comments |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.