Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve docstrings for better discoverability #7378

Open
maawoo opened this issue Dec 14, 2022 · 9 comments
Open

Improve docstrings for better discoverability #7378

maawoo opened this issue Dec 14, 2022 · 9 comments

Comments

@maawoo
Copy link
Contributor

maawoo commented Dec 14, 2022

What is your issue?

I noticed that the docstrings of the aggregation methods are mostly written in the same style, e.g.: "Reduce this Dataset's data by applying xy along some dimension(s).". Let's say a user is interested in calculating the variance and searches for the appropriate method. Neither xarray.DataArray.var nor xarray.Dataset.var will be returned (see here), because "variance" is not mentioned at all in the docstrings. Same problem exists for other methods like .std, .prod, .cumsum, .cumprod, and probably others.

#6793 is related, but I guess it already has enough tasks.

@maawoo maawoo added the needs triage Issue that has not been reviewed by xarray team member label Dec 14, 2022
@TomNicholas TomNicholas added topic-documentation and removed needs triage Issue that has not been reviewed by xarray team member labels Dec 14, 2022
@TomNicholas
Copy link
Member

That's a useful observation, thank you @maawoo!

This comes from the way we generate our code for the many different aggregations xarray can perform. We actually use this script to automatically generate all the source code for all the aggregations in this file. That script has a template that is filled in for each method.

Currently the template looks like this

TEMPLATE_REDUCTION_SIGNATURE = '''
def {method}(
        self,
        dim: Dims = None,
        *,{extra_kwargs}
        keep_attrs: bool | None = None,
        **kwargs: Any,
    ) -> {obj}:
        """
        Reduce this {obj}'s data by applying ``{method}`` along some dimension(s).
        Parameters
        ----------'''

where in the case of variance the method is just var so "variance" isn't in the generated docstring anywhere.

How might we fix this? One immediate thought that might help is to change the template to use a method_name and a long_name, where method_name is var but long_name is variance for example. This shouldn't be particularly difficult, and we would welcome a PR if you would be interested in contributing? We would help you out 🙂

Or we might change the docstrings in some other, more granular way. Adding examples to aggregation methods would also have to deal with the fact they are autogenerated #6793

@maawoo
Copy link
Contributor Author

maawoo commented Dec 19, 2022

Hi @TomNicholas,
thanks for your response! I'd be happy to contribute and will see what I can do once I find some spare time 🙂

@mahamtariq58
Copy link

Hi @TomNicholas , I would like to contribute to this issue.Could I be assigned to it?

@TomNicholas
Copy link
Member

Hi @mahamtariq58, thanks for your interest! We don't normally assign issues to individuals, you are just welcome to have a go at solving any issue that interests you.

@mahamtariq58
Copy link

Okay thank you @TomNicholas

@Amisha2778
Copy link

Hey @TomNicholas
I am an Outreachy applicant. I would like to work on this issue.

@TomNicholas
Copy link
Member

Hi @Amisha2778 - great to hear you are interested. You don't need my permission - please have a go at solving any issue that looks interesting to you, and please ask questions if you have any difficulties!

@Ravenin7
Copy link
Contributor

Ravenin7 commented Apr 1, 2023

Hey @TomNicholas . You mentioned one way of fixing this issue would be to also use long_name in the template to be more descriptive. Is there an already existing list of these names somewhere that could be used for this purpose or would one have to create these names for each method?
Also, if it is the latter, do you have any suggestion(s) for how I should figure these names out? I wouldn't have known that var stood for variance if I hadn't read it here. I may have guessed it as xarray.Dataset.var suggests to see numpy.var which is about computing variance but I don't want to guess wrong.

@dcherian
Copy link
Contributor

dcherian commented Apr 2, 2023

would one have to create these names for each method?

Yes I think so.

xarray.Dataset.var suggests to see numpy.var which is about computing variance but I don't want to guess wrong.

Yes things like var, std etc. are pretty standard so you should able to find them. If not, feel free to ask !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants