Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: Test no-file for source #493

Merged
merged 3 commits into from
Aug 21, 2023
Merged

TST: Test no-file for source #493

merged 3 commits into from
Aug 21, 2023

Conversation

larsoner
Copy link
Collaborator

Not sure this is the best solution but it at least allows things in MNE-Python to pass. Turns out the real issue was us subclassing list and that having to associated file. But I added another test that failed a different way along the way as well for where we create functions dynamically.

cc @stefmolin @stefanv since you've probably thought about this more than I have! And @jarrodmillman since we should probably get some form of this in for 1.6.

BTW the time to test docstrings in MNE-python went from this before #476:

5.15s call     mne/tests/test_docstring_parameters.py::test_docstring_parameters

to:

49.58s call     mne/tests/test_docstring_parameters.py::test_docstring_parameters

So we should indeed prioritize time/efficiency at some point soon :)

@@ -125,8 +127,12 @@ def extract_ignore_validation_comments(filepath: os.PathLike) -> Dict[int, List[
dict[int, list[str]]
Mapping of line number to a list of checks to ignore.
"""
with open(filepath) as file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would tweak this slightly.

numpydoc_ignore_comments = {}

try:
    with open(filepath) as file:
        last_declaration = 1
        declarations = ["def", "class"]
        for token in tokenize.generate_tokens(file.readline):
            if token.type == tokenize.NAME and token.string in declarations:
                last_declaration = token.start[0]
            if token.type == tokenize.COMMENT:
                match = re.match(IGNORE_COMMENT_PATTERN, token.string)
                if match:
                    rules = match.group(1).split(",")
                    numpydoc_ignore_comments[last_declaration] = rules
except (OSError, TypeError):  # can be None, nonexistent, or unreadable
    pass

return numpydoc_ignore_comments

Copy link
Collaborator Author

@larsoner larsoner Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer the current approach as I always try to limit the scope of try/except as much as possible. With your code there is a single return path, which is nice. However the scoping for the try/except is broader, which means you risk accidentally catching a TypeError somewhere in the remaining 10 lines that do more than just "try to open the file". And I don't think the interpretation of the current code is too bad: "try to open the file and if you can't, return an empty dict. if you can, proceed to parse it"

numpydoc/tests/test_validate.py Outdated Show resolved Hide resolved
@larsoner
Copy link
Collaborator Author

I also realized that extract_ignore_validation_comments was being called for every :py:obj: to be validated, and that this is probably a big overhead when you only need to do it once per source file (as each of those could have tens/hundreds/thousands(?) of docstrings in them). Based on this:

$ find mne-python/mne -name "*.py" | wc -l
609
$ find matplotlib/lib/ -name "*.py" | wc -l
290
$ find pandas/pandas/ -name "*.py" | wc -l
1376
$ find scipy/scipy/ -name "*.py" | wc -l
924
$ find numpy/numpy/ -name "*.py" | wc -l
495

I added a @functools.lru_cache(maxsize=2000) to extract_ignore_validation_comments. This dropped the time from ~50s to ~10s for MNE-Python, which seems well worth the slight memory usage increase required to store the output dicts.

@jarrodmillman
Copy link
Member

Thanks!

@jarrodmillman jarrodmillman merged commit d9a3dfe into numpy:main Aug 21, 2023
23 checks passed
@stefanv stefanv added this to the 1.6.0 milestone Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants