Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate use of lru_cache on methods #5670

Closed
DanielNoord opened this issue Jan 13, 2022 · 15 comments · Fixed by #5674 or #6182
Closed

Investigate use of lru_cache on methods #5670

DanielNoord opened this issue Jan 13, 2022 · 15 comments · Fixed by #5674 or #6182
Assignees
Labels
Maintenance Discussion or action around maintaining pylint or the dev workflow performance
Milestone

Comments

@DanielNoord
Copy link
Collaborator

DanielNoord commented Jan 13, 2022

Problem

I stumbled upon this video which explains problems with using lru_cache on methods as it leaks self. I haven't watched the full video yet but watched the previous video and saw the PR at pytest here: pytest-dev/pytest#9148

This is something we're doing ourselves as well and could be a potential source of memory leaks and other problems. I think this warrants an investigation and probably a refactor.

I have also opened pylint-dev/astroid#1344 to track this.


I think after the investigation of internal use it might make sense to perhaps turn this into a checker? I'll report after I have finished my investigation whether that might be useful.

@DanielNoord DanielNoord added Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling Maintenance Discussion or action around maintaining pylint or the dev workflow performance and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Jan 13, 2022
@DanielNoord DanielNoord self-assigned this Jan 13, 2022
DanielNoord added a commit to DanielNoord/pylint that referenced this issue Jan 13, 2022
DanielNoord added a commit to DanielNoord/pylint that referenced this issue Jan 13, 2022
Pierre-Sassoulas pushed a commit that referenced this issue Jan 13, 2022
@DanielNoord DanielNoord added this to the 2.13.0 milestone Jan 14, 2022
@asottile
Copy link
Contributor

you probably also want to ban @functools.cache on methods -- glad you enjoyed my video :)

@DanielNoord
Copy link
Collaborator Author

Thanks @asottile! Wasn't sure whether if you would appreciate being tagged but I did wonder if there were other troublesome functions. I'll open an issue for cache in a couple of minutes.
I do wonder though, in the docs of functools there is an example of turning cached_property into a stacking of property and cache. Wouldn't that create the same issue?

@asottile
Copy link
Contributor

yeah that would likely have the same issues

@mbyrnepr2
Copy link
Member

@DanielNoord good job on your work here.
I was wondering, however, does this checker give a counter-intuitive warning to users?

The lru_cache is core Python functionality & the Python developers seem to be saying here that this is expected behaviour of the lru cache: https://bugs.python.org/issue19859.
My feeling is this checker would mean Pylint is now warning users who are using it correctly against doing so; and there are ways to misuse any part of the language.

@DanielNoord
Copy link
Collaborator Author

Hmm perhaps you are right. It is at Warning level now. Would you want to completely remove the checker, or downgrade it to either Convention or Refactor?

@mbyrnepr2
Copy link
Member

I think especially the line in the warning - lru_cache shouldn't be used on a method as it creates memory leaks contradicts the views of the core Python developers in the issue linked above. Because of that, it feels controversial to say it shouldn't be used.
So, I think that wording could change to something like Method arguments are kept alive for the duration of the program when decorating a method with lru_cache etc. This would simply be stating a fact and a user may be more comfortable knowing that nothing is wrong with that per-se and that it is simply expected behaviour that Pylint is giving a heads-up about.

@Pierre-Sassoulas
Copy link
Member

I like the phrasing from Radomir Dopieralski's suggestion:

Warning! lru_cache will keep references to all the arguments for which it keeps cached values, which prevents them from being freed from memory when there are no other references. This can lead to memory leaks when you call a function with lru_cache on a lot of short-lived objects [without settings lru_cache's max_size argument]

I think the phrasing from the proposed warning need to be clear that this is not unwanted all the time. Maybe we only need to display the warning if max_size is None or not set ?

@DanielNoord
Copy link
Collaborator Author

That's actually a good improvement. I don't have time today but should have before 2.13 releases. I'll change the checker to:

  • Update message and description
  • Only trigger when max_size is None or not set

@DanielNoord
Copy link
Collaborator Author

I'm moving this to the 2.14 milestone pre-maturely as #5791 addresses all the issues raised that need to be fixed before 2.13 launched.

I'm keeping this open as reminder to myself that I need to implement the check for functools.cache as well.

@DanielNoord DanielNoord modified the milestones: 2.13.0, 2.14.0 Feb 10, 2022
@DanielNoord
Copy link
Collaborator Author

Note to self: we should also disallow max_size=None. We're using that in astroid and we're not raising the warning therefore.

@mdegat01
Copy link

mdegat01 commented Apr 4, 2022

What is the reason pylint requires explicitly providing a value for maxsize over just using the default value of 128? It seems like this should only raise an issue if maxsize=None is added and not if maxsize is omitted. Unless 128 is not good for another reason?

@jacobtylerwalls
Copy link
Member

Thanks for the report, that looks like an oversight. Since that's a false positive in a released version, that's eligible for a patch release. (As opposed to this ticket, which looks like it's still open as an enhancement tracker for functools.cache.) Could you open a new bug report? Thanks!

@jacobtylerwalls
Copy link
Member

@DanielNoord would you mind opening a new issue for the below? I'd like to close this ticket and force all discussion into new issues.

I'm keeping this open as reminder to myself that I need to implement the check for functools.cache as well.

@jacobtylerwalls
Copy link
Member

jacobtylerwalls commented Apr 4, 2022

I'm keeping this open as reminder to myself that I need to implement the check for functools.cache as well.

BTW, another reason to go to an issue first before a PR for that latter issue, is that I'm likely -1 on doing so. Unlike lru_cache(maxSize=None), there's no other way to call it. So I'm essentially in agreement with:

My feeling is this checker would mean Pylint is now warning users who are using it correctly against doing so; and there are ways to misuse any part of the language.~

EDIT:
I guess I'm okay with such a change because the implementation of cache is just a synonym for lru_cache(max_size=None), so if we're going to warn about it once we should just warn about it consistently. 👍🏻

@DanielNoord
Copy link
Collaborator Author

@jacobtylerwalls I have openend a PR implementing the issue. Should have done so earlier. Let me know if you still want me to create a new issue 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Maintenance Discussion or action around maintaining pylint or the dev workflow performance
Projects
None yet
6 participants