-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommend Counter() in place of defaultdict(int) #323
Comments
This appears to solve an apparent memory leak.
Howdy - Good suggestion. TIL about Has the memory problem been fixed in later pythons (i.e. in >= 3.10) with defaultdict? If not, has a bug been raised? I'm torn with this one as it feels we could get it fixed in cpython, but maybe recommending it could be worth it. Would love others opinions here ... |
I've not confirmed on recent Python, but assume that like Python 3.9 they do the same - i.e. they probably behave the same way as per the current documentation. I don't think it is a bug, just a potentially surprising feature. I can understand if the calling factory function to make default objects were expensive then the current behaviour may have advantages, but with |
I'm happy to help with implementing this one. As it's not clear bug and just a recommendation, maybe it should be disabled by default? |
If this is accepted as in scope, I agree it falls under the |
Yup - I'll accept an opinionated for this - Lets just make sure we share the memory savings etc. - Thanks! |
…) instead of defaultdict(int)
…) instead of defaultdict(int)
Please have a look at #489 if it's what you meant. I checked the code given below on python 3.10 and 3.12 and in both cases using Counter instead of defaultdict(int) resulted in reducing memory usage from ~1000MB to flat ~20 MB Btw. I'm sorry it took me so much time to finally have a look at it 😞 |
…) instead of defaultdict(int)
…tead of defaultdict(int) (#489) Co-authored-by: djvdq <[email protected]>
Implemented in #489 |
I am suggesting a new check to recommend replacing
defaultdict(int)
withCounter()
, both of which are from the standard librarycollections
, due to the former having a major gotcha with memory usage.Consider this test case:
The apparent memory leak with
defaultdict
is due to the documented behaviour of recording the missing keys in the dictionary with the default value, https://docs.python.org/3/library/collections.html#collections.defaultdict says:I assume I'm not the only person to have used
defaultdict
without appreciating this, and in the case ofdefaultdict(int)
there is good alternative inCounter()
The text was updated successfully, but these errors were encountered: