Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary list comprehension in sum, min, max #3259

Closed
janosh opened this issue Feb 27, 2023 · 13 comments · Fixed by #10759
Closed

Unnecessary list comprehension in sum, min, max #3259

janosh opened this issue Feb 27, 2023 · 13 comments · Fixed by #10759
Assignees
Labels
rule Implementing or modifying a lint rule

Comments

@janosh
Copy link

janosh commented Feb 27, 2023

It's not part of PIE802 so maybe this should go somewhere else but I noticed #3149 doesn't include built-in aggregators like sum(), min(), max().

Suggestion is to auto-fix:

# error
sum([i**2 for i in range(10)])

# ok
sum(i**2 for i in range(10))

Maybe this is covered by flake8-comprehensions? Looked but couldn't find it.

@matthewlloyd
Copy link
Contributor

Hmm, it looks like these used to be covered by flake8-comprehensions rule C407, which included these eight built-ins:

  • all
  • any
  • frozenset
  • max
  • min
  • sorted
  • sum
  • tuple

C407 was dropped in its entirety in 2021, mostly due to bugs introduced by the subtle behavior changes: adamchainz/flake8-comprehensions#247

There was a proposal in 2022 to add it back for any and all, which have the potential for more significant performance gains due to the possibility of short-circuiting the rest of the generator: adamchainz/flake8-comprehensions#426

The proposal did not get implemented because flake8-pie added PIE802 to cover any and all. Of course, these two are the most likely to cause a change in behavior, due to short-circuiting of side effects where the others are guaranteed to consume the entire generator.

Short-circuiting aside, they all reduce memory consumption.

@Skylion007
Copy link
Contributor

This would be a great candidate for an opinionated RUFF rule though as they are useful check for the many cases even if do have some false positive. I would even go so far as to recommend including itertools functions as well since they all take in iterables anyway.

@janosh
Copy link
Author

janosh commented Mar 2, 2023

I would even go so far as to recommend including itertools functions as well since they all take in iterables anyway.

The more, the merry! :)

@charliermarsh charliermarsh added the rule Implementing or modifying a lint rule label Mar 2, 2023
@JonathanPlasse
Copy link
Contributor

We should also include more-itertools.

@matthewlloyd
Copy link
Contributor

An autofix for this is not particularly safe of course - it can change the behavior of code even when there is no short circuiting involved:

COUNTER = 0


class Addable:
    def __init__(self, i, source=None):
        global COUNTER
        COUNTER += 1
        self.i = i
        print(f"__init__({self.i}, source={source})")

    def __add__(self, other):
        return Addable(COUNTER + self.i + other.i, source="__add__")

    def __radd__(self, other):
        return Addable(
            COUNTER + self.i + (other.i if isinstance(other, Addable) else other),
            source="__radd__",
        )

    def __repr__(self):
        return str(self.i)


print("List comprehension:")
COUNTER = 0
result_listcomp = sum([Addable(i, source="listcomp") for i in range(3)])
print(result_listcomp)
print()

print("Generator:")
COUNTER = 0
result_generator = sum(Addable(i, source="generator") for i in range(3))
print(result_generator)
print()

assert result_listcomp == result_generator

Output:

List comprehension:
__init__(0, source=listcomp)
__init__(1, source=listcomp)
__init__(2, source=listcomp)
__init__(3, source=__radd__)
__init__(8, source=__add__)
__init__(15, source=__add__)
15

Generator:
__init__(0, source=generator)
__init__(1, source=__radd__)
__init__(1, source=generator)
__init__(5, source=__add__)
__init__(2, source=generator)
__init__(12, source=__add__)
12

Traceback (most recent call last):
  File "/home/mil/Development/uavforecast/test.py", line 36, in <module>
    assert result_listcomp == result_generator
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

@carljm
Copy link
Contributor

carljm commented Apr 2, 2024

The rationale for this lint in the case of any and all is strong, because any and all can short-circuit and the generator version can potentially avoid many loops in the case of a large iterable; the comprehension prevents short-circuiting.

For cases like min, max, and sum where the input iterable will always be fully exhausted, in Python 3.12 the comprehension version is generally a bit faster than the generator version. (I'm using https://gist.github.com/carljm/39ccaaddf3ef7a4cb018264b7282f05f to benchmark, derived from https://gist.github.com/smithdc1/7825842972afe05ad2f457e8b9fbc890). Results for all of sum, max, and min are pretty much the same:

.....................
list_comprehension small: Mean +- std dev: 1.39 us +- 0.01 us
.....................
generator small: Mean +- std dev: 2.04 us +- 0.01 us
.....................
list_comprehension large: Mean +- std dev: 17.5 ms +- 0.5 ms
.....................
generator large: Mean +- std dev: 23.6 ms +- 0.5 ms

The current documentation for the any/all version of the lint is a bit misleading, IMO. It says the performance benefit is from avoiding overhead of list creation, but it uses an example case that (accidentally?) maximizes the short-circuiting benefits (the first item in range(1000) in 0, so all will short-circuit immediately); the actual benefit here is entirely from the short-circuiting. If we modify that benchmark ever so slightly to use the numbers from 1 to 1000 instead of 0 to 1000, the claimed 40x performance benefit vanishes, and again the comprehension version is a bit faster:

➜ python -m timeit "all([i for i in range(1, 1000)])"
20000 loops, best of 5: 12.1 usec per loop

➜ python -m timeit "all(i for i in range(1, 1000))"
20000 loops, best of 5: 18 usec per loop

The reason for this is that while the comprehension does have overhead to create a list, generators also have quite a bit of interpreter overhead, as they have to yield on each iteration; in particular, a function like the sum/min/max builtins that is written in C and can use the underlying C-level tp_next directly on a list is quite efficient at iterating a list, and is much less efficient with a generator that has to yield through the interpreter on every iteration; this more than cancels out the list allocation overhead.

I don't think we should over-pivot here based on current CPython performance, as performance characteristics change over time, and there's a reasonable possibility that faster-cpython may be able to improve the performance of generators in future versions. And even today, the generator version will be much more memory-efficient for large data than the comprehension one.

But I do want to double-check: given the above performance context, do we still think it makes sense to extend this autofix to sum, min, and max? The fix is semantically unsafe, it will probably make performance a bit worse, and it will improve memory usage.

@carljm
Copy link
Contributor

carljm commented Apr 2, 2024

cc @charliermarsh @MichaReiser ^

@zanieb
Copy link
Member

zanieb commented Apr 2, 2024

I'm a bit torn.

I'm not sure if I'm convinced of the "unsafety" from #3259 (comment) — I feel like we might have other safe fixes that can be broken by operator overloads and global state. Is there more that makes it semantically unsafe?

@charliermarsh
Copy link
Member

Wow, learned a lot from your comment -- thank you!

I think the "unsafety" is okay given that we already have the same problem (in theory) with any and all, unless I'm mistaken. I care more about whether we should recommend a pattern that fares worse on the microbenchmarks. I'm leaning towards saying "yes, we should still add sum, min, and max", since (1) it does improve memory usage (so it's not totally negative), (2) the performance hit is likely non-critical (though, on that note, we should improve the existing docs), and (3) it is a "simplification" of the user's code to pass the generator directly.

Perhaps we leave this as an open question for a day or so, though, to give others on the issue time to chime in.

@carljm
Copy link
Contributor

carljm commented Apr 2, 2024

we might have other safe fixes that can be broken by operator overloads and global state. Is there more that makes it semantically unsafe?

I don't think there's more than that; the unsafety is due to laziness. With the comprehension version, the value expression (i.e. expr in sum([expr for x in y])) is evaluated eagerly for all elements in the inner iterable first, and then the resulting values are processed one by one by the outer fold function (sum or min or max). With the generator version, the value expression is lazily evaluated one item at a time and then immediately processed by the fold function. This ordering difference can be seen in @matthewlloyd's example. For this difference to be observable, there does need to be some kind of shared (not necessarily global) state between the elements, and folding of an element needs to have a side effect on that state, which I think does imply an override of some magic method (either addition or comparison.)

The existing autofix for any and all is much more unsafe, for the same reason it can provide a perf win: short-circuiting. This unsafety doesn't require any magic methods with shared state, it just requires any kind of externally visible side effect of iteration at all, because with the short-circuiting generator version, that side effect may just not happen for a bunch of items in the iterable.

My feeling is that unsafety is unsafety, even if you have to use advanced features of the language to trigger it -- people stretch Python in all kinds of ways! So I would consider this autofix unsafe in all variants. But I'm not opposed to extending it anyway! It makes sense to me that lots of code might not really care about the perf either way and may as well just be simpler.

I'll push a docs-only PR for now to clarify the current docs, and wait a day or so on the actual autofix extension.

@zanieb
Copy link
Member

zanieb commented Apr 2, 2024

Sounds good to me! Thanks for the response.

carljm added a commit that referenced this issue Apr 2, 2024
…-all (C419) (#10744)

Ref #3259; see in particular
#3259 (comment)

## Summary

Improve the accuracy of the docs for this lint rule/fix.

## Test Plan

Generated the docs locally and visited the page for this rule:

![Screenshot 2024-04-02 at 4 56
40 PM](https://github.com/astral-sh/ruff/assets/61586/64f25cf6-edfe-4682-ac8e-7e21b834f5f2)

---------

Co-authored-by: Zanie Blue <[email protected]>
@matthewlloyd
Copy link
Contributor

matthewlloyd commented Apr 3, 2024

Agree with @carljm's insightful and well presented analysis. Thanks for fixing the original docs - yes, the range(0, 1000) was accidental - good catch, in hindsight, 40x was too good to be true.

Also agree with @charliermarsh's comment "yes, we should still add sum, min, and max" and the reasoning. The performance penalty is small and may change as CPython evolves, and I wonder to what extent it's offset by time savings at the next GC due to the reduced memory consumption. I also like the simpler code.

I think they should all be marked unsafe, as the "all" and "any" fixes can change both execution extent and order, and the others at least the order. If there are other autofixes currently marked as safe which could similarly change execution semantics in the context of dunder overload magic (with or without side effects on shared state), given how widespread that is, my vote would be to mark them unsafe too.

@carljm
Copy link
Contributor

carljm commented Apr 3, 2024

Attached PR adds sum, min, and max as requested in the initial issue here.

I'm inclined to call that complete for this issue, and defer itertools and more_itertools funcs. One thought there (thanks @AlexWaygood) is that in the future if ruff gains type support, we could do this generally for any function typed as taking an iterable, which would naturally adapt to future additions to itertools. That seems a lot nicer than hardcoding lists of functions for modules that may change over time.

carljm added a commit that referenced this issue Apr 3, 2024
…check (C419) (#10759)

Fixes #3259 

## Summary

Renames `UnnecessaryComprehensionAnyAll` to
`UnnecessaryComprehensionInCall` and extends the check to `sum`, `min`,
and `max`, in addition to `any` and `all`.

## Test Plan

Updated snapshot test.

Built docs locally and verified the docs for this rule still render
correctly.
Glyphack pushed a commit to Glyphack/ruff that referenced this issue Apr 12, 2024
…-all (C419) (astral-sh#10744)

Ref astral-sh#3259; see in particular
astral-sh#3259 (comment)

## Summary

Improve the accuracy of the docs for this lint rule/fix.

## Test Plan

Generated the docs locally and visited the page for this rule:

![Screenshot 2024-04-02 at 4 56
40 PM](https://github.com/astral-sh/ruff/assets/61586/64f25cf6-edfe-4682-ac8e-7e21b834f5f2)

---------

Co-authored-by: Zanie Blue <[email protected]>
Glyphack pushed a commit to Glyphack/ruff that referenced this issue Apr 12, 2024
…check (C419) (astral-sh#10759)

Fixes astral-sh#3259 

## Summary

Renames `UnnecessaryComprehensionAnyAll` to
`UnnecessaryComprehensionInCall` and extends the check to `sum`, `min`,
and `max`, in addition to `any` and `all`.

## Test Plan

Updated snapshot test.

Built docs locally and verified the docs for this rule still render
correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants