-
-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making Pylint faster #497
Making Pylint faster #497
Conversation
Wow, an effort put to this is very impressive. Thank you! |
Sorry ignore my comments if you saw them, I remembered that we haven't removed 2.7 yet from tox |
@nickdrozd What do you think we can do to help prevent performance regressions? At first glance, some of this code would make me want to refactor it away because of DRY (but I understand the trade off) |
astroid/node_classes.py
Outdated
@@ -967,6 +967,10 @@ def pytype(self): | |||
:rtype: str | |||
""" | |||
|
|||
def get_children(self): | |||
for elt in self.elts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could use yield from
as @brycepg mentioned. We still have some bits of Python 2 compatibility in astroid, but we are in the process of removing it from both pylint and astroid (so if your PR fails on Python 2 for now, feel free to ignore that)
astroid/node_classes.py
Outdated
@@ -1172,6 +1176,9 @@ def __init__(self, name=None, lineno=None, col_offset=None, parent=None): | |||
|
|||
super(AssignName, self).__init__(lineno, col_offset, parent) | |||
|
|||
def get_children(self): | |||
return (_ for _ in ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
astroid/node_classes.py
Outdated
@@ -1287,7 +1297,49 @@ class Arguments(mixins.AssignTypeMixin, NodeNG): | |||
|
|||
:type: NodeNG | |||
""" | |||
|
|||
def get_children(self): | |||
if self.args is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one would also benefit from yield from
(like the entire PR). But I'd also like to have these grouped, as in:
yield from self.defaults
yield from self.kwonlyargs
yield from ...
The reason is that the blocks seem to be independent only in the nature of the value that gets yielded, but other than that, it's all the same.
astroid/node_classes.py
Outdated
@@ -644,6 +645,30 @@ def nodes_of_class(self, klass, skip_klass=None): | |||
for matching in child_node.nodes_of_class(klass, skip_klass): | |||
yield matching | |||
|
|||
def get_assign_nodes(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer these to be private. Some comments before them mentioning why they are like this would also help in the future in case someone wonders why these couldn't have been made more DRY
Thank you for doing this amazing work @nickdrozd ! There's not much to be commented about this PR, left a couple of comments with things that we can improve, but overall looks like a pretty good gain in performance. Also I noticed from your yippi output that there might be other places where there are tons of calls, where we should have less, such as in the transforms for instance. |
Thanks for the kind words! I was somewhat worried that these changes A few points:
|
get_children is elegant and flexible and slow. def get_children(self): for field in self._astroid_fields: attr = getattr(self, field) if attr is None: continue if isinstance(attr, (list, tuple)): for elt in attr: yield elt else: yield attr It iterates over a list, dynamically accesses attributes, does null checks, and does type checking. This function gets called a lot, and all that extra work is a real drag on performance. In most cases there isn't any need to do any of these checks. Take an Assign node for instance: def get_children(self): for elt in self.targets: yield elt yield self.value It's known in advance that Assign nodes have a list of targets and a value, so just yield those without checking anything.
The check was being repeated unnecessarily in a tight loop.
nodes_of_class is a very flexible method, which is great for use in client code (e.g. Pylint). However, that flexibility requires a great deal of runtime type checking: def nodes_of_class(self, klass, skip_klass=None): if isinstance(self, klass): yield self if skip_klass is None: for child_node in self.get_children(): for matching in child_node.nodes_of_class(klass, skip_klass): yield matching return for child_node in self.get_children(): if isinstance(child_node, skip_klass): continue for matching in child_node.nodes_of_class(klass, skip_klass): yield matching First, the node has to check its own type to see whether it's of the desired class. Then the skip_klass flag has to be checked to see whether anything needs to be skipped. If so, the type of every yielded node has to be check to see if it should be skipped. This is fine for calling code whose arguments can't be known in advance ("Give me all the Assign and ClassDef nodes, but skip all the BinOps, YieldFroms, and Globals."), but in Astroid itself, every call to this function can be known in advance. There's no need to do any type checking if all the nodes know how to respond to certain requests. Take get_assign_nodes for example. The Assign nodes know that they should yield themselves and then yield their Assign children. Other nodes know in advance that they aren't Assign nodes, so they don't need to check their own type, just immediately yield their Assign children. Overly specific functions like get_yield_nodes_skip_lambdas certainly aren't very elegant, but the tradeoff is to take advantage of knowing how the library code works to improve speed.
The PR has been updated to
|
Looks pretty good. I think it's a good idea to keep |
I agree with Bryce that Nevertheless, this was a fun patch @nickdrozd ! Thank you so much for contributing this work. |
I recently learned how to do Python profiling with
yappi
.pylint
hasalways seemed slow to me, so I decided to see if it could be sped up.
Here is the call graph from running
pylint
againsthttps://github.com/PyCQA/pycodestyle/blob/master/pycodestyle.py:
On these graphs, nodes represent function calls, and the brighter the
node, the more time spent in that function. Each node has three
numbers: 1) the total time spent in the function, including its
subcalls; 2) the total time spent in that function but not its
subcalls; and 3) the total number of times the function was called.
I was somewhat surprised to learn (although it seems obvious in
retrospect) that the
pylint
code itself is not especially slow andthat most time is being spent in
astroid
functions. Looking at thatgraph, it's obvious that
nodes_of_clas
andget_children
arebottlenecks, and optimizing those functions could have a big impact.
(To explain the numbers a bit,
nodes_of_class
is taking up almost60% of total CPU time, and more than a third of that time is being
spent in
get_children
.)First, I timed three runs of
pylint
withastroid
on master to geta benchmark:
Next, I timed three runs with astroid on the commit
Add type-specific get_children
. This commit gives each (or almost each) node class itsown
get_children
method instead of having them all use the samegeneric function. This sped things up:
Here is the call graph with that change:
As compared with the previous graph,
nodes_by_class
takes slightlyless total CPU time, but of the time it does take, less of it is spent
in its subcalls. This is because the type-specific
get_children
calls go much faster.
Okay, so
nodes_of_class
is now the sole bottleneck. First, I appliedthe commit
Move nodes_of_class null check out of inner loop
, which,as the name suggests, just shuffles around the null check logic in
that function. This provided a modest speedup:
According to the call graph (which I won't bother to post), this
change saved about .2% of total CPU time, which is not bad for such a
small change.
Finally, I applied a much larger change, the commit
Add type-specific nodes_of_class
. This commit adds a few functions that are just likenodes_of_class
except that they apply only to specific nodesearches. This eliminates the need to do expensive runtime type
checking. These replacing calls to
nodes_of_class
withinastroid
with these new functions sped things up significantly:
Here is the call graph
It's a little hard to interpret, but by my count
nodes_of_class
wentfrom taking ~58% of total CPU time to ~48%.
Note that all of this data comes from running
pylint
, as I saidearlier, against just a single file,
pycodestyle.py
. Largerpylint
targets (projects with lots of subdirectories, for instance) have
different profiles, and different functions become more prominent
(
get_children
andnodes_of_class
are slow in all circumstances,however). I have ideas for more optimizations, which I will come back
with after these first changes are taken care of.