-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pylint slow when run on script with pandas #2198
Comments
No, this is not the expected behaviour, there is probably a check that triggers a deep pandas inference leading to this result. You can try to ignore it with |
I'm definitely seeing this on
It's fine if I disable all checks. I tried finding the 'culprit' check ... but a variety of checks cause the issue, and I stopped after ten or so. |
Sorry, about that. I hit the close issue by mistake. |
Does this mean there are more than 10 checks that ignore the ignore-modules=pandas directive? |
As mentioned by @PCManticore that directive seems to have no impact. Some tests using the following:
Results with above code (i.e. no
with
Note: i7-7700HQ, so reasonable CPU. |
As mentioned earlier,
It's not intended to ignore all the errors that happens to be with a given module. Also if anyone wants to investigate which checks contribute to the slowness of |
While I was as of yet unable to find the root cause, I nevertheless traced this problem down to the 1.6.2 release of astroid. I took a simple test program like the one above and varied the installed versions of Pandas, Numpy, Pylint, and astroid. I found that the Pandas, Numpy, and Pylint versions do not matter at all, I tested several versions of each from Summer 2017 up until today. But astroid <= 1.6.1 took only about 20-30 seconds, whereas anything >= 1.6.2 took 8-10 minutes! This also applies to the 2.x.y releases of Pylint and astroid, they take forever to analyse the simple test program. |
Thanks @SeppMe I don't out of the top of my head what features shipped with astroid 1.6, but most likely there's something odd going on with the inference, which triggers these abnormal running times. |
It's astroid. |
Is there any update on this? |
Did someone run |
@kapsh Could we just revert that commit? |
@kapsh and @dickreuter No one got to work on this ticket just yet. We're still trying to fix the issues created after 2.0 launch, so we didn't have the time to investigate this issue or any other reported performance issues. Bare with us while we're working on our way through the backlog in order to get to this issue or investigate yourselves what the root cause is and send a PR to fix the problem. |
Thanks for letting me know. But please note that the whole package is completely unusable at the moment after version 1.6.2. So not sure what other errors you're looking at, but most likely this problem deserves a higher priority. |
FYI @dickreuter , it seems to be 5-10x faster since I last tried it:
pylint: 7c103cd
|
@dickreuter That's a bit of an exaggeration that the package is completely unusable. As @kodonnell mentioned, do make sure to test with the latest version. |
@PCManticore thanks for feedback! I understand that your hands are full and didn't try to blame someone. Unfortunately I barely understand what that commit is doing — have seen only pylint's codebase and don't know anything about astroid. @dickreuter personally I wouldn't rush into it, there should be reasons for that commit and reverting it can break more things. Didn't check this though. @kodonnell that's interesting. Confirmed with pylint 2.1.1 & astroid 2.0.3 (5 seconds vs 35). Doesn't help much in my case (my project using pandas still stuck on Python 2), but generally it's a good news. |
Install an old version of pylint and asteroid can help. |
have you tried incresing the number of jobs, does it really work? |
I most definitely cannot observe anything getting better with the most current versions. Simple testcases, just as above, one with an import pandas, another without. |
@SeppMe Thanks for letting us know, we'll get to it. This issue is now part of the |
Can you provide more detail about the |
@kodonnell There's nothing formal per se, just a GitHub project to track all the issues that are related to performance: https://github.com/PyCQA/pylint/projects/3. This doesn't include the issues on |
Great news, can't wait to try it out! 👍 Thanks! |
@PCManticore Thank you very much! When do you plan to release Astroid v2.2.0 with this fix? I am debating whether to update my organization's |
For now installing from |
Hi,
It still takes quite time on a macos i7:
With pylint release 2.2.2 it takes roughly the same time |
@sp-daniel-pinyol Please report a separate issue. From a quick look it seems that both 1.9 and 2.2 exhibit the same behaviour, it doesn't seem to be caused by the regression which caused this particular issue with pandas. |
Any estimate when we'll get this released? many thanks |
@dickreuter I'll release 2.3 somewhere in February, in the meantime you can use the dev release. |
@dickreuter We've just switched to using |
Yes it’s really odd that it takes months for a release that important. It should be done asap. Currently plyint is totallly unusable. The only way is to take it directly from GitHub. |
@dickreuter you can install the dev package if you want. |
It’s tricky for large corporations as we can only use packages from anaconda. |
My company switched to using the dev release, and it reduced the duration of our entire CI/CD from 25+ minutes to 4 minutes. |
@dickreuter Cool, quick question: can those large corporations pay for provided support of one of these tools they're using, like |
I believe Microsoft has sponsored engineers who work solely on open source work not tied to the company (e.g., Lodash). Also, I know Stripe has sponsored several open source developers before. However, @dickreuter, if you more publically state your case just as you have here, I feel it is reasonable for companies - even small startups like mine - to donate. |
Can’t donate, but I contribute. But what I can’t do is the release. If you tell me how and give me the keys, I’m happy to do it.
… On 8 Feb 2019, at 19:11, Danny Nemer ***@***.***> wrote:
I believe Microsoft has sponsored engineers who work solely on open source work not tied to the company (e.g., Lodash). Also, I know Stripe has sponsored several open source developers before.
However, @dickreuter, if you more publically state your case just as you have here, I feel it is reasonable for companies - even small startups like mine - to donate.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Pylint taking ~7 minutes on 1500 files. Tried upgrading, tried ignoring pandas, and not seeing any improvements. Has anybody found a solution with pylint + pandas that doesn't take minutes to run? We already separated various rules into different pylintrc files and run those in parallel in an attempt to speed things up |
The issue has been fixed in the latest version of pylint. You may need to regenerate your pylintrc.
… On 24 Apr 2019, at 15:29, James Quigley ***@***.***> wrote:
Pylint taking ~7 minutes on 1500 files. Tried upgrading, tried ignoring pandas, and not seeing any improvements. Has anybody found a solution with pylint + pandas that doesn't take minutes to run? We already separated various rules into different pylintrc files and run those in parallel in an attempt to speed things up
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Actually realizing its prospector causing the slowness. Pylint run on its own takes no time at all. |
According to this pylint-dev/pylint#2198 The latest version fixed performance issues. Let's test it on runbot, if it works, the runbot Dockerfile can be adapted accordingly. It could fix things like that: http://runbot.odoo.com/runbot/build/513891
I'm experiencing major slowdowns for checking pandas/numpy files. A 366 line file takes 128 seconds to check. pylint --version:
|
pylint is still taking over a minute to lint a short .py file that contains a very long string (65510 chars). I'm not sure if it's related to this issue, or if I should open a new issue. pylint 2.6.0 |
Probably due to #4062 instead. |
Sample script
Running pylint
pylint --version output
Q1. Is this expected behaviour?
Q1a. If so is there a way to make pylint ignore pandas?
The text was updated successfully, but these errors were encountered: