Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Naive cythonization for performance #606

Closed
wants to merge 5 commits into from
Closed

WIP: Naive cythonization for performance #606

wants to merge 5 commits into from

Conversation

kodonnell
Copy link

WIP - mainly for my testing at the moment, and submitted as a PR to allow for discussion/review etc. Do not expect the code to be pretty and up to standard as yet, etc.

As proposed and validated in the pylint issue, it could be possible to 'naively' cythonize the astroid codebase for performance. With the changes up to c9fef18, the results are as follows:

master

$ cd astroid
$ for i in {1..5}; do time pylint ../pycodestyle/pycodestyle.py > /dev/
null; done

real    0m6.018s
user    0m5.547s
sys     0m0.359s

real    0m5.970s
user    0m5.609s
sys     0m0.297s

real    0m6.027s
user    0m5.625s
sys     0m0.281s

real    0m6.064s
user    0m5.781s
sys     0m0.250s

real    0m6.087s
user    0m5.625s
sys     0m0.391s

naive_cython

$ git checkout naive_cython
$ python setup.py build_ext --inplace
$ for i in {1..5}; do time pylint ../pycodestyle/pycodestyle.py > /dev/null; done

real    0m5.191s
user    0m4.859s
sys     0m0.266s

real    0m5.099s
user    0m4.750s
sys     0m0.281s

real    0m5.181s
user    0m4.797s
sys     0m0.313s

real    0m5.152s
user    0m4.813s
sys     0m0.313s

real    0m5.219s
user    0m4.766s
sys     0m0.375s

So roughly going from 6s to 5s. That's not bad for 'free' (especially on something that can run for hours). NB:

  • not a proper benchmark for all the usual reasons. If someone has a standard set of tests they can run, that'd be great - I'd be interested to see e.g. the performance difference on more complex projects.
  • this is running in Ubuntu 1604 WSL on a windows machine ... but I assume it's still indicative. Might vary a bit on other platforms though.

Tasks:

  • do completely naive cython
  • confirm with maintainer(s):
    • will we ship cython? does it need to be e.g. 50% faster before we do? etc. Or, should we create a fork of Cython which people can choose to use (e.g. with pylint) - and maybe if enough people adopt it and it holds up, then we migrate to it for good?
    • if we switch to Cython, when?: we don't want to be continually keeping this PR up to date etc.
    • should I bother now with other 'naive' cythonizations (e.g. setting types etc.). I.e. no code refactoring, and it should be recognizable by currently developers, except with cdef int i etc. scattered around the show. If we could get a Cython expert to run their eye over things, they'll probably find a bunch of other easy naive optimizations. My feeling is that if we're moving to Cython anyway, we might as well make a reasonably compelling case for it, and get the low-hanging fruit before we release it.
    • should I have all files cythonized (which is easier) or should I only try to cythonize the ones which are performance critical? For me the former is easier, as I can just blindly rename *.py to *.pyx
  • based on the answers to above:
    • do all the stuff to make this a properly distributable cython/python package
    • maybe do a fair bit of regression testing: compare output of pylint on a bunch of sample projects, to ensure it's behaving the same. We could compare performance too. (NB: this could then form the basis of a performance regression suite etc.)
    • do a blog post etc. and communicate it with users.

Future:

  • tackle pylint too. Note that there might be some scope for faster performance if we merge the libraries (as we can do cdefs for functions etc. which will be faster ... though possible not sufficiently so to justify it)
  • refactor performance critical stuff into non-naive Cython. I don't know how far to go down the rabbit hole. On the one hand, there's maintainability etc. to think about. On the other, performance is very important to astroid/pylint - pylint has a lot of users (client and CI builds etc.), and could be even more popular if it wasn't so slow, as there's no real alternative to it. In addition, being in Cython, it does open avenues for much more 'straightforward' optimization (i.e. just move python -> c without much thinking), as opposed to clever/harder solutions (like caching etc.) Not my call.

@kodonnell
Copy link
Author

@PCManticore if you could provide some feedback on the questions above (I'll update them with answers as they're provided), and maybe CC any other relevant people for their ideas, that'd be great.

@kodonnell
Copy link
Author

Any comments @PCManticore? I was awaiting some feedback/direction before proceeding.

@PCManticore
Copy link
Contributor

Going to check this PR this week, sorry for the waiting! Feel free to proceed for now.

@ceridwen
Copy link
Contributor

I've been meaning to take a look at this, but unfortunately I'm very busy just right now. I have three preliminary thoughts:

  1. Turning too much of the inference code into C/Cython would probably reduce the number of people who could contribute patches and require dropping support for PyPy. I would guess that most of the performance gains will come from a small set of the code base. Is this patch taking the approach the standard library does, with an optional C/Cython extension to improve performance?

  2. I have some guesses as to where the expensive code paths are, but profiling would be better. @brycepg put together a corpus for benchmarking here: https://github.com/brycepg/pylint-corpus. Have you tried it against this? See also Performance Benchmarks / Integration tests pylint#1954 . We really need performance benchmarks in the tests for performance improvements to be maintained, just with nickdrozd's patches we've already had some regressions.

  3. I would bet that significant performance improvement would come from just using better data structures and optimizing to reduce object creation in the inference.

@kodonnell
Copy link
Author

Thanks @PCManticore - I'm still unsure of what direction to take, so I'll await your comments.

@ceridwen FYI that it's very little code, so nothing really to review. I've basically just compiled the existing python-only code with cython - no C specifics.

Turning too much of the inference code into C/Cython would probably reduce the number of people who could contribute patches ...

Agreed - see my last point in the PR. FYI Cython code can just be pure python (which is this PR - the code is still python - except for some minor tweaks like dynamic method allocation and removing const as a variable name) - it's just compiled with Cython (which tends to improve pure python speed by 20-30%). I haven't written any C code etc.

Also - how many people contribute to the core of astroid already? (I.e. if not many people make major changes then it's less of an argument.)

and require dropping support for PyPy.

Good point. How many people use PyPy for running this? Seems like a relatively small minority. And if they're only using it to run pylint then it's not a major - that's a post-processing task etc. If they're only using PyPy for performance, then they might not need to anyway if this approach is faster anyway.

We really need performance benchmarks

Totally agreed - see my comment in the original PR.

Have you tried it against [pylint-corpus] ...

Nope - I didn't know about that.

just with nickdrozd's patches we've already had some regressions.

This is somewhat in favor of using Cython, as in the last paragraph of my original PR comment. Current performance optimizations have to be rather clever and often sacrifice maintainability/interpretability for (often minor) speed improvements. With Cython, we wouldn't have to worry about that.

Specifically to your point on this PR (in it's current state) - it's very unlikely there will be regressions with the PR as it is, since it's performance depends only on Cython (which is well tested etc.) and not code changes.

I would bet that significant performance improvement would come from just using better data structures and optimizing to reduce object creation in the inference.

Me too. But a 20% gain without me even understanding the code is pretty nice. I'll probably leave the smarter stuff to those more capable.

Is this patch taking the approach the standard library does, with an optional C/Cython extension to improve performance?

No idea - moving to Cython is a big change, so this PR is more of a "look what's possible - what do you think we should do?" piece. I'm not the best person to make these decisions (by a long way) - I'm just hopefully proposing something which others can build on.

@nickdrozd
Copy link
Contributor

I'm not familiar with Cython. What exactly is it doing to get the speedup? Will it require messing with the code in a substantial way?

In any case, this is a cool approach. A while back I said "It would certainly be nice if a bird flew by and dropped Pylint-rewritten-in-C into our hands, but short of that I doubt it will ever happen", so it's nice to see that it happened 🐈

@kodonnell
Copy link
Author

@nickdrozd

What exactly is it doing to get the speedup?

No idea sorry - I'm no expert in Cython. If you want, you can check out the C code it generates ...

Will it require messing with the code in a substantial way?

Check out the PR - there are very few code changes. Otherwise see my comments in the original PR around how much the code can/should change.

A while back I said "It would certainly be nice if a bird flew by and dropped Pylint-rewritten-in-C into our hands, but short of that I doubt it will ever happen", so it's nice to see that it happened 🐈

I remember reading this. Note that it's not quite true - I haven't rewritten anything in C. Nearly all the files will run with normal python just fine. I could even use the more traditional method for dynamically setting methods on classes, and then all the code would still run in pure python. However, if you do compile it with Cython (which does convert it to C along the way), it'll run faster.

@kodonnell
Copy link
Author

@PCManticore did you ever get a chance to review this? I could possibly close this otherwise.

@PCManticore
Copy link
Contributor

Thanks @kodonnell I haven't had a chance to review this closely up until now. But now I am not entirely convinced that this is the way to go in terms of improving the performance of astroid. I agree with the general sentiment of @ceridwen that this will mean both less contributors that what we already have and dropping support for Python, for a somewhat minor increase in performance. We also don't have that much experience with Cython and I am afraid on jumping on magic performance improvement tools without a modicum of experience to be able to debug any unexpected side effect. And finally, the proposed future changes seems to require a lot of work in both implementation and maintaining astroid post cythonization, which I seriously doubt we have time for. All in all, I think it's best to leave astroid pure Python for on and add corresponding performance improvements with Python code.

@PCManticore PCManticore closed this Jun 2, 2019
@kodonnell
Copy link
Author

Thanks @PCManticore. I tend to disagree that small/minor cythonization will affect contributions, or that it's a "magic ... tool" (as it's used in quite a few popular projects), or that the performance may be minor (as I didn't even try to optimise anything ... it might be the case that we could get a 10x performance increase by tweaking a few lines of code). However, I'm happy to close this as it probably doesn't make much sense to pursue this without at least a core contributor being more familiar with Cython.

For anyone reading this later - if you are a Cython ninja, I'd be really interested to see what some slightly-more-than-naive Cythonizing could do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants