-
-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pickling=True in save_module seems to slow down completions #937
Comments
Can you test on the parso branch? You might need to install parso for that ( I have changed quite a few things about pickling (and moved the parser out of Jedi). |
Hit a stacktrace Commit: b9271cf and I installed parso
|
I ran
|
The Problem in my opinionThat even when you have a lot of ram Jedi won't use it to it's benefit rather is caches everything ( caching is important but doesn't always require you to drop data from memory ) , I have a MacBook Pro 15' 2017 and still with the blazing fast NVME ssd it's struggling to read the cash and also creating wear in the ssd due the constant I/O So... why not create RamDisk as a temporally solutionit solve the problem with pickling read and write low performance and even use less cpu ( I know it's not perfect but consider this until Jedi is able handle cashing and memory usage reliably we don't have a choice ) So this is my setup for now:
also this consume a lot of memory but to be honest I rather have a snappy coding experience with less multitasking than the other way around. |
I think we should start using the JSON standard for serialization. Check this out: https://konstantin.blog/2010/pickle-vs-json-which-is-faster/. JSON is several orders of magnitude faster than Pickle. Best of all, it is human-readable! This is great for debugging purposes too. |
Be my guest in implementing it with JSON. If you do, I just want to see the performance difference. :) |
@kbatbouta Is pickling really the source of the problem? What's your environment/editor? Does using RAM make any noticeable difference in your case? In fact, I'm having a similar issue. I'm using vscode + vscode-python and tensorflow 1.5. Completions for the root tensorflow module (
|
I'm not sure. How new is your computer? SSD or HD?
|
Well, it's few years old already (5-2500K @ 3.30GHz), but still it's not the slowest piece of hardware. Usually, ~/.cache/jedi is located on an SSD disk, but I've just made an experiment: I deleted jedi cache and symlinked ~/.cache/jedi into /dev/shm to make sure it's only in memory and does not hit the disk. The initial run was slow of course, but after cached were generated, it didn't make any difference, the numbers are pretty much the same. That's why I doubted the problem is with caches:
But that's just "synthetic" tests. I haven't checked it in VSCode yet (gotta do it in the evening at home). I wonder, whether there are any low-hanging fruits in terms of jedi performance, such as a) running jedi (or parts of it) in pypy b) parallelizing completion (inference, analysis)? I'm not sure whether b) is trivial or even possible at all, though. |
I have tried pypy, but it seems like it's slower than normal jedi (even after disabling pickling, which is what Armin Rigo told me could be an issue). For parallelizing there's certainly possibilities, but I doubt I'll ever tackle it. It just doesn't make sense in Python for jedi. I guess one of my next projects is to create something like an "index" that caches most of these things. This means that jedi would work a bit more like PyCharm in a way. The pre-calculated index would generally mean that you have very good speed. However until that is done, performance for tensorflow and numpy is probably suffering. Sorry for that, but it's really not easy to fix, because of the huge size of the code bases. |
looking at the with cache tensorflow case and using cProfile, most of the time is spent in get_definition and get_defined_names. the time to deserialize the pickled object only impacts uncached times.
|
That is pretty much what I expected. I have tried to optimize these things before, but I think it's pretty hard without a lot of additional caching. |
It looks like python3.6 is significantly faster for the with cache tensorflow case (and most other cases) Python 3.6
Python 2.7:
|
IMO the issues with pickling=True have been fixed in parso. Please refer to #910 for numpy/matplotlib/tensorflow slowness. |
Environment:
If I force
pickling
to False insave_module
, getting recommendations with no cache is much faster.change:
Results from master:
Results with
picking = False
:The text was updated successfully, but these errors were encountered: