Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move completion into its own daemon for python #358

Closed
kovidgoyal opened this issue Jun 4, 2013 · 21 comments · Fixed by #578
Closed

Move completion into its own daemon for python #358

kovidgoyal opened this issue Jun 4, 2013 · 21 comments · Fixed by #578

Comments

@kovidgoyal
Copy link

One of the goals of YCM (as I understand it) is to be fast. Unfortunately, using jedi for completion means that on larger projects, that goal is not achieved, for several reasons:

  1. Sometimes jedi takes several seconds to return completions in newly opened, large .py files. This means vim does not respond to key strokes for extended periods of time.

  2. I notice general sluggishness when moving line by line through text. This is probably because of completion that is triggered and aborted or because the GIL pauses the UI thread while the backend thread is working.

  3. I tend to keep vim sessions open for days of continuous hacking, which means that memory consumption in the vim process goes to multi-gigabyte levels. Using a daemon means that the daemon could be restarted, without needed to restart the entire vim session.

A possible solution is to move completion activity into a separate daemon process (one per vim instance). vim can communicate with it over stdin/stdout so that it is easy to make it cross platform and use select() to keep from blocking the UI. I implemented something similar for the powerline project https://github.com/kovidgoyal/powerline-daemon a little while ago and I am willing to do the work for YCM as well, provided this is something you are on board with, otherwise, I'll just cook up a quick hack for myself.

If you do decide want me to implement it, I'll need some guidance as to the best place to implement it. I was originally planning to just use it for jedi completion, since that is what I care about, but it may be better to make it useable for all backends, since this problem is likely to recur with other backends in the future. Also, there may be some technical reasons that make what I am proposing infeasible, I'm not yet familiar enough with the codebase to be sure.

Oh and nice to see you again, Valloric, good job with YCM :)

@kovidgoyal
Copy link
Author

After more investigation, the problem seems to be worse than I first thought. I tried the following ycm config

let g:ycm_collect_identifiers_from_tags_files = 1
let g:ycm_filetype_specific_completion_to_disable = {"python":1}

On a large project (calibre, if you care, ~500K LoC) with a 4.5MB tags file, moving the cursor around in normal mode becomes slow. This slowness remains even after disabling the CursorMoved and CursorHold autocommands from youcompleteme.vim, which leads me to believe its a GIL issue, though why YCM should be doing background processing in normal mode is not clear to me. That it is doing so I infer from the fact that CPU usage climbs while scrolling and goes back to zero when the cursor is just sitting there. Further, the increase in CPU usage is ~3x the increase when the tags file is not loaded. Is there someplace else that I need to disable autocommands?

You should be able to reproduce the normal mode cursor slowness by generating a tags file for YCM itself (thanks to Boost the tags file is 720MB for YCM and the slowness is extremely noticeable). Generate the tags file by running

ctags -R --fields=+l

in the YCM root.

That implies, at least to me, that all completion needs to be moved into a daemon and the in-vim ycm should just become a dumb client.

@oblitum
Copy link
Contributor

oblitum commented Jun 4, 2013

I wonder, is it currently YCM focus to handle tags files?

As I'm not a very long-time VIM user, I didn't started with addiction for them, so I still never made use of them while developing for c++, and particularly, what's provided without them is pretty fine for me, I wait the day such ctags files and IDE completion databases aren't needed anymore, but, anyway, about the other languages, I dunno whether working together with tags files is actually needed or not with YCM.

@kovidgoyal
Copy link
Author

I only tried using tags because, as I said, jedi is too slow (for large projects). Using tags is faster, but still not fast enough. Having noticeable latency when scrolling with hjkl in normal mode is not acceptable, at least to me. And this is on my top end dev machine, I dont even want to try it on my laptop :)

@JazzCore
Copy link
Contributor

JazzCore commented Jun 4, 2013

@kovidgoyal When working on python files 2 backends are active: identifier and jedi completers.

Identifier completer uses C++ threading (dont know much about it) and does some processing in normal mode (it parses identifiers on CursorHold autocmd).
Jedi uses this event loop get the completions. When moving line by line there is constant ShouldUseNow method polling on every cursor move but Jedi is called only after the dot. I dont get the sluggishness when moving in file, so cant say for sure what causes this.
About not responding to key strokes i dont really get it. You mean it blocks the GUI? Thats not supposed to happen since it uses a backend thread and YCM completion by itself aborts a completion when user types something ( bad wording here, code is more understandable ).

Tags parsing happens in identifier completer ( see 454a961 ). However, the actual parsing is happens only when entering buffer ( OnFileReadyToParse completer event.

I'm not sure about daemon idea, can you please explain how it differs from current threaded backend communication?

@kovidgoyal
Copy link
Author

The difference is simple, in python there is a global interpreter lock (GIL), which means that only one thread can be active at a a time. So if you are running some CPU intensive task in a background thread, that thread will cause the main GUI thread in vim to block (if it is in python). In practice, the python threads release the GIL periodically (after every x python bytecode instructions in python 2.x and after every x milliseconds in python 3.2+). This is why when there is not heavy CPU usage you dont notice GIL problems.

Therefore, a CPU intensive background python thread can cause both sluggishness and ignored keystrokes in the main GUI thread of vim.

When using a separate process, there is no GIL, and by using select() to communicate with the worker process, you can guarantee that the main vim thread is never blocked.

@zhaocai
Copy link
Contributor

zhaocai commented Jun 4, 2013

👍 I do notice lag sometimes. I guess the root cause is GIL as @kovidgoyal stated.

@JazzCore
Copy link
Contributor

JazzCore commented Jun 4, 2013

@kovidgoyal why not using multiprocessing instead?

@kovidgoyal
Copy link
Author

Because multiprocessing does fork() without exec() on Unix. That means the entire memory footprint of vim is duplicated*, open file handles/sockets are duplicated, various locks are left in an inconsistent state and so on. For example, some other plugins I use in vim use inotify extensively, all the watches that inotify creates will (probably) be duped and left open in the worker process.

*On linux fork uses copy-on-write for memory duplication, however, because python uses refcounting, that makes no practical difference.

@JazzCore
Copy link
Contributor

JazzCore commented Jun 4, 2013

Right, good point, daemon would be a nice addition.

It's better to make this available to all backends since there is already a filename completer beside Jedi which uses python threads and may be more in the future.

Another concern that it should be crossplatform since YCM is, or at least add a fallback to the current implementation.

@kovidgoyal
Copy link
Author

I also think it would be best to make it available to all backends. Basically move the entire logic into the worker process and make the vim part just a dumb client that communicates with the worker.

Let us see if Valloric agrees, at least in principle. Then we can discuss the best way to implement it.

@kurojishi
Copy link

i second @kovidgoyal idea, the worst problem of YCM on big python projects is the lag, it's take a lot of time to index everything

@Valloric
Copy link
Member

Sorry for the delay in responding, I knew I was going to have to write a big response to this issue thread and I couldn't find the time. Here goes:

One of the goals of YCM (as I understand it) is to be fast. Unfortunately, using jedi for completion means that on larger projects, that goal is not achieved, for several reasons:

I've used the YCM jedi_completer on large Python projects (bigger than Calibre) with great success. The first time you query for semantic completions in a Python file it takes a bit for Jedi to do its thing, but after that first completion invocation, it's pretty damn fast. At least in my experience.

  1. Sometimes jedi takes several seconds to return completions in newly opened, large .py files. This means vim does not respond to key strokes for extended periods of time.

It will appear as if Vim's GUI thread has been blocked, but if you try to type it will "unblock". YCM polls the complete_check() Vim function to check for new key input while processing your completion request in the background; if new keystrokes are detected, the operation is aborted.

Vim displays this in a terrible way; the GUI appears blocked, but typing unblocks it.

This is also sadly not perfect, and on occasion the GUI thread is in fact blocked by some YCM operation. I'm immensely annoyed when this happens.

  1. I notice general sluggishness when moving line by line through text. This is probably because of completion that is triggered and aborted or because the GIL pauses the UI thread while the backend thread is working.

This sounds like this FAQ question. YCM doesn't do anything computationally intensive on cursor move at all. In fact, the only thing that is done on cursor move in normal mode is updating Syntastic diagnostics if you're in a C-family file. So in a Python file, literally nothing is done on cursor move. See the code for more details.

So this is highly unlikely to be caused by YCM. My money is on an out-of-date Syntastic version. Something about YCM rubs the old Syntastic the wrong way. I've heard the "slow when moving the cursor around" complaint about a dozen times now and every single time the issue was resolved by updating the user's copy of Syntastic.

This slowness remains even after disabling the CursorMoved and CursorHold autocommands from youcompleteme.vim, which leads me to believe its a GIL issue, though why YCM should be doing background processing in normal mode is not clear to me.

You just proved it's not YCM code that's doing it. There are no other CursorMoved autocommands other than the ones at the top of autoload/youcompleteme.vim. If you commented out those autocommands, literally no YCM code is run on cursor move.

The size of the tags file is also a red herring. I've used a 30 MB tags file without any slowness. YCM processes the tags file in a background thread in the C++ code (with the background thread created and managed by the C++, not the Python), so not even the GIL could be an issue. YCM also processes your tags file only once until the next time the tags file is modified.

  1. I tend to keep vim sessions open for days of continuous hacking, which means that memory consumption in the vim process goes to multi-gigabyte levels. Using a daemon means that the daemon could be restarted, without needed to restart the entire vim session.

I usually have a Vim instance open for weeks at work, so I'm in the same boat. I'm not usually bothered by the memory consumption, but I'll get to the daemon part below.

A possible solution is to move completion activity into a separate daemon process

This has been on my mind for months now actually. Vim goes out of its way to block the GUI thread and it's hard to prevent it from strangling itself. So yes, YCM will be split into a client & server model. The server will probably be an actual HTTP server that exposes a JSON interface. Either that or Apache Thrift. I'll probably go with the JSON API and switch to Thrift if the JSON overhead ends up being significant. I doubt that it will be though.

If you do decide want me to implement it, I'll need some guidance as to the best place to implement it.

Don't worry about it, I'll implement it some time in the future. There are several other concerns about future features that need to be kept in mind when building the daemon so it's not simple stuff.

Oh and nice to see you again, Valloric, good job with YCM :)

Hi to you too Kovid. :) I'm glad you like YCM and with some luck we'll have the issues you raised sorted out.

@kovidgoyal
Copy link
Author

Sorry for the delay in responding, I knew I was going to have to write a big response to this issue thread and I couldn't find the time. Here goes:

No problem, we're all busy people.

One of the goals of YCM (as I understand it) is to be fast. Unfortunately, using jedi for completion means that on larger projects, that goal is not achieved, for several reasons:

I've used the YCM jedi_completer on large Python projects (bigger than Calibre) with great success. The first time you query for semantic completions in a Python file it takes a bit for Jedi to do its thing, but after that first completion invocation, it's pretty damn fast. At least in my experience.

It isn't simply the size, it's also the local context. So for example if
you have a file that imports nothing sitting in a huge project, there is
no penalty. However if you have files that happen to import a lot of
code and have a large completion space, then you start getting the slowdown.
The slowdown also seems to be correlated with the length of the session.
SO the same file will not show a slowdown in the beginning, but if you
hack on it for a day or two without restarting vim, it starts getting
sluggish.

  1. Sometimes jedi takes several seconds to return completions in newly opened, large .py files. This means vim does not respond to key strokes for extended periods of time.

It will appear as if Vim's GUI thread has been blocked, but if you try to type it will "unblock". YCM polls the complete_check() Vim function to check for new key input while processing your completion request in the background; if new keystrokes are detected, the operation is aborted.

That does not match up with my experience. I see the GUI thread being
blocked for several seconds (not responding to keystrokes) fairly often
when completing for the first time in a large file (by large, I mean a
file that imports lots of other code). Note that if the GUI thread is
waiting on the GIL, keystroke events may be discarded, depending on the
type of vim. So for gvim running in X11 which uses an async protocol, if
the main thread is blocked the keystrokes will be discarded, so polling
compete_check() wont help.

This is a fairly complex issue, things that can influence it include the
version of python, for instance in python 3.2+ python threading was
refactored to ensure predictable release of the GIL after a specific
interval (50ms I think). On older versions the interpreter releases the
lock only after a fixed number of bytecode instructions. Unfortunately,
there is no way to know how long a particular bytecode instruction might
actually take to execute. The other major factor that could influence
this is the type of event loop, X11, console, xterm, whatever OS X uses
and so on.

  1. I notice general sluggishness when moving line by line through text. This is probably because of completion that is triggered and aborted or because the GIL pauses the UI thread while the backend thread is working.

This sounds like this FAQ question. YCM doesn't do anything computationally intensive on cursor move at all. In fact, the only thing that is done on cursor move in normal mode is updating Syntastic diagnostics if you're in a C-family file. So in a Python file, literally nothing is done on cursor move. See the code for more details.

So this is highly unlikely to be caused by YCM. My money is on an out-of-date Syntastic version. Something about YCM rubs the old Syntastic the wrong way. I've heard the "slow when moving the cursor around" complaint about a dozen times now and every single time the issue was resolved by updating the user's copy of Syntastic.

No, I read the FAQ. I keep all my vim plugins updated from their github
repos with a cron job and in fact I have contributed to syntastic in the
past. Also, disabling YCM causes the sluggishness to go away,
re-enabling it causes the sluggishness to return. That does not prove it
is YCM (my vim setup is very complex, with a lot of plugins, so it could
be an interaction with something else). But it is highly suggestive.

This slowness remains even after disabling the CursorMoved and CursorHold autocommands from youcompleteme.vim, which leads me to believe its a GIL issue, though why YCM should be doing background processing in normal mode is not clear to me.

You just proved it's not YCM code that's doing it. There are no other CursorMoved autocommands other than the ones at the top of autoload/youcompleteme.vim. If you commented out those autocommands, literally no YCM code is run on cursor move.

The size of the tags file is also a red herring. I've used a 30 MB tags file without any slowness. YCM processes the tags file in a background thread in the C++ code (with the background thread created and managed by the C++, not the Python), so not even the GIL could be an issue. YCM also processes your tags file only once until the next time the tags file is modified.

OK, I'll take your word for it, it's likely the same interaction as
before in that case. In any case since you plan to implement a daemon,
the question is moot. I can use a bit of local hackery to get around the
situation until you implement a proper daemonized backend.

  1. I tend to keep vim sessions open for days of continuous hacking, which means that memory consumption in the vim process goes to multi-gigabyte levels. Using a daemon means that the daemon could be restarted, without needed to restart the entire vim session.

I usually have a Vim instance open for weeks at work, so I'm in the same boat. I'm not usually bothered by the memory consumption, but I'll get to the daemon part below.

I agree it's not a major issue, except when I work on my laptop, which
at the moment is somewhat RAM constrained. Time to get a new one :)

A possible solution is to move completion activity into a separate daemon process

This has been on my mind for months now actually. Vim goes out of its way to block the GUI thread and it's hard to prevent it from strangling itself. So yes, YCM will be split into a client & server model. The server will probably be an actual HTTP server that exposes a JSON interface. Either that or Apache Thrift. I'll probably go with the JSON API and switch to Thrift if the JSON overhead ends up being significant. I doubt that it will be though.

That is great news, but might I ask, why an HTTP server? Do you envisage
people using the server over a network? There will be considerable
overhead in communicating with an HTTP server instead of using a simple
serialization protocol on a unix socket or even stdin/stdout.

Also, just a warning from my experience with the calibre server, finding
free ports to run a network server that binds to anything other than
localhost is pretty difficult, especially on windows with its antivirus
software. Of course, the userbase for YCM is likely to be somewhat more
technically educated than that for calibre.

Don't worry about it, I'll implement it some time in the future. There are several other concerns about future features that need to be kept in mind when building the daemon so it's not simple stuff.

Excellent, I will await this work with great anticipation. I have a fair
bit of experience writing python daemons on Unix, so feel free to ping
me if you want to discuss something in the design/implementation.

@Valloric
Copy link
Member

OK, I'll take your word for it, it's likely the same interaction as
before in that case. In any case since you plan to implement a daemon,
the question is moot. I can use a bit of local hackery to get around the
situation until you implement a proper daemonized backend.

The planned daemon does make it moot, but I'm convinced the issue is solvable without it. If you commented out the autocommands and you still get lag on cursor move, then something in your Vim configuration is interacting badly with YCM in some way. The autocommands at the top of the file you were looking at are the only ones in YCM.

You can further test out your theory that it's caused by GIL contention by putting the following in your vimrc: let g:ycm_filetype_specific_completion_to_disable = {'python': 1}. This will disable the jedi completer. Also comment out the CursorMoved autocommands. These two things together will turn off all YCM code that runs on cursor move and all the Python background threads that might be running. The identifier completer uses C++ bg threads so that's never going to touch the GIL.

If you still get lag when moving the cursor, it's not YCM code that's doing it. As I mentioned, old Syntastic did something crazy (reasons unknown) when it was loaded with YCM. It could be something similar to that if you're sure your Syntastic is very recent.

That is great news, but might I ask, why an HTTP server? Do you envisage
people using the server over a network? There will be considerable
overhead in communicating with an HTTP server instead of using a simple
serialization protocol on a unix socket or even stdin/stdout.

Like I mentioned, there are other concerns in play. I'd like to have one server on my workstation and then have multiple Vim/Emacs/SublimeText instances connect to it (or one per instance, user-configurable). The HTTP server will be a proof-of-concept of this. It's super generic, low on maintenance and easy to connect to from various languages and on every platform (Windows doesn't have unix sockets for instance). It will be more overhead, sure, but I'll hold off on the optimization until I know it's not premature. Apache Thrift would give me the same feature set but with more performance at the cost of some ease-of-use and maintenance burden.

But again, I'm going to keep it simple until the need arises to make it complex.

The flexibility of possibly running the server over the network might come in handy for some other use cases I have in mind.

@oblitum
Copy link
Contributor

oblitum commented Jun 11, 2013

About sluggishness, beware with the new regexpengine, enabled by default in lastest vim sources. I've come across an unreasonable sluggishness while moving because of it, setting it to the old one (set regexpengine=1) made everything go back to normal speed again, free of flickers.

@kovidgoyal
Copy link
Author

You can further test out your theory that it's caused by GIL contention by putting the following in your vimrc: let g:ycm_filetype_specific_completion_to_disable = {'python': 1}. This will disable the jedi completer. Also comment out the CursorMoved autocommands. These two things together will turn off all YCM code that runs on cursor move and all the Python background threads that might be running. The identifier completer uses C++ bg threads so that's never going to touch the GIL.

If you still get lag when moving the cursor, it's not YCM code that's doing it. As I mentioned, old Syntastic did something crazy (reasons unknown) when it was loaded with YCM. It could be something similar to that if you're sure your Syntastic is very recent.

I did do that, (it was what prompted my second post in this thread), my results were:

  1. Disabling YCM, lag goes away
  2. Disabling the python completer, lag goes away
  3. Disabling the python completer, enabling the tags based completer,
    lag returns (less severe, but still present), for large tags files.

Dont worry about it, as I said the daemon makes it moot, there's no
point wasting time investigating an issue that will go away eventually.

Until you get around to implementing it, I can workaround the problem by
just disabling the semantic completer and/or restarting vim when it gets
slow. It's a little annoying, but I can live with it.

@kovidgoyal
Copy link
Author

Just FYI, I managed to fix the slow hjkl movement in normal mode, you were right, it had nothing to do with YCM (well YCM exacerbated the underlying problem, which was why I initially suspected it). Apologies for the false alarm.

The slowness on completion issue remains, but as we discussed that will be fixed by the eventual move to a daemon.

@Valloric
Copy link
Member

Valloric commented Oct 9, 2013

A new version of YCM that's been split into client-server model is ready for testing. More information in this ycm-users thread. Please take it out for a test drive!

@kovidgoyal
Copy link
Author

Awesome thanks, I'll jump on it in a few days, when my workload clears a bit.

@kovidgoyal
Copy link
Author

After a few minutes of editing this file: https://github.com/kovidgoyal/calibre/blob/master/src/calibre/gui2/tweak_book/file_list.py

ycmd.py pegs the CPU at 100% and leaks memory continuously at about 20 MB/s

Vanilla YCM works normally with that file. If there is some more debugging I can do, let me know.

vim 7.4.52 ycm installed as (linux, gentoo, amd64)

git clone https://github.com/Valloric/YouCompleteMe.git
git checkout --track origin/ycmd
git submodule update --init --recursive
./install.sh

@Valloric
Copy link
Member

@kovidgoyal Could you open a new issue for that? Thanks.

bijancn pushed a commit to bijancn/YouCompleteMe that referenced this issue Jul 26, 2016
[READY] Fix wrong constant used in Typescript completer

It sets a timeout of 100 seconds, which is a little long.

Should not conflict with PR ycm-core#358.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.svg" height="40" alt="Review on Reviewable"/>](https://reviewable.io/reviews/valloric/ycmd/359)
<!-- Reviewable:end -->
bijancn pushed a commit to bijancn/YouCompleteMe that referenced this issue Jul 26, 2016
[READY] Making ycmd compatible with Python 3

This one was a doozy.

The `futurize` tool helped, but it probably accounted for 1% of the work. The `str` vs `bytes` vs `unicode` thing affects _everything_ and all of it needed to be handled manually, so this PR touches practically every single file in the repo. As such, **until we merge this PR, no other code-changing PR should be merged.** We need to get to a state where we have green Travis for Python 3 before any other PRs start landing.

Note that this new ycmd can't be used with YCM (yet) because of `ycm_client_support.so` and all the old hackery around it (have I mentioned I hate YCM's omnicompleter?). Everything should work once I port YCM to be Python 3 compatible; in the meantime, we can play around with ycmd using `example_client.py` (which now only works on Python 3 BTW).

Note that YCM will need to have special logic so that ycmd gets built with support for the same version of Python that is compiled into Vim. YCM that runs in Python 2 inside Vim won't work with a ycmd running in  Python 3 (and vice-versa). Fun times ahead.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.svg" height="40" alt="Review on Reviewable"/>](https://reviewable.io/reviews/valloric/ycmd/358)
<!-- Reviewable:end -->
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants