Cache file handles rather than opening them each time (shadow filesystem) #447

krassowski · 2020-12-31T16:53:27Z

What are you trying to do?

The current virtual documents implementation opens file for read and then write for each document modification. This takes time, even for me running on Linux and SSD. This may be a bigger performance issue for Windows users, especially with HDD. We can cache the file handles in a Python dictionary.

What is new in your approach and why do you think it will be successful?

My preliminary benchmarking shows that opening file once reduces the entire operation from ~1 ms to ~0.7 ms. The differences are presumably more notable on non-SSD drives and on Windows with the default NTFS where operations are known to have a higher cost than on the default Linux and Mac file systems (sources: my own experience, this discussion from MS WSL project, countless SO posts).

Who cares? If you are successful, what difference will it make?

I expect that this small performance increase will compound to a noticeable improvement for users with HDD and Windows, especially when multiple LSP server queries are being sent in a row, e.g. when quickly typing code in the editor with autocomplete enabled, which is a critical feature and often pointed out as having poor performance by the users. Of note, the recent work on highlights debouncing and token checking reduced the number of queries sent to the LSP server when typing quickly allowing for better performance already (see #433).

What are the risks?

It requires a caching mechanism for EditableFile, including:

a way to close files upon exit: easy (atexit),
a way to purge closed files from cache (or risk a memory leak): hard
potential concurrency issues?

What are the mid-term and final “exams” to check for success?

The CI robot tests runs on average noticeably faster on Windows (~1% time decrease is a success)

The text was updated successfully, but these errors were encountered:

krassowski · 2020-12-31T18:24:50Z

Cache invalidation could happen on textDocument/didClose. We need to implement this anyway because we are not currently clearing the disk used by opened files (also, should ramisk get implemented as in #356, this would become a memory leak).

krassowski added the package:jupyter-lsp label Dec 31, 2020

krassowski changed the title ~~Cache file handles rather than opening them each time (virtual documents)~~ Cache file handles rather than opening them each time (shadow filesystem) Dec 31, 2020

krassowski mentioned this issue Dec 31, 2020

Implement caching of shadow virtual file handles #448

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache file handles rather than opening them each time (shadow filesystem) #447

Cache file handles rather than opening them each time (shadow filesystem) #447

krassowski commented Dec 31, 2020 •

edited

Loading

krassowski commented Dec 31, 2020

Cache file handles rather than opening them each time (shadow filesystem) #447

Cache file handles rather than opening them each time (shadow filesystem) #447

Comments

krassowski commented Dec 31, 2020 • edited Loading

What are you trying to do?

What is new in your approach and why do you think it will be successful?

Who cares? If you are successful, what difference will it make?

What are the risks?

What are the mid-term and final “exams” to check for success?

krassowski commented Dec 31, 2020

krassowski commented Dec 31, 2020 •

edited

Loading