Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid indexing the workspace for single-file mode #13770

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented Oct 16, 2024

Summary

This PR updates the language server to avoid indexing the workspace for single-file support.

What's a single-file support?

When a user opens the file directly in an editor, and not the folder that represents the workspace, the editor usually can't determine the workspace root. This means that during initializing the server, the workspaceFolders field will be empty / nil.

Now, in this case, the server defaults to using the current working directory which is a reasonable default assuming that the directory would point to the one where this open file is present. This would allow the server to index the directory itself for any config file, if present.

It turns out that in VS Code the current working directory in the above scenario is the system root directory / and so the server will try to index the entire root directory which would take a lot of time. This is the issue as described in astral-sh/ruff-vscode#627. To reproduce, refer astral-sh/ruff-vscode#627 (comment).

This PR updates the indexer to avoid traversing the workspace to read any config file that might be present. The first commit (8dd2a31) refactors the initialization and introduces two structs Workspaces and Workspace. The latter struct includes a field to determine whether it's the default workspace. The second commit (61fc39b) utilizes this field to avoid traversing.

Closes: #11366

Test Plan

When using the ruff server from this PR, we see that the server starts quickly as seen in the logs. Next, when I switch to the release binary, it starts indexing the root directory.

@dhruvmanila dhruvmanila added the server Related to the LSP server label Oct 16, 2024
@dhruvmanila dhruvmanila force-pushed the dhruv/single-file-support branch 2 times, most recently from ea90647 to 2166588 Compare October 16, 2024 09:13
@dhruvmanila dhruvmanila marked this pull request as ready for review October 16, 2024 09:45
Copy link
Contributor

github-actions bot commented Oct 16, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new structs are great!

I'm not entirely sure that I understand the full extent of the change. Why are we skipping the current directory if it is the default workspace? Doesnt' that have an impact for non vs code users? Would you mind explaining in more detail how this works in today and how it affects different editors?

Comment on lines 113 to 114
// Add any settings from above the workspace root. The ones from the workspace root will be
// added only if it's not the default workspace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: What interests me is why we're skipping the root itself if it is the default workspace. I think that would also make the comment easier to understand if it explains why we sometimes skip the root vs when we consider it.

@dhruvmanila
Copy link
Member Author

I'm not entirely sure that I understand the full extent of the change. Why are we skipping the current directory if it is the default workspace?

We will consider any config file that's present in the current directory but not the ones that are present in any nested directories.

The client didn't provide any workspace during initialization therefore we use the current working directory as the default workspace.

Doesnt' that have an impact for non vs code users? Would you mind explaining in more detail how this works in today and how it affects different editors?

For VS Code, if the user opened only a single file in the editor then there won't be any workspace visible in the sidebar. The user can't open any files that are inside any of the nested directories. Here, the limitation would be that if there is any config file present in the directory containing the open file, that won't be considered. This is because the current working directory of the editor process is the system root directory.

For Neovim, if the user opened a single file directly via nvim test.py, then the client will try to find the root directory for it by looking for any of the relevant files like pyproject.toml, etc. For example, for ruff this is described at https://github.com/neovim/nvim-lspconfig/blob/541f3a2781de481bb84883889e4d9f0904250a56/lua/lspconfig/configs/ruff.lua#L3-L7. Now, if those files aren't present then the workspaceFolders will be empty during initialization where the server will assume the current working directory. In this case, the path would be same as that of the nvim process. The neat thing here is that if the user opened a file which is nested in the workspace and that workspace contains the root file, the Neovim client will send a request to add a new workspace folder that's pointing to the directory containing this new file. So, those settings will be considered by the server because they've been indexed.

@dhruvmanila
Copy link
Member Author

For Neovim,

With the following tree structure:

.
├── nested
│   ├── nested.py
│   └── pyproject.toml
└── test.py
   0.000071500s  INFO main ruff_server::server: No workspace(s) were provided during initialization. Using the current working directory as a default workspace: /Users/dhruv/projects/ruff-temp
   0.000151750s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp, using default settings
   0.004234667s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp
   0.004946542s DEBUG ruff:worker:3 ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/test.py
   0.005483375s  INFO     ruff:main ruff_server::server: Configuration file watcher successfully registered
   1.204698542s  INFO     ruff:main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp/nested
   1.246196250s DEBUG ruff:worker:4 ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/nested.py

While, if we remove the nested/pyproject.toml file, then:

   0.000400084s  INFO main ruff_server::server: No workspace(s) were provided during initialization. Using the current working directory as a default workspace: /Users/dhruv/projects/ruff-temp
   0.000877750s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp, using default settings
   0.012916500s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp
   0.014122625s  INFO ruff:main ruff_server::server: Configuration file watcher successfully registered
   0.014338834s DEBUG ruff:worker:0 ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/test.py
   1.608874959s DEBUG ruff:worker:2 ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/nested.py

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think I understand now what's happening but the code in RuffSettingsIndex::new with the early return and the skip doesn't make it obvious what's happening.

I suggest that we add more extensive documentation why we don't skip the workspace directory if it is the default (because it won't take the recursive path).

We should also add a comment to the early return for the default workspace explaining why we exit early.

I'm still somewhat uncertain about the change itself because it means that we won't find any nested configurations for editors that don't pass a workspace root by default. The way I remember it is that VS Code is the exception and most other editors don't provide a workspace root. That would mean that this change is a breaking change for all those users, because their setup only continues to work if the ruff setting is in the workspace root (root of the cwd)

@dhruvmanila
Copy link
Member Author

I'm still somewhat uncertain about the change itself because it means that we won't find any nested configurations for editors that don't pass a workspace root by default. The way I remember it is that VS Code is the exception and most other editors don't provide a workspace root. That would mean that this change is a breaking change for all those users, because their setup only continues to work if the ruff setting is in the workspace root (root of the cwd)

Neovim provides workspace root, it uses files like pyproject.toml, ruff.toml, and .ruff.toml to find it. I can look into what other editors are doing.

@dhruvmanila
Copy link
Member Author

Sublime Text using the LSP implementation as https://github.com/sublimelsp/LSP provides the workspace folders during initialization. It seems to be similar to VS Code minus the multi-root workspace support. When I open a folder in the editor, it passes it as the workspace folder but if I just open a single file, then it doesn't provide any folder. But, unlike VS Code, the std::env::current_dir for Sublime Text when a single file is opened returns /Applications/Sublime Text.app/Contents/MacOS for me:

   0.000099083s  INFO main ruff_server::server: No workspace(s) were provided during initialization. Using the current working directory as a default workspace...
   0.009809375s  INFO main ruff_server::session::index: Registering workspace: /Applications/Sublime Text.app/Contents/MacOS

The conclusion is that it wouldn't be a breaking change in Sublime Text because it already behaves correctly and the server does unnecessary work by indexing.


Helix also behaves in a similar manner as Neovim as it determines the workspace root based on certain set of files: https://github.com/helix-editor/helix/blob/d1b8129491124ce6068e95ccc58a7fefb1c9db45/languages.toml#L861. And, it also sends a request when a nested folder is opened with a root file in it which Helix would consider as a workspace folder (similar to Neovim):

2024-10-16T22:07:44.889 helix_lsp::transport [INFO] ruff -> {"jsonrpc":"2.0","method":"workspace/didChangeWorkspaceFolders","params":{"event":{"added":[{"name":"nested","uri":"file:///Users/dhruv/projects/ruff-temp/nested"}],"removed":[]}}}

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing this. The case I'm concerned about is #10398 but what I understand from your comments is that neovim handles this correctly because it sets the cwd to the file's parent directory.

@dhruvmanila
Copy link
Member Author

Thanks for testing this. The case I'm concerned about is #10398 but what I understand from your comments is that neovim handles this correctly because it sets the cwd to the file's parent directory.

That issue seems unrelated to what's being changed here. Can you explain more on what your concern is about?

Regardless, I'll update the RuffSettingsIndex::new to be more clearer in what's happening using comments and possible code.

Comment on lines +492 to +496
tracing::info!(
"No workspace(s) were provided during initialization. \
Using the current working directory as a default workspace: {}",
current_dir.display()
);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also changed this from warn to info. I don't think it's a warning i.e., not something that the user should be concerned of.

@MichaReiser
Copy link
Member

Isn't it related because it introduced the fallback to the current working directory if no workspace path was provided and this PR changes the behavior for this exact use case?

@dhruvmanila
Copy link
Member Author

Isn't it related because it introduced the fallback to the current working directory if no workspace path was provided and this PR changes the behavior for this exact use case?

Yes.

So, for Neovim specifically, it might not pass a working directory which means that the server should be running in single-file mode where we don't need to traverse the current directory (the behavior as is in this PR). If a user opened any file that's nested inside the current directory and if it contained a config file, then Neovim will make sure to inform the server about it by adding a new workspace folder using DidChangeWorkspaceFolders notification.

What this means is that the server is doing extra work for single-file mode without this PR.

@dhruvmanila dhruvmanila changed the title Avoid indexing the workspace for single-file support Avoid indexing the workspace for single-file mode Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
server Related to the LSP server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Native server indexes the entire project when no local config found
2 participants