Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Run tasks" calls same Task Provider repeatedly. #7496

Open
tsmaeder opened this issue Apr 3, 2020 · 6 comments
Open

"Run tasks" calls same Task Provider repeatedly. #7496

tsmaeder opened this issue Apr 3, 2020 · 6 comments
Labels
bug bugs found in the application performance issues related to performance tasks issues related to the task system

Comments

@tsmaeder
Copy link
Contributor

tsmaeder commented Apr 3, 2020

Description

I have written a VS Code extension that registers a task provider for type "foobar" and prints to the console when "provideTasks()" is called.
I have a workspace where I configure 5 tasks of type "npm" in /.theia/tasks.json. When I now run the command "Run Task", the console output shows that "provideTasks" is called 6 times before the dialog for selecting the task to run is opened.
In general, the number of times "provideTasks()" is called is n+1, where n is the number of tasks in "tasks.json".

@tsmaeder tsmaeder added bug bugs found in the application performance issues related to performance tasks issues related to the task system labels Apr 3, 2020
@tsmaeder
Copy link
Contributor Author

tsmaeder commented Apr 3, 2020

The culprit seems to be here:

const detected = await this.providedTaskConfigurations.getTaskToCustomize(cus, rootFolder);

getTaskToCustomize() in turn calls getTasks(), which unconditionally fetches tasks from providers. We should really always use cached contributed tasks on only refetch when some action explicitly asks for it.

@akosyakov
Copy link
Member

cc @elaihau

@elaihau elaihau self-assigned this Apr 7, 2020
elaihau added a commit that referenced this issue Apr 9, 2020
- detected tasks are fetched N+1 times when users click "Run Tasks" from
the menu, where N is the number of customized detected tasks in the
tasks.json.

- fixes #7496

Signed-off-by: Liang Huang <[email protected]>
elaihau added a commit that referenced this issue Apr 16, 2020
- detected tasks are fetched N+1 times when users click "Run Tasks" from
the menu, where N is the number of customized detected tasks in the
tasks.json.

- fixes #7496

Signed-off-by: Liang Huang <[email protected]>
@elaihau elaihau removed their assignment Apr 26, 2020
@tsmaeder
Copy link
Contributor Author

tsmaeder commented Jun 8, 2020

Thinking about this (and my approach for a fix in #7633) a bit more...
Obviously, we would like to fetch tasks from task providers as few times as possible. This is because fetching them has a non-negligible performance impact (which brought me to file this issue).
There are two aspects to this issue: first: at which points in time can the set of contributed tasks change. And the second is: what are the consequences if we work with an outdated set of tasks.
The first question is easy to answer: any time. Since there are no restrictions on the implementation of task providers, we could implement one that returns a new set of randomly generated tasks each time we ask it. So we can never be 100% sure that the tasks we're working with are the correct set of tasks or that the task we have in hand have all their prerequisites met. That implies that when we execute a contributed task, we must check the prerequisites and fail the task when they are not met. This does not mean we have to explicitly check the preconditions: building an npm module that has been deleted, for example, may simply return a non-zero exit code.
However, often contributed tasks depend on the state of the project we're working on. That state is typically represented as files on disk. So while Theia is executing action on behalf of the user the set of tasks is unlikely to change. Also, the set of tasks is unlikely to change from second to second: typically the set of tasks changes because of something the user does, like checking out a project from git.
It turns out the second question has already been answered: since we never can be sure to have the correct set of contributed tasks, there are no negative consequences to working with outdated tasks: the only negative consequences are that we are not fulfilling user expectations. These expectations are that the set of tasks he wants to execute is available and that tasks presented by the system execute successfully.
So if there is no right "scope" we can use for keeping tasks ("caching"), what would be a reasonable scope? I would propose that a single user command ("Run Task...") would be that scope: we would fetch the set of tasks once per user action and keep the same set of tasks until that user command completes.

@tsmaeder
Copy link
Contributor Author

tsmaeder commented Jun 8, 2020

In principle, there are two ways to establish such a scope:

  1. Cache invalidation
    This is the approach used in my PoC PR: we keep an implicit scope somewhere and when we start a user action, we invalidate that cache. The advantage is that we don't have to pass the set of tasks around. The disadvantage is that we may miss a call site where we should invalidate the cache so that we work with old tasks.
  2. Explicit Scope
    The approach is to never keep the set of tasks as state, but to fetch tasks once and have it as a required parameter to be passed around wherever the tasks are needed. The advantage is that ensure that a token is obtained by all clients of the task service: thus we are sure that no-one works with out-of-date tasks by accident. Downside is that this change is syntactically breaking. Note that with the implicit cache, the change is still behaviourally breaking, event though it still compiles.

@tsmaeder
Copy link
Contributor Author

A first try was based on cache invalidation: #7633

@tsmaeder
Copy link
Contributor Author

The second one it based on an explicit token.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bugs found in the application performance issues related to performance tasks issues related to the task system
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants