Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the SCM Plugin API from the latest vscode #9045

Merged
merged 1 commit into from
Mar 15, 2021
Merged

Conversation

vinokurig
Copy link
Contributor

@vinokurig vinokurig commented Feb 9, 2021

Signed-off-by: Igor Vinokur [email protected]

What it does

Update the SCM Plugin API methods from the latest vscoe.

related CQ: http://dev.eclipse.org/ipzilla/show_bug.cgi?id=23027
fixes #7953

How to test

  1. Remove "@theia/git": "^1.10.0" from examples/browser/package.json file.
  2. Download vscode-git-1.49.3 and copy it to the plugins folder.
  3. Start Theia and open a project cloned from GitHub e.g. https://github.com/vinokurig/environment-variable.git
  4. Open browser logs console and make some git operations with the help of the scm view.
    See: no repeating root ERROR Child node 'file-change-tree-root' does not belong to this 'file-change-tree-root' tree. errors.

Review checklist

Reminder for reviewers

@vinokurig vinokurig added the scm issues related to the source control manager label Feb 9, 2021
@vinokurig vinokurig changed the title Update SCM Plugin API from the latest vscode Update the SCM Plugin API from the latest vscode Feb 10, 2021
@westbury
Copy link
Contributor

@vinokurig I tried this out with vscode-git-1.49.3 and https://github.com/westbury/theia which has a couple of outstanding PRs. I did get some tree errors in the console, though not many. They occurred when I committed changes. So I can understand this PR better, can you explain why those Child node 'file-change-tree-root' does not belong to this 'file-change-tree-root' tree errors are occurring and why it is fixed by the code changes in this PR.

Screenshot from 2021-02-11 00-08-28

@vinokurig
Copy link
Contributor Author

@westbury The file-change-tree-root' does not belong to this 'file-change-tree-root error is caused by repeating requests to update the SCM tree. In the current master SCM plugin API doesn't filter the git events and send requests to update the SCM tree on every change by the vscode git plugin. That's why we see a lot of repeating errors.
But even with this updated SCM API we still have double update event on a git index update. First one is to update the tree and the second one is to update the the status bar commands. The SCM API is designed to send a general update event (this.onDidChangeEmitter.fire()) on any type of update:

updateSourceControl(features: SourceControlProviderFeatures): void {
this.features = { ...this.features, ...features };
this.onDidChangeEmitter.fire();
if (typeof features.commitTemplate !== 'undefined') {
this.onDidChangeCommitTemplateEmitter.fire(this.commitTemplate!);
}
if (typeof features.statusBarCommands !== 'undefined') {
this.onDidChangeStatusBarCommandsEmitter.fire(this.statusBarCommands!);
}
}

I've added a debounce to the SCM tree listener:
this.toDisposeOnRepositoryChange.push(provider.onDidChange(() =>
debounce(() => this.root = this.createTree(), 200)())
);

So after the fixup there is no file-change-tree-root' does not belong to this 'file-change-tree-root errors at all.

Copy link
Member

@azatsarynnyy azatsarynnyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked it and I can confirm that there are no file-change-tree-root-related errors anymore in the browser's console. Thanks for fixing that annoying bug!

@vinokurig
Copy link
Contributor Author

@vince-fugnitto @westbury I am going to merge it in a few days if there are no objections?

@westbury
Copy link
Contributor

@vince-fugnitto @westbury I am going to merge it in a few days if there are no objections?

Yes, sorry I did not get back sooner. I am a little concerned that this solution is rather trying to hide the problem rather than fix it. The change to debounce the two calls into one seems to be a bit of a hack, apart from the fact that this compromises performance by adding a 200ms delay. Would it not be better to fix it so that a second call to repository onDidChange works without error, and preferably is efficient when there are no changes? I realize we are constrained by the plugin API.

I don't understand this well, and I could well be wrong. Can you wait till Monday 22nd so I can look at this in more detail? I think you have done great work identifying the problem and possible solutions, as this has been around for a while and no one had fixed it.

@vinokurig
Copy link
Contributor Author

@westbury

Can you wait till Monday 22nd so I can look at this in more detail?

Yes, sure.

@westbury
Copy link
Contributor

@vinokurig It would be better if we just stopped generating errors when createTree is called before the previous createTree completed. You can see how this works by using a different id each time, i.e. in the createTree method in scm-tree-model.ts, replace the id with something like id: 'file-change-tree-root' + (++this.suffix),. If you do this, and remove the debounce, then you will not see the error. We should be sure that the previous createTree stops when the next one is called (unless we can be more specific about what is dirty).

Also, if we were to stick with the debounce solution, we should be using throttle. We then get the refresh straight away, with a guarantee that the next createTree won't be called until the previous call has finished (or at least it won't be called until 200ms has passed, by which time we are just hoping that in most environments the previous call has finished). Another option would be to use a promise, like 'defer', so the call to createTree will be made as soon as the previous one completed.

@vinokurig
Copy link
Contributor Author

@westbury

@vinokurig It would be better if we just stopped generating errors when createTree is called before the previous createTree completed. You can see how this works by using a different id each time, i.e. in the createTree method in scm-tree-model.ts, replace the id with something like id: 'file-change-tree-root' + (++this.suffix),. If you do this, and remove the debounce, then you will not see the error. We should be sure that the previous createTree stops when the next one is called (unless we can be more specific about what is dirty).

Setting a unique id fixed the problem, Could you please take a look one more time?

@westbury
Copy link
Contributor

westbury commented Mar 1, 2021

@vinokurig Sorry, I did mention that if one uses a different root then the error would not occur. I was really trying to suggest alternative solutions that did not involve arbitrary delays, rather than say it was the complete solution. Making the change in your last commit on its own really is only disabling the message. As we would never have two root object instances with the same id, the test will never cause the message to be issued, and we may as well have just removed the message.

That check, that the root is current, is important because it causes the original tree refresh to terminate when a new root is set. By returning from setChildren early, it is bypassing the fireNodeRefreshed call below. That call ultimately results in expanded nodes being refreshed recursively. We do want to be sure we continue to stop the refresh of the old tree.

I do note that the return value from the refresh (undefined if the root node changed) is not being checked. If you look at TreeExpansionServiceImp.init(), all children continue to be refreshed even after a refresh failed. That may be another reason why one occasionally sees the message appear many times together (especially for non-SCM trees). If a new root was set while processing a node with many children then a message will appear for each child.

IMO, I think just removing the console log is the best solution. A root change is a known situation which is not a problem. It does not have to be logged, just handled properly which I believe it is. Personally I would add some comments to explain that a root change is ok and we just have to stop refreshing the old tree. @akosyakov does this sound right to you?

@vinokurig vinokurig mentioned this pull request Mar 2, 2021
1 task
@vinokurig
Copy link
Contributor Author

@akosyakov What do you think about removing the error logging?

if (this.nodes[root.id] && this.nodes[root.id] !== root) {
console.error(`Child node '${parent.id}' does not belong to this '${root.id}' tree.`);
return undefined;
}

@vinokurig
Copy link
Contributor Author

@westbury @vince-fugnitto @akosyakov So what's the plan? Is it OK to remove the console log when tree root is changed?

@vince-fugnitto
Copy link
Member

@westbury @vince-fugnitto @akosyakov So what's the plan? Is it OK to remove the console log when tree root is changed?

@vinokurig I do not yet fully understand the code in this area to comment if the log is no longer necessary but I feel like it is.
The code was introduce to fix infinite recursion in the tree (5b45e12), and it is something that is still present with #8911 (comment) (we actually end up running out of memory). If not for this message we probably would not identify such issues?

From the commit:

It causes 2 parallel refreshes and with bad timing on workspaces with multiple roots can lead to infinity loop and eventually max call stack error. This resolves the issue generically for trees by allowing parallel refreshes without resetting a root.

@tsmaeder
Copy link
Contributor

tsmaeder commented Mar 5, 2021

I did mention that if one uses a different root then the error would not occur.

@westbury you wouldn't get the error, but we would still have multiple tree updates going on in parallel 🤷 .
@vinokurig I have to agree with Nigel here that this PR will make the problem not show up, but it will still be there. I haven't looked through all changes, but if we merge the PR, we certainly shouldn't close the issue. The problem will still occur anytime someone sets a value for Tree.root before a previous update has not finished.
I think wee need to fix this in the Tree implementation. In principle, there are only two ways to handle this: you wait with the second update or you cancel the first one. We also need to think about calls to refresh() that are not starting at the root. Parallel updates to unrelated subtrees are fine.
I think the performance problems I'm seeing have to do with the fact that we're refreshing the tree hundreds of times in parallel.

I would vote we merge this PR: it improves the scm plugin implementation and makes Theia better (for me at least). We should fix the underlying problems, but I don't have a ready solution for that.

@vinokurig
Copy link
Contributor Author

@westbury @vince-fugnitto @akosyakov
I've tried to fix the problem of multiple SCM update events like by using the onDidChangeResources event like in vscode, but it causes another problem: the SCM resources are not updated when the event is fired. It works in vscode because they listen to the onDidChange event as well and update the SCM groups separately: https://github.com/microsoft/vscode/blob/0657df6e82ff9e6900c32a7edffce9aa98879857/src/vs/workbench/contrib/scm/browser/scmViewPane.ts#L1069
I've reverted the hack which adds a timestamp to the SCM tree id, so in the current state the console error could be still reproduced, but there are no more repeating errors!
So what do you think about merging this PR without closing the issue and then update the SCM tree widget to be able to refresh groups separately?

@westbury
Copy link
Contributor

westbury commented Mar 9, 2021

@tsmaeder currently the refresh of the tree stops if something else sets a new root, as described in paragraph 2 of #9045 (comment) . The only way to get two refreshes running simultanously on the same tree is, as you said, if the second refresh in on a child node. I have not seen this happen, but I can see that if one expanded a node during a refresh then it could be possible. A possible fix would be to check not just that the root node is the same instance but also check all the ancestors up to the root node.

@vince-fugnitto I don't understand how it can be useful to keep the offending console logging. We know that another refresh can be started before the previous has finished. How is it useful to log that? The commit you referenced was a refactoring that fixed a problem where the root was actually being changed back if a root change was detected. As far as I know, this fix worked and I know of no remaining problems other than the child-node refresh problem. And when a refresh of a child-node occurs concurrently the console log will not be triggered.

I agree that it's fine to leave this as is so we can get this PR merged. It otherwise looks fine to me but I don't really understand all the changes in scm and scm-main. Either someone else needs to look at that code or I need to spend more time understanding it.

@vince-fugnitto
Copy link
Member

@vince-fugnitto I don't understand how it can be useful to keep the offending console logging. We know that another refresh can be started before the previous has finished. How is it useful to log that?

@westbury as I mentioned previously, I don't have the necessary knowledge in the area (unless I spend time to investigate) to determine if the logging is necessary or not, but I did encounter issues in #8911 where the logging seems to indicate a more serious error? I ended up in a a state where the log happened recursively until I ran out of memory. For such a case is the log not useful unless I'm mistaken?

image

@tsmaeder
Copy link
Contributor

tsmaeder commented Mar 10, 2021

@westbury the problem is not the console message. The problem is that performance really drops when we have a lot of refresh calls. Since Tree.refresh() is API, you can call it as many times as you want, even when you don't set a new root. Therefore, I don't think checking "is it the same root object" is a sufficient stop condition for parallel updates. De-duplicating refresh calls seems a reasonable thing to do: refresh on the same tree node is idempotent so you only ever need to keep one future refresh for the same node. When I see an API like "refresh()", I would expect it to be efficient if nothing has changed. I don't really see an indication that you shouldn't call it frequently or repeatedly and the implementation will figure out if nothing relevant has changed.
As I said this PR fixes the immediate problem, but there's work to be done in the future.

@vinokurig
Copy link
Contributor Author

@westbury

Either someone else needs to look at that code or I need to spend more time understanding it.

This PR has been already approved, may be you would like to suggest some one else to review this PR?

@westbury
Copy link
Contributor

@tsmaeder

the problem is not the console message.

But if I read the 'how to test' section of this PR, the removal of the console message is the requirement for testing of this PR to pass.

The problem is that performance really drops when we have a lot of refresh calls.

Yes, I don't disagree with that. I even commented that this could be fixed by checking all nodes up to the root and detecting if a later refresh has been performed on any of those nodes. But that is not the issue that this PR intended to cover.

@vince-fugnitto The console message may be useful but I thought many people were complaining about it filling up the logs. Was not the point that because the refresh can be called "frequently or repeatedly and the implementation will figure out if nothing relevant has changed", we should not be polluting the log when this happens.

I feel there is nothing we disagree on but are somehow just misunderstanding each other. Let's leave the issue open and comment in it the remaining performance issue. Let's leave the console log exactly as is, for now.

@vinokurig

This PR has been already approved, may be you would like to suggest some one else to review this PR?

I had understood that @azatsarynnyy had tested for console logs, but you still needed someone to review the code and verify the Plugin API side. We don't use the Git plugin so I did not consider this my area and I just thought I should clarify that so unreviewed code was not accidentally merged. I don't know anyone to suggest so I will review the Plugin API stuff later today or tomorrow.

Copy link
Contributor

@westbury westbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume everyone is ok with the drop of support for older Git plugins. I had a couple of minor comments/questions on the code but it looks good to me.

/**
* Diffs two *sorted* arrays and computes the splices which apply the diff.
*/
export function sortedDiff<T>(before: ReadonlyArray<T>, after: ReadonlyArray<T>, compare: (a: T, b: T) => number): Splice<T>[] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to add this function? I don't see this being used by the Theia codebase nor by the Git built-in or any other vscode build-in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the sortedDiff function to the scm.ts

/**
* Represents the validation type of the Source Control input.
*/
export enum SourceControlInputBoxValidationType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a shame this needs to be copied here too in order to expose it as proposed API. It means we have three copies of the same interface, here, theia-proposed.d.ts, and ScmInputIssueType in the frontend. I'm not sure what to suggest that might be better so that is ok.

@vinokurig vinokurig force-pushed the che-7953 branch 2 times, most recently from 5a6a4a0 to f276007 Compare March 15, 2021 13:56
@vinokurig vinokurig merged commit 4e75591 into master Mar 15, 2021
@vinokurig vinokurig deleted the che-7953 branch March 15, 2021 14:18
@github-actions github-actions bot added this to the 1.12.0 milestone Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scm issues related to the source control manager
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using GIT plugin generates an error on the console
5 participants