Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --parallel/--no-parallel option to download:plugins #11112

Conversation

xoriath
Copy link
Contributor

@xoriath xoriath commented May 6, 2022

The default behaviour is unchanged and is still --parallel,
however this change adds --no-parallel or --parallel false
as an option to not download every listed plugin in parallel.

We are seeing some issues both with ECONNRESET and ZlibBuffer
intermittently (probably due to network errors/throttling)

This fixes #11110

Signed-off-by: Morten Engelhardt Olsen [email protected]

What it does

Add new argument --parallel (default: true) which will either download plugins in serial or in parallel

How to test

There is no real user visible changes, the only difference is how quickly the target folder is created and the plugins extracted (parallel vs serial)

Review checklist

Reminder for reviewers

@xoriath xoriath force-pushed the add-no-parallel-option-to-download-plugins branch from 2f003c7 to f8fd03a Compare May 6, 2022 15:43
Copy link
Member

@vince-fugnitto vince-fugnitto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution 👍

In order to accept your changes please be sure to sure to sign the eclipse contributor agreement (eca) with the same email as your authorship.

@vince-fugnitto vince-fugnitto added theia-cli issues related to the theia-cli builtins Issues related to VS Code builtin extensions labels May 6, 2022
@xoriath
Copy link
Contributor Author

xoriath commented May 6, 2022

@vince-fugnitto I could have sworn... gimme a moment :)

@xoriath
Copy link
Contributor Author

xoriath commented May 6, 2022

hmmmm, I'm failing spectacularly at getting the ECA to validate...

@vince-fugnitto
Copy link
Member

hmmmm, I'm failing spectacularly at getting the ECA to validate...

I do see that there is a valid eca account for your email

image

And your authorship and sign-off match:

From f8fd03abbc2a953eb7919a3524bcdbb6058fb15a Mon Sep 17 00:00:00 2001
From: Morten Engelhardt Olsen <[email protected]>
Date: Fri, 6 May 2022 08:36:39 -0700
Subject: [PATCH] Add --parallel/--no-parallel option to download:plugins

The default behaviour is unchanged and is still --parallel,
however this change adds --no-parallel or --parallel false
as an option to not download every listed plugin in parallel.

We are seeing some issues both with ECONNRESET and ZlibBuffer
intermittently (probably due to network errors/throttling)

This fixes #11110

Signed-off-by: Morten Engelhardt Olsen <[email protected]>

So perhaps it's just the tool taking time to update.

@xoriath
Copy link
Contributor Author

xoriath commented May 6, 2022

I sent an email to [email protected].. we'll see :)

dev-packages/cli/src/theia.ts Outdated Show resolved Hide resolved
dev-packages/cli/src/download-plugins.ts Outdated Show resolved Hide resolved
@xoriath xoriath force-pushed the add-no-parallel-option-to-download-plugins branch from f8fd03a to a44ad3a Compare May 6, 2022 16:32
@xoriath
Copy link
Contributor Author

xoriath commented May 6, 2022

ECA check fixed...

The default behaviour is unchanged and is still --parallel,
however this change adds --no-parallel or --parallel false
as an option to not download every listed plugin in parallel.

We are seeing some issues both with ECONNRESET and ZlibBuffer
intermittently (probably due to network errors/throttling)

This fixes eclipse-theia#11110

Signed-off-by: Morten Engelhardt Olsen <[email protected]>
@xoriath xoriath force-pushed the add-no-parallel-option-to-download-plugins branch from a44ad3a to 547e2d8 Compare May 6, 2022 20:56
@ccorn
Copy link

ccorn commented May 7, 2022

For testing, I have locally merged #11113 and this branch at 547e2d8.

My test command:

rm -rf plugins; yarn download:plugins --no-parallel

Improvement: This no longer silently ignores extensions for which no compatible version can be found, but throws:

--- resolving 1 extension-packs ---
Error: No download url for extension pack ms-vscode.js-debug-companion (undefined)
    at /home/ccorn/BUILD/theia/dev-packages/cli/lib/download-plugins.js:83:27
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
    at async Promise.all (index 72)
    at async downloadPlugins (/home/ccorn/BUILD/theia/dev-packages/cli/lib/download-plugins.js:75:46)
    at async Object.handler (/home/ccorn/BUILD/theia/dev-packages/cli/lib/theia.js:206:13)

Which is actually true: The earliest version has engine.vscode == "^1.55.0", which is later than VSCODE_DEFAULT_API_VERSION = 1.53.2.

Therefore I have added ms-vscode.js-debug-companion to theiaPluginsExcludeIds.

The bad news: I still get lots of parallel connections and ECONNRESET:

--- resolving 1 extension-packs ---
Error: socket hang up
    at connResetException (internal/errors.js:609:14)
    at TLSSocket.socketOnEnd (_http_client.js:458:23)
    at TLSSocket.emit (events.js:326:22)
    at endReadableNT (_stream_readable.js:1241:12)
    at processTicksAndRejections (internal/process/task_queues.js:84:21) {
  code: 'ECONNRESET'
}

Network traffic continues long after that, and even after network traffic has ended,
the command does not terminate, with lsof -i listing between 3 and 7 remaining open node connections. This behavior seems to be the same as of the v1.25.0 release.

Worse: After manually aborting, the plugins directory only contains the extensions listed in theiaPlugins, nothing more. Where have the pack-bundled extensions gone? I do not recall similar behavior of the v1.25.0 release.

My workaround for the v1.25.0 release was to replace the extension pack with an explicit list in theiaPlugins, as in previous releases.

So, no success yet, but improvement in error checking.
The problem (still) is in resolving the extension packs.

@ccorn ccorn mentioned this pull request May 7, 2022
1 task
@vince-fugnitto
Copy link
Member

@ccorn I'm not sure I follow, the behavior should be the following:

  1. download all plugins listed under theiaPlugins.
  2. if an extension is an extensionPack, then resolve the extensions it declares:
    • the resolvement should download a version of the extension (listed by id) that satisfies the framework's declared supported api.
    • the resolvement of an extension should skip an extension if already downloaded from step 1.
    • the resolvement should not download an extension that is part of the exclusion list.

If you were to abort before the resolvement of the extensionPack then of course you would not see the extension under your folder.

My workaround for the v1.25.0 release was to replace the extension pack with an explicit list in theiaPlugins, as in previous releases.

As I mentioned previously, the theiaPlugins we define in this repo are for our purposes only. There is nothing different in the 1.25.0 release, the download:plugins script has not been updated for over 2 months now. This repo simply updated our own builtins for test purposes to make use of an extensionPack instead of listing each individually, but it does not affect you downstream.

@ccorn
Copy link

ccorn commented May 8, 2022

If you were to abort before the resolvement of the extensionPack then of course you would not see the extension under your folder

Just tried it again, with #11113 but without this merge request. This time it's a ZBufError:

--- resolving 1 extension-packs ---
Error: ZBufError
    at Gunzip.<anonymous> (/home/ccorn/BUILD/theia/node_modules/bent/src/nodejs.js:36:57)
    at Gunzip.emit (events.js:314:20)
    at Zlib.zlibOnError [as onerror] (zlib.js:185:8)

And I do have to abort again, but without #11112 at least some of the bundled plugins show up after the abort. With 547e2d8, none show up.

This repo simply updated our own builtins for test purposes to make use of an extensionPack instead of listing each individually

That is the point: Resolving extension packs uses another code path and/or a new scale of connection parallelism which turns out to be problematic. Well, even the usual downloading of theiaPlugins sometimes failed, but resolving of extension packs has never succeeded here, which is why I have reverted to an all-explicit plugin list, as in v1.24.0 (but with ID-based names where the extension pack would have created those).

@ccorn
Copy link

ccorn commented May 8, 2022

More experimenting: I have listed all bundled plugins explicitly in theiaPlugins, but kept the extension pack listed as well. Thus, resolving the extension pack would be attempted, but should find all extensions downloaded already. Would the script succeed in this case?

No. Both without and with this merge request (547e2d8), the steps logged as --- downloading plugins --- and --- collecting extension-packs --- went well (and the plugins directory populated accordingly); then in --- resolving 1 extension-packs --- I still got a ZBufError or ECONNRESET as reported before. Again, I noticed lots of network traffic, and after that traffic had died down, I still had to abort the script.

Without 547e2d8, many (but not all) messages like
- [extension ID]: already downloaded - skipping were output.
With 547e2d8, no such messages appeared.

@colin-grant-work colin-grant-work self-requested a review May 10, 2022 14:14
@xoriath
Copy link
Contributor Author

xoriath commented May 11, 2022

So, I don't have a problem by just blindly doing the talking to open-vsx in serial... let me add a commit you can try for that...

To not send all the search requests against OVSX
in one big promise, just resolve them slowly one-by-one.

The download itself is still heeding the --parallel/--no-parallel
setting.

Signed-off-by: Morten Engelhardt Olsen <[email protected]>
@ccorn
Copy link

ccorn commented May 11, 2022

So, I don't have a problem by just blindly doing the talking to open-vsx in serial... let me add a commit you can try for that...

Thanks, I will try it! But be prepared for feedback delays measured in hours, in case your changes succeed in making the includeAllVersion queries run to completion here.

@ccorn
Copy link

ccorn commented May 12, 2022

OK, I locally merged #11112 (8b22b76) with #11113 (51a7899), and time yarn download:plugins --no-parallel succeeded, running to completion without errors.

Not only did it succeed, but it did so within under 30 minutes.
This is remarkable because when I (earlier) used a sequence of curl commands to simulate the API requests with includeAllVersions and store the responses, that took 186 minutes (yes, 3h) for 73 non-excluded plugins with a total response size of 1,323,992,876 bytes.
(I now suspect that Theia's code uses compression for transfers, which in this case is ideally suited because of the high redundancy in the version info.)

I solemnly declare that this fix works for me.
You probably want to polish this, so I'll keep watching.

(Of course, the gross amount of data transferred for version-checking is still good material for a precious rant, but not within the scope of this pull request.)

@colin-grant-work
Copy link
Contributor

(Of course, the gross amount of data transferred for version-checking is still good material for a precious rant, but not within the scope of this pull request.)

I'd encourage taking the rant to the Open VSX repository, since it seems that they have structured their responses to include such a massive amount of redundant information.

@xoriath
Copy link
Contributor Author

xoriath commented May 12, 2022

yeah, that's for OpenVSX :) However it does look like doing the queries against OpenVSX in parallel does not really gain much w.r.t time, so I don't actually see much issue doing this resolving stage in serial... Not sure if you have any differing opinion, @colin-grant-work / @vince-fugnitto ?

@colin-grant-work
Copy link
Contributor

colin-grant-work commented May 12, 2022

No, I agree - if the user chooses to do things serially, we should try to do everything that we can serially, including the queries to Open VSX about versions.

This plugin does not have a download url, which is now an error where it before was a warning.
@xoriath
Copy link
Contributor Author

xoriath commented May 12, 2022

But what if he chooses parallel... In my mind resolving this could also go in serial then just to make the code easier to follow, and I at least see no speed improvement in the async version of the queries... Granted I don't have a setup with a whole lot of extension packs but...

@xoriath
Copy link
Contributor Author

xoriath commented May 17, 2022

@vince-fugnitto / @colin-grant-work do you guys think more is needed on this PR?

@colin-grant-work
Copy link
Contributor

@xoriath, apologies for the delay. I've been slowly chewing on the question of whether to run the Open VSX requests in parallel or sequence. Let me take another look this afternoon and I'll get back to you.

@xoriath
Copy link
Contributor Author

xoriath commented May 17, 2022

No rush, @colin-grant-work , I've just pondered and ended up that changing the queries back to parallel (or make them actually heed the new switch) is... a bit more meh... So at least in my mind I would like to put this in and see if we alleviate the issues that we have. If resolving against open-vsx is too slow, then deal with that in round 2.

However it looks like eclipse/openvsx#379 is discussing the data response format in itself.

Copy link
Contributor

@colin-grant-work colin-grant-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be working as promised. 👍 If necessary, we can modify the VSX calls to occur in sequence as well separately.

@xoriath xoriath requested a review from vince-fugnitto May 19, 2022 17:12
@ccorn
Copy link

ccorn commented May 19, 2022

The latest commit (f1f66c1) in this PR, while needed for testing, would be duplicating part of #11113, so I think you should merge 8b22b76 instead or update #11113 accordingly.

I also think it is useful to keep the two essential commits (--no-parallel and serial version query) separate (i. e. no squashing). Because if OpenVSX succeeds in reducing their response size by a factor of 30, then the (reduced) total time needed could be dominated by connection buildup latencies, and then you might want to retry parallel version queries by tentatively reverting only 8b22b76 and keeping the --no-parallel commit.

@colin-grant-work colin-grant-work merged commit 8ba808a into eclipse-theia:master May 25, 2022
@xoriath xoriath deleted the add-no-parallel-option-to-download-plugins branch May 25, 2022 16:45
colin-grant-work pushed a commit that referenced this pull request May 25, 2022
@vince-fugnitto
Copy link
Member

I think we should either fix the implementation or revert the commit for now since we only respect the parallel flag for the initial download, but we do not respect it for the resolution of packs and dependencies which are now handled sequentially.

@colin-grant-work
Copy link
Contributor

I've put up #11201 which respects parallel / no-parallel for all parts of the download process.

@ccorn
Copy link

ccorn commented May 25, 2022

@colin-grant-work: Shouldn't the revert take place on the master branch? To me this looks as if the master branch has the changes, whereas this PR branch has them reverted, and I suppose it should be the other way around.

@colin-grant-work
Copy link
Contributor

colin-grant-work commented May 25, 2022

My PR is not a reversion of this PR - just an adjustment. In particular, this PR made it so that all plugin version information was resolved before any plugins were downloaded. That didn't change the logic, but it did mean that there was no visible feedback for a long period while calls to Open VSX were made to determine the versions to be downloaded. My PR changes that sequence so that for each plugin, we determine the version, then download it, ensuring that we get some feedback for each plugin as we go and avoiding the impression of long hang.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
builtins Issues related to VS Code builtin extensions theia-cli issues related to the theia-cli
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Limit simultaneous plugin downloads
4 participants