Download the list of packages before requesting from a registry #1375

s-ludwig · 2018-02-12T10:21:23Z

Currently when downloading package information, each registry is queried in sequence, failing and falling back to the next one if a package doesn't exsit or isn't accessible due to any other kind of error.

This has a few implications:

Performance: The failing queries usually take up the same amount of time as a failing query, meaning that with multiple additional registries configured, the total time required for querying all packages can go up proportionally. Note that apart from the approach porposed here, this point can also be mitigated by allowing to query multiple packages at once.
Privacy:; The current approach potentially gives away the names of the used packages to other registries than the one actually containing those packages. This can leak private information from an internal company network whenever the internal registry server is not available.
Bad error messages: It's uncertain whether a package really doesn't exist on a particular server, or if there is some kind of more general error condition. If downloading the package list fails, it is a clear indication of the latter. Also, with the list of packages cached locally, the error message can even list a specific error for the registr(y/ies) that are supposed to contain the package in question, instead of just "Couldn't be found on any of the registries".

wilzbach · 2018-02-12T14:36:10Z

See #1366 for using a single request to download the package with its dependencies.

MartinNowak · 2018-02-13T15:27:49Z

Please consider the bad experience we had with a metadata-cache in the past (see #755 and #528). Too often the cached information is outdated and the commands that require registry interaction are expected to deliver up-to date information. Downloading a full index every time is likely slower than what we have atm. I considered differential updates of the index edits (like a mysql binlog), but the effort didn't seem appropriate compared to #1366.

Privacy:; The current approach potentially gives away the names of the used packages to other registries than the one actually containing those packages.

Dub clients do query custom servers before the default server and defaults can be turned off, so that should be fine for such use-cases.

Bad error messages: It's uncertain whether a package really doesn't exist on a particular server, or if there is some kind of more general error condition.

That's handled by 404 vs. 5xx errors.

So overall I think an full index on client side comes with a lot of new technical issues and requires quite some work, while hardly solving our current problems. The only plus side, on a registry outage, dub could fallback to the local index (possibly using outdated information). But I think it's simpler to improve availability of the registry to solve that problem.

s-ludwig · 2018-02-14T09:18:36Z

No, the plan would be to always download the list first before trying to download further package information from a particular registry - caching is not the intention. It would of course be implemented with proper support modification checks, so that it only gets re-downloaded when things have actually changed. Also note that the file is currently 13 KB uncompressed, 6 KB compressed, so that's also not going to be a problem in the forseeable future.

Dub clients do query custom servers before the default server and defaults can be turned off, so that should be fine for such use-cases.

Yes, but if such a server is not reachable for some reason, then the request will simply fall back to the public registry, revealing internal package names to it. I'm assuming that even internal projects will use some public packages and do not necessarily mirror them on their internal registry.

That's handled by 404 vs. 5xx errors.

Getting a 5xx error will still not tell whether a package is found on that server. 404 could also happen due to a misconfiguration of the web server for example and would also lead to the wrong conclusions. Having (or not having) the full list of packages up front on the other hand allows to generate a very precise error message.

It should also not be underestimated that configuring a single private registry will almost double the amount of time spent to fetch information for a public package, because the private retgistry will always be queried first.

MartinNowak · 2018-02-20T13:49:34Z

Also note that the file is currently 13 KB uncompressed, 6 KB compressed, so that's also not going to be a problem in the forseeable future.

That's just the package index list without version information, right?
I'm not sure that I really want to get into dealing with privacy concerns, it's only the package names after all. There are interesting techinques like search indexing encrypted files that might work here as well, also somewhat similar to https://www.signal.org/blog/private-contact-discovery/. But are we really up for adding complexity for that?

s-ludwig · 2018-02-20T14:22:07Z

Yes, just the names. It's also the error messages that would suddenly make sense. I'm not saying that we have to make this a priority, though, but it's something that I would not dismiss.

Speaking of multiple registries, does the multi-package query currently work correctly for dependency trees that span multiple registries?

MartinNowak · 2018-02-24T13:37:01Z

Speaking of multiple registries, does the multi-package query currently work correctly for dependency trees that span multiple registries?

Yes, it will remove packages from the query list that are in the response from a registry, still WIP though.

s-ludwig added the enhancement label Feb 12, 2018

MartinNowak added the 2-re label Feb 13, 2018

wilzbach mentioned this issue Mar 31, 2018

Non-optional dependency diet-ng of vibe-d:http not found in dependency tree!?. dlang/ci#190

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download the list of packages before requesting from a registry #1375

Download the list of packages before requesting from a registry #1375

s-ludwig commented Feb 12, 2018

wilzbach commented Feb 12, 2018

MartinNowak commented Feb 13, 2018 •

edited

Loading

s-ludwig commented Feb 14, 2018

MartinNowak commented Feb 20, 2018

s-ludwig commented Feb 20, 2018

MartinNowak commented Feb 24, 2018 •

edited

Loading

Download the list of packages before requesting from a registry #1375

Download the list of packages before requesting from a registry #1375

Comments

s-ludwig commented Feb 12, 2018

wilzbach commented Feb 12, 2018

MartinNowak commented Feb 13, 2018 • edited Loading

s-ludwig commented Feb 14, 2018

MartinNowak commented Feb 20, 2018

s-ludwig commented Feb 20, 2018

MartinNowak commented Feb 24, 2018 • edited Loading

MartinNowak commented Feb 13, 2018 •

edited

Loading

MartinNowak commented Feb 24, 2018 •

edited

Loading