Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to download GitHub release asset after #69

Closed
ronsaldo opened this issue Dec 9, 2021 · 10 comments
Closed

Failure to download GitHub release asset after #69

ronsaldo opened this issue Dec 9, 2021 · 10 comments

Comments

@ronsaldo
Copy link

ronsaldo commented Dec 9, 2021

I am having issues on downloading a GitHub release asset with zinc in Pharo 9. I suspect this might be related to an encoding issue that might be invalidating a security token that is required for downloading the artifact after the redirect. The issue can be replicated with the following snippet on a playground:

ZnClient new
	accept: 'application/octet-stream'; "Required by GitHub"
	url: 'https://api.github.com/repos/ronsaldo/abstract-gpu/releases/assets/49144132';
	followRedirects: true;
	downloadTo: 'agpu.tar.gz'
@svenvc
Copy link
Owner

svenvc commented Dec 10, 2021

The redirect URL seems to point to AWS S3 which seems sensitive to some URL encoding issues. I have to study this a bit more, but I have seen this before in other contexts. Either encoding is too eager or too loose, there is not one solution.

Here is my initial workspace:

meta := ZnClient new
	forJsonREST;
	url: 'https://api.github.com/repos/ronsaldo/abstract-gpu/releases/assets/49144132'; 
	get.

ZnClient new
	url: (meta at: 'browser_download_url');
	downloadTo: '/tmp/agpu.tar.gz';
	yourself.

ZnClient new
	url: (meta at: 'browser_download_url');
	get.

ZnLogEvent logToTranscript.

logEvents := OrderedCollection new.
ZnLogEvent announcer when: ZnLogEvent do: [ :event | logEvents add: event ].
logEvents.

redirect := ZnClient new
	dontFollowRedirects;
	url: (meta at: 'browser_download_url');
	get;
	response.
	
redirect location.
redirect headers at: #Location.

The last two expressions show the difference between the Location URL in unparsed/parsed form.

@svenvc
Copy link
Owner

svenvc commented Dec 10, 2021

ZnUrl's encoding strategy is currently based on https://datatracker.ietf.org/doc/html/rfc3986/#appendix-A

Specifically, the query part allows unencoded $/ and $; which are encoded by the redirect/location URL from GitHub towards AWS S3.

Zn parses the URL (correctly even if it encodes more than needed), then renders it again (without the encoding), hence it is different. The AWS S3 signature in the query part most probably depends on the URL being literally identical.

I have to read a bit more to figure out what can be done.

@ClotildeToullec
Copy link

ClotildeToullec commented Dec 13, 2021

We have a similar issue with the Moose images. They also are downloaded from a GitHub release.
The problem never occurred before last week (around December 6, 2021)

The issue can be reproduced by trying to download a Moose image in the Pharo launcher:

In the image creation UI: Official Distributions, Moose Suite 9.0 (development).

This snippet reproduces it:

url := 'https://github.com/moosetechnology/Moose/releases/download/continuous/Moose9-development.zip'.

ZnClient new
        signalProgress: true;
        url: url;
        enforceHttpSuccess: true;
        downloadTo: '/tmp/moose-test'

@svenvc
Copy link
Owner

svenvc commented Dec 13, 2021

With the following commit (update Zinc HTTP Components from this repository),

df181e9

you can specify a different encoding strategy for the query part of URLs, which makes the initial example pass

meta := ZnClient new
  forJsonREST;
  url: 'https://api.github.com/repos/ronsaldo/abstract-gpu/releases/assets/49144132'; 
  get.

ZnOptions globalDefault clone
  at: #queryKeyValueSafeSet put: (ZnOptions queryKeyValueSafeSet \ '/;');
  during: [
    ZnClient new
      url: (meta at: 'browser_download_url');
      get ].

or the equivalent with the downloadTo:

@svenvc
Copy link
Owner

svenvc commented Dec 13, 2021

The Moose download passes as follows:

ZnOptions globalDefault clone
  at: #queryKeyValueSafeSet put: (ZnOptions queryKeyValueSafeSet \ '/;');
  during: [ | url |
    url := 'https://github.com/moosetechnology/Moose/releases/download/continuous/Moose9-development.zip'.
    ZnClient new
      signalProgress: true;
      url: url;
      enforceHttpSuccess: true;
      downloadTo: '/tmp/moose-test' ]

@svenvc
Copy link
Owner

svenvc commented Dec 13, 2021

It is also possible to modify the localOptions of a specific client instance:

ZnClient new in: [ :client |
  client localOptions at: #queryKeyValueSafeSet put: (ZnOptions queryKeyValueSafeSet \ '/;').
  client get: (meta at: 'browser_download_url') ].

@ronsaldo
Copy link
Author

Thank you very much. The workaround works.

@svenvc
Copy link
Owner

svenvc commented Dec 15, 2021

Thanks for the feedback.

BTW, here is another way to customise the local options of a client, as a single expression:

ZnClient new 
  withOptions: [ :options |
    options at: #queryKeyValueSafeSet put: (ZnOptions queryKeyValueSafeSet \ '/;') ];
  get: (meta at: 'browser_download_url').

@demarey
Copy link
Contributor

demarey commented Dec 16, 2021

Hi @svenvc,

I can confirm it also works for Moose download:

url := 'https://github.com/moosetechnology/Moose/releases/download/continuous/Moose9-development.zip'.

ZnClient new 
  signalProgress: true;
  url: url;
  enforceHttpSuccess: true;
  withOptions: [ :options |
    options at: #queryKeyValueSafeSet put: (ZnOptions queryKeyValueSafeSet \ '/;') ];
  downloadTo: '/tmp/moose-test'.

If I understand well, '/;' characters should not be encoded with a percent. Is it a recent change of github? Zinc worked like a charm before. How did you debug this issue? It is interesting if we have a similar problem in the future.

Thanks for the quick fix.

@svenvc
Copy link
Owner

svenvc commented Dec 16, 2021

Thank you @demarey for testing and for the feedback.

How I found the problem ? The final redirect failed, everything looked reasonable, but the URL was very complicated and I spotted a difference between the URL in the location header field and the one actually being followed. Zinc always parses and re-renders URLs. I know that this is a sensitive area.

Percent encoding can be done on a scale: very safely, encoding everything except for a small set of safe characters, which most URL parsers will accept (Zinc does), or encoding as little as possible, following the specs to the max.

Now, in this case (I guess something changed on their end), the final URL looks like an AWS S3 access with an embedded signature (AWS v4 signature). That code is very picky about encoding, formatting, ordering, etc - because it signs a very specific input. Somehow the validation code has certain expectations about the general encoding of the resulting URL. We are forced to follow (at least for now). But that is not (yet) strong enough an argument to change the default (which also came about from some trial and error, I remember I once had it on encode everything and that gave issues too).

All this has now been recorded and we will see how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants