Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A clarification in document #16

Open
stkim1 opened this issue Feb 18, 2017 · 2 comments
Open

A clarification in document #16

stkim1 opened this issue Feb 18, 2017 · 2 comments
Assignees

Comments

@stkim1
Copy link

stkim1 commented Feb 18, 2017

Hello @Redundancy,

I've recently stumbled upon go-sync and quite impressed with the underlying implementation and its pairing tests. Test cases easily double ( or nearly as much as triple in some cases ) workload and it's a bit discouraging to see them unnoticed by others. Plus, go-sync appears to have broader platform coverage as it stays away from <netinet/in.h> and <arpa/inet.h>. 👍

A reflection of questions arises after following the code base.

In README, it is pointed that

The ZSync mechanism has the weakness that HTTP1.1 ranged requests are not always well supported by CDN providers and ISP proxies. When issues happen, they're very difficult to respond to correctly in software (if possible at all). Using HTTP 1.0 and fully completed GET requests would be better, if possible.

This is a very appetizing point as it is not unusual to face such issue from low-end hosting services for unbeknown reasons. Looking into DoReqeust of HTTPBlockSource where its header is composed, however, we can see the operation requests HTTP 1.1 and Range specifically. I thought there would be a fallback HTTP 1.0 measure to encounter a possible failure, but it was no avail.

I might not have a comprehensive understanding of how go-sync is built. If you could point me a direction where I should turn my head, it would be very much appreciated.

Thank you very much for this charming work!

@Redundancy Redundancy self-assigned this Feb 22, 2017
@Redundancy
Copy link
Owner

You're right - the existing implementation does do something that I explicitly warn against. Note that hypothetically speaking, including cache-busting headers in the responses can help.

The main reason that I don't do it is that it's quite prescriptive to decide to use HTTP1.0 and come up with an appropriate scheme for chopping up the payload. The ideal situation would often be to match the granularity to your units of change (as described a bit lower down).

However, at least theoretically, it should not be too difficult to provide a different implementation of HttpRequester that could source block ranges from potentially multiple files, and fallback to downloading and caching an entire file if a ranged request failed. The work that would be required would largely be to decide how to chop up the input, lay that out as URIs and if that should be handled as a dynamic concern by a webserver or a static one (say, S3).

In order to achieve this, it would probably be best to embed the Verifier into the requester so that it can handle and respond to more cases internally (See BlockSourceBase).

The intent is that the use of Go's interfaces allows delegation of responsibility - the HttpRequester and BlockSourceBase should be almost entirely pluggable, and how you would get the information about the file splits could be an external concern.

@stkim1
Copy link
Author

stkim1 commented Mar 1, 2017

I appreciate your input. I've been digging and starting to draw diagrams to picture how the code is designed. It seems, though, writing test cases against possible failures, especially related to http connections, could render a feasible leverage to catch up the design in time and build the HTTP 1.0 fallback measure.

Coincidentally, you've stated that Network Error handling is one of the improvements you've sought for, and it seems to be the top priority to production readiness mentioned in #15.

I have had very pleasant experience from network simulator such as Network Link Conditioner, albeit only on OSX. I don't believe the very utility could be deployed on such test cases, but the notion of having artificial conditioner to stimulate network issues should stand valid for some degree.

In search of similar tools, I've encountered tylertreat/comcast and Shopify/toxiproxy.

Both projects are highly oriented toward CLI usage, but I think it should be possible to incorporate one of them into tests within affordable time window. I'll start building test cases, and make PRs. I hope you can save some time to review additional tests, and guide me to cover corners properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants