Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added (and set as default) a cURL-based method for fetching HTTP content #624

Merged
merged 1 commit into from
Aug 9, 2016

Conversation

julienCXX
Copy link

@julienCXX julienCXX commented Aug 6, 2016

Implements proposal in #531.

[EDIT 2016-08-10] Relates to sebsauvage#114 (virtualtam)

@ArthurHoaro
Copy link
Member

Thanks!

A few remarks:

5.5.0 Added the cURL resource as the first argument to the CURLOPT_PROGRESSFUNCTION callback.

  • the documentation should be updated in Github's wiki, it'll be included in the repo at the next release.
  • it looks like you're downloading the remote page in every case, while the original function only download it if it find a 200 HTTP code. That's not a bad idea, because Shaarli will only make 1 request and users don't usually shaare error pages, however both method should have the same behaviour.

@julienCXX
Copy link
Author

I think it is fixed now, for the first remark.

For the latter one, I can return an empty body on 40x HTTP codes, even though the body would be still downloaded.
Alternatively, I can make cURL stop immediately when it encounters one of the said codes. But the original headers would not be returned.
Finally, I could send 2 requests, in the fashion of the fallback method, but that would make the code less readable.

@ArthurHoaro
Copy link
Member

Nah, I like what you did, removing the or code = 200 here is a better idea, IMHO.

. ',en-US;q=0.7,en;q=0.3';

if (!function_exists('curl_init')) {
return get_http_response_fallback($cleanUrl, $timeout, $maxBytes,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the arguments list is too long, we usually use one line per argument, as stated in PSR-2.

@julienCXX
Copy link
Author

Nah, I like what you did, removing the or code = 200 here is a better idea, IMHO.

OK to do that, but I would remove the condition on line 159 too.

@ArthurHoaro
Copy link
Member

No. We make an extra request changing one parameter if we don't get a HTTP 200. However, if we already have a valid response, the extra request is not necessary.

@julienCXX
Copy link
Author

OK, I made all the changes you stated. And also:

  • in cURL method: put received headers with the same name in a sub-array, like in the fallback method
  • in fallback method: found a way to send Accept and Accept-Language headers, like in the cURL method
    Thus, with the fallback method, http://services.gisgraphy.com/ now works and title of http://android.izzysoft.de/ appears in English.

I also noticed that get_headers() follows redirects by default, thus get_redirected_headers() might be useless. The trick here is that it mixes-up headers from the original page with all its redirects, which happened to me when building the cURL method (as I explained it in #531). And it explains the weird headers you got with https://fr.atlassian.com/git/tutorials/comparing-workflows/forking-workflow/.

@ArthurHoaro
Copy link
Member

I also noticed that get_headers() follows redirects by default, thus get_redirected_headers() might be useless.

You're right, I don't know how I didn't see that. A bit of clean up could be done later.

I'll run a few tests but I'm OK with the current PR. Could you squash your commits and update the wiki?

@julienCXX
Copy link
Author

Could you squash your commits and update the wiki?

Yes, I can. Which commit structure do you want?

@ArthurHoaro
Copy link
Member

You changed one file for one feature, I'd say one commit is fine.

@ArthurHoaro
Copy link
Member

Tested, it works fine.

* @param int $maxBytes maximum downloaded bytes
* @param string $userAgent "User-Agent" header
* @param string $acceptLanguage "Accept-Language" header
* @param int $maxRedr maximum amonut of redirections followed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*amount

@virtualtam
Copy link
Member

+1 for merging, PR tested with a few known-to-be-problematic web sites listed in #531
(sorry, had to find a small typo somewhere :D )

Btw, this list of problematic websites could be used to add functional (vs. unitary) test suites, to check Shaarli code against real-world conditions. Let me know if there's interest, this would be quite easy to add through PHPUnit's configuration.

@ArthurHoaro
Copy link
Member

@virtualtam It would be interesting indeed. However, aren't functional tests supposed to test the application a whole, meaning fire up a web server, login, etc., instead of components?

@julienCXX Thanks again!

@ArthurHoaro ArthurHoaro modified the milestones: 0.8.0, 0.8.1 Aug 9, 2016
@ArthurHoaro ArthurHoaro merged commit d0d3623 into shaarli:master Aug 9, 2016
@virtualtam
Copy link
Member

Well, you can test both; usually functional tests cover a whole feature in real-world conditions, but can also be used to check features that cannot be covered through unitary tests, e.g. accessing external services. We actually do have such tests, I think most of them are related to HttpUtils.

@julienCXX julienCXX deleted the pr-curl-http-fetch branch August 10, 2016 19:43
virtualtam added a commit to virtualtam/Shaarli that referenced this pull request Aug 12, 2016
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Additions:
- [all][env] set $TERM=dumb to avoid debconf-related issues
- [all][pkg] install ca-certificates
- [all][pkg] cleanup APT cache after installing packages
- [dev] refactor Dockerfile

TODO:
- [prod][php] use Composer to resolve PHP dependencies
- [prod] refactor Dockerfile

Signed-off-by: VirtualTam <[email protected]>
virtualtam added a commit to virtualtam/Shaarli that referenced this pull request Aug 12, 2016
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [all][env] set $TERM=dumb to avoid debconf-related issues
- [all][pkg] install ca-certificates
- [all][pkg] cleanup APT cache after installing packages
- [dev] refactor Dockerfile
- [prod][master][php] use Composer to resolve PHP dependencies
- [prod][master] refactor Dockerfile

TODO:
- [prod][stable][php] use Composer to resolve PHP dependencies
- [prod][stable] refactor Dockerfile

Signed-off-by: VirtualTam <[email protected]>
virtualtam added a commit to virtualtam/Shaarli that referenced this pull request Aug 12, 2016
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [all][env] set $TERM=dumb to avoid debconf-related issues
- [all][pkg] install ca-certificates
- [all][pkg] cleanup APT cache after installing packages
- [dev] refactor Dockerfile
- [prod][master][php] use Composer to resolve PHP dependencies
- [prod][master] refactor Dockerfile

Commented modifications:
- [prod][stable][php] use Composer to resolve PHP dependencies
- [prod][stable] refactor Dockerfile

Signed-off-by: VirtualTam <[email protected]>
virtualtam added a commit to virtualtam/Shaarli that referenced this pull request Aug 12, 2016
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [all][env] set $TERM=dumb to avoid debconf-related issues
- [all][pkg] install ca-certificates
- [all][pkg] cleanup APT cache after installing packages
- [dev] refactor Dockerfile
- [prod][master][php] use Composer to resolve PHP dependencies
- [prod][master] refactor Dockerfile
- [prod][stable] refactor Dockerfile

Commented modifications:
- [prod][stable][php] use Composer to resolve PHP dependencies

Signed-off-by: VirtualTam <[email protected]>
virtualtam added a commit to virtualtam/Shaarli that referenced this pull request Aug 14, 2016
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [all][env] set $TERM=dumb to avoid debconf-related issues
- [all][pkg] install ca-certificates
- [all][pkg] cleanup APT cache after installing packages
- [dev] refactor Dockerfile
- [prod][master] refactor Dockerfile
- [prod][master][php] use Composer to resolve PHP dependencies

Signed-off-by: VirtualTam <[email protected]>
virtualtam added a commit to virtualtam/Shaarli that referenced this pull request Aug 14, 2016
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624
Relates to shaarli#633

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [prod][stable] refactor Dockerfile
- [prod][stable] set $TERM=dumb to avoid debconf-related issues
- [prod][stable] install ca-certificates
- [prod][stable] cleanup APT cache after installing packages
- [prod][stable] use Composer to resolve PHP dependencies

Signed-off-by: VirtualTam <[email protected]>
ArthurHoaro pushed a commit that referenced this pull request Nov 5, 2016
Relates to #607
Relates to #612
Relates to #624

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [all][env] set $TERM=dumb to avoid debconf-related issues
- [all][pkg] install ca-certificates
- [all][pkg] cleanup APT cache after installing packages
- [dev] refactor Dockerfile
- [prod][master] refactor Dockerfile
- [prod][master][php] use Composer to resolve PHP dependencies

Signed-off-by: VirtualTam <[email protected]>
ArthurHoaro pushed a commit that referenced this pull request Nov 5, 2016
Relates to #607
Relates to #612
Relates to #624
Relates to #633

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [prod][stable] refactor Dockerfile
- [prod][stable] set $TERM=dumb to avoid debconf-related issues
- [prod][stable] install ca-certificates
- [prod][stable] cleanup APT cache after installing packages
- [prod][stable] use Composer to resolve PHP dependencies

Signed-off-by: VirtualTam <[email protected]>
portailp pushed a commit to PortailPro/Shaarli that referenced this pull request Mar 20, 2017
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [all][env] set $TERM=dumb to avoid debconf-related issues
- [all][pkg] install ca-certificates
- [all][pkg] cleanup APT cache after installing packages
- [dev] refactor Dockerfile
- [prod][master] refactor Dockerfile
- [prod][master][php] use Composer to resolve PHP dependencies

Signed-off-by: VirtualTam <[email protected]>
portailp pushed a commit to PortailPro/Shaarli that referenced this pull request Mar 20, 2017
Relates to shaarli#607
Relates to shaarli#612
Relates to shaarli#624
Relates to shaarli#633

See https://github.com/shaarli/Shaarli/wiki/Server-requirements

Modifications:
- [prod][stable] refactor Dockerfile
- [prod][stable] set $TERM=dumb to avoid debconf-related issues
- [prod][stable] install ca-certificates
- [prod][stable] cleanup APT cache after installing packages
- [prod][stable] use Composer to resolve PHP dependencies

Signed-off-by: VirtualTam <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants