Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bizarre issue with Diffbot using guzzlehttp #50

Open
jonathantullett opened this issue Oct 8, 2016 · 7 comments
Open

Bizarre issue with Diffbot using guzzlehttp #50

jonathantullett opened this issue Oct 8, 2016 · 7 comments

Comments

@jonathantullett
Copy link

I've created a Crawl API job which has a few hundred results. I'm trying to get the results using type:article (so $bot->search("type:article") with setNum to "all") and it's throwing an exception:

PHP Warning:  curl_multi_exec(): Unable to create temporary file, Check permissions in temporary files directory. in /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php on line 106

Warning: curl_multi_exec(): Unable to create temporary file, Check permissions in temporary files directory. in /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php on line 106
PHP Fatal error:  Uncaught GuzzleHttp\Exception\RequestException: cURL error 23: Failed writing body (2749 != 16384) (see http://curl.haxx.se/libcurl/c/libcurl-errors.html) in /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php:187
Stack trace:
#0 /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php(150): GuzzleHttp\Handler\CurlFactory::createRejection(Object(GuzzleHttp\Handler\EasyHandle), Array)
#1 /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php(103): GuzzleHttp\Handler\CurlFactory::finishError(Object(GuzzleHttp\Handler\CurlMultiHandler), Object(GuzzleHttp\Handler\EasyHandle), Object(GuzzleHttp\Handler\CurlFactory))
#2 /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(179): GuzzleHttp\Handler\CurlFactory::finish(Object(GuzzleHttp\Handler\CurlMultiHandler), Object(GuzzleHttp\Handler\EasyHandle), Object(GuzzleHttp\Handler\CurlFactory))
#3 /home/tullettj/websites/c in /home/tullettj/websites/core-code/lib/vendor/php-http/guzzle6-adapter/src/Promise.php on line 127

Fatal error: Uncaught GuzzleHttp\Exception\RequestException: cURL error 23: Failed writing body (2749 != 16384) (see http://curl.haxx.se/libcurl/c/libcurl-errors.html) in /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php:187
Stack trace:
#0 /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php(150): GuzzleHttp\Handler\CurlFactory::createRejection(Object(GuzzleHttp\Handler\EasyHandle), Array)
#1 /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php(103): GuzzleHttp\Handler\CurlFactory::finishError(Object(GuzzleHttp\Handler\CurlMultiHandler), Object(GuzzleHttp\Handler\EasyHandle), Object(GuzzleHttp\Handler\CurlFactory))
#2 /home/tullettj/websites/core-code/lib/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(179): GuzzleHttp\Handler\CurlFactory::finish(Object(GuzzleHttp\Handler\CurlMultiHandler), Object(GuzzleHttp\Handler\EasyHandle), Object(GuzzleHttp\Handler\CurlFactory))
#3 /home/tullettj/websites/c in /home/tullettj/websites/core-code/lib/vendor/php-http/guzzle6-adapter/src/Promise.php on line 127

So I've played with the setNum values and 60 seems to be the magic number. If I query for 60 or less, it's fine, however if I go for 61 or above, it throws this exception.

Have you seen this before, @Swader? It's a bit of a head scratcher (I have ~2Gb free in the temporary files directory)

Thanks!

@jonathantullett
Copy link
Author

I've run it with a few other searches and the values are arbitrary. I thought it may be memory_limit related, but the script's configured with a memory_limit of -1 (so, unlimited).

@jonathantullett jonathantullett changed the title Bizarre issue with guzzlehttp Bizarre issue with Diffbot using guzzlehttp Oct 8, 2016
@Swader
Copy link
Owner

Swader commented Oct 9, 2016

That'll happen with large bodies :( See this and this.

Let me know if you manage to hack past it.

@jonathantullett
Copy link
Author

@Swader I've been working around this so far by decreasing the number of results downloaded if there's an exception thrown.

However, I'm now starting to see it being thrown when only a single result (setNum(1)) is being requested. This is rather problematic. Can you think of any way around this, or do we just have to consider them bad searches?

@Swader
Copy link
Owner

Swader commented May 5, 2017

@jonathantullett I'm sorry about the delay, didn't see this until now - I'll play around with it when I find time. It's still related to the above links from what I can tell, so I'll just have to modify the underlying stack to the tac method without implicitly relying on Guzzle to handle everything and it should work. This would, however, increase dependency on curl. I'll think about the best solution for everyone.

@Swader
Copy link
Owner

Swader commented Nov 30, 2017

@jonathantullett to continue on our discussion from Support - how are you calling the hundreds of search calls? I think I may be misunderstanding what's going on, as I've been unable to reproduce the hung calls. Can you share your code?

@jonathantullett
Copy link
Author

@Swader this is a different issue. This one is replicated by trying to download setNum($XX) articles for a search (I use the min time on the search), and I see the problem on a number of searches - often related to the size of the pages being returned.

I’ll find a search which is showing the issue and post it later (not at home at the moment) but this is completely unrelated to the dangling HTTPS connection issue.

@Swader
Copy link
Owner

Swader commented Nov 30, 2017

No I know, I just had no other way to ping you here directly 😬 A new issue with the hung calls would be appreciated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants