-
-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak #525
Comments
Hi @nerg4l Do you want to work on a PR to fix the Html parser? And about CurlDispatcher, not sure why this can happen. The Maybe, after passing the stream to the response, it should be closed and clear here? |
Hello, I will open a PR later today to fix the parser. About the other leak, I'm not sure if the issue is with the stream function. I tried to use |
I'm not sure if #527 solves the problem but definitely reduces the occurrence. |
Let's continue the conversation here. Performance after the recent changes:
With Guzzle client:
Calling Would you accept a PR for adding a |
Okay, but I cannot understand why this difference. Guzzle uses CURL under the hood, so I cannot understand why this happens. Could be possible that Guzzle had a cache to avoid perform the same query twice? In your tests, you're doing the requests to the same url. The Embed implementation consume 4MB less of memory, the difference is only in the time. |
Tested with 300 YouTube links:
I had to add
It looks like you were correct partially and that now Guzzle's memory usage is the problematic one. Code:
|
Great, so I guess the default CurlDispatcher is good enough, right? |
I think it's good to go. I will open a new issue if needed. |
I was trying to fetch oEmbed information in bulk but I run into multiple memory issues.
\HtmlParser\Parser
First,
libxml_use_internal_errors(true)
uses an internal buffer which can fill whenlibxml_clear_errors
is not called. The most voted "User Contributed Notes" on https://www.php.net/manual/en/function.libxml-use-internal-errors.php mentions this.https://github.com/oscarotero/html-parser/blob/0c5b619bdc7ac061f06a667d913e2af708ee3231/src/Parser.php#L77-L87
I fixed that by parsing in batch and calling
libxml_clear_errors
between each parsing.\Embed\Http\CurlDispatcher
I wasn't able to completely verify the source of this leak. However, the error complains about
stream_get_contents
and running profiling it does shows that this method uses the most memory.I replaced
\Embed\Http\CurlDispatcher
with\GuzzleHttp\Client
and memory usage fall from 300+ MB to only 84 MB.The text was updated successfully, but these errors were encountered: