-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in consumer when using lz4 compression #1021
Comments
Thanks for the great bug report! I found the issue and will put up a PR to fix. |
It looks like there may also be a bug in lz4tools that could cause a crash during lz4 decompression (assuming #1024 fix is applied to kafka-python): darkdragn/lz4tools#8 |
We might consider switching to https://github.com/python-lz4/python-lz4 for primary lz4 support going forward. It seems to be more active. |
Thanks for the amazingly fast turnaround @dpkp! |
I added python-lz4 support to #1024 -- can you take a look and see if it seems ok? |
A few issues popping up w/ python-lz4: it does not work with pypy, and its frame encoding does not work with the older "broken" lz4 code used by kafka brokers prior to 0.10. These aren't devastating problems, and certainly better than random segfaults on decode, but not as straightforward as I'd like. |
Given the complexity here w/ both options, I think I'm going to defer this until after the next release (which I'm hoping to push out in the next few days). |
#1024 appears to be working as expected. Thanks again! |
Well I take it back -- i managed to get the lz4 patch passing tests again and I'm going to merge for release. Hooray! |
I encountered an issue where using Output of the
File in question:
I'm using |
You might get more traction opening this as a new ticket... comments on closed issues/PRs typically get lost. |
The memory leak reported in lz4 0.18.1 has now been fixed, and the fix is in the new release 0.18.2. Thanks to @bt-wil for reporting this and giving a test case. I hope you guys will continue to reach out if you see any issues with lz4 - am keen to work with you, as I am a big fan, and user, of kafka :) |
Opening a PR to check if tests pass with the new version. If so, we'll want to bump `requirements-dev.txt` as well. Many thanks to @jonathanunderwood for his diligent work here: #1021 (comment)
Opening a PR to check if tests pass with the new version. If so, we'll want to bump `requirements-dev.txt` as well. Many thanks to @jonathanunderwood for his diligent work here: #1021 (comment)
I've noticed a problem with unbounded memory usage in the consumer when handling lz4-compressed payloads. The symptoms point to a memory leak of some description. I've created a test case which should allow easy reproduction of the issue.
I've tried experimenting with various consumer settings such as
fetch_max_bytes
,max_partition_fetch_bytes
andmax_in_flight_requests_per_connection
in an attempt to lessen the buffering requirements of each individual request to Kafka. In all cases, memory usage of consumer processes has continued to rise until such a point that those processes are killed by the host system.I can additionally confirm that this issue is present in the latest PyPI release (1.3.2) as well as
master
and that the issue only manifests when usinglz4
compression specifically.gzip
appears to be working fine, for example.Any help is greatly appreciated. Thanks!
The text was updated successfully, but these errors were encountered: