Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka hashed partitioner #374

Closed
siniG opened this issue May 3, 2015 · 3 comments
Closed

Kafka hashed partitioner #374

siniG opened this issue May 3, 2015 · 3 comments
Assignees
Milestone

Comments

@siniG
Copy link

siniG commented May 3, 2015

Hi,
In HashedPartitioner you use python hash function
idx = hash(key) % size #line 12
The python hash function is not consistent and is based on the current running python environment.
For example hash('123') would produce a different partition each time a python process is restarted.
Is there a chance to use another python hash function instead (e.g. I'd recommend murmur hash, mmh3).
Thanks

@dpkp dpkp added this to the 0.9.4 Release milestone May 17, 2015
@dpkp
Copy link
Owner

dpkp commented May 17, 2015

I think we should attempt to partition records consistently with the mainline java client. The code there is fairly simple abs(murmur2(key)) % numPartitions

see https://github.com/apache/kafka/blob/0.8.2/clients/src/main/java/org/apache/kafka/clients/producer/internals/Partitioner.java#L69

@dpkp
Copy link
Owner

dpkp commented May 17, 2015

Changing the key partitioning function has implications for anyone running a KeyedProducer and attempting to use parallel consumers based on the partitioned keys. I think this change requires at least a minor version bump when released (0.10)

@dpkp dpkp self-assigned this Jun 9, 2015
@dpkp
Copy link
Owner

dpkp commented Jun 11, 2015

Give Murmur2Partitioner a try and let me know if you run into any other issues.

@dpkp dpkp closed this as completed Jun 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants