Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38366: [Java] Fix Murmur hash on buffers less than 4 bytes #38368

Merged
merged 1 commit into from
Oct 20, 2023

Conversation

manolama
Copy link
Contributor

@manolama manolama commented Oct 19, 2023

Rationale for this change

Using the MurmurHash implementation would cause collisions on small input values.

What changes are included in this PR?

Fix the iteration for small and tail values that are not 4 bytes in length.

Are these changes tested?

Yes

Are there any user-facing changes?

Unlikely unless someone was using the MurmurHash functions to persist a hash value.

@github-actions
Copy link

⚠️ GitHub issue #38366 has been automatically assigned in GitHub to PR creator.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you!

@lidavidm lidavidm changed the title GH-38366: [Java][Memory] Fix Murmur hash on buffers less than 4 bytes GH-38366: [Java] Fix Murmur hash on buffers less than 4 bytes Oct 20, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Oct 20, 2023
@lidavidm lidavidm merged commit 4bbd48d into apache:main Oct 20, 2023
15 of 16 checks passed
@lidavidm lidavidm removed the awaiting merge Awaiting merge label Oct 20, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 4bbd48d.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 23, 2023
…pache#38368)

### Rationale for this change

Using the `MurmurHash` implementation would cause collisions on small input values.

### What changes are included in this PR?

Fix the iteration for small and tail values that are not 4 bytes in length.

### Are these changes tested?

Yes

### Are there any user-facing changes?
Unlikely unless someone was using the `MurmurHash` functions to persist a hash value.

* Closes: apache#38366

Authored-by: Chris Larsen <[email protected]>
Signed-off-by: David Li <[email protected]>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 25, 2023
…pache#38368)

### Rationale for this change

Using the `MurmurHash` implementation would cause collisions on small input values.

### What changes are included in this PR?

Fix the iteration for small and tail values that are not 4 bytes in length.

### Are these changes tested?

Yes

### Are there any user-facing changes?
Unlikely unless someone was using the `MurmurHash` functions to persist a hash value.

* Closes: apache#38366

Authored-by: Chris Larsen <[email protected]>
Signed-off-by: David Li <[email protected]>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…pache#38368)

### Rationale for this change

Using the `MurmurHash` implementation would cause collisions on small input values.

### What changes are included in this PR?

Fix the iteration for small and tail values that are not 4 bytes in length.

### Are these changes tested?

Yes

### Are there any user-facing changes?
Unlikely unless someone was using the `MurmurHash` functions to persist a hash value.

* Closes: apache#38366

Authored-by: Chris Larsen <[email protected]>
Signed-off-by: David Li <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…pache#38368)

### Rationale for this change

Using the `MurmurHash` implementation would cause collisions on small input values.

### What changes are included in this PR?

Fix the iteration for small and tail values that are not 4 bytes in length.

### Are these changes tested?

Yes

### Are there any user-facing changes?
Unlikely unless someone was using the `MurmurHash` functions to persist a hash value.

* Closes: apache#38366

Authored-by: Chris Larsen <[email protected]>
Signed-off-by: David Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Java][Memory] Murmur hash implementation failing to hash less than 4 bytes.
2 participants