Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more hash functions for tuple, unify FunctionHash template. [CLICKHOUSE-3490] #3451

Merged
merged 5 commits into from
Nov 1, 2018

Conversation

CurtizJ
Copy link
Member

@CurtizJ CurtizJ commented Oct 23, 2018

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en


static UInt64 mergeHashes(UInt64 h1, UInt64 h2)
{
return IntHash64Impl::apply(h1) ^ h2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the main hash function is cryptographic, we must use cryptographic hash function to combine too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to do it, variants:

  1. combine(x, y) = sipHash64(x || y) where || is "concatenation".
  2. if there was more than one element in tuple, update (SipHash::update, MD5_update, etc.) state by length of each element then by contents of each element; and finalize at the end.


SELECT murmurHash3_64(1, 2, 3);
SELECT murmurHash3_64(1, 3, 2);
SELECT murmurHash3_64('a', [1, 2, 3], 4);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And please add also a test with tuple argument or multiple arguments, one of them is tuple.

3533626746
2388617433
2708309598
3012058918
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand, why it was changed...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just changed to intHash32 or intHash64 as it was for other non-crypto hash functions.
I don't think it is expected behaviour.

@alexey-milovidov alexey-milovidov merged commit 82933e9 into ClickHouse:master Nov 1, 2018
@alexey-milovidov
Copy link
Member

Even halfMD5 is switched to intHash for integers. This is totally wrong. Will fix.

alexey-milovidov added a commit that referenced this pull request Nov 1, 2018
alexey-milovidov added a commit that referenced this pull request Nov 1, 2018
Fixed idiosyncrasy with hash functions introduced in #3451
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants