Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expression: POC implementation of Vitess hashing algorithm. #20972

Closed
wants to merge 3 commits into from

Conversation

bezmax
Copy link
Contributor

@bezmax bezmax commented Nov 11, 2020

What problem does this PR solve?

Issue Number: close #20971

Problem Summary: Implementing vitess_hash function to support efficient migration between tidb <-> vitess.

What is changed and how it works?

Proposal: #20971

What's Changed: New builtin_misc function added implementing vitess_hash algorithm.

Related changes

In order PRs need to be merged:

  1. Adding vitess_hash function code to tipb tipb#198
  2. Adding vitess_hash function to ast parser#1089
  3. (This PR) expression: POC implementation of Vitess hashing algorithm. #20972

Check List

Tests

  • Unit test
  • Integration test

Release note

  • Added vitess_hash function which implements Vitess' hashing algorithm to determine sharding key.

@bezmax bezmax requested a review from a team as a code owner November 11, 2020 01:17
@bezmax bezmax requested review from XuHuaiyu and removed request for a team November 11, 2020 01:17
@ti-srebot ti-srebot added the first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. label Nov 11, 2020
@CLAassistant
Copy link

CLAassistant commented Nov 11, 2020

CLA assistant check
All committers have signed the CLA.

@sre-bot
Copy link
Contributor

sre-bot commented Nov 11, 2020

@bezmax
Copy link
Contributor Author

bezmax commented Nov 12, 2020

The build is failing as it depends on changes to tipb and parser to be merged first. The links to respective PRs are in description.

@bezmax bezmax changed the title Draft: POC implementation of Vitess hashing algorithm. POC implementation of Vitess hashing algorithm. Nov 12, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Nov 19, 2020

@sre-bot
Copy link
Contributor

sre-bot commented Nov 19, 2020

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what's changed

Or if the count of mainly changed packages are more than 3, use

  • *: what's changed

@ghost ghost changed the title POC implementation of Vitess hashing algorithm. expression: POC implementation of Vitess hashing algorithm. Nov 19, 2020
@qw4990 qw4990 requested review from Reminiscent and qw4990 November 23, 2020 03:49
}

// evalString evals VITESS_HASH(int64|string|decimal).
func (b *builtinVitessHashSig) evalString(row chunk.Row) (string, bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we implement the vectorized expression function for builtinVitessHashSig in expression/builtin_miscellaneous_vec.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Reminiscent Could you explain what are "vectorized expression functions" in this context?

Copy link
Contributor

@Reminiscent Reminiscent Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Reminiscent Could you explain what are "vectorized expression functions" in this context?

You can see this blog for more information. In short, the expression function builtinVitessHashSig.evalString here processes one row of data at a time. And if you implement the vectorized expression function for builtinVitessHashSig.vecEvalString in expression/builtin_miscellaneous_vec.go, it can process a batch at a time. The blog introduces more information.

defer s.cleanEnv(c)
tk := testkit.NewTestKit(c, s.store)
tk.MustExec("use test")
tk.MustQuery("select vitess_hash(30375298039) from t").Check(testkit.Rows("\x03\x12\x65\x66\x1E\x5F\x11\x33"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create a table t first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not create t for the reason that there are many other function tests that do not do that. For example tests for str_to_date, hour, time, etc. If you think it's important - I will add t.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to add some statements like

tk.MustExec("drop table if exists t;")
tk.MustExec("create table t(a int);")

for safty.

@Reminiscent
Copy link
Contributor

@qw4990 @XuHuaiyu PTAL

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 23, 2021
@ti-chi-bot
Copy link
Member

@bezmax: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-srebot
Copy link
Contributor

@bezmax, please update your pull request.

1 similar comment
@ti-srebot
Copy link
Contributor

@bezmax, please update your pull request.

@ti-srebot
Copy link
Contributor

@bezmax PR closed due to no update for a long time. Feel free to reopen it anytime.

@ti-srebot ti-srebot closed this Mar 20, 2021
@bezmax
Copy link
Contributor Author

bezmax commented Mar 23, 2021

Would like this PR to be re-opened. Thanks!

@kolbe
Copy link
Contributor

kolbe commented Mar 23, 2021

/reopen

@ti-chi-bot
Copy link
Member

@kolbe: Failed to re-open PR: state cannot be changed. The bezmax-vitess-hash branch was force-pushed or recreated.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bezmax
Copy link
Contributor Author

bezmax commented Mar 23, 2021

Given this PR got closed, I re-created it with all the fixes applied here: #23493

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/expression first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/execution SIG execution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for vitess' hashing algorithm
9 participants