Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add --user_percent option #561

Merged
merged 32 commits into from
Apr 9, 2020
Merged

feat: add --user_percent option #561

merged 32 commits into from
Apr 9, 2020

Conversation

jrconlin
Copy link
Member

@jrconlin jrconlin commented Apr 1, 2020

Description

Adds the following flags:

  • --user_percent will divvy up the users into blocks and move the specified block. It takes an option formatted as "block#:percentage". Block numbers are 1 based. For example,
    --user_percent=2:33 will divide the total distinct users into non-overlapping blocks of approximately 33%, and then move the second block (e.g. the 33-65th users in the list). Extra users that may not be evenly divided into percentage blocks will be appended to the last block. (e.g. for --user_percent=3:33, users 66-99 would be copied over, a total of 34 users)

  • --ms_delay will pause this many milliseconds between spanner transaction writes (default 0). Each spanner transaction contains --write_chunk rows (1000). note: this is a rename of the unused --readchunk option)

Testing

local testing can be used to determine that users are divided up into groups properly. --dryrun should help by preventing data from being actually written to spanner.

Issue(s)

Issue #407 #571

@jrconlin jrconlin added 🚧WIP Work in Progress 2 Estimate - s - This is a small change with clearly defined parameters. labels Apr 1, 2020
@jrconlin jrconlin requested a review from a team April 1, 2020 23:05
The `--user_percent` option will divvy up the users into blocks and
move the specified block. It takes an option formatted as
"block#:percentage". Block numbers are 1 based. For example,
--user_percent=2:33 will divide the total distinct users into
non-overlapping blocks of approximately 33%, and then move the second
block (e.g. the 33-65th users in the list). Extra users that may not be
evenly divided into percentage blocks will be appended to the last
block. (e.g. for `--user_percent=3:33`, users 66-99 would be copied
over, a total of 34 users)

Issue #407
use `ms_delay` to pause between spanner transaction `--readchunk`s. This allows
some primative throttling for feeding spanner data.
Reminder: `readchunk` sets the max number of items to try to write per chunk
to spanner in any given transaction, default value 1000.
tools/user_migration/migrate_node.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_node.py Outdated Show resolved Hide resolved
tools/user_migration/migrate_node.py Outdated Show resolved Hide resolved
@jrconlin jrconlin requested a review from pjenvey April 3, 2020 00:38
pjenvey
pjenvey previously approved these changes Apr 6, 2020
pjenvey
pjenvey previously approved these changes Apr 8, 2020
@pjenvey pjenvey merged commit f3c9751 into master Apr 9, 2020
@jrconlin jrconlin deleted the user_migration3 branch April 9, 2020 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 Estimate - s - This is a small change with clearly defined parameters. 🚧WIP Work in Progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants