Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Tree #54

Merged
merged 12 commits into from
Jul 27, 2018
Merged

Use Tree #54

merged 12 commits into from
Jul 27, 2018

Conversation

BillDorn
Copy link
Contributor

Rewrite of library to store contexts and configurations in a trie style tree, this scales better than the current method.
Most internal methods and variables are changed so client code that relies on these will likely break, support can be added if needed. Public methods should match previous behavior in all cases.
Perf measurements based off the preexisting benchmark for reads and initial load time:

Input YCB(ops/sec) Tree(ops/sec) Improvement
Base 472,447 2,985,223 6.3x
y20 (large config / large dimensions) 29,010 245,589 8.5x
y20 (large config / small dimensions) 40,699 280,518 6.9x
y20 (small config / large dimensions) 93,800 5,046,250 53.8x
y20 (small config / small dimensions) 230,637 3,181,432 13.8x
y20 (small config / large dimensions / big context) 98,767 5,032,506 51.0x
y20 (large config / large dimensions / big context) 18,654 173,161 9.3x
y20 (medium config / jumbo dimensions / tiny context) 119,894 865,038 7.2x
y20 (medium config / jumbo dimensions / medium context) 23,115 481,654 20.8x
y20 (medium config / jumbo dimensions / bigger context) 0.46 167,406 364,000x
Input YCB(ops/sec) Tree(ops/sec) Improvement
Load Base 153,850 205,738 1.3x
Load (large config / large dimensions) 53 148 2.8x
Load (small config / large dimensions) 845 3,991 4.7x
Load (small config / small dimensions) 12,579 29,781 2.4x
Load (large config / small dimensions) 241 391 1.6x
Load (medium config / jumbo dimensions) 586 872 1.5x

@drewfish
Copy link
Contributor

drewfish commented Jun 5, 2018

Cool thanks! It might take me a while to review this. Since this a big change it might make sense to bump the major version number (even though the API is the same). What do you think?

@drewfish
Copy link
Contributor

drewfish commented Jun 5, 2018

Oh yeah when we were using this heavily in production we noticed that the large amount of object creations was causing garbage collection pressure. Not sure if there's something we can do in the tests (existing or new) to capture the GC events and compare them between the old & new code.

@redonkulus
Copy link
Collaborator

Overall, I'm ok with the changes. I'm not intimately familiar with the previous code so I don't know the inns and outs of each change. However, if upon testing with our production code that the same input produces the same output and there are no performance regressions, we can publish it with a major version bump.

@redonkulus
Copy link
Collaborator

After testing internally and validating the same output is produced with these changes, we want to start testing this. Therefore, I'm going to merge this and publish under 2.x.

Thanks @BillDorn!

@redonkulus redonkulus merged commit 1beec82 into yahoo:master Jul 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants