Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose an unload method for SBT nodes #784

Merged
merged 3 commits into from
Jan 15, 2020
Merged

Expose an unload method for SBT nodes #784

merged 3 commits into from
Jan 15, 2020

Conversation

luizirber
Copy link
Member

From #778 (comment):

Another comment: For search we only need to load internal nodes once (they will never be checked again). This helps saving total memory consumed (because we can unload the internal node after checking it). The feature/unload branch expose this in the find function, but needs more tests.

This is not so useful for gather, because internal nodes might be checked more than once, but might lead to a "low-memory" mode that always unload internal node data, or a mixed approach where we cache the internal node data for frequently accessed nodes.

A dirty version of this is in the unassigned.py I wrote for @taylorreiter, but I would rather avoid having to dig into private fields like this and have a proper method =P

This is the PR exposing the proper method.

Checklist

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@codecov
Copy link

codecov bot commented Dec 5, 2019

Codecov Report

Merging #784 into master will decrease coverage by 0.94%.
The diff coverage is 80.86%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #784      +/-   ##
==========================================
- Coverage   79.37%   78.42%   -0.95%     
==========================================
  Files          83       82       -1     
  Lines        7035     6889     -146     
  Branches      469      469              
==========================================
- Hits         5584     5403     -181     
- Misses       1151     1185      +34     
- Partials      300      301       +1
Flag Coverage Δ
#pytests 89.47% <86.95%> (-0.46%) ⬇️
#rusttests 48.54% <45.83%> (-0.17%) ⬇️
Impacted Files Coverage Δ
sourmash/cli/sig/describe.py 88.67% <100%> (ø) ⬆️
sourmash/sourmash_args.py 96.03% <100%> (ø) ⬆️
sourmash/sig/__main__.py 82.5% <33.33%> (ø) ⬆️
src/core/src/sketch/minhash.rs 34.4% <45.83%> (-0.03%) ⬇️
sourmash/commands.py 83.54% <50%> (-0.15%) ⬇️
sourmash/signature.py 87.06% <88.46%> (-7.97%) ⬇️
sourmash/_compat.py 50% <0%> (-50%) ⬇️
src/core/src/index/storage.rs 69.64% <0%> (-2.58%) ⬇️
src/core/src/index/bigsi.rs 60.21% <0%> (-0.66%) ⬇️
sourmash/lca/lca_utils.py 96% <0%> (-0.62%) ⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4119b43...b6879ee. Read the comment docs.

@luizirber luizirber added this to the post-3.0 milestone Dec 30, 2019
@luizirber luizirber force-pushed the feature/unload branch 2 times, most recently from ed4a6de to b4025ee Compare January 7, 2020 06:04
@luizirber
Copy link
Member Author

I defaulted it to False in Index.find(), but changed search to set unload_data to True. I think this makes sense since sourmash search only checks the index once, and if you're doing more advanced uses of search (like keeping the index in memory, and doing repeated searches) you can then set unload_data to False.

An example of this in action:
image
(ignore the increase in rust_sig, other things happening there)

thoughts @ctb?

@luizirber luizirber modified the milestones: post-3.0, 3.1 Jan 14, 2020
@luizirber luizirber changed the title [WIP] expose an unload method for SBT nodes E[WIP] expose an unload method for SBT nodes Jan 14, 2020
@luizirber luizirber changed the title E[WIP] expose an unload method for SBT nodes Expose an unload method for SBT nodes Jan 14, 2020
@luizirber luizirber merged commit cba11c5 into master Jan 15, 2020
@luizirber luizirber deleted the feature/unload branch January 15, 2020 18:15
@luizirber luizirber restored the feature/unload branch January 15, 2020 18:15
@luizirber luizirber deleted the feature/unload branch January 15, 2020 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants