Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content: require backing store for checkpoint #6255

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Sep 3, 2024

Problem: A backing store is required for content.flush but it is not required for content.checkpoint-put. This is inconsistent
and can lead to checkpointing problems done the line.

Require content.checkpoint-put to only work if there is a backing store available. As a consequence, remove code that handled "cached" checkpoints when a backing store is not available.

@chu11 chu11 force-pushed the issue6251_content_checkpoint_require_backing branch 3 times, most recently from 431e867 to 72db5f3 Compare September 4, 2024 20:45
Problem: An accidental 'd' was added to remove, making it "removed".

Fix spelling.
Problem: A test in t0028-content-backing-none.t incorrectly
calls checkpoint_put when it should call checkpoint_get.

Fix invalid test.
Problem:  The typical message unpack style is to place key names
and storage pointers on the same line, but that is not done in
several locations in the content and content backing modules.

Correct code style to be more consistent to the rest of flux-core.
Problem: A backing store is required for content.flush but it
is not required for content.checkpoint-put.  This is inconsistent
and can lead to checkpointing problems done the line.

Require content.checkpoint-put to only work if there is a backing
store available.  As a consequence, remove code that handled
"cached" checkpoints when a backing store is not available.

Fixes flux-framework#6251
Problem: Now that the content backing store is required for checkpoints,
many tests fail.

Remove tests that previously assumed that checkpointing worked without
a content backing store.  Adjust some tests that now have an new
error message.
Problem: There is no coverage to ensure that the "none" backing
store works identically to when no backing store is never loaded.

Add coverage in t0028-content-backing-none.t.
@chu11 chu11 force-pushed the issue6251_content_checkpoint_require_backing branch from 72db5f3 to 9363a9a Compare December 18, 2024 18:31
@chu11
Copy link
Member Author

chu11 commented Dec 18, 2024

rebased and re-pushed, just didn't want this series of fixes to be forgotten :-) (along w/ #6240 and #6260)

@garlick
Copy link
Member

garlick commented Dec 18, 2024

I guess one caveat to this change is that, if running without a backing store, you could no longer reload the KVS module without losing all the data. Do we care about that?

Copy link

codecov bot commented Dec 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.63%. Comparing base (c9eb3a8) to head (9363a9a).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6255      +/-   ##
==========================================
+ Coverage   83.61%   83.63%   +0.01%     
==========================================
  Files         522      522              
  Lines       87734    87650      -84     
==========================================
- Hits        73356    73302      -54     
+ Misses      14378    14348      -30     
Files with missing lines Coverage Δ
src/modules/content-files/content-files.c 73.91% <ø> (ø)
src/modules/content-sqlite/content-sqlite.c 73.88% <ø> (ø)
src/modules/content/cache.c 85.43% <ø> (-0.03%) ⬇️
src/modules/content/checkpoint.c 77.86% <100.00%> (-1.16%) ⬇️

... and 9 files with indirect coverage changes

@chu11
Copy link
Member Author

chu11 commented Dec 18, 2024

I guess one caveat to this change is that, if running without a backing store, you could no longer reload the KVS module without losing all the data. Do we care about that?

I think it's ok?

The reason for this fix was simply the inconsistency. a content-flush + checkpoint-put is (will always be?) a combination to be done. One of those can work without a backing store right now, but the other one can't. So it sort of doesn't make sense.

So presumably the alternative would be to try and support content-flush and checkpoint-put if a backing store isn't loaded?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants