-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Cannot clean up corrupted cache when pillar_cache_backend=disk #62527
Comments
Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. |
@Ch3LL Hi! I was on the salt community call last week, and I promised to file the bug I was trying to describe. What I could also do is submit a patch which just wraps the msgpack reading thing with a |
Looks like I'm able to replicate this. If you submit a PR, I will be more than willing to review and test it. I haven't gone into the code yet, but I will when you submit the PR and make sure its the correct fix. |
FYI it took a while because I was out on vacation, but I just put up the PR. |
@rgeoghegan Do we know the reason this file is getting corrupted in the first place? |
@dwoz Nothing is specifically corrupting the file, but I was playing with clearing the pillar cache file by just deleting it, and noticed a race condition in the code (along with this bug), and saw that if the file is corrupted, there is no way to recover other than manually deleting the disk cache file. |
Closed by #62760 |
Description
If I use the
pillar_cache_backend: "disk"
config option, and the on-disk msgpack file for a minion gets corrupted, the pillar is now blank, and any attempt to runpillar.clear_pillar_cache
crashes, even after restarting the salt-master.Setup
I am using salt 3004.1 from the yum repo:
I setup a system with a master and a minion with one pillar file:
my_pillar.sls
Steps to Reproduce the behavior
I start with my pillar working as expected:
I add stuff to the pillar file to make it an invalid msgpack file:
Now my pillar is reported as empty:
And the master log has an exception:
Running clear_pillar_cache does not work:
And all this behaviour persists even if the salt-master is restarted.
If I delete the cache file, everything returns to normal:
Expected behavior
IMHO, an unreadable cache file should be treated as a missing cache, and just cause the pillar to be rebuilt.
Versions Report
salt --versions-report
(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)The text was updated successfully, but these errors were encountered: