Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Bug with caching #748

Closed
larsoner opened this issue Jun 27, 2023 · 2 comments · Fixed by #756
Closed

BUG: Bug with caching #748

larsoner opened this issue Jun 27, 2023 · 2 comments · Fixed by #756

Comments

@larsoner
Copy link
Member

I was just running some tests and:

  1. Ran without any derivatives (by rm -Rfing them)
  2. Changed a parameter.
  3. Ran again, and it recomputed stuff and wrote files (good).
  4. Changed the parameter back to its previous value.
  5. Ran again, and *it didn't rerun" because it said it had cached.

I think (5) is a bug -- somehow the file being there its mtime or hash or whatever didn't cause it to run again. This seems weird/bad to me. But I don't think it's too surprising since I think we only check the mtime of the input files and not the output files. Somehow we need to check to make sure that the output files haven't been overwritten (or modified) when determining if we need to rerun or not.

@hoechenberger
Copy link
Member

I remember we discussed this as a known Limitation back when we implemented caching. For now, we should probably at least document this behavior

@larsoner
Copy link
Member Author

It seems like we should be able to overcome it by:

  1. Storing and the mtime/hash of the out_files when actually running the step and returning {key: (filename, hash) ... } as the out_files dict of the step.
  2. When we do the joblib caching check, if they say the step is done/complete, we manually additionally check the hash/mtime of all the filenames and force a re-run if they've changed from what's expected.

I think it should work and not add much overhead. (Well, the minimum required amount of overhead I guess.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants