BUG: Bug with caching #748

larsoner · 2023-06-27T19:04:10Z

I was just running some tests and:

Ran without any derivatives (by rm -Rfing them)
Changed a parameter.
Ran again, and it recomputed stuff and wrote files (good).
Changed the parameter back to its previous value.
Ran again, and *it didn't rerun" because it said it had cached.

I think (5) is a bug -- somehow the file being there its mtime or hash or whatever didn't cause it to run again. This seems weird/bad to me. But I don't think it's too surprising since I think we only check the mtime of the input files and not the output files. Somehow we need to check to make sure that the output files haven't been overwritten (or modified) when determining if we need to rerun or not.

The text was updated successfully, but these errors were encountered:

hoechenberger · 2023-06-27T20:04:28Z

I remember we discussed this as a known Limitation back when we implemented caching. For now, we should probably at least document this behavior

larsoner · 2023-06-28T10:15:41Z

It seems like we should be able to overcome it by:

Storing and the mtime/hash of the out_files when actually running the step and returning {key: (filename, hash) ... } as the out_files dict of the step.
When we do the joblib caching check, if they say the step is done/complete, we manually additionally check the hash/mtime of all the filenames and force a re-run if they've changed from what's expected.

I think it should work and not add much overhead. (Well, the minimum required amount of overhead I guess.)

larsoner mentioned this issue Jul 5, 2023

BUG: Fix bug with cache invalidation #756

Merged

1 task

hoechenberger closed this as completed in #756 Jul 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Bug with caching #748

BUG: Bug with caching #748

larsoner commented Jun 27, 2023

hoechenberger commented Jun 27, 2023

larsoner commented Jun 28, 2023

BUG: Bug with caching #748

BUG: Bug with caching #748

Comments

larsoner commented Jun 27, 2023

hoechenberger commented Jun 27, 2023

larsoner commented Jun 28, 2023