This repository has been archived by the owner on Nov 24, 2023. It is now read-only.
load phase of all
mode may lost data when worker frequently starts/stops task
#1377
Labels
Bug Report
Please answer these questions before submitting your issue. Thanks!
What did you do? If possible, provide a recipe for reproducing the error.
will reproduce error later.
This bug is caused by two hidden bugs
a)
cleanDumpFiles
is wrongly triggered because it will checkcheckPoint.AllFinished()
, but this is a memory cache of DB which may not initializedb) load unit didn't return error when there're missing SQL files which should exists according to checkpoint
when worker is too quickly
Close
because of scheduling of network problems, worker may trigger a). In all mode,cleanDumpFiles
will clean the SQL files (in full modecleanDumpFiles
will clean whole directory including dump metadata thus would cause an error). Then next time when it resume with the task, worker is continued on load unit becauseIsFresh
will look into database rather than cache likecheckPoint.AllFinished()
. When load unit starts, it will trigger b) so finish load unit because of no files to load and goes to sync unit.What did you expect to see?
What did you see instead?
Versions of the cluster
DM version (run
dmctl -V
ordm-worker -V
ordm-master -V
):The text was updated successfully, but these errors were encountered: