Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

load phase of all mode may lost data when worker frequently starts/stops task #1377

Closed
Tracked by #1388
lance6716 opened this issue Jan 14, 2021 · 1 comment · Fixed by #1378
Closed
Tracked by #1388

load phase of all mode may lost data when worker frequently starts/stops task #1377

lance6716 opened this issue Jan 14, 2021 · 1 comment · Fixed by #1378
Labels
severity/major type/bug This issue is a bug report

Comments

@lance6716
Copy link
Collaborator

lance6716 commented Jan 14, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.

    will reproduce error later.

    This bug is caused by two hidden bugs

    a) cleanDumpFiles is wrongly triggered because it will check checkPoint.AllFinished(), but this is a memory cache of DB which may not initialized
    b) load unit didn't return error when there're missing SQL files which should exists according to checkpoint

    when worker is too quickly Close because of scheduling of network problems, worker may trigger a). In all mode, cleanDumpFiles will clean the SQL files (in full mode cleanDumpFiles will clean whole directory including dump metadata thus would cause an error). Then next time when it resume with the task, worker is continued on load unit because IsFresh will look into database rather than cache like checkPoint.AllFinished(). When load unit starts, it will trigger b) so finish load unit because of no files to load and goes to sync unit.

  2. What did you expect to see?

  3. What did you see instead?

  4. Versions of the cluster

    • DM version (run dmctl -V or dm-worker -V or dm-master -V):

      (paste DM version here, and you must ensure versions of dmctl, DM-worker and DM-master are same)
      
@lance6716 lance6716 changed the title load phase may lost data load phase may lost data when worker is frquently start/stop task Jan 14, 2021
@lance6716 lance6716 changed the title load phase may lost data when worker is frquently start/stop task load phase may lost data when worker frquently starts/stops task Jan 14, 2021
@lance6716 lance6716 changed the title load phase may lost data when worker frquently starts/stops task load phase of full mode may lost data when worker frquently starts/stops task Jan 14, 2021
@lance6716 lance6716 changed the title load phase of full mode may lost data when worker frquently starts/stops task load phase of all mode may lost data when worker frquently starts/stops task Jan 14, 2021
@lance6716 lance6716 changed the title load phase of all mode may lost data when worker frquently starts/stops task load phase of all mode may lost data when worker frequently starts/stops task Jan 14, 2021
@lance6716 lance6716 changed the title load phase of all mode may lost data when worker frequently starts/stops task load phase of all mode may lost data when worker frequently starts/stops task Jan 14, 2021
@glkappe
Copy link

glkappe commented Jan 14, 2021

asktug link:https://asktug.com/t/topic/67791

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
severity/major type/bug This issue is a bug report
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants