You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.
Description:
Currently, baseSync will fetch file list then insert diffs to metastore before processing file sync. This procedure works well when the number of files is not large. But, when there are billion of files in src dir, the processing time before file sync may be not acceptable.
Basic solution/idea (Async process file list):
Add all files to memory (very fast).
Batch insert diffs into metastore (slow due to metastore and network).
Processing file diffs when diffs are in metastore (fast).
The text was updated successfully, but these errors were encountered:
Description:
Currently, baseSync will fetch file list then insert diffs to metastore before processing file sync. This procedure works well when the number of files is not large. But, when there are billion of files in
src
dir, the processing time before file sync may be not acceptable.Basic solution/idea (Async process file list):
The text was updated successfully, but these errors were encountered: