-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MySQL CDC stress testing #3969
Comments
@subodh1810 I didn't run a stress test yet but made a small research about errors happening in #4010.
He had run two syncs. In the first one I collected two ERRORs from Debezium/Airbyte, the first one:
Searching I found this messages 1 / 2 in the Debezium Gitter chat:
the second one is:
For this the recommendation is to increase Debezium Engine Properties:
|
I was able to reproduce this problem. Problems : There are mainly two problems -
Also, the following logs start to pop up towards the end denoting that debezium is not able to write the cursor information in the offset file
Analysis : Both the problems are related and caused by the The decrease in speed is because of Garbage collection. Take a look at the screenshot from the cpu profiling analysis. As you can see we spent 99% time in Garbage collection which explains why towards the end of the sync the speed decreased substantially. Once we processed tons of records and populated the queue very quickly, the heap size became huge and cause of that the GC had to pause the main execution thread often to do cleanup. Since the main thread was being paused for longer and longer durations, debezium was also not able to flush the offset and thats why these errors started popping up I am attaching the cpu profiling report and the logs from the sync in this zip. The heapdump is of 6GB in size so its not possible to attach. |
Very cool analysis. Nice work. |
Fantastic to see this work done. I've been trying to replicate data between MySQL (Aurora) and Snowflake for the last couple of weeks to no avail, always running into the OOM problem. Most of our tables are small but we have a handful of tables in the 10-30M range and trying to sync them one by one did not help. It's the one of the few things holding me back from using Airbyte more widely and as a replacement to our cloud based platform. Look forward to updates on this issue and happy to provide what I can on our use case to help assist development with this. |
There has been a report of OOM for MySQL CDC. We should stress test the CDC connector to make sure it can handle millions of records. A user reported OOM while syncing a table with following schema :
The table had 1M records. We should try to sync a similar table with similar number of records and observe the memory consumption of our docker instance and find bottlenecks.
Airbyte was running on a
t3.xlarge instance (4vCPU's and 16GB).
There are few things that might be causing the OOM (its a guess, we need to validate it) :
The text was updated successfully, but these errors were encountered: