-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: optimize tables/schema operations #57
Conversation
This improved the Squared CI tests enough to finish https://github.com/meltano/squared/actions/runs/5260307866 but the github streams took 27 and 33 mins whereas they took about 7 mins with the transferwise variant. Theres a lot of schema message like I mentioned in the description so I wonder if the logic that diffs the schema against existing ones is accidentally thinking that every schema message is a new schema so its reinitializing a new sink object and draining the previous one, causing lost of wasted time. I'll need to test this theory out. |
@pnadolny13 looks great 👍 Would be good to create issues for these caches in the SDK. |
Closes #29
prepare_schema
method was used withIF NOT EXISTS
as a quick way to avoid these weird column casing and reserved word errors. Its slow to let it constantly retry creating so I removed it by fixing theschema_exists
logic to not accidentally try to recreate it again. This required schema names to be conformed before passing to theprepare_schema
method.get_sink
to remove_sdc_
columns before comparing because the existing sink schema has already been post processed at that point and the new incoming schema has not. This should be pushed down to the SDK but for now this worked to get my CI tests down to the original transferwise timing. cc @kgpayne