-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADAP-506] [Regression] post release 1.5.0 - models are failing with Runtime error - "Read operation timed out" #427
Comments
We are seeing the same issue when running our DBT code in docker against Redshift Serverless. Going back to 1.4.6 fixed the issue for us.
|
We had the same issue and had to rollback to a previous version |
Thanks for alerting us about this @Kamalakar4Pelago, @mp24-git, and @fabientra Are all of you using Redshift Serverless? Are any of you using cross-database references or RA3 instances? Trying to hone in on what is common between all of you so that we can try reproducing the issue on our end. |
We are using Redshift Serverless without cross-database references. The issue happens mostly with large tables that we read from an external schema (which are eventually S3 parquet files). Smaller tables read from the same schema are ok. I see that big tables fail after ~31 seconds but smaller tables seem to be fine
|
We are getting the same issue but am using a plain RA3 cluster. There are some uses of cross-database but fails on any table. The workaround was to bump connect_timeout to something obscenely high which gives the tables time to run. |
Background context:
I'm wondering if the connection timeout is somehow coming into play? If so, might explain that ~30 second threshold. Within your
|
Yes we found that the connect_timeout directly linked to the time it takes to fail, we had it at 20sec, tested with 60, then 600 and finally 6000 |
We are on the Redshift [dc2. large] cluster. |
We are using a plain ra3. |
Potentially related? Edit: I'm guessing not related. |
Issue still exists with release 1.5.1. We are using a small cluster of ra3.xlplus nodes (not serverless.) Very short queries that read only few hundred rows might work. Anything larger times out. |
Experiencing the same issue. |
Thanks for confirming @stisti and @elongl. The Redshift team is looking at this to determine a path forward. WorkaroundIn the meantime, the workaround is to add an absurdly high timeout within your connect_timeout: 999999 |
confirming that we're facing the same issue. Upgraded from 1.4 to 1.5 for |
I faced the same issue after upgrading to version 1.5.0. Updated the connect_timeout to 999999 in the profile.yml as suggested by @dbeatty10 and the issues was fixed. |
We are also facing this issue after updating our ECS instances of DBT to 1.5.0. We are using Redshift Server with dc2.large nodes. |
Having the same issues here running our jobs deployed via dbt cloud. Had to rollback the upgrade to dbt 1.5 in dbt cloud. |
Same issue here when trying to upgrade to v1.5 in dbt Cloud. I see that it seems to have been fixed with dbt-redshift 1.5.2. @dbeatty10 How are the adapter versions updated in dbt Cloud? Is there a way to select the version of the Redshift adapter in dbt Cloud? |
@chodera sorry to hear that you're still seeing this issue. You're right that you shouldn't be seeing it any longer.
You're also highlighting a great UX improvement opportunity for dbt Cloud. Coincidentally, I opened an internal feature request ticket to have the adapter version displayed yesterday; everyone agrees it's a no brainer. As far as selecting which adapter version you're using, I think a core value proposition of dbt Cloud is that you shouldn't have to worry about which patch version you're using; it should just work. For the time being, can you please reach out to dbt Cloud support? If you are able to determine that you are indeed using |
Is this a new bug in dbt-redshift?
Current Behavior
As a workaround - we deployed a hot fix to use 1.4.6 version, which is the prior working version in our case.
Expected Behavior
All models which were running in the 1.4.6 version should also be expected to be running with 1.5.0.
Steps To Reproduce
Relevant log output
Environment
Additional Context
No response
The text was updated successfully, but these errors were encountered: