-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingest/fivetran): support filtering on destination ids #11277
feat(ingest/fivetran): support filtering on destination ids #11277
Conversation
WalkthroughThe pull request introduces modifications to the Fivetran ingestion components within the metadata ingestion framework. A new field, Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
@matthew-coudert-cko how slow is the Fivetran connector for you? I strongly suspect that there's some low-hanging fruit in terms of optimizations there (e.g. we're probably doing a bunch of N+1 queries) |
Not horribly slow, but around 40/45 minutes. My main worry is actually ingesting too many data process instances as it tends to slow down our graphQL queries when we have lots and lots of them (we've had the same issue with Airflow since we have 1000s of tasks running every hour). We solve that by just going into the DB and deleting old ones and reindexing whenever it gets slow. |
@matthew-coudert-cko yup makes sense. By the way, you might be interested in this #11102 |
@coderabbitai summary |
Add ability for users to filter to only include specific destination IDs. This would let us exclude non-prod environments in our PROD DataHub deployment and speed up the ingestion (as its pretty slow given the 100s of connectors we have).
Checklist
Summary by CodeRabbit
New Features
destination_patterns
, for enhanced control over Fivetran source ingestion.Bug Fixes
Tests
destination_patterns
functionality.