This repository has been archived by the owner on Dec 10, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 273
chore: add extra field for Datasource control config #856
Closed
graceguo-supercat
wants to merge
1
commit into
apache-superset:master
from
graceguo-supercat:gg-AddFieldToControlConfig
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an attribute of
datasource
? I'd imagine the same check may become available to the dataset API endpoint, too.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. i feel
datasource
contains data that stored in database, while health check message is sanity check on-the-fly, and this message won't persist in database.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a state associated with a specific datasource. It could be considered an extended read-only prop from the API's standpoint---similar to the various transformation we already did for
order_by
andverbose_map
indatasource.data
.The check may even become an instance method for the
BaseDatasource
model so it's easier to access.Imagine if we need to access the same info for multi datasources (e.g. display the warning icon in dataset CRUD list view), having the message attached to the datasource would make rendering much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You suggest health_check_message could be part of datasource.data but is not stored in database?
Yes that will be good because we can access it whenever we call datasource.data. But I do have a little concern: what if this health check become heavy and slow? so I think we should keep the sanity check away from core data model, and only trigger it when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, which is why it doesn't have to be inside the datasource.data property but could be an instance method and added to the API output in the view handlers. We can also change
datasource.data
to be a method instead of property so each call can choose whether to run these checks or not.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the new example, i understand your idea better. Is persisting to db really necessary? If so, then we need to update db record:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked airbnb's airflow job: it will update
extra
periodically, which means if you want to re-use the same field, airflow jobs and sanity check may step on each other's toe.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not super necessary, but as you mentioned earlier, the health check could get expensive. In case we want to cache the results somewhere, this would be the way to do it.
When datasource is updated, we can simply add a one liner
datasource.check_health(commit=False, force=True)
to the update command. Then we updatecheck_health
to allow a force health check option that skips the version check.I think the Airflow job can be updated pretty easily. In fact, since you don't know what other future features may put things into
extra
, any updates on theextra
field should be amerge
operation instead of a simple override.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though
the health check could get expensive.
, it's not the concern right now. If most of the check is expensive, i will choose to send health check async in the explore view, instead of adding check during the explore request.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will go direction that save health_check results into datasource
extra
field. So there is no need to change datasource control config. close this PR.