-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIP-15] Adding database level python-date-format #8474
[SIP-15] Adding database level python-date-format #8474
Conversation
52131dd
to
892164f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of grammar nits and a non-blocking question about non-inclusive start times. Thank you for adding the documentation!
@@ -1306,7 +1306,17 @@ SIP-15 | |||
|
|||
`SIP-15 <https://github.com/apache/incubator-superset/issues/6360>`_ aims to ensure that time intervals are handled in a consistent and transparent manner for both the Druid and SQLAlchemy connectors. | |||
|
|||
Prior to SIP-15 SQLAlchemy used inclusive endpoints however these may behave like exclusive depending on the time column (refer to the SIP for details) and thus the endpoint behavior could be unknown. To aid with transparency the current endpoint behavior is explicitly called out in the chart time range (post SIP-15 this will be [start, end) for all connectors and databases). One can override the defaults on a per database level via the ``extra`` | |||
Prior to SIP-15 SQLAlchemy used inclusive endpoints however these may behave like exclusive for string columns (due to lexicographical ordering) if no formatting was defined and the column formatting did not conform to an ISO 8601 date-time (refer to the SIP for details). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it would be a convenience for the reader to add a link to the SIP issue on Github.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@willbarrett there's a link to the GitHub issue on line #1307.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whup, I missed the context. Thanks!
superset/connectors/sqla/views.py
Outdated
@@ -112,6 +112,9 @@ class TableColumnInlineView(CompactCRUDMixin, SupersetModelView): | |||
"define an expression and type for transforming the string into a " | |||
"date or timestamp. Note currently time zones are not supported. " | |||
"If time is stored in epoch format, put `epoch_s` or `epoch_ms`." | |||
"If no pattern is specified we fallback to using the optional defaults " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: fallback is a noun, fall back is the verb form.
@@ -113,7 +113,10 @@ function ColumnCollectionTable({ | |||
you will need to define an expression and type for | |||
transforming the string into a date or timestamp. Note | |||
currently time zones are not supported. If time is stored | |||
in epoch format, put \`epoch_s\` or \`epoch_ms\`.`)} | |||
in epoch format, put \`epoch_s\` or \`epoch_ms\`. If no pattern | |||
is specified we fallback to using the optional defaults on a per |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: fallback is a noun, fall back is the verb form.
@@ -169,15 +169,19 @@ def get_time_filter( | |||
col = self.get_sqla_col(label="__time") | |||
l = [] | |||
if start_dttm: | |||
l.append(col >= text(self.dttm_sql_literal(start_dttm))) | |||
l.append( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is now configuration for inclusive/exclusive start points, should we have another if
condition in this block supporting col > <start time literal statement>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@willbarrett per SIP-15 the end state is [start, end)
i.e., this should not be configurable. The reason for the interim configuration is to allow users to preview their charts using the [start, end)
time range endpoints before saving and thus persisting this information in the chart form data.
The time_range_endpoints
form-data is merely there as a mechanism to provide both the current and future states simultaneously for a period which enabling an institution to manually fix charts as necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for clarifying!
2550f04
to
5fb124e
Compare
@willbarrett thanks for the review. I addressed your |
5fb124e
to
04b55d2
Compare
Codecov Report
@@ Coverage Diff @@
## master #8474 +/- ##
==========================================
+ Coverage 66.52% 71.46% +4.93%
==========================================
Files 449 117 -332
Lines 22595 12334 -10261
Branches 2367 0 -2367
==========================================
- Hits 15032 8814 -6218
+ Misses 7425 3520 -3905
+ Partials 138 0 -138 Continue to review full report at Codecov.
|
(cherry picked from commit 554a6d8)
CATEGORY
Choose one
SUMMARY
This PR addresses the SIP-15 issue where for string column types the start endpoint may behave like
(start
rather than[start
due to lexicographical ordering if the RHS is not encoded to mimic the formatting of the LHS.For example at Airbnb we have over 20,000 tables in Superset which have a
ds
column (datestamp) which adheres to theYYYY-MM-DD
ISO 8601 format however no python-date-format is specified for these columns and thus the filter endpoints behave like,i.e.,
(start
rather than[start
. This problem could be remedied by setting the python-date-format for every table but that seems overkill and doesn't ensure newly added tables will define the formatting. This PR creates a default column/python-date-format mapping on a per engine basis where one can specify a fallback (default) format if not specified.Note #8464 switched the order of evaluation for obtaining the
datetime
SQL literal ensuring known per engine data types conversions, i.e.,DATE
,TIMESTAMP
, etc. are applied prior to any python-date-format. This helps to ensure that the mapping isn't heavy handed, i.e., applying the format in an unnecessary or problematic manner.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before
After
TEST PLAN
ADDITIONAL INFORMATION
REVIEWERS
to: @betodealmeida @etr2460 @michellethomas @mistercrunch @villebro @willbarrett