-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(explore): Postgres datatype conversion #13294
Merged
villebro
merged 36 commits into
apache:master
from
nikolagigic:postgres_type_conversion
Mar 12, 2021
Merged
Changes from all commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
70572c8
test
nikolagigic 8e330db
unnecessary import
nikolagigic 83f3996
fix lint
nikolagigic d6c4c1c
changes
nikolagigic 35fa495
fix lint
nikolagigic c00e4c3
changes
nikolagigic 488a840
changes
nikolagigic c894b90
changes
nikolagigic 2fb975b
Merge branch 'master' into postgres_type_conversion
nikolagigic 82a8c9d
changes
nikolagigic 4b8d0ec
answering comments & changes
nikolagigic ddcc14a
answering comments
nikolagigic fcb5edc
answering comments
nikolagigic 010e50e
changes
nikolagigic 32f58a8
changes
nikolagigic ebcbb53
changes
nikolagigic 1bdebbf
fix tests
nikolagigic 2f341b9
fix tests
nikolagigic 39cc3b3
fix tests
nikolagigic d9afba7
fix tests
nikolagigic 1050974
fix tests
nikolagigic 1a799df
Merge branch 'master' into postgres_type_conversion
nikolagigic b92e2ac
fix tests
nikolagigic e630335
fix tests
nikolagigic c9b5e56
fix tests
nikolagigic 276c820
fix tests
nikolagigic 8430ef3
fix tests
nikolagigic 0189072
fix tests
nikolagigic bfdc994
fix tests
nikolagigic 8e1d813
Merge branch 'master' into postgres_type_conversion
nikolagigic f089e9a
fix tests
nikolagigic 7292e37
fix tests
nikolagigic 7e0d4d1
fix tests
nikolagigic b5f6244
fix tests
nikolagigic 1e06266
fix tests
nikolagigic 4336ae5
fix tests
nikolagigic File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,7 +41,7 @@ | |
import sqlparse | ||
from flask import g | ||
from flask_babel import gettext as __, lazy_gettext as _ | ||
from sqlalchemy import column, DateTime, select | ||
from sqlalchemy import column, DateTime, select, types | ||
from sqlalchemy.engine.base import Engine | ||
from sqlalchemy.engine.interfaces import Compiled, Dialect | ||
from sqlalchemy.engine.reflection import Inspector | ||
|
@@ -50,13 +50,14 @@ | |
from sqlalchemy.orm import Session | ||
from sqlalchemy.sql import quoted_name, text | ||
from sqlalchemy.sql.expression import ColumnClause, ColumnElement, Select, TextAsFrom | ||
from sqlalchemy.types import TypeEngine | ||
from sqlalchemy.types import String, TypeEngine, UnicodeText | ||
|
||
from superset import app, security_manager, sql_parse | ||
from superset.errors import ErrorLevel, SupersetError, SupersetErrorType | ||
from superset.models.sql_lab import Query | ||
from superset.sql_parse import ParsedQuery, Table | ||
from superset.utils import core as utils | ||
from superset.utils.core import ColumnSpec, GenericDataType | ||
|
||
if TYPE_CHECKING: | ||
# prevent circular imports | ||
|
@@ -145,8 +146,87 @@ class BaseEngineSpec: # pylint: disable=too-many-public-methods | |
_date_trunc_functions: Dict[str, str] = {} | ||
_time_grain_expressions: Dict[Optional[str], str] = {} | ||
column_type_mappings: Tuple[ | ||
Tuple[Pattern[str], Union[TypeEngine, Callable[[Match[str]], TypeEngine]]], ..., | ||
] = () | ||
Tuple[ | ||
Pattern[str], | ||
Union[TypeEngine, Callable[[Match[str]], TypeEngine]], | ||
GenericDataType, | ||
], | ||
..., | ||
] = ( | ||
( | ||
re.compile(r"^smallint", re.IGNORECASE), | ||
types.SmallInteger(), | ||
GenericDataType.NUMERIC, | ||
), | ||
( | ||
re.compile(r"^integer", re.IGNORECASE), | ||
types.Integer(), | ||
GenericDataType.NUMERIC, | ||
), | ||
( | ||
re.compile(r"^bigint", re.IGNORECASE), | ||
types.BigInteger(), | ||
GenericDataType.NUMERIC, | ||
), | ||
( | ||
re.compile(r"^decimal", re.IGNORECASE), | ||
types.Numeric(), | ||
GenericDataType.NUMERIC, | ||
), | ||
( | ||
re.compile(r"^numeric", re.IGNORECASE), | ||
types.Numeric(), | ||
GenericDataType.NUMERIC, | ||
), | ||
(re.compile(r"^real", re.IGNORECASE), types.REAL, GenericDataType.NUMERIC,), | ||
( | ||
re.compile(r"^smallserial", re.IGNORECASE), | ||
types.SmallInteger(), | ||
GenericDataType.NUMERIC, | ||
), | ||
( | ||
re.compile(r"^serial", re.IGNORECASE), | ||
types.Integer(), | ||
GenericDataType.NUMERIC, | ||
), | ||
( | ||
re.compile(r"^bigserial", re.IGNORECASE), | ||
types.BigInteger(), | ||
GenericDataType.NUMERIC, | ||
), | ||
( | ||
re.compile(r"^string", re.IGNORECASE), | ||
types.String(), | ||
utils.GenericDataType.STRING, | ||
), | ||
( | ||
re.compile(r"^N((VAR)?CHAR|TEXT)", re.IGNORECASE), | ||
UnicodeText(), | ||
utils.GenericDataType.STRING, | ||
), | ||
( | ||
re.compile(r"^((VAR)?CHAR|TEXT|STRING)", re.IGNORECASE), | ||
String(), | ||
utils.GenericDataType.STRING, | ||
), | ||
(re.compile(r"^date", re.IGNORECASE), types.Date(), GenericDataType.TEMPORAL,), | ||
( | ||
re.compile(r"^timestamp", re.IGNORECASE), | ||
types.TIMESTAMP(), | ||
GenericDataType.TEMPORAL, | ||
), | ||
( | ||
re.compile(r"^interval", re.IGNORECASE), | ||
types.Interval(), | ||
GenericDataType.TEMPORAL, | ||
), | ||
(re.compile(r"^time", re.IGNORECASE), types.Time(), GenericDataType.TEMPORAL,), | ||
( | ||
re.compile(r"^boolean", re.IGNORECASE), | ||
types.Boolean(), | ||
GenericDataType.BOOLEAN, | ||
), | ||
) | ||
time_groupby_inline = False | ||
limit_method = LimitMethod.FORCE_LIMIT | ||
time_secondary_columns = False | ||
|
@@ -160,25 +240,6 @@ class BaseEngineSpec: # pylint: disable=too-many-public-methods | |
try_remove_schema_from_table_name = True # pylint: disable=invalid-name | ||
run_multiple_statements_as_one = False | ||
|
||
# default matching patterns to convert database specific column types to | ||
# more generic types | ||
db_column_types: Dict[utils.GenericDataType, Tuple[Pattern[str], ...]] = { | ||
utils.GenericDataType.NUMERIC: ( | ||
re.compile(r"BIT", re.IGNORECASE), | ||
re.compile( | ||
r".*(DOUBLE|FLOAT|INT|NUMBER|REAL|NUMERIC|DECIMAL|MONEY).*", | ||
re.IGNORECASE, | ||
), | ||
re.compile(r".*LONG$", re.IGNORECASE), | ||
), | ||
utils.GenericDataType.STRING: ( | ||
re.compile(r".*(CHAR|STRING|TEXT).*", re.IGNORECASE), | ||
), | ||
utils.GenericDataType.TEMPORAL: ( | ||
re.compile(r".*(DATE|TIME).*", re.IGNORECASE), | ||
), | ||
} | ||
|
||
@classmethod | ||
def get_dbapi_exception_mapping(cls) -> Dict[Type[Exception], Type[Exception]]: | ||
""" | ||
|
@@ -208,25 +269,6 @@ def get_dbapi_mapped_exception(cls, exception: Exception) -> Exception: | |
return exception | ||
return new_exception(str(exception)) | ||
|
||
@classmethod | ||
def is_db_column_type_match( | ||
cls, db_column_type: Optional[str], target_column_type: utils.GenericDataType | ||
) -> bool: | ||
""" | ||
Check if a column type satisfies a pattern in a collection of regexes found in | ||
`db_column_types`. For example, if `db_column_type == "NVARCHAR"`, | ||
it would be a match for "STRING" due to being a match for the regex ".*CHAR.*". | ||
|
||
:param db_column_type: Column type to evaluate | ||
:param target_column_type: The target type to evaluate for | ||
:return: `True` if a `db_column_type` matches any pattern corresponding to | ||
`target_column_type` | ||
""" | ||
if not db_column_type: | ||
return False | ||
patterns = cls.db_column_types[target_column_type] | ||
return any(pattern.match(db_column_type) for pattern in patterns) | ||
|
||
@classmethod | ||
def get_allow_cost_estimate(cls, extra: Dict[str, Any]) -> bool: | ||
return False | ||
|
@@ -967,24 +1009,35 @@ def make_label_compatible(cls, label: str) -> Union[str, quoted_name]: | |
return label_mutated | ||
|
||
@classmethod | ||
def get_sqla_column_type(cls, type_: Optional[str]) -> Optional[TypeEngine]: | ||
def get_sqla_column_type( | ||
cls, | ||
column_type: Optional[str], | ||
column_type_mappings: Tuple[ | ||
Tuple[ | ||
Pattern[str], | ||
Union[TypeEngine, Callable[[Match[str]], TypeEngine]], | ||
GenericDataType, | ||
], | ||
..., | ||
] = column_type_mappings, | ||
Comment on lines
+1015
to
+1022
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of passing the mapping to the method, we can probably just call the |
||
) -> Union[Tuple[TypeEngine, GenericDataType], None]: | ||
""" | ||
Return a sqlalchemy native column type that corresponds to the column type | ||
defined in the data source (return None to use default type inferred by | ||
SQLAlchemy). Override `column_type_mappings` for specific needs | ||
(see MSSQL for example of NCHAR/NVARCHAR handling). | ||
|
||
:param type_: Column type returned by inspector | ||
:param column_type: Column type returned by inspector | ||
:return: SqlAlchemy column type | ||
""" | ||
if not type_: | ||
if not column_type: | ||
return None | ||
for regex, sqla_type in cls.column_type_mappings: | ||
match = regex.match(type_) | ||
for regex, sqla_type, generic_type in column_type_mappings: | ||
match = regex.match(column_type) | ||
if match: | ||
if callable(sqla_type): | ||
return sqla_type(match) | ||
return sqla_type | ||
return sqla_type(match), generic_type | ||
villebro marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return sqla_type, generic_type | ||
return None | ||
|
||
@staticmethod | ||
|
@@ -1101,3 +1154,43 @@ def is_readonly_query(cls, parsed_query: ParsedQuery) -> bool: | |
or parsed_query.is_explain() | ||
or parsed_query.is_show() | ||
) | ||
|
||
@classmethod | ||
def get_column_spec( | ||
cls, | ||
native_type: Optional[str], | ||
source: utils.ColumnTypeSource = utils.ColumnTypeSource.GET_TABLE, | ||
column_type_mappings: Tuple[ | ||
Tuple[ | ||
Pattern[str], | ||
Union[TypeEngine, Callable[[Match[str]], TypeEngine]], | ||
GenericDataType, | ||
], | ||
..., | ||
] = column_type_mappings, | ||
) -> Union[ColumnSpec, None]: | ||
""" | ||
Converts native database type to sqlalchemy column type. | ||
:param native_type: Native database typee | ||
:param source: Type coming from the database table or cursor description | ||
:return: ColumnSpec object | ||
""" | ||
column_type = None | ||
|
||
if ( | ||
cls.get_sqla_column_type( | ||
native_type, column_type_mappings=column_type_mappings | ||
) | ||
is not None | ||
): | ||
column_type, generic_type = cls.get_sqla_column_type( # type: ignore | ||
native_type, column_type_mappings=column_type_mappings | ||
) | ||
is_dttm = generic_type == GenericDataType.TEMPORAL | ||
|
||
if column_type: | ||
return ColumnSpec( | ||
sqla_type=column_type, generic_type=generic_type, is_dttm=is_dttm | ||
) | ||
|
||
return None |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure we match Mysql
INT
type, we could just change this to^INT
to match bothINT
andINTEGER
, unless there are any known incompatible types that could cause a collision.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an
INTEGER
type inmysql
dialect.