-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Authentication: Enable user impersonation for Superset to HiveServer2 using hive.server2.proxy.user (a.fernandez) #3652
Conversation
Need to fix unit tests |
@@ -184,6 +186,28 @@ def select_star(cls, my_db, table_name, schema=None, limit=100, | |||
sql = sqlparse.format(sql, reindent=True) | |||
return sql | |||
|
|||
@classmethod | |||
def modify_url_for_impersonation(cls, url, impersonate_user, username): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base class has methods for how to modify a URI and URL object for impersonation
superset/db_engine_specs.py
Outdated
url.query["hive_server2_proxy_user"] = username | ||
|
||
@classmethod | ||
def get_uri_for_impersonation(cls, uri, impersonate_user, username): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python Engine Spec overrides the methods for how to modify a URI and URL object for impersonation
self.sqlalchemy_uri = str(conn) # hides the password | ||
|
||
def get_effective_user(self, url, user_name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this to its own method
elif hasattr(g, 'user') and g.user.username: | ||
effective_username = g.user.username | ||
return effective_username | ||
|
||
def get_sqla_engine(self, schema=None, nullpool=False, user_name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed uri to url since it was of type SQLAlchemy.URL
self.db_engine_spec.modify_url_for_impersonation(url, self.impersonate_user, effective_username) | ||
|
||
masked_url = self.get_password_masked_url(url) | ||
logging.info("Database.get_sqla_engine(). Masked URL: {0}".format(masked_url)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important to mask the URL while logging since it may contain a password
@@ -168,6 +168,7 @@ def handle_error(msg): | |||
session.merge(query) | |||
session.commit() | |||
logging.info("Set query to 'running'") | |||
conn = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some code paths, conn was not defined
superset/views/core.py
Outdated
uri = db_engine.get_uri_for_impersonation(uri, impersonate_user, username) | ||
masked_url = database.get_password_masked_url_from_uri(uri) | ||
|
||
logging.info("Superset.testconn(). Masked URL: {0}".format(masked_url)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important to mask the URL while logging since it may contain a password
@@ -282,7 +282,8 @@ def test_testconn(self): | |||
# validate that the endpoint works with the password-masked sqlalchemy uri | |||
data = json.dumps({ | |||
'uri': database.safe_sqlalchemy_uri(), | |||
'name': 'main' | |||
'name': 'main', | |||
'impersonate_user': False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit test is failing since g.user.username is missing. Will fix soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial annotations
eee12f1
to
4e9263e
Compare
Unit tests passed,
|
Coverage increased (+0.08%) to 70.191% when pulling 4e9263e804d4297a94edf81aaeaf1b1f39cfd04d on afernandez:afernandez_impersonate into 52a9f27 on apache:master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Highlighted a few minor things but LGTM otherwise. Please lint according to our .pylinrc
. We used to have automation with Landscape.io but switched it off since we move the repo to Apache...
superset/db_engine_specs.py
Outdated
:param impersonate_user: Bool indicating if impersonation is enabled | ||
:param username: Effective username | ||
""" | ||
if impersonate_user and username is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: I'd go without the is not None
as None evals to False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
superset/db_engine_specs.py
Outdated
backend_name = url.get_backend_name() | ||
|
||
# Must be Hive connection, enable impersonation, and set param auth=LDAP|KERBEROS | ||
if backend_name == "hive" and "auth" in url.query.keys() and impersonate_user is True and username is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some very long lines here. PEP8 says 80, our pylint say 90. If you want to lint your PR only you can git diff master... | flake8 --diff
thought that's flake8 not pylint. There's also git-lint
which can lint your diff as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
superset/db_engine_specs.py
Outdated
:param impersonate_user: Bool indicating if impersonation is enabled | ||
:param username: Effective username | ||
""" | ||
if impersonate_user is True and "auth" in url.query.keys() and username is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: -is not None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
…veServer2 using hive.server2.proxy.user (a.fernandez)
4e9263e
to
f310817
Compare
@mistercrunch would you be able to re-review these changes? Does the decrease in coveralls mean I need to add unit tests? |
Thanks for reviewing. I figured out how to do it without needing changes to PyHive. Will submit another PR soon. |
…veServer2 using hive.server2.proxy.user (a.fernandez) (apache#3652)
Superset today has a config for impersonation when creating/editing a datasource.
When used with Presto, it actually creates a connection on behalf of the logged on user.
For Hive, we instead want to connect as the superuser (superset service account) but use the hive.server2.proxy.user property in the URI to enable impersonation.
Unit Tests passed.
cc @timifasubaa @mistercrunch