-
Notifications
You must be signed in to change notification settings - Fork 16
Conversation
@pytest.fixture | ||
def example_datasets() -> List[Dict]: | ||
example_datasets = [] | ||
example_filenames = [ | ||
"data/dataset/postgres_example_test_dataset.yml", | ||
"data/dataset/mongo_example_test_dataset.yml", | ||
"data/dataset/snowflake_example_test_dataset.yml", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan from here out is to avoid adding any subsequent datasets to this fixture, preferring to have them loaded in by a dedicated fixture for each one.
@@ -129,7 +131,7 @@ def test_create_and_process_access_request( | |||
|
|||
policy.delete(db=db) | |||
pr.delete(db=db) | |||
db.expunge_all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line was causing downstream failures in the test teardown where FKs also needed to be cleaned up (specifically the ConnectionConfig
-> DatasetConfig
relation). After talking to @pattisdr I discovered this line was here as a forcing function to ensure that pr
is no longer in the database session when the execution logs are checked for their .privacy_request_id
attribute, a step to ensure that value is not cleared when pr
is deleted.
I've opted to manually check whether pr
is in the session in the line below, as this performs the same function.
from fidesops.util import logger | ||
from fidesops.util.logger import NotPii, MASKED | ||
|
||
|
||
def test_logger_masks_pii() -> None: | ||
@pytest.fixture(scope="function") | ||
def toggle_testing_envvar() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful to explicitly check areas where we use the os.environ["TESTING"]
in production code
tests/conftest.py
Outdated
@@ -38,8 +41,7 @@ def migrate_test_db() -> None: | |||
def db() -> Generator: | |||
"""Return a connection to the test DB""" | |||
# Create the test DB enginge | |||
## This asserts that TESTING==True | |||
assert os.getenv("TESTING", False) | |||
assert os.getenv("TESTING") == "True" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @stevenbenjamin — I've updated these checks to look like this now as that's the correct convention according to pylint
. Environment vars can only be strings so returning a bool
as the default arg is dangerous. Hopefully this syntax is clearer too.
operator: str, | ||
) -> str: | ||
"""Returns field names in clauses surrounded by quotation marks as required by Snowflake syntax.""" | ||
return f'"{field_name}" {operator} (:{field_name})' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the ()
around :{field_name}
is an interesting difference to how the more vanilla SQL connectors work. It could be that currently in our Postgres and MySQL connectors IN
functions are accepted without parenthesis, and if so we should add them in because with-parenthesis is the standard SQL syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a snowflake thing, but an idiosyncracy of the the way the dialect is realized in sqlalchemy. the other dialects we've used seem to add the '()' around tuple IN values automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good @seanpreston, not merging yet in case there's anything lingering you want to address but it looks good.
code was organized and readable about why certain decisions were made. I completed local access request testing outside of our regular unit tests - this behaved as expected and it was nice to see the queries being executed too in testing mode.
@@ -256,6 +262,45 @@ def snowflake_connection_config(db: Session) -> Generator: | |||
connection_config.delete(db) | |||
|
|||
|
|||
@pytest.fixture(scope="function") | |||
def snowflake_connection_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming convention is confusing now - snowflake_connection_config
is based off of safe_snowflake_connection_config
, but has write permissions, and then separate snowflake_read_connection_config
has read. I was confused at first because I incorrectly assumed safe_snowflake_connection_config` meant read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot, I'll tidy this up in some way. Maybe unsafe_snowflake_connection_config
or just snowflake_connection_config
and it always has secrets if available. The read only config isn't being used anywhere since that property isn't enforced in the traversal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanpreston I added this PR #49 to not do erasures if a node's connection config doesn't have write access -
@@ -0,0 +1,229 @@ | |||
dataset: | |||
- fides_key: snowflake_example_test_dataset | |||
name: Snowflake Example Test Dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're thinking about test fixtures for these different datasources, it would be nice to create some interdependency between them, like how the mongo example test dataset relies on a field in the postgres example test dataset, I don't think our datasets should all be completely independent datastores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have the postman collection be able to make requests against our fidesops snowflake db - but the way i understand it, we wipe and repopulate it from these unit tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we wipe and repopulate it from these unit tests?
Not quite — we're only wiping and repopulating the data we use in erasures. Everything else is part of a copy of the dataset Atlas uses for Snowflake. There are some frustrating minor differences here, for example, in Snowflake the collection is order
and in Postgres it's orders
plural.
I'll factor this into the testing design for sure! Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ready to me @seanpreston
Purpose
This PR contains query execution for the Snowflake external datastore.
Changes
SQLQueryConfig
class to with aSnowflakeQueryConfig
SQLQueryConfig
class to format variables that eventually end up in the query stringSnowflakeQueryConfig
data_type
andlength
meta annotationsLeft to do
Variant
type and erasure handlingtype
attribute into Snowflake datasetTicket
Closes #73