Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

[#1393] Update Fidesops config with sane defaults where necessary #1395

Merged
merged 14 commits into from
Sep 29, 2022

Conversation

seanpreston
Copy link
Contributor

@seanpreston seanpreston commented Sep 27, 2022

Purpose

This PR addresses some shortcomings in the Fidesops config highlighted by the recent removal of the fidesops.toml from the Docker build. I've not tried to set defaults for any config vars within the [database], [redis], or [security] subsections, since they're either handled separately, or setting a default is an anti-pattern.

This Fideslib PR also aims to clarify which part of config. loading any error has occurred in.

Changes

  • Only populate analytics_id if analytics_opt_out is either None or False — this still preserves the behaviour of an explicit opt-out.
  • Sets port to 8080
  • Sets task_retry_ vars to 0 — this means by default nodes in the Fidesops identity graph traversal will not attempt to retry if they fail.

Checklist

  • Update CHANGELOG.md file
    • Merge in main so the most recent CHANGELOG.md file is being appended to
    • Add description within the Unreleased section in an appropriate category. Add a new category from the list at the top of the file if the needed one isn't already there.
    • Add a link to this PR at the end of the description with the PR number as the text. example: #1
  • Applicable documentation updated (guides, quickstart, postman collections, tutorial, fidesdemo, database diagram.
  • If docs updated (select one):
    • documentation complete, or draft/outline provided (tag docs-team to complete/review on this branch)
    • documentation issue created (tag docs-team to complete issue separately)
  • Good unit test/integration test coverage
  • This PR contains a DB migration. If checked, the reviewer should confirm with the author that the down_revision correctly references the previous migration before merging
  • The Run Unsafe PR Checks label has been applied, and checks have passed, if this PR touches any external services

Ticket

Fixes #1393

@seanpreston seanpreston added the run unsafe ci checks Triggers running of unsafe CI checks label Sep 27, 2022
Copy link
Contributor

@NevilleS NevilleS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - few notes from my end

task_retry_count: int
task_retry_delay: int # In seconds
task_retry_backoff: int
# By default Fidesops will not retry graph nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? Our previous defaults for this (set in the TOML) did have nonzero retries, and I think that's helpful behaviour!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not retrying at all is a sane default given the context of:

  • this var. is also used in the CI tests (ultimately we'll overload this) and causes runtimes to increase significantly
  • the reporting of what's happening during the retries is opaque (again, we will fix this in a subsequent release)
  • it's blanket applied across all datastores, regardless of any specifics — a more sane implementation would allow for datastore level overrides

It can always be overloaded

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll trust you to roll this into future improvements so that our default retry logic is reasonable. I'd argue that we should disable it in CI if we want (we can override this ourselves!) but understand if the retries are surprising right now. We can improve that in our error reporting work

src/fidesops/ops/core/config.py Show resolved Hide resolved
FidesopsNotificationSettings
] = FidesopsNotificationSettings()

port: int = 8080 # Run the webserver on port 8080 by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable can't be set via an ENV option right now, since there isn't a env var prefix set for the top-level config object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot!

notifications: FidesopsNotificationSettings

port: int
execution: Optional[ExecutionSettings] = ExecutionSettings()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a (required!) config variable missing from this schema: fidesops.security.drp_jwt_secret. Can you add it to the schema and ensure it's optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the Fideslib config already, so we shouldn't specify it twice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see! I'd like it to have a default though, can't we make that optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, have made this update to Fideslib

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this is optional jwt_key: str = config.security.drp_jwt_secret in src/fidesops/ops/api/v1/endpoints/drp_endpoints.py has the wrong type. Interestingly it already handles the None case.

Copy link
Contributor

@NevilleS NevilleS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more suggestion: can you bump the fideslib requirement in this PR and ensure the related PR lands? So that this all comes together into the ops codebase and closing the related issue.

@@ -5,7 +5,6 @@ celery[pytest]==5.2.7
click==8.1.3
cryptography~=3.4.8
dask==2022.8.0
emails
Copy link
Contributor Author

@seanpreston seanpreston Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused, so I'm removing it here as part of this housekeeping.

return v or cls.generate_and_store_client_id()
if not v and not values.get("analytics_opt_out"):
v = cls.generate_and_store_client_id()
return v
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @eastandwestwind — this should hopefully fix the client_id generation issue once and for all

FidesopsNotificationSettings
] = FidesopsNotificationSettings()

port: int = int(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be open to other ways of doing this than casting to int(...) though it seemed practical for now.

My main gripe is that this could be confusing for users as the env vars are expected to be strings, where port must then be an integer. It's technically possible to run Fidesops with FIDESOPS__PORT="iamastring" and the error message would be along the lines of ValueError: invalid literal for int() with base 10: 'iamastring'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here being the user gets a more informative error message?

src/fidesops/ops/core/config.py Show resolved Hide resolved
notifications: FidesopsNotificationSettings

port: int
execution: Optional[ExecutionSettings] = ExecutionSettings()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this is optional jwt_key: str = config.security.drp_jwt_secret in src/fidesops/ops/api/v1/endpoints/drp_endpoints.py has the wrong type. Interestingly it already handles the None case.

FidesopsNotificationSettings
] = FidesopsNotificationSettings()

port: int = int(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here being the user gets a more informative error message?

@sanders41
Copy link
Contributor

@seanpreston I re-ran my isolated container and everything worked well.

After these updates jwt_key: str = config.security.drp_jwt_secret in src/fidesops/ops/api/v1/endpoints/drp_endpoints.py has the wrong type now. It needs to either be updated to jwt_key: Optional[str] or the type could be removed and use type inference, either way would work. Other than that everything looks good now.

@seanpreston
Copy link
Contributor Author

After these updates jwt_key: str = config.security.drp_jwt_secret in src/fidesops/ops/api/v1/endpoints/drp_endpoints.py has the wrong type now

I updated that already, you might just need to refresh the diff. Will merge away.

@sanders41 sanders41 merged commit effa4cb into main Sep 29, 2022
@sanders41 sanders41 deleted the sp-1393-config branch September 29, 2022 13:30
task_retry_backoff: int
# By default Fidesops will not retry graph nodes
task_retry_count: int = 0
task_retry_delay: int = 0 # In seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a nit here, but # In seconds doesn't help end-users. If we want to clarify it we should update the variable name to contain the unit, i.e. task_retry_delay_seconds

Copy link
Contributor

@sanders41 sanders41 Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't think all users are going to read the source code first 😄 ?

Making this change would be a breaking change, task_retry_delay was in previous releases it just gets a default value now. So the question now is it worth it to make a breaking change or could we add something to the docs and that be enough?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
run unsafe ci checks Triggers running of unsafe CI checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fidesops requires a minimal config TOML file to start and several config variables are missing safe defaults
4 participants