-
Notifications
You must be signed in to change notification settings - Fork 16
Conversation
@@ -27,6 +27,8 @@ The types of changes are: | |||
* Parallelize CI safe checks to reduce run time [#717](https://github.com/ethyca/fidesops/pull/717) | |||
* Add dependabot to keep dependencies up to date [#718](https://github.com/ethyca/fidesops/pull/718) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have left a blank space here in anticipation of a conflict with this PR
@@ -172,6 +174,10 @@ def start_webserver() -> None: | |||
) | |||
) | |||
|
|||
if not config.execution.USE_DEDICATED_WORKER: | |||
logger.info("Starting worker...") | |||
subprocess.Popen(["fidesops", "worker"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may well be a smarter way to run this on the same container. For instance this wouldn't be any more performant than Fidesops prior to Celery if this was run outside of a high availability environment, as the same server is doing the work albeit in a slightly different structure. The main benefits of this are:
- giving users the option to run everything with as small container footprint as possible
- saving developers time and memory building + operating the worker container to test the privacy request flow locally
I'd be interested to test whether running the privacy request blocks the uvicorn server receiving requests in a deployed setting.
@@ -98,6 +101,10 @@ TASK_RETRY_DELAY=20 | |||
TASK_RETRY_BACKOFF=2 | |||
REQUIRE_MANUAL_REQUEST_APPROVAL=true | |||
MASKING_STRICT=true | |||
CELERY_BROKER_URL="redis://:testpassword@redis:6379/1" | |||
CELERY_RESULT_BACKEND="redis://:testpassword@redis:6379/1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a separate task out to make these defaults more sane
@@ -58,6 +58,9 @@ The `fidesops.toml` file should specify the following variables: | |||
|`TASK_RETRY_BACKOFF` | `FIDESOPS__EXECUTION__TASK_RETRY_BACKOFF` | int | 2 | 1 | The backoff factor for retries, to space out repeated retries. | |||
|`REQUIRE_MANUAL_REQUEST_APPROVAL` | `FIDESOPS__EXECUTION__REQUIRE_MANUAL_REQUEST_APPROVAL` | bool | False | False | Whether privacy requests require explicit approval to execute | |||
|`MASKING_STRICT` | `FIDESOPS__EXECUTION__MASKING_STRICT` | bool | True | True | If MASKING_STRICT is True, we only use "update" requests to mask data. (For third-party integrations, you should define an `update` endpoint to use.) If MASKING_STRICT is False, you are allowing fidesops to use any defined DELETE or GDPR DELETE endpoints to remove PII. In this case, you should define `delete` or `data_protection_request` endpoints for your third-party integrations. Note that setting MASKING_STRICT to False means that data may be deleted beyond the specific data categories that you've configured in your Policy. | |||
|`CELERY_BROKER_URL` | `FIDESOPS__EXECUTION__CELERY_BROKER_URL` | str | redis://:testpassword@redis:6379/1 | N/A | The datastore to maintain ordered queues of tasks. | |||
|`CELERY_RESULT_BACKEND` | `FIDESOPS__EXECUTION__RESULT_BACKEND` | str | redis://:testpassword@redis:6379/1 | N/A | The datastore to put results from asynchronously processed tasks. | |||
|`WORKER_ENABLED` | `FIDESOPS__EXECUTION__WORKER_ENABLED` | bool | True | True | Whether Fidesops is running with a dedicated worker to process privacy requests asynchronously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These descriptions will undoubtedly need work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fine for right now - they can be expanded on with #755
@@ -41,6 +41,7 @@ class ExecutionSettings(FidesSettings): | |||
MASKING_STRICT: bool = True | |||
CELERY_BROKER_URL: str = "redis://:testpassword@redis:6379/1" | |||
CELERY_RESULT_BACKEND: str = "redis://:testpassword@redis:6379/1" | |||
WORKER_ENABLED: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason why we default our users to True
but use false
for ourselves? - https://github.com/ethyca/fidesops/pull/770/files#diff-b9a882b5033bf73236083b237701d90db30d7405950cbba9cfee37df20bdb191R37
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're generally deploying Fidesops in production, so should be encouraged to use a robust setup. We're using Fidesops in a development context and might not need to focus on the execution of privacy requests always, and it's mainly important when execution is happening.
Doing some thinking about workflow here, a user would need to both set This is fine for now, but I'd like to be able to detect the We talked briefly about passing the env var around in |
I think they're two separate concerns. It's good for us to be able to spin up workers as and when we need to work with them locally, but it doesn't need to be part of the default setup as long as we can guarantee the same code is being run and it's only the paradigm that changes. Ideally I'd like to see if this does get in our way first and then address via the current change management avenues (retros etc). |
@seanpreston sounds good to me, let's make sure we at least explain steps needed in the docs issue- #755 |
* default fidesops to running the worker and webserver on same container * default to using a worker, add docker config for worker * update changelog * USE_DEDICATED_WORKER -> WORKER_ENABLED * add basic descriptions for celery vars to docs * remove unused import * add Make command for a server + worker
Purpose
We've recently added Celery as a project dependency, and it's currently compulsory to run a worker in order to execute privacy requests. This PR makes that optional, by providing an option to run the worker on the same node as the webserver as a subprocess.
Changes
worker
definition intodocker-compose.worker.yml
USE_DEDICATED_WORKER
config varUSE_DEDICATED_WORKER
isFalse
-> runfidesops worker
as a subprocess on webserver startupEffect
docker compose up
will bring up the webserver, and runfidesops worker
as a subprocess on that containerdocker compose -f docker-compose.yml -f docker-compose.worker.yml
will forceUSE_DEDICATED_WORKER
toTrue
and bring up the worker container separatelyChecklist
CHANGELOG.md
fileCHANGELOG.md
file is being appended toUnreleased
section in an appropriate category. Add a new category from the list at the top of the file if the needed one isn't already there.Run Unsafe PR Checks
label has been applied, and checks have passed, if this PR touches any external servicesTicket
Fixes #769