Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Make worker node optional #770

Merged
merged 8 commits into from
Jul 6, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ The types of changes are:
* Parallelize CI safe checks to reduce run time [#717](https://github.com/ethyca/fidesops/pull/717)
* Add dependabot to keep dependencies up to date [#718](https://github.com/ethyca/fidesops/pull/718)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have left a blank space here in anticipation of a conflict with this PR

* Make running a worker node optional [#770](https://github.com/ethyca/fidesops/pull/770)

### Changed
* Base64 encode passwords on frontend [#749](https://github.com/ethyca/fidesops/pull/749)

Expand Down
22 changes: 22 additions & 0 deletions docker-compose.worker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
services:
webserver:
depends_on:
- worker
environment:
- FIDESOPS__EXECUTION__WORKER_ENABLED=True

worker:
build:
context: .
dockerfile: Dockerfile.worker
command: fidesops worker
depends_on:
redis:
condition: service_started
restart: always
volumes:
- type: bind
source: ./
target: /fidesops
read_only: False
- /fidesops/src/fidesops.egg-info
18 changes: 1 addition & 17 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ services:
context: .
dockerfile: Dockerfile.app
depends_on:
- worker
- db
- redis
expose:
Expand All @@ -29,6 +28,7 @@ services:
- FIDESOPS__LOG_PII=${FIDESOPS__LOG_PII}
- FIDESOPS__HOT_RELOAD=${FIDESOPS__HOT_RELOAD}
- FIDESOPS__ROOT_USER__ANALYTICS_ID=${FIDESOPS__ROOT_USER__ANALYTICS_ID}
- FIDESOPS__EXECUTION__WORKER_ENABLED=False

db:
image: postgres:12
Expand Down Expand Up @@ -59,22 +59,6 @@ services:
ports:
- "0.0.0.0:6379:6379"

worker:
build:
context: .
dockerfile: Dockerfile.worker
command: fidesops worker
depends_on:
redis:
condition: service_started
restart: always
volumes:
- type: bind
source: ./
target: /fidesops
read_only: False
- /fidesops/src/fidesops.egg-info

docs:
build:
context: docs/fidesops/
Expand Down
7 changes: 7 additions & 0 deletions docs/fidesops/docs/guides/configuration_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ The `fidesops.toml` file should specify the following variables:
|`TASK_RETRY_BACKOFF` | `FIDESOPS__EXECUTION__TASK_RETRY_BACKOFF` | int | 2 | 1 | The backoff factor for retries, to space out repeated retries.
|`REQUIRE_MANUAL_REQUEST_APPROVAL` | `FIDESOPS__EXECUTION__REQUIRE_MANUAL_REQUEST_APPROVAL` | bool | False | False | Whether privacy requests require explicit approval to execute
|`MASKING_STRICT` | `FIDESOPS__EXECUTION__MASKING_STRICT` | bool | True | True | If MASKING_STRICT is True, we only use "update" requests to mask data. (For third-party integrations, you should define an `update` endpoint to use.) If MASKING_STRICT is False, you are allowing fidesops to use any defined DELETE or GDPR DELETE endpoints to remove PII. In this case, you should define `delete` or `data_protection_request` endpoints for your third-party integrations. Note that setting MASKING_STRICT to False means that data may be deleted beyond the specific data categories that you've configured in your Policy.
|`CELERY_BROKER_URL` | `FIDESOPS__EXECUTION__CELERY_BROKER_URL` | str | redis://:testpassword@redis:6379/1 | N/A | The datastore to maintain ordered queues of tasks.
|`CELERY_RESULT_BACKEND` | `FIDESOPS__EXECUTION__RESULT_BACKEND` | str | redis://:testpassword@redis:6379/1 | N/A | The datastore to put results from asynchronously processed tasks.
|`WORKER_ENABLED` | `FIDESOPS__EXECUTION__WORKER_ENABLED` | bool | True | True | Whether Fidesops is running with a dedicated worker to process privacy requests asynchronously.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These descriptions will undoubtedly need work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fine for right now - they can be expanded on with #755

|---|---|---|---|---|---|
|`ANALYTICS_OPT_OUT` | `FIDESOPS__USER__ANALYTICS_OPT_OUT` | bool | False | False | Opt out of sending anonymous usage data to Ethyca to improve the product experience
| Admin UI Variables|---|---|---|---|---|
Expand Down Expand Up @@ -98,6 +101,10 @@ TASK_RETRY_DELAY=20
TASK_RETRY_BACKOFF=2
REQUIRE_MANUAL_REQUEST_APPROVAL=true
MASKING_STRICT=true
CELERY_BROKER_URL="redis://:testpassword@redis:6379/1"
CELERY_RESULT_BACKEND="redis://:testpassword@redis:6379/1"
Copy link
Contributor Author

@seanpreston seanpreston Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a separate task out to make these defaults more sane

WORKER_ENABLED=true


[root_user]
ANALYTICS_OPT_OUT=false
Expand Down
1 change: 1 addition & 0 deletions fidesops.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ REQUIRE_MANUAL_REQUEST_APPROVAL = false
TASK_RETRY_COUNT = 0
TASK_RETRY_DELAY = 1
TASK_RETRY_BACKOFF = 1
WORKER_ENABLED = false

[root_user]
ANALYTICS_OPT_OUT = false
Expand Down
1 change: 1 addition & 0 deletions src/fidesops/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ class ExecutionSettings(FidesSettings):
MASKING_STRICT: bool = True
CELERY_BROKER_URL: str = "redis://:testpassword@redis:6379/1"
CELERY_RESULT_BACKEND: str = "redis://:testpassword@redis:6379/1"
WORKER_ENABLED: bool = True
Copy link
Contributor

@eastandwestwind eastandwestwind Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason why we default our users to True but use false for ourselves? - https://github.com/ethyca/fidesops/pull/770/files#diff-b9a882b5033bf73236083b237701d90db30d7405950cbba9cfee37df20bdb191R37

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're generally deploying Fidesops in production, so should be encouraged to use a robust setup. We're using Fidesops in a development context and might not need to focus on the execution of privacy requests always, and it's mainly important when execution is happening.


class Config:
env_prefix = "FIDESOPS__EXECUTION__"
Expand Down
6 changes: 6 additions & 0 deletions src/fidesops/main.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import logging
import subprocess
from datetime import datetime, timezone
from pathlib import Path
from typing import Callable, Optional
Expand All @@ -23,6 +24,7 @@
from fidesops.core.config import config
from fidesops.db.database import init_db
from fidesops.schemas.analytics import Event, ExtraData
from fidesops.tasks import start_worker
from fidesops.tasks.scheduled.scheduler import scheduler
from fidesops.tasks.scheduled.tasks import initiate_scheduled_request_intake
from fidesops.util.cache import get_cache
Expand Down Expand Up @@ -172,6 +174,10 @@ def start_webserver() -> None:
)
)

if not config.execution.WORKER_ENABLED:
logger.info("Starting worker...")
subprocess.Popen(["fidesops", "worker"])
Copy link
Contributor Author

@seanpreston seanpreston Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may well be a smarter way to run this on the same container. For instance this wouldn't be any more performant than Fidesops prior to Celery if this was run outside of a high availability environment, as the same server is doing the work albeit in a slightly different structure. The main benefits of this are:

  • giving users the option to run everything with as small container footprint as possible
  • saving developers time and memory building + operating the worker container to test the privacy request flow locally

I'd be interested to test whether running the privacy request blocks the uvicorn server receiving requests in a deployed setting.


logger.info("Starting web server...")
uvicorn.run(
"fidesops.main:app",
Expand Down