Skip to content

Commit

Permalink
refactor!: Delete BigQuery and CSV EDP simulator variants
Browse files Browse the repository at this point in the history
  • Loading branch information
SanjayVas committed Oct 28, 2024
1 parent f388774 commit 881bf95
Show file tree
Hide file tree
Showing 13 changed files with 53 additions and 979 deletions.
36 changes: 10 additions & 26 deletions docs/gke/correctness-test.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,37 +67,21 @@ kubectl apply -k src/main/k8s/dev/kingdom
## Deploy EDP simulators

See the [simulator deployment guide](simulator-deployment.md). The test assumes
that there are valid events in the range `[2021-03-15, 2021-03-17]`. The
synthetic generator variant assumes that the event message type is
`wfa.measurement.api.v2alpha.event_templates.testing.TestEvent`, and the
BigQuery variant assumes the event message type is `halo_cmm.uk.pilot.Event`.
that there are valid events in the range `[2021-03-15, 2021-03-17]`. The test
assumes that the event message type is
`wfa.measurement.api.v2alpha.event_templates.testing.TestEvent`.

## Run the correctness test

Run the following, substituting your own values:

* Synthetic generator

```shell
bazel test //src/test/kotlin/org/wfanet/measurement/integration/k8s:SyntheticGeneratorCorrectnessTest
--test_output=streamed \
--define=kingdom_public_api_target=v2alpha.kingdom.dev.halo-cmm.org:8443 \
--define=mc_name=measurementConsumers/Rcn7fKd25C8 \
--define=mc_api_key=W9q4zad246g
```

* BigQuery

```shell
bazel test //src/test/kotlin/org/wfanet/measurement/integration/k8s:BigQueryCorrectnessTest
--test_output=streamed \
--define=kingdom_public_api_target=v2alpha.kingdom.dev.halo-cmm.org:8443 \
--define=mc_name=measurementConsumers/Rcn7fKd25C8 \
--define=mc_api_key=W9q4zad246g \
--define=google_cloud_project=halo-cmm-demo \
--define=bigquery_dataset=demo \
--define=bigquery_table=labelled_events
```
```shell
bazel test //src/test/kotlin/org/wfanet/measurement/integration/k8s:SyntheticGeneratorCorrectnessTest \
--test_output=streamed \
--define=kingdom_public_api_target=v2alpha.kingdom.dev.halo-cmm.org:8443 \
--define=mc_name=measurementConsumers/Rcn7fKd25C8 \
--define=mc_api_key=W9q4zad246g
```

The time the test takes depends on the size of the data set. With the default
synthetic generator configuration, this is about an hour. Eventually, you should
Expand Down
182 changes: 43 additions & 139 deletions docs/gke/simulator-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,60 +15,13 @@ See [Machine Setup](machine-setup.md).

## Configure event data source

There are two data sources that can be used:

1. Synthetic generator

Events are generated according to
[simulator synthetic data specifications](../../src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto),
consisting of a single `SyntheticPopulationSpec` and a
`SyntheticEventGroupSpec` for each `EventGroup`. There are default
specifications included, but you can replace these with your own after
before you apply the K8s Kustomization.

This data source supports any event message type.

2. BigQuery table

Events are read from a Google Cloud BigQuery table. See the section below on
how to populate the table.

This data source currently only supports the `halo_cmm.uk.pilot.Event`
message type.

### Populate BigQuery table

The BigQuery table schema has the following columns:

* `date`
* Type: `DATE`
* `publisher_id`
* Type: `INTEGER`
* `vid`
* Type: `INTEGER`
* `digital_video_completion_status`
* Type: `STRING`
* Values:
* `0% - 25%`
* `25% - 50%`
* `50% - 75%`
* `75% - 100%`
* `100%`
* `viewability`
* Type: `STRING`
* Values:
* `viewable_0_percent_to_50_percent`
* `viewable_50_percent_to_100_percent`
* `viewable_100_percent`

The `dev` configuration expects a table named `labelled_events` in a dataset
named `demo` in the `us-central1` region. The table can be created in the
[Google Cloud Console](https://console.cloud.google.com/bigquery), specifying a
CSV file with automatic schema detection.

The
[`uk-pilot-synthetic-data-gen` script](https://github.com/world-federation-of-advertisers/uk-pilot-synthetic-data-gen)
may be helpful in generating a CSV file with test events.
Events are generated according to
[simulator synthetic data specifications](../../src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto),
consisting of a single `SyntheticPopulationSpec` and a `SyntheticEventGroupSpec`
for each `EventGroup`. There are default specifications included, but you can
replace these with your own after before you apply the K8s Kustomization.

This data source supports any event message type.

## Provision Google Cloud Project infrastructure

Expand All @@ -83,7 +36,7 @@ Applying the Terraform configuration will create a new cluster. You can use the
gcloud container clusters get-credentials simulators
```

## Build and push container image (optional)
## Build and push container image (not recommended)

If you aren't using pre-built release images, you can build the image yourself
from source and push them to a container registry. For example, if you're using
Expand All @@ -95,99 +48,50 @@ The build target to use depends on the event data source. Assuming a project
named `halo-cmm-demo` and an image tag `build-0001`, run the following to build
and push the image:

* Synthetic generator

```shell
bazel run -c opt //src/main/docker:push_synthetic_generator_edp_simulator_runner_image \
--define container_registry=gcr.io \
--define image_repo_prefix=halo-cmm-demo --define image_tag=build-0001
```

* BigQuery

```shell
bazel run -c opt //src/main/docker:push_bigquery_edp_simulator_runner_image \
--define container_registry=gcr.io \
--define image_repo_prefix=halo-cmm-demo --define image_tag=build-0001
```
```shell
bazel run -c opt //src/main/docker:push_synthetic_generator_edp_simulator_runner_image \
--define container_registry=gcr.io \
--define image_repo_prefix=halo-cmm-demo --define image_tag=build-0001
```

## Generate K8s Kustomization

Run the following, substituting your own values:

* Synthetic generator

```shell
bazel build //src/main/k8s/dev:synthetic_generator_edp_simulators.tar \
--define=kingdom_public_api_target=v2alpha.kingdom.dev.halo-cmm.org:8443 \
--define=worker1_id=worker1
--define=worker1_public_api_target=public.worker1.dev.halo-cmm.org:8443 \
--define=worker2_id=worker2
--define=worker2_public_api_target=public.worker2.dev.halo-cmm.org:8443 \
--define=mc_name=measurementConsumers/TGWOaWehLQ8 \
--define=edp1_name=dataProviders/HRL1wWehTSM \
--define=edp1_cert_name=dataProviders/HRL1wWehTSM/certificates/HRL1wWehTSM \
--define=edp2_name=dataProviders/djQdz2ehSSE \
--define=edp2_cert_name=dataProviders/djQdz2ehSSE/certificates/djQdz2ehSSE \
--define=edp3_name=dataProviders/SQ99TmehSA8 \
--define=edp3_cert_name=dataProviders/SQ99TmehSA8/certificates/SQ99TmehSA8 \
--define=edp4_name=dataProviders/TBZkB5heuL0 \
--define=edp4_cert_name=dataProviders/TBZkB5heuL0/certificates/TBZkB5heuL0 \
--define=edp5_name=dataProviders/HOCBxZheuS8 \
--define=edp5_cert_name=dataProviders/HOCBxZheuS8/certificates/HOCBxZheuS8 \
--define=edp6_name=dataProviders/VGExFmehRhY \
--define=edp6_cert_name=dataProviders/VGExFmehRhY/certificates/VGExFmehRhY \
--define container_registry=gcr.io \
--define image_repo_prefix=halo-cmm-demo --define image_tag=build-0001
```

The resulting archive will contain `SyntheticEventGroupSpec` messages in
text format under `src/main/k8s/dev/synthetic_generator_config_files/`.
These can be replaced in order to customize the synthetic generator.

* BigQuery

```shell
bazel build //src/main/k8s/dev:bigquery_edp_simulators.tar \
--define=kingdom_public_api_target=v2alpha.kingdom.dev.halo-cmm.org:8443 \
--define=worker1_id=worker1
--define=worker1_public_api_target=public.worker1.dev.halo-cmm.org:8443 \
--define=worker2_id=worker2
--define=worker2_public_api_target=public.worker2.dev.halo-cmm.org:8443 \
--define=mc_name=measurementConsumers/TGWOaWehLQ8 \
--define=edp1_name=dataProviders/HRL1wWehTSM \
--define=edp1_cert_name=dataProviders/HRL1wWehTSM/certificates/HRL1wWehTSM \
--define=edp2_name=dataProviders/djQdz2ehSSE \
--define=edp2_cert_name=dataProviders/djQdz2ehSSE/certificates/djQdz2ehSSE \
--define=edp3_name=dataProviders/SQ99TmehSA8 \
--define=edp3_cert_name=dataProviders/SQ99TmehSA8/certificates/SQ99TmehSA8 \
--define=edp4_name=dataProviders/TBZkB5heuL0 \
--define=edp4_cert_name=dataProviders/TBZkB5heuL0/certificates/TBZkB5heuL0 \
--define=edp5_name=dataProviders/HOCBxZheuS8 \
--define=edp5_cert_name=dataProviders/HOCBxZheuS8/certificates/HOCBxZheuS8 \
--define=edp6_name=dataProviders/VGExFmehRhY \
--define=edp6_cert_name=dataProviders/VGExFmehRhY/certificates/VGExFmehRhY \
--define container_registry=gcr.io \
--define=google_cloud_project=halo-cmm-demo \
--define=bigquery_dataset=demo \
--define=bigquery_table=labelled_events \
--define image_repo_prefix=halo-cmm-demo --define image_tag=build-0001
```
```shell
bazel build //src/main/k8s/dev:synthetic_generator_edp_simulators.tar \
--define=kingdom_public_api_target=v2alpha.kingdom.dev.halo-cmm.org:8443 \
--define=worker1_id=worker1
--define=worker1_public_api_target=public.worker1.dev.halo-cmm.org:8443 \
--define=worker2_id=worker2
--define=worker2_public_api_target=public.worker2.dev.halo-cmm.org:8443 \
--define=mc_name=measurementConsumers/TGWOaWehLQ8 \
--define=edp1_name=dataProviders/HRL1wWehTSM \
--define=edp1_cert_name=dataProviders/HRL1wWehTSM/certificates/HRL1wWehTSM \
--define=edp2_name=dataProviders/djQdz2ehSSE \
--define=edp2_cert_name=dataProviders/djQdz2ehSSE/certificates/djQdz2ehSSE \
--define=edp3_name=dataProviders/SQ99TmehSA8 \
--define=edp3_cert_name=dataProviders/SQ99TmehSA8/certificates/SQ99TmehSA8 \
--define=edp4_name=dataProviders/TBZkB5heuL0 \
--define=edp4_cert_name=dataProviders/TBZkB5heuL0/certificates/TBZkB5heuL0 \
--define=edp5_name=dataProviders/HOCBxZheuS8 \
--define=edp5_cert_name=dataProviders/HOCBxZheuS8/certificates/HOCBxZheuS8 \
--define=edp6_name=dataProviders/VGExFmehRhY \
--define=edp6_cert_name=dataProviders/VGExFmehRhY/certificates/VGExFmehRhY \
--define container_registry=gcr.io \
--define image_repo_prefix=halo-cmm-demo --define image_tag=build-0001
```

The resulting archive will contain `SyntheticEventGroupSpec` messages in text
format under `src/main/k8s/dev/synthetic_generator_config_files/`. These can be
replaced in order to customize the synthetic generator.

Extract the generated archive to some directory.

## Apply K8s Kustomization

From the Kustomization directory, run

* Synthetic generator

```shell
kubectl apply -k src/main/k8s/dev/synthetic_generator_edp_simulators
```

* BigQuery

```shell
kubectl apply -k src/main/k8s/dev/bigquery_edp_simulators
```
```shell
kubectl apply -k src/main/k8s/dev/synthetic_generator_edp_simulators
```
10 changes: 0 additions & 10 deletions src/main/docker/images.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -91,11 +91,6 @@ COMMON_IMAGES = [
image = "//src/main/kotlin/org/wfanet/measurement/loadtest/panelmatchresourcesetup:panel_match_resource_setup_runner_image",
repository = _PREFIX + "/loadtest/panel-match-resource-setup",
),
struct(
name = "csv_edp_simulator_runner_image",
image = "//src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider:csv_edp_simulator_runner_image",
repository = _PREFIX + "/simulator/csv-edp",
),
struct(
name = "synthetic_generator_edp_simulator_runner_image",
image = "//src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider:synthetic_generator_edp_simulator_runner_image",
Expand Down Expand Up @@ -141,11 +136,6 @@ GKE_IMAGES = [
image = "//src/main/kotlin/org/wfanet/measurement/duchy/deploy/gcloud/job/mill/shareshuffle:gcs_honest_majority_share_shuffle_mill_job_image",
repository = _PREFIX + "/duchy/honest-majority-share-shuffle-mill",
),
struct(
name = "bigquery_edp_simulator_runner_image",
image = "//src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider:bigquery_edp_simulator_runner_image",
repository = _PREFIX + "/simulator/bigquery-edp",
),
struct(
name = "duchy_gcloud_postgres_update_schema_image",
image = "//src/main/kotlin/org/wfanet/measurement/duchy/deploy/gcloud/postgres/tools:update_schema_image",
Expand Down
25 changes: 0 additions & 25 deletions src/main/k8s/dev/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -383,31 +383,6 @@ EDP_SIMULATOR_TAGS = {
"google_cloud_project": GCLOUD_SETTINGS.project,
}

cue_dump(
name = "bigquery_edp_simulator_gke",
srcs = ["bigquery_edp_simulator_gke.cue"],
cue_tags = dict(EDP_SIMULATOR_TAGS.items() + {
"bigquery_dataset": SIMULATOR_K8S_SETTINGS.bigquery_dataset,
"bigquery_table": SIMULATOR_K8S_SETTINGS.bigquery_table,
}.items()),
tags = ["manual"],
deps = [":edp_simulator_gke"],
)

kustomization_dir(
name = "bigquery_edp_simulators",
testonly = True,
srcs = [
"resource_requirements.yaml",
":bigquery_edp_simulator_gke",
],
generate_kustomization = True,
tags = ["manual"],
deps = [
"//src/main/k8s/testing/secretfiles:kustomization",
],
)

cue_dump(
name = "synthetic_generator_edp_simulator_gke",
srcs = ["synthetic_generator_edp_simulator_gke.cue"],
Expand Down
46 changes: 0 additions & 46 deletions src/main/k8s/dev/bigquery_edp_simulator_gke.cue

This file was deleted.

Loading

0 comments on commit 881bf95

Please sign in to comment.