From 5026be5d1f86f8148846781a93a9b64f14dc9077 Mon Sep 17 00:00:00 2001 From: Alejandro Leal Date: Tue, 16 May 2023 01:58:20 -0400 Subject: [PATCH] Update to multiple README.md - blueprints/data-solutions/data-platform-foundations/README.md - blueprints/factories/project-factory/README.md - modules/net-ilb-l7/README.md - modules/project/README.md --- .../data-solutions/data-platform-foundations/README.md | 6 +++--- blueprints/factories/project-factory/README.md | 2 +- modules/net-ilb-l7/README.md | 4 ++-- modules/project/README.md | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/blueprints/data-solutions/data-platform-foundations/README.md b/blueprints/data-solutions/data-platform-foundations/README.md index e4ff871c09..8bb9c2caaf 100644 --- a/blueprints/data-solutions/data-platform-foundations/README.md +++ b/blueprints/data-solutions/data-platform-foundations/README.md @@ -2,7 +2,7 @@ This module implements an opinionated Data Platform Architecture that creates and setup projects and related resources that compose an end-to-end data environment. -For a minimal Data Platform, plese refer to the [Minimal Data Platform](../data-platform-minimal/) blueprint. +For a minimal Data Platform, please refer to the [Minimal Data Platform](../data-platform-minimal/) blueprint. The code is intentionally simple, as it's intended to provide a generic initial setup and then allow easy customizations to complete the implementation of the intended design. @@ -41,13 +41,13 @@ This separation into projects allows adhering to the least-privilege principle b The script will create the following projects: - **Drop off** Used to store temporary data. Data is pushed to Cloud Storage, BigQuery, or Cloud PubSub. Resources are configured with a customizable lifecycle policy. -- **Load** Used to load data from the drop off zone to the data warehouse. The load is made with minimal to zero transformation logic (mainly `cast`). Anonymization or tokenization of Personally Identifiable Information (PII) can be implemented here or in the transformation stage, depending on your requirements. The use of [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates) is recommended. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the scirpt and have separate `Load` projects. +- **Load** Used to load data from the drop off zone to the data warehouse. The load is made with minimal to zero transformation logic (mainly `cast`). Anonymization or tokenization of Personally Identifiable Information (PII) can be implemented here or in the transformation stage, depending on your requirements. The use of [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates) is recommended. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the script and have separate `Load` projects. - **Data Warehouse** Several projects distributed across 3 separate layers, to host progressively processed and refined data: - **Landing - Raw data** Structured Data, stored in relevant formats: structured data stored in BigQuery, unstructured data stored on Cloud Storage with additional metadata stored in BigQuery (for example pictures stored in Cloud Storage and analysis of the images for Cloud Vision API stored in BigQuery). - **Curated - Cleansed, aggregated and curated data** - **Confidential - Curated and unencrypted layer** - **Orchestration** Used to host Cloud Composer, which orchestrates all tasks that move data across layers. -- **Transformation** Used to move data between Data Warehouse layers. We strongly suggest relying on BigQuery Engine to perform the transformations. If BigQuery doesn't have the features needed to perform your transformations, you can use Cloud Dataflow with [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates). This stage can also optionally anonymize or tokenize PII. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the scirpt and have separate `Tranformation` projects. +- **Transformation** Used to move data between Data Warehouse layers. We strongly suggest relying on BigQuery Engine to perform the transformations. If BigQuery doesn't have the features needed to perform your transformations, you can use Cloud Dataflow with [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates). This stage can also optionally anonymize or tokenize PII. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the script and have separate `Transformation` projects. - **Exposure** Used to host resources that share processed data with external systems. Depending on the access pattern, data can be presented via Cloud SQL, BigQuery, or Bigtable. For BigQuery data, we strongly suggest relying on [Authorized views](https://cloud.google.com/bigquery/docs/authorized-views). ### Roles diff --git a/blueprints/factories/project-factory/README.md b/blueprints/factories/project-factory/README.md index a86e708e30..d374dceb0a 100644 --- a/blueprints/factories/project-factory/README.md +++ b/blueprints/factories/project-factory/README.md @@ -83,7 +83,7 @@ module "projects" { ```yaml # ./data/defaults.yaml -# The following applies as overrideable defaults for all projects +# The following applies as overridable defaults for all projects # All attributes are required billing_account_id: 012345-67890A-BCDEF0 diff --git a/modules/net-ilb-l7/README.md b/modules/net-ilb-l7/README.md index 5ae4ac9477..4250285bd0 100644 --- a/modules/net-ilb-l7/README.md +++ b/modules/net-ilb-l7/README.md @@ -2,7 +2,7 @@ This module allows managing Internal HTTP/HTTPS Load Balancers (L7 ILBs). It's designed to expose the full configuration of the underlying resources, and to facilitate common usage patterns by providing sensible defaults, and optionally managing prerequisite resources like health checks, instance groups, etc. -Due to the complexity of the underlying resources, changes to the configuration that involve recreation of resources are best applied in stages, starting by disabling the configuration in the urlmap that references the resources that neeed recreation, then doing the same for the backend service, etc. +Due to the complexity of the underlying resources, changes to the configuration that involve recreation of resources are best applied in stages, starting by disabling the configuration in the urlmap that references the resources that need recreation, then doing the same for the backend service, etc. ## Examples @@ -109,7 +109,7 @@ You can leverage externally defined health checks for backend services, or have Health check configuration is controlled via the `health_check_configs` variable, which behaves in a similar way to other LB modules in this repository. -Defining different health checks fromt he default is very easy. You can for example replace the default HTTP health check with a TCP one and reference it in you backend service: +Defining different health checks from the default is very easy. You can for example replace the default HTTP health check with a TCP one and reference it in you backend service: ```hcl module "ilb-l7" { diff --git a/modules/project/README.md b/modules/project/README.md index b13f887f18..2df16ce3cf 100644 --- a/modules/project/README.md +++ b/modules/project/README.md @@ -294,7 +294,7 @@ module "project" { Organization policies can be loaded from a directory containing YAML files where each file defines one or more constraints. The structure of the YAML files is exactly the same as the `org_policies` variable. -Note that contraints defined via `org_policies` take precedence over those in `org_policies_data_path`. In other words, if you specify the same contraint in a YAML file *and* in the `org_policies` variable, the latter will take priority. +Note that constraints defined via `org_policies` take precedence over those in `org_policies_data_path`. In other words, if you specify the same constraint in a YAML file *and* in the `org_policies` variable, the latter will take priority. The example below deploys a few organization policies split between two YAML files.