Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for GCP IAM impersonation #448

Merged
merged 1 commit into from
Sep 3, 2024

Conversation

michaellzc
Copy link
Contributor

@michaellzc michaellzc commented Jun 14, 2024

Add support for GCP IAM service account impersonation

Use cases

The company has a centralized service account that is used for Terraform automation. However, such GSA should not be used to access the database directly where each database will have its own IAM DB users.

This added an option to impersonate the database IAM user via the centralized GSA. As long as the centralized GSA has sufficient permissions to impersonate as the database IAM DB user, it can be used to perform database automation in Terraform.

Testing

resource "google_sql_database_instance" "self" {}
resource "google_sql_user" "admin" {}
resource "google_service_account" "db_iam_admin" {}
resource "google_sql_user" "iam_admin" {
  name     = trimsuffix(google_service_account.db_iam_admin.email, ".gserviceaccount.com")
  instance = google_sql_database_instance.self.name
  type     = "CLOUD_IAM_SERVICE_ACCOUNT"
}
resource "google_project_iam_member" "iam_admin_project_iam_members" {
  for_each = toset(["roles/cloudsql.client", "roles/cloudsql.instanceUser"])
  member   = google_service_account.db_iam_admin.member
  role     = each.key
}

provider "postgresql" {
  scheme                              = "gcppostgres"
  host                                = google_sql_database_instance.self.connection_name
  username                            = trimsuffix(google_service_account.db_iam_admin.email, ".gserviceaccount.com")
  gcp_iam_impersonate_service_account = google_service_account.db_iam_admin.email
  port                                = 5432
  superuser                           = false
  alias                               = "iamAdmin"
}

# it should work and able to apply resources using the IAM db user
resource "postgresql_*" "*" {
  provider = postgresql.iamAdmin

  // *
}

bobheadxi added a commit to sourcegraph/managed-services-platform-cdktf that referenced this pull request Jun 17, 2024
bobheadxi added a commit to sourcegraph/managed-services-platform-cdktf that referenced this pull request Jun 17, 2024
bobheadxi added a commit to sourcegraph/sourcegraph-public-snapshot that referenced this pull request Jul 5, 2024
…ream (#63092)

Adds a new `postgreSQL.logicalReplication` configuration to allow MSP to
generate prerequisite setup for integration with Datastream:
https://cloud.google.com/datastream/docs/sources-postgresql. Integration
with Datastream allows the Data Analytics team to self-serve data
enrichment needs for the Telemetry V2 pipeline.

Enabling this feature entails downtime (Cloud SQL instance restart), so
enabling the logical replication feature at the Cloud SQL level
(`cloudsql.logical_decoding`) is gated behind
`postgreSQL.logicalReplication: {}`.

Setting up the required stuff in Postgres is a bit complicated,
requiring 3 Postgres provider instances:

1. The default admin one, authenticated with our admin user
2. New: a workload identity provider, using
cyrilgdn/terraform-provider-postgresql#448 /
sourcegraph/managed-services-platform-cdktf#11.
This is required for creating a publication on selected tables, which
requires being owner of said table. Because tables are created by
application using e.g. auto-migrate, the workload identity is always the
table owner, so we need to impersonate the IAM user
3. New: a "replication user" which is created with the replication
permission. Replication seems to not be a propagated permission so we
need a role/user that has replication enabled.

A bit more context scattered here and there in the docstrings.

Beyond the Postgres configuration we also introduce some additional
resources to enable easy Datastream configuration:

1. Datastream Private Connection, which peers to the service private
network
2. Cloud SQL Proxy VM, which only allows connections to `:5432` from the
range specified in 1, allowing a connection to the Cloud SQL instance
2. Datastream Connection Profile attached to 1

From there, data team can click-ops or manage the Datastream Stream and
BigQuery destination on their own.

Closes CORE-165
Closes CORE-212

Sample config:

```yaml
  resources:
    postgreSQL:
      databases:
        - "primary"
      logicalReplication:
        publications:
          - name: testing
            database: primary
            tables:
              - users
```

## Test plan

sourcegraph/managed-services#1569

## Changelog

- MSP services can now configure `postgreSQL.logicalReplication` to
enable Data Analytics team to replicate selected database tables into
BigQuery.
Chickensoupwithrice pushed a commit to sourcegraph/sourcegraph-public-snapshot that referenced this pull request Jul 10, 2024
…ream (#63092)

Adds a new `postgreSQL.logicalReplication` configuration to allow MSP to
generate prerequisite setup for integration with Datastream:
https://cloud.google.com/datastream/docs/sources-postgresql. Integration
with Datastream allows the Data Analytics team to self-serve data
enrichment needs for the Telemetry V2 pipeline.

Enabling this feature entails downtime (Cloud SQL instance restart), so
enabling the logical replication feature at the Cloud SQL level
(`cloudsql.logical_decoding`) is gated behind
`postgreSQL.logicalReplication: {}`.

Setting up the required stuff in Postgres is a bit complicated,
requiring 3 Postgres provider instances:

1. The default admin one, authenticated with our admin user
2. New: a workload identity provider, using
cyrilgdn/terraform-provider-postgresql#448 /
sourcegraph/managed-services-platform-cdktf#11.
This is required for creating a publication on selected tables, which
requires being owner of said table. Because tables are created by
application using e.g. auto-migrate, the workload identity is always the
table owner, so we need to impersonate the IAM user
3. New: a "replication user" which is created with the replication
permission. Replication seems to not be a propagated permission so we
need a role/user that has replication enabled.

A bit more context scattered here and there in the docstrings.

Beyond the Postgres configuration we also introduce some additional
resources to enable easy Datastream configuration:

1. Datastream Private Connection, which peers to the service private
network
2. Cloud SQL Proxy VM, which only allows connections to `:5432` from the
range specified in 1, allowing a connection to the Cloud SQL instance
2. Datastream Connection Profile attached to 1

From there, data team can click-ops or manage the Datastream Stream and
BigQuery destination on their own.

Closes CORE-165
Closes CORE-212

Sample config:

```yaml
  resources:
    postgreSQL:
      databases:
        - "primary"
      logicalReplication:
        publications:
          - name: testing
            database: primary
            tables:
              - users
```

## Test plan

sourcegraph/managed-services#1569

## Changelog

- MSP services can now configure `postgreSQL.logicalReplication` to
enable Data Analytics team to replicate selected database tables into
BigQuery.
Copy link
Owner

@cyrilgdn cyrilgdn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I'll trust you on the code as I don't really know how it works 😅

Could you simply apply the 2 comments or merge the main branch so it will retrigger the tests 🙏 ?

postgresql/config.go Outdated Show resolved Hide resolved
@cyrilgdn cyrilgdn added the waiting-response Further information is requested label Aug 29, 2024
@michaellzc michaellzc force-pushed the mlzc/gcp-iam-impersonation branch 2 times, most recently from 317e036 to 82c61be Compare August 29, 2024 20:39
@michaellzc
Copy link
Contributor Author

michaellzc commented Aug 29, 2024

Hi,

I'll trust you on the code as I don't really know how it works 😅

Could you simply apply the 2 comments or merge the main branch so it will retrigger the tests 🙏 ?

done :)

trust me it works haha. we've been using the fork for a while now.

@github-actions github-actions bot removed the waiting-response Further information is requested label Aug 29, 2024
@michaellzc michaellzc force-pushed the mlzc/gcp-iam-impersonation branch 3 times, most recently from 9743058 to 8d8fbcc Compare August 29, 2024 20:43
@michaellzc
Copy link
Contributor Author

@cyrilgdn alright, all good now. would you take another look? Thanks.

@michaellzc michaellzc requested review from aaklilu and cyrilgdn August 29, 2024 23:05
website/docs/index.html.markdown Outdated Show resolved Hide resolved
@michaellzc michaellzc force-pushed the mlzc/gcp-iam-impersonation branch from 8d8fbcc to 4393a7b Compare August 30, 2024 19:19
postgresql/config.go Outdated Show resolved Hide resolved
@michaellzc michaellzc force-pushed the mlzc/gcp-iam-impersonation branch from 4393a7b to 95952dc Compare September 3, 2024 00:13
@michaellzc michaellzc requested a review from cyrilgdn September 3, 2024 01:22
@michaellzc michaellzc force-pushed the mlzc/gcp-iam-impersonation branch from 95952dc to 287347e Compare September 3, 2024 01:37
Copy link
Owner

@cyrilgdn cyrilgdn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot 👍

@cyrilgdn cyrilgdn merged commit c3f634b into cyrilgdn:main Sep 3, 2024
6 checks passed
@cyrilgdn
Copy link
Owner

cyrilgdn commented Sep 8, 2024

@michaellzc This has just been released in v1.23.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants