-
Notifications
You must be signed in to change notification settings - Fork 827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
infra/gcp: clean up main project #1974
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: spiffxp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
877bbd1
to
6140ead
Compare
6140ead
to
7e8c9a3
Compare
`gsutil ch` specifically doesn't support convenience bindings (e.g. a role binding with "projectEditors:project-foo" as a member that will expand to all members with "roles/editor" on project-foo) as it goes against the principle of least privilege Annoyingly, this is the case not just for new bindings, but also attempts to remove existing bindings. Instead, we must follow a read-modify-write pattern with `gsutil iam get` and `gsutil iam set` This was really only needed for ensure_removed_gcs_binding but for parity I redid ensure_gcs_binding to follow the same pattern. If this turns out to be wildly slower than before, we should revert for the common case of adding new bindings.
roles/storage.legacyBucketOwner contains storage.buckets.setIamPolicy, but not storage.objects.get. This means someone with roles/editor on the project could grant themselves access to read bucket contents that they aren't supposed to be able to read. Given that roles/editor has no *.setIamPolicy permissions for other service resources, this seems like a security gap that should be closed. Ideally we would do this in _ensure_gcs_bucket. However, removing this role means removing other (possibly needed) permissions that may be used by GCP service agent service accounts (e.g. App Engine, GCR, GCE): - storage.buckets.get - storage.multipartUploads.(abort|create|list|listParts) - storage.objects.(create|delete|list) So until we have time to research what buckets those service agents specifically need access to, we'll leave them alone and constrain this policy to "private" gcs buckets that are currently only used by humans to store terraform state containing potentially sensitive info
ensure_gcs_role_binding prepends "roles/storage." to roles given to it, just like `gsutil ch` effectively does when its results are viewed through `gsutil iam get` So pass "admin" to end up binding "roles/storage.admin"
Basically full circle back to linusa's original fix to color. The "$*" syntax was hiding args from echo, e.g. color 6 -n "foo" came out as `echo "-n foo"`, which was messing up some of the output from ensure-main-project.sh The original "${@}$(_nocolor)" was failing shellcheck SC2145, which cautions that "${@}$x" can lead to unspecified behavior. Since the intent here is to ensure "$(_nocolor)" is the last thing echo prints, pass it explicitly as another argument: "${@}" "$(_nocolor)"
Either ensure_serviceaccount_role_binding never worked, or something has changed recently. For whatever reason, when running without `--project foo`, it is impossible to use a consistent resource identifier for subcommands of `gcloud iam service-accounts`: - add-iam-policy-binding: expects 'projects/p/serviceAccounts/email' - get-iam-policy: rejects the above, expects 'numeric id' or 'email' Apparently, for add-iam-policy-binding, the project hosting the service-account is _not_ parsed out of its email. Instead, the project will come from: - the fully specified name `projects/foo/serviceAccounts/email` - `--project foo` if given - `gcloud config get-value core/project` otherwise ref: https://cloud.google.com/sdk/gcloud/reference/iam/service-accounts/add-iam-policy-binding#SERVICE_ACCOUNT So to allow use of a consistent identifier (email) while reading or updating service-account iam-policy, update _ensure_resource_role_binding to optionally accept a project argument. Then parse out and pass the project for service-accounts only (since they are thus far the only resource to require this). Ditto all of the above for remove-iam-policy-binding, _ensure_removed_resource_role_binding. I ran into this because I intentionally leave `core/project` set to a personal project, to ensure I have to be explicit about making changes to shared projects. When I ran ensure-main-project.sh it couldn't find the kubernetes-public service accounts in my personal project when trying to modify workload identity bindings on those service accounts.
As an alternative to empower_ksa_to_svcacct that reduces boilerplate on the caller. These functions may better belong in lib_iam, and perhaps a better name is empower_serviceaccount_for_gke, but this is good enough for now. Can revisit whenever we feel like replacing uses of empower_ksa_to_svcacct
this was motivated by ensuring that serviceaccounts are bound to their needed roles directly instead of via group membership - refactor blocks into functions - use array of principal:role bindings - add support for binding serviceaccounts to roles, but skipping if the serviceaccount doesn't exist (to avoid dependency cycles)
7e8c9a3
to
d288449
Compare
d288449
to
faed441
Compare
specifically: - refactored by: - grouping blocks of code into functions - parameterizing those functions on PROJECT - adding ensure_workload_identity_serviceaccount to handle the sundry single-role service accounts intended for use by GKE clusters - prune main_project_services down to just what we directly use - drop compute: direct dep of container - drop oslogin: direct dep of compute - swap bigquery-json for biquery: the former doesn't exist - add comments explaning each of the services we do use - fix a bunch of shellcheck nits - add some TODO(spiffxp) lines for role or binding removal as followup
faed441
to
306fd98
Compare
`terraform apply` and `./infra/gcp/ensure-main-project.sh` were getting in a fight over authoritative IAM bindings for the kubernetes-public project IMO terraform shouldn't be setting authoritative bindings for a project unless it's also managing the project. Whereas in this case, the module in question is just for managing a GKE cluster within the project. The infra/gcp/ensure-main-project.sh script is the source of truth about the configuration of the kubernetes-public project. I opted to move the bindings over to ensure-main-project.sh instead of updating the terraform module to non-authoritatively add IAM members, since they were related to a group that is managed outside of terraform vs. service accounts that were created by terraform. As part of this commit I preemptively ran `terraform state rm` for the two resources that were removed, so that terraform won't try to destroy them
The most recent commit should address #2000 |
If you want to see the outcomes of these changes, please take a look at my comments on the latest audit PR #2001 (review) |
/cc @ameukam |
/lgtm ( since "I've run these scripts a few times before while working on this PR" :) ) |
# permissions because: | ||
# * The full list is large and has stuff that is inherited listed in it | ||
# * All of our other IAM binding logic calls are additive | ||
function ensure_terraform_state_buckets() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking comment: IMHO we should empower the owners of each bucket with roles/viewer
so they can easily run terraform plan
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you once again for the good catch. Apparently roles/viewer
is necessary to get storage.buckets.list
.
I didn't catch this when testing with my less-privileged account, because it has audit.viewer
at the org level and thus roles/viewer
on everything implicitly.
They only other off-the-shelf roles that have this are:
roles/storage.admin
roles/viewer
roles/editor
roles/owner
I'm tempted to suggest storage.admin
, let me look real quick
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it looks like we can't set any of the primitive roles just on the bucket
$ gsutil iam ch "user:[email protected]:roles/viewer" gs://k8s-infra-clusters-terraform
BadRequestException: 400 Role (roles/viewer) does not exist in the resource's hierarchy.
Which leaves use with roles/viewer
on the project, or roles/storage.admin
on the bucket as the quick fix here. I'll add to the project for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #2005 to address
Upgrade terraform provider google to 3.1.0 from 2.14. Upgrade terraform version to 0.13.6. We accidently convert the terraform to a 0.13.x version when kubernetes#1974 was deployed. Signed-off-by: Arnaud Meukam <[email protected]>
Related to #516 |
Briefly:
projectEditor:kubernetes-public
bindings fromk8s-infra-tf-*
bucketsensure_removed_gcs_role_binding
to usegsutil iam set
instead ofgsutil iam ch
empower_gke_for_serviceaccount
functions to libk8s-infra-gcp-auditor
serviceaccountcontainer
which usescompute
which usesoslogin
; drop the latter two since we only need to specify the formerbigquery-json
is not a service, usebigquery
insteadensure_serviceaccount_role_binding
either never working or hitting a recent API changeSee individual commits for details
Fixes #2000