-
Notifications
You must be signed in to change notification settings - Fork 827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure gcs bucket for k8s-infra prow instance #1909
Conversation
/retitle Ensure gcs bucket for k8s-infra prow instance |
/hold |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ameukam, cpanato The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
infra/gcp/ensure-main-project.sh
Outdated
@@ -57,6 +57,11 @@ CLUSTER_USERS_GROUP="[email protected]" | |||
# The DNS admins group. | |||
DNS_GROUP="[email protected]" | |||
|
|||
# Buckets for the logs of prow | |||
PROW_BUCKETS=( | |||
k8s-prow-infra-logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this is supposed to be the community equivalent of kubernetes-jenkins
?
k8s-prow-infra-logs | |
k8s-infra-prow-results |
Naming is hard when I overthink things.
- First thought:
k8s-infra-prow-logs
to get the consistentk8s-infra
prefix - But...
- "prow logs" sounds like logs of the prow components (plank, sinker, crier, etc)
- there are more than "logs" that end up
kubernetes-jenkins
, there are also "artifacts" like junit.xml files, coverage files, metadata, etc.
- OK, so
k8s-infra-prow-artifacts
for "artifacts" produced by prow, including job results - But...
- "artifacts" clashes with the sig-release meaning of "binaries and container images that represent a given release of a given subproject"
- makes me thing it's going to be the binaries/images for prow itself, a.k.a. the equivalent to
gcr.io/k8s-prow
- OK, so how about just
k8s-infra-prow
and be done with it - But...
- If we're doing things over, one of the problems with
kubernetes-jenkins
is that it allows per-object ACLs. They allowed that single bucket to be used for multiple purposes, including occasionally storing secrets or credentials. - We don't want per-object ACLs in kubernetes.io, as they are painful and laborious to maintain, audit and enforce. We are planning on setting an org policy to explicitly require per-bucket IAM only (aka uniform bucket level access).
- Thus we should name this bucket for the main purpose everyone expects of
kubernetes-jenkins
, public read access to things about and created by prow jobs for the kubernetes project. If other purposes arise, they should result in other buckets named/IAM'ed as such.
- If we're doing things over, one of the problems with
- OK, so how about
k8s-infra-prow-results
, as in the "results" of prow job runs- I looked around in prow's codebase and docs and couldn't find a consistent term for "things that are uploaded to GCS or S3 by the pod-utilities
sidecar
component andcrier
". Results isn't just "test results" but from the concept of running a prow job and then looking at the "resulting" files in GCS/S3. - I considered "data" but that sounds like "training data" or other metadata that is loaded in to help run a job, where I believe this bucket's intended purpose is storage of things about and created by prow jobs
- I looked around in prow's codebase and docs and couldn't find a consistent term for "things that are uploaded to GCS or S3 by the pod-utilities
WDYT about k8s-infra-prow-results
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"artifacts" clashes with the sig-release meaning of "binaries and container images that represent a given release of a given subproject"
The buckets are prefixed (k8s-infra-prow
) with a reference. I don't think we should have confusion with other projects.
Let's go with k8s-infra-prow-results
, it's more aligned with the needs of this instance.
infra/gcp/ensure-main-project.sh
Outdated
@@ -79,6 +84,40 @@ ensure_private_gcs_bucket \ | |||
"${PROJECT}" \ | |||
"gs://${CLUSTER_TERRAFORM_BUCKET}" | |||
|
|||
|
|||
color 6 "Ensuring all the prow buckets exist" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
The rest of this I would like to wait to merge until I land a refactor of this file (#1974)
I am happy to take up redoing this part to fit in with the refactor as followup, or leaving it to you to ensure someone other than me understands the structure I tried to put in place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ameukam: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
8723612
to
0b82197
Compare
/skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure provisioning new service accounts is required here. For the existing prow build clusters their service accounts are setup via terraform
local project="${1}" | ||
|
||
for bucket in "${PROW_BUCKETS[@]}"; do | ||
local svc_acct_name="${bucket}-sa" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are going to want to reuse the existing service account(s) that prow jobs run as instead of creating a new service account
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this would be happening via workload identity but I think we're still living in a world where pod-utils are ultimately what write to GCS, and they still get their credentials from a kubernetes secret named "service-account", according prow's config.yaml
: https://github.com/kubernetes/test-infra/blob/c1d43437727eab8c84c5ac989b278feefd556a7d/config/prow/config.yaml#L23
The way I did this for k8s-infra-prow-build-trusted/prow-build-trusted was to setup a service account named k8s-infra-prow-build-trusted
bindable via workload identity using terraform. Then manually create a key and store in a kubernetes secret named service-account
in the test-pods
namespace, to match the name in prow's config.yaml
.
Looking at it now, I think I should have set it up so the WI binding was [test-pods/default]
instead of [test-pods/k8s-infra-prow-build-trusted]
, because I doubt anything is actually using it. And we can also use external secrets instead of manually creating the secret
So that's build clusters.
Then there's the prow control plane. For example, I know that at least crier
will write to the same GCS bucket.
Currently, prow.k8s.io uses a control-plane
GCP service account, and then kubernetes service accounts that can bind to it for each component, e.g. https://github.com/kubernetes/test-infra/blob/c1d43437727eab8c84c5ac989b278feefd556a7d/config/prow/cluster/crier_rbac.yaml#L16-L22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So tl;dr I would create a k8s-infra-prow@kubernetes-public
service account for the prow control plane, since it may require more than just write access to this one bucket.
Since you're doing this for the cluster in kubernetes-public, you could do all this in https://github.com/kubernetes/k8s.io/blob/main/infra/gcp/clusters/projects/kubernetes-public/aaa
But no objections to iterating in bash if you need to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spiffxp I'll add a second commit with a HCL version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local project="${1}" | ||
|
||
for bucket in "${PROW_BUCKETS[@]}"; do | ||
local svc_acct_name="${bucket}-sa" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this would be happening via workload identity but I think we're still living in a world where pod-utils are ultimately what write to GCS, and they still get their credentials from a kubernetes secret named "service-account", according prow's config.yaml
: https://github.com/kubernetes/test-infra/blob/c1d43437727eab8c84c5ac989b278feefd556a7d/config/prow/config.yaml#L23
The way I did this for k8s-infra-prow-build-trusted/prow-build-trusted was to setup a service account named k8s-infra-prow-build-trusted
bindable via workload identity using terraform. Then manually create a key and store in a kubernetes secret named service-account
in the test-pods
namespace, to match the name in prow's config.yaml
.
Looking at it now, I think I should have set it up so the WI binding was [test-pods/default]
instead of [test-pods/k8s-infra-prow-build-trusted]
, because I doubt anything is actually using it. And we can also use external secrets instead of manually creating the secret
So that's build clusters.
Then there's the prow control plane. For example, I know that at least crier
will write to the same GCS bucket.
Currently, prow.k8s.io uses a control-plane
GCP service account, and then kubernetes service accounts that can bind to it for each component, e.g. https://github.com/kubernetes/test-infra/blob/c1d43437727eab8c84c5ac989b278feefd556a7d/config/prow/cluster/crier_rbac.yaml#L16-L22
local project="${1}" | ||
|
||
for bucket in "${PROW_BUCKETS[@]}"; do | ||
local svc_acct_name="${bucket}-sa" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So tl;dr I would create a k8s-infra-prow@kubernetes-public
service account for the prow control plane, since it may require more than just write access to this one bucket.
Since you're doing this for the cluster in kubernetes-public, you could do all this in https://github.com/kubernetes/k8s.io/blob/main/infra/gcp/clusters/projects/kubernetes-public/aaa
But no objections to iterating in bash if you need to.
color 6 "Creating service account: ${svc_acct_name}" | ||
ensure_service_account \ | ||
"${project}" \ | ||
"${svc_acct_name}" \ | ||
"${svc_acct_name}" | ||
|
||
color 6 "Empowering service account: ${svc_acct_name}" | ||
empower_svcacct_to_write_gcs_bucket "${svc_acct_email}" "gs://${bucket}" | ||
|
||
color 6 "Ensure secret ${SECRET_ID} exists in project ${PROJECT}" | ||
ensure_secret "${project}" "${SECRET_ID}" | ||
|
||
color "Ensure ${SECRET_ID} contains secret key for ${svc_acct_name}" | ||
ensure_serviceaccount_key_secret "${project}" "${SECRET_ID}" "${svc_acct_email}" | ||
|
||
color 6 "Empowering [email protected] to read secret ${SECRET_ID}" | ||
ensure_secrets_role_binding \ | ||
"projects/${project}/secrets/${SECRET_ID}" \ | ||
"group:[email protected]" \ | ||
"roles/secretmanager.secretAccessor" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically
k8s.io/infra/gcp/ensure-conformance-storage.sh
Lines 104 to 131 in 4d98e1c
# ensure_conformance_serviceaccount ensures that: | |
# - a serviceaccount of the given name exists in PROJECT | |
# - it can write to the given bucket | |
# - it has a private key stored in a secret in PROJECT accessible to the given group | |
function ensure_conformance_serviceaccount() { | |
local name="${1}" | |
local bucket="${2}" | |
local secret_accessors="${3}" | |
local email="$(svc_acct_email "${PROJECT}" "${name}")" | |
local secret="${name}-key" | |
local private_key_file="${TMPDIR}/key.json" | |
color 6 "Ensuring service account exists: ${email}" | |
ensure_service_account "${PROJECT}" "${name}" "Grants write access to ${bucket}" | |
color 6 "Ensuring ${PROJECT} contains secret ${secret} with private key for ${email}" | |
ensure_serviceaccount_key_secret "${PROJECT}" "${secret}" "${email}" | |
color 6 "Empowering ${secret_accessors} to access secret: ${secret}" | |
ensure_secret_role_binding \ | |
"projects/${PROJECT}/secrets/${secret}" \ | |
"group:${secret_accessors}" \ | |
"roles/secretmanager.secretAccessor" | |
color 6 "Empowering ${email} to write to ${bucket}" | |
empower_svcacct_to_write_gcs_bucket "${email}" "${bucket}" | |
} |
You could parameterize it on project and pull into lib_iam.sh for re-use. Can be done as followup though
a35359e
to
5eb49b0
Compare
Following prow [documentation](https://github.com/kubernetes/test-infra/blob/master/prow/getting_started_deploy.md#configure-a-gcs-buckethttps://github.com/kubernetes/test-infra/blob/master/prow/getting_started_deploy.md#configure-a-gcs-bucket) guidance : Create a GCS bucket for tide history and build logs. Create a service account and grant admin access to the bucket. Create a service account key and add the generated key to Secret Manager. Signed-off-by: Arnaud Meukam <[email protected]>
5eb49b0
to
951a4e3
Compare
<<<<<<< HEAD | ||
if [ ! $# -eq 3 ] || [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then | ||
echo "${FUNCNAME[0]}(project, secret, serviceaccountt) requires 3 arguments" >&2 | ||
======= | ||
if [ ! $# -eq 3 -o -z "$1" -o -z "$2" -o -z "$3" ]; then | ||
echo "ensure_serviceaccount_key_secret(project, secret, serviceaccount) requires 3 arguments" >&2 | ||
>>>>>>> 0b821977 (Add gcs public bucket for k8s-infra-prow logs.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge conflict, you want to keep HEAD
/close |
@spiffxp: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Following prow documentation guidance :
Create a GCS bucket for tide history and build logs.
Create a service account and grant admin access to the bucket.
Create a service account key and add the generated key to Secret
Manager.
Signed-off-by: Arnaud Meukam [email protected]