Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import automation: Setup script updates #966

Merged
merged 8 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion import-automation/executor/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,10 @@

FROM python:3.11.4

RUN apt upgrade
RUN apt-get update \
&& apt-get -y upgrade \
&& apt-get -y autoremove \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /workspace

ADD requirements.txt /workspace/requirements.txt
Expand Down
2 changes: 1 addition & 1 deletion import-automation/executor/app/configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ class ExecutorConfig:

# ID of the Google Cloud project that hosts the executor. The project
# needs to enable App Engine and Cloud Scheduler.
gcp_project_id: str = 'google.com:datcom-data'
gcp_project_id: str = 'datcom-import-automation'
# ID of the Google Cloud project that stores generated CSVs and MCFs. The
# project needs to enable Cloud Storage and gives the service account the
# executor uses sufficient permissions to read and write the bucket below.
Expand Down
3 changes: 2 additions & 1 deletion import-automation/executor/app/executor/cloud_scheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@
from google.protobuf import json_format
from google.api_core.exceptions import AlreadyExists, NotFound

GKE_SERVICE_DOMAIN = os.getenv('GKE_SERVICE_DOMAIN', 'import.datacommons.dev')
GKE_SERVICE_DOMAIN = os.getenv('GKE_SERVICE_DOMAIN',
'importautomation.datacommons.org')
GKE_CALLER_SERVICE_ACCOUNT = os.getenv('CLOUD_SCHEDULER_CALLER_SA')
GKE_OAUTH_AUDIENCE = os.getenv('CLOUD_SCHEDULER_CALLER_OAUTH_AUDIENCE')

Expand Down
10 changes: 8 additions & 2 deletions import-automation/executor/gke/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,14 @@ Follow

## (One Time) Setup GKE

1. Update OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET in "gke/configure_gke.sh".
2. Run `./gke/configure_gke.sh`.
1. Set the PROJECT_ID, OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET environment variables in "gke/configure_gke.sh", e.g.
```
export PROJECT_ID=<GCP_PROJECT>
export PROJECT_ID=<OAUTH_CLIENT_ID>
export PROJECT_ID=<OAUTH_CLIENT_SECRET>
jehangiramjad marked this conversation as resolved.
Show resolved Hide resolved
```

2. Run `./gke/configure_gke.sh`. The script will error out if the environment variables in (1) are not set.

## Deployment

Expand Down
30 changes: 21 additions & 9 deletions import-automation/executor/gke/configure_gke.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,24 @@
# See the License for the specific language governing permissions and
# limitations under the License.

PROJECT_ID=datcom-data
# Verify that the required environment variables are set.
if [ -z "$PROJECT_ID" ]
then
echo "\$PROJECT_ID must be set and cannot be empty."
exit 1
fi
if [ -z "$OAUTH_CLIENT_ID" ]
then
echo "\$OAUTH_CLIENT_ID must be set and cannot be empty."
exit 1
fi
if [ -z "$OAUTH_CLIENT_SECRET" ]
then
echo "\$OAUTH_CLIENT_SECRET must be set and cannot be empty."
exit 1
fi

PROJECT_ID=datcom-import-automation
jehangiramjad marked this conversation as resolved.
Show resolved Hide resolved
jehangiramjad marked this conversation as resolved.
Show resolved Hide resolved

gcloud config set project $PROJECT_ID

Expand All @@ -40,21 +57,16 @@ kubectl create namespace import-automation \
kubectl create serviceaccount --namespace import-automation import-automation-ksa \
--dry-run=client -o yaml | kubectl apply -f -

gcloud iam service-accounts add-iam-policy-binding \
--project $PROJECT_ID \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:$PROJECT_ID.svc.id.goog[import-automation/import-automation-ksa]" \
[email protected]
gcloud projects add-iam-policy-binding $PROJECT_ID \
--role=roles/iam.workloadIdentityUser \
--member="serviceAccount:$PROJECT_ID.svc.id.goog[import-automation/import-automation-ksa]"

kubectl annotate serviceaccount \
--namespace import-automation \
--overwrite \
import-automation-ksa \
iam.gke.io/[email protected]

# Set the oauth env vars before running the script
# export OAUTH_CLIENT_ID=<fill>
# export OAUTH_CLIENT_SECRET=<fill>
kubectl -n import-automation create secret generic import-automation-iap-secret \
--from-literal=client_id=$OAUTH_CLIENT_ID \
--from-literal=client_secret=$OAUTH_CLIENT_SECRET
Expand Down
1 change: 1 addition & 0 deletions import-automation/executor/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ flask
gunicorn
pytz
absl-py
libwebp-dev>=1.3.2 # To fix a zero-day vulnerability (CVE-2023-4863): https://snyk.io/blog/find-and-fix-webp-vulnerability-cve-2023-4863/
jehangiramjad marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion import-automation/executor/test/cloud_scheduler_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ def test_http_job_request(self):
'seconds': 60 * 30
},
'http_target': {
'uri': 'https://import.datacommons.dev/update',
'uri': 'https://importautomation.datacommons.org/update',
'http_method': 'POST',
'headers': {
'Content-Type': 'application/json',
Expand Down
9 changes: 6 additions & 3 deletions import-automation/executor/test/integration_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,13 @@
NUM_LINES_TO_CHECK = 50

CONFIGS = {
jehangiramjad marked this conversation as resolved.
Show resolved Hide resolved
'github_repo_owner_username': os.environ['GITHUB_AUTH_USERNAME'],
# The GitHub params belong to the public Data Commons gmail account.
# Auth tokens, user name and other details can be found in the inbox
# and in the inbox of teammates.
'github_repo_owner_username': os.environ['_GITHUB_REPO_OWNER_USERNAME'],
'github_repo_name': 'data-demo',
'github_auth_username': 'intrepiditee',
'github_auth_access_token': os.environ['GITHUB_AUTH_ACCESS_TOKEN']
'github_auth_username': os.environ['_GITHUB_AUTH_USERNAME'],
'github_auth_access_token': os.environ['_GITHUB_AUTH_ACCESS_TOKEN']
}


Expand Down
Loading