Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run some example in Kubernetes execution mode in CI #1127

Merged
merged 64 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
998683e
Run some example in Kubernetes execution mode in CI
pankajastro Jul 22, 2024
8c1899a
Fix static check
pankajastro Jul 29, 2024
8f333f2
Update job name
pankajastro Jul 29, 2024
954beed
Update job name
pankajastro Jul 29, 2024
a800d40
Ignore kubernetes tests in other tests
pankajastro Jul 30, 2024
2f847f2
Update hash
pankajastro Jul 30, 2024
bff6f10
Ignore kubernetes dag
pankajastro Jul 30, 2024
7bb3a70
Add custom liveness param
pankajastro Jul 31, 2024
21a3f71
Add custom liveness param
pankajastro Jul 31, 2024
a8828a5
Adjust port-forwarding
pankajastro Jul 31, 2024
6f53c99
Adjust host
pankajastro Jul 31, 2024
eb681a7
Add sleep
pankajastro Jul 31, 2024
7c21be6
Fix livenessProbe
pankajastro Jul 31, 2024
e0dd8c1
Fix values.yaml
pankajastro Aug 1, 2024
6ae63ff
Update values.yaml
pankajastro Aug 1, 2024
d4be671
More experiment
pankajastro Aug 9, 2024
ae242e7
Fix
pankajastro Aug 9, 2024
2a59f34
Fix
pankajastro Aug 9, 2024
b9be16e
Fix
pankajastro Aug 9, 2024
1c31bbe
Upgrade postgres adapter
pankajastro Aug 12, 2024
385d747
Refactor startup script
pankajastro Aug 12, 2024
ce98d26
Fix workflow yml
pankajastro Aug 12, 2024
e90831b
disable custom config
pankajastro Aug 12, 2024
dfc55cb
Add sleep before debug log
pankajastro Aug 12, 2024
fe9da13
Add sleep before port forward
pankajastro Aug 12, 2024
d5d6665
Add postgres service
pankajastro Aug 12, 2024
72d3b1e
disable postgresqlExtendedConf.huge_pages
pankajastro Aug 12, 2024
a07b661
Add postgres deployment yml
pankajastro Aug 12, 2024
6839998
Fix postgres-deployment.yaml
pankajastro Aug 12, 2024
ab5a5cf
Fix postgres port
pankajastro Aug 12, 2024
d04b997
backup
pankajastro Aug 12, 2024
6751282
add port-forward
pankajastro Aug 12, 2024
9a65742
local working version
pankajastro Aug 14, 2024
7a55a17
cleanup
pankajastro Aug 14, 2024
0601a1d
cleanup
pankajastro Aug 14, 2024
d0300a4
Fix tests
pankajastro Aug 14, 2024
4483f1c
Fix pre-commit check
pankajastro Aug 14, 2024
123591b
Fix pre-commit check
pankajastro Aug 14, 2024
5e42afb
More env and string issue fix
pankajastro Aug 14, 2024
b75d29a
Disable some task and add docs
pankajastro Aug 14, 2024
a71a9f9
Update docs
pankajastro Aug 14, 2024
1238cc7
Update docs
pankajastro Aug 14, 2024
36f0044
try different project path
pankajastro Aug 15, 2024
0875d2d
Add executable path
pankajastro Aug 15, 2024
4070458
Add debug stmt
pankajastro Aug 15, 2024
c1b1403
add more debgu stmpt
pankajastro Aug 15, 2024
6984652
add more debgu stmpt
pankajastro Aug 15, 2024
9fc1259
Run tests from dev dir
pankajastro Aug 15, 2024
d16b67d
Fix pre-commit
pankajastro Aug 15, 2024
54991f8
Update hash
pankajastro Aug 15, 2024
035d7db
Fix command
pankajastro Aug 15, 2024
19773af
Ignore tests
pankajastro Aug 15, 2024
338da1f
Ignore test
pankajastro Aug 15, 2024
0a032c9
try virtual env
pankajastro Aug 15, 2024
6401857
Update airflow version
pankajastro Aug 15, 2024
5927f3f
install tests
pankajastro Aug 15, 2024
96f1510
install apache-airflow-providers-cncf-kubernetes
pankajastro Aug 15, 2024
048acc7
Install dbt
pankajastro Aug 15, 2024
151e9dc
Remove ignore
pankajastro Aug 15, 2024
b5d43aa
Extend ignore
pankajastro Aug 15, 2024
7defc5d
Ignore More
pankajastro Aug 15, 2024
82b8345
Ignore more :(
pankajastro Aug 15, 2024
0ad6dcf
Update .github/workflows/test.yml
pankajastro Aug 15, 2024
05cedca
Update .github/workflows/test.yml
pankajastro Aug 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 70 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,6 @@ jobs:
POSTGRES_DB: postgres
POSTGRES_SCHEMA: public
POSTGRES_PORT: 5432
SOURCE_RENDERING_BEHAVIOR: all

- name: Upload coverage to Github
uses: actions/upload-artifact@v2
Expand Down Expand Up @@ -235,7 +234,6 @@ jobs:
POSTGRES_DB: postgres
POSTGRES_SCHEMA: public
POSTGRES_PORT: 5432
SOURCE_RENDERING_BEHAVIOR: all

- name: Upload coverage to Github
uses: actions/upload-artifact@v2
Expand Down Expand Up @@ -379,7 +377,6 @@ jobs:
POSTGRES_DB: postgres
POSTGRES_SCHEMA: public
POSTGRES_PORT: 5432
SOURCE_RENDERING_BEHAVIOR: all

- name: Upload coverage to Github
uses: actions/upload-artifact@v2
Expand Down Expand Up @@ -461,12 +458,82 @@ jobs:
AIRFLOW_CONN_EXAMPLE_CONN: postgres://postgres:[email protected]:5432/postgres
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH

Run-Kubernetes-Tests:
needs: Authorize
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [ "3.11" ]
airflow-version: [ "2.9" ]
steps:
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha || github.ref }}
- uses: actions/cache@v3
with:
path: |
~/.cache/pip
.local/share/hatch/
key: coverage-integration-kubernetes-test-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.airflow-version }}-${{ hashFiles('pyproject.toml') }}-${{ hashFiles('cosmos/__init__.py') }}

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Create KinD cluster
uses: container-tools/kind-action@v1

- name: Install packages and dependencies
run: |
python -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -e ".[tests]"
pip install apache-airflow-providers-cncf-kubernetes
pip install dbt-postgres==1.8.2 psycopg2==2.9.3 pytz
pip install apache-airflow==${{ matrix.airflow-version }}
# hatch -e tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }} run pip freeze

- name: Run kubernetes tests
run: |
source venv/bin/activate
sh ./scripts/test/kubernetes-setup.sh
cd dev && sh ../scripts/test/integration-kubernetes.sh
# hatch run tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }}:test-kubernetes
env:
AIRFLOW_HOME: /home/runner/work/astronomer-cosmos/astronomer-cosmos/
AIRFLOW_CONN_EXAMPLE_CONN: postgres://postgres:[email protected]:5432/postgres
AIRFLOW_CONN_AWS_S3_CONN: ${{ secrets.AIRFLOW_CONN_AWS_S3_CONN }}
AIRFLOW_CONN_GCP_GS_CONN: ${{ secrets.AIRFLOW_CONN_GCP_GS_CONN }}
AIRFLOW_CONN_AZURE_ABFS_CONN: ${{ secrets.AIRFLOW_CONN_AZURE_ABFS_CONN }}
AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT: 90.0
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH
COSMOS_CONN_POSTGRES_PASSWORD: ${{ secrets.COSMOS_CONN_POSTGRES_PASSWORD }}
DATABRICKS_CLUSTER_ID: mock
DATABRICKS_HOST: mock
DATABRICKS_WAREHOUSE_ID: mock
DATABRICKS_TOKEN: mock
POSTGRES_HOST: localhost
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: postgres
POSTGRES_SCHEMA: public
POSTGRES_PORT: 5432

- name: Upload coverage to Github
uses: actions/upload-artifact@v2
with:
name: coverage-integration-kubernetes-test-${{ matrix.python-version }}-${{ matrix.airflow-version }}
path: .coverage

Code-Coverage:
if: github.event.action != 'labeled'
needs:
- Run-Unit-Tests
- Run-Integration-Tests
- Run-Integration-Tests-Expensive
- Run-Kubernetes-Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
Expand Down
18 changes: 18 additions & 0 deletions dev/Dockerfile.postgres_profile_docker_k8s
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM python:3.11

RUN pip install dbt-postgres==1.8.2 psycopg2==2.9.3 pytz

ENV POSTGRES_DATABASE=postgres
ENV POSTGRES_DB=postgres
ENV POSTGRES_HOST=postgres.default.svc.cluster.local
ENV POSTGRES_PASSWORD=postgres
ENV POSTGRES_PORT=5432
ENV POSTGRES_SCHEMA=public
ENV POSTGRES_USER=postgres

RUN mkdir /root/.dbt
COPY dags/dbt/jaffle_shop/profiles.yml /root/.dbt/profiles.yml

RUN mkdir dags
COPY dags dags
RUN rm dags/dbt/jaffle_shop/packages.yml
12 changes: 12 additions & 0 deletions dev/dags/dbt/jaffle_shop/profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,15 @@ default:
dbname: "{{ env_var('POSTGRES_DB') }}"
schema: "{{ env_var('POSTGRES_SCHEMA') }}"
threads: 4

postgres_profile:
target: dev
outputs:
dev:
type: postgres
dbname: "{{ env_var('POSTGRES_DATABASE') }}"
host: "{{ env_var('POSTGRES_HOST') }}"
pass: "{{ env_var('POSTGRES_PASSWORD') }}"
port: 5432 # "{{ env_var('POSTGRES_PORT') | as_number }}"
schema: "{{ env_var('POSTGRES_SCHEMA') }}"
user: "{{ env_var('POSTGRES_USER') }}"
98 changes: 98 additions & 0 deletions dev/dags/jaffle_shop_kubernetes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""
## Jaffle Shop DAG
[Jaffle Shop](https://github.com/dbt-labs/jaffle_shop) is a fictional eCommerce store. This dbt project originates from
dbt labs as an example project with dummy data to demonstrate a working dbt core project. This DAG uses the cosmos dbt
parser to generate an Airflow TaskGroup from the dbt project folder.


The step-by-step to run this DAG are described in:
https://astronomer.github.io/astronomer-cosmos/getting_started/kubernetes.html#kubernetes

"""

from airflow import DAG
from airflow.providers.cncf.kubernetes.secret import Secret
from pendulum import datetime

from cosmos import (
DbtSeedKubernetesOperator,
DbtTaskGroup,
ExecutionConfig,
ExecutionMode,
ProfileConfig,
ProjectConfig,
)
from cosmos.profiles import PostgresUserPasswordProfileMapping

DBT_IMAGE = "dbt-jaffle-shop:1.0.0"

project_seeds = [{"project": "jaffle_shop", "seeds": ["raw_customers", "raw_payments", "raw_orders"]}]

postgres_password_secret = Secret(
deploy_type="env",
deploy_target="POSTGRES_PASSWORD",
secret="postgres-secrets",
key="password",
)

postgres_host_secret = Secret(
deploy_type="env",
deploy_target="POSTGRES_HOST",
secret="postgres-secrets",
key="host",
)

with DAG(
dag_id="jaffle_shop_kubernetes",
start_date=datetime(2022, 11, 27),
doc_md=__doc__,
catchup=False,
) as dag:
# [START kubernetes_seed_example]
load_seeds = DbtSeedKubernetesOperator(
task_id="load_seeds",
project_dir="dags/dbt/jaffle_shop",
get_logs=True,
schema="public",
image=DBT_IMAGE,
is_delete_operator_pod=False,
secrets=[postgres_password_secret, postgres_host_secret],
profile_config=ProfileConfig(
profile_name="postgres_profile",
target_name="dev",
profile_mapping=PostgresUserPasswordProfileMapping(
conn_id="postgres_default",
profile_args={
"schema": "public",
},
),
),
)
# [END kubernetes_seed_example]

# [START kubernetes_tg_example]
run_models = DbtTaskGroup(
profile_config=ProfileConfig(
profile_name="postgres_profile",
target_name="dev",
profile_mapping=PostgresUserPasswordProfileMapping(
conn_id="postgres_default",
profile_args={
"schema": "public",
},
),
),
project_config=ProjectConfig(dbt_project_path="dags/dbt/jaffle_shop"),
execution_config=ExecutionConfig(
execution_mode=ExecutionMode.KUBERNETES,
),
operator_args={
"image": DBT_IMAGE,
"get_logs": True,
"is_delete_operator_pod": False,
"secrets": [postgres_password_secret, postgres_host_secret],
},
)
# [END kubernetes_tg_example]

load_seeds >> run_models
24 changes: 4 additions & 20 deletions docs/getting_started/execution-modes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -144,27 +144,11 @@ Check the step-by-step guide on using the ``kubernetes`` execution mode at :ref:

Example DAG:

.. code-block:: python

postgres_password_secret = Secret(
deploy_type="env",
deploy_target="POSTGRES_PASSWORD",
secret="postgres-secrets",
key="password",
)
.. literalinclude:: ../../dev/dags/jaffle_shop_kubernetes.py
:language: python
:start-after: [START kubernetes_seed_example]
:end-before: [END kubernetes_seed_example]

docker_cosmos_dag = DbtDag(
# ...
execution_config=ExecutionConfig(
execution_mode=ExecutionMode.KUBERNETES,
),
operator_args={
"image": "dbt-jaffle-shop:1.0.0",
"get_logs": True,
"is_delete_operator_pod": False,
"secrets": [postgres_password_secret],
},
)
AWS_EKS
----------

Expand Down
28 changes: 4 additions & 24 deletions docs/getting_started/kubernetes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,30 +28,10 @@ Additional KubernetesPodOperator parameters can be added on the operator_args pa

For instance,

.. code-block:: python

run_models = DbtTaskGroup(
profile_config=ProfileConfig(
profile_name="postgres_profile",
target_name="dev",
profile_mapping=PostgresUserPasswordProfileMapping(
conn_id="postgres_default",
profile_args={
"schema": "public",
},
),
),
project_config=ProjectConfig(PROJECT_DIR),
execution_config=ExecutionConfig(
execution_mode=ExecutionMode.KUBERNETES,
),
operator_args={
"image": DBT_IMAGE,
"get_logs": True,
"is_delete_operator_pod": False,
"secrets": [postgres_password_secret, postgres_host_secret],
},
)
.. literalinclude:: ../../dev/dags/jaffle_shop_kubernetes.py
:language: python
:start-after: [START kubernetes_tg_example]
:end-before: [END kubernetes_tg_example]

Step-by-step instructions
+++++++++++++++++++++++++
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ freeze = "pip freeze"
test = 'sh scripts/test/unit.sh'
test-cov = 'sh scripts/test/unit-cov.sh'
test-integration = 'sh scripts/test/integration.sh'
test-kubernetes = "sh scripts/test/integration-kubernetes.sh"
test-integration-dbt-1-5-4 = 'sh scripts/test/integration-dbt-1-5-4.sh'
test-integration-expensive = 'sh scripts/test/integration-expensive.sh'
test-integration-setup = 'sh scripts/test/integration-setup.sh'
Expand Down
1 change: 1 addition & 0 deletions scripts/test/integration-dbt-1-5-4.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ pytest -vv \
--durations=0 \
-m integration \
--ignore=tests/perf \
--ignore=tests/test_example_k8s_dags.py \
-k 'basic_cosmos_task_group'
1 change: 1 addition & 0 deletions scripts/test/integration-expensive.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ pytest -vv \
--durations=0 \
-m integration \
--ignore=tests/perf \
--ignore=tests/test_example_k8s_dags.py \
-k 'example_cosmos_python_models or example_virtualenv'
16 changes: 16 additions & 0 deletions scripts/test/integration-kubernetes.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

set -x
set -e

# Reset the Airflow database to its initial state
airflow db reset -y

# Run tests using pytest
pytest -vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m integration \
../tests/test_example_k8s_dags.py
1 change: 1 addition & 0 deletions scripts/test/integration-sqlite.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ pytest -vv \
--durations=0 \
-m integration \
--ignore=tests/perf \
--ignore=tests/test_example_k8s_dags.py \
-k 'example_cosmos_sources or sqlite'
3 changes: 2 additions & 1 deletion scripts/test/integration.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,5 @@ pytest -vv \
--durations=0 \
-m integration \
--ignore=tests/perf \
-k 'not (sqlite or example_cosmos_sources or example_cosmos_python_models or example_virtualenv)'
--ignore=tests/test_example_k8s_dags.py \
-k 'not (sqlite or example_cosmos_sources or example_cosmos_python_models or example_virtualenv or jaffle_shop_kubernetes)'
34 changes: 34 additions & 0 deletions scripts/test/kubernetes-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/bin/bash

# Print each command before executing it
# Exit the script immediately if any command exits with a non-zero status (for debugging purposes)
set -x
set -e

# Create a Kubernetes secret named 'postgres-secrets' with the specified literals for host and password
kubectl create secret generic postgres-secrets \
--from-literal=host=postgres-postgresql.default.svc.cluster.local \
--from-literal=password=postgres

# Apply the PostgreSQL deployment configuration from the specified YAML file
kubectl apply -f scripts/test/postgres-deployment.yaml

# Build the Docker image with tag 'dbt-jaffle-shop:1.0.0' using the specified Dockerfile
cd dev && docker build --progress=plain --no-cache -t dbt-jaffle-shop:1.0.0 -f Dockerfile.postgres_profile_docker_k8s .

# Load the Docker image into the local KIND cluster
kind load docker-image dbt-jaffle-shop:1.0.0

# Retrieve the name of the PostgreSQL pod using the label selector 'app=postgres'
# The output is filtered to get the first pod's name
POD_NAME=$(kubectl get pods -n default -l app=postgres -o jsonpath='{.items[0].metadata.name}')

# Print the name of the PostgreSQL pod
echo "$POD_NAME"

# Forward port 5432 from the PostgreSQL pod to the local machine's port 5432
# This allows local access to the PostgreSQL instance running in the pod
kubectl port-forward --namespace default "$POD_NAME" 5432:5432 &

# List all pods in the default namespace to verify the status of pods
kubectl get pod
3 changes: 2 additions & 1 deletion scripts/test/performance.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ pytest -vv \
-s \
-m 'perf' \
--ignore=tests/test_example_dags.py \
--ignore=tests/test_example_dags_no_connections.py
--ignore=tests/test_example_dags_no_connections.py \
--ignore=tests/test_example_k8s_dags.py
Loading
Loading