Skip to content

Commit

Permalink
Merge pull request #7 from navikt/new_beginning
Browse files Browse the repository at this point in the history
New beginning
  • Loading branch information
Kyrremann authored Oct 13, 2023
2 parents 900973a + 2fbd5aa commit bfbc79f
Show file tree
Hide file tree
Showing 24 changed files with 5,044 additions and 109 deletions.
34 changes: 22 additions & 12 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,30 @@ on:
types: [published]

jobs:
publish-pypi:
tests:
runs-on: ubuntu-latest
container:
image: python:3.8
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: 3.11
- run: pip3 install poetry
- run: poetry install --with test
- run: poetry run pytest

publish-pypi:
runs-on: ubuntu-latest
needs: tests
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: 3.11
- name: Set version
run: sed -i "s~version=.*~version='$(echo ${{ github.ref }} | sed "s/refs\/tags\///"),~g" setup.py

- run: pip3 install twine --user

- name: Build release
run: python3 setup.py sdist bdist_wheel

- name: Publish to PyPi
run: python -m twine upload -u ${{ secrets.PYPI_USER }} -p ${{ secrets.PYPI_PASSWORD }} dist/*
run: sed -i "s~version =.*~version = '$(echo ${{ github.ref }} | sed "s/refs\/tags\///"),~g" pyproject.toml
- run: pip3 install --user poetry
- run: poetry build
- run: poetry publish
env:
POETRY_HTTP_BASIC_PYPI_USERNAME: ${{ secrets.PYPI_USER }}
POETRY_HTTP_BASIC_PYPI_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
15 changes: 15 additions & 0 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: Tests

on: push

jobs:
tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: 3.11
- run: pip3 install poetry
- run: poetry install --with test
- run: poetry run pytest
161 changes: 160 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,160 @@
.idea
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
124 changes: 81 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,87 @@
# Dataverk airflow
Enkel wrapperbibliotek rundt [KubernetesPodOperator](https://airflow.apache.org/docs/stable/kubernetes.html) som lager
airflow tasker som kjører i separate kubernetes podder.

## Knada pod notebook operator
Lager en kubernetes pod operator som kjører en jupyter notebook. Tar seg av kloning av ønsket repo og varsling ved feil
på epost og/eller slack.
Enkelt wrapperbibliotek rundt [KubernetesPodOperator](https://airflow.apache.org/docs/stable/kubernetes.html) som lager Airflow task som kjører i en Kubernetes pod.

### Eksempel på bruk
````python
## Våre operators

Alle våre operators lar deg klone et repo på forhånd, bare legg det til med `repo="navikt/<repo>`.
Vi har også støtte for å installere Python pakker ved oppstart av Airflow task, spesifiser `requirements.txt`-filen din med `requirements_path="/path/to/requirements.txt"`.

### Quarto operator

Denne kjører Quarto render for deg.

```python
from airflow import DAG
from airflow.utils.dates import days_ago
from dataverk_airflow import quarto_operator


with DAG('navn-dag', start_date=days_ago(1), schedule_interval="*/10 * * * *") as dag:
t1 = quarto_operator(dag=dag,
name="<navn-på-task>",
repo="navikt/<repo>",
quarto={
"path": "/path/to/index.qmd",
"env": "dev/prod",
"id":"uuid",
"token":
"quarto-token"
},
slack_channel="<#slack-alarm-kanal>")
```

### Notebook operator

Denne lar deg kjøre en Jupyter notebook.

```python
from airflow import DAG
from airflow.utils.dates import days_ago
from dataverk_airflow import notebook_operator


with DAG('navn-dag', start_date=days_ago(1), schedule_interval="*/10 * * * *") as dag:
t1 = notebook_operator(dag=dag,
name="<navn-på-task>",
repo="navikt/<repo>",
nb_path="/path/to/notebook.ipynb",
slack_channel="<#slack-alarm-kanal>")
```

### Python operator

Denne lar deg kjøre vilkårlig Python-scripts.

```python
from airflow import DAG
from datetime import datetime
from dataverk_airflow.knada_operators import create_knada_nb_pod_operator


with DAG('navn-pod-dag', start_date=datetime(2020, 10, 28), schedule_interval="*/10 * * * *") as dag:
t1 = create_knada_nb_pod_operator(dag=dag,
email="<[email protected]>",
slack_channel="<#slack-alarm-kanal>",
name="<navn-på-task>",
repo="navikt/<repo>",
nb_path="<sti-til-notebook-i-repo>",
namespace="<kubernetes-namespace>",
branch="<branch-i-repo>",
log_output=False)
````

## Knada python pod operator
Lager en kubernetes pod operator som kjører et python skript. Tar seg av kloning av ønsket repo og varsling ved feil
på epost og/eller slack.

````python

from airflow.utils.dates import days_ago
from dataverk_airflow import python_operator


with DAG('navn-dag', start_date=days_ago(1), schedule_interval="*/10 * * * *") as dag:
t1 = python_operator(dag=dag,
name="<navn-på-task>",
repo="navikt/<repo>",
script_path="/path/to/script.py",
slack_channel="<#slack-alarm-kanal>")
```

## Kubernetes operator

Vi tilbyr også vår egen Kubernetes operator som kloner et valg repo inn i containeren.

```python
from airflow import DAG
from datetime import datetime
from dataverk_airflow.knada_operators import create_knada_python_pod_operator


with DAG('navn-pod-dag', start_date=datetime(2020, 10, 28), schedule_interval="*/10 * * * *") as dag:
t1 = create_knada_python_pod_operator(dag=dag,
email="<[email protected]>",
slack_channel="<#slack-alarm-kanal>",
name="<navn-på-task>",
repo="navikt/<repo>",
script_path="<sti-til-notebook-i-repo>",
namespace="<kubernetes-namespace>",
branch="<branch-i-repo>")
````
from airflow.utils.dates import days_ago
from dataverk_airflow import kubernetes_operator


with DAG('navn-dag', start_date=days_ago(1), schedule_interval="*/10 * * * *") as dag:
t1 = kubernetes_operator(dag=dag,
name="<navn-på-task>",
repo="navikt/<repo>",
cmds=["/path/to/bin/", "script-name.sh", "argument1", "argument2"],
image="europe-north1-docker.pkg.dev/nais-management-233d/ditt-team/ditt-image:din-tag",
slack_channel="<#slack-alarm-kanal>")
```
5 changes: 5 additions & 0 deletions dataverk_airflow/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from dataverk_airflow.git_clone import git_clone
from dataverk_airflow.kubernetes_operator import kubernetes_operator, MissingValueException
from dataverk_airflow.python_operator import python_operator
from dataverk_airflow.notebook_operator import notebook_operator
from dataverk_airflow.quarto_operator import quarto_operator
28 changes: 28 additions & 0 deletions dataverk_airflow/git_clone.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import os

import kubernetes.client as k8s


def git_clone(
repo: str,
branch: str,
mount_path: str
):
return k8s.V1Container(
name="clone-repo",
image=os.getenv("CLONE_REPO_IMAGE"),
volume_mounts=[
k8s.V1VolumeMount(
name="dags-data",
mount_path=mount_path,
sub_path=None,
read_only=False
),
k8s.V1VolumeMount(
name="airflow-git-secret",
mount_path="/keys",
),
],
command=["/bin/sh", "-c"],
args=[f"/git-clone.sh {repo} {branch} {mount_path}"]
)
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os

import kubernetes.client as k8s


Expand Down
Loading

0 comments on commit bfbc79f

Please sign in to comment.