Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data versioning #3437

Merged
merged 24 commits into from
Dec 18, 2024
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
8e121c1
dataset versioning
ShakutaiGit Sep 19, 2024
b70a26b
docs for tag added
ShakutaiGit Sep 19, 2024
6757548
generating the readme and github workflow using the script.
ShakutaiGit Oct 27, 2024
ccd84ff
update the config variables to the common vars
ShakutaiGit Oct 29, 2024
f04043c
Merge remote-tracking branch 'upstream/main' into dataset-versioning
ShakutaiGit Oct 29, 2024
03ac57c
Merge remote-tracking branch 'upstream/main' into dataset-versioning
ShakutaiGit Nov 4, 2024
1224ce8
black
ShakutaiGit Nov 4, 2024
cf39fb3
running readme.py
ShakutaiGit Nov 4, 2024
ad0410a
disabling blob upload
ShakutaiGit Nov 4, 2024
e9565d7
disabling azure blob
ShakutaiGit Nov 4, 2024
97f81c4
update path
ShakutaiGit Nov 4, 2024
3d16dcb
black formatting
ShakutaiGit Nov 4, 2024
aa41300
creating the workflow again
ShakutaiGit Nov 4, 2024
8ac12d6
remove from readme
ShakutaiGit Nov 4, 2024
7cf8c1c
adding workflow.
ShakutaiGit Nov 4, 2024
b32a59b
re-constructing
ShakutaiGit Nov 14, 2024
31e4dbb
added workflow
ShakutaiGit Nov 14, 2024
8567512
fix workflow
ShakutaiGit Nov 14, 2024
777f981
Merge branch 'main' of https://github.com/Azure/azureml-examples into…
ShakutaiGit Nov 14, 2024
d543ed7
consist variables names
ShakutaiGit Nov 25, 2024
80397af
Merge branch 'main' of https://github.com/Azure/azureml-examples into…
ShakutaiGit Nov 25, 2024
1aa4d4e
Merge branch 'main' of https://github.com/Azure/azureml-examples into…
ShakutaiGit Dec 18, 2024
6f9fce8
fix words ident
ShakutaiGit Dec 18, 2024
9dff727
fix cron time
ShakutaiGit Dec 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ sdk/python/foundation-models/cohere/command_faiss_langchain.ipynb @stewart-co @k
sdk/python/foundation-models/cohere/command_tools-langchain.ipynb @stewart-co @kseniia-cohere
/sdk/python/foundation-models/nixtla/ @AzulGarza
/sdk/python/foundation-models/healthcare-ai/ @jmerkow @ivantarapov
/sdk/python/assets/data/versioning.ipynb @ShakutaiGit

#### files referenced in docs (DO NOT EDIT, except for Docs team!!!) #############################################################################################
/cli/assets/component/train.yml @sdgilley @msakande @Blackmist @ssalgadodev @lgayhardt @fbsolo-ms1
Expand Down
94 changes: 94 additions & 0 deletions .github/workflows/sdk-assets-data-versioning.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# This code is autogenerated.
# Code is generated by running custom script: python3 readme.py
# Any manual changes to this file may cause incorrect behavior.
# Any manual changes will be overwritten if the code is regenerated.

name: sdk-assets-data-versioning
# This file is created by sdk/python/readme.py.
# Please do not edit directly.
on:
workflow_dispatch:
schedule:
- cron: "48 11/12 * * *"
pull_request:
branches:
- main
paths:
- sdk/python/assets/data/**
- .github/workflows/sdk-assets-data-versioning.yml
- sdk/python/dev-requirements.txt
- infra/bootstrapping/**
- sdk/python/setup.sh

permissions:
id-token: write
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: check out repo
uses: actions/checkout@v2
- name: setup python
uses: actions/setup-python@v2
with:
python-version: "3.10"
- name: pip install notebook reqs
run: pip install --no-cache-dir -r sdk/python/dev-requirements.txt
- name: azure login
uses: azure/login@v1
with:
client-id: ${{ secrets.OIDC_AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.OIDC_AZURE_TENANT_ID }}
subscription-id: ${{ secrets.OIDC_AZURE_SUBSCRIPTION_ID }}
- name: bootstrap resources
run: |
echo '${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}';
bash bootstrap.sh
working-directory: infra/bootstrapping
continue-on-error: false
- name: setup SDK
run: |
source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
bash setup.sh
working-directory: sdk/python
continue-on-error: true
- name: validate readme
run: |
python check-readme.py "${{ github.workspace }}/sdk/python/assets/data"
working-directory: infra/bootstrapping
continue-on-error: false
- name: setup-cli
run: |
source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
bash setup.sh
working-directory: cli
continue-on-error: true
- name: Eagerly cache access tokens for required scopes
run: |
# Workaround for azure-cli's lack of support for ID token refresh
# Taken from: https://github.com/Azure/login/issues/372#issuecomment-2056289617

# Management
az account get-access-token --scope https://management.azure.com/.default --output none
# ML
az account get-access-token --scope https://ml.azure.com/.default --output none
- name: run assets/data/versioning.ipynb
run: |
source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" generate_workspace_config "../../.azureml/config.json";
bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" replace_template_values "versioning.ipynb";
[ -f "../../.azureml/config" ] && cat "../../.azureml/config";
papermill -k python versioning.ipynb versioning.output.ipynb
working-directory: sdk/python/assets/data
- name: upload notebook's working folder as an artifact
if: ${{ always() }}
uses: ./.github/actions/upload-artifact
with:
name: versioning
path: sdk/python/assets/data
1 change: 1 addition & 0 deletions sdk/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Test Status is for branch - **_main_**
|assets|assets-in-registry|[share-models-components-environments](assets/assets-in-registry/share-models-components-environments.ipynb)|*no description*|[![share-models-components-environments](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-assets-in-registry-share-models-components-environments.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-assets-in-registry-share-models-components-environments.yml)|
|assets|component|[component](assets/component/component.ipynb)|Create a component asset|[![component](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-component-component.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-component-component.yml)|
|assets|data|[data](assets/data/data.ipynb)|Read, write and register a data asset|[![data](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-data.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-data.yml)|
|assets|data|[versioning](assets/data/versioning.ipynb)|Compute and check dataset hash in Azure ML; register if unique for efficient versioning.|[![versioning](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-versioning.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-versioning.yml)|
|assets|data|[working_with_mltable](assets/data/working_with_mltable.ipynb)|Read, write and register a data asset|[![working_with_mltable](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-working_with_mltable.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-working_with_mltable.yml)|
|assets|environment|[environment](assets/environment/environment.ipynb)|Create custom environments from docker and/or conda YAML|[![environment](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-environment-environment.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-environment-environment.yml)|
|assets|model|[model](assets/model/model.ipynb)|Create model from local files, cloud files, Runs|[![model](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-model-model.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-model-model.yml)|
Expand Down
Loading
Loading