Dciborow/linux template testing #1008

dciborow · 2019-12-12T17:10:42Z

Description

Related Issues

Checklist:

I have followed the contribution guidelines and code style for this project.
I have added tests covering my contributions.
I have updated the documentation accordingly.

…nto staging

review-notebook-app · 2019-12-12T17:10:49Z

Check out this pull request on

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

…microsoft/recommenders into staging

miguelgfierro · 2019-12-13T11:00:00Z

tests/ci/azure_pipeline_test/dsvm_nightly_linux_cpu.yml

-      testResultsFiles: '**/test-*.xml'
-      failTaskOnFailedTests: true
-    condition: succeededOrFailed()
+  - template: steps/conda_pytest_linux.yml


I'm trying to understand the workflow, this file will call tests/ci/azure_pipeline_test/steps/conda_pytest_linux.yml, right?
Two questions:

what does coalesce do?

are PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON always set, even if they are not in the spark environment?

it fills in the first string that is not empty. I am using it to minimize duplicated inputs. in particular, the notebooks has a strange mix of params, where it most cases i need to fill in "unit", but in other cases "notebook"

yes, as they are just env vars, (and i'm still working on my devops jujitsu to do conditional addition of params), i decided to just set them.

foot in mouth: i figured out how to only insert the parmas for spark.

miguelgfierro · 2019-12-13T11:04:55Z

tests/ci/azure_pipeline_test/dsvm_trigger_linux_multistage.yml

@@ -0,0 +1,42 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.


where is this used?

https://dev.azure.com/AZGlobal/Azure%20Global%20CAT%20Engineering/_build?definitionId=134&_a=summary

we can create a pipeline off of it in the other ado as well, but I didnt want to do to much there without everyone on the team understanding it first (i only did the minimum i needed to param the agents)

miguelgfierro · 2019-12-13T11:11:54Z

tests/ci/azure_pipeline_test/stage/linux_test_stage.yml

@@ -0,0 +1,41 @@
+parameters:


this file is similar but no exactly the same as tests/ci/azure_pipeline_test/stages/linux_test_stages.yml.

linux_test_stage is the generic template for one stage

while linux_test_stages specifically calls that template 3 different ways, for CPU, GPU, and Spark tests.

so, first question, we have dsvm_trigger_linux_multistage.yml that is calling stages/linux_test_stages.yml that calls stage/linux_test_stage.yml three times, for CPU, GPU, and Spark tests, then each of them calling steps/conda_pytest_linux.yml, this is for the uni tests. Then we have dsvm_nightly_linux_multistage.yml that is doing the same thing but for the nightly tests?

second question: who is calling conda_pytest_linux_steps.yml?

third question, the original yml are still there, are they still needed if we have (1st Q)?

fourth question, if we don't need the yamls in (3rd Q), is github going to identify the 12 original pipelines independently so they will appear when someone does a PR?

as a general comment, I'm honestly making an effort to understand the value of all this parametrization, but to me all this is increasing the complexity. I'm trying to understand how we are making things easier for everyone, but so far I can't

So, I'm slowly trying to expand the test harness that we created for the HPs to the Best Practice repos. I'm looking to minimize the configuration differences, as well as the duplicated code, for all of the build yamls that I need to bring together to do this. I included the additional templates in this repo, for this PR, to try and make it more clear then my previous PR how the templates fit together. Then, as I look to the next BP repo, I'll start pulling common pieces into the central microsoft/ai location.

for your questions...
1 - let me try and create a diagram for this.... It's funny that I think AML studio would be a great pallet to show this on.... but I can create something in PowerPoint that will also show how NIghtly and Trigger both use the same stages.
Which of these particular levels do you think is necessary and does any layer feel like to much?

2 - I had a typo in linux_test_stage, that should have been calling conda_pytest_linux_steps, updated it.

3 - So I did not want to touch these, because they are connecting with your CI/CD system. We could do a separate PR to remove them, only after the multistage blending versions are committing, and a new ADO pipeline has been created, if you are interested in that.
For my expanding test harness, I did not particularly need separate pipelines for each test so I preferred having a single pipeline connecting with the repo.
I want to leave the decision about what you want to do with the individual pipelines completely up to the reco team, and just leave what I am using next to it.

4 - if someone does a PR, and you use the multistage pipeline, Github will roll this into a single check, but if the check fails, the user can check the logs to see which stage failed. The "Stages" in the ADO pipeline is a new feature, that they haven't included in any type of badge yet.

I see.

I think we ideally would like to have the same 12 pipelines for unit test + 6 for nightly builds, but trying to make them simple.

An idea that came from your development would be this. First there will be some components in this repo and some in the common repo (microsoft/ai). The common components will be:

schedule.yml -> https://github.com/microsoft/recommenders/blob/master/tests/ci/azure_pipeline_test/dsvm_nightly_linux_gpu.yml#L5

conda remove linux https://github.com/microsoft/recommenders/blob/master/tests/ci/azure_pipeline_test/dsvm_nightly_linux_gpu.yml#L54

conda remove windows

publish tests
these components can be reused in other repos.

In this repo we will have:

template for creating the conda env https://github.com/microsoft/recommenders/blob/master/tests/ci/azure_pipeline_test/dsvm_nightly_linux_gpu.yml#L29

template for creating the unit tests https://github.com/microsoft/recommenders/blob/master/tests/ci/azure_pipeline_test/dsvm_unit_linux_cpu.yml#L37

template for creating the nightly tests https://github.com/microsoft/recommenders/blob/master/tests/ci/azure_pipeline_test/dsvm_nightly_linux_gpu.yml#L36
these components are specific for this repo, and are the core controllers of the tests, if we want to change something, we don't need to go to an external repo to change them

These are the bricks, then for each of the unit and nightly tests, we will have a yaml file and this will be connected to github.

A final thing, I think we will save a lot of lines of code if we don't use true and false flag for each component but we group them. An example on this file: https://github.com/microsoft/recommenders/blob/master/tests/ci/azure_pipeline_test/dsvm_unit_linux_gpu.yml, this code:

- bash: | python scripts/generate_conda_file.py --gpu conda env update -n reco_gpu -f reco_gpu.yaml displayName: 'Creating Conda Environment with dependencies' - script: | . /anaconda/etc/profile.d/conda.sh && \ conda activate reco_gpu && \ pytest tests/unit -m "not notebooks and not spark and gpu" --junitxml=reports/test-unit.xml

could be parametrized with conda_params (can be empty, --gpu or --pyspark), env_name and test_string (that would be "not notebooks and not spark and gpu").

would this make sense?

@anargyri proposed the following idea, we would have only 2 yaml files, one for unit tests and another for nightly builds. Similarly to the previous comment, we will parametrize all the content like conda_params, conda_env, test_instruction, etc.

Now we have two possibilities:

we can inject the correct parameters from ADO to the template and github is able to link each test

if 1) doesn't work, we will need to create a yaml file for each pipeline with parameters and template.

if this idea makes sense to people, next step would be to explore if 1 is feasable

I would also add that if (1) fails, you could still use two templates and create the yml files with a script (and check them in, so that GitHub can find them).
A similar approach to what I am suggesting is used frequently in other contexts, especially when there are many parameters to select. E.g. Jun did something similar in this repo, for creating yml files programmatically and passing them to Kubeflow
https://github.com/microsoft/HyperTune/blob/master/src/kubeflow/manifest/hypertune.template
https://github.com/microsoft/HyperTune/blob/a40e8e07b308f878013c812d069bbcececb948b7/src/kubeflow/utils.py#L187

miguelgfierro · 2019-12-13T11:13:08Z

@dciborow Dan, I asked you a couple of questions, you mentioned earlier that this was work in progress, so not sure if the PR is finished or it is for discussion

dciborow · 2019-12-14T00:20:38Z

@miguel-ferreira , it is ready for review

dciborow

@miguel-ferreira , would it be a better PR if I backed out all of the templates I am adding, and only added the two pipeline files I need for test harness? I can move all the templates into my central repo.

One thing I really wanted to learn from going through all the reco build pipelines was what I could extract to begin creating a standard template for any future BP repos. This has given me a good sense of it, and would just like to understand what makes the most sense for you.

Basically, the reco repo is my test case, to make sure that my test harness interface for BP repos works.

anargyri · 2019-12-19T16:21:54Z

@miguel-ferreira , would it be a better PR if I backed out all of the templates I am adding, and only added the two pipeline files I need for test harness? I can move all the templates into my central repo.

One thing I really wanted to learn from going through all the reco build pipelines was what I could extract to begin creating a standard template for any future BP repos. This has given me a good sense of it, and would just like to understand what makes the most sense for you.

Basically, the reco repo is my test case, to make sure that my test harness interface for BP repos works.

You are mentioning the wrong Miguel.

miguelgfierro · 2019-12-19T16:57:49Z

@miguel-ferreira we would love to get your contributions :-)

miguelgfierro · 2020-11-13T08:16:53Z

closing this as it was very old and we are reactivating the test machines
#1215

dciborow added 2 commits December 12, 2019 12:08

Template for Linux Testing

e71c6bf

Merge branch 'staging' of https://github.com/microsoft/recommenders i…

0364c1a

…nto staging

dciborow requested review from gramhagen, miguelgfierro and yueguoguo as code owners December 12, 2019 17:10

dciborow changed the base branch from master to staging December 12, 2019 17:10

dciborow added 16 commits December 12, 2019 12:11

Typo

7db339f

Refactor to reduce duplicated code

c62d696

move publish to end of testing

6738f57

move publish to end of testing

f90331e

Adding Multistage Pipeline

5553841

Adding Multistage Pipeline

5e249cf

Adding Multistage Pipeline

28cf068

Adding Multistage Pipeline

ef33607

Update dsvm_trigger_linux_multistage.yml

e4af432

Update linux_test_stage.yml

3d88ea2

Adding Multistage Pipeline

7d9c740

Merge branch 'dciborow/linux-template-testing' of https://github.com/…

1145712

…microsoft/recommenders into staging

Adding Multistage Pipeline

20aa2bf

Update conda_pytest_linux.yml

86279e4

Update linux_test_stages.yml

6d4f68e

Update linux_test_stages.yml

e21acf2

miguelgfierro reviewed Dec 13, 2019

View reviewed changes

dciborow added 2 commits December 13, 2019 21:54

Update conda_pytest_linux.yml

31be356

Update linux_test_stage.yml

995ae75

dciborow commented Dec 17, 2019

View reviewed changes

miguelgfierro mentioned this pull request Dec 19, 2019

revert to previous yamls #1019

Merged

3 tasks

miguelgfierro closed this Nov 13, 2020

gramhagen deleted the dciborow/linux-template-testing branch March 21, 2021 13:50

gramhagen restored the dciborow/linux-template-testing branch March 21, 2021 13:50

miguelgfierro deleted the dciborow/linux-template-testing branch November 25, 2022 09:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dciborow/linux template testing #1008

Dciborow/linux template testing #1008

dciborow commented Dec 12, 2019 •

edited

Loading

review-notebook-app bot commented Dec 12, 2019

miguelgfierro Dec 13, 2019

dciborow Dec 14, 2019

dciborow Dec 14, 2019

miguelgfierro Dec 13, 2019

dciborow Dec 14, 2019 •

edited

Loading

dciborow Dec 14, 2019

miguelgfierro Dec 13, 2019

dciborow Dec 14, 2019

miguelgfierro Dec 16, 2019 •

edited

Loading

miguelgfierro Dec 16, 2019

dciborow Dec 17, 2019 •

edited

Loading

miguelgfierro Dec 17, 2019

miguelgfierro Dec 19, 2019

anargyri Dec 19, 2019

miguelgfierro commented Dec 13, 2019

dciborow commented Dec 14, 2019

dciborow left a comment •

edited

Loading

anargyri commented Dec 19, 2019

miguelgfierro commented Dec 19, 2019

miguelgfierro commented Nov 13, 2020

		@@ -0,0 +1,42 @@
		# Copyright (c) Microsoft Corporation. All rights reserved.

Dciborow/linux template testing #1008

Dciborow/linux template testing #1008

Conversation

dciborow commented Dec 12, 2019 • edited Loading

Description

Related Issues

Checklist:

review-notebook-app bot commented Dec 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dciborow Dec 14, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miguelgfierro Dec 16, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dciborow Dec 17, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miguelgfierro commented Dec 13, 2019

dciborow commented Dec 14, 2019

dciborow left a comment • edited Loading

Choose a reason for hiding this comment

anargyri commented Dec 19, 2019

miguelgfierro commented Dec 19, 2019

miguelgfierro commented Nov 13, 2020

dciborow commented Dec 12, 2019 •

edited

Loading

dciborow Dec 14, 2019 •

edited

Loading

miguelgfierro Dec 16, 2019 •

edited

Loading

dciborow Dec 17, 2019 •

edited

Loading

dciborow left a comment •

edited

Loading