-
Notifications
You must be signed in to change notification settings - Fork 6.8k
For mxnet-validation pipeline, require sanity build to complete successfully before running other build pipelines. #17999
Conversation
Hey @josephevans , Thanks for submitting the PR
CI supported jobs: [clang, windows-cpu, sanity, windows-gpu, website, miscellaneous, centos-cpu, unix-cpu, unix-gpu, centos-gpu, edge] Note: |
Looks like the other builds have already started running before sanity build completed @josephevans |
@mxnet-bot run ci [unix-cpu] |
Jenkins CI successfully triggered : [unix-cpu] |
@mxnet-bot run ci [unix-cpu] |
Jenkins CI successfully triggered : [unix-cpu] |
Yeah, there's a few script calls I have to approve in Jenkins for this to work. That's why it skipped the sanity check delays initially (it's fail open.) It'd be nice to use a trusted library in Jenkins to do this to (a) protect against potential security vulnerabilities and malicious abuse and (b) not have to approve these in the future. Looking into that. |
Makes sense. Thanks for the clarification |
…sanity check first, then starts all other builds.
89e0beb
to
ae4cb1b
Compare
@mxnet-bot run ci [unix-gpu] |
Jenkins CI successfully triggered : [unix-gpu] |
ci/jenkins/Jenkinsfile_full
Outdated
stage("full-build") { | ||
def pipelineName = JOB_NAME.substring(0, JOB_NAME.indexOf('/')) | ||
build job: pipelineName + "/sanity/" + BRANCH_NAME, wait: true | ||
['centos-cpu', 'centos-gpu', 'clang', 'edge', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some way to build this list automatically?
Have you tested this setup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has all been tested on mxnet-ci-dev.
We can get a list from Jenkins, but it would require scriptApprovals, which I'd like to avoid. I could move the build list to a variable at the top of the file so it's more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which function exactly would have to be whitelisted? Maybe we can make an exception if it does not expose anything we're worried about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would have to whitelist the Jenkins.instance static method, which would open a huge hole that anyone who submitted a PR could exploit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay agree.
ci/jenkins/Jenkinsfile_full
Outdated
max_time = 180 | ||
|
||
stage("full-build") { | ||
def pipelineName = JOB_NAME.substring(0, JOB_NAME.indexOf('/')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that this way is resilient? Jobs might have multiple slashes etc. This doesn't seem to be fully reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will be resilient. It just extracts the parent folder of the job pipeline (ie 'mxnet-validation'.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well but this will break with nested folders (a/b/c/sanity will result in a/sanity) or if there is no top-level folder at all. @szha @sandeep-krishnamurthy can you get somebody to assist on this matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution @josephevans and review @marcoabreu . How can I assist here? Looks like a regular dev discussion as with any other PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to get somebody else involved who's familiar with Jenkins. I personally don't have enough time to do coaching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcoabreu Thanks for the suggestions, I've update the PR to preserve the full path of the build job, so the staggered build pipelines will still work if we reorganize our Jenkins pipelines into subfolders.
…b path in case we use nested folders in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks a lot for this work on staggered builds!
Lets get this merged
@sandeep-krishnamurthy @leezu plz help review/merge
@mxnet-label-bot [pr-awaiting-merge] |
Please show that this is tested and working before merging. |
We will still need to make some changes to Jenkins in order for the staggered builds to start working, so there is no risk in merging now. I will send an email out to the community when I setup the new build pipeline and migrate to it. Here is a test run we did on ci-dev. As you can see, we first run the sanity build. Once it finishes, we spawn off all the other build jobs to run in the background. |
…ssfully before running other build pipelines. (apache#17999) * Refactor staggered builds - create new full build pipeline that runs sanity check first, then starts all other builds. * Move list of build jobs to top of file for clarity. Preserve whole job path in case we use nested folders in the future. Co-authored-by: Joe Evans <[email protected]>
…ssfully before running other build pipelines. (apache#17999) * Refactor staggered builds - create new full build pipeline that runs sanity check first, then starts all other builds. * Move list of build jobs to top of file for clarity. Preserve whole job path in case we use nested folders in the future. Co-authored-by: Joe Evans <[email protected]>
…ssfully before running other build pipelines. (apache#17999) * Refactor staggered builds - create new full build pipeline that runs sanity check first, then starts all other builds. * Move list of build jobs to top of file for clarity. Preserve whole job path in case we use nested folders in the future. Co-authored-by: Joe Evans <[email protected]>
* For mxnet-validation pipeline, require sanity build to complete successfully before running other build pipelines. (#17999) * Refactor staggered builds - create new full build pipeline that runs sanity check first, then starts all other builds. * Move list of build jobs to top of file for clarity. Preserve whole job path in case we use nested folders in the future. Co-authored-by: Joe Evans <[email protected]> * If sanity build is not found, wait until Jenkins recognizes it. (#18119) * If sanity build is not found, wait until Jenkins recognizes it. * Also add a timeout of 30m for sanity build to run and complete, so we don't get stuck in a loop. Co-authored-by: Joe Evans <[email protected]> Co-authored-by: Joe Evans <[email protected]>
* For mxnet-validation pipeline, require sanity build to complete successfully before running other build pipelines. (#17999) * Refactor staggered builds - create new full build pipeline that runs sanity check first, then starts all other builds. * Move list of build jobs to top of file for clarity. Preserve whole job path in case we use nested folders in the future. Co-authored-by: Joe Evans <[email protected]> * If sanity build is not found, wait until Jenkins recognizes it. (#18119) * If sanity build is not found, wait until Jenkins recognizes it. * Also add a timeout of 30m for sanity build to run and complete, so we don't get stuck in a loop. Co-authored-by: Joe Evans <[email protected]> Co-authored-by: Joe Evans <[email protected]>
* For mxnet-validation pipeline, require sanity build to complete successfully before running other build pipelines. (#17999) * Refactor staggered builds - create new full build pipeline that runs sanity check first, then starts all other builds. * Move list of build jobs to top of file for clarity. Preserve whole job path in case we use nested folders in the future. Co-authored-by: Joe Evans <[email protected]> * If sanity build is not found, wait until Jenkins recognizes it. (#18119) * If sanity build is not found, wait until Jenkins recognizes it. * Also add a timeout of 30m for sanity build to run and complete, so we don't get stuck in a loop. Co-authored-by: Joe Evans <[email protected]> Co-authored-by: Joe Evans <[email protected]>
…ssfully before running other build pipelines. (apache#17999) * Refactor staggered builds - create new full build pipeline that runs sanity check first, then starts all other builds. * Move list of build jobs to top of file for clarity. Preserve whole job path in case we use nested folders in the future. Co-authored-by: Joe Evans <[email protected]>
Description
Add new build pipeline that will run sanity build, and if it completes successfully, run the rest of the mxnet-validation pipelines.
#17802