Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail fast when image build fails on tests #11102

Closed
hbelmiro opened this issue Aug 15, 2024 · 2 comments · Fixed by #11115
Closed

Fail fast when image build fails on tests #11102

hbelmiro opened this issue Aug 15, 2024 · 2 comments · Fixed by #11115
Assignees

Comments

@hbelmiro
Copy link
Contributor

hbelmiro commented Aug 15, 2024

/area test

Currently, the workflows don't fail when an image build fails. We need to change that so the workflow fails fast.

Several tests depend on creating a KFP cluster. For example:

uses: ./.github/actions/kfp-cluster

This kfp-cluster action does not fail when an image build fails. It fails later when it tries to use the image and can't (because it was not built). The idea is to fail when the image build fails so we don't need to wait too much time to know about the failure.

There's also this one that depends on another action:

uses: ./.github/actions/kfp-tekton-cluster

But they all should be using the same script.


Love this idea? Give it a 👍.

Copy link

@hbelmiro: The label(s) area/test cannot be applied, because the repository doesn't have them.

In response to this:

/area test

Currently, the workflow doesn't fail when an image build fails. We need to change that so the workflow fails fast.


Love this idea? Give it a 👍.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

hbelmiro added a commit to hbelmiro/issues-i-can-help-you-with that referenced this issue Aug 15, 2024
@ElayAharoni
Copy link
Contributor

/assign

ElayAharoni pushed a commit to ElayAharoni/pipelines that referenced this issue Aug 19, 2024
ElayAharoni pushed a commit to ElayAharoni/pipelines that referenced this issue Aug 20, 2024
ElayAharoni pushed a commit to ElayAharoni/pipelines that referenced this issue Aug 20, 2024
ElayAharoni pushed a commit to ElayAharoni/pipelines that referenced this issue Aug 20, 2024
ElayAharoni pushed a commit to ElayAharoni/pipelines that referenced this issue Aug 20, 2024
…-on-tests-#11102' into Fail-fast-when-image-build-fails-on-tests-kubeflow#11102
ElayAharoni pushed a commit to ElayAharoni/pipelines that referenced this issue Aug 20, 2024
…-on-tests-#11102' into Fail-fast-when-image-build-fails-on-tests-kubeflow#11102
ElayAharoni pushed a commit to ElayAharoni/pipelines that referenced this issue Aug 20, 2024
…-on-tests-#11102' into Fail-fast-when-image-build-fails-on-tests-kubeflow#11102
google-oss-prow bot pushed a commit that referenced this issue Aug 20, 2024
* Fail fast when image build fails on tests #11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

* Fail fast when image build fails on tests #11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

---------

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Co-authored-by: Elay Aharoni (EXT-Nokia) <[email protected]>
KevinGrantLee pushed a commit that referenced this issue Sep 17, 2024
* Fail fast when image build fails on tests #11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

* Fail fast when image build fails on tests #11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

---------

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Co-authored-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>
R3hankhan123 pushed a commit to R3hankhan123/pipelines that referenced this issue Sep 20, 2024
…low#11115)

* Fail fast when image build fails on tests kubeflow#11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

* Fail fast when image build fails on tests kubeflow#11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

---------

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Co-authored-by: Elay Aharoni (EXT-Nokia) <[email protected]>
google-oss-prow bot pushed a commit that referenced this issue Sep 20, 2024
…etters (#11097)

* temp title: change title

Signed-off-by: KevinGrantLee <[email protected]>

* add release notes

Signed-off-by: KevinGrantLee <[email protected]>

* formatting

Signed-off-by: KevinGrantLee <[email protected]>

* feat(backend): move comp logic to workflow params (#10979)

* feat(backend): move comp logic to workflow params

Signed-off-by: zazulam <[email protected]>
Co-authored-by: droctothorpe <[email protected]>
Co-authored-by: andreafehrman <[email protected]>
Co-authored-by: MonicaZhang1 <[email protected]>
Co-authored-by: kylekaminky <[email protected]>
Co-authored-by: CarterFendley <[email protected]>
Signed-off-by: zazulam <[email protected]>

* address pr comments

Signed-off-by: zazulam <[email protected]>

* Use function name instead of base name and address edge cases

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: zazulam <[email protected]>

* Improve logic and update tests

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: zazulam <[email protected]>

* POC hashing command and args

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: zazulam <[email protected]>

* Add comments to clarify the logic

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: zazulam <[email protected]>

* Hash entire PipelineContainerSpec

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: zazulam <[email protected]>

---------

Signed-off-by: zazulam <[email protected]>
Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: droctothorpe <[email protected]>
Co-authored-by: andreafehrman <[email protected]>
Co-authored-by: MonicaZhang1 <[email protected]>
Co-authored-by: kylekaminky <[email protected]>
Co-authored-by: CarterFendley <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* feat(component): internal

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 660985413
Signed-off-by: KevinGrantLee <[email protected]>

* feat(components): internal

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 661332120
Signed-off-by: KevinGrantLee <[email protected]>

* fix(components): Fix to model batch explanation component for Structured Data pipelines

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 661475667
Signed-off-by: KevinGrantLee <[email protected]>

* feat(components): Support dynamic values for boot_disk_type, boot_disk_size in preview.custom_job.utils.create_custom_training_job_from_component

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 662242688
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Upgrade Argo to v3.4.17 (#10978)

Signed-off-by: Giulio Frasca <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* test: Moved kubeflow-pipelines-manifests to GitHub Actions (#11066)

Signed-off-by: vmudadla <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* fix: re-enable exit hanler test. (#11100)

Signed-off-by: Liav Weiss (EXT-Nokia) <[email protected]>
Co-authored-by: Liav Weiss (EXT-Nokia) <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* fix(frontend): retrieve archived logs from correct location (#11010)

* fix(frontend): retrieve archived logs from correct location

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: andreafehrman <[email protected]>
Co-authored-by: owmasch <[email protected]>

* Add namespace tag handling and validation

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: andreafehrman <[email protected]>
Co-authored-by: owmasch <[email protected]>

* Remove whitespace from keyFormat

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: andreafehrman <[email protected]>
Co-authored-by: owmasch <[email protected]>

* Update frontend unit tests

Signed-off-by: droctothorpe <[email protected]>

* Remove superfluous log statements

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: quinnovator <[email protected]>

* Add link to keyFormat in manifests

Signed-off-by: droctothorpe <[email protected]>

* Fix workflow parsing for log artifact

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: quinnovator <[email protected]>

* Fix unit test

Signed-off-by: droctothorpe <[email protected]>

---------

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: andreafehrman <[email protected]>
Co-authored-by: owmasch <[email protected]>
Co-authored-by: quinnovator <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* feat(component): internal

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 663774557
Signed-off-by: KevinGrantLee <[email protected]>

* feat(component): internal

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 663872006
Signed-off-by: KevinGrantLee <[email protected]>

* chore(components): GCPC 2.16.1 Release

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 663883139
Signed-off-by: KevinGrantLee <[email protected]>

* test: Fail fast when image build fails on tests #11102 (#11115)

* Fail fast when image build fails on tests #11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

* Fail fast when image build fails on tests #11102

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

---------

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Co-authored-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* fix(components): Use instance.target_field_name format for text-bison models only, use target_field_name for gemini models

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 665638487
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Renamed GitHub workflows from *.yaml to *.yml for consistency (#11126)

Signed-off-by: hbelmiro <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* Fix view edit cluster roles (#11067)

* Fixing incorrect typing in loop_parallism example

Signed-off-by: Oswaldo Gomez <[email protected]>

* Fixing samples/core/loop_parameter example

Signed-off-by: Oswaldo Gomez <[email protected]>

* Fixing aggregate-to-kubeflow-pipelines-edit

Signed-off-by: Oswaldo Gomez <[email protected]>

* keeping MRs separate.

Signed-off-by: Oswaldo Gomez <[email protected]>

* Adding blank line

Signed-off-by: Oswaldo Gomez <[email protected]>

---------

Signed-off-by: Oswaldo Gomez <[email protected]>
Co-authored-by: Oswaldo Gomez <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* fix(components): Pass moddel name to eval_runner to process batch prediction's output as per the output schema of model used

Signed-off-by: Googler <[email protected]>
PiperOrigin-RevId: 665977093
Signed-off-by: KevinGrantLee <[email protected]>

* feat(components): release LLM Model Evaluation image version v0.7

Signed-off-by: Jason Dai <[email protected]>
PiperOrigin-RevId: 666102687
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Adding @DharmitD to SDK reviewers (#11131)

Signed-off-by: ddalvi <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* test: Kubeflow Pipelines V2 integration Tests (#11125)

Signed-off-by: Diego Lovison <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Add make targets for building driver and launcher images (#11103)

Signed-off-by: Giulio Frasca <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* feat(Backend + SDK): Update kfp backend and kubernetes sdk to support EmptyDir (#10913)

Update kfp backend and kubernetes sdk to support mounting EmptyDir
volumes to task pods.

Inspired by #10427

Fixes: #10656

Signed-off-by: Greg Sheremeta <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* docs:fixing broken links in readme (#11108)

Signed-off-by: Fiona Waters <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(deps): bump micromatch from 4.0.5 to 4.0.8 in /test/frontend-integration-test (#11132)

Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.5 to 4.0.8.
- [Release notes](https://github.com/micromatch/micromatch/releases)
- [Changelog](https://github.com/micromatch/micromatch/blob/4.0.8/CHANGELOG.md)
- [Commits](micromatch/micromatch@4.0.5...4.0.8)

---
updated-dependencies:
- dependency-name: micromatch
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: KevinGrantLee <[email protected]>

* Fix: Basic sample tests - sequential is flaky (#11138)

Signed-off-by: Diego Lovison <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Wrapped "Failed GetContextByTypeAndName" error for better troubleshooting (#11098)

Signed-off-by: hbelmiro <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(components): Update AutoSxS and RLHF image tags

Signed-off-by: Michael Hu <[email protected]>
PiperOrigin-RevId: 668536503
Signed-off-by: KevinGrantLee <[email protected]>

* test: Improvements to wait_for_pods function (#11162)

Signed-off-by: hbelmiro <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* fix(frontend): fixes filter pipeline text box shows error when typing anything in it. Fixes #10241 (#11096)

* Filter pipeline text box shows error when typing anything in it #10241

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

* Filter pipeline text box shows error when typing anything in it #10241

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>

---------

Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Co-authored-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* correct artifact preview behavior in UI (#11059)

This change allows KFP UI to fallback to UI host namespace when no
namespaces are provided when referencing the artifact object store
provider secret, in default kubeflow deployments this namespace is
simply "kubeflow", however the user can customize this behavior by
providing the environment variable "SERVER_NAMESPACE" to the KFP UI
deployment.

Further more, this change addresses a bug that caused URL
parse to fail when parsing endpoints without a protocol, this will
support such endpoint types as <ip>:<port> for object store endpoints,
as is the case in the default kfp deployment manifests.

Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Added DCO link to PR template (#11176)

Signed-off-by: Helber Belmiro <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(backend): Update driver and launcher licenses (#11177)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(backend): update driver and launcher default images (#11178)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Add instructions for releasing driver and launcher images (#11179)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* test: Fixed `kfp-runtime-tests` to run on master branch (#11158)

Signed-off-by: hbelmiro <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* (fix): reduce executor logs (#11169)

* remove driver logs from executor

These logs congest the executor runtime logs making it difficult for the
user to differentiate between logs. The driver logs are unnecessary here
and can be removed to reduce this clutter.

Signed-off-by: Humair Khan <[email protected]>

* remove duplicate emissary call in executor

As per the initial inline dev comment, argo podspecpatch did not add the
emissary call, and had to be manualy added. This was fixed a couple of
argo versions back. However, as a result executor pod makes double calls
to the executor, which as a consequence also results in superflous logs.

This change removes the additional call to emissary to resolve this.

Signed-off-by: Humair Khan <[email protected]>

---------

Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: add PaulinaPacyna and ouadakarim as reviewers (#11180)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* test: Move run-all-gcpc-modules to GitHub Actions  (#11157)

* add gcpc modules tests to gha

Signed-off-by: Amanpreet Singh Bedi <[email protected]>

* remove run-all-gcpc-modules test driver script

Signed-off-by: Amanpreet Singh Bedi <[email protected]>

* fix path under gcpc modules tests github action

Signed-off-by: Amanpreet Singh Bedi <[email protected]>

* upgrade ubuntu base image

Signed-off-by: Amanpreet Singh Bedi <[email protected]>

* upgrade python version to 3.9

Signed-off-by: Amanpreet Singh Bedi <[email protected]>

---------

Signed-off-by: Amanpreet Singh Bedi <[email protected]>
Signed-off-by: Amanpreet Singh Bedi <[email protected]>
Co-authored-by: Amanpreet Singh Bedi <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* fix(sdk): Kfp support for pip trusted host (#11151)

Signed-off-by: Diego Lovison <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(sdk): Loosening kubernetes dependency constraint (#11079)

* Loosening kubernetes dependency constraint

Signed-off-by: egeucak <[email protected]>

* added setuptools in test script

Signed-off-by: egeucak <[email protected]>

---------

Signed-off-by: egeucak <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Remove unwanted Frontend test files (#10973)

Signed-off-by: ddalvi <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* fix(ui): fixes empty string value in pipeline parameters (#11175)

Signed-off-by: Jan Staněk <[email protected]>
Co-authored-by: Jan Staněk <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(backend): update driver and launcher default images (#11182)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(release): bumped version to 2.3.0

Signed-off-by: KevinGrantLee <[email protected]>

* chore: Update RELEASE.md to remove obsolete instructions (#11183)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: Release kfp-pipeline-spec 0.4.0 (#11189)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: release kfp-kubernetes 1.3.0 (#11190)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore: update kfp-kubernetes release scripts and instructions (#11191)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* feat(sdk)!: Pin kfp-pipeline-spec==0.4.0, kfp-server-api>=2.1.0,<2.4.0 (#11192)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* chore(sdk): release KFP SDK 2.9.0 (#11193)

Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: KevinGrantLee <[email protected]>

* Delete test pipelines as they are duplicate with
pipeline_with_resource_spec

Signed-off-by: KevinGrantLee <[email protected]>

---------

Signed-off-by: KevinGrantLee <[email protected]>
Signed-off-by: zazulam <[email protected]>
Signed-off-by: droctothorpe <[email protected]>
Signed-off-by: Googler <[email protected]>
Signed-off-by: Giulio Frasca <[email protected]>
Signed-off-by: vmudadla <[email protected]>
Signed-off-by: Liav Weiss (EXT-Nokia) <[email protected]>
Signed-off-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Signed-off-by: hbelmiro <[email protected]>
Signed-off-by: Oswaldo Gomez <[email protected]>
Signed-off-by: Jason Dai <[email protected]>
Signed-off-by: ddalvi <[email protected]>
Signed-off-by: Diego Lovison <[email protected]>
Signed-off-by: Greg Sheremeta <[email protected]>
Signed-off-by: Fiona Waters <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Michael Hu <[email protected]>
Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: Helber Belmiro <[email protected]>
Signed-off-by: Chen Sun <[email protected]>
Signed-off-by: Amanpreet Singh Bedi <[email protected]>
Signed-off-by: Amanpreet Singh Bedi <[email protected]>
Signed-off-by: egeucak <[email protected]>
Signed-off-by: Jan Staněk <[email protected]>
Co-authored-by: Michael <[email protected]>
Co-authored-by: droctothorpe <[email protected]>
Co-authored-by: andreafehrman <[email protected]>
Co-authored-by: MonicaZhang1 <[email protected]>
Co-authored-by: kylekaminky <[email protected]>
Co-authored-by: CarterFendley <[email protected]>
Co-authored-by: Googler <[email protected]>
Co-authored-by: Giulio Frasca <[email protected]>
Co-authored-by: Vani Haripriya Mudadla <[email protected]>
Co-authored-by: Liav Weiss <[email protected]>
Co-authored-by: Liav Weiss (EXT-Nokia) <[email protected]>
Co-authored-by: owmasch <[email protected]>
Co-authored-by: quinnovator <[email protected]>
Co-authored-by: ElayAharoni <[email protected]>
Co-authored-by: Elay Aharoni (EXT-Nokia) <[email protected]>
Co-authored-by: Helber Belmiro <[email protected]>
Co-authored-by: Oswaldo Gomez <[email protected]>
Co-authored-by: Oswaldo Gomez <[email protected]>
Co-authored-by: Jason Dai <[email protected]>
Co-authored-by: Dharmit Dalvi <[email protected]>
Co-authored-by: Diego Lovison <[email protected]>
Co-authored-by: Greg Sheremeta <[email protected]>
Co-authored-by: Fiona Waters <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Michael Hu <[email protected]>
Co-authored-by: Humair Khan <[email protected]>
Co-authored-by: Chen Sun <[email protected]>
Co-authored-by: aman23bedi <[email protected]>
Co-authored-by: Amanpreet Singh Bedi <[email protected]>
Co-authored-by: ege uçak <[email protected]>
Co-authored-by: Jan Staněk <[email protected]>
Co-authored-by: Jan Staněk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants