Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pdb): Add support for managing PDBs #2515

Merged
merged 1 commit into from
Oct 15, 2020
Merged

Conversation

groszewn
Copy link
Contributor

@groszewn groszewn commented Oct 2, 2020

Add support for managing PDBs through the SeldonSpec.

Contributes to #2508

Signed-off-by: Nick Groszewski [email protected]

What this PR does / why we need it:

Adds support for specifying and managing PDBs as part of a SeldonDeployment so that disruptions can be properly budgeted.

Which issue(s) this PR fixes:

Fixes #2508

Special notes for your reviewer:

These changes were tested on a local kind cluster, but I'd appreciate it if we could double check that all necessary generated files actually exist.

I haven't updated the helm chart roles/CRDs. Is there a make target that would appropriately copy the roles and CRDs from the operator folder to the chart, or is that done manually?

Does this PR introduce a user-facing change?:

Support disruption budgeting in SeldonDeployment

@seldondev
Copy link
Collaborator

Hi @groszewn. Thanks for your PR.

I'm waiting for a SeldonIO member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

@@ -288,7 +289,8 @@ type SeldonPodSpec struct {
Metadata metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
Spec v1.PodSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
HpaSpec *SeldonHpaSpec `json:"hpaSpec,omitempty" protobuf:"bytes,3,opt,name=hpaSpec"`
Replicas *int32 `json:"replicas,omitempty" protobuf:"bytes,4,opt,name=replicas"`
PdbSpec *SeldonPdbSpec `json:"pdbSpec,omitempty" protobuf:"bytes,4,opt,name=pdbSpec"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep the order of elements to keep backward compatibility for gRPC protobufs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@groszewn groszewn force-pushed the pdb branch 2 times, most recently from df82a53 to 4ff9a07 Compare October 4, 2020 15:59
@groszewn
Copy link
Contributor Author

groszewn commented Oct 5, 2020

@cliveseldon reordered the elements to keep backwards compatibility and generated the helm chart updates to the CRD.

@ukclivecox
Copy link
Contributor

/ok-to-test

@seldondev
Copy link
Collaborator

Mon Oct 5 12:50:42 UTC 2020
The logs for [pr-build] [1] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/1.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=1

@seldondev
Copy link
Collaborator

Mon Oct 5 12:50:58 UTC 2020
The logs for [lint] [2] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/2.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=2

@groszewn
Copy link
Contributor Author

groszewn commented Oct 6, 2020

@cliveseldon do the tests need to be rekicked?

@ukclivecox
Copy link
Contributor

/test integration

@seldondev
Copy link
Collaborator

Tue Oct 6 13:00:50 UTC 2020
The logs for [integration] [3] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/3.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=3

@ukclivecox
Copy link
Contributor

@groszewn is there anyway to add an end-to-end test either via a notebook example and then calling it from https://github.com/SeldonIO/seldon-core/blob/master/testing/scripts/test_notebooks.py

@groszewn
Copy link
Contributor Author

groszewn commented Oct 7, 2020

@cliveseldon e2e test added and documentation updated to link the notebook.

@seldondev
Copy link
Collaborator

Wed Oct 7 02:33:37 UTC 2020
The logs for [pr-build] [4] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/4.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=4

@seldondev
Copy link
Collaborator

Wed Oct 7 02:33:42 UTC 2020
The logs for [lint] [5] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/5.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=5

@seldondev
Copy link
Collaborator

Wed Oct 7 02:52:28 UTC 2020
The logs for [pr-build] [6] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/6.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=6

@seldondev
Copy link
Collaborator

Wed Oct 7 02:52:37 UTC 2020
The logs for [lint] [7] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/7.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=7

@seldondev
Copy link
Collaborator

Wed Oct 7 12:27:53 UTC 2020
The logs for [lint] [9] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/9.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=9

@seldondev
Copy link
Collaborator

Wed Oct 7 12:28:03 UTC 2020
The logs for [pr-build] [8] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/8.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=8

@axsaucedo
Copy link
Contributor

/test notebooks

@seldondev
Copy link
Collaborator

Wed Oct 7 14:23:37 UTC 2020
The logs for [notebooks] [10] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/10.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=10

@ukclivecox
Copy link
Contributor

@groszewn the 1 failed notebook test is a known issue that will be fixed in #2530

@groszewn
Copy link
Contributor Author

groszewn commented Oct 8, 2020

@cliveseldon rebased off of master to get those fixes

@groszewn
Copy link
Contributor Author

/retest

@seldondev
Copy link
Collaborator

Sat Oct 10 22:17:37 UTC 2020
The logs for [notebooks] [16] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/16.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=16

@groszewn
Copy link
Contributor Author

/retest

@seldondev
Copy link
Collaborator

Tue Oct 13 16:28:28 UTC 2020
The logs for [notebooks] [17] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/17.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=17

@axsaucedo
Copy link
Contributor

axsaucedo commented Oct 14, 2020

@groszewn the test_custom_metrics is known to currently having issues via #2541, and the test_explainer was fixed in master - did you rebase recently?

@groszewn
Copy link
Contributor Author

@axsaucedo Haven't rebased since the fix was merged to master, will do so now.

@axsaucedo
Copy link
Contributor

Ok sounds perfect, it seems all should be passing now (except the known issues with #2541)

@axsaucedo
Copy link
Contributor

/test notebooks

@seldondev
Copy link
Collaborator

Wed Oct 14 12:49:59 UTC 2020
The logs for [notebooks] [20] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/20.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=20

@seldondev
Copy link
Collaborator

Wed Oct 14 12:49:57 UTC 2020
The logs for [pr-build] [18] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/18.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=18

@seldondev
Copy link
Collaborator

Wed Oct 14 12:50:10 UTC 2020
The logs for [lint] [19] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/19.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=19

@axsaucedo
Copy link
Contributor

It seems the explainer test is failing, as well as another which seems to be flaky, re-testing to confirm and we'll investigate.

/test notebooks

@RafalSkolasinski
Copy link
Contributor

Hmm.... @axsaucedo #2541 is about custom_metrics notebook but last notebook tests failed on:

  • TestNotebooks.test_explainer_examples
  • TestNotebooks.test_protocol_examples

@seldondev
Copy link
Collaborator

Wed Oct 14 15:39:27 UTC 2020
The logs for [notebooks] [21] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/21.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=21

@groszewn
Copy link
Contributor Author

/retest

@seldondev
Copy link
Collaborator

Wed Oct 14 19:09:09 UTC 2020
The logs for [notebooks] [22] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/22.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=22

@axsaucedo
Copy link
Contributor

We have confirmed that the tests failing in the notebooks job are flaky, so we're looking to merge this PR - it seems there are some merge conflicts, if you can resolve them we can rerun and merge @groszewn

@groszewn
Copy link
Contributor Author

@axsaucedo sounds good to me, rerunning some validations locally but I'll have the PR updated shortly.

Add support for managing PDBs through the SeldonSpec.

Contributes to SeldonIO#2508

Signed-off-by: Nick Groszewski <[email protected]>
@seldondev
Copy link
Collaborator

Thu Oct 15 12:52:51 UTC 2020
The logs for [pr-build] [23] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/23.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=23

@seldondev
Copy link
Collaborator

Thu Oct 15 12:53:14 UTC 2020
The logs for [lint] [24] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/24.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=24

@groszewn
Copy link
Contributor Author

/test integration

@seldondev
Copy link
Collaborator

Thu Oct 15 13:13:03 UTC 2020
The logs for [integration] [25] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2515/25.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2515 --build=25

@seldondev
Copy link
Collaborator

@groszewn: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
notebooks 7814d94 link /test notebooks
integration 6fe3a96 link /test integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. I understand the commands that are listed here.

@axsaucedo
Copy link
Contributor

Thanks @groszewn - current integration test pass, only failure is a flaky one
/approve

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: axsaucedo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@axsaucedo axsaucedo merged commit f275586 into SeldonIO:master Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support PDB specifications for SeldonDeployments
5 participants