Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python migration #435

Closed
wants to merge 69 commits into from
Closed

Python migration #435

wants to merge 69 commits into from

Conversation

jdplatt
Copy link
Contributor

@jdplatt jdplatt commented Mar 16, 2019

This change is Reviewable

@jdplatt
Copy link
Contributor Author

jdplatt commented Apr 15, 2019

/retest

@johnugeorge
Copy link
Member

Can you update the description with the details of restructuring done in this PR ?

@@ -0,0 +1,19 @@
apiVersion: apiextensions.k8s.io/v1beta1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

studyjobcontroller is not needed in v1alpha2. Why are these files added?

@jdplatt
Copy link
Contributor Author

jdplatt commented Apr 15, 2019

Sure! Along with my changes to the python code to generate suggestions I added v1alpha2 folders for several components. For the following folders I copied my version into a new folder called v1alpha2 and the copied the v1alpha1 folder over from master:

  1. manifests
  2. scripts
  3. pkg/suggestion
  4. test/e2e
  5. test/scripts

For the cmd/suggestion and test/workflow folders I left the current folders in the same place to avoid creating merge conflicts for other people and added a v1alpha2 folder containing my version. I figured we can clean up the cmd/suggestion folder when we merge the suggestion service into a single image, and the test/workflow folder when we do the integrations tests for v1alpha2.

Because my v1alpha2 is just the current state of the manifests, there will be need to be more changes as we add the v1alpha2 versions of the different components. For example, if the studyjobcontroller is not needed in v1alpha2 you can delete it in the PR that implements the replacement. This may not be the ideal approach, but this PR was already getting huge and I wanted to get it merged and not have to roll changes for even more components into it.

@johnugeorge
Copy link
Member

Do you really need to copy all the files to v1alpha2 folder in this PR? Is there any reason? I feel that it will be better to add v1alpha2 only for relevant components related to the changes in this PR . eg: manifests for suggestion

For other components, we can add whenever corresponding v1alpha2 code is added.

@jdplatt
Copy link
Contributor Author

jdplatt commented Apr 15, 2019

The code for v1alpha1 and v1alpha2 is very similar so it seemed natural to take a copy of v1alpha1 as the starting point of v1alpha2 and make the required alterations, rather than build up v1alpha2 from scratch. I guess I could have created the v1alpha2 folders and cherry-picked only the files I changed to move over, but it seemed easier to move everything at once to make sure nothing was missed.

@richardsliu
Copy link
Contributor

@jdplatt We've changed the API implementation quite a bit from v1alpha1 to v1alpha2, so taking a copy of all the files would make it harder for others to determine which files were copied from v1alpha1 and which ones are under development. It also unfortunately makes this PR harder to review. Can we include only the relevant changed files? Or alternatively mark the copied files with a note/comment so we know which files should be removed later?

@jlewi
Copy link
Contributor

jlewi commented Apr 22, 2019

@jdplatt @richardsliu How's this PR coming? Do we have a goal in mind for getting it merged?

@richardsliu
Copy link
Contributor

@jdplatt We should try to get this PR merged soon, otherwise we will be dealing with more conflicts later on. Let me know if you need help with this.

@richardsliu
Copy link
Contributor

/retest

@k8s-ci-robot
Copy link

@jdplatt: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
kubeflow-katib-presubmit bbde559 link /test kubeflow-katib-presubmit

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

- start the service

```
python suggestion/bayesian/main.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the correct path?

@@ -0,0 +1,88 @@
package suggestion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

@@ -0,0 +1,28 @@
apiVersion: extensions/v1beta1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name of the directory should be bayestian-optimization?

from .bayesian_optimization_algorithm import BOAlgorithm


ALGORITHM_REGISTER = {"random_search": RandomSearch,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would someone distinguish between an algorithm that is still in development and one that is stable? What if users want to enable or disable a specific algorithm?

Also if a user wants to add a new algorithm, how would they achieve this without changing the shared Docker image?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it configurable instead of hard-code

@@ -66,6 +66,11 @@ echo "REPO_NAME ${REPO_NAME}"
echo "VERSION ${VERSION}"

sed -i -e "s@image: katib\/katib-controller@image: ${REGISTRY}\/${REPO_NAME}\/katib-controller:${VERSION}@" manifests/v1alpha2/katib-controller/katib-controller.yaml
sed -i -e "s@image: katib\/suggestion@image: ${REGISTRY}\/${REPO_NAME}\/suggestion:${VERSION}@" manifests/vizier/suggestion/random/deployment.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem to be missing the v1alpha2 path.

@@ -0,0 +1,28 @@
apiVersion: extensions/v1beta1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "vizier" necessary here? I think we can just use "suggestion" here. Also going forward we should avoid using the name vizier since it can confuse users.

@@ -0,0 +1,10 @@
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had cmd/suggestion/hyperband/v1alpha1 already. can you move it in cmd/suggestion/hyperband/v1alpha2? do dose others

# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/katib
WORKDIR /go/src/github.com/kubeflow/katib/cmd/suggestion/hyperband
RUN go build -o hyperband
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can go build -o hyperband work well?

@@ -0,0 +1,17 @@
package suggestion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd better rename this file's name

import grpc

from pkg.api.python import api_pb2_grpc
from pkg.suggestion.katib_suggestion.rpc_service import SuggestionService
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong python path

@@ -0,0 +1,27 @@
package main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you plan to migrate hyperband to python one in another PR? if yes, please move it in this PR

@@ -0,0 +1,26 @@
apiVersion: "kubeflow.org/v1alpha1"
kind: StudyJob
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need sync up most part in test/ for v1alpha2

from .bayesian_optimization_algorithm import BOAlgorithm


ALGORITHM_REGISTER = {"random_search": RandomSearch,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it configurable instead of hard-code

self.manager_addr = MANAGER_ADDRESS
self.manager_port = MANAGER_PORT
self.logger = logger if (logger is not None) else get_logger()
self.search_algorithm = search_algorithm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if search_algorithm is not in ALGORITHM_REGISTER, we should exit with right error message in init instead of exception occurring in line 30 when GetSuggestions.
That is, the corresponding deployment should not be Running if search_algorithm is not supported

def __init__(self, parameter_config, suggestion_config, logger=None):
self.parameter_config = parameter_config
self.suggestion_config = suggestion_config
self.logger = logger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.logger unused

def GetSuggestions(self, request, context):
suggestion_config = self._parse_suggestion_parameters(request.param_id)
suggestion_config = {param.name: param.value for param in suggestion_config}
study_conf = self._get_study_config(request.study_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

study_id field is dropped in v1alpha2

@hougangliu
Copy link
Member

@jdplatt GetSuggestionsRequest of v1alpha2 changes a lot, I think you need sync up suggestion services of v1alpha2 with it.
Thanks!

@richardsliu
Copy link
Contributor

Discussed offline with @jdplatt and @johnugeorge - we (myself and Johnu) will patch this PR and make the necessary fixes based on recent changes to the v1alpha2 API. Thanks John for your contributions.

@@ -0,0 +1,8 @@
FROM python:3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In nasrl/Dockerfile it is pinned it to python:3.6 instead of having it floating at major version 3. Opinions differ, but I definitely think it is a good idea to have transitions between minor versions of python (like 3.6 -> 3.7) be a concious choice, it is not rare that stuff breaks (even though they should not), and then you need to pin the minor version.


self.registered_studies = dict()

if logger == None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pass the logger object around? It is a bit confusing since you can then have a logger named package1.module1.class1 write logs for package1.module1.class2. If you instead just always do self.logger = logging.getLogger(__name__) then the loggers form a hierarchy reflecting the python module hierarchy, and you can configure the top logger and all the others will inherit that configuration

@richardsliu
Copy link
Contributor

/close

@k8s-ci-robot
Copy link

@richardsliu: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants