-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reject asset upload with invalid id
#273
Conversation
Signed-off-by: JAulet <[email protected]>
Signed-off-by: JAulet <[email protected]>
id
model_service_controller_impl.py. Signed-off-by: JAulet <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @JAulet -- the original problem was not that the id
we generate was invalid, but the id
field inside the YAML could still be invalid. And since that YAML itself was used by another component to get the id
the DataShim logic failed. So this PR should add some code to verify the id
s in the YAML adhere to the k8s identifier format.
@@ -396,6 +396,11 @@ def _upload_notebook_yaml(yaml_file_content: AnyStr, name=None, access_token=Non | |||
requirements = yaml_dict["implementation"]["github"].get("requirements") | |||
filter_categories = yaml_dict.get("filter_categories") or dict() | |||
|
|||
errors, status = validate_id(notebook_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this check above line 391 where we generate the notebook_id
and validate the yaml_dict.get("id")
errors, status = validate_id(yaml_dict.get("id"))
@@ -397,6 +397,11 @@ def _upload_model_yaml(yaml_file_content: AnyStr, name=None, existing_id=None): | |||
if type(api_model.servable_tested_platforms) == str: | |||
api_model.servable_tested_platforms = api_model.servable_tested_platforms.replace(", ", ",").split(",") | |||
|
|||
errors, status = validate_id(api_model.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this check to line 375 after we load the yaml and before we create the ApiModel
object and validate the model_def.get("model_identifier")
errors, status = validate_id(model_def.get("model_identifier"))
@@ -371,6 +371,11 @@ def _upload_dataset_yaml(yaml_file_content: AnyStr, name=None, existing_id=None) | |||
# if yaml_dict.get("id") != dataset_id: | |||
# raise ValueError(f"Dataset.id contains non k8s character: {yaml_dict.get('id')}") | |||
|
|||
errors, status = validate_id(dataset_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this check above line 366 after we load the YAML and before we generate the dataset_id
and validate the yaml_dict.get("id")
errors, status = validate_id(yaml_dict.get("id"))
call to check the id of the yaml file itself rather than generated id. Signed-off-by: JAulet <[email protected]>
@ckadner Please Review. |
id
id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks @JAulet
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ckadner, JAulet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Added validate-id to __init__.py * Added check to notebook_service_controller_impl.py * Added validate-id check to dataset_service_controller_impl.py and model_service_controller_impl.py. * Moved validate_id check to before id is generated and changed function call to check the id of the yaml file itself rather than generated id. Resolves machine-learning-exchange#209 Signed-off-by: JAulet <[email protected]> Signed-off-by: Krishna Kumar <[email protected]>
Excerpt from issue #209:
The actual
id
of the dataset asset in MLX wascodenet-langclass
. Notice the dash-
which correctly replaced the underscore_
character. However, the component which generated the DataShim metadata used the value of theid
field from the YAML file, with theid
containing the underscore character.Proposed fix:
Any
_upload_XXX
method inapi/server/swagger_server/controllers_impl/XXX_service_controller_impl.py
which takes anid
(or similar field likemodel_identifier
) from the uploaded YAML file to generate the assetid
should call a newvalidate_id()
method similar to theswagger_server.controllers_impl.validate_parameters
function inapi/server/swagger_server/controllers_impl/__init__.py
:api/server/swagger_server/controllers_impl/
:__init__.py
, addvalidate_id(id: str)
methoddataset_service_controller_impl.py
, line 372:def _upload_dataset_yaml(...)
... usevalidate_id(id: str)
model_service_controller_impl.py
, line 381:def _upload_model_yaml(...)
... usevalidate_id(id: str)
notebook_service_controller_impl.py
, line 392:def _upload_notebook_yaml(...)
... usevalidate_id(id: str)
Resolves #209