Error running multimodel endpoints in sagemaker #1911

Najib-Haq · 2024-05-15T13:21:15Z

Description

Multi model endpoint deployment in sagemaker through DJL serving is supposed to be supported. Here is the related AWS page and associated tutorial. I tried to run the code in the demo as it is. The endpoint gets created, but getting errors when trying to invoke the endpoint.

Expected Behavior

Successful invocation of endpoint based on target model name in multimodel endpoint scenario.

Error Message

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
Cell In[14], line 1
----> 1 print(predictor.predict( {"prompt": "Large model inference is"}, target_model="opt-350m.tar.gz"))
      2 print(predictor.predict({"prompt": "Large model inference is"}, target_model="bloomz-560m.tar.gz"))
      3 print(predictor.predict({"prompt": "Large model inference is"}, target_model="gpt-neo-125m.tar.gz"))

File /opt/conda/lib/python3.10/site-packages/sagemaker/base_predictor.py:212, in Predictor.predict(self, data, initial_args, target_model, target_variant, inference_id, custom_attributes, component_name)
    209 if inference_component_name:
    210     request_args["InferenceComponentName"] = inference_component_name
--> 212 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    213 return self._handle_response(response)

File /opt/conda/lib/python3.10/site-packages/botocore/client.py:565, in ClientCreator._create_api_method.<locals>._api_call(self, *args, **kwargs)
    561     raise TypeError(
    562         f"{py_operation_name}() only accepts keyword arguments."
    563     )
    564 # The "self" in this scope is referring to the BaseClient.
--> 565 return self._make_api_call(operation_name, kwargs)

File /opt/conda/lib/python3.10/site-packages/botocore/client.py:1021, in BaseClient._make_api_call(self, operation_name, api_params)
   1017     error_code = error_info.get("QueryErrorCode") or error_info.get(
   1018         "Code"
   1019     )
   1020     error_class = self.exceptions.from_code(error_code)
-> 1021     raise error_class(parsed_response, operation_name)
   1022 else:
   1023     return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (404) from primary with message "{
  "code": 404,
  "type": "ModelNotFoundException",
  "message": "Failed to detect engine of the model: /opt/ml/models/5ce62ebae83b18f9141573e424631f2e/model/temp"
}
". See https://eu-central-1.console.aws.amazon.com/cloudwatch/home?region=eu-central-1#logEventViewer:group=/aws/sagemaker/Endpoints/lmi-model-2024-05-15-10-10-58-392 in account 662744937784 for more information.

How to Reproduce?

Execute the given demo code at here.

Steps to reproduce

(Paste the commands you ran that produced the error.)

Run the code at https://github.com/deepjavalibrary/djl-demo/blob/master/aws/sagemaker/Multi-Model-Inference-Demo.ipynb.

What have you tried to solve it?

Tried to instead use models stored in s3 buckets. Gave same error. These models can be successfully deployed in their own endpoints through DJL-serving but not in the multimodel scenario.

The text was updated successfully, but these errors were encountered:

sindhuvahinis · 2024-09-03T23:43:57Z

@Najib-Haq Which version of DLC image did you use?

Najib-Haq added the bug Something isn't working label May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error running multimodel endpoints in sagemaker #1911

Error running multimodel endpoints in sagemaker #1911

Najib-Haq commented May 15, 2024

sindhuvahinis commented Sep 3, 2024

Error running multimodel endpoints in sagemaker #1911

Error running multimodel endpoints in sagemaker #1911

Comments

Najib-Haq commented May 15, 2024

Description

Expected Behavior

Error Message

How to Reproduce?

Steps to reproduce

What have you tried to solve it?

sindhuvahinis commented Sep 3, 2024