Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta data field not modifiable by subsequent components #829

Closed
shirishph opened this issue Aug 29, 2019 · 8 comments
Closed

Meta data field not modifiable by subsequent components #829

shirishph opened this issue Aug 29, 2019 · 8 comments
Assignees
Milestone

Comments

@shirishph
Copy link

I'm not able to modify the value of a dictionary key which is passed as meta data between Seldon components

I have an inference graph like so:

{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "katecheo"
    },
    "spec": {
        "annotations": {
            "project_name": "Katecheo",
            "deployment_version": "v0.1.0",
            "seldon.io/rest-read-timeout": "100000",
            "seldon.io/rest-connection-timeout": "100000",
            "seldon.io/grpc-read-timeout": "100000"
        },
        "name": "katecheo",
        "oauth_key": "oauth-key",
        "oauth_secret": "oauth-secret",
        "predictors": [{
            "componentSpecs": [{
                "spec": {
                    "imagePullSecrets": [{
                        "name": "***"
                    }, {
                        "name": "***"
                    }, {
                        "name": "***"
                    }],
                    "containers": [{
                            "name": "alpha",
                            "image": "***/alpha:delme",
                            "imagePullPolicy": "Always"
                        },
                        {
                            "name": "beta",
                            "image": "***/beta:delme",
                            "imagePullPolicy": "Always"
                        },
                        {
                            "name": "gamma",
                            "image": "***/gamma:delme",
                            "imagePullPolicy": "Always"
                        }
                    ],
                    "terminationGracePeriodSeconds": 20
                }
            }],
            "graph": {
                "children": [{
                    "name": "beta",
                    "endpoint": {
                        "type": "REST"
                    },
                    "type": "MODEL",
                    "children": [{
                        "name": "gamma",
                        "endpoint": {
                            "type": "REST"
                        },
                        "type": "MODEL",
                        "children": []
                    }]
                }],
                "name": "alpha",
                "endpoint": {
                    "type": "REST"
                },
                "type": "MODEL"
            },
            "name": "qna",
            "replicas": 1,
            "annotations": {
                "predictor_version": "v1"
            }
        }]
    }
}

The components are like so:

Alpha Component

class Alpha(object):

    result = {}

    def __init__(self):
        print("init")

    def predict(self, X, features_names, meta):
        print('predict')
        if meta and "tags" in meta:
            self.result = meta["tags"]

        self.result["proceed"] = 'Alpha'
        self.result["alpha"] = "Yes"

        return X

    def tags(self):
        return self.result

Beta Component

class Beta(object):

    result = {}

    def __init__(self):
        print("init")

    def predict(self, X, features_names, meta):
        print('predict')
        if meta and "tags" in meta:
            self.result = meta["tags"]

        self.result["proceed"] = 'Beta'
        self.result["beta"] = "Yes"

        return X

    def tags(self):
        return self.result

Gamma Component

class Gamma(object):

    result = {}

    def __init__(self):
        print("init")

    def predict(self, X, features_names, meta):
        print('predict')
        if meta and "tags" in meta:
            self.result = meta["tags"]

        self.result["proceed"] = 'Gamma'
        self.result["gamma"] = "Yes"

        return X

    def tags(self):
        return self.result

As you can see when the inference graph is run, we would expect the value of proceed to be Gamma, but what we get is Alpha

$ curl -X POST -H 'Content-Type: application/json' -d '{"data": {"names": ["message"],"ndarray": ["<SAMPLE INPUT>"]}}' http://XXX.XXX.XXX.XXX/seldon/default/katecheo/api/v0.1/predictions
{
  "meta": {
    "puid": "2c7dt9mids4iq7j2kdqakcgvs8",
    "tags": {
      "alpha": "Yes",
      "beta": "Yes",
      "gamma": "Yes",
      "proceed": "Alpha"  <--------------- Should this not be "Gamma?"
    },
    "routing": {
      "alpha": -1,
      "beta": -1
    },
    "requestPath": {
      "alpha": "cvdigital/alpha:delme",
      "beta": "cvdigital/beta:delme",
      "gamma": "cvdigital/gamma:delme"
    },
    "metrics": []
  },
  "data": {
    "names": [],
    "ndarray": ["<SAMPLE INPUT>"]
  }
}   

Is this expected? Can we not modify the value of proceed in say Beta or Gamma components?

@ukclivecox ukclivecox self-assigned this Aug 29, 2019
@ukclivecox ukclivecox added this to the 1.0.x milestone Aug 29, 2019
@gmattar
Copy link

gmattar commented Aug 29, 2019

Hi @shirishph, do you think saving the result in a class property self.result is ok?

The model class is instantiated only once when flask starts so In think that "result" property might be unexpectedly changed when you get multiple requests at the same time

I have a similar problem and this might be a cleaner solution than changing predict_raw if it works

@shirishph
Copy link
Author

shirishph commented Aug 30, 2019

Hey @gmattar. Thanks for your very valid points. The only way I could think of to save self.result is in a class property and that's because def tags(self): needs to return proceed flag to caller and the flag is set in predict. How else can predict() share the flag info with tags()?

Actually, I did not know we can have flask inside a Seldon inference graph. My actual code implementation serves ML models and there's no flask. I will not be having that instance-per-http-request issue confusion which flask brings.

Could you please elaborate on what you meant by "this might be a cleaner solution than changing predict_raw if it works"?

I could get away with setting a unique flag in each component, but then the graph would be brittle. Ideally the proceed flag allows flexibility in ordering components. Don't know if I'm making sense to you 😕 😃

@axsaucedo
Copy link
Contributor

@shirishph we had a look at this today, it's was expected behaviour but upon discussion it's clear that it makes sense to have latter steps to overwrite previous ones. This will be fixed once #829 is landed.

In regards to the points raised by @gmattar, you shouldn't experience sporadic changes of the values as the server deals with requests synchronously. You would have to take this into consideration if you use our Gunicorn implementation (which is activated if you add extra workers). Otherwise you just have to make sure you don't send the previous metadata when returning the latest request.

@gmattar
Copy link

gmattar commented Aug 30, 2019

predict_raw receives the original request, before seldon breaking it into parts like feature_names, X, meta. This way, my idea was intercept the requet, include the metadata I needed and then call the regular predict:

import seldon_core

def _seldon_predict(user_model, request):
    (features, meta, datadef, data_type) = seldon_core.utils.extract_request_parts_json(request)
    client_response = seldon_core.user_model.client_predict(user_model, features, datadef.names, meta=meta)
    return seldon_core.utils.construct_response_json(user_model, False, request, client_response)

class Base:
    def predict_raw(self, request):
        is_proto = isinstance(request, seldon_core.proto.prediction_pb2.SeldonMessage)
        json_request = seldon_message_to_json(request) if is_proto else request
        meta_results = json_request.get("meta", {}).get('results', [])
        json_response = {}
        prediction = None

        json_response = _seldon_predict(self, json_request)

        if "jsonData" in json_response:
            prediction = json_response["jsonData"]
        elif "binData" in json_response:
            prediction = json_response["binData"]
        elif "strData" in json_response:
            prediction = json_response["strData"]
        else:
            prediction = json_response["data"]

        json_response["meta"]["results"] = meta_results
        json_response["meta"]["results"].append({self.name: prediction})

        return json_response

class ModelA(Base):
    def __init__(self):
        self.name = "model-a"

    def predict(self, X, features_names):
        return X

this worked for single model deployments but failed to inference graph because meta was overwritten.
I'm gonna try something similar to your solution and think about how to make it work with gunicorn in the future. @axsaucedo do you have any insight about it?

@shirishph
Copy link
Author

Thx @axsaucedo. I will use a workaround for now

@shirishph
Copy link
Author

@gmattar You may reopen the issue if you'd like to keep it open

@axsaucedo
Copy link
Contributor

@shirishph this PR fixes your issue, you can try it by running it through that branch #833. I'll reopen the issue so we can close it once that PR gets merged, as it fully addresses the bug you found.

@axsaucedo axsaucedo reopened this Aug 31, 2019
@ukclivecox
Copy link
Contributor

Closing as PR #833 merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants