Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate ports defined in seldon-container-engine container #1799

Closed
xaniasd opened this issue May 6, 2020 · 18 comments
Closed

Duplicate ports defined in seldon-container-engine container #1799

xaniasd opened this issue May 6, 2020 · 18 comments
Labels

Comments

@xaniasd
Copy link
Contributor

xaniasd commented May 6, 2020

port 8000 is defined twice in the seldon-container-engine spec, as in the following example:

      name: seldon-container-engine
        ports:
        - containerPort: 8000
          protocol: TCP
        - containerPort: 8000
          name: metrics
          protocol: TCP
        - containerPort: 5001
          protocol: TCP

using seldon-core v1.1.0

@xaniasd xaniasd added bug triage Needs to be triaged and prioritised accordingly labels May 6, 2020
@xaniasd
Copy link
Contributor Author

xaniasd commented May 6, 2020

I believe this is the culprit.

@RafalSkolasinski
Copy link
Contributor

This was because this port serves not only metrics in executor / engine.
Do you know about any side-effects / downsides to having it duplicated?

@xaniasd
Copy link
Contributor Author

xaniasd commented May 6, 2020

That was my guess as well. About side-effects, haven't noticed any. Istio picks 2 application ports instead of one but no real effect. But this is not clean and confusing.

@RafalSkolasinski
Copy link
Contributor

RafalSkolasinski commented May 6, 2020

My understanding was that this section is mostly informative + used by other systems configured to do so.
I have for sure have checked at a time that port entry may appear twice with two different names.

This was more when I was looking at models nodes before I decided to add metrics there on separate port. For models we have

    ports:
    - containerPort: 6000
      name: metrics
      protocol: TCP
    - containerPort: 9000
      name: http
      protocol: TCP

but initially I was thinking about having metrics also served on the same port as http and I checked that following would work

    ports:
    - containerPort: 6000
      name: metrics
      protocol: TCP
    - containerPort: 6000
      name: http
      protocol: TCP

Maybe for executor / engine we should instead of removing duplicate just add name to it to go from

    ports:
    - containerPort: 8000
      protocol: TCP
    - containerPort: 8000
      name: metrics
      protocol: TCP

to

    ports:
    - containerPort: 8000
      name: http
      protocol: TCP
    - containerPort: 8000
      name: metrics
      protocol: TCP

in order to make it clear at glance what purpose these ports serve.

I need to check in Kubernetes documentation for a potential downsides and/or side-effects,

@cliveseldon any thoughts on this?

@ukclivecox
Copy link
Contributor

The name of the http names port is expected I think so we can't get rid of that. So yes we have some ports that serve multiple purposes: http and metrics.

I'm also not sure of any requirements on this for k8s so would need to research.

@xaniasd
Copy link
Contributor Author

xaniasd commented May 6, 2020

I think #1798 and this issue are somewhat related, in a contradictory manner.

If we use separate ports for metrics (as is the case now) then the second port declaration can be leveraged (with a more suited name) for efficient prometheus scraping. I don't see another reason to declare the port twice. But if we want to keep it like this, then naming every port is definitely an improvement.

The greater issue (that motivated all this from my part) is that there are multiple ports declared in the pod (for metrics and regular traffic). Seldon adds prometheus annotations for scraping (prometheus.io/scrape: true) but does not specify a scraping port. This is probably because only a single port can be specified with an annotation prometheus.io/port. Prometheus service discovery then creates a target for every port in the pod which leads to unwanted network traffic and logs.

At this point, I can think of 2 options:

  • create specific prometheus jobs for scraping seldon deployments that will only target the metrics ports. Better names for metric ports of SeldonDeployment containers #1798 would help in this case as seldon-specific regexes can be defined.
  • expose all application metrics through a single endpoint. The service orchestrator would collect metrics from all nodes and expose them. In that case prometheus.io/port can be used

cc: @axsaucedo

@RafalSkolasinski
Copy link
Contributor

This is probably because only a single port can be specified with an annotation prometheus.io/port.

That's exactly the reason for naming the port. There is this issue prometheus/prometheus#3756 in Prometheus that will hopefully lift the limitation.

Pod-wise we will for sure have multiple ports that are named metrics: one in executor/engine and one in each model node.

@xaniasd
Copy link
Contributor Author

xaniasd commented May 6, 2020

The name of the http names port is expected I think so we can't get rid of that. So yes we have some ports that serve multiple purposes: http and metrics.

@cliveseldon actually the name (http) is missing (see my original post)

@RafalSkolasinski
Copy link
Contributor

It seems that http name is currently present in model containers and missing from orchestrator one (executor/engine).

@ukclivecox
Copy link
Contributor

We should probably think of naming it. Though at present it could be http or grpc depending on the protocol and actually that is the reason why its only the same port number for http as for grpc the executor has 1 port for the API grpc traffic and another running an http server for metrics and liveness probes. So a few things to take in account.

This will be made simpler when we multiplex grpc and http traffic on same port. #1762

@xaniasd
Copy link
Contributor Author

xaniasd commented May 6, 2020

This is probably because only a single port can be specified with an annotation prometheus.io/port.

That's exactly the reason for naming the port. There is this issue prometheus/prometheus#3756 in Prometheus that will hopefully lift the limitation.

Pod-wise we will for sure have multiple ports that are named metrics: one in executor/engine and one in each model node.

in the second option I outline, a call to pod:8000 could subsequently call localhost/6000 (, localhost/6001 etc) and and send all sets of metrics to prometheus. Here's an example implementation: exporter-merger

@ukclivecox ukclivecox removed the triage Needs to be triaged and prioritised accordingly label May 7, 2020
@xaniasd
Copy link
Contributor Author

xaniasd commented May 14, 2020

@cliveseldon @axsaucedo now that one option is possible, I'm willing to give the alternative a try. I believe exposing a single metrics endpoint is the better option soI can create a PR for evaluation if you guys thing it's worth the while.

@ukclivecox
Copy link
Contributor

Can you explain further what you mean by a single metrics endpoint? At present both the executor and each individual model in the inference graph exposes metrics on a port names metrics. If you could outline your plan we could review it.

@xaniasd
Copy link
Contributor Author

xaniasd commented May 18, 2020

There are indeed multiple ports exposed, for regular traffic (9000 and 5001) metrics/traffic (8000) and exclusively for metrics (6000, 6001 etc). Using pod annotations, we can only configure a single port per pod for metrics scraping; so with the current setup we cannot use this mechanism.

The only viable solution is therefore to use the discovery feature of prometheus (i.e. probe every defined port). This ends up as noise on the other ports, as Prometheus tries to scrape all service ports. We can restrict this using regex on port names (e.g. metrics); the downside is that it requires specific configuration for Seldon Deployments on the Prometheus side.

Another way would be to expose a single central endpoint for metrics collection (port 8000, /metrics). We can then define the pod annotation for that port and Prometheus will leave the other ones alone. We could do this either for each individual pod or for the entire SeldonDeployment. For this we need to build the necessary login into the executor; it would be configured to collect metrics from each model (or other nodes in the infrerence graph) and export them to Prometheus. This way we eliminate unwanted noise to the other ports and make seldon-specific configuration on Prometheus server unnecessary.

In short, increased complexity in the seldon controller in favor of a simpler experience for the user, that's my proposal :)

I give some more details here and here.

@ukclivecox
Copy link
Contributor

I'd prefer not to increase coupling between the executor and the graph components if at all possible. It sounds like you are proposing the executor works like prometheus and scrapes the graph components and then exposing it in 1 place.

The only extra config for prometheus is the following:

- source_labels: [__meta_kubernetes_pod_container_port_name]
action: keep
regex: metrics(-.*)?

@xaniasd
Copy link
Contributor Author

xaniasd commented May 18, 2020

The only extra config for prometheus is the following:

- source_labels: [__meta_kubernetes_pod_container_port_name]
action: keep
regex: metrics(-.*)?

Indeed, this makes assumptions specific to the seldon setup and naming scheme. If we want to discover other pods that don't comply with this, we need another job. Additionally, using the annotation mechanism we could simply use the default prometheus instance deployed by istio (which the one of seldon-core-analytics is intended to replace if I'm not mistaken).

I'd prefer not to increase coupling between the executor and the graph components if at all possible. It sounds like you are proposing the executor works like prometheus and scrapes the graph components and then exposing it in 1 place.

Yes, this is the shorter version of my proposal :) If it doesn't fit the design philosophy we can drop it.

@axsaucedo
Copy link
Contributor

Closing via #1806

@mts-dyt
Copy link

mts-dyt commented Jun 6, 2023

Hello,
With Kubernetes 1.26 it seems that this duplication has side effects.
I cannot deleted correctly the deployed models:

failed to create manager for existing fields: failed to convert new object (seldon/aaa-may-test-model-0-seldon-wrapper-7bd58b99fd-n9c9m; /v1, Kind=Pod) to smd typed: .spec.containers[name="seldon-container-engine"].ports: duplicate entries for key [containerPort=8000,protocol="TCP"]

more info: kubernetes/kubernetes#118261

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants