Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update POC Readme #219

Open
danehans opened this issue Jan 23, 2025 · 2 comments
Open

Update POC Readme #219

danehans opened this issue Jan 23, 2025 · 2 comments

Comments

@danehans
Copy link
Contributor

danehans commented Jan 23, 2025

The POC readme causes the following error when trying to install the example InferencePool and InferenceModel custom resources (../examples/poc/manifests/inferencepool-with-model.yaml):

Error from server (NotFound): error when creating "examples/poc/manifests/inferencepool-with-model.yaml": the server could not find the requested resource (post inferencepools.inference.networking.x-k8s.io)
Error from server (NotFound): error when creating "examples/poc/manifests/inferencepool-with-model.yaml": the server could not find the requested resource (post inferencemodels.inference.networking.x-k8s.io)

The CRDs must be installed before creating instances of each custom resource. Additionally, the doc should state that commands should be run from the repo root and remove the ../.

The example gateway manifest fails to install:

$ kubectl apply -f manifests/gateway.yaml
error: the path "manifests/gateway.yaml" does not exist

The path needs to be updated:

$ kubectl apply -f pkg/manifests/gateway.yaml
gateway.gateway.networking.k8s.io/inference-gateway created
gatewayclass.gateway.networking.k8s.io/inference-gateway created
backend.gateway.envoyproxy.io/backend-dummy created
httproute.gateway.networking.k8s.io/llm-route created

The same goes for the ext_proc.yaml and patch_policy.yaml:

$ kubectl apply -f ./manifests/ext_proc.yaml
kubectl apply -f ./manifests/patch_policy.yaml
error: the path "./manifests/ext_proc.yaml" does not exist
error: the path "./manifests/patch_policy.yaml" does not exist

The path needs to be updated:

$ kubectl apply -f ./pkg/manifests/ext_proc.yaml
kubectl apply -f ./pkg/manifests/patch_policy.yaml
clusterrole.rbac.authorization.k8s.io/pod-read created
clusterrolebinding.rbac.authorization.k8s.io/pod-read-binding created
deployment.apps/inference-gateway-ext-proc created
service/inference-gateway-ext-proc created
envoyextensionpolicy.gateway.envoyproxy.io/ext-proc-policy created
envoypatchpolicy.gateway.envoyproxy.io/custom-response-patch-policy created

The gateway never reaches a "Ready" state:

$ kubectl get gateway/inference-gateway
NAME                CLASS               ADDRESS   PROGRAMMED   AGE
inference-gateway   inference-gateway             False        6m2s

Note that I am using a kind cluster to run the POC.

@danehans danehans changed the title Update POC Readme to Install CRDs Update POC Readme Jan 23, 2025
@kfswain
Copy link
Collaborator

kfswain commented Jan 23, 2025

Interesting, I just ran through this myself last week. And I came up with some of the things you mentioned:

  • add make install to the list of commands
  • the directory changes are a little tedious, and can be simplified

But otherwise I had everything come up nicely. Do you know why your GW was unable to configure correctly?

@danehans
Copy link
Contributor Author

add make install to the list of commands

Note that we have an install Make target but it simply installs the CRDs. Even with install target updated, a script that can be curl'ed or a single kubectl apply would be bomb for a quickstart guide so users don't get hung up with the install.

Do you know why your GW was unable to configure correctly?

The gateway never gets an external IP:

$ kubectl apply -f ./pkg/manifests/gateway.yaml
gateway.gateway.networking.k8s.io/inference-gateway created
gatewayclass.gateway.networking.k8s.io/inference-gateway created
backend.gateway.envoyproxy.io/backend-dummy created
httproute.gateway.networking.k8s.io/llm-route created

$ kubectl get gtw
NAME                CLASS               ADDRESS   PROGRAMMED   AGE
inference-gateway   inference-gateway             False        5s

$ get svc -n envoy-gateway-system
NAME                                       TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                                   AGE
envoy-default-inference-gateway-6454a873   LoadBalancer   10.96.197.219   <pending>     8080:32455/TCP,8081:31502/TCP             27s
envoy-gateway                              ClusterIP      10.96.96.73     <none>        18000/TCP,18001/TCP,18002/TCP,19001/TCP   5h34m

To resolve this issue, either MetalLB or cloud-provider-kind becomes a dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants