-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add readiness probe on TGIS container (caikit+tgis) #156
Comments
All we should need is model loaded in TGIS, caikit will make the API available kind of regardless of whether or not it's loaded (since it attempts to lazy load the connections, etc), so ensuring the foundation model at least is loaded on TGIS before declaring container ready seems to be the best bet here |
This was observed in perfscale where they have a large amount of contention on the GPU, so it may take a long time for models to be able to reserve space on the GPU, this would potentially also be a problem for a very large FM |
If livenessProbe:
grpc:
port: 8033
initialDelaySeconds: 10
readinessProbe:
grpc:
port: 8033
initialDelaySeconds: 10 A liveness/readiness probe could also be added for caikit ( readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080 |
I am going to replicate this issue on my env and will try to add liveness/readiness probe to see if it solve the problem. |
It seems there's an issue with how kserve handles the ServingRuntime. Adding a grpc ReadinessProbe to
It seems that KsvcReconciler log entry{
"level": "info",
"ts": "2023-11-29T12:27:27Z",
"logger": "KsvcReconciler",
"msg": "knative service configuration diff (-desired, +observed):",
"diff": " v1.ConfigurationSpec{\n \tTemplate: v1.RevisionTemplateSpec{\n \t\tObjectMeta: {Labels: {\"component\": \"predictor\", \"serving.kserve.io/inferenceservice\": \"caikit-example-isvc\"}, Annotations: {\"autoscaling.knative.dev/class\": \"kpa.autoscaling.knative.dev\", \"autoscaling
.knative.dev/min-scale\": \"1\", \"internal.serving.kserve.io/storage-initializer-sourceuri\": \"pvc://caikit-claim/flan-t5-small-caikit/\", \"sidecar.istio.io/inject\": \"true\", ...}},\n \t\tSpec: v1.RevisionSpec{\n \t\t\tPodSpec: v1.PodSpec{\n \t\t\t\tVolumes: {{Name: \"config-volu
me\", VolumeSource: {ConfigMap: &{LocalObjectReference: {Name: \"caikit-tgis-config\"}}}}},\n \t\t\t\tInitContainers: nil,\n \t\t\t\tContainers: []v1.Container{\n \t\t\t\t\t{\n \t\t\t\t\t\t... // 6 identical fields\n \t\t\t\t\t\tEnvFrom: nil,\n \t\t\t\t\t\tEnv: {{Name: \"TRANSFORMERS_
CACHE\", Value: \"/tmp/transformers_cache\"}},\n \t\t\t\t\t\tResources: v1.ResourceRequirements{\n- \t\t\t\t\t\t\tLimits: v1.ResourceList{},\n+ \t\t\t\t\t\t\tLimits: nil,\n- \t\t\t\t\t\t\tRequests: v1.ResourceList{},\n+ \t\t\t\t\t\t\tRequests: nil,\n \t\t\t\t\t\t\tClaims: nil,\n \t\t\t
\t\t\t},\n \t\t\t\t\t\tVolumeMounts: nil,\n \t\t\t\t\t\tVolumeDevices: nil,\n \t\t\t\t\t\tLivenessProbe: nil,\n \t\t\t\t\t\tReadinessProbe: &v1.Probe{\n \t\t\t\t\t\t\tProbeHandler: v1.ProbeHandler{\n \t\t\t\t\t\t\t\tExec: nil,\n \t\t\t\t\t\t\t\tHTTPGet: nil,\n \t\t\t\t\t\t\t\tTC
PSocket: nil,\n- \t\t\t\t\t\t\t\tGRPC: s\"&GRPCAction{Port:8083,Service:nil,}\",\n+ \t\t\t\t\t\t\t\tGRPC: nil,\n \t\t\t\t\t\t\t},\n \t\t\t\t\t\t\tInitialDelaySeconds: 0,\n \t\t\t\t\t\t\tTimeoutSeconds: 0,\n \t\t\t\t\t\t\t... // 4 identical fields\n \t\t\t\t\t\t},\n \t\t\t\t
\t\tStartupProbe: nil,\n \t\t\t\t\t\tLifecycle: nil,\n \t\t\t\t\t\t... // 7 identical fields\n \t\t\t\t\t},\n \t\t\t\t\t{\n \t\t\t\t\t\t... // 6 identical fields\n \t\t\t\t\t\tEnvFrom: nil,\n \t\t\t\t\t\tEnv: nil,\n \t\t\t\t\t\tResources: v1.ResourceRequirements{\n- \t\t\t\t\t\t\
tLimits: v1.ResourceList{},\n+ \t\t\t\t\t\t\tLimits: nil,\n- \t\t\t\t\t\t\tRequests: v1.ResourceList{},\n+ \t\t\t\t\t\t\tRequests: nil,\n \t\t\t\t\t\t\tClaims: nil,\n \t\t\t\t\t\t},\n \t\t\t\t\t\tVolumeMounts: {{Name: \"config-volume\", ReadOnly: true, MountPath: \"/caikit/config/\"}}
,\n \t\t\t\t\t\tVolumeDevices: nil,\n \t\t\t\t\t\tLivenessProbe: nil,\n \t\t\t\t\t\tReadinessProbe: &v1.Probe{\n \t\t\t\t\t\t\tProbeHandler: v1.ProbeHandler{\n \t\t\t\t\t\t\t\tExec: nil,\n \t\t\t\t\t\t\t\tHTTPGet: nil,\n \t\t\t\t\t\t\t\tTCPSocket: &{},\n- \t\t\t\t\t\t\t\tGRPC:
s\"&GRPCAction{Port:8085,Service:nil,}\",\n+ \t\t\t\t\t\t\t\tGRPC: nil,\n \t\t\t\t\t\t\t},\n \t\t\t\t\t\t\tInitialDelaySeconds: 0,\n \t\t\t\t\t\t\tTimeoutSeconds: 0,\n \t\t\t\t\t\t\t... // 4 identical fields\n \t\t\t\t\t\t},\n \t\t\t\t\t\tStartupProbe: nil,\n \t\t\t\t\t\tLife
cycle: nil,\n \t\t\t\t\t\t... // 7 identical fields\n \t\t\t\t\t},\n \t\t\t\t},\n \t\t\t\tEphemeralContainers: nil,\n \t\t\t\tRestartPolicy: \"\",\n \t\t\t\t... // 34 identical fields\n \t\t\t},\n \t\t\tContainerConcurrency: &0,\n \t\t\tTimeoutSeconds: &300,\n \t\t\t...
// 2 identical fields\n \t\t},\n \t},\n }\n"
} Which, when formatted in a sane manner becomes: reconciler diffv1.ConfigurationSpec{
Template: v1.RevisionTemplateSpec{
ObjectMeta: {Labels: {\"component\": \"predictor\", \"serving.kserve.io/inferenceservice\": \"caikit-example-isvc\"}, Annotations: {\"autoscali
ng.knative.dev/class\": \"kpa.autoscaling.knative.dev\", \"autoscaling.knative.dev/min-scale\": \"1\", \"internal.serving.kserve.io/storage-initializer-sourceuri\": \"pvc://caikit-claim/flan-t5-small-caikit/\", \"sidecar.isti
o.io/inject\": \"true\", ...}},
Spec: v1.RevisionSpec{
PodSpec: v1.PodSpec{
Volumes: {{Name: \"config-volume\", VolumeSource: {ConfigMap: &{LocalObjectReference: {Name: \"caikit-tgis-config\"
}}}}},
InitContainers: nil,
Containers: []v1.Container{
{
... // 6 identical fields
EnvFrom: nil,
Env: {{Name: \"TRANSFORMERS_CACHE\", Value:
\"/tmp/transformers_cache\"}},
Resources: v1.ResourceRequirements{
- Limits: v1.ResourceList{},
+ Limits: nil,
- Requests: v1.ResourceList{},
+ R
equests: nil,
Claims: nil,
},
VolumeMounts: nil,
VolumeDevices: nil,
LivenessProbe: nil,
ReadinessProbe: &v1.Probe{
\
t ProbeHandler: v1.ProbeHandler{
Exec: nil,
HTTPGet: nil,
TCPSocket: nil,
- GRPC: s\"&GRPCAction{Port:8083,Service:nil,}\",
+ GRPC: nil,
},
InitialDelaySeconds: 0,
TimeoutSeconds: 0,
... // 4 identical fields
},
StartupProbe: nil,
Lifecycle: nil,
... // 7 identical fields
},
{
... // 6 identical fields
EnvFrom: nil,
Env: nil,
Resources: v
1.ResourceRequirements{
- Limits: v1.ResourceList{},
+ Limits: nil,
- Requests: v1.ResourceList{},
+ Requests: nil,
Claims: nil,
},
VolumeMounts: {{Name: \"config-volume\", ReadOnly: true, MountPath: \"/caikit/config/\"}},
VolumeDevices: nil,
LivenessProbe: nil,
ReadinessProbe: &v1.Probe{
ProbeHandler: v1.ProbeHandler{
Exec: nil,
HTTPGet: nil,
TCPSocket: &{},
- GRPC: s\"&GRPCAction{Port:8085,Service:nil,}\",
+ GRPC: nil,
},
InitialDelaySeconds: 0,
TimeoutSeconds: 0,
... // 4 identical fields
},
StartupProbe:
nil,
Lifecycle: nil,
... // 7 identical fields
},
},
EphemeralContainers: nil,
RestartPolicy: \"\",
... // 34 identical fields
},
ContainerConcurrency: &0,
TimeoutSeconds: &300,
... // 2 identical fields
},
},
} |
related: #170 |
Seems we are currently blocked by knative/serving#8949 (comment) |
@dtrifiro hi do you define probes in multiple containers (just trying to confirm)? |
@skonto Just one probe in one container, but with multiple containers. I now understand what's going on though. I was initially confused because the behaviour for the grpc probes was different (see #156 (comment) and the grpc section below). Here's minimal knative serving examples: http probes
# probes-testing.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: probes-testing
namespace: probes-testing
spec:
template:
spec:
containers:
- image: python:3.11-slim
name: container1
command: ["python", "-m", "http.server", "8001"]
# ports:
# - containerPort: 8001
# protocol: TCP
readinessProbe:
httpGet:
path: /
port: 8001
- image: python:3.11-slim
name: container2
command: ["python", "-m", "http.server"]
ports:
- containerPort: 8000
protocol: TCP
# readinessProbe:
# httpGet:
# path: /
# port: 8000 Trying to
The request does not go through, which I guess is not what I'd like, but it makes sense. grpc probesA setup similar to the above, but with grpc probes: # grpc-probes-testing.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: probes-testing
namespace: probes-testing
spec:
template:
spec:
containers:
- name: agnhost
image: registry.k8s.io/e2e-test-images/agnhost:2.35
command: ["/agnhost", "grpc-health-checking"]
ports:
# - containerPort: 5000
- containerPort: 8080
readinessProbe:
grpc:
port: 5000
- name: agnhost1
image: registry.k8s.io/e2e-test-images/agnhost:2.35
command: ["/agnhost", "grpc-health-checking", "--port", "5001"]
# ports:
# - containerPort: 5001
# - containerPort: 8080
readinessProbe:
grpc:
port: 5001 This manifest applies with no errors. Getting all resources: $ kubectl get all
NAME LATESTCREATED LATESTREADY READY REASON
configuration.serving.knative.dev/probes-testing probes-testing-00001 Unknown
NAME CONFIG NAME K8S SERVICE NAME GENERATION READY REASON ACTUAL REPLICAS DESIRED REPLICAS
revision.serving.knative.dev/probes-testing-00001 probes-testing 1 Unknown Deploying
NAME URL READY REASON
route.serving.knative.dev/probes-testing http://probes-testing.kserve-test.example.com Unknown RevisionMissing
NAME URL LATESTCREATED LATESTREADY READY REASON
service.serving.knative.dev/probes-testing http://probes-testing.kserve-test.example.com probes-testing-00001 Unknown RevisionMissing No pod is scheduled. If I look at the event log, I see the following: InternalError Revision/probes-testing-00001 failed to create deployment "probes-testing-00001-deployment": Deployment.apps "probes-testing-00001-deployment" is invalid: spec.template.spec.containers[1].readinessProbe: Required value: must specify a handler type```
which is the confusing behaviour I was experiencing in the first place |
@dtrifiro fyi WIP in Knative Serving: knative/serving#14853 |
@Xaenalt
The text was updated successfully, but these errors were encountered: