-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🐛 Bug]: invalid session id with selenium grid #2153
Comments
@amardeep2006, thank you for creating this issue. We will troubleshoot it as soon as we can. Info for maintainersTriage this issue by using labels.
If information is missing, add a helpful comment and then
If the issue is a question, add the
If the issue is valid but there is no time to troubleshoot it, consider adding the
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable
After troubleshooting the issue, please add the Thank you! |
Can you also check the logs of keda-operator at that time, to see if any deployment was scaled down by the Scaler. When looking at your logs, it's similar to an issue still open for discussion around autoscaling here #2129 |
Thanks @VietND96 for response. I looked at the issue you mentioned and it seems like almost my issue. I have tried disabling autoscaling and all tests worked fine so the issue is related to autoscaling that's for sure.
I have another questing : Looking at the chrome-node logs , what do you think ? Let's assume KEDA operator instructed for downscaling, Why did not the preStop script bale to hold the DRAINING node till tests was complete ? Can there be some bug in prestop script logic ? edit: I just saw one more issue that looks like similar in nature. |
I will share further details tomorrow as I feel it could be kubernetes killing pods because of many reasons
|
@VietND96 I applied the chromeNode.terminationGracePeriodSeconds=3600 setting and the issue is disappeared. I see following issues that may need fix in helm chart:
firefoxNode:
enabled: true
imagePullPolicy: Always
# /dev/shm volume
dshmVolumeSizeLimit: "2Gi"
# Resources for firefox-node container
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "2Gi"
cpu: "2"
extraEnvironmentVariables:
# - name: "SE_VNC_NO_PASSWORD"
# value: "1"
- name: "SE_VNC_VIEW_ONLY"
value: "1"
autoscaling:
scaledOptions:
minReplicaCount: 0
maxReplicaCount: 3
terminationGracePeriodSeconds: 3600 edit : Downside is few pods may live for 3600 seconds in terminating state and still do processing. I can live with that for now. |
Ok, let me check any regression broken, since in README I updated that default |
It was a defect actually, the logic is handled but the template name is not called in Node spec YAML, so value |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
What happened?
I upgraded to selenium grid 4.18.1 this week from 4.14 and observing that some test scripts are throwing error like
invalid session id: Unable to find session with ID: 2e5824bb7eb7f83dc8d83774e8c7c539
Deployed on : Kubernetes
Autoscaling: enabled (Deployment)
terminationGracePeriodSeconds: 3600
in autoscaling.Do we know the reasons why chrome node goes into draining mode ?
Additional Info :
I run around 30 tests in parallel.
I run 1 browser per pod.
The test duration ranges between 10 minutes to 35 minutes.
Keda version previously was 2.12.0 but now it 2.13.1 with this upgrade.
Command used to start Selenium Grid with Docker (or Kubernetes)
Relevant log output
Operating System
kubernetes 1.23.14
Docker Selenium version (image tag)
4.18.1
Selenium Grid chart version (chart version)
0.28.1
The text was updated successfully, but these errors were encountered: