-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graphd crash during DML when node being restarted #5041
Comments
Looks like the storaged is involved in this crash here. @pengweisong |
Does graphd has any coredump? It is a basic chaos scenario which has been verified long ago. But I didn't check it recently. @HarrisChu @kikimo , do we have this case? |
No coredump in this case. |
@goranc about "When I stop services on one node of the cluster" what services are we talking about? Only StorageD? or? |
I've restarted all services on one node, invoking 'nebuladb.service stop all' |
Hi @goranc , crash issues are very important to us. Thanks for sharing. We still have troubles reproducing this problem. Would you please share the overall topology of your cluster? How many nodes does this cluster have? How many services does one node has? Does one node has all types of services (metad, graphd, storaged)? More information may help us reproduce this problem. |
I don't quite recall |
The hosts here means storage? @wey-gu @xtcyclist |
According to this detailed description, I think all services within a node are to be restarted, which very likely includes some storaged services. @kikimo |
Are there any fatal logs or stderr logs @wey-gu |
We don't have more info on this for now. |
@goranc , hi, would you please help us to make sure whether there are services crashed? My colleague tried to reproduce your case but only find the connection that keeps inserting data disconnects, with no services crashing. |
Hi, When restarting services on specific node, all services are restarted. |
Close it first, reopen if it reappears. |
Please check the FAQ documentation before raising an issue
Describe the bug (required)
Your Environments (required)
3.3.0
How To Reproduce(required)
Steps to reproduce the behavior:
Expected behavior
Looks like a chaos use case, no crash should occur, some writing failure could be accepted though.
Additional context
@goranc could help provide more information when needed.
The text was updated successfully, but these errors were encountered: