-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new m3db namespace causes partial cluster OOM #2155
Comments
Just to confirm, you saw no logs in the OOMed nodes pre-OOM and post namespace update? For example no logs like:
Trying to get an idea of where the namespace update gets stuck. |
Yes, nothing from dbnode process after namespace update and before OOM:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This has been happening with both v0.14.2 and v0.15.0-rc0
Most recent example:
Cluster has 18 db nodes, RF=3. It has a few namespaces setup and is taking about 200k metrics / s. As soon as new namespace is added via namespace API, 14 out of 18 nodes bootstrap the new namespace without an issue. While 4 out of 18 see the new namespace added, but do not start bootstrap. At the same time the memory and goroutine count start to sharply rise on these 4 nodes until all of them OOM and start full bootstrap. There is nothing in log files of these nodes after they see the new namespace and until the process killed.
m3dbnode-config.yml.zip
ns.json.zip
placement.json.zip
Was able to reproduce on a cluster with more than a few m3db nodes:
The text was updated successfully, but these errors were encountered: