-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP: handle a case of a null cm on the worker - v1.10.x #6765
UCP: handle a case of a null cm on the worker - v1.10.x #6765
Conversation
@alinask can you pls add NEWS entry? |
NEWS
Outdated
@@ -17,6 +17,7 @@ | |||
* Fixes in RPM dependency on libibverbs | |||
* Fixes in ABI backward compatibility for active message protocol | |||
* Add support for DC full-handshake mode (off by default). | |||
* Fixes for handling a NULL cm on the ucp worker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes in TCP connection establishment (issue #6755)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why tcp..? the null cm can be any cm in the array (most likely rdmacm though) but it's ucp layer...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we describe the error in more high-level in terms of how it affects a user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes for handling a missing sockaddr transport on a host
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes for connection establishment protocol (tcpcm, rdmacm, etc.)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Fixes for segmentation fault while listening for connections
55df280
to
45400db
Compare
failure on |
bot:pipe:retest |
1 similar comment
bot:pipe:retest |
@alinask can you pls squash? |
may happen if the list of components that support CM is longer than the available cms on the host (worker->cms).
8a69295
to
3a307d4
Compare
What
Handle a case of a null cm on the worker.
Why ?
This may happen if the list of components that support CM is longer than the
available cms on the host (worker->cms).
Needed to fix #6755
Backport of #6759
How ?
Skip a null cm on the worker.