-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/CM/RDMACM: share dummy CQ per device #6016
UCT/CM/RDMACM: share dummy CQ per device #6016
Conversation
@evgeny-leksikov was it clean CP? |
no, there were 2 conflicts around dev_name parameter |
bot:pipe:retest |
src/uct/ib/rdmacm/rdmacm_cm.c
Outdated
@@ -624,6 +674,8 @@ UCS_CLASS_CLEANUP_FUNC(uct_rdmacm_cm_t) | |||
|
|||
ucs_trace("destroying event_channel %p on cm %p", self->ev_ch, self); | |||
rdma_destroy_event_channel(self->ev_ch); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove space line
src/uct/ib/rdmacm/rdmacm_cm.h
Outdated
#include <uct/base/uct_cm.h> | ||
#include <ucs/datastruct/khash.h> | ||
|
||
#include <infiniband/verbs.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's not needed
src/uct/ib/rdmacm/rdmacm_cm.c
Outdated
khiter_t iter; | ||
int ret; | ||
|
||
iter = kh_put(uct_rdmacm_cm_cqs, &cm->cqs, device_name, &ret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it work?? looks like the hash table key can point to memory on the stack (device_name->dev_name)
IMO better use rdma_cm_id::pd pointer as the key
Any chance it's using stack garbage as the key and actually creating new CQ every time?
2309b48
to
7d38a90
Compare
re-cherry-picked squashed yosefe#105 + yosefe#113 to avoid conflicts which were appeared due to wrong approach with |
|
||
/* create a dummy qp in order to get a unique qp_num to provide to librdmacm */ | ||
status = uct_rdmacm_cm_create_dummy_cq_qp(cep->id, &cq, &qp); | ||
status = uct_rdmacm_cm_create_dummy_qp(cep->id, cq, &cep->qp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe call uct_rdmacm_cm_get_cq from uct_rdmacm_cm_create_dummy_qp()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to avoid hidden side effect of such function, let's keep things simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
What
share dummy CQ per device
Why ?
to reduce resources consumption
How ?
ported from yosefe#105