-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log: Use RTLD_NOLOADwhen checking symbols #310
Conversation
on FreeBSD 11 call dlopen on a shared library causes the constructors to run again. As we're just getting symbols we don't need this to happen. Actually we don't WANT it to happen because it can cause qb_log_init to be called twice (recursively) and the dlnames list gets corrupted. This causess corosync (at leasT0 to crash at startup. Signed-off-by: Christine Caulfield <[email protected]>
I'm not very familiar with the dl*() functions. Doesn't the library need to be loaded to use dlsym()? It looks like RTLD_NOLOAD is a glibc thing, I guess we don't support anything else? |
I think by 'load' it means activate ready for execution - which is the usual use for dlopen of course. It certainly means the constructors are called (which is the problem here). with NOLOAD I suspect it just loads it into memory so you can get the symbols from it. But because we are already walking the tree of loaded libraries we don't need to 'LOAD' them again, the symbols are already there (I have checked this). It's certainly supported by FreeBSD (which is where I see the problem) and Linux, but if you think it might be a problem I can set it at ./configure time. |
According to dlopen(3), Beyond this, the algorithm seems racy: what if a shared object is replaced after the application initially loads it but before this |
We're not looking for the whole symbol table, just for the logging segments inside each loaded object so I think dlsym() is fine. Also adding NOLOAD removes the potential for the race as will only look inside the objects that are already loaded in memory and in use by the program - which is what we want. If the shared object is replaced we most certainly do NOT want the new symbols as they might not be correct for what is in RAM. |
Sure, but with |
TBH that's probably preferable to getting the wrong symbol, which would be a guaranteed segfault :) |
Probably so, though I'm not sure what happens on symbol lookup failure. |
It's a messy area I agree, thanks for your input. I'll see if anyone else has any comments before I commit it. Just to be on the safe side :) |
To be explicit, I support merging this PR. We're probably better off with it, even though the code might not be perfect still. I'm unsure because the quite insightful #266 (comment) and its followups do not mention this supposed race. |
Thanks Feri. I've only seen the race on FreeBSD but it could potentially be more common |
On FreeBSD 11 call dlopen on a shared library causes the constructors
to run again. As we're just getting symbols we don't need this to
happen.
Actually we don't WANT it to happen because it can cause qb_log_init to
be called twice (recursively) and the dlnames list gets corrupted. This
causes corosync (at least) to crash at startup.
Signed-off-by: Christine Caulfield [email protected]