-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mac shlib dependencies fixes #1891
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one target (opentelemetry_exporter_otlp_grpc_client
) can link gRPC::grpc++
, or there may be symbols mixing/overlapping problem.
exporters/otlp/CMakeLists.txt
Outdated
|
||
target_link_libraries(opentelemetry_exporter_otlp_grpc_client | ||
PRIVATE gRPC::grpc++) | ||
PUBLIC gRPC::grpc++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this to PUBLIC link here?This should be PRIVATE link or all targets depend opentelemetry_exporter_otlp_grpc_client will link gRPC::grpc++
, which will cause the problem in #1603 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, struggling to reproduce #1603...
Before spending much time on it, let me guess: Is this about linking against static gRPC libs by a chance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, struggling to reproduce #1603... Before spending much time on it, let me guess: Is this about linking against static gRPC libs by a chance?
This BUG can be repreduce when compiling gRPC as static library and compiling otel-cpp as dynamic library on Linux(With ELF ABI).
In this situation, the global variable will shared the same address between dynamic libraries but the initialization guards is private, which will construct these global variables more than onece.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ays7 - Do you have further comments on above discussion ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will try to look into static gRPC linking soon -- marked PR as WIP for now
…solve for the shlib to successfully link]
38d0fb3
to
9665120
Compare
@owent I was able to reproduce #1603 with static gRPC libraries, however the fact that multiple opentelemetry libraries were linked against gRPC turns out to be a red-herring, the real culprit is OTEL's grpc tracer harness was defined in both
That was later resolved with move of common code into
That said, you right, gRPC library link[s] better be kept private |
PUBLIC opentelemetry_otlp_recordable | ||
opentelemetry_exporter_otlp_grpc_client) | ||
PUBLIC opentelemetry_otlp_recordable opentelemetry_exporter_otlp_grpc_client | ||
PRIVATE gRPC::grpc++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we link gRPC::grpc++
into opentelemetry_exporter_otlp_grpc
, opentelemetry_exporter_otlp_grpc_log
and opentelemetry_exporter_otlp_grpc_metrics
? If we build otel-cpp as static libraries, they will link it just by link opentelemetry_exporter_otlp_grpc_client
, and if we build otel-cpp as dynamic libraries, I think the global variables of gRPC are both in opentelemetry_exporter_otlp_grpc_client
, opentelemetry_exporter_otlp_grpc
, opentelemetry_exporter_otlp_grpc_log
and opentelemetry_exporter_otlp_grpc_metrics
, the symbol overlap problem still exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@owent gRPC::grpc++
must be linked either directly or via shared dependency into all libs that make direct gRPC API calls to satisfy Mac shared library linker -- Mac shared libraries are linked like executables on linux -- all referenced symbols must be resolved to somewhere.
One way to achieve this is by exposing gPRC as a public dependency of opentelemetry_exporter_otlp_grpc_client
, which is a common dependency of the 3 libs you mentioned -- that was my original suggestion.
For the static linking: here is my understanding: when linking shared libs on linux, linker first attempts to resolve referenced symbols to one of linked in shared libraries & if that fails to one of linked in static libraries -- in a later case code from the static library is copied into the output.
With that in mind, in the weird scenario of static gRPC/dynamic OTEL libraries mix, what happens is - bulk of gPRC library code is embedded into opentelemetry_exporter_otlp_grpc_client
and all other otlp libs [which are dependent on _client
] reference gRPC code from the _client
.so and not explicitly linked static gRPC library.
There is a potential issue if, say, both opentelemetry_exporter_otlp_grpc_log
& opentelemetry_exporter_otlp_grpc_metrics
call certain function that is not called inside opentelemetry_exporter_otlp_grpc_client
-- in that case both _log
& _metrics
may get a copy of some gRPC lib code. Fortunately there are no gRPC symbols like that at this point, but maybe it is indeed safer to declare gRPC lib as common PUBLIC dependency as I originally suggested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can not link static gRPC into more than one dynamic libraries and executable. There will be more than one copy of some global variables in gRPC and some of them will be initialized more than once and some will not be initialized for ever.So the solution is just link gRPC into one dynamic library, that is opentelemetry_exporter_otlp_grpc_client
, other libraries should only use the API from opentelemetry_exporter_otlp_grpc_client
and can not use gRPC APIs directly.
Here is a simple sample to reproduce the problem in gRPC.
a.h
struct foo {
int bar;
foo();
~foo();
void print(const char*);
static foo _;
};
void print_static_global(const char*);
a.cpp
#include "a.h"
#include <iostream>
#include <thread>
#include <chrono>
#include <memory>
struct bar {
foo* ptr = nullptr;
};
foo foo::_;
static bar s_;
foo::foo(): bar(127) {
s_.ptr = this;
std::cout<< "construct "<< this<< std::endl;
}
foo::~foo() {
std::cout<< "destroy "<< this<< std::endl;
}
void foo::print(const char* prefix) {
std::cout<< prefix<< "-foo: "<< this<< ": "<< bar<< std::endl;
}
void print_static_global(const char* prefix) {
foo::_.print(prefix);
std::cout<< prefix<< "-piblic API bar: "<< s_.ptr<< std::endl;
}
std::shared_ptr<std::thread> g_t(new std::thread([]() {
std::this_thread::sleep_for(std::chrono::seconds{1});
std::cout<< "internal API bar: "<< s_.ptr<< std::endl;
}), [](std::thread* thd) {
thd->join();
delete thd;
});
b.cpp
#include "a.h"
void dll_func_b() {
print_static_global("b");
}
c.cpp
#include "a.h"
void dll_func_b();
int main() {
print_static_global("c");
dll_func_b();
return 0;
}
Compile and run
[owent@VM-144-59-centos test]$ clang++ a.cpp -o libtest_a.a -c -fPIC -pthread
[owent@VM-144-59-centos test]$ clang++ b.cpp -o libtest_b.so -shared -fPIC -L$PWD -ltest_a -pthread
[owent@VM-144-59-centos test]$ clang++ c.cpp -fPIC -L$PWD -ltest_b -ltest_a '-Wl,-rpath=$ORIGIN' -pthread
[owent@VM-144-59-centos test]$ ./a.out
[owent@VM-144-59-centos test]$ ./a.out
construct 0x55af97755338
construct 0x55af97755338
c-foo: 0x55af97755338: 127
c-piblic API bar: 0x55af97755338
b-foo: 0x55af97755338: 127
b-piblic API bar: 0x55af97755338
internal API bar: (nil)
internal API bar: 0x55af97755338
destroy 0x55af97755338
destroy 0x55af97755338
[owent@VM-144-59-centos test]$
$ORIGIN
may be replaced by@loader_path
on macOS.
As you see, the global variable 0x55af97755338
is constructed twice and destroyed twice,And the internal API use different static bar s_;
. I didn't test it on macOS but i guess it has the same problem.It's more complicated in gRPC and it will make gRPC do not work in some dynamic libraries.
Actually, if we set gRPC as private dependency of opentelemetry_exporter_otlp_grpc_client
, and when we build opentelemetry_exporter_otlp_grpc_client
into a static library. All the targets will links gRPC automatically by just link opentelemetry_exporter_otlp_grpc_client
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@owent Thanks for the test - I will play with it soon.
In a mean time here is another idea: anything particularly wrong with combining all _otlp_grpc_
libraries into a single opentelemetry_exporter_otlp_grpc
library?
And one more thing: if static gRPC library can only be linked into one DLL then any application that use gRPC [either directly or indirectly via, say, third party dependency] will be in conflict with otlp_grpc
libraries. Which logically means that use of static gRPC libraries is a bad idea. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@owent Thanks for the test - I will play with it soon. In a mean time here is another idea: anything particularly wrong with combining all
_otlp_grpc_
libraries into a singleopentelemetry_exporter_otlp_grpc
library?
I prefer spilt modules one by one personally, the build system may built all libraries but most users will use only part of them.
It will cost more memory and CPU if we pack all codes into one library, we have a project which will cost about 8+GB memory to link every executable in this project and it's easy OOM when we link them concurrency.We have done many jobs to split dependencies into small libraries to solve this problem.
And one more thing: if static gRPC library can only be linked into one DLL then any application that use gRPC [either directly or indirectly via, say, third party dependency] will be in conflict with
otlp_grpc
libraries. Which logically means that use of static gRPC libraries is a bad idea. Right?
Yes, in my understanding, it's safer to use static or dynamic libraries for all dependencies, not mixed. There were issues report this problem when only use otlp-cpp, so I think the only thing we can do is keep it right when users only use and link otlp-cpp’s SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ays7 - Do you have any questions on above comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
closing as discussed above. |
cmake tweaks to get shlibs to link on Mac with OTLP ON