-
Notifications
You must be signed in to change notification settings - Fork 115
Conversation
Something that's come up in the envoy review: envoyproxy/data-plane-api#386 (comment) Should we standardize on a tracer's configuration format being in json/yaml? Doing so would allow applications to specify the configuration inline. |
@rnburn my 2 cents "standadizing" is a difficult thing because each tracer's configuration is unique. If envoy needs a place in its own config that can be opaquely passed to the tracer, then it's envoy's business. E.g. it can say "tracerConfig" property is what contains the config, and stop "standardizing" at that point. json/yaml again sounds like the concern of the containing application, not the tracer. In Jaeger clients we expose a Configuration object that can be initialized from a config file, but doesn't have to be. In fact, in Python/JS that Configuration simply takes a dict / object as constructor argument, which is idiomatic way of passing nested configuration. It's harder with compiled langs like Java/Go, where we expose an object/struct that contains say yaml annotations. But it would be better if there was some neutral configuration API where you can just say Finally, where standardizing does really make sense (aside from that cfg.getX API), is in passing that configuration to tracer factory, which is necessary for dynamic loading. So in sumary
|
By standardize, I only meant to specify that the configuration is in json or yaml, not say anything about what's in the configuration. Currently, the factory class for the tracer takes a string representing the configuration. What form the string takes is the concern of both the application and the tracer since they have to agree on the format. Ideally, in envoy's configuration you'd write something like ...
tracerConfig:
opt1: val11
opt2:
- val21
val22
... and envoy would call the factory method with the string opt1: val11
opt2:
- val21
val22 by agreeing that the tracer's configuration is in something like yaml, it can be easily embedded into the rest of envoy's configuration and at least someone validated (i.e. checked that it's valid yaml) before passing it to the tracer. |
I would split the concern of the config format from the config values. At minimum Tracer factory needs to have an API for a programmatically accessible config, such as |
Couldn't the parsing of the configuration be handled by a library (e.g. yaml-cpp) linked in by the tracer? It allows an accessible interface like what you're describing. |
My point is that I would make the parsing of the config separate from providing the config to the tracer. The latter is best done via some API. The parsing will typically be done by the enclosing application anyway, so why make it a tracer responsibility (with all side effects like extra dependencies, different formats, etc.)? |
But that would complicate the API since tracers would need to be constructed from some sort of object that allows it to query the configuration instead of a simple string. Also, applications need not have to parse the tracer's configuration. If it's specified by a file, it can be read into a string and passed to the tracer library via a string. |
Yes, by in all scenarios that I've seen they do. Config files are typically managed by some infrastructure, which may include merging multiple files, overriding parts of them, doing variable substitution, etc. In some cases configurations are not even files, but some config store, who knows. Tracer doesn't need to know anything about that. The special case of tracer having its own config file is easily included in this approach, the file name just becomes a property passed to the tracer via the overall config mechanism.
That's the whole point. It's an abstraction that allows for a lot of flexibility. The config object's API can be very simple, a single Get method that takes nested property name (e.g. as done in spf13/viper). If you instead have a simple string instead of config object, then each tracer needs to compile with some parsing library (likely more than one), increasing dependencies hell and limiting portability if the enclosing application happen to use a different config file format or dynamic source. Also, if your API is just an opaque string, potentially read from another file, then what standardization do we even need? The string could be base64 encoding for all we care, only tracer needs to parse it. NB: the implementations of the config API for different config file formats could be a shared util in open tracing, if there is a demand for it, so that envoy and nginx, for example, would not need to re-implement them (as long as they use supported file formats). |
What about nested configurations or configurations with values other than string? Once you account for that the API gets a lot bigger than the simple key-value lookup you're proposing. And I don't see linking in a json parsing library as being that burdensome -- it's likely something that tracer's would want to do anyways (Jaeger's already linking to yaml-cpp btw). |
I am proposing that the Get function handles nested properties, Get("sampler.type"). Extending this API to return different data types is debatable. Linking parsing libraries is a problem imo. One of the important design principles for tracing libraries is to bring as few dependencies as possible. Unless the dependencies are vendored (which sucks to maintain), more dependencies create more dependency hell (diamond deps). I wasn't following jaeger-cpp that closely, would have argued against linking to yaml lib. No other jaeger lib in other languages includes parsing, it's simply not the responsibility of the tracer to parse all possible formats of config files. |
@yurishkuro I'd prefer a string serialized into a standard data format for this API. A couple of reasons: This API is separate from the tracer config API. The later is tracer-specific. It makes sense for that to be a non-serialized object which does not require JSON or other parsing. But this is a generalized transport that connects an application with any tracer, and any kind of option any tracer may want. Any non-standard API will create difficulty for application developers, as they will have to manually implement the interface and translate all of the tracer options from their former into it. That sounds like a lot of work that can't easily be wrapped up in a shared helper library. Limiting the API to less than JSON - only supporting non-nested Regarding the issue of tracers bundling in a JSON parser - the parser is only needed when the tracer is included as a shared object, not compiled in directly, so it should be possible to ship a version of your tracer that does not include a parsing library, and wrap it in a version that handles parsing and exposes a TracerFactory. If it turns out we can standardize on something more restrictive than JSON, we could write a small parsing library that tracers could include instead of a generalized JSON parser, and avoid bundling a larger dependency. Export maps can be used on linux to only expose the Tracer API, and ensure there are no dependency conflicts. |
@isaachier -- does this answer everything for you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR adds support for dynamically loading tracers into Envoy. It will allow authors of OpenTracing compliant tracers to support envoy without requiring envoy to bake their tracers into its build. Tracing libraries can be enabled by adding a section like this to envoy's configuration: tracing: http: name: envoy.dynamic.ot config: library: /path/to/tracing/library config: { .... } With this change, Envoy will be able to use Jaeger's new C++ tracing library. Some features this will allow that aren't available when using Jaeger as a collector for Zipkin: Propagation will work correctly with other services using Jaeger without requiring them to make configuration changes. Jaeger's different sampling strategies can be used. More efficient tranports are available. While it might also make sense sometime in the future to offer a version of Jaeger or other tracers built into Envoy, decoupled plug-in support will allow for quicker on-boarding of tracers and more experimentation in the area without making Envoy's external dependencies become unmanageable. Notes: One challenge with using dynamically loaded libraries in Envoy is that the standard c++ library is statically compiled in with the option -static-libstdc++. If plugin libraries are linked to the libstdc++ in a standard way, this will create problems as two versions of the standard library will be loaded and calls to their functions from the plugin will be intermingled between them, breaking many of the standard library functions. The approach I took with Jaeger was to have it also use -static-libstdc++ and then use an export map to ensure that it's version of the standard library is used without any intermingling (see dsohowto for background on the approach). It does mean that there will be multiple copies of libstdc++ running, but from what I've heard this can be made to work. Another approach could be to add a config option to link libstdc++ dynamically. This depends on opentracing/opentracing-cpp#45 being merged in to opentracing-cpp, but I wanted to get a PR in early to get feedback on the approach in case we need to make changes to that PR. Risk Level: Medium Testing: I have a small example that demoes this functionality using docker and a branch of jaeger. I also plan to add unit test coverage using a mock tracer library that will be added to opentracing-cpp. Docs Changes: envoyproxy/data-plane-api#386 Release Notes: Included in PR. Signed-off-by: Ryan Burn <[email protected]>
This PR adds support for dynamic loading to allow C++ applications instrumented with OpenTracing to dynamically load in tracing libraries at runtime instead of requiring them to be linked in at compile-time.
The basic idea is tracing libraries will implement the function
opentracing_make_tracer_factory
that constructs and returns a pointer to aTracerFactory
which can be used construct the library's tracers from configuration strings. To dynamically load a tracing library, an application calls the functiondynamically_load_tracing_library
which takes in the name of the library, calls the library'sopentracing_make_tracer_factory
function, and returns theTracerFactory
.Returning a factory like this instead of an actual tracer will allow for users to potentially reconfigure tracers by reconstructing a new replacement tracer from a different configuration string.
The
opentracing_make_tracer_factory
function is declared to have weak linkage so that multiple tracing libraries can be linked in without getting a multiple definition error. I know there was some concern about the portability of weak linkage, but its broadly supported by compilers on unix platforms and something similar can be done on windows with__declspec(selectany)
. Still, I'm open to other approaches if anyone thinks there's a better way to solve this.I currently have an example of dynamic loading working with a branch of envoy and jaeger, which can be demoed with this docker application. I also plan on adding a mock tracer library into opentracing-cpp which will support dynamic loading and can be used to add testing coverage for this new code.
Fixes #28