-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add opentelemetry tracing and metrics #202
base: main
Are you sure you want to change the base?
Conversation
tracer: Arc::new(BoxedTracer::new(Box::new(NoopTracer::new()))), | ||
tracer_context: Arc::new(Context::new()), | ||
process_context: Context::new(), | ||
meter_provider: GlobalMeterProvider::new(NoopMeterProvider::new()), | ||
logger: Arc::new( | ||
env_logger::Builder::new() | ||
.filter_level(log::LevelFilter::Off) | ||
.build(), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't fully understood about new_dist_state, and have made all logging for this be noops.
Is there any advice on what I should do here? Should logging to the terminal print on the control server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So dist state only has references to control client and "node". Node implements communication with other nodes (for example, it receives a spawn command from another node). Control client talks to the control server.
We can leave it out for now.
@@ -127,6 +130,7 @@ anyhow = "1.0" | |||
bincode = "1.3" | |||
dashmap = "5.4" | |||
log = "0.4" | |||
opentelemetry = { version = "0.19", git = "https://github.com/tqwewe/opentelemetry-rust", branch = "cow", features = ["metrics"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this PR uses some changes in opentelemetry-rust which are not yet published.
The two PR's are:
open-telemetry/opentelemetry-rust#1009
open-telemetry/opentelemetry-rust#1018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw PRs got merged, do they plan to release the newer version soon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at their previous releases, it seems they don't push releases very frequently :(
I've just made a discussion on it, hopefully we can get more insight there
open-telemetry/opentelemetry-rust#1031
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks good! All metrics include some attributes like: node id, environment id, process id. They are not implicitly set, right? We have to set them on each call? We also want to set them in vm, no guest, so that it can be trusted.
tracer: Arc::new(BoxedTracer::new(Box::new(NoopTracer::new()))), | ||
tracer_context: Arc::new(Context::new()), | ||
process_context: Context::new(), | ||
meter_provider: GlobalMeterProvider::new(NoopMeterProvider::new()), | ||
logger: Arc::new( | ||
env_logger::Builder::new() | ||
.filter_level(log::LevelFilter::Off) | ||
.build(), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So dist state only has references to control client and "node". Node implements communication with other nodes (for example, it receives a spawn command from another node). Control client talks to the control server.
We can leave it out for now.
@@ -127,6 +130,7 @@ anyhow = "1.0" | |||
bincode = "1.3" | |||
dashmap = "5.4" | |||
log = "0.4" | |||
opentelemetry = { version = "0.19", git = "https://github.com/tqwewe/opentelemetry-rust", branch = "cow", features = ["metrics"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw PRs got merged, do they plan to release the newer version soon?
Actually the process_id and environment_id is not attached to every trace/metric, but is only attached to the parent spans. I'll work on injecting this data to every span/log/metric. |
Todo:
target_info
. (Turns out its not possible)Add push and take functions for resource sharing. Though, this should not "move" the resource, probably would be better to allow multiple processes to share metric resources.I don't think this is needed in this PR for now.Spans cannot be shared across processes, as they are in a tree structure, and sharing them means it would be possible to drop a parent span before its child, which wouldn't make sense.
Running the spawn process benchmark, this PR does not seem to affect performance of spawning processes.
Related PRs:
open-telemetry/opentelemetry-rust#1009
open-telemetry/opentelemetry-rust#1018
Screenshots for examples/metrics.rs
https://github.com/lunatic-solutions/lunatic-rs/blob/4681561eb78d1164bc1b2eef7c436bcab36622ab/examples/metrics.rs#L21-L78
Terminal
Jaeger
Prometheus