-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core.metrics] Add support for multiple processes in ops metrics & stats API; deprecate process
field
#104031
Comments
Pinging @elastic/kibana-core (Team:Core) |
@joshdover @mshustov @pgayvallet Since we were discussing this topic in the RFC thread, WDYT about the proposed design here? Is |
It'd probably make sense for the metricbeat implementation to simply ingest separate documents for each process and tie them together with
Array design seems like the general direction we want to go. For differentiating the process types, maybe we add a |
I would have used a map instead of an array here, but maybe it doesn't make sense? I admit I don't really have any precise idea on how the consumers are ingesting this data (and in term of mappings, an array does make way more sense). "processes": {
"coordinator": { ... },
"worker-1": { ... },
"worker-2": { ... },
}, |
Yeah exactly, I guess my point was that while it would be ideal for our response to be as close to ECS as possible, however there just isn't an ECS-compatible way to do what we need to do here. So my vote would be to get it as close to ECS as we can, and leave it to Metricbeat to handle ingestion.
A map would make sense from a config standpoint -- it would be more consistent with how we are approaching clustering configs -- however I tend to lean toward Josh's suggestion above of keeping a single
This has a few benefits:
So overall I think I'm +1 on Josh's proposal, however since It might also be nice if these types/names were consistent with the ones provided in the clustering config. e.g. given this config: node:
enabled: true
workers:
foo:
count: 2
max_old_space_size: 1gb
bar:
count: 1
max_old_space_size: 512mb we could get these stats: {
// ...,
"processes": [
{
"name": "coordinator",
"pid": 52646,
"memory": {...},
"event_loop_delay": 0.22967800498008728,
"uptime_ms": 1706021.930404,
},
{
"name": "worker-foo-1",
"pid": 52647,
"memory": {...},
"event_loop_delay": 0.22967800498008728,
"uptime_ms": 1706021.930404,
},
{
"name": "worker-foo-2",
"pid": 52648,
"memory": {...},
"event_loop_delay": 0.22967800498008728,
"uptime_ms": 1706021.930404,
},
{
"name": "worker-bar-1",
"pid": 52649,
"memory": {...},
"event_loop_delay": 0.22967800498008728,
"uptime_ms": 1706021.930404,
},
],
// ...,
} |
Also, as @Bamieh pointed out, maybe while we are introducing a new field, we should consider moving to the new histogram-style This probably isn't something the metrics service should need to concern itself with, however it might make sense for the |
Part of #68626
Summary
In order to support a multi-process Kibana (RFC: #94057), we need to update the
/stats
REST API to support metrics for more than one process.This endpoint, registered from the
usage_collection
plugin, is getting these stats from Core'smetrics
service (getOpsMetrics$
), which is also used in themonitoring
plugin for stats collection.The
/stats
are problematic in that they contain a handful ofprocess
metrics which will differ from worker-to-worker:As each request could be routed to a different worker, different results may come back each time.
Ultimately we'll need to extend the API to provide per-worker stats, but in the interim, we need to at least:
process
field as deprecatedOne idea
The simplest approach we could take here would be to do something like add a
processes
field which is simply an array of objects in the same format as the existingprocess
:We could also consider adding some other human-readable identifier (so we don't just have
pid
), which could be used in the future for aworker.id
.Important to note here that I checked with the ECS team, and while there are patterns for nesting child process objects which point back to a parent (
process.parent
), there isn't a top-down pattern for doing something likeprocess.children[]
. So whatever we end up doing here would be a custom field from an ECS perspective, though we can still maintain the same ECS-compliant body for each individual process object.The idea would be to then make the breaking change in
8.0
, which would be handled as a follow-up task (#104124).Scope
process
field in opts metrics & stats apiThe text was updated successfully, but these errors were encountered: