-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MongoDB Atlas] Process data stream #9552
[MongoDB Atlas] Process data stream #9552
Conversation
multi: false | ||
required: false | ||
show_user: false | ||
description: The request tracer logs requests and responses to the agent's local file-system for debugging configurations. Enabling this request tracing compromises security and should only be used for debugging. See [documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-httpjson.html#_request_tracer_filename) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put references to the agent one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont have any documentation for request_tracer
in fleet, I Checked with other integations and it seems that link of configuration of Individual input/module is used here. Here is one referance.
required: false | ||
show_user: false | ||
description: >- | ||
Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. This executes in the agent before the logs are parsed. See [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put references to the agent one
- name: period | ||
type: text | ||
title: Period | ||
description: Period of fetching metrics.Value of Granularity and Period will be same. Supported units for this parameter are h/m/s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats metrics.Value.
Might not be clear for first time user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate more on metrics? Period of fetching metrics
This description is same across the integrations for period, let me know what I can change here.
For value of Granularity and Period I will add details in readme and put reference here.
- name: groupId | ||
type: text | ||
title: GroupId | ||
description: Identifier string that identifies your project. Groups and projects are synonymous terms. Your group id is the same as your project id. Ex. 32b6e34b3d91647abb20e7b8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: Identifier string that identifies your project. Groups and projects are synonymous terms. Your group id is the same as your project id. Ex. 32b6e34b3d91647abb20e7b8 | |
description: Identifier string that identifies your project. Groups and projects are synonymous terms. Group id is identical to project id. Ex. 32b6e34b3d91647abb20e7b8 |
multi: false | ||
required: true | ||
show_user: true | ||
secret: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since its a private key, should we make as secret:true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Milan has one open PR where secret is true for private key but it seems that CI is falling because of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason of failure doesn't look to be this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it was my bad, the system test is falling when secret is kept true. @milan-elastic can elaborate more here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any update on this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to keep secret false as of now! If we'll make it true it will lead to system test failure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
System Test Failure by making it true in the manifest ?
"CACHE_BYTES_READ_INTO": 0, | ||
"CACHE_USED_BYTES": 0, | ||
"CACHE_BYTES_WRITTEN_FROM": 0, | ||
"CONNECTIONS": 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets have some values populated
? | ||
state | ||
: | ||
state.with(request("GET", state.url + "/api/atlas/v2/groups/" + state.groupId + "/processes?pageNum=" + string(state.page_num) + "&itemsPerPage=100").with({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efd6 : Any suggestions for improvements in this program here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cleanup (correct indentation and whitespace makes CEL code much easier to debug). It is untested.
state:
group_id: {{groupId}}
want_more: false
page_num: 1
query: /measurements?granularity=PT{{period}}&period=PT{{period}}
redact:
fields: ~
program: |
(
has(state.hostlist) && size(state.hostlist) > 0 ?
state
:
state.with(request("GET", state.url + "/api/atlas/v2/groups/" + state.group_id + "/processes?pageNum=" + string(state.page_num) + "&itemsPerPage=100").with({
"Header": {
"Accept": ["application/vnd.atlas." + string(now.getFullYear()) + "-01-01+json"]
}
}).do_request().as(resp, bytes(resp.Body).decode_json().as(body, {
"hostlist": body.results.map(e, state.url + "/api/atlas/v2/groups/" + state.group_id + "/processes/" + e.id + state.query),
"next": 0,
"page_num": body.links.exists_one(res, res.rel == "next") ? int(state.page_num)+1 : 1
})))
).as(state, state.next >= size(state.hostlist) ? {} :
request("GET", string(state.hostlist[state.next])).with({
"Header": {
"Accept": ["application/vnd.atlas." + string(timestamp(now).getFullYear()) + "-01-01+json"]
}
}).do_request().as(res, {
"events": bytes(res.Body).decode_json().as(f, f.with({"response": zip(
f.measurements.map(m, m.name),
f.measurements.map(m, m.dataPoints.map(d, d.value).as(v, size(v) == 0 ? 0 : v[0]))
)}).drop(["measurements", "links"])),
?"hostlist": (int(state.next)+1) < size(state.hostlist) ? state.hostlist : optional.none(),
"next": (int(state.next)+1) < size(state.hostlist) ? (int(state.next)+1) : 0,
"want_more": (int(state.next)+1) < size(state.hostlist) || state.page_num != 1,
"page_num": state.page_num,
"group_id": state.group_id,
"query": state.query,
})
)
- I've used optional types to replace the use of
drop_empty
; I assume that that is the only reason that it is being used. - I've aggregated the two
drop
calls into one to avoid multiple cross-language calls. - The aggregation for
events
probably wants a comment explaining what it is doing; I take it that it's assigning a vector to each of the names in measurements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niraj-elastic : Please address this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efd6 I have updated code according to your suggestions, I have removed drop_empty
since code works fine even without it. let me know if any other changes are required.
processors: | ||
- set: | ||
field: ecs.version | ||
value: 8.11.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required since we are now using ecs@mappings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need to check this in detail, ecs.version got set to 8.0.0 automatically when I removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a tag?
This value being autoset, can create confusion in users mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our datasets are in ECS. Setting the version is a nice indicator, that the data users are looking at is fully ECS and we also use this sometimes in telemetry.
For datasets / integrations: Can it be assumed that each time an integration is released, it is on the most recent version that this version of elatsic-package supports until set otherwise. Or can we go even further, can it be assumed that the ECS version used by the integration can be set to the version that is currently shipped via Fleet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO ECS version used by Integration should be the one shipped via Fleet.
There is now an issue created for it.
type: group | ||
fields: | ||
- name: msg | ||
description: The average rate of message asserts per second over the selected sample period. These internal server errors have a well-defined text string. Atlas logs stack traces for these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"These internal server errors"
We are using asserts first and then errors later. Lets use the same term.
metric_type: gauge | ||
unit: byte | ||
- name: write.bytes | ||
description: The maximum Disk read Latency value over the time period specified by the metric granularity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: The maximum Disk read Latency value over the time period specified by the metric granularity. | |
description: The maximum disk read latency value over the period specified by the metric granularity. |
🚀 Benchmarks reportTo see the full report comment with |
- Set up alerts to minimize Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) by quickly referencing relevant logs during troubleshooting. | ||
|
||
## Data streams | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope when we merge both logs and metrics, we will get merge the documentation appropriately as this just states anut metrics currently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, once any PR is merged we will sync the readme.
? | ||
state | ||
: | ||
state.with(request("GET", state.url + "/api/atlas/v2/groups/" + state.groupId + "/processes?pageNum=" + string(state.page_num) + "&itemsPerPage=100").with({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niraj-elastic : Please address this
|
||
Data streams: | ||
|
||
- `process` : This data stream Collects host Metrics per process for all the hosts of the specified group. Metrics like Measurements for the host, such as CPU usage, number of I/O operations and memory are available on this data stream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `process` : This data stream Collects host Metrics per process for all the hosts of the specified group. Metrics like Measurements for the host, such as CPU usage, number of I/O operations and memory are available on this data stream. | |
- `process` : This data stream collects host metrics per process for all the hosts of the specified group. Metrics like measurements for the host, such as CPU usage, number of I/O operations and memory are available on this data stream. |
type: group | ||
fields: | ||
- name: max.pct | ||
description: MAX Children User CPU usage scaled to a range of 0% to 100% by dividing by the number of CPU cores. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: MAX Children User CPU usage scaled to a range of 0% to 100% by dividing by the number of CPU cores. | |
description: MAX Children user CPU usage scaled to a range of 0% to 100% by dividing by the number of CPU cores. |
Can we update the description on this section by dividing by the number of CPU cores.
to make it more readable?
This is applicable wherever we add this statement.
type: group | ||
fields: | ||
- name: deleted | ||
description: Displays the documents per second deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: Displays the documents per second deleted. | |
description: Displays the documents deleted per second. |
Applies to all the below document related metrics.
type: group | ||
fields: | ||
- name: kernel.pct | ||
description: The amount of CPU time spent by the Full-Text Search process in kernel space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: The amount of CPU time spent by the Full-Text Search process in kernel space. | |
description: The amount of CPU time spent by the Full-Text search process in kernel space. |
Applicable wherever we are using this description.
description: Amount of COMPUTED process memory, in mebibytes (MiB). | ||
type: double | ||
metric_type: gauge | ||
- name: mapped.mb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While looking into the mongodb metrics document, I am just curious to understand why the computed memory is represented in mib
and the rest of them are in mb
. Can you please verify the differences in the type here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likely it is MB too, i have updated this field.
description: Timestamp of the latest operation entry in the oplog on the primary node. | ||
type: double | ||
metric_type: gauge | ||
unit: s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the value is timestamp as per the description, then the unit: s
has to be removed and the name of the field
should be changed or if the value is not timestamp
then we need to update the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the description.
type: group | ||
fields: | ||
- name: kb | ||
description: Physical memory used in kilobytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we have consistency over the descriptions. Please remove the extra comma added to the other field descriptions.
type: group | ||
fields: | ||
- name: max.pct | ||
description: MAX User CPU usage of processes on the host scaled to a range of 0 to 100% by dividing by the number of CPU cores. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: MAX User CPU usage of processes on the host scaled to a range of 0 to 100% by dividing by the number of CPU cores. | |
description: MAX user CPU usage of processes on the host scaled to a range of 0 to 100% by dividing by the number of CPU cores. |
packages/mongodb_atlas/manifest.yml
Outdated
version: 0.0.1 | ||
source: | ||
license: "Elastic-2.0" | ||
description: This Elastic integration collects metrics from MongoDB Atlas integration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: This Elastic integration collects metrics from MongoDB Atlas integration. | |
description: This Elastic integration collects metrics from MongoDB Atlas database. |
packages/mongodb_atlas/manifest.yml
Outdated
policy_templates: | ||
- name: mongodb_atlas | ||
title: MongoDB Atlas metrics | ||
description: Collect MongoDB Atlas and metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please check the description here?
@muthu-mps : Do you have any other comments? |
@niraj-elastic - While checking the process metrics, the |
"group_id": state.group_id, | ||
"query": state.query, | ||
}) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a final new line.
f.measurements.map(m, m.name), | ||
f.measurements.map(m, m.dataPoints.map(d, d.value).as(v, size(v) == 0 ? 0 : v[0])) | ||
)}).drop(["measurements", "links"])), | ||
"hostlist": (int(state.next)+1) < size(state.hostlist) ? state.hostlist : [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you not using the optional field here? Is this to avoid complication on the want_more iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed ofline, this change will bring more complication to our existing CEL script. So I am keeping it as it is for now. If we decide to change this I will raise one more PR to change this in future.
).as(state, state.next >= size(state.hostlist) ? {} : | ||
request("GET", string(state.hostlist[state.next])).with({ | ||
"Header": { | ||
"Accept": ["application/vnd.atlas." + string(timestamp(now).getFullYear()) + "-01-01+json"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this
"Accept": ["application/vnd.atlas." + string(timestamp(now).getFullYear()) + "-01-01+json"] | |
"Accept": ["application/vnd.atlas." + string(now.getFullYear()) + "-01-01+json"] |
now
is already a timestamp, so no conversion is required.
@muthu-mps I checked the API documentation for all the metrics but I can not find reference to |
Fine, lets figure out this metric later. |
- name: input.type | ||
type: keyword | ||
description: Type of Filebeat input. | ||
- name: tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we leverage the ECS field for tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tags field falles under base fields group. hance we have kept it under base-fields.yml file. let me know if we have to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Those are not the current specifications we follow for base-fields. Then the data stream fields are not part of the base fields group. Are we removing those fields from the file? No. One of the requirement when the dynamic mapping was enabled is to retain the base-fields mentioned here.
- I would recommend to remove the
tags
field from thebase-fields.yml
.
Dashboard looks good to me |
💚 Build Succeeded
History
|
Quality Gate passedKudos, no new issues were introduced! 0 New issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Package mongodb_atlas - 0.0.1 containing this change is available at https://epr.elastic.co/search?package=mongodb_atlas |
What does this PR do?
Checklist
changelog.yml
file.How to test this PR locally
Screenshots
Related issue