-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove lightweight modules in packages #3188
Comments
++ on moving to a fully package based approach. There is another benefit of this that the data collection part on the edge becomes less complex and needs less changes. This might make it easier that also newer version of Elastic Agent can ship data to older versions of Elasticsearch as the data format shipped stays the same. What matters is only the package that is installed. On side effect of 2 is that we will use the rename processor a lot more (see https://github.com/elastic/beats/blob/main/x-pack/metricbeat/module/aws/rds/manifest.yml#L79 as an example). Lets make sure to involve the es-perf team ( @DJRickyB ) in this. I'm not too concerned on the metrics front about this change as my understanding is the number of events is not very high compared to logs and we do much heavier processing on the logs side but it is worth investigating. It might be useful to have a bulk rename processor that could also make it easier to write these renames. |
One problem showed up is: before moving configuration from Beats to Integrations, EC2 metrics data stream generates events with field:
With moving the configuration into integrations, EC2 metrics data stream generates events with field:
This is caused by Should we add a rename processor to rename |
The rename processor is the low hanging fruit here. More fundamental, I wonder if we still need the metricset.name field in the first place. Are we using it for any queries? In general it should be replaced by queries on |
@ruflin Agreed. |
Should we add a removal step in the ingest pipeline to remove |
@endorama Good question! This field does tell us which metricset is it using from Metricbeat. This way, user knows we are using the cloudwatch metricset in the background instead of the ec2 metricset for example. Not sure how useful this is but since this field exist across all integrations, maybe we can keep it for now? |
@kaiyan-sheng @endorama @zmoog i think everything is done here - shall we close this issue once and for all? |
@tommyers-elastic I just went though our GCP package and found these data streams are added but not with lightweight module removed. I will move these into the issue description.
|
@kaiyan-sheng shall we communicate this effort with the wider audience of integrations' teams? This would help to advertise the recommended way when it comes to the usage (or non usage :)) of leighweight modules. We discovered this while implementing/reviewing at #4253 (comment). Also I guess the long term plan would be to use input packages (when they are ready). Relevant content: elastic/package-spec#319 (comment) and elastic/package-spec#404 |
++, lets make sure it is document in a guide around how to build integrations. |
@ruflin +1! I also suggested working on documentation for how to build integrations (especially with lightweight module) for our team for 8.6.0 planning. |
@tommyers-elastic I think this meta issue is ready to be closed now. |
Background
Before Elastic Agent and packages existed we introduced lightweight modules in Metricbeat to simplify the development of new metricsets and packages. The goal was to move to pure config defined modules as we have in Filebeat. Over time we have been making more and more use of it, for example in the AWS package. This is great as it reduces the number of code changes we have to make.
With the introduction of elastic packages, it has become even more important to have packages purely based on configurations because that is what is delivered as part of the package. Unfortunately many modules use lightweight modules but the packages we build also depend on the lightweight modules to be there, meaning the code/config files have to be contributed to the beats repository to work. This makes it easier to keep packages and modules in sync but it also means we keep adding to the beats repository. This issue is caused by running the migration script from Metricbeat to integrations.
Actions
rds
: we should move the configuration part from Beats into integrations.Impacted Integrations (wip)
gcp.billing
[GCP] Remove lightweight module ingcp.billing
data stream #3720gcp.compute
[GCP] Remove lightweight module ingcp.compute
data stream #3721gcp.firestore
[GCP] Remove lightweight module ingcp.firestore
data stream #3722gcp.storage
[GCP] Remove lightweight module in gcp.storage data stream #4262gcp.pubsub
[GCP] Remove lightweight module in gcp.pubsub data stream #4263gcp.loadbalancing
[GCP] Remove lightweight module in gcp.loadbalancing data stream #4264gcp.dataproc
[GCP] Remove lightweight module in gcp.dataproc data stream #4265The text was updated successfully, but these errors were encountered: