Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MongoDB Atlas] Add support of mongodb database datastream #9539

Merged
merged 18 commits into from
Apr 18, 2024

Conversation

milan-elastic
Copy link
Contributor

@milan-elastic milan-elastic commented Apr 8, 2024

Create the package for Mongodb Atlas Integration and Provide the support of Mongod database logs.

Integration release checklist

This checklist is intended for integrations maintainers to ensure consistency
when creating or updating a Package, Module or Dataset for an Integration.

Along with the mongod_database PR, to keep the code changes(review comments) consistent with other log datastreams, I have made the same changes for mongod_audit datastream.

All changes

  • Change follows the contributing guidelines
  • Supported versions of the monitoring target are documented
  • Supported operating systems are documented (if applicable)
  • Integration or System tests exist
  • Documentation exists, useful guidelines to follow
  • Fields follow ECS and naming conventions
  • At least a manual test with ES / Kibana / Agent has been performed.
  • Required Kibana version set to: 8.12.0
  • TSDB enablement for the data stream

Data dataset changes

This entry is currently only recommended. It will be mandatory once we provide better support for it.

  • Sample event (sample_event.json) exists

Related Issue

Screenshots

mongodb_atlas-mongod-database-dashboard

Known Issues:

Currently Following challenges are still open to solve for this PR:

@milan-elastic milan-elastic marked this pull request as ready for review April 10, 2024 05:57
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
packages/mongodb_atlas/_dev/build/docs/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add different log samples in the pipeline test.

@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@milan-elastic
Copy link
Contributor Author

Please add different log samples in the pipeline test.

Done, added more events for pipeline that we are managed to generate!

@SubhrataK
Copy link

Please move the line graphs together.

@milan-elastic milan-elastic requested a review from a team as a code owner April 16, 2024 16:50
@milan-elastic
Copy link
Contributor Author

Please move the line graphs together.

Done!

@shmsr
Copy link
Member

shmsr commented Apr 17, 2024

Review comment:

diff --git a/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml b/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
index 5b34df3d8..e6ad6f0f1 100644
--- a/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
+++ b/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
@@ -46,25 +46,27 @@ processors:
             value: '{{ _ingest.on_failure_message }}'
   - script:
       lang: painless
-      description: Provides descriptive value to the field.
+      description: Maps log severity levels to descriptive values
       tag: informative_log_level
       on_failure:
         - append:
             field: error.message
             value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
+      params:
+        severity_levels:
+          F: fatal
+          E: error
+          W: warning
+          I: informational
+          D1: debug 1
+          D2: debug 2
+          D3: debug 3
+          D4: debug 4
+          D5: debug 5
       source: |
-        Map m = new HashMap();
-        m.put("F", "fatal");
-        m.put("E", "error");
-        m.put("W", "warning");
-        m.put("I", "informational");
-        m.put("D1", "debug 1");
-        m.put("D2", "debug 2");
-        m.put("D3", "debug 3");
-        m.put("D4", "debug 4");
-        m.put("D5", "debug 5");
-        if (m.get(ctx.json?.s) != null) {
-          ctx.severity = m.get(ctx.json.s);
+        String severity = ctx.json?.s;
+        if (severity != null) {
+          ctx.log.level = params.severity_levels.getOrDefault(severity, null);
         }
   - rename:
       field: severity
@@ -101,20 +103,27 @@ processors:
   - script:
       lang: painless
       source: |-
-        boolean drop(Object o) {
-          if (o == null || o == '') {
-            return true;
+        /**
+         * Recursively drops null and empty values from the Elasticsearch doc context.
+         *
+         * @param o The object to process (can be a Map, List, or any other type).
+         */
+        void drop(Object o) {
+          if (o instanceof List) {
+            ((List) o).removeIf(item -> {
+              drop(item);
+              return item == null;
+            });
           } else if (o instanceof Map) {
-            ((Map) o).values().removeIf(v -> drop(v));
-            return (((Map) o).size() == 0);
-          } else if (o instanceof List) {
-            ((List) o).removeIf(v -> drop(v));
-            return (((List) o).length == 0);
+            ((Map) o).values().removeIf(item -> {
+              drop(item);
+              return item == null;
+            });
           }
-          return false;
         }
+
         drop(ctx);
-      description: Drops null/empty values recursively.
+      description: Drops null and empty values recursively from the Elasticsearch document context.
   - remove:
       field:
         - event.original
@@ -125,6 +134,7 @@ processors:
       field:
         - json
       ignore_missing: true
+      description: Removes temporary fields.
   - set:
       field: event.kind
       value: pipeline_error
@@ -136,4 +146,4 @@ on_failure:
   - append:
       field: event.kind
       value: pipeline_error
-      allow_duplicates: false
+      allow_duplicates: false
\ No newline at end of file

Improvements:

  1. Use params. Not Hashmaps. (See: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-using.html#prefer-params)
  2. No return value is required by drop as there's no one using it. Changed it to void. Also, made some minor changes.

@shmsr
Copy link
Member

shmsr commented Apr 17, 2024

I think there's scope to make this CEL program better. I might have a few suggestions.

@shmsr shmsr mentioned this pull request Apr 17, 2024
10 tasks
@shmsr
Copy link
Member

shmsr commented Apr 18, 2024

@milan-elastic Need some clarity on this.

While trying to improve the CEL code, I was reading MongoDB's doc as well. Here's what I saw:

  • API that we are using is deprecated. Did we see this? Support is until June of 2025. We should invest some time if there's a new API and if the new API is better. See this.
  • Also, for logName parameter in the API, there are four options. Why did you specifically go for mongodb and not mongodb-audit-log? See this.
  • Also, have we mentioned that auditing logs are not there in M0, M2, M5, or the Serverless instances?
  • Read this full. This too.
  • We can take relevant parts of this doc to improve our doc.

@milan-elastic
Copy link
Contributor Author

milan-elastic commented Apr 18, 2024

While trying to improve the CEL code, I was reading MongoDB's doc as well. Here's what I saw:

  • API that we are using is deprecated. Did we see this? Support is until June of 2025. We should invest some time if there's a new API and if the new API is better. See this.

The API itself is not deprecated; only the resource version we specified in the header during API calls will be deprecated. You can find more information about this version schema and the header here. To accommodate these changes in the version header, we have implemented a dynamic header that is based on the current year. The code for these changes is specified below.

"Header": {
              "Accept": ["application/vnd.atlas." + string(now.getFullYear()) + "-01-01+json"]
            }

  • Also, for logName parameter in the API, there are four options. Why did you specifically go for mongodb and not mongodb-audit-log? See this.

Yes, that is correct! Currently, we are collecting two types of logs from MongoDB Atlas: mongod_database logs and mongod_audit logs. To collect mongod_database logs, we use the log name mongodb, and for mongod_audit logs, we use mongodb-audit-log file. So both the logs are getting collected but under different datastreams. More bifurcation is provided here.

  • Also, have we mentioned that auditing logs are not there in M0, M2, M5, or the Serverless instances?

Sure, I will add that in Readme.

  • Read this full. This too.
  • We can take relevant parts of this doc to improve our doc.

Yes we have already visited this document and captured the information that found useful.

@milan-elastic
Copy link
Contributor Author

milan-elastic commented Apr 18, 2024

Review comment:

diff --git a/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml b/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
index 5b34df3d8..e6ad6f0f1 100644
--- a/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
+++ b/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
@@ -46,25 +46,27 @@ processors:
             value: '{{ _ingest.on_failure_message }}'
   - script:
       lang: painless
-      description: Provides descriptive value to the field.
+      description: Maps log severity levels to descriptive values
       tag: informative_log_level
       on_failure:
         - append:
             field: error.message
             value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
+      params:
+        severity_levels:
+          F: fatal
+          E: error
+          W: warning
+          I: informational
+          D1: debug 1
+          D2: debug 2
+          D3: debug 3
+          D4: debug 4
+          D5: debug 5
       source: |
-        Map m = new HashMap();
-        m.put("F", "fatal");
-        m.put("E", "error");
-        m.put("W", "warning");
-        m.put("I", "informational");
-        m.put("D1", "debug 1");
-        m.put("D2", "debug 2");
-        m.put("D3", "debug 3");
-        m.put("D4", "debug 4");
-        m.put("D5", "debug 5");
-        if (m.get(ctx.json?.s) != null) {
-          ctx.severity = m.get(ctx.json.s);
+        String severity = ctx.json?.s;
+        if (severity != null) {
+          ctx.log.level = params.severity_levels.getOrDefault(severity, null);
         }
   - rename:
       field: severity
@@ -101,20 +103,27 @@ processors:
   - script:
       lang: painless
       source: |-
-        boolean drop(Object o) {
-          if (o == null || o == '') {
-            return true;
+        /**
+         * Recursively drops null and empty values from the Elasticsearch doc context.
+         *
+         * @param o The object to process (can be a Map, List, or any other type).
+         */
+        void drop(Object o) {
+          if (o instanceof List) {
+            ((List) o).removeIf(item -> {
+              drop(item);
+              return item == null;
+            });
           } else if (o instanceof Map) {
-            ((Map) o).values().removeIf(v -> drop(v));
-            return (((Map) o).size() == 0);
-          } else if (o instanceof List) {
-            ((List) o).removeIf(v -> drop(v));
-            return (((List) o).length == 0);
+            ((Map) o).values().removeIf(item -> {
+              drop(item);
+              return item == null;
+            });
           }
-          return false;
         }
+
         drop(ctx);
-      description: Drops null/empty values recursively.
+      description: Drops null and empty values recursively from the Elasticsearch document context.
   - remove:
       field:
         - event.original
@@ -125,6 +134,7 @@ processors:
       field:
         - json
       ignore_missing: true
+      description: Removes temporary fields.
   - set:
       field: event.kind
       value: pipeline_error
@@ -136,4 +146,4 @@ on_failure:
   - append:
       field: event.kind
       value: pipeline_error
-      allow_duplicates: false
+      allow_duplicates: false
\ No newline at end of file

Improvements:

  1. Use params. Not Hashmaps. (See: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-using.html#prefer-params)
  2. No return value is required by drop as there's no one using it. Changed it to void. Also, made some minor changes.
+        /**
+         * Recursively drops null and empty values from the Elasticsearch doc context.
+         *
+         * @param o The object to process (can be a Map, List, or any other type).
+         */
        void drop(Object o) {
         if (o instanceof List) {
            ((List) o).removeIf(item -> {
              drop(item);
              return item == null;
            });
             ((Map) o).values().removeIf(item -> {
              drop(item);
              return item == null;
            });
           }
         }

Above function will not work when we have field with empty value. Like field1 = “”. Then this function will not remove this field from the event. It’s only work if field have null value. While the current script is handling mentioned case as well.

@shmsr
Copy link
Member

shmsr commented Apr 18, 2024

While trying to improve the CEL code, I was reading MongoDB's doc as well. Here's what I saw:

  • API that we are using is deprecated. Did we see this? Support is until June of 2025. We should invest some time if there's a new API and if the new API is better. See this.

The API itself is not deprecated; only the resource version we specified in the header during API calls will be deprecated. You can find more information about this version schema and the header here. To accommodate these changes in the version header, we have implemented a dynamic header that is based on the current year. The code for these changes is specified below.

"Header": {
              "Accept": ["application/vnd.atlas." + string(now.getFullYear()) + "-01-01+json"]
            }
  • Also, for logName parameter in the API, there are four options. Why did you specifically go for mongodb and not mongodb-audit-log? See this.

Yes, that is correct! Currently, we are collecting two types of logs from MongoDB Atlas: mongod_database logs and mongod_audit logs. To collect mongod_database logs, we use the log name mongodb, and for mongod_audit logs, we use mongodb-audit-log file. So both the logs are getting collected but under different datastreams. More bifurcation is provided here.

  • Also, have we mentioned that auditing logs are not there in M0, M2, M5, or the Serverless instances?

Sure, I will add that in Readme.

  • Read this full. This too.
  • We can take relevant parts of this doc to improve our doc.

Yes we have already visited this document and captured the information that found useful.

Thanks for clarifying the points ++

@shmsr
Copy link
Member

shmsr commented Apr 18, 2024

Review comment:

diff --git a/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml b/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
index 5b34df3d8..e6ad6f0f1 100644
--- a/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
+++ b/packages/mongodb_atlas/data_stream/mongod_database/elasticsearch/ingest_pipeline/default.yml
@@ -46,25 +46,27 @@ processors:
             value: '{{ _ingest.on_failure_message }}'
   - script:
       lang: painless
-      description: Provides descriptive value to the field.
+      description: Maps log severity levels to descriptive values
       tag: informative_log_level
       on_failure:
         - append:
             field: error.message
             value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
+      params:
+        severity_levels:
+          F: fatal
+          E: error
+          W: warning
+          I: informational
+          D1: debug 1
+          D2: debug 2
+          D3: debug 3
+          D4: debug 4
+          D5: debug 5
       source: |
-        Map m = new HashMap();
-        m.put("F", "fatal");
-        m.put("E", "error");
-        m.put("W", "warning");
-        m.put("I", "informational");
-        m.put("D1", "debug 1");
-        m.put("D2", "debug 2");
-        m.put("D3", "debug 3");
-        m.put("D4", "debug 4");
-        m.put("D5", "debug 5");
-        if (m.get(ctx.json?.s) != null) {
-          ctx.severity = m.get(ctx.json.s);
+        String severity = ctx.json?.s;
+        if (severity != null) {
+          ctx.log.level = params.severity_levels.getOrDefault(severity, null);
         }
   - rename:
       field: severity
@@ -101,20 +103,27 @@ processors:
   - script:
       lang: painless
       source: |-
-        boolean drop(Object o) {
-          if (o == null || o == '') {
-            return true;
+        /**
+         * Recursively drops null and empty values from the Elasticsearch doc context.
+         *
+         * @param o The object to process (can be a Map, List, or any other type).
+         */
+        void drop(Object o) {
+          if (o instanceof List) {
+            ((List) o).removeIf(item -> {
+              drop(item);
+              return item == null;
+            });
           } else if (o instanceof Map) {
-            ((Map) o).values().removeIf(v -> drop(v));
-            return (((Map) o).size() == 0);
-          } else if (o instanceof List) {
-            ((List) o).removeIf(v -> drop(v));
-            return (((List) o).length == 0);
+            ((Map) o).values().removeIf(item -> {
+              drop(item);
+              return item == null;
+            });
           }
-          return false;
         }
+
         drop(ctx);
-      description: Drops null/empty values recursively.
+      description: Drops null and empty values recursively from the Elasticsearch document context.
   - remove:
       field:
         - event.original
@@ -125,6 +134,7 @@ processors:
       field:
         - json
       ignore_missing: true
+      description: Removes temporary fields.
   - set:
       field: event.kind
       value: pipeline_error
@@ -136,4 +146,4 @@ on_failure:
   - append:
       field: event.kind
       value: pipeline_error
-      allow_duplicates: false
+      allow_duplicates: false
\ No newline at end of file

Improvements:

  1. Use params. Not Hashmaps. (See: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-using.html#prefer-params)
  2. No return value is required by drop as there's no one using it. Changed it to void. Also, made some minor changes.
+        /**
+         * Recursively drops null and empty values from the Elasticsearch doc context.
+         *
+         * @param o The object to process (can be a Map, List, or any other type).
+         */
        void drop(Object o) {
         if (o instanceof List) {
            ((List) o).removeIf(item -> {
              drop(item);
              return item == null;
            });
             ((Map) o).values().removeIf(item -> {
              drop(item);
              return item == null;
            });
           }
         }

Above function will not work when we have field with empty value. Like field1 = “”. Then this function will not remove this field from the event. It’s only work if field have null value. While the current script is handling mentioned case as well.

Okay, can we modify that program to add that condition? Also, this should also work:

void drop(Map map) {
  for (String key : map.keySet()) {
    if (map[key] instanceof Map) {
      drop(map[key]);
    } else if (map[key] == null || map[key].equals("")) {
      map.remove(key);
    }
  }
}

drop(ctx);

I just want to keep it simple and avoid returning a bool.

@milan-elastic
Copy link
Contributor Author

void drop(Map map) {
for (String key : map.keySet()) {
if (map[key] instanceof Map) {
drop(map[key]);
} else if (map[key] == null || map[key].equals("")) {
map.remove(key);
}
}
}

drop(ctx);

It seems like this script might not handle few cases, I need to test it thoroughly. Current script is working fine without any issue as of now so I'm keeping it as it is for the time being. But once I test it thoroughly, I'll update this script in other datastreams as well in separate PR.

Copy link
Contributor

@harnish-elastic harnish-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚀

Copy link
Contributor

@niraj-elastic niraj-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@ishleenk17 ishleenk17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please recheck if there are any nit comments left.
Else Looks good!

@elasticmachine
Copy link

💚 Build Succeeded

History

@milan-elastic milan-elastic merged commit 4558683 into elastic:main Apr 18, 2024
5 checks passed
@elasticmachine
Copy link

Package mongodb_atlas - 0.0.3 containing this change is available at https://epr.elastic.co/search?package=mongodb_atlas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants