Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graylog 5.0 fails to index input messages with Opensearch 2.x #14236

Closed
Ahmad-Faizan opened this issue Dec 16, 2022 · 7 comments
Closed

Graylog 5.0 fails to index input messages with Opensearch 2.x #14236

Ahmad-Faizan opened this issue Dec 16, 2022 · 7 comments

Comments

@Ahmad-Faizan
Copy link

Ahmad-Faizan commented Dec 16, 2022

We have installed Graylog using Helm chart from KongZ repo. As the chart is not released with support for the latest Graylog, we manually changed the image tag to 5.0.0 and deployed it alongside MongoDB 5.0.0 and OpenSearch version 2.0.1 and 2.3.0 on Kubernetes 1.24.6 on Azure.

The deployment comes up online and the Elasticsearch cluster is in green state. There are no error logs from Graylog, MongoDB or OpenSearch pods.
The Graylog support matrix: https://go2docs.graylog.org/5-0/planning_your_deployment/planning_your_upgrade_to_opensearch.htm

Expected Behavior

If we add an input to Graylog, f.ex GELF TCP, and send a message using echo and netcat, the message should show up in the search dashboard. Instead of using a GELF TCP input, we can also test it with Random Message Generator under System > Inputs > Select New Input. It shall generate random messages which we can then view and search from the homepage.

Current Behavior

If we add an input to Graylog, f.ex GELF TCP, and send a message using echo and netcat, the message fails to index with the bulk API.
The widget in the homepage gives this error message:

While retrieving data for this widget, the following error(s) occurred:
Elasticsearch exception [type=illegal_argument_exception, reason=key [types] is not supported in the metadata section].

The pod logs have this message:

Caught exception during bulk indexing: ElasticsearchException{message=ElasticsearchException[An error occurred: ]; nested: IOException[Unable to parse response body for Response{requestLine=POST /_bulk?timeout=1m HTTP/1.1, host=http://opensearch-cluster-master:9200, response=HTTP/1.1 200 OK}]; nested: NullPointerException;, errorDetails=[]}, retrying ( attempt #23). - {}

Since the OpenSearch version 2.0.1 and 2.3.0 is supported by Graylog 5 according to the support matrix, the indexing errors should not come.

Possible Solution

We found that the same issue is happening even with OpenSearch version 2.3.0.
Looking at https://opensearch.org/docs/2.3/breaking-changes/ , we think that this is due to API changes related to Java High Level REST API client inside Graylog.
opensearch-project/OpenSearch#1940
opensearch-project/OpenSearch#2215
opensearch-project/OpenSearch#4643

Steps to Reproduce (for bugs)

  1. Install MongoDB version 5.0.0
  2. Install OpenSearch version 2.0.1 from this Helm chart
  3. Install Graylog from this Helm chart
  4. Change image tag to 5.0.0 in Graylog chart
  5. Login to Graylog and check the cluster state is green or not under System > Overview
  6. Add an input ( GELF TCP or Random Message Generator)
  7. Check received messages under Search tab
  8. The messages would not be displayed and the widget would have an error.
  9. Check the pod logs of graylog pods, there will be an API indexing error.

Context

We want to run Graylog with FluentBit to import the logs to our cluster. FluentBit would generate and stream the logs to our https://graylog-input.mydomain.com and we can store and index the logs on latest versions of Graylog, OpenSearch and MongoDB.

Your Environment

  • Graylog Version: 5.0.0+37301e5
  • Java Version: JRE: Eclipse Adoptium 17.0.5 on Linux 5.4.0-1094-azure
  • Elasticsearch Version: OpenSearch:2.0.1
  • MongoDB Version: MongoDB 5.0.0
  • Operating System: Ubuntu 22.04.1 LTS (jammy)
  • Architecture: amd64
  • Deployment: docker
  • Cluster: Kubernetes 1.24 on Azure AKS
  • Browser version: Chrome 108.0
@dennisoelkers
Copy link
Member

Hey @Ahmad-Faizan,

thanks for reporting this. We are using Graylog 5.0 with Opensearch 2.x for quite a while now, so it is surprising that it does not seem to work for you. Are you sure that there are no connectivity issues? Particularly the NullPointerException in the error message is strange. Are you sure that Graylog is really talking to Opensearch? Is something proxying between Graylog and Opensearch? Do you have a mismatch in your SSL/TLS config?

@Ahmad-Faizan
Copy link
Author

Ahmad-Faizan commented Dec 16, 2022

Thanks for responding @dennisoelkers

I see this log in the graylog pod when it starts up, I assume it means that Graylog is able to communicate with OpenSearch.
Is there any other way which I should use to check ?

[SearchDbPreflightCheck] - Connected to (Elastic/Open)Search version <OpenSearch:2.0.1> - {} 

Here is a curl request from graylog pod to the opensearch service (both are in the same namespace)

graylog@graylog-0:~$ curl http://user:password@opensearch-cluster-master:9200
{
  "name" : "opensearch-cluster-master-1",
  "cluster_name" : "opensearch-cluster",
  "cluster_uuid" : "<some-uuid>",
  "version" : {
    "distribution" : "opensearch",
    "number" : "2.0.1",
    "build_type" : "tar",
    "build_hash" : "6462a546240f6d7a158519499729bce12dc1058b",
    "build_date" : "2022-06-15T08:47:42.243126494Z",
    "build_snapshot" : false,
    "lucene_version" : "9.1.0",
    "minimum_wire_compatibility_version" : "7.10.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "The OpenSearch Project: https://opensearch.org/"
}

@janheise
Copy link
Contributor

janheise commented Dec 16, 2022

@Ahmad-Faizan I am no expert with the Helm charts, but there is the following in 'values.yaml'

elasticsearch:
    ## Major version of the Elasticsearch version used.
    ## It is required by Graylog 4. See https://docs.graylog.org/en/4.0/pages/configuration/elasticsearch.html#available-elasticsearch-configuration-tunables
    version: "7"

Did you change it? Try removing version: "7" and see what happens

@janheise
Copy link
Contributor

janheise commented Dec 16, 2022

Elasticsearch exception [type=illegal_argument_exception, reason=key [types] is not supported in the metadata section].
is the error we should follow up on - and that error is probably because of an incompatible mismatch of OpenSearch client library/server

@Ahmad-Faizan
Copy link
Author

Thanks for pointing that out @janheise , it was the root cause.
Removing the version: 7 from the Helm chart as well as the values.yaml brought the graylog cluster up in healthy state. The logs are also getting indexed properly in the OpenSearch cluster.

It was silly of me to miss this detail.

@Cobesz
Copy link

Cobesz commented Jul 28, 2023

Thanks for pointing that out @janheise , it was the root cause. Removing the version: 7 from the Helm chart as well as the values.yaml brought the graylog cluster up in healthy state. The logs are also getting indexed properly in the OpenSearch cluster.

It was silly of me to miss this detail.

How did you end up fixing it? afaik you can remove the versioning by adding {} to the line in the values.yaml, correct?
It does not fix the problem for me. This is my configuration:

tags:
  install-opensearch: false

graylog:
  image:
    tag: 5.1.2

  opensearch:
    version: {}
  input:
    udp:
      service:
        name: graylog-udp
        type: ClusterIP
      ports:
        - name: wazuh
          port: 5555

  config: |
    elasticsearch_index_prefix = graylog

I do get incoming messages:
image

this is my dashboard:
image

@Ahmad-Faizan
Copy link
Author

@Cobesz
you can see this https://github.com/KongZ/charts/blob/main/charts/graylog/values.yaml#L406
this will be especially relevant if you are using Graylog 4.x with OpenSearch 1.x but as per your values.yaml configuration, you can just remove these lines, and it should work.

  opensearch:
    version: {}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants