Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Type removal] Ignore _type field in bulk request #3505

Merged
merged 2 commits into from
Jun 6, 2022

Conversation

dreamer-89
Copy link
Member

@dreamer-89 dreamer-89 commented Jun 3, 2022

Description

  1. Ignore _type field in bulk request. The change include creating bulk request parser with errorOnType set to false to accommodate _type params. It also adds back some of test cases from [Remove] type support from Bulk API #2215
  2. Revert errorOnType removal from bulk request parser done as part of 9f403ae

Related #3504
#3484

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dreamer-89 dreamer-89 requested review from a team and reta as code owners June 3, 2022 23:02
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 72773de73f84102681cf9ac86066785cad078246
Log 5765

Reports 5765

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 69d1886
Log 5766

Reports 5766

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jun 4, 2022

Gradle check job #5766 has three test failures but still reported success. Opened #3506

Tests with failures:
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=bulk/11_with_deprecated_types/empty action}
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=bulk/11_with_deprecated_types/List of strings}
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=bulk/11_with_deprecated_types/Array of objects}

1185 tests completed, 6 failed, 33 skipped
There were failing tests. See the report at: file:///var/CITOOL/workflow/OpenSearch_CI/PR_Checks/Gradle_Check/search/qa/mixed-cluster/build/reports/tests/v2.0.0%23mixedClusterTest/index.html

> Task :qa:mixed-cluster:v2.0.0#bwcTest
> Task :qa:mixed-cluster:bwcTestSnapshots
> Task :qa:mixed-cluster:check
build complete, generating: /var/CITOOL/workflow/OpenSearch_CI/PR_Checks/Gradle_Check/search/build/5766.tar.bz2

BUILD SUCCESSFUL in 39m
2594 actionable tasks: 2594 executed

All failures happens on 11_with_deprecated_types test file added back with type info.

  1. List of strings
   1> [2022-06-03T19:44:27,592][INFO ][o.o.b.MixedClusterClientYamlTestSuiteIT] [test] [p0=bulk/11_with_deprecated_types/List of strings] after test
  2> REPRODUCE WITH: ./gradlew ':qa:mixed-cluster:v2.0.0#mixedClusterTest' --tests "org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=bulk/11_with_deprecated_types/List of strings}" -Dtests.seed=E5838AF664ABA55C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=vi-VN -Dtests.timezone=America/St_Kitts -Druntime.java=17
  2> java.lang.AssertionError: Failure at [bulk/11_with_deprecated_types:124]: expected [2xx] status code but api [bulk] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]","stack_trace":"[Action/metadata line [1] contains an unknown parameter [_type]]; nested: IllegalArgumentException[Action/metadata line [1] contains an unknown parameter [_type]];\n\tat org.opensearch.OpenSearchException.guessRootCauses(OpenSearchException.java:681)\n\tat org.opensearch.OpenSearchException.generateFailureXContent(OpenSearchException.java:609)\n\tat org.opensearch.rest.BytesRestResponse.build(BytesRestResponse.java:169)\n\tat org.opensearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:130)\n\tat org.opensearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:110)\n\tat org.opensearch.rest.RestController.dispatchRequest(RestController.java:313)\n\tat org.opensearch.rest.RestController.tryAllHandlers(RestController.java:397)\n\tat org.opensearch.rest.RestController.dispatchRequest(RestController.java:240)\n\tat 
...
  1. Array of objects
org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT > test {p0=bulk/11_with_deprecated_types/Array of objects} FAILED
    java.lang.AssertionError: Failure at [bulk/11_with_deprecated_types:3]: expected [2xx] status code but api [bulk] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]","stack_trace":"[Action/metadata line [1] contains an unknown parameter [_type]]; nested: IllegalArgumentException[Action/metadata line [1] contains an unknown parameter [_type]];\n\tat org.opensearch.OpenSearchException.guessRootCauses(OpenSearchException.java:681)\n\tat org.opensearch.OpenSearchException.generateFailureXContent(OpenSearchException.java:609)\n\tat org.opensearch.rest.BytesRestResponse.build(BytesRestResponse.java:169)\n\tat org.opensearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:130)\n\tat org.opensearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:110)\n\tat org.opensearch.rest.RestController.dispatchRequest(RestController.java:313)\n\tat org.opensearch.rest.RestController.tryAllHandlers(RestController.java:397)\n\tat org.opensearch.rest.RestController.dispatchRequest(RestController.java:240)\n\tat org.opensearch.http.AbstractHttpServerTransport.dispatchRequest(AbstractHttpServerTransport.java:366)\n\tat 
...

  1. Empty action
REPRODUCE WITH: ./gradlew ':qa:mixed-cluster:v2.0.0#mixedClusterTest' --tests "org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=bulk/11_with_deprecated_types/empty action}" -Dtests.seed=E5838AF664ABA55C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=vi-VN -Dtests.timezone=America/St_Kitts -Druntime.java=17
  2> java.lang.AssertionError: Failure at [bulk/11_with_deprecated_types:112]: the error message was expected to match the provided regex but didn't
    Expected: Malformed action\/metadata line \[3\], expected FIELD_NAME but found \[END_OBJECT\]
         but: was "{root_cause=[{type=illegal_argument_exception, reason=Action/metadata line [1] contains an unknown parameter [_type], stack_trace=[Action/metadata line [1] contains an unknown parameter [_type]]; nested: IllegalArgumentException[Action/metadata line [1] contains an unknown parameter [_type]];\n\tat org.opensearch.OpenSearchException.guessRootCauses(OpenSearchException.java:681)\n\tat org.opensearch.OpenSearchException.generateFailureXContent(OpenSearchException.java:609)\n\tat org.opensearch.rest.BytesRestResponse.build(BytesRestResponse.java:169)\n\tat org.opensearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:130)\n\tat org.opensearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:110)\n\tat org.opensearch.rest.RestController.dispatchRequest(RestController.java:313)\n\tat org.opensearch.rest.RestController.tryAllHandlers(RestController.java:397)\n\tat org.opensearch.rest.RestController.dispatchRequest(RestController.java:240)\n\tat 
...

@dreamer-89
Copy link
Member Author

start gradle check

@dreamer-89
Copy link
Member Author

Failure is not reproducible when specific test/task is run. Refiring

./gradlew ':qa:mixed-cluster:v1.3.3#mixedClusterTest'

./gradlew ':qa:mixed-cluster:v1.3.3#mixedClusterTest' --tests "org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=bulk/11_with_deprecated_types/List of strings}" -Dtests.seed=96820F928959DD6A -Dtests.security.manager=true

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 69d1886
Log 5767

Reports 5767

@reta
Copy link
Collaborator

reta commented Jun 4, 2022

@dreamer-89 I thought _type has been deprecated long time ago and should not be used, even for bulk requests. I am not sure it is sustainable to deal with such ad-hoc exceptions, @nknize @dblock @andrross what do you think guys?

@andrross
Copy link
Member

andrross commented Jun 4, 2022

@reta I think REST API Versioning will give us the ability to have compatibility like this without making ad-hoc exceptions in the long term. For this particular case logstash/beats is important enough that we're inclined to make this exception. @CEHENKLE do you have anything to add to that?

@CEHENKLE
Copy link
Member

CEHENKLE commented Jun 4, 2022

@reta I think REST API Versioning will give us the ability to have compatibility like this without making ad-hoc exceptions in the long term. For this particular case logstash/beats is important enough that we're inclined to make this exception. @CEHENKLE do you have anything to add to that?

Yeah, it's causing enough pain for folks that I think it's worth it to put in this bit of no-opery until we can get to a better medium and long term solution. It doesn't mean we'll take our foot off the gas API version, but I don't want people to be broken on 2.x until we do.

Thanks,
/C

@dreamer-89
Copy link
Member Author

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 69d1886
Log 5775

Reports 5775

@dreamer-89
Copy link
Member Author

Tests are essentially flaky and passes on retries. Verified this on local run (report snapshot and zip attached below). Ran gradle check with scan option to get more detailed output, which shows that tests are marked flaky and succeeded on retry.

https://scans.gradle.com/s/t5ufqhxrrkyl6/tests/overview?task-path=:qa:mixed-cluster:v1.3.3%23mixedClusterTest

Screen Shot 2022-06-04 at 12 33 46 PM

Complete report:
mixedClusterTest.zip

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jun 5, 2022

On first run with gradle check scan, type related tests are failing with illegal_argument_exception - Expected type [_doc] but received [type]; which is thrown while setting mapping definition for the index on node running < 2.0 version. Re-running the test to verify if this is consistent behavior.

  "stash" : {	
    "body" : {	
      "took" : 121,	
      "errors" : true,	
      "items" : [	
        {	
          "index" : {	
            "_index" : "test",	
            "_type" : "type",	
            "_id" : "yfckMIEBsPpbjF2eAHyi",	
            "status" : 400,	
            "error" : {	
              "type" : "illegal_argument_exception",	
              "reason" : "Expected type [_doc] but received [type]"	
            }	
          }	
...
  }	

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jun 5, 2022

  1. https://scans.gradle.com/s/t5ufqhxrrkyl6/tests/overview?outcome=flaky --> failing with Expected type [_doc] but received [type]
  2. https://scans.gradle.com/s/ietqu6zf4ha3m --> No test failures
  3. https://scans.gradle.com/s/nglfka3vq4kha/tests/overview?outcome=flaky --> failing with unavailable_shards_exception
  | "type" : "unavailable_shards_exception", |  
  | "reason" : "[test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [2] requests and a refresh]"
  1. https://scans.gradle.com/s/jgobeogerwkig --> No test failures

Ran more scans and had similar pattern as above.

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jun 5, 2022

Regarding other usages of _type in libbeat (common go package containing client and api for search engine integration); a quick search on beats shows that _type is still used in Bulk, Index, Ingest, Delete, Search and Count end-points .

@nknize
Copy link
Collaborator

nknize commented Jun 6, 2022

Regarding other usages of _type in libbeat (common go package containing client and api for search engine integration); a quick search on beats shows that _type is still used in Bulk, Index, Ingest, Delete, Search and Count end-points .

Like the Bulk endpoint, SearchURIWithBody also clears docType on version check < 8. I don't see that check in the other endpoints, though; if someone wants to double check me. This might mean we'll need to temporarily add back the type endpoint for Index, Ingest, Delete, and Count until we can get REST API Version checks merged.

Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for verifying this. Hopefully we can get REST Version API in by 2.2 to make this easier for the rest of the 2.x line.

Let's get the endpoints taken care of in a separate PR.

@nknize nknize merged commit a93ff13 into opensearch-project:2.x Jun 6, 2022
@dreamer-89
Copy link
Member Author

A quick update, I was able to reproduce the issue with filebeats (system module enabled) and OpenSearch 2.0. Verified that with this fix, filebeats process wasn't throwing any exceptions and was able to log events on OpenSearch engine.

Before fix

2022-06-06T19:03:05.859Z	INFO	[esclientleg]	eslegclient/connection.go:314	Attempting to connect to Elasticsearch version 2.0.0-SNAPSHOT
2022-06-06T19:03:05.864Z	INFO	[publisher_pipeline_output]	pipeline/output.go:151	Connection to backoff(elasticsearch(http://localhost:9200)) established
2022-06-06T19:03:05.868Z	ERROR	[elasticsearch]	elasticsearch/client.go:224	failed to perform any bulk index operations: 400 Bad Request: {“error”:{“root_cause”:[{“type”:“illegal_argument_exception”,“reason”:“Action/metadata line [1] contains an unknown parameter [_type]“}],“type”:“illegal_argument_exception”,“reason”:“Action/metadata line [1] contains an unknown parameter [_type]“},“status”:400}
2022-06-06T19:03:05.868Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer

After fix


2022-06-06T19:17:02.245Z	INFO	log/harvester.go:302	Harvester started for file: /var/log/syslog
2022-06-06T19:17:03.239Z	ERROR	[elasticsearch]	elasticsearch/client.go:224	failed to perform any bulk index operations: Post "http://localhost:9200/_bulk": dial tcp 127.0.0.1:9200: connect: connection refused
2022-06-06T19:17:03.239Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer
2022-06-06T19:17:03.239Z	INFO	[publisher]	pipeline/retry.go:223	  done
2022-06-06T19:17:04.530Z	ERROR	[publisher_pipeline_output]	pipeline/output.go:180	failed to publish events: Post "http://localhost:9200/_bulk": dial tcp 127.0.0.1:9200: connect: connection refused
2022-06-06T19:17:04.530Z	INFO	[publisher_pipeline_output]	pipeline/output.go:143	Connecting to backoff(elasticsearch(http://localhost:9200))
2022-06-06T19:17:04.530Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer
2022-06-06T19:17:04.530Z	INFO	[publisher]	pipeline/retry.go:223	  done
2022-06-06T19:17:08.523Z	ERROR	[publisher_pipeline_output]	pipeline/output.go:154	Failed to connect to backoff(elasticsearch(http://localhost:9200)): Get "http://localhost:9200": dial tcp 127.0.0.1:9200: connect: connection refused
2022-06-06T19:17:08.523Z	INFO	[publisher_pipeline_output]	pipeline/output.go:145	Attempting to reconnect to backoff(elasticsearch(http://localhost:9200)) with 1 reconnect attempt(s)
2022-06-06T19:17:08.523Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer
2022-06-06T19:17:08.523Z	INFO	[publisher]	pipeline/retry.go:223	  done
2022-06-06T19:17:11.836Z	INFO	[monitoring]	log/log.go:144	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpuacct":{"total":{"ns":68002374694}},"memory":{"mem":{"usage":{"bytes":-200704}}}},"cpu":{"system":{"ticks":180,"time":{"ms":5}},"total":{"ticks":500,"time":{"ms":22},"value":500},"user":{"ticks":320,"time":{"ms":17}}},"handles":{"limit":{"hard":1048576,"soft":1024},"open":12},"info":{"ephemeral_id":"208aa84d-ade5-44ff-b985-d768f79620e1","uptime":{"ms":390035}},"memstats":{"gc_next":11085872,"memory_alloc":8760176,"memory_total":31727280,"rss":52125696},"runtime":{"goroutines":42}},"filebeat":{"events":{"active":4,"added":4},"harvester":{"open_files":2,"running":2,"started":1}},"libbeat":{"config":{"module":{"running":1}},"output":{"events":{"active":1,"batches":1,"total":1}},"pipeline":{"clients":2,"events":{"active":3,"filtered":1,"published":3,"retry":2,"total":4}}},"registrar":{"states":{"current":4}},"system":{"load":{"1":0.94,"15":0.45,"5":0.43,"norm":{"1":0.0131,"15":0.0063,"5":0.006}}}}}}
2022-06-06T19:17:15.543Z	ERROR	[publisher_pipeline_output]	pipeline/output.go:154	Failed to connect to backoff(elasticsearch(http://localhost:9200)): Get "http://localhost:9200": dial tcp 127.0.0.1:9200: connect: connection refused
2022-06-06T19:17:15.543Z	INFO	[publisher_pipeline_output]	pipeline/output.go:145	Attempting to reconnect to backoff(elasticsearch(http://localhost:9200)) with 2 reconnect attempt(s)
2022-06-06T19:17:15.543Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer
2022-06-06T19:17:15.543Z	INFO	[publisher]	pipeline/retry.go:223	  done
2022-06-06T19:17:28.191Z	ERROR	[publisher_pipeline_output]	pipeline/output.go:154	Failed to connect to backoff(elasticsearch(http://localhost:9200)): Get "http://localhost:9200": dial tcp 127.0.0.1:9200: connect: connection refused
2022-06-06T19:17:28.191Z	INFO	[publisher_pipeline_output]	pipeline/output.go:145	Attempting to reconnect to backoff(elasticsearch(http://localhost:9200)) with 3 reconnect attempt(s)
2022-06-06T19:17:28.191Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer
2022-06-06T19:17:28.191Z	INFO	[publisher]	pipeline/retry.go:223	  done
2022-06-06T19:17:41.835Z	INFO	[monitoring]	log/log.go:144	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpuacct":{"total":{"ns":105948552151}},"memory":{"mem":{"usage":{"bytes":73728}}}},"cpu":{"system":{"ticks":190,"time":{"ms":12}},"total":{"ticks":520,"time":{"ms":24},"value":520},"user":{"ticks":330,"time":{"ms":12}}},"handles":{"limit":{"hard":1048576,"soft":1024},"open":12},"info":{"ephemeral_id":"208aa84d-ade5-44ff-b985-d768f79620e1","uptime":{"ms":420035}},"memstats":{"gc_next":11748432,"memory_alloc":6157968,"memory_total":32468824,"rss":52125696},"runtime":{"goroutines":42}},"filebeat":{"harvester":{"open_files":2,"running":2}},"libbeat":{"config":{"module":{"running":1}},"output":{"events":{"active":1}},"pipeline":{"clients":2,"events":{"active":3,"retry":2}}},"registrar":{"states":{"current":4}},"system":{"load":{"1":1.15,"15":0.48,"5":0.54,"norm":{"1":0.016,"15":0.0067,"5":0.0075}}}}}}
2022-06-06T19:17:50.338Z	ERROR	[publisher_pipeline_output]	pipeline/output.go:154	Failed to connect to backoff(elasticsearch(http://localhost:9200)): Get "http://localhost:9200": dial tcp 127.0.0.1:9200: connect: connection refused
2022-06-06T19:17:50.338Z	INFO	[publisher_pipeline_output]	pipeline/output.go:145	Attempting to reconnect to backoff(elasticsearch(http://localhost:9200)) with 4 reconnect attempt(s)
2022-06-06T19:17:50.338Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer
2022-06-06T19:17:50.338Z	INFO	[publisher]	pipeline/retry.go:223	  done
2022-06-06T19:17:50.407Z	INFO	[esclientleg]	eslegclient/connection.go:314	Attempting to connect to Elasticsearch version 2.0.0-SNAPSHOT
2022-06-06T19:17:50.416Z	INFO	[publisher_pipeline_output]	pipeline/output.go:151	Connection to backoff(elasticsearch(http://localhost:9200)) established -->

nknize pushed a commit that referenced this pull request Jun 10, 2022
* Revert "Revert removal of typed end-points for bulk, search, index APIs (#3524)"

This reverts commit f48043b.

* Revert "[Type removal] Ignore _type field in bulk request (#3505)"

This reverts commit a93ff13.

Signed-off-by: Suraj Singh <[email protected]>
Signed-off-by: Nicholas Walter Knize <[email protected]>
@opoplawski
Copy link

Has this made it into a released version yet? I'm still seeing issues with 2.1.0.

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jul 18, 2022

Hi @opoplawski,

Thanks for checking in. The actual issue was version incompatibility b/w OpenSearch 2.0 & external clients (filebeat etc, please check #3484) and was mitigated by fix. Clients need to override override_main_response_version setting as mentioned here. Please let me know if you are still seeing any issues.

@opoplawski
Copy link

Ah, didn't realize that setting override_main_response_version was a needed part of the fix. With that in place, my winlogbeat 7.12 clients are happy. Logstach with opensearch output plugin reports:

[2022-07-18T09:01:45,701][WARN ][logstash.outputs.opensearch][main] Detected a node with a higher major version than previously observed, this could be the result of an OpenSearch cluster upgrade {:previous_major=>2, :new_major=>7, :node_url=>"XX/"}

but hopefully that won't cause any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants