Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOOBE in TransportUpdateByQueryAction$MultipleIndexUpdateByQueryActionListener.onResponse() #22

Closed
ofavre opened this issue Aug 7, 2014 · 5 comments

Comments

@ofavre
Copy link
Contributor

ofavre commented Aug 7, 2014

I got a strange error while running the tests multiple time with the same seed.

$ mvn test -Dtests.seed=385239340D701F6F
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building elasticsearch-action-updatebyquery 2.0.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-enforcer-plugin:1.0:enforce (enforce-maven) @ elasticsearch-action-updatebyquery ---
[INFO] 
[INFO] --- maven-resources-plugin:2.3:resources (default-resources) @ elasticsearch-action-updatebyquery ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ elasticsearch-action-updatebyquery ---
[INFO] Compiling 1 source file to /data/Yakaz/elasticsearch-action-updatebyquery/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.3:testResources (default-testResources) @ elasticsearch-action-updatebyquery ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ elasticsearch-action-updatebyquery ---
[INFO] Compiling 1 source file to /data/Yakaz/elasticsearch-action-updatebyquery/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ elasticsearch-action-updatebyquery ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- junit4-maven-plugin:2.0.15:junit4 (tests) @ elasticsearch-action-updatebyquery ---
[INFO] <JUnit4> says 你好! Master seed: 385239340D701F6F
Executing 1 suite with 1 JVM.

Started J0 PID(16324@TeKa-Laptop).
HEARTBEAT J0 PID(16324@TeKa-Laptop): 2014-08-07T11:20:57, stalled for 22.1s at: UpdateByQueryTests.testUpdateByQuery_multipleIndices
Suite: org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests
  1> [2014-08-07 11:20:35,254][INFO ][test                     ] Setup TestCluster [shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]] with seed [385239340D701F6F] using [2] nodes
  1> [2014-08-07 11:20:35,264][INFO ][test                     ] Test testUpdateByQuery_multipleIndices(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) started
  1> [2014-08-07 11:20:35,599][INFO ][node                     ] [node_0] version[1.1.0], pid[16324], build[2181e11/2014-03-25T15:59:51Z]
  1> [2014-08-07 11:20:35,600][INFO ][node                     ] [node_0] initializing ...
  1> [2014-08-07 11:20:35,611][INFO ][plugins                  ] [node_0] loaded [action-updatebyquery], sites []
  1> [2014-08-07 11:20:38,809][INFO ][node                     ] [node_0] initialized
  1> [2014-08-07 11:20:38,811][INFO ][node                     ] [node_0] starting ...
  1> [2014-08-07 11:20:38,937][INFO ][transport                ] [node_0] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.71:9300]}
  1> [2014-08-07 11:20:42,002][INFO ][cluster.service          ] [node_0] new_master [node_0][b7OAgitdSey5YEN3M6KRLg][TeKa-Laptop][inet[/192.168.1.71:9300]], reason: zen-disco-join (elected_as_master)
  1> [2014-08-07 11:20:42,109][INFO ][discovery                ] [node_0] shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/b7OAgitdSey5YEN3M6KRLg
  1> [2014-08-07 11:20:42,185][INFO ][http                     ] [node_0] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.71:9200]}
  1> [2014-08-07 11:20:42,192][INFO ][gateway                  ] [node_0] recovered [0] indices into cluster_state
  1> [2014-08-07 11:20:42,199][INFO ][node                     ] [node_0] started
  1> [2014-08-07 11:20:42,199][INFO ][test                     ] Start Shared Node [node_0] not shared
  1> [2014-08-07 11:20:42,221][INFO ][node                     ] [node_1] version[1.1.0], pid[16324], build[2181e11/2014-03-25T15:59:51Z]
  1> [2014-08-07 11:20:42,222][INFO ][node                     ] [node_1] initializing ...
  1> [2014-08-07 11:20:42,227][INFO ][plugins                  ] [node_1] loaded [action-updatebyquery], sites []
  1> [2014-08-07 11:20:42,952][INFO ][node                     ] [node_1] initialized
  1> [2014-08-07 11:20:42,955][INFO ][node                     ] [node_1] starting ...
  1> [2014-08-07 11:20:42,996][INFO ][transport                ] [node_1] bound_address {inet[/0:0:0:0:0:0:0:0:9301]}, publish_address {inet[/192.168.1.71:9301]}
  1> [2014-08-07 11:20:46,151][INFO ][cluster.service          ] [node_0] added {[node_1][n09X2sAeS7mdt0qzIQV2GA][TeKa-Laptop][inet[/192.168.1.71:9301]],}, reason: zen-disco-receive(join from node[[node_1][n09X2sAeS7mdt0qzIQV2GA][TeKa-Laptop][inet[/192.168.1.71:9301]]])
  1> [2014-08-07 11:20:46,171][INFO ][cluster.service          ] [node_1] detected_master [node_0][b7OAgitdSey5YEN3M6KRLg][TeKa-Laptop][inet[/192.168.1.71:9300]], added {[node_0][b7OAgitdSey5YEN3M6KRLg][TeKa-Laptop][inet[/192.168.1.71:9300]],}, reason: zen-disco-receive(from master [[node_0][b7OAgitdSey5YEN3M6KRLg][TeKa-Laptop][inet[/192.168.1.71:9300]]])
  1> [2014-08-07 11:20:46,261][INFO ][discovery                ] [node_1] shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/n09X2sAeS7mdt0qzIQV2GA
  1> [2014-08-07 11:20:46,283][INFO ][http                     ] [node_1] bound_address {inet[/0:0:0:0:0:0:0:0:9201]}, publish_address {inet[/192.168.1.71:9201]}
  1> [2014-08-07 11:20:46,284][INFO ][node                     ] [node_1] started
  1> [2014-08-07 11:20:46,285][INFO ][test                     ] Start Shared Node [node_1] not shared
  1> [2014-08-07 11:20:46,405][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_multipleIndices]: before test
  1> [2014-08-07 11:20:46,406][INFO ][test.integration.updatebyquery] --> creating index test1
  1> [2014-08-07 11:20:46,963][INFO ][cluster.metadata         ] [node_0] [test1] creating index, cause [api], shards [3]/[1], mappings [subtype1, type1]
  1> [2014-08-07 11:20:47,518][INFO ][test.integration.updatebyquery] --> creating index test2
  1> [2014-08-07 11:20:47,784][INFO ][cluster.metadata         ] [node_0] [test2] creating index, cause [api], shards [4]/[0], mappings [subtype1, type1]
  1> [2014-08-07 11:20:48,336][INFO ][cluster.metadata         ] [node_0] [test0] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:48,817][INFO ][cluster.metadata         ] [node_0] [test0] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:49,406][INFO ][cluster.metadata         ] [node_0] [test1] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:49,750][INFO ][cluster.metadata         ] [node_0] [test2] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:50,082][INFO ][cluster.metadata         ] [node_0] [test3] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:50,571][INFO ][cluster.metadata         ] [node_0] [test3] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:50,799][INFO ][cluster.metadata         ] [node_0] [test4] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:51,430][INFO ][cluster.metadata         ] [node_0] [test4] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:51,606][INFO ][cluster.metadata         ] [node_0] [test5] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:52,267][INFO ][cluster.metadata         ] [node_0] [test5] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:52,536][INFO ][cluster.metadata         ] [node_0] [test6] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:52,944][INFO ][cluster.metadata         ] [node_0] [test6] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:53,462][INFO ][cluster.metadata         ] [node_0] [test7] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:53,905][INFO ][cluster.metadata         ] [node_0] [test7] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:54,086][INFO ][cluster.metadata         ] [node_0] [test8] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:54,473][INFO ][cluster.metadata         ] [node_0] [test8] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:54,718][INFO ][cluster.metadata         ] [node_0] [test9] creating index, cause [auto(index api)], shards [10]/[0], mappings []
  1> [2014-08-07 11:20:55,627][INFO ][cluster.metadata         ] [node_0] [test9] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:20:56,856][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_multipleIndices]: cleaning up after test
  1> [2014-08-07 11:20:56,866][INFO ][cluster.metadata         ] [node_0] [test0] deleting index
  1> [2014-08-07 11:20:56,891][ERROR][action.updatebyquery     ] [node_1] [test1][0] error while executing update by query shard request
  1> org.elasticsearch.transport.RemoteTransportException: index 10
  1> Caused by: org.elasticsearch.transport.ResponseHandlerFailureTransportException: index 10
  1> Caused by: java.lang.IndexOutOfBoundsException: index 10
  1>    at java.util.concurrent.atomic.AtomicReferenceArray.checkedByteOffset(AtomicReferenceArray.java:78)
  1>    at java.util.concurrent.atomic.AtomicReferenceArray.set(AtomicReferenceArray.java:139)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$MultipleIndexUpdateByQueryActionListener.onResponse(TransportUpdateByQueryAction.java:117)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$MultipleIndexUpdateByQueryActionListener.onResponse(TransportUpdateByQueryAction.java:98)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$UpdateByQueryIndexOperationAction$ShardResponseListener.finalizeAction(TransportUpdateByQueryAction.java:395)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$UpdateByQueryIndexOperationAction$ShardResponseListener.handleResponse(TransportUpdateByQueryAction.java:373)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$UpdateByQueryIndexOperationAction$3.handleResponse(TransportUpdateByQueryAction.java:340)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$UpdateByQueryIndexOperationAction$3.handleResponse(TransportUpdateByQueryAction.java:333)
  1>    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:154)
  1>    at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
  1>    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
  1>    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
  1>    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
  1>    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
  1>    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
  1>    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
  1>    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
  1>    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
  1>    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
  1>    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1>    at java.lang.Thread.run(Thread.java:745)
  1> [2014-08-07 11:20:56,895][ERROR][transport.netty          ] [node_1] failed to handle exception response [org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$UpdateByQueryIndexOperationAction$3@27dbd56a]
  1> java.lang.IndexOutOfBoundsException: index 3
  1>    at java.util.concurrent.atomic.AtomicReferenceArray.checkedByteOffset(AtomicReferenceArray.java:78)
  1>    at java.util.concurrent.atomic.AtomicReferenceArray.set(AtomicReferenceArray.java:139)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$UpdateByQueryIndexOperationAction$ShardResponseListener.handleException(TransportUpdateByQueryAction.java:380)
  1>    at org.elasticsearch.action.updatebyquery.TransportUpdateByQueryAction$UpdateByQueryIndexOperationAction$3.handleException(TransportUpdateByQueryAction.java:344)
  1>    at org.elasticsearch.transport.netty.MessageChannelHandler.handleException(MessageChannelHandler.java:181)
  1>    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:159)
  1>    at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
  1>    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
  1>    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
  1>    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
  1>    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
  1>    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
  1>    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  1>    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
  1>    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
  1>    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
  1>    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
  1>    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
  1>    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1>    at java.lang.Thread.run(Thread.java:745)
  1> [2014-08-07 11:20:58,031][INFO ][cluster.metadata         ] [node_0] [test8] deleting index
  1> [2014-08-07 11:20:58,631][INFO ][cluster.metadata         ] [node_0] [test2] deleting index
  1> [2014-08-07 11:20:58,861][INFO ][cluster.metadata         ] [node_0] [test6] deleting index
  1> [2014-08-07 11:20:59,286][INFO ][cluster.metadata         ] [node_0] [test1] deleting index
  1> [2014-08-07 11:20:59,573][INFO ][cluster.metadata         ] [node_0] [test3] deleting index
  1> [2014-08-07 11:21:00,320][INFO ][cluster.metadata         ] [node_0] [test4] deleting index
  1> [2014-08-07 11:21:00,713][INFO ][cluster.metadata         ] [node_0] [test5] deleting index
  1> [2014-08-07 11:21:01,114][INFO ][cluster.metadata         ] [node_0] [test7] deleting index
  1> [2014-08-07 11:21:01,525][INFO ][cluster.metadata         ] [node_0] [test9] deleting index
  1> [2014-08-07 11:21:01,905][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_multipleIndices]: cleaned up after test
  1> [2014-08-07 11:21:01,905][INFO ][test                     ] Wipe data directory for all nodes locations: [data/d3/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/0, data/d1/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/0, data/d3/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/1, data/d1/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/1, data/d0/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/1, data/d0/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/0, data/d2/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/0, data/d2/shared-TeKa-Laptop-CHILD_VM=[0]-CLUSTER_SEED=[4058369109940772719]-HASH=[4DCCE58B3BA]/nodes/1]
  1> [2014-08-07 11:21:01,917][ERROR][test                     ] FAILURE  : testUpdateByQuery_multipleIndices(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests)
  1> REPRODUCE WITH  : mvn test -Dtests.seed=385239340D701F6F -Dtests.class=org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests -Dtests.method=testUpdateByQuery_multipleIndices -Dtests.prefix=tests -Dfile.encoding=UTF-8 -Duser.timezone=Europe/Paris -Des.logger.level=INFO
  1> Throwable:
  1> java.lang.AssertionError: 
  1> Expected: "test1"
  1>      got: "test2"
  1> 
  1>     __randomizedtesting.SeedInfo.seed([385239340D701F6F:2AF542E7CD9C08AE]:0)
  1>     [...org.junit.*]
  1>     org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests.testUpdateByQuery_multipleIndices(UpdateByQueryTests.java:222)
  1>     [...sun.*, com.carrotsearch.randomizedtesting.*, java.lang.reflect.*]
  1>     org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
  1>     org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
  1>     org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
  1>     [...com.carrotsearch.randomizedtesting.*]
  1>     org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
  1>     org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
  1>     org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
  1>     [...com.carrotsearch.randomizedtesting.*]
  1>     org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
  1>     org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
  1>     [...com.carrotsearch.randomizedtesting.*]
  1>     org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
  1>     org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
  1>     org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
  1>     org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
  1>     [...com.carrotsearch.randomizedtesting.*]
  1>     java.lang.Thread.run(Thread.java:745)
  1> 
  1> [2014-08-07 11:21:01,938][INFO ][test                     ] Test testUpdateByQuery_multipleIndices(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) finished
FAILURE 26.7s | UpdateByQueryTests.testUpdateByQuery_multipleIndices <<<
   > Throwable #1: java.lang.AssertionError: 
   > Expected: "test1"
   >      got: "test2"
   > 
   >    at __randomizedtesting.SeedInfo.seed([385239340D701F6F:2AF542E7CD9C08AE]:0)
   >    at org.junit.Assert.assertThat(Assert.java:780)
   >    at org.junit.Assert.assertThat(Assert.java:738)
   >    at org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests.testUpdateByQuery_multipleIndices(UpdateByQueryTests.java:222)
   >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   >    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   >    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   >    at java.lang.reflect.Method.invoke(Method.java:606)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
   >    at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
   >    at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
   >    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
   >    at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
   >    at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
   >    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
   >    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   >    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   >    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
   >    at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
   >    at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
   >    at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
   >    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
   >    at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
   >    at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
   >    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   >    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   >    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   >    at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
   >    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   >    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
   >    at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
   >    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   >    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
   >    at java.lang.Thread.run(Thread.java:745)
  1> [2014-08-07 11:21:01,940][INFO ][test                     ] Test testUpdateByQuery_usingAliases(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) started
  1> [2014-08-07 11:21:01,951][INFO ][plugins                  ] [transport_client_node_0] loaded [action-updatebyquery], sites []
  1> [2014-08-07 11:21:02,119][INFO ][plugins                  ] [transport_client_node_1] loaded [action-updatebyquery], sites []
  1> [2014-08-07 11:21:02,284][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_usingAliases]: before test
  1> [2014-08-07 11:21:02,318][INFO ][cluster.metadata         ] [node_0] [test] creating index, cause [api], shards [5]/[1], mappings []
  1> [2014-08-07 11:21:02,711][INFO ][cluster.metadata         ] [node_0] [test] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:21:02,963][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_usingAliases]: cleaning up after test
  1> [2014-08-07 11:21:02,967][INFO ][cluster.metadata         ] [node_0] [test] deleting index
  1> [2014-08-07 11:21:03,305][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_usingAliases]: cleaned up after test
  1> [2014-08-07 11:21:03,387][INFO ][test                     ] Test testUpdateByQuery_usingAliases(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) finished
  1> [2014-08-07 11:21:03,388][INFO ][test                     ] Test testUpdateByQuery_noMatches(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) started
  1> [2014-08-07 11:21:03,400][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_noMatches]: before test
  1> [2014-08-07 11:21:03,401][INFO ][test.integration.updatebyquery] --> creating index test
  1> [2014-08-07 11:21:03,439][INFO ][cluster.metadata         ] [node_0] [test] creating index, cause [api], shards [3]/[1], mappings [subtype1, type1]
  1> [2014-08-07 11:21:03,714][INFO ][cluster.metadata         ] [node_0] [test] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:21:03,768][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_noMatches]: cleaning up after test
  1> [2014-08-07 11:21:03,770][INFO ][cluster.metadata         ] [node_0] [test] deleting index
  1> [2014-08-07 11:21:03,923][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery_noMatches]: cleaned up after test
  1> [2014-08-07 11:21:03,926][INFO ][test                     ] Test testUpdateByQuery_noMatches(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) finished
  1> [2014-08-07 11:21:03,927][INFO ][test                     ] Test testUpdateByQuery(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) started
  1> [2014-08-07 11:21:03,934][INFO ][plugins                  ] [transport_client_node_1] loaded [action-updatebyquery], sites []
  1> [2014-08-07 11:21:04,066][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery]: before test
  1> [2014-08-07 11:21:04,067][INFO ][test.integration.updatebyquery] --> creating index test
  1> [2014-08-07 11:21:04,098][INFO ][cluster.metadata         ] [node_0] [test] creating index, cause [api], shards [2]/[0], mappings [subtype1, type1]
  1> [2014-08-07 11:21:04,423][INFO ][cluster.metadata         ] [node_0] [test] update_mapping [type1] (dynamic)
  1> [2014-08-07 11:21:04,587][INFO ][cluster.metadata         ] [node_0] [test] update_mapping [type2] (dynamic)
  1> [2014-08-07 11:21:04,817][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery]: cleaning up after test
  1> [2014-08-07 11:21:04,819][INFO ][cluster.metadata         ] [node_0] [test] deleting index
  1> [2014-08-07 11:21:04,946][INFO ][test.integration.updatebyquery] [UpdateByQueryTests#testUpdateByQuery]: cleaned up after test
  1> [2014-08-07 11:21:04,966][INFO ][test                     ] Test testUpdateByQuery(org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests) finished
Completed in 30.86s, 4 tests, 1 failure <<< FAILURES!


Tests with failures:
  - org.elasticsearch.test.integration.updatebyquery.UpdateByQueryTests.testUpdateByQuery_multipleIndices


[INFO] JVM J0:     0.73 ..    32.31 =    31.58s
[INFO] Execution time total: 32 seconds
[INFO] Tests summary: 1 suite, 4 tests, 1 failure
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 41.445s
[INFO] Finished at: Thu Aug 07 11:21:05 CEST 2014
[INFO] Final Memory: 30M/177M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.carrotsearch.randomizedtesting:junit4-maven-plugin:2.0.15:junit4 (tests) on project elasticsearch-action-updatebyquery: Execution tests of goal com.carrotsearch.randomizedtesting:junit4-maven-plugin:2.0.15:junit4 failed: /data/Yakaz/elasticsearch-action-updatebyquery/target/junit4-ant-6078308574069738634.xml:16: There were test failures: 1 suite, 4 tests, 1 failure -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException

I was unable to reproduce using the command suggested in the output.

I was working on another bug due to / hinted by randomization, in which I only modified the number of shards inside in the settings of an client().admin().indices().prepareCreate() call. It should not interfere anyway.
I was on commit 349d105.

@ofavre
Copy link
Contributor Author

ofavre commented Aug 7, 2014

In TransportUpdateByQueryAction.MultipleIndexUpdateByQueryActionListener.onResponse(), we can read:

        public void onResponse(IndexUpdateByQueryResponse indexUpdateByQueryResponse) {
            successFullIndexResponses.set(indexCounter.getAndIncrement(), indexUpdateByQueryResponse);
            if (indexCounter.get() == expectedNumberOfResponses) {
                finishHim();

            }
        }

        public void onFailure(Throwable e) {
            failedIndexResponses.set(indexCounter.getAndIncrement(), e);
            if (indexCounter.get() == expectedNumberOfResponses) {
                finishHim();
            }
        }

The IOOBE happened in the onResponse() function, called through ShardResponseListener.finalizeAction().

What happened is that somehow indexCounter.getAndIncrement() returned a value superior or equal to expectedNumberOfResponses in onResponse().
That lead AtomicReferenceArray.set() to throw an IOOBE, causing the MessageChannelHandler.handleResponde() to call onFailure() after we already incremented the AtomicInteger indexCounter, causing the ShardResponseListener.handleException() to call indexCounter.getAndIncrement() once more, itself overflowing.

            void handleResponse(ShardUpdateByQueryResponse response) {
                shardResponses.set(indexCounter.getAndIncrement(), response);
                if (indexCounter.get() == numberOfExpectedShardResponses) {
                    finalizeAction();
                }
            }

            void handleException(Throwable e, ShardRouting shard) {
                logger.error("[{}][{}] error while executing update by query shard request", e, request.index(), shard.id());
                String failure = ExceptionsHelper.detailedMessage(e);
                shardResponses.set(indexCounter.getAndIncrement(), new ShardUpdateByQueryResponse(shard.id(), failure));
                if (indexCounter.get() == numberOfExpectedShardResponses) {
                    finalizeAction();
                }
            }

@ofavre
Copy link
Contributor Author

ofavre commented Aug 7, 2014

There is a racing condition with the following listener idiom:

final int numberOfExpectedShardResponses;
final AtomicInteger indexCounter = new AtomicInteger();
void noteResponse(Response response) {
    responses.set(indexCounter.getAndIncrement(), response);
    if (indexCounter.get() == numberOfExpectedResponses) {
        finalizeAction();
    }
}

Suppose the following interleaving of operations on two threads:

  • indexCount = 0
  • numberOfExpectedResponses = 2
  • Thread A: getAndIncrement() → 0, indexCount is now 1
  • Thread B: getAndIncrement() → 1, indexCount is now 2
  • Thread A: set(0)
  • Thread A: get() → 2
  • Thread A: finalizeAction()
  • Thread A: return
  • Thread B: set(1)
  • Thread B: get() → 2
  • Thread B: finalizeAction()
  • Thread B: return

This exhibits 2 problems:

  • finalizeAction() is called before set(1).
  • finalizeAction() is called twice.

A possible fix would be to use another AtomicInteger, incremented and compared to the number of expected responses only after the response is set in the array:

final int numberOfExpectedShardResponses;
final AtomicInteger indexCounter = new AtomicInteger();
final AtomicInteger writeCounter = new AtomicInteger();
void noteResponse(Response response) {
    responses.set(indexCounter.getAndIncrement(), response);
    if (writeCounter.incrementAndGet() == numberOfExpectedResponses) {
        finalizeAction();
    }
}

This way, every single index from indexCounter is still used by only one thread at a time, like before, but now only one thread will have the responsibility to finalize, and it could only do so after all values have been properly written.

@ofavre
Copy link
Contributor Author

ofavre commented Aug 7, 2014

@martijnvg Shouldn't TransportUpdateByQueryAction extend some more specialized pre-existent classes like TransportIndexReplicationOperationAction or TransportIndicesReplicationOperationAction?
They seem to implement the fix I proposed above, through their completionCounter variable.

The initial problem may not be directly related to this one, as I still can't explain why set() has been called with an index causing overflow.
It's likely that some error arose while calling finalizeAction().

@ofavre
Copy link
Contributor Author

ofavre commented Aug 7, 2014

Maybe it was just a race condition in ShardResponseListener.handleResponse() that caused finalizeAction() to be called twice, and hence MultipleIndexUpdateByQueryActionListener.onResponse() to write one too many response, causing an IOOBE, cascading to causing another IOOBE in ShardResponseListener.handleException().
This scenario would explain the encountered problem.

@ofavre ofavre closed this as completed in 087a7fb Aug 7, 2014
@martijnvg
Copy link
Contributor

@ofavre Those base classes are actually meant to perform a write and replica it. The TransportUpdateByQueryAction is effectively a preprocess place where we just run a query on the primary shard and put matching ids in a bulk request. So logically it should extend one of those base classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants