Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] RollupActionIT testRollupIndexAndSetNewRollupPolicy failing #68609

Closed
mark-vieira opened this issue Feb 5, 2021 · 15 comments · Fixed by #68878 or #87269
Closed

[CI] RollupActionIT testRollupIndexAndSetNewRollupPolicy failing #68609

mark-vieira opened this issue Feb 5, 2021 · 15 comments · Fixed by #68878 or #87269
Assignees
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI

Comments

@mark-vieira
Copy link
Contributor

mark-vieira commented Feb 5, 2021

This is a new test case which has failed a few times now since being added.

Build scan:
https://gradle-enterprise.elastic.co/s/nckptifuyp4v2/tests/:x-pack:plugin:ilm:qa:multi-node:javaRestTest/org.elasticsearch.xpack.ilm.actions.RollupActionIT/testRollupIndexAndSetNewRollupPolicy#1

Repro line:
./gradlew ':x-pack:plugin:ilm:qa:multi-node:javaRestTest' --tests "org.elasticsearch.xpack.ilm.actions.RollupActionIT.testRollupIndexAndSetNewRollupPolicy" -Dtests.seed=F0A4BF22F8036B7E -Dtests.security.manager=true -Dtests.locale=sk-SK -Dtests.timezone=Pacific/Truk -Druntime.java=11

Reproduces locally?:
Nope.

Applicable branches:
master and 7.x (error on the 7.x build looked a bit different though)

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?search.relativeStartTime=P7D&search.timeZoneId=America/Los_Angeles&tests.container=org.elasticsearch.xpack.ilm.actions.RollupActionIT&tests.sortField=FAILED&tests.unstableOnly=true

Failure excerpt:

org.elasticsearch.xpack.ilm.actions.RollupActionIT > testRollupIndexAndSetNewRollupPolicy FAILED
    java.lang.AssertionError
        at __randomizedtesting.SeedInfo.seed([F0A4BF22F8036B7E:7FF7996A3D79E9A]:0)
@mark-vieira mark-vieira added >test-failure Triaged test failures from CI :Data Management/ILM+SLM Index and Snapshot lifecycle management labels Feb 5, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Feb 5, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@mark-vieira
Copy link
Contributor Author

Since this is a newly added test I've muted it in master and 7x..

@mark-vieira
Copy link
Contributor Author

@andreidan since you added this test do you mind taking a look?

@dakrone
Copy link
Member

dakrone commented Feb 5, 2021

Actually I think @talevy might be better, since he recently did the work that added the rollup stuff to ILM, @talevy can you take a look at this?

(Andrei is on the git blame because of moving the tests around)

@talevy
Copy link
Contributor

talevy commented Feb 5, 2021

yes. I'll investigate. thanks for the ping and for muting the test!

@talevy talevy self-assigned this Feb 5, 2021
@talevy talevy added the :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data label Feb 5, 2021
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 5, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@mark-vieira
Copy link
Contributor Author

(Andrei is on the git blame because of moving the tests around)

Ah, thanks for the clarification @dakrone. The "old" test has failed in a similar manner before as well: https://gradle-enterprise.elastic.co/s/5o6gpcxlp5uri/tests/:x-pack:plugin:ilm:qa:multi-node:javaRestTest/org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT/testRollupIndexAndSetNewRollupPolicy#1

@mark-vieira
Copy link
Contributor Author

FYI, I've seen other builds with slightly different test failures as well but the common thing I see is this in some of the node logs:

» [2021-02-05T19:26:13,436][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [javaRestTest-0] fatal error in thread [elasticsearch[javaRestTest-0][rollup_indexing][T#1]], exiting
»  java.lang.AssertionError: null
»  	at org.elasticsearch.xpack.rollup.v2.RollupShardIndexer.checkCleanDirectory(RollupShardIndexer.java:207) ~[?:?]
»  	at org.elasticsearch.xpack.rollup.v2.RollupShardIndexer.execute(RollupShardIndexer.java:196) ~[?:?]
»  	at org.elasticsearch.xpack.rollup.v2.TransportRollupIndexerAction.shardOperation(TransportRollupIndexerAction.java:112) ~[?:?]
»  	at org.elasticsearch.xpack.rollup.v2.TransportRollupIndexerAction.shardOperation(TransportRollupIndexerAction.java:46) ~[?:?]
»  	at org.elasticsearch.action.support.broadcast.TransportBroadcastAction.lambda$asyncShardOperation$0(TransportBroadcastAction.java:291) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:728) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
»  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
»  	at java.lang.Thread.run(Thread.java:834) [?:?]

I think these might all be related.

@mark-vieira
Copy link
Contributor Author

Also @talevy we might want to improve some of the error messages in there as per what Lee and I talked about in slack.

https://elastic.slack.com/archives/C0D8ST60Y/p1612563629493100

@mark-vieira
Copy link
Contributor Author

Yeah, pretty much every build failing on task :x-pack:plugin:ilm:qa:multi-node:javaRestTest is giving me this on at least one of the nodes:

» [2021-02-05T14:52:12,154][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [javaRestTest-3] fatal error in thread [elasticsearch[javaRestTest-3][rollup_indexing][T#1]], exiting
»  java.lang.AssertionError: null
»  	at org.elasticsearch.xpack.rollup.v2.RollupShardIndexer.checkCleanDirectory(RollupShardIndexer.java:207) ~[?:?]

The confusing bit is depending on when it happens and which node the tests happen to hit you'll get different tests failing with different errors. I know very little, but it seems like that is the root cause here.

If we could prioritize a fix here that would be great since it's causing a whole host of build failures and it's not practical to mute them all unless we literally mute this entire project.

talevy added a commit to talevy/elasticsearch that referenced this issue Feb 11, 2021
This commit removes the assertion in RollupShardIndexer that verifies that
temporary files are deleted. Since it is the responsibility of the indexer
to instruct the OS to delete files, it may not do so in a timely manner. This
results in a potentially flaky assertion. Instead, a new unit test is introduced
that will introspect the indexer and assert that it had successfully called
for the files to be deleted.

Closes elastic#68609.
talevy added a commit that referenced this issue Feb 11, 2021
This commit removes the assertion in RollupShardIndexer that verifies that
temporary files are deleted. Since it is the responsibility of the indexer
to instruct the OS to delete files, it may not do so in a timely manner. This
results in a potentially flaky assertion. Instead, a new unit test is introduced
that will introspect the indexer and assert that it had successfully called
for the files to be deleted.

Closes #68609.
talevy added a commit to talevy/elasticsearch that referenced this issue Feb 16, 2021
This commit removes the assertion in RollupShardIndexer that verifies that
temporary files are deleted. Since it is the responsibility of the indexer
to instruct the OS to delete files, it may not do so in a timely manner. This
results in a potentially flaky assertion. Instead, a new unit test is introduced
that will introspect the indexer and assert that it had successfully called
for the files to be deleted.

Closes elastic#68609.
@mark-vieira
Copy link
Contributor Author

This continues to fail on occasion, but now with a different error:

org.elasticsearch.xpack.ilm.actions.RollupActionIT > testRollupIndexAndSetNewRollupPolicy FAILED
    java.lang.RuntimeException: failed to delete policy: policy-rGJwI
        at __randomizedtesting.SeedInfo.seed([36BD55660610A9B6:C1E693D25DC45C52]:0)
        at org.elasticsearch.test.rest.ESRestTestCase.lambda$deleteAllILMPolicies$10(ESRestTestCase.java:868)

        Caused by:
        org.elasticsearch.client.ResponseException: method [DELETE], host [http://127.0.0.1:37291], URI [/_ilm/policy/policy-rGJwI], status line [HTTP/1.1 400 Bad Request]
        {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Cannot delete policy [policy-rGJwI]. It is in use by one or more indices: [index-wvjliyojez, rollup-index-wvjliyojez-fcxqwqicrziimbg_kkzlfa]"}],"type":"illegal_argument_exception","reason":"Cannot delete policy [policy-rGJwI]. It is in use by one or more indices: [index-wvjliyojez, rollup-index-wvjliyojez-fcxqwqicrziimbg_kkzlfa]"},"status":400}

https://gradle-enterprise.elastic.co/s/2s3bse2tloulo/tests/:x-pack:plugin:ilm:qa:multi-node:javaRestTest/org.elasticsearch.xpack.ilm.actions.RollupActionIT/testRollupIndexAndSetNewRollupPolicy#1

@talevy mind taking a look?

@mark-vieira mark-vieira reopened this Mar 1, 2021
@andreidan
Copy link
Contributor

Drive-by comment. This seems to be a problem with the rollup operation itself?

[2021-03-01T17:46:45,444][ERROR][o.e.x.i.IndexLifecycleRunner] [javaRestTest-1] policy [policy-rGJwI] for index [rollup-index-wvjliyojez-fcxqwqicrziimbg_kkzlfa] failed on step [{"phase":"cold","action":"rollup","name":"rollup"}]. Moving to ERROR step
org.elasticsearch.ElasticsearchException: Unable to rollup index [rollup-index-wvjliyojez-fcxqwqicrziimbg_kkzlfa]
	at org.elasticsearch.xpack.rollup.v2.TransportRollupAction$2.onResponse(TransportRollupAction.java:281) [x-pack-rollup-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.xpack.rollup.v2.TransportRollupAction$2.onResponse(TransportRollupAction.java:275) [x-pack-rollup-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:31) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.client.node.NodeClient.lambda$executeLocally$0(NodeClient.java:100) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:170) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:164) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.ActionListener$DelegatingActionListener.onResponse(ActionListener.java:184) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.ActionListener$DelegatingActionListener.onResponse(ActionListener.java:184) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.AckedClusterStateUpdateTask.onAllNodesAcked(AckedClusterStateUpdateTask.java:56) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.service.MasterService$SafeAckedClusterStateTaskListener.onAllNodesAcked(MasterService.java:546) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.service.MasterService$AckCountDownListener.finish(MasterService.java:671) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.service.MasterService$AckCountDownListener.onNodeAck(MasterService.java:662) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.service.MasterService$DelegatingAckListener.onNodeAck(MasterService.java:594) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.coordination.Coordinator$CoordinatorPublication$4$1.onSuccess(Coordinator.java:1408) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.service.ClusterApplierService$SafeClusterApplyListener.onSuccess(ClusterApplierService.java:542) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:413) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:151) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:669) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: org.elasticsearch.ElasticsearchException: [rollup-index-wvjliyojez-fcxqwqicrziimbg_kkzlfa/wOT4gh97ThKEsCfFuVWsdg][[rollup-index-wvjliyojez-fcxqwqicrziimbg_kkzlfa][0]] org.elasticsearch.action.support.broadcast.BroadcastShardOperationFailedException:
	at org.elasticsearch.xpack.rollup.v2.TransportRollupIndexerAction.newResponse(TransportRollupIndexerAction.java:130) ~[?:?]
	at org.elasticsearch.xpack.rollup.v2.TransportRollupIndexerAction$Async.finishHim(TransportRollupIndexerAction.java:149) ~[?:?]
	at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction.onOperation(TransportBroadcastAction.java:226) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction$1.handleException(TransportBroadcastAction.java:182) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.TransportService$5.handleException(TransportService.java:620) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1164) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:317) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:165) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:315) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:307) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:126) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:84) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:678) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:129) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:104) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:69) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:63) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[?:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[?:?]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
	... 1 more
Caused by: org.elasticsearch.action.support.broadcast.BroadcastShardOperationFailedException: 
	at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction.setFailure(TransportBroadcastAction.java:250) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction.onOperation(TransportBroadcastAction.java:204) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction$1.handleException(TransportBroadcastAction.java:182) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.TransportService$5.handleException(TransportService.java:620) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1164) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:317) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:165) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:315) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:307) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:126) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:84) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:678) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:129) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:104) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:69) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:63) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[?:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[?:?]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
	... 1 more
Caused by: org.elasticsearch.transport.RemoteTransportException: [javaRestTest-0][127.0.0.1:46357][indices:admin/xpack/rollup_indexer[s]]
Caused by: org.elasticsearch.common.io.stream.NotSerializableExceptionWrapper: unsupported_operation_exception: String representation of doc values for [aggregate_metric_double] fields is not supported
	at org.elasticsearch.xpack.aggregatemetric.mapper.AggregateDoubleMetricFieldMapper$AggregateDoubleMetricFieldType$1$1.getBytesValues(AggregateDoubleMetricFieldMapper.java:387) ~[?:?]
	at org.elasticsearch.index.fielddata.LeafFieldData.getFormattedValues(LeafFieldData.java:36) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.xpack.rollup.v2.FieldValueFetcher.getLeaf(FieldValueFetcher.java:50) ~[?:?]
	at org.elasticsearch.xpack.rollup.v2.RollupShardIndexer$BucketCollector.leafFetchers(RollupShardIndexer.java:491) ~[?:?]
	at org.elasticsearch.xpack.rollup.v2.RollupShardIndexer$BucketCollector.getLeafCollector(RollupShardIndexer.java:436) ~[?:?]
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:652) ~[lucene-core-8.8.0.jar:8.8.0 b10659f0fc18b58b90929cfdadde94544d202c4a - noble - 2021-01-25 19:07:45]
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445) ~[lucene-core-8.8.0.jar:8.8.0 b10659f0fc18b58b90929cfdadde94544d202c4a - noble - 2021-01-25 19:07:45]
	at org.elasticsearch.xpack.rollup.v2.RollupShardIndexer.computeBucket(RollupShardIndexer.java:294) ~[?:?]
	at org.elasticsearch.xpack.rollup.v2.RollupShardIndexer.execute(RollupShardIndexer.java:202) ~[?:?]
	at org.elasticsearch.xpack.rollup.v2.TransportRollupIndexerAction.shardOperation(TransportRollupIndexerAction.java:112) ~[?:?]
	at org.elasticsearch.xpack.rollup.v2.TransportRollupIndexerAction.shardOperation(TransportRollupIndexerAction.java:46) ~[?:?]
	at org.elasticsearch.action.support.broadcast.TransportBroadcastAction.lambda$asyncShardOperation$0(TransportBroadcastAction.java:291) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:728) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:834) ~[?:?]

@talevy
Copy link
Contributor

talevy commented Mar 3, 2021

thank you. will take another look!

@williamrandolph
Copy link
Contributor

We are still having "failed to delete policy" errors: https://gradle-enterprise.elastic.co/s/i2gfwo5lss5j4

This time testRollupIndex is failing.

java.lang.RuntimeException: failed to delete policy: policy-ekBlR	
at __randomizedtesting.SeedInfo.seed([A4822491ADE2DE00:42A655C7C4AED7CC]:0)	
at org.elasticsearch.test.rest.ESRestTestCase.lambda$deleteAllILMPolicies$17(ESRestTestCase.java:898)	
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)	
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)	
at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1603)	
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)	
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)	
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)	
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)	
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)	
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)	
at org.elasticsearch.test.rest.ESRestTestCase.deleteAllILMPolicies(ESRestTestCase.java:894)	
at org.elasticsearch.test.rest.ESRestTestCase.wipeCluster(ESRestTestCase.java:668)	
at org.elasticsearch.test.rest.ESRestTestCase.cleanUpCluster(ESRestTestCase.java:315)	
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)	
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)	
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)	
at java.lang.reflect.Method.invoke(Method.java:566)	
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)	
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)	
at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)	
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)	
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)	
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)	
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)	
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)	
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)	
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)	
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)	
at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)	
at java.lang.Thread.run(Thread.java:834)	

Caused by: org.elasticsearch.client.ResponseException: method [DELETE], host [http://127.0.0.1:44089], URI [/_ilm/policy/policy-ekBlR], status line [HTTP/1.1 400 Bad Request]	
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Cannot delete policy [policy-ekBlR]. It is in use by one or more indices: [index-whtqnvpnwz]"}],"type":"illegal_argument_exception","reason":"Cannot delete policy [policy-ekBlR]. It is in use by one or more indices: [index-whtqnvpnwz]"},"status":400}	
at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:330)	
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:296)	
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:270)	
at org.elasticsearch.test.rest.ESRestTestCase.lambda$deleteAllILMPolicies$17(ESRestTestCase.java:896)	
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)	
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)	
at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1603)	
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)	
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)	
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)	
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)	
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)	
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)	
at org.elasticsearch.test.rest.ESRestTestCase.deleteAllILMPolicies(ESRestTestCase.java:894)	
at org.elasticsearch.test.rest.ESRestTestCase.wipeCluster(ESRestTestCase.java:668)	
at org.elasticsearch.test.rest.ESRestTestCase.cleanUpCluster(ESRestTestCase.java:315)	
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)	
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)	
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)	
at java.lang.reflect.Method.invoke(Method.java:566)	
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)	
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)	
at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)	
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)	
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)	
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)	
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)	
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)	
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)	
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)	
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)	
at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)	
at java.lang.Thread.run(Thread.java:834)

@talevy talevy assigned csoulios and unassigned talevy Mar 18, 2021
@dimitris-athanasiou
Copy link
Contributor

Got a few more failures on this. Details in #70980 (which I have now closed as I realised it's a duplicate of this one)

csoulios added a commit to csoulios/elasticsearch that referenced this issue Mar 30, 2021
csoulios added a commit that referenced this issue Mar 30, 2021
csoulios added a commit that referenced this issue Mar 30, 2021
csoulios added a commit that referenced this issue Mar 30, 2021
csoulios added a commit that referenced this issue Jul 26, 2022
This PR adds support for an ILM action that downsamples a time-series index
by invoking the _rollup endpoint (#85708)

A policy that includes the rollup action will look like the following

PUT _ilm/policy/my_policy
{
  "policy": {
    "phases": {
      "warm": {
        "actions": {
          "rollup": {
  	    "fixed_interval": "24h"
  	  }
  	}
      }
    }
  }
}

Relates to #74660
Fixes #68609
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI
Projects
None yet
8 participants