[BUG] Flaky test PredictionITTests #2436

Hailong-am · 2024-05-10T15:41:09Z

What is the bug?

PredictionITTests is flaky, CI runs https://github.com/opensearch-project/ml-commons/actions/runs/9025991264/job/24802601995?pr=2435

How can one reproduce the bug?
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

What is the expected behavior?
A clear and concise description of what you expected to happen.

What is your host/environment?

OS: [e.g. iOS]
Version [e.g. 22]
Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:test' --tests "org.opensearch.ml.action.prediction.PredictionITTests.testPredictionWithDataFrame_BatchRCF" -Dtests.seed=CC4F8C51F54E62C2 -Dtests.security.manager=false -Dtests.locale=fr-FR -Dtests.timezone=Pacific/Niue -Druntime.java=17
    java.lang.AssertionError: Shard [.plugins-ml-config][0] is still locked after 5 sec waiting

[2024-05-09T15:11:50,845][INFO ][o.o.m.a.p.PredictionITTests] [testPredictionWithDataFrame_BatchRCF] after test
  1> [2024-05-09T15:11:50,845][INFO ][o.o.t.OpenSearchTestClusterRule] [testPredictionWithDataFrame_BatchRCF] [PredictionITTests#testPredictionWithDataFrame_BatchRCF]: cleaning up after test
  1> [2024-05-09T15:11:50,885][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:50,894][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_s0] [.plugins-ml-model/JrJHF6J_TEKPCBru8sqBRA] deleting index
  1> [2024-05-09T15:11:50,894][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_s0] [iris_data_for_prediction_it/baRt5UxBTXCsYtA4d-bt_w] deleting index
  1> [2024-05-09T15:11:50,894][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:50,905][INFO ][o.o.t.s.M.Listener       ] [node_s1] [.plugins-ml-model][0] start check index
  1> [2024-05-09T15:11:50,913][INFO ][o.o.t.s.M.Listener       ] [node_s1] [.plugins-ml-model][0] end check index
  1> [2024-05-09T15:11:50,917][INFO ][o.o.t.s.M.Listener       ] [node_s1] [iris_data_for_prediction_it][0] start check index
...
  [iris_data_for_prediction_it][9] start check index
  1> [2024-05-09T15:11:50,974][INFO ][o.o.t.s.M.Listener       ] [node_s1] [iris_data_for_prediction_it][9] end check index
  1> [2024-05-09T15:11:50,979][INFO ][o.o.t.s.M.Listener       ] [node_s0] [.plugins-ml-model][0] start check index
  1> [2024-05-09T15:11:50,980][INFO ][o.o.t.s.M.Listener       ] [node_s0] [.plugins-ml-model][0] end check index
  1> [2024-05-09T15:11:50,982][INFO ][o.o.t.s.M.Listener       ] [node_s0] [iris_data_for_prediction_it][0] start check index
  ...
[iris_data_for_prediction_it][7] end check index
  1> [2024-05-09T15:11:51,013][INFO ][o.o.p.PluginsService     ] [node_s0] PluginService:onIndexModule index:[.plugins-ml-config/v-3SLo1xQceuXE0u-CxdKg]
  1> [2024-05-09T15:11:51,015][INFO ][o.o.c.m.MetadataCreateIndexService] [node_s0] [.plugins-ml-config] creating index, cause [api], templates [random_index_template], shards [1]/[1]
  1> [2024-05-09T15:11:51,016][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:51,026][INFO ][o.o.p.PluginsService     ] [node_s0] PluginService:onIndexModule index:[.plugins-ml-config/v-3SLo1xQceuXE0u-CxdKg]
  1> [2024-05-09T15:11:51,052][INFO ][o.o.c.m.MetadataIndexTemplateService] [node_s0] removing template [random_index_template]
  1> [2024-05-09T15:11:51,093][INFO ][o.o.m.e.i.MLIndicesHandler] [node_s0] create index:.plugins-ml-config
  1> [2024-05-09T15:11:51,095][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:51,103][INFO ][o.o.p.PluginsService     ] [node_s1] PluginService:onIndexModule index:[.plugins-ml-config/v-3SLo1xQceuXE0u-CxdKg]
  1> [2024-05-09T15:11:51,136][INFO ][o.o.i.r.RecoverySourceHandler] [node_s0] [.plugins-ml-config][0][recover to node_s1] finalizing recovery took [2.1ms]
  1> [2024-05-09T15:11:51,138][INFO ][o.o.c.r.a.AllocationService] [node_s0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.plugins-ml-config][0]]]).
  1> [2024-05-09T15:11:51,153][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:51,201][INFO ][o.o.m.c.MLSyncUpCron     ] [node_s0] ML configuration initialized successfully

From the log, you can see when doing test clean up, MLSyncUpCron initial the .plugins-ml-config after delete all indexes and before do after test assertion. To fix it we need to disable MLSyncUpCron and test is not depends on .plugins-ml-config, so we are safe to disable it.

The text was updated successfully, but these errors were encountered:

Hailong-am added bug Something isn't working untriaged labels May 10, 2024

dhrubo-os added this to ml-commons projects May 21, 2024

dhrubo-os moved this to In Progress in ml-commons projects May 21, 2024

dhrubo-os assigned Hailong-am May 21, 2024

dhrubo-os removed the untriaged label May 21, 2024

Hailong-am closed this as completed Jun 19, 2024

github-project-automation bot moved this from In Progress to Done in ml-commons projects Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Flaky test PredictionITTests #2436

[BUG] Flaky test PredictionITTests #2436

Hailong-am commented May 10, 2024 •

edited

Loading

[BUG] Flaky test PredictionITTests #2436

[BUG] Flaky test PredictionITTests #2436

Comments

Hailong-am commented May 10, 2024 • edited Loading

Hailong-am commented May 10, 2024 •

edited

Loading