Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Flaky test PredictionITTests #2436

Closed
Hailong-am opened this issue May 10, 2024 · 0 comments
Closed

[BUG] Flaky test PredictionITTests #2436

Hailong-am opened this issue May 10, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@Hailong-am
Copy link
Contributor

Hailong-am commented May 10, 2024

What is the bug?

PredictionITTests is flaky, CI runs https://github.com/opensearch-project/ml-commons/actions/runs/9025991264/job/24802601995?pr=2435

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

What is the expected behavior?
A clear and concise description of what you expected to happen.

What is your host/environment?

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:test' --tests "org.opensearch.ml.action.prediction.PredictionITTests.testPredictionWithDataFrame_BatchRCF" -Dtests.seed=CC4F8C51F54E62C2 -Dtests.security.manager=false -Dtests.locale=fr-FR -Dtests.timezone=Pacific/Niue -Druntime.java=17
    java.lang.AssertionError: Shard [.plugins-ml-config][0] is still locked after 5 sec waiting
[2024-05-09T15:11:50,845][INFO ][o.o.m.a.p.PredictionITTests] [testPredictionWithDataFrame_BatchRCF] after test
  1> [2024-05-09T15:11:50,845][INFO ][o.o.t.OpenSearchTestClusterRule] [testPredictionWithDataFrame_BatchRCF] [PredictionITTests#testPredictionWithDataFrame_BatchRCF]: cleaning up after test
  1> [2024-05-09T15:11:50,885][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:50,894][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_s0] [.plugins-ml-model/JrJHF6J_TEKPCBru8sqBRA] deleting index
  1> [2024-05-09T15:11:50,894][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_s0] [iris_data_for_prediction_it/baRt5UxBTXCsYtA4d-bt_w] deleting index
  1> [2024-05-09T15:11:50,894][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:50,905][INFO ][o.o.t.s.M.Listener       ] [node_s1] [.plugins-ml-model][0] start check index
  1> [2024-05-09T15:11:50,913][INFO ][o.o.t.s.M.Listener       ] [node_s1] [.plugins-ml-model][0] end check index
  1> [2024-05-09T15:11:50,917][INFO ][o.o.t.s.M.Listener       ] [node_s1] [iris_data_for_prediction_it][0] start check index
...
  [iris_data_for_prediction_it][9] start check index
  1> [2024-05-09T15:11:50,974][INFO ][o.o.t.s.M.Listener       ] [node_s1] [iris_data_for_prediction_it][9] end check index
  1> [2024-05-09T15:11:50,979][INFO ][o.o.t.s.M.Listener       ] [node_s0] [.plugins-ml-model][0] start check index
  1> [2024-05-09T15:11:50,980][INFO ][o.o.t.s.M.Listener       ] [node_s0] [.plugins-ml-model][0] end check index
  1> [2024-05-09T15:11:50,982][INFO ][o.o.t.s.M.Listener       ] [node_s0] [iris_data_for_prediction_it][0] start check index
  ...
[iris_data_for_prediction_it][7] end check index
  1> [2024-05-09T15:11:51,013][INFO ][o.o.p.PluginsService     ] [node_s0] PluginService:onIndexModule index:[.plugins-ml-config/v-3SLo1xQceuXE0u-CxdKg]
  1> [2024-05-09T15:11:51,015][INFO ][o.o.c.m.MetadataCreateIndexService] [node_s0] [.plugins-ml-config] creating index, cause [api], templates [random_index_template], shards [1]/[1]
  1> [2024-05-09T15:11:51,016][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:51,026][INFO ][o.o.p.PluginsService     ] [node_s0] PluginService:onIndexModule index:[.plugins-ml-config/v-3SLo1xQceuXE0u-CxdKg]
  1> [2024-05-09T15:11:51,052][INFO ][o.o.c.m.MetadataIndexTemplateService] [node_s0] removing template [random_index_template]
  1> [2024-05-09T15:11:51,093][INFO ][o.o.m.e.i.MLIndicesHandler] [node_s0] create index:.plugins-ml-config
  1> [2024-05-09T15:11:51,095][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:51,103][INFO ][o.o.p.PluginsService     ] [node_s1] PluginService:onIndexModule index:[.plugins-ml-config/v-3SLo1xQceuXE0u-CxdKg]
  1> [2024-05-09T15:11:51,136][INFO ][o.o.i.r.RecoverySourceHandler] [node_s0] [.plugins-ml-config][0][recover to node_s1] finalizing recovery took [2.1ms]
  1> [2024-05-09T15:11:51,138][INFO ][o.o.c.r.a.AllocationService] [node_s0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.plugins-ml-config][0]]]).
  1> [2024-05-09T15:11:51,153][WARN ][o.o.c.r.a.AllocationService] [node_s0] Falling back to single shard assignment since batch mode disable or multiple custom allocators set
  1> [2024-05-09T15:11:51,201][INFO ][o.o.m.c.MLSyncUpCron     ] [node_s0] ML configuration initialized successfully

From the log, you can see when doing test clean up, MLSyncUpCron initial the .plugins-ml-config after delete all indexes and before do after test assertion. To fix it we need to disable MLSyncUpCron and test is not depends on .plugins-ml-config, so we are safe to disable it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

2 participants