Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testSingleNumericFeatureAndMixedTrainingAndNonTrainingRows test failures #53236

Closed
benwtrent opened this issue Mar 6, 2020 · 3 comments
Closed
Assignees
Labels
:ml Machine learning >test-failure Triaged test failures from CI v8.0.0-alpha1

Comments

@benwtrent
Copy link
Member

benwtrent commented Mar 6, 2020

Test failure can be seen here: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+matrix-java-periodic/ES_RUNTIME_JAVA=openjdk15,nodes=general-purpose/553/console

Reproduce line:

./gradlew ':x-pack:plugin:ml:qa:native-multi-node-tests:integTestRunner' --tests "org.elasticsearch.xpack.ml.integration.ClassificationIT.testSingleNumericFeatureAndMixedTrainingAndNonTrainingRows" -Dtests.seed=48D6323F088FCED4 -Dtests.security.manager=true -Dtests.locale=nb -Dtests.timezone=Europe/Kirov -Dcompiler.java=13

Not able to reliably reproduce locally.

The server logs show an interesting story.

[2020-03-06T16:59:01,906][INFO ][o.e.x.m.d.DataFrameAnalyticsManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Started reindexing
[2020-03-06T16:59:02,613][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Started loading data
[2020-03-06T16:59:02,623][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Started analyzing
[2020-03-06T16:59:02,623][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Waiting for result processor to complete
[2020-03-06T16:59:04,067][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] [data_frame_analyzer/399059] [CBoostedTreeImpl.cc@1155] loss* = 0.00841236, regularization* = (depth penalty multiplier = 3.767144, soft depth limit = 5.683778, soft depth tolerance = 0.050000, tree size penalty multiplier = 1.033723, leaf weight penalty multiplier = 0.469302), downsample factor* = 1, eta* = 0.223607, eta growth rate per tree* = 1.33541, maximum number trees* = 7, feature bag fraction* = 0.2
[2020-03-06T16:59:04,067][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Result processor has completed
[2020-03-06T16:59:04,067][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Closing process
[2020-03-06T16:59:04,071][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] [data_frame_analyzer/399059] [CBoostedTreeImpl.cc@240] Training finished after 18 iterations. Time per iteration in ms mean: 106.211 std. dev:  224.044
[2020-03-06T16:59:04,176][ERROR][o.e.x.m.p.l.CppLogMessageHandler] [integTest-2] [controller/387903] [CDetachedProcessSpawner.cc@184] Child process with PID 399059 was terminated by signal 6
[2020-03-06T16:59:04,472][INFO ][o.e.x.m.p.AbstractNativeProcess] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] State output finished
[2020-03-06T16:59:04,473][ERROR][o.e.x.m.d.p.AnalyticsProcessManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Error closing data frame analyzer process
org.elasticsearch.ElasticsearchException: Fatal error: '7fb1ce761000-7fb1ce762000 rw-p 00000000 00:00 0 ', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: '7ffd33405000-7ffd33426000 rw-p 00000000 00:00 0                          [stack]', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: '7ffd3344c000-7ffd3344e000 r-xp 00000000 00:00 0                          [vdso]', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: 'ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: 'si_signo 6, si_code: -6, si_errno: 0, address: 0x7fb1cc5ca337, library: /lib64/libc.so.6, base: 0x7fb1cc594000, normalized address: 0x36337', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
        at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:51) ~[x-pack-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.ml.process.AbstractNativeProcess.close(AbstractNativeProcess.java:184) ~[x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.ml.dataframe.process.AnalyticsProcessManager.closeProcess(AnalyticsProcessManager.java:320) [x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.ml.dataframe.process.AnalyticsProcessManager.processData(AnalyticsProcessManager.java:189) [x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.ml.dataframe.process.AnalyticsProcessManager.lambda$runJob$1(AnalyticsProcessManager.java:124) [x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]

Observe the line:

[2020-03-06T16:59:04,473][ERROR][o.e.x.m.d.p.AnalyticsProcessManager] [integTest-2] [classification_single_numeric_feature_and_mixed_data_set] Error closing data frame analyzer process
org.elasticsearch.ElasticsearchException: Fatal error: '7fb1ce761000-7fb1ce762000 rw-p 00000000 00:00 0 ', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: '7ffd33405000-7ffd33426000 rw-p 00000000 00:00 0                          [stack]', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: '7ffd3344c000-7ffd3344e000 r-xp 00000000 00:00 0                          [vdso]', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: 'ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)
Fatal error: 'si_signo 6, si_code: -6, si_errno: 0, address: 0x7fb1cc5ca337, library: /lib64/libc.so.6, base: 0x7fb1cc594000, normalized address: 0x36337', version: 8.0.0-SNAPSHOT (build 5102588ca22c4b)

This indicates the native process called abort().

MIGHT be related to elastic/ml-cpp#1040

@benwtrent benwtrent added >test-failure Triaged test failures from CI :ml Machine learning v8.0.0 labels Mar 6, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@mayya-sharipova
Copy link
Contributor

tests muted on master and 7.x

@benwtrent benwtrent self-assigned this Mar 20, 2020
@benwtrent
Copy link
Member Author

The underlying C++ issue (which might have been to blame) has been closed. The test is no longer muted, sloe closing this issue as fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test-failure Triaged test failures from CI v8.0.0-alpha1
Projects
None yet
Development

No branches or pull requests

4 participants