Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] GeoIpDownloaderStatsIT testStats failing #87035

Closed
hendrikmuhs opened this issue May 23, 2022 · 6 comments · Fixed by #91662
Closed

[CI] GeoIpDownloaderStatsIT testStats failing #87035

hendrikmuhs opened this issue May 23, 2022 · 6 comments · Fixed by #91662
Assignees
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI

Comments

@hendrikmuhs
Copy link

Build scan:
https://gradle-enterprise.elastic.co/s/ndzzouplmtgfe/tests/:modules:ingest-geoip:internalClusterTest/org.elasticsearch.ingest.geoip.GeoIpDownloaderStatsIT/testStats

Reproduction line:
./gradlew ':modules:ingest-geoip:internalClusterTest' --tests "org.elasticsearch.ingest.geoip.GeoIpDownloaderStatsIT.testStats" -Dtests.seed=FB06B6504F00972 -Dtests.locale=en-IN -Dtests.timezone=America/Halifax -Druntime.java=17

Applicable branches:
master

Reproduces locally?:
No

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.ingest.geoip.GeoIpDownloaderStatsIT&tests.test=testStats

Failure excerpt:

java.lang.AssertionError: 
Expected: <3>
     but: was <0>

  at __randomizedtesting.SeedInfo.seed([FB06B6504F00972:953132DA6C2CC751]:0)
  at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
  at org.junit.Assert.assertThat(Assert.java:956)
  at org.junit.Assert.assertThat(Assert.java:923)
  at org.elasticsearch.ingest.geoip.GeoIpDownloaderStatsIT.lambda$testStats$1(GeoIpDownloaderStatsIT.java:89)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1096)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1069)
  at org.elasticsearch.ingest.geoip.GeoIpDownloaderStatsIT.testStats(GeoIpDownloaderStatsIT.java:86)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:568)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)
  at java.lang.Thread.run(Thread.java:833)

@hendrikmuhs hendrikmuhs added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >test-failure Triaged test failures from CI labels May 23, 2022
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label May 23, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@joegallo joegallo self-assigned this May 26, 2022
@masseyke
Copy link
Member

I've seen this one fail repeatedly on my local machine. Almost all of the time seems to be pulling the databases from https://storage.googleapis.com/. Normally this takes 5-6 seconds (fairly well within the 10-second assertBusy), but sometimes it consistently takes 12 seconds or so, and assertBusy fails. I'm not sure what accounts for that slowdown, but I've put enough logging in the code locally to convince myself that it's the remote request to a resource that we don't control, and not variability within our code. So I'm going to extend the timeout a little on the assertBusy.

Sort of related -- we're wrapping the InputStream from the HttpURLConnection with a BufferedInputStream with an 8 KB buffer. But it never seems to pull back more than 1378 bytes at a time because HttpURLConnection::available always returns 0. That's not a huge deal because our code GeoIpDownloader::getChunk is buffering it into 1 MB arrays. But odd that we're only getting 1378 bytes at a time (max) from the remote resource. That seems to be the case whether it's running fast or slow though, so I'm not going to worry about it.

@masseyke
Copy link
Member

Actually I think that last comment might have been more relevant than I realized. This test is not supposed to be hitting https://storage.googleapis.com/ at all. It's supposed to pull smaller versions of the databases from localhost. I was looking into #90837 when I ran into this. That one fails when the "geoip_endpoint" property has not been set, so I had disabled it locally. That resulted in falling back to googleapis.com. I'm wondering if whatever caused that to happen for #90837 also caused this.

@masseyke
Copy link
Member

Actually there's evidence in the logs associated with this ticket:

org.elasticsearch.ElasticsearchStatusException: error during downloading https://geoip.elastic.co/v1/database?elastic_geoip_service_tos=agree

If the test fixture were set up correctly then it would pull from http://127.0.0.1:65463/?elastic_geoip_service_tos=agree instead. So I think that this one, #90837, and #90838 are all duplicates, and solving one would solve them all.

@craigtaverner craigtaverner reopened this Feb 1, 2023
@gmarouli
Copy link
Contributor

gmarouli commented Feb 2, 2023

I took a look at the recent failure and it seems like the test that timed out because it didn't try to download from localhost in 7.17 branch:

[2023-02-01T15:17:14,090][INFO ][o.e.i.g.GeoIpDownloader  ] [node_t3] fetching geoip databases overview from [https://geoip.elastic.co/v1/database?elastic_geoip_service_tos=agree]

Backport: #93459

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI
Projects
None yet
6 participants