-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Firebase Android tests flapping #6546
Comments
Thank you for listing these up @friedbunny! I will keep 👀 on it. |
Looked at all the builds from yesterday and was able to locate one flaky build vs ~ 40 builds: I will continue to monitor the job queu in the following days |
We need to determine if the source of the behaviour is:
Short term actions:
What can we do if we can't resolve this issue? If the failing tests are perceived as more hinderance as actually beneficial we should look into downscaling the amount of tests we run. IMO the most important thing these tests need to flag is the regression that core crashes on startup. Having one test that validates this instead of having 150+ tests currently could make the test more reliable eg. If you would take 20 PR's. you would find one with a flaky test. This roughly means that we have one flaky test for 3000 different test cases. If we would execute 1 test case for each PR we would be able to test 3000 PR's successfully without the tests failing because of flakyness. |
Just did. |
Been running some test runs locally and was able to occasionally reproduce it. Going to see if I can make the test harness run more robust. I'm also thinking that changes to the OnMapReady callback could have resulted in the test being less reliable. |
Found another failing build, this time it couldn't download a dependency. |
A whopping 30 failed tests: https://www.bitrise.io/build/006abf10f975dbbf |
I light of the above reports and to unblock other contributors: I'm going to downscale our +150 tests to just one. This should still catch the most important regressions that show when a MapView is rendered on screen. I'm going to look into #6366 to run scheduled builds with the full set of tests on a daily basis and run manual tests on the firebase gui to resolve our issues. |
Reopening as I have seeing some random runtime style test irregularly fail with: android.support.test.espresso.AppNotIdleException: Looped for 6 iterations over 60 SECONDS. The following Idle Conditions failed ASYNC_TASKS_HAVE_IDLED.
at dalvik.system.VMStack.getThreadStackTrace(Native Method)
at java.lang.Thread.getStackTrace(Thread.java:580)
at android.support.test.espresso.base.DefaultFailureHandler.getUserFriendlyError(DefaultFailureHandler.java:92)
at android.support.test.espresso.base.DefaultFailureHandler.handle(DefaultFailureHandler.java:56)
at android.support.test.espresso.ViewInteraction.runSynchronouslyOnUiThread(ViewInteraction.java:184)
at android.support.test.espresso.ViewInteraction.check(ViewInteraction.java:158)
at com.mapbox.mapboxsdk.testapp.style.BaseStyleTest.checkViewIsDisplayed(BaseStyleTest.java:25)
at com.mapbox.mapboxsdk.testapp.style.BackgroundLayerTest.testBackgroundColorAsInt(BackgroundLayerTest.java:84)
at java.lang.reflect.Method.invoke(Native Method)
at java.lang.reflect.Method.invoke(Method.java:372)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at android.support.test.internal.statement.UiThreadStatement.evaluate(UiThreadStatement.java:55)
at android.support.test.rule.ActivityTestRule$ActivityStatement.evaluate(ActivityTestRule.java:270)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at android.support.test.internal.runner.TestExecutor.execute(TestExecutor.java:59)
at android.support.test.runner.AndroidJUnitRunner.onStart(AndroidJUnitRunner.java:262)
at android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1853) |
Also seeing occurrences of native crashes:
Don't have the symbolicated stacktrace of this due #7358. |
Bump. I'm having to restart most Android builds at least once. |
@jfirebaugh thank you for the bump to prioritize this: Been going through the latests 80 bitrise builds an noticed that 21 builds failed. A couple of those failures were related to the PR but most of them are not. Aside 2 infrastructure related failures (internet/timeout), the remaining failures are related to the following 2 issues:
cc @mapbox/android |
I can't get #7513 to pass at all (3 retries), although the same commit was passing yesterday.
|
I haven't seen that crash before. This has really spun out of control since we are seeing different crashes. For now with the upcoming holidays and limited bandwidth. I'm going to scale down the amount of instrumentation tests run on CI to one. I will not scale them back until we are sure the issues in this ticket are addressed. |
It may be unrelated, but we saw this failure via #7725, a simple macOS/iOS PR on the current release branch:
|
In the last week or so, Firebase tests have been flapping:
https://www.bitrise.io/build/74f7be21c7dedd19 — two firebase test failures
https://www.bitrise.io/build/a1ae533de44ce8bd — one firebase test failure
https://www.bitrise.io/build/fa9e4efdc1c76881 — took 45 minutes to timeout when downloading Java
https://www.bitrise.io/build/e3f62dfb41987c22 — firebase networking timeout
https://www.bitrise.io/build/73e363e750edb3d1 — five firebase test failures
https://www.bitrise.io/build/91f588750e7ed5af — device farm timed out
Even after digging through the (very long) logcat, it’s not obvious to me why these are failing. These builds all passed after restarting them. Some are likely because of Bitrise networking instability, but others could represent bugs in our code or tests — it’s hard to say.
Let’s investigate these failures and see if there’s anything we can do to improve reliability.
/cc @tobrun @zugaldia
The text was updated successfully, but these errors were encountered: