Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with CI #132

Closed
wipfli opened this issue Apr 11, 2021 · 27 comments
Closed

Problems with CI #132

wipfli opened this issue Apr 11, 2021 · 27 comments

Comments

@wipfli
Copy link
Contributor

wipfli commented Apr 11, 2021

I submitted pull request #131 and in the first test run unit-test of the action CI failed with

test/unit/ui/handler/scroll_zoom.test.js .............. 1/3 31s
  ScrollZoomHandler > Zooms for single mouse wheel tick with non-magical deltaY
  not ok timeout!
    signal: SIGTERM
    expired: TAP
    stack: |
      emit (node_modules/signal-exit/index.js:77:11)
      process.listener (node_modules/signal-exit/index.js:91:7)
    test: Zooms for single mouse wheel tick with non-magical deltaY
  
  ScrollZoomHandler
  not ok timeout!

So this was https://github.com/maplibre/maplibre-gl-js/actions/runs/738748443.

@nyurik then asked me to add a line to the changelog, which I did and which then triggered another run of the tests which went through without any error. This second run which worked was https://github.com/maplibre/maplibre-gl-js/actions/runs/738794426.

Any idea why the test might be unreliable?

@wipfli
Copy link
Contributor Author

wipfli commented Apr 11, 2021

@msbarry

@msbarry
Copy link
Contributor

msbarry commented Apr 11, 2021

There's probably some flakiness in the tests. I've seen it more often in the browser tests, but could be in the unit tests as well. We might want to change the github action config so that the tests retry once in case of a failure?

@wipfli
Copy link
Contributor Author

wipfli commented Apr 11, 2021

I have mixed feelings about rerunning the test automatically. The problem is that the action cannot distinguish between a flaky test and one which fails because it has to fail. I suggest we trigger tests manually if we want to rerun them. Like this we can spot problems with the tests faster.

Maybe we can keep this issue to keep track flaky test runs.

@wipfli
Copy link
Contributor Author

wipfli commented Apr 13, 2021

Here is another false negative:

Run yarn install --frozen-lockfile
yarn install v1.22.5
[1/5] Validating package.json...
[2/5] Resolving packages...
[3/5] Fetching packages...
info There appears to be trouble with your network connection. Retrying...
info There appears to be trouble with your network connection. Retrying...
info There appears to be trouble with your network connection. Retrying...
info There appears to be trouble with your network connection. Retrying...
info There appears to be trouble with your network connection. Retrying...
error An unexpected error occurred: "https://registry.yarnpkg.com/rxjs/-/rxjs-6.5.4.tgz: ESOCKETTIMEDOUT".
info If you think this is a bug, please open a bug report with the information provided in "/Users/runner/work/maplibre-gl-js/maplibre-gl-js/yarn-error.log".
info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
Error: Process completed with exit code 1.

This is in CI -> macos-latest

@wipfli
Copy link
Contributor Author

wipfli commented Apr 14, 2021

And in https://github.com/maplibre/maplibre-gl-js/runs/2339962584?check_suite_focus=true

macos-latest:

test/unit/source/source_cache.test.js ............. 144/147
  SourceCache reloads expiring tiles > calls reloadTile when tile expires
  not ok test unfinished
    stack: |
      Test._ade‍.r.test (test/unit/source/source_cache.test.js:1584:7)
      Parser.emit (node_modules/minipass/index.js:414:25)
      Parser.emit (node_modules/minipass/index.js:414:25)
    test: calls reloadTile when tile expires
    at:
      line: 1584
      column: 7
      file: test/unit/source/source_cache.test.js
      function: Test._ade‍.r.test
    source: |
      t.test('calls reloadTile when tile expires', (t) => {
  
  not ok child test left in queue: t.test SourceCache sets max cache size correctly
  not ok test count !== plan
    --- wanted
    +++ found
    -13
    +14
    results:
      ok: false
      count: 14
      pass: 12
      fail: 2
      bailout: false
      todo: 0
      skip: 0
      plan:
        start: 1
        end: 13
        skipAll: false
        skipReason: ''
        comment: ''
      failures:
        - ok: false
          id: 13
          time: 99.598
          name: SourceCache reloads expiring tiles
        - ok: false
          id: 14
          name: >-
            child test left in queue: t.test SourceCache sets max cache size
            correctly
          parent: ''

and test-browser:

test/browser/drag.test.js
  start server

    ✓ Listening at 10.1.0.4:9968

  start browser

    ✓ platform: linux

    ✓ browser: chrome

    ✓ version: 89.0.4389.114

  dragging
    drag to the left

      1) timeout!
1..0 # no tests found
1..0 # no tests found

      2) test count !== plan
    test/browser/zoom.test.js
      start server

        ✓ Listening at 10.1.0.4:9968
      start browser

        ✓ platform: linux

        ✓ browser: chrome

        ✓ version: 89.0.4389.114
      zooming
        double click at the center

          ✓ zoomed in by 1 zoom level

  9 passing (34s)
  2 failing

  1) test/browser/drag.test.js dragging drag to the left timeout!:
     Error: timeout!


  2) test/browser/drag.test.js test count !== plan:

      test count !== plan
      + expected - actual

      -2
      +1
      
  

error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
Error: Process completed with exit code 1.

@wipfli
Copy link
Contributor Author

wipfli commented Apr 16, 2021

And in CI -> test-browser -> test-chrome

Run xvfb-run -s "-ac -screen 0 1280x1024x24" yarn run test-browser
yarn run v1.22.10
$ cross-env build/run-tap --reporter spec --no-coverage test/browser/**/*.test.js

test/browser/drag.test.js
  start server

    ✓ Listening at 10.1.0.4:9968

  start browser

    1) session not created: This version of ChromeDriver only supports Chrome version 90 Current browser version is 89.0.4389.114 with binary path /usr/bin/google-chrome

  1 passing (9s)
  1 failing

  1) test/browser/drag.test.js start browser session not created: This version of ChromeDriver only supports Chrome version 90 Current browser version is 89.0.4389.114 with binary path /usr/bin/google-chrome:
     Error: session not created: This version of ChromeDriver only supports Chrome version 90 Current browser version is 89.0.4389.114 with binary path /usr/bin/google-chrome
      at Test._334‍.r.tap.test (test/browser/util/browser.js:82:11)


    2) test count !== plan
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
Error: Process completed with exit code 1.

https://github.com/maplibre/maplibre-gl-js/pull/137/checks?check_run_id=2364438111

@wipfli
Copy link
Contributor Author

wipfli commented Apr 19, 2021

https://github.com/maplibre/maplibre-gl-js/pull/140/checks?check_run_id=2380509547

test/unit/ui/handler/scroll_zoom.test.js ............. 9/10 31s
  ScrollZoomHandler > Zooms for single mouse wheel tick with non-magical deltaY
  not ok timeout!
    signal: SIGTERM
    expired: TAP
    stack: |
      emit (node_modules/signal-exit/index.js:77:11)
      process.listener (node_modules/signal-exit/index.js:91:7)
    test: Zooms for single mouse wheel tick with non-magical deltaY

ci -> macos-latest -> unit-test

@wipfli
Copy link
Contributor Author

wipfli commented Apr 19, 2021

https://github.com/maplibre/maplibre-gl-js/pull/140/checks?check_run_id=2380738229

[5/5] Building fresh packages...
error D:\a\maplibre-gl-js\maplibre-gl-js\node_modules\node-plantuml: Command failed.
info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
Exit code: 1
Command: node scripts/get-vizjs.js
Arguments: 
Directory: D:\a\maplibre-gl-js\maplibre-gl-js\node_modules\node-plantuml
Output:
graphviz was not found on the system. Downloading vizjs instead. See http://plantuml.com/vizjs. This may take a few minutes.

ci -> windows-latest -> yarn install

@astridx
Copy link
Contributor

astridx commented Jun 29, 2021

https://github.com/maplibre/maplibre-gl-js/pull/140/checks?check_run_id=2380509547

test/unit/ui/handler/scroll_zoom.test.js ............. 9/10 31s
  ScrollZoomHandler > Zooms for single mouse wheel tick with non-magical deltaY
  not ok timeout!
    signal: SIGTERM
    expired: TAP
    stack: |
      emit (node_modules/signal-exit/index.js:77:11)
      process.listener (node_modules/signal-exit/index.js:91:7)
    test: Zooms for single mouse wheel tick with non-magical deltaY

ci -> macos-latest -> unit-test

When I run the tests, this is the only one error message I get, while running the tests.

test/unit/ui/handler/mouse_rotate.test.js ............. 7/7 4s
test/unit/ui/handler/scroll_zoom.test.js ............. 9/10 30s
  ScrollZoomHandler > Zooms for single mouse wheel tick with non-magical deltaY
  not ok timeout!
    signal: SIGTERM
    expired: TAP
    stack: |
      emit (node_modules/signal-exit/index.js:77:11)
      process.listener (node_modules/signal-exit/index.js:91:7)
    test: Zooms for single mouse wheel tick with non-magical deltaY

test/unit/ui/handler/touch_zoom_rotate.test.js ...... 64/64
test/unit/ui/map/isMoving.test.js ................... 17/17 5s
test/unit/ui/map/isRotating.test.js ................... 5/5 4s
test/unit/ui/map/isZooming.test.js .................... 7/7 4s
test/unit/ui/map/requestRenderFrame.test.js ........... 3/3 4s
total ......................................... 16627/16628
  

  16627 passing (6m)
  1 failing

npm ERR! code 1

This are my versions:

$ npm -v
7.6.1
$ git/maplibre/maplibre-gl-js$ node -v
v10.23.0

@HarelM
Copy link
Collaborator

HarelM commented Jun 29, 2021

I've seen this test fail too, seems like this specific zoom test is flaky...?

@HarelM
Copy link
Collaborator

HarelM commented Jun 29, 2021

Sorry, my bad, these are different tests, but related to zoom... I don't remember exactly which one failed in my case, but it was related to zoom...

@astridx
Copy link
Contributor

astridx commented Jun 29, 2021

@HarelM No, it has nothing to do with your code. The error occurs before.

@wipfli
Copy link
Contributor Author

wipfli commented Jun 30, 2021

After a merge to main test-unit failed:

test/unit/ui/camera.test.js ....................... 281/282
  camera > #easeTo > duration is 0 when prefers-reduced-motion: reduce is set
  not ok Camera transition time exceeded expected range( [0,10) ) :13
    at:
      line: 39
      column: 22
      file: test/unit/ui/camera.test.js
      type: Camera
      function: camera.on.on
    stack: |
      Camera.camera.on.on (test/unit/ui/camera.test.js:39:22)
      Camera.fire (src/util/evented.js:119:26)
      Camera._afterEase (src/ui/camera.js:938:14)
      _ease (src/ui/camera.js:876:18)
      Camera._ease (src/ui/camera.js:1199:13)
      Camera.easeTo (src/ui/camera.js:844:14)
      Test.t.test (test/unit/ui/camera.test.js:969:20)
      Parser.emit (node_modules/minipass/index.js:414:25)
      Camera.camera.on.on (test/unit/ui/camera.test.js:948:23)
      Camera.fire (src/util/evented.js:119:26)
      Camera._afterEase (src/ui/camera.js:938:14)
      Camera._ease (src/ui/camera.js:876:18)
      Camera._stop (src/ui/camera.js:1185:23)
      Camera.stop (src/ui/camera.js:1169:21)
      Camera._renderFrameCallback (src/ui/camera.js:1216:18)
      TaskQueue.run (src/util/task_queue.js:52:18)
      Camera.camera.simulateFrame (test/unit/ui/camera.test.js:14:44)
      Timeout.setTimeout (test/unit/ui/camera.test.js:959:28)
    source: >
      test.ok(timeDiff >= min && timeDiff < max, `Camera transition time exceeded
      expected range( [${min},${max}) ) :${timeDiff}`);

@wipfli
Copy link
Contributor Author

wipfli commented Jun 30, 2021

xvfb-run -s "-ac -screen 0 1280x1024x24" yarn run test-unit is the command we use to run these tests. Is there maybe some configuration option to have it more reliable?

@HarelM
Copy link
Collaborator

HarelM commented Jun 30, 2021

I think we should flag the tests that we see that sometimes fail and and some retry mechanism or something. Timing can changed based on machine load, so these tests should be more resilient...

@HarelM
Copy link
Collaborator

HarelM commented Jul 11, 2021

Here's another failure, no code changes were actually made (only changes in typing). the following is the test that failed:
I'm just writing it down for future reference.

test/unit/ui/handler/dblclick_zoom.test.js ........... 9/10 4s
  DoubleClickZoomHandler does not zoom on double tap if touchstart events are > 500ms apart
  not ok should be equal
    --- wanted
    +++ found
    -0
    +1
    compare: ===
    at:
      line: 79
      column: 11
      file: test/unit/ui/handler/dblclick_zoom.test.js
      function: simulateDoubleTap.then
    stack: |
      simulateDoubleTap.then (test/unit/ui/handler/dblclick_zoom.test.js:79:11)
    source: |
      t.equal(zoom.callCount, 0);

Also this test:

test/unit/ui/camera.test.js ....................... 281/282
  camera > #flyTo > resets duration to 0 if it exceeds maxDuration
  not ok should be equal to within 10
    --- wanted
    +++ found
    -0
    +10
    compare: ===
    at:
      line: 37
      column: 17
      file: test/util/index.js
      function: equalWithPrecision
    stack: |
      equalWithPrecision (test/util/index.js:37:17)
      Camera.camera.on.on (test/unit/ui/camera.test.js:1626:21)
      Camera.fire (src/util/evented.js:119:26)
      Camera._afterEase (src/ui/camera.js:938:14)
      _ease (src/ui/camera.js:1153:23)
      Camera._ease (src/ui/camera.js:1199:13)
      Camera.flyTo (src/ui/camera.js:1129:14)
      Test.t.test (test/unit/ui/camera.test.js:1630:20)
      loop (node_modules/function-loop/index.js:46:15)
      loop (node_modules/function-loop/index.js:46:15)
      loop (node_modules/function-loop/index.js:46:15)
      Parser.emit (node_modules/minipass/index.js:414:25)
      loop (node_modules/function-loop/index.js:46:15)
      loop (node_modules/function-loop/index.js:46:15)
      loop (node_modules/function-loop/index.js:46:15)
      Camera.camera.on (test/unit/ui/camera.test.js:1604:19)
      Camera.fire (src/util/evented.js:119:26)
      Camera._afterEase (src/ui/camera.js:938:14)
      Camera._ease (src/ui/camera.js:1153:23)
      Camera._stop (src/ui/camera.js:1185:23)
      Camera.stop (src/ui/camera.js:1169:21)
      Camera._renderFrameCallback (src/ui/camera.js:1216:18)
      TaskQueue.run (src/util/task_queue.js:52:18)
      Camera.camera.simulateFrame (test/unit/ui/camera.test.js:14:44)
      Timeout.setTimeout (test/unit/ui/camera.test.js:1613:24)
    source: |
      return test.equal(expectedRounded, actualRounded, message, extra);

@wipfli
Copy link
Contributor Author

wipfli commented Sep 5, 2021

Problem with the release workflow: https://github.com/maplibre/maplibre-gl-js/runs/3514866759?check_suite_focus=true

Error: No files matching the pattern "'src/css/maplibre-gl.css'" were found.
    at D:\a\maplibre-gl-js\maplibre-gl-js\node_modules\stylelint\lib\standalone.js:212:12
    at processTicksAndRejections (internal/process/task_queues.js:95:5)

@wipfli
Copy link
Contributor Author

wipfli commented Sep 5, 2021

Release still fails. @HarelM can you have a look at the error message. @msbarry can you remind me why release runs on windows?

@wipfli wipfli changed the title Timeout in unit test Problems with CI Sep 5, 2021
@HarelM
Copy link
Collaborator

HarelM commented Sep 5, 2021

The problem is with the ' around the file name in lint-css when running on Windows.
Removing it solves the issue, haven't tried it on mac or linux yet...

@msbarry
Copy link
Contributor

msbarry commented Sep 5, 2021

Looks like that was added by @petr-pokorny-1 in #95 "First jobs perform tests. It runs on Windows runner since Linux runners don't have GPU, which is required by many tests. Second job runs on Linux and creates builds minified library, creates github release, uploads release assets, updates version in npm package and submits package in the npm registry."

@wipfli
Copy link
Contributor Author

wipfli commented Sep 5, 2021

Thanks for the explanation @msbarry.

I think it is a real problem that we have different CI strategies for lint, build, test and for lint, build, test, release. Release should use the exact same steps as the normal CI does which we run against all pull requests and pushes to main. Let me know if there are constraints which made you choose a different approach in the past.

@msbarry
Copy link
Contributor

msbarry commented Sep 5, 2021

Definitely agree @wipfli - I set up the ci and analyze workflows not the release one. I seem to remember it was tricky to factor out common reused steps. Another option might be to chain them, so you can trigger a release workflow off of a successful build?

@HarelM
Copy link
Collaborator

HarelM commented Sep 5, 2021

I opened another issue on it - to split the steps to different yml files, after that you can define the triggers for each yml file I guess, maybe...

@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2021

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Nov 5, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2021

This issue was closed because it has been stalled for 30 days with no activity.

@github-actions github-actions bot closed this as completed Dec 5, 2021
@wipfli wipfli reopened this Dec 27, 2021
@wipfli
Copy link
Contributor Author

wipfli commented Dec 27, 2021

Poor GitHub seems to have a hard time will all our CI scripts.

@github-actions github-actions bot removed the stale label Dec 28, 2021
@HarelM
Copy link
Collaborator

HarelM commented Feb 11, 2022

I'm no longer experiencing this issues.
I'll close this one and we'll open it in case we need to.

@HarelM HarelM closed this as completed Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants