Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird flicker in video for end to end test on updating to v10.10.0 #24377

Closed
lidiagc opened this issue Oct 25, 2022 · 24 comments · Fixed by #25898
Closed

Weird flicker in video for end to end test on updating to v10.10.0 #24377

lidiagc opened this issue Oct 25, 2022 · 24 comments · Fixed by #25898
Assignees

Comments

@lidiagc
Copy link

lidiagc commented Oct 25, 2022

Current behavior

We were previously using cypress version 9.7.0 and now want to update to version 10.10.0. However, we sometimes get an issue with the end-to-end tests when running on GitLab CI.

As you can see from the video, a weird flicker happens during the test and it eventually fails. Does it seem that the test is jumping up and down the commands as it was running two instances of the test? The video is only showing the start of the test and is pixelated since this is a private project.

weird-video-e2e.mp4

We haven't been able to reproduce this behavior locally, and it doesn't constantly happen on the CI. It is also not specific to this test, sometimes the other end-to-end tests fail occasionally and the videos have the same weird flicker. These tests didn't have this issue before upgrading cypress version.

Desired behavior

We would like to be able to update cypress to version 10, but this issue is delaying the update since we cannot rely on the end-to-end tests for our CI.

Test code to reproduce

We can't provide the full test code since this is a private project.

  • cypress.config.ts
import { defineConfig } from "cypress";

export default defineConfig({
    chromeWebSecurity: false,
    defaultCommandTimeout: 15000,
    viewportWidth: 1920,
    viewportHeight: 1080,
    videoUploadOnPasses: false,

    projectId: "xxxxxx",

    e2e: {
        setupNodeEvents(on, _config) {},

        retries: {
            runMode: 0,
        },

        specPattern: "cypress/e2e/**/*.{ts,tsx}",
    },
});
  • cypress-dockerfile
FROM cypress/included:10.10.0

RUN npm install -g [email protected]

# other packages installed necessary to run our e2e tests
  • gitlab-ci.yml
itest:end-to-end:
    image: ${ITEST_IMAGE}
    stage: itest
    script:
      # backend necessary scripts
  
      - cd frontend
      - echo -e "\e[0Ksection_start:`date +%s`:frontend-deps[collapsed=true]\r\e[0KInstalling frontend deps"
      - CYPRESS_INSTALL_BINARY=0 pnpm install --no-verify-store-integrity --frozen-lockfile -r
      - echo -e "\e[0Ksection_end:`date +%s`:frontend-deps\r\e[0K"
      - cypress run
        --config baseUrl="http://localhost:8001/",defaultCommandTimeout=60000,watchForFileChanges=false,pageLoadTimeout=100000,numTestsKeptInMemory=0
        --browser chrome
        --reporter mocha-junit-reporter
        --reporter-options 'mochaFile=cypress/reports/junit/junit-[hash].xml'
        --env KC_HOSTNAME="localhost:8080"
        --spec clusters-active.ts
    artifacts:
      when: on_failure
      paths:
        - frontend/cypress/screenshots
        - frontend/cypress/videos
        - logs
      expire_in: 15 days
      reports:
        junit: frontend/cypress/reports/junit/*.xml
  • gitlab output
$ cypress run --config baseUrl="http://localhost:8001/",defaultCommandTimeout=60000,watchForFileChanges=false,pageLoadTimeout=100000,numTestsKeptInMemory=0 --browser chrome --reporter mocha-junit-reporter --reporter-options 'mochaFile=cypress/reports/junit/junit-[hash].xml' --env KC_HOSTNAME="localhost:8080" --spec clusters-active.ts
[10659:1017/234749.919499:ERROR:node_bindings.cc(279)] Most NODE_OPTIONs are not supported in packaged apps. See documentation for more details.
[10659:1017/234750.894445:ERROR:zygote_host_impl_linux.cc(263)] Failed to adjust OOM score of renderer with pid 10889: Permission denied (13)
libva error: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null)
[10889:1017/234750.909020:ERROR:gpu_memory_buffer_support_x11.cc(44)] dri3 extension not supported.
====================================================================================================
  (Run Starting)
  ┌────────────────────────────────────────────────────────────────────────────────────────────────┐
  │ Cypress:        10.10.0                                                                        │
  │ Browser:        Chrome 100 (headless)                                                          │
  │ Node Version:   v16.14.2 (/usr/local/bin/node)                                                 │
  │ Specs:          1 found (clusters-active.ts)                                                   │
  │ Searched:       cypress/e2e/customer/clusters-active.ts                                        │
  └────────────────────────────────────────────────────────────────────────────────────────────────┘
────────────────────────────────────────────────────────────────────────────────────────────────────
                                                                                                    
  Running:  clusters-active.ts                                                              (1 of 1)
Timed out waiting for the browser to connect. Retrying...
  (Results)
  ┌────────────────────────────────────────────────────────────────────────────────────────────────┐
  │ Tests:        9                                                                                │
  │ Passing:      0                                                                                │
  │ Failing:      1                                                                                │
  │ Pending:      0                                                                                │
  │ Skipped:      8                                                                                │
  │ Screenshots:  2                                                                                │
  │ Video:        true                                                                             │
  │ Duration:     1 minute, 40 seconds                                                             │
  │ Spec Ran:     clusters-active.ts                                                               │
  └────────────────────────────────────────────────────────────────────────────────────────────────┘
  (Screenshots)
  -  /builds/.../frontend/cypress/screenshots/clusters-act     (1280x720)
     ive.ts/test -- before all hook (failed).png                        
  -  /builds/.../frontend/cypress/screenshots/clusters-act     (1280x720)
     ive.ts/test -- after all hook (failed).png                         
  (Video)
  -  Started processing:  Compressing to 32 CRF                                                     
  -  Finished processing: /builds/.../frontend/cypress/vid    (6 seconds)
                          eos/clusters-active.ts.mp4                                                
====================================================================================================
  (Run Finished)
       Spec                                              Tests  Passing  Failing  Pending  Skipped  
  ┌────────────────────────────────────────────────────────────────────────────────────────────────┐
  │ ✖  clusters-active.ts                       01:40        9        -        1        -        8 │
  └────────────────────────────────────────────────────────────────────────────────────────────────┘
    ✖  1 of 1 failed (100%)                     01:40        9        -        1        -        8  

Cypress Version

10.10.0

Node version

v16.14.2

Operating System

cypress/included:10.10.0

Debug Logs

No response

Other

Which debug logs would be more relevant for this? DEBUG=cypress:* has too many logs, so the GitLab job exceeds the maximum logs saved before the test finishes.

@mschile
Copy link
Contributor

mschile commented Oct 31, 2022

Hi @lidiagc 👋 , thanks for logging this issue.

Does it seem that the test is jumping up and down the commands as it was running two instances of the test?

Yes, it does look like that though I can't think of anything that would cause this type of behavior.

Does this happen on all browsers or just Chrome? If you run just a single spec does it happen? Does it also happen in Cypress 10.0.3?

@lidiagc
Copy link
Author

lidiagc commented Nov 3, 2022

Hi @mschile, thank you for replying!

We have in total 4 jobs that run our end-to-end tests:

  • (1/4): runs only one spec file (the one I mentioned in the issue)
  • (2/4): runs two spec files
  • (3/4): runs two spec files
  • (4/4): runs four spec files

First, I tested with version 10.10.0 and browser Electron:

  • (1/4)
    The same issue happened at least once

  • (2/4)
    The flicker happened on the first spec and the test failed
    The second spec passed

  • (3/4)
    Similarly, the flicker happened on the first spec and the test failed
    The second spec passed

  • (4/4)
    Again, the flicker happened on the first spec and the test failed
    The following three spec files passed

Our tests are not prepared to run with Firefox, so they all failed but not because of the flicker.

Then, I downgraded to version 10.0.3 and ran the tests a couple of times both with Chrome and Electron and the flicker issue never appeared. I'm guessing something after that version changed so the issue happens occasionally?

It's good to know that there is a version of Cypress 10 that doesn't have this issue; however, we wouldn't want to upgrade to this version because of the relative path issue that was fixed in version 10.9.0. Our component tests have a lot of similar names, and what differentiates them is their path. Upgrading to version 10.0.3 wouldn't be viable.

Let me know if there are other scenarios you want me to test!

@mschile
Copy link
Contributor

mschile commented Nov 3, 2022

Thanks for the update @lidiagc! I would love to narrow in on which Cypress version caused the regression. My initial guess is Cypress 10.8.0 may have broke it. Would you be able to try 10.7.0 and 10.8.0?

@lidiagc
Copy link
Author

lidiagc commented Nov 4, 2022

I tried versions 10.7.0 and 10.8.0 and the flicker never happened. Then, I upgraded to version 10.9.0 and it occurred. I suppose something in that version caused the regression?

@mschile
Copy link
Contributor

mschile commented Nov 8, 2022

@lidiagc, thanks for determining which version caused the regression. Unfortunately, I haven't been able to reproduce the flickering. I know you aren't able to provide a link to your private repository, but are you able to recreate the issue in a public one or possibly using the cypress-test-tiny project?

@mschile mschile assigned ryanthemanuel and unassigned mschile Nov 8, 2022
@ryanthemanuel
Copy link
Collaborator

Hi @lidiagc. We have a theory about things that might be causing video oddities. I created a 10.9 custom binary with some code tweaks to test out that theory. Can you try this binary to see if it works?

npm install https://cdn.cypress.io/beta/npm/10.9.1/linux-x64/10.9.0-minus-video-refactor-c2135e7e0e6b269a755e7f4309f90630af81d3b9/cypress.tgz

@lidiagc
Copy link
Author

lidiagc commented Nov 18, 2022

Hi @ryanthemanuel, thank you for looking into this. I installed the custom binary and ran the end-to-end tests on GitLab. The flicker still happens and it seems way more frequent than before.

We have recently upgraded to version 11.0.1 and we have encountered the flicker a couple of times, but it seemed way less frequent than in version 10, hence our decision to upgrade.

@mjhenkes mjhenkes assigned mjhenkes and unassigned ryanthemanuel Nov 22, 2022
@mjhenkes
Copy link
Member

@lidiagc, given that you're seeing this less frequently on 11.0.1 would you be ok with us closing this issue?

@mjhenkes mjhenkes closed this as not planned Won't fix, can't repro, duplicate, stale Nov 29, 2022
@chasemgray
Copy link

chasemgray commented Jan 2, 2023

Hi @mjhenkes
I've been struggling to debug this issue for the last week or two. Based on the cypress debug logs we can see that cypress launches one version of chrome, but then fails to successfully connect to it. It then launches another version very quickly after that and then does successfully launch. Both instances of chrome are running in parallel and executing all the same tests. This causes duplicate logs, duplicate commands, and multiple dom snapshots which all get compiled together into a video which appears to be flickering but it's really just oscillating between the different browsers running the same tests. This still occurs on version of Cypress up to 12.2.0 and started around version 10.10.0. It doesn't occur for us on version 10.0.0.
We have been able to reduce the occurrences of this issue by increasing CYPRESS_INTERNAL_BROWSER_CONNECT_TIMEOUT to a very high value so that it eventually connects to chrome. Do we know why the newer version of Cypress fails to connect within the timeout period? And when it fails to connect, how come it isn't aware an instance of Chrome is running the tests the background while it launches another one?

@nagash77
Copy link
Contributor

nagash77 commented Jan 3, 2023

Hi @chasemgray , can you please open a new issue with your specific problem and a reproducible example?

@lidiagc
Copy link
Author

lidiagc commented Jan 3, 2023

Hi, just wanted to update here that we are still experiencing what @chasemgray described in Cypress 12.1.0. We occasionally get failed tests in the CI with the video showing the same flickering behavior first described in #24377 (comment).

@chasemgray
Copy link

We have an open internal ticket with cypress support (10338) just in case the team wants to reference it there.

@chasemgray
Copy link

This issue also appears to be related to this problem #22825
Multiple Chrome browsers are causing duplicate calls to .next() in cypress middleware which causes issues.

Looking back through our logs it appears this issue has existed for a very long time in Cypress if Cypress fails to launch or terminate a browser. We have old runs that I looked back at and found the same symptoms. It got much worse on some minor version of 10.X.X.

Here is a screenshot of where it fails to detect that chrome launched

Screen Shot 2023-01-03 at 12 05 38 PM

@chasemgray
Copy link

One big change I can see here is the change from Bluebird.join to Bluebird.all
3c2fea2#diff-23312a21d720a74c51c81d569ca48200ab80885b2f4415df78e39cea5685788fL551

Bluebird.join last argument is supposed to be a function to call, and Bluebird.all is expecting an array of promises. It seems like this would significantly affect waiting properly for browser launch if this wasn't an intended change (which I'm assuming since the pull request was converting code to typescript)http://bluebirdjs.com/docs/api/promise.join.html
http://bluebirdjs.com/docs/api/promise.all.html

Maybe I'm completely off. I was just looking at the code history to see what could have changed this to make it so much worse.

@chasemgray
Copy link

@chrisbreiding @nagash77 any update on the issue?

@nagash77
Copy link
Contributor

nagash77 commented Jan 5, 2023

Hi @chasemgray , no update just yet. @chrisbreiding is hard at work diving in.

@chrisbreiding
Copy link
Contributor

@chasemgray Thanks for investigating this and I think you may be onto something, but unfortunately I don't think that exact change is what's causing the issue. The change from Bluebird.join to Bluebird.all in that commit is functionally equivalent.

From the Bluebird.join doc:

This behavior has been deprecated but is still supported partially - when the last argument is an immediate function value the new semantics will apply

Bluebird checks if the last argument is a function (as opposed to a promise) and uses the documented behavior. In the case of the code in question, both functions return a promise, so Bluebird uses the old semantics.

I do think it's the originally intended behavior that we race the two promises of connecting to the socket and launching the browser before moving on, but that's not to say there isn't a bug or race condition there. The browser launching process is fairly complex, so there's a lot of potential for something to go wrong. I think it's a good area to poke at to determine the cause for this issue, so I'm digging into it more and will hopefully come up with the root of the problem.

@chasemgray
Copy link

chasemgray commented Jan 8, 2023

@chrisbreiding It seems like there are two issues:

  • Increased frequency of an unconnected chrome instance or one that fails to be terminated when it doesn't launch in time.
  • Cypress runner allows messages to come from multiple browsers. I would think that it would be able to detect, either via port or something else, if the messages come from the browser it believes launched successfully. Though this wouldn't prevent the rogue browser instance from causing side effects, etc.

@AtofStryker
Copy link
Contributor

AtofStryker commented Jan 23, 2023

@chasemgray @lidiagc do we have a definite way to reproduce this issue or get into this state where we can see the flickering?

@chasemgray
Copy link

Significantly slow down the chrome launch seems to be the most common cause we see. Otherwise it might launch in time for the cypress logic to detect it.

Isn't there a way for cypress to just fail when it detects messages coming in on multiple ports?

@chasemgray
Copy link

Also, if you want to look at this over zoom I can walk you through a lot of the debug details that showed us this was due to chrome launching twice.

@mschile
Copy link
Contributor

mschile commented Feb 3, 2023

@lidiagc and @chasemgray, I was able to reproduce the video flicker locally with cypress run by lowering the CYPRESS_INTERNAL_BROWSER_CONNECT_TIMEOUT to 500 ms. With this reproduction, we'll be able to investigate the root cause and come up with a solution. Thank you for your continued patience on this issue.

cypress_api.cy.js.mp4

Isn't there a way for cypress to just fail when it detects messages coming in on multiple ports?

Yes, that is probably possible though my guess is the root cause is higher up and we'll want to figure out why the browser is not closing as expected in the first place.

@chrisbreiding
Copy link
Contributor

#25898 should fix this issue by preventing more than one browser from being connected at a time. It will be out with the next release, but if you'd like to check it out ahead of time, I'd recommend trying out the prerelease build for the latest commit on the develop branch.

Thanks again, @chasemgray, for digging into this. Recognizing that multiple browsers were being connected helped pinpoint the root cause of the issue.

@cypress-bot
Copy link
Contributor

cypress-bot bot commented Feb 25, 2023

Released in 12.7.0.

This comment thread has been locked. If you are still experiencing this issue after upgrading to
Cypress v12.7.0, please open a new issue.

@cypress-bot cypress-bot bot locked as resolved and limited conversation to collaborators Feb 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants