-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add E2E testing #917
Comments
Testing frameworks in node.js are fragmented. I think we separate unit tests and integration (feature) tests. We use different frameworks for each test suite. Currently, I think:
|
I don't have the expertise here to comment, but am happy to have them separated. In general, I think feature (e2e?) tests are needed. Over the years we've added amazing capabilities to auspice, and we have hundreds of datasets which make different use of these varied features. As such, manually testing a decent proportion of the datasets at each PR has become untenable and this leads to bugs which often take a while to discover 😦 Creating a dev environment which can give us confidence in this area would be wonderful for nextstrain. |
I'm happy to help out here. Can you write down the main test cases that you want to have coverage for? Especially the use cases where you think are most important. For example: Rendering
Tree Layout Map Layout |
For integration testing of general frontend logic, I've had good experiences with cypress ( https://www.cypress.io/ ), especially in contrast to previous experiences with setting up Selenium/PhantomJS based testing. It integrates well with CI, so it seems like a logical first step might be to add a Cypress test step to the existing Travis pipeline. For visual regression testing, I've heard that others have had good experiences with percy.io, but I haven't personally used it. It doesn't sound like this is necessary, but mentioning it just in case it's what you had in mind. In order to test that the app works with these different datasets - would you be comfortable with having the fixture data files hardcoded into this repository? If so, it would be helpful to know what is a representative set of datasets to store that have a good mixture of properties for a test suite to exercise. |
cypress.io looks good. It seems like the framework to get started with now. |
Cypress's bottleneck is test recording and parallelization, which are bundled together. So if we run out of test quota, CI/CD tests will stop working. Our team recently (last 3 months) evaluated between Cypress and Puppeteer, and went with Puppeteer to avoid the hard bottleneck and also paved way for easier upgrade path to Playwright for cross browser testing in all Chrome, Firefox, and Safari down the road! For unit testing React, Jest + Enzyme will do Under the hood, Cypress uses Mocha instead of Jasmine, whereas Jest uses Jasmine already, so that could streamline vendor library dependencies too Resources:
|
@tihuan thanks for the info, very useful. I always find node.js ecosystem a mess :). In these situations, I'd delegate the decision to whoever takes the lead. I'm happy to help out with writing some tests. |
Thanks for the quick reply, @dnprock 😄 Yeah, not sure who's gonna lead the testing framework setup, but I can help prototype integration tests with jest + puppeteer if needed! |
To clarify, the test recording limit is about sending the test results to
the cypress.io dashboard hosted service, which is optional - the test
runner itself can be used without the hosted dashboard and is still quite
helpful. Additionally they have an unlimited plan for qualifying open
source projects (
https://www.cypress.io/oss-plan ). I don’t think this necessarily means it
should be picked, but I don’t think the quota concern would apply here.
The point about cross browser testing is a good one to keep in mind, though
I’m not sure whether it’s needed (someone in the org can probably check via
google analytics what the userbase looks like). In fairness to cypress,
they support Firefox and Edge as of last month, but testing Safari does
seem unique to playwright for now.
…On Thu, Mar 5, 2020 at 7:01 PM Timmy Huang ***@***.***> wrote:
Cypress's bottleneck is test recording and parallelization, which are
bundled together. So if we run out of test quota, CI/CD tests will stop
working.
Our team recently (last 3 months) evaluated between Cypress and Puppeteer,
and went with Puppeteer to avoid the hard bottleneck and also paved way for
easier upgrade path to Playwright for cross browser testing in all Chrome,
Firefox, and Safari down the road!
For unit testing React, Jest + Enzyme will do
Integration test can be Jest + Puppeteer
Under the hood, Cypress uses Mocha instead of Jasmine, whereas Jest uses
Jasmine already, so that could streamline vendor library dependencies too
Resources:
1. Puppeteer <https://github.com/puppeteer/puppeteer>
2. Jest-Puppeteeer <https://github.com/smooth-code/jest-puppeteer>
3. Playwright <https://github.com/microsoft/playwright> - Original
Puppeteer team from Google moved on to Microsoft to create this
4. Jest <https://github.com/facebook/jest>
5. Enzyme <https://github.com/enzymejs/enzyme>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#917?email_source=notifications&email_token=ACE2MM32R4OI5QWETZGDYOLRGA4NDA5CNFSM4LBB5CO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN7BGPQ#issuecomment-595465022>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACE2MMYJNMPM7EAQDEEUZIDRGA4NDANCNFSM4LBB5COQ>
.
|
Hey @hydrosquall thanks for adding the extra info! Cypress ties dashboard and parallel tests together, since it's pretty much their money maker from what I read here: cypress-io/cypress#2520, so no dashboard or quota will block parallel tests. In our trial, the tests stopped working altogether once the quota ran out There's a homegrown solution to self host dashboard, but seems like it's still in alpha: https://github.com/agoldis/sorry-cypress The free OSS plan is promising! Not sure if its limit of 5 users would be enough for the project though And thanks for bringing up that Cypress supports Firefox and Edge 👍 Puppeteer also has support for Firefox and Edge, given that now Edge is Chromium based too (https://blogs.windows.com/msedgedev/2019/11/04/edge-chromium-release-candidate-get-ready) Both Cypress and Puppeteer are easy to setup, so maybe it boils down to which approach is more maintainable in the long run for this project, in terms of contributor interest, experience, free plan user limit, etc.? |
Hey, quickly chiming in to mention that Datadog has an E2E testing product, called browser tests and an OSS plan as well |
Ah yes! Thanks for mentioning Datadog's browser tests and flagging the OSS plan, @MargotLepizzera 🥇 My project uses Datadog to periodically (5 mins) ping our app to make sure it's functional, and the integration with Slack is super neat ✨ For writing integration tests, I'm thinking we might need a version control solution, so the test code can be checked into this repo and run locally and on every PR too A coworker just shared this article: https://medium.com/prodperfect/how-to-build-e2e-test-cases-b0cb3e3c97bf tl;dr is integration tests should mostly cover critical user paths (a small set of all the tests), and use unit tests and component tests to cover larger area of the code base. This way we don't end up writing all tests in integration, resulting in taking hours to run the test suite! |
I just created a puppeteer branch with a POC integration test to click on the Play button with Zika dataset: https://github.com/nextstrain/auspice/compare/master...tihuan:jest-test-setup?expand=1 To run the integration test:
You should get:
Please let me know if you encounter any issue 😄 Thank you! |
Thanks all for your input. I'll defer to @colinmegill to guide which testing framework we go with, as I think this is a very important decision for the project, but I wanted to give a summary of the kind of tests we desperately need. Currently, our testing consists of me loading a number of datasets and trying to run all the functionality I believe the PR may have affected. Things fall through the cracks this way, especially as we have more developers and more features. Example of where testing would have been super helpful this weekWe had a PR last week which fixed an open issue where the labels in the legend would overflow & look bad. The PR limited the characters in the legend name, so they didn't overflow. Great! I tested it on a few examples, it worked, I merged it. But this broke legend display for numeric legend traits, which I didn't test on 😢 . These are the kind of things I'd love testing to automate out of my life! List of test actions it would be wonderful to automateFor (e.g.) 10 different datasets, with the acknowledgement that some params will be specific to each one). This list isn't exhaustive (but is equally too exhaustive to begin with).
|
I will +1 puppeteer! Looks great @tihuan :) |
Sorry for getting back to this late, got a workload burst this week 🚀 Thanks for checking out the PR, @colinmegill 🤩🙏 Love the list you provided too, @jameshadfield ! Sounds like it will involve SVG testing, which I don't have much experience in and will require some research on how to do it right. @colinmegill how do we do that in cellxgene? Are other contributors okay with Puppeteer? Does the PR look good to everyone? Thanks all! |
@tihuan I see some changes in formating of .eslintrc file. It may conflict with other devs already on the project. Otherwise, the test PR looks good. I'm happy to help out. |
Update: Auspice now has jest / puppeteer support thanks to @tihuan's #943 🎉 @tihuan & @colinmegill do you recommend closing this issue and creating new issues for each of #917 (comment) or is there more general e2e discussion to happen here first? |
Yay thanks everyone for the discussion and reviews! 😍🙏🎉 I think it makes sense to break down #917 (comment) into individual issues, so people can tackle them in parallel In general, I would recommend people use https://github.com/smooth-code/jest-puppeteer/blob/master/packages/expect-puppeteer/README.md#api where possible, instead of going straight to using Puppeteer's own API in writing tests, unless Once we have individual issues for each test case, we could also tag them with priority, so people know which ones to choose first? Thanks all! |
Agree with all the above, thanks so much @tihuan! And thanks for referencing those |
👍 @colinmegill / @tihuan if you have time, could you turn one of those example tests into an issue as you're better placed to know how much detail is required (and please reference those recommendations in the docs so that the test implementations are consistent). |
Thanks both! Yeah I'll pick a test you described above and turn it into an issue and see how people like it! ✍️ |
Hi all! Quick update on writing an issue for a test 😄 I'm working on Given that it's asserting SVG changes, I wanna try using https://github.com/americanexpress/jest-image-snapshot to diff images. Another option is https://percy.io/, but not sure if they have OSS free plan thus testing with The expected outcome of the issue is to provide a list of assertions + image diffs we want the test to do, along with suggested Ideally, I'd like to do a spike on Extra discussion:
In my work project, our TravisCI setup runs a few different checks in parallel:
Love to hear your thoughts on all three questions! Thanks all! |
Hi @tihuan -- thanks for commenting, it's exciting that this is progressing!
This, to me, is the big question. The datasets which auspice sources (via What happens if, for instance, we change the font size of the axis text label? Presumably this breaks any test with the tree in view. Is there a way to programmatically regenerate the images needed for all the tests? Worth me asking at this point if there other approaches than comparing image snapshots which we should consider? E.g. ensuring that a certain SVG element is in the DOM & has the correct colour after a colour-by change (I have no idea what's best practice at the moment).
Agreed, with the caveat that we should pick the right approach here as (hopefully) many tests will be forthcoming which use it. Really like your thoughts on detailing a suggested API for others to follow ⭐️
Seems sensible. Recent work means we're now at 0 errors & warnings (perhaps the first time this has happened!). Enforcing this via |
Morning @jameshadfield ! Oh thank YOU and your team for running the show and give us such a great tool to use. Especially right now! I'm just pitching in where I can and trying to be helpful :-)
Yes! The image diffing library uses the same flag as updating regular jest snapshots I'm thinking we can create a new command in
Great question! We also have unit testing we can use to snapshot DOM structure via jest snapshot testing and enzyme from Airbnb. It will do what you described, which is asserting the SVG elements are rendered with the right attributes, inner text, etc., as expected. Personally I think we should do both unit testing at the component level by feeding the component different combinations of props, and snapshot expected DOM output. And also UI testing with screenshots to make sure the whole user flow works in the app as expected. Additionally, since UI/E2E tests are much slower than unit testing, they should only cover critical user flows / happy paths that block prod deployment when any of them breaks. So it's meant to be spot checking things that should never break, while unit testing can cover way more edge cases to ensure any function/component works as intended given any input Quick definition of unit, integration, and UI tests that I'm referring to, since I feel there are times different people define tests differently 😆 E.g., referring to
That sounds great! Yeah, defining patterns that will be used at scale will involve risks, but we can always adjust them as we go when team members discover new patterns that work better. I think the discussion should continue, so we get more ideas on figuring out best practices for testing data virtualization. Hopefully we can get more people to chime in! In the meantime I'll work on a prototype and get a PR out for screenshot testing, so we have a concrete example that we can use to discuss pros and cons
HURRAY!! That's a super exciting win 🎉🤩👏! Yeah, ESLint comes with rules that can be autofixed, so as long as we pass In addition to ESLint, for the code base to have a consistent coding style and format, we can use prettier, which has been wildly popular since its introduction 3 years ago. Definitely highly recommend it! |
💯 -- thanks! |
Hi all! Sorry for lack of updates in the past week, been swamped with work 🌊🏊♂️ Currently I got screenshot diffing to work, but a few things I wanna add before submitting a PR:
It's super easy to set up, I just have been working nights to get a major project on track right now 😆
So I was thinking the helper function could take a screenshot periodically, say, every 100ms, and compare the screenshots until they are identical (else retry). So we know the page is steady and ready for the real screenshot diffing test. I'm hoping to get my work project on track this week, then get back on this testing stuff on Sunday Hope that's okay! Thank you 😊 🙏 |
Just submitted a PR for adding screenshot testing! PTAL all 😄 🙏 ! Thank you! |
CC: @nellyk @anescobar1991 (two of the maintainers of |
First of all thanks to all of you for working on a project like this one that is helping us in our fight against covid19 (and other malicious pathogens)! What I'd recommend is executing the tests in a docker container in order to ensure repeatability across operating systems. Otherwise you will likely have issues with snapshots looking one way on a developer's machine (let's say running macOS) but then on CI the snapshots look slightly different causing failures. Many times things like fonts render differently across operating systems causing these issues. This seems heavy handed (and is) but over the years we really have found that it is the only way to have minimally brittle integration tests. That and make sure that all data is mocked to avoid dependencies on things you don't have control over. Other than that the things mentioned in #917 (comment) make sense! Let me know if you face any issues! |
Hi all, There is another linter (no, call it a static source analyzer?) called Deepscan with a free open source plan. Currently it shows 24 issues in Auspice. My PR's have been fixes for Deepscan warnings, and Deepscan makes it easy by recommending a fix. Those 24 issues might be harmless but there could be one in 24 which actually breaks something.
|
Thoughts, implementations and advice welcome!
The text was updated successfully, but these errors were encountered: