Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduction to E2E Playwright testing #6641

Open
1 of 12 tasks
BOHEUS opened this issue Aug 16, 2024 · 30 comments
Open
1 of 12 tasks

Introduction to E2E Playwright testing #6641

BOHEUS opened this issue Aug 16, 2024 · 30 comments
Assignees

Comments

@BOHEUS
Copy link
Contributor

BOHEUS commented Aug 16, 2024

This issue is a parent for all future PR and issues

Description

Since more and more issues/requests appear for E2E tests, here are some things which need to be addressed before progressing:

  • tests must consist of modules (since most of them will have overlaying path, e.g. each test will log in before accessing any data), this way it'll be easier to fix one thing in one file instead of multiple files
  • tests must write logs in such form which is easy to process, understand and clear to find where the possible bug is, on top of that, making screenshots so finding process will be much easier
  • tests must replicate as close as possible human behaviour, e.g. clicking on input before typing a word (so input will be focused)
  • any variables such as URLs, fixed login data (like when hosting locally or in demo) must be imported from separate file like .env
  • tests must be able to run in CI in order to ensure the quality of new features and the product (in CI tests must have the URL of existing instance like demo, not the localhost)
  • tests must be versatile so anyone can basically run them after providing proper variables, be used in different environments (locally and in production/demo) and both in normal as well as regression testing
  • tests must clear after finishing the test (basic assumption is every single test start from clean environment i.e. right after creating a new workspace) as some leftovers from previous tests may result in failed tests later

While this shouldn't prevent from creating issues/requests for tests, this must be addressed ASAP so technical debt will be minimised

Based on all of the things described above (some of them are taken from here where I described them earlier), here's checklist

TO DO

This checklist ^ may be updated in the future

Stages

  1. type: chore

Materials:

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 16, 2024

Please assign this task to me as I don't have enough permissions to do it by myself

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 17, 2024

Cleaner can be done in 2 ways:

  • by using commands responsible for restarting the database and seeding it with demo data
  • by deleting objects created during the tests

First option is time-consuming due to fact resetting database can take about 1-2 min, when run after each test, the amount of time spend on waiting until commands are finished might be higher than on the tests itself but the amount of code is insignificant. (OS shell driver required)

Second option is better as it allows to test additional cases like user deleting some objects within one test and the time spent on deleting objects created during the test is lower but the amount of code required to secure all cases is significantly higher compared to first option.

Both approaches should be used, first one in case there are scenarios requiring to delete objects created by default (e.g. after first login to workspace) or in case something bad happens, second by default.

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 19, 2024

In the future when integration tests with Google will be implemented, instead of using the UI for accessing the mails and calendar, tests should use the API Google provides, this way, it'll be easier to implement (in case the account will be secured using 2FA, it'll be easier to work with) and faster (compared to normal access with UI) with the same result (question is how email can be checked whether its' content [formatted HTML] is the same as in the mockups, using built-in assert in TS?)

Of course, if the app will support at some point in the future other mail and calendar providers such as Microsoft or Apple, the general rule of thumb is to use API for reasons stated above if possible, if not (like for Proton Mail, there is Python module maintained by the company but not JS/TS library https://github.com/ProtonMail/proton-python-client), then module for specific provider should be created tests should be mocked as creating a module for provider's calendar will take too much time.

API documentation reference:

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 19, 2024

Playwright offers a way to test REST API using built-in request class so API testing can be done within one framework. While the documentation doesn't explicitly say that it's possible to test GraphQL API, one search and blog post later it turns out to be possible without bigger problems (except for the query mashed into 1 line).

However, many people are using Postman for (automated) API testing which supports REST and GraphQL API without bigger problems (additionally, it can get GraphQL schema from the link easily) as well as additional features such as workflows, tests within API requests, exporting environments and more. While this option would be great, it's only nice to have as supporting both solutions will take too much time and effort unless better solution will be found in the future. For now, it's better to use only Playwright.

Documentation reference:

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 20, 2024

When using app, user has to log in to its account in order to perform any action. In tests, the same logic must be saved (logging in as the first step) but that brings up a problem which is logging before each test.
There are 2 solutions for this: either leave this as if and log in before every test and log out after every test (after cleaning all changes and reverting the workspace to state before the test) or use global setup (before all tests) and storage state (before every test).

The first solution mimics the user's behaviour but adds additional ~20 sec to time execution of each test, while the second solution saves that time.

The question is what about different users when role based access will be added. Storage states (second solution) seem like a good option.

Documentation:

@lucasbordeau
Copy link
Contributor

@BOHEUS E2E would be really useful for example when we refactor scrolling behavior like this issue : #6645

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 22, 2024

@lucasbordeau added item to Tasklist, by default Playwright scrolls by itself but if that's not enough, it can be done by simulating mouse wheel behaviour or scrolling enough to see specific element on screen. Either way, to check and simulate this, I need to add several objects as I'm running tests from fresh setup each time.

Documentation:

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 22, 2024

When interacting with page, users are interacting with it by clicking on elements, e.g. clicking on button to complete the transaction or on link to redirect to seemingly normal page only to be rickrolled. In tests, those elements can be located using several options such as .locator(), .getByPlaceholder(), .getByTitle() and more but it can be also done in another way. Since HTML is basically a XML-based node tree, XPath can be used to traverse through this tree and find specific element.

But XPath should be used if built-in locators into Playwright fail as last resort.

Documentation:

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 22, 2024

Since Twenty added webhooks (essentially allowing for integration with different, not officially supported apps with Zapier), these should be tested if are working. Playwright does not offer any way to test them (nor does any other framework), thus the third-party provider should be used for this. There are plenty of sites offering intercepting/receiving webhooks, notably one of them being https://webhook.site

TBD if verifying webhooks should be done within the E2E tests.

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 23, 2024

Found issue #5583 with interesting solution for Postman testing, if it works like described there, then should it be used together with Playwright? Writing Playwright tests for API is a good option considering fact there are no tests for API and when done correctly, they can be used inside tests, e.g. test checking if object created by API is visible in UI, has correct data, etc.

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Aug 23, 2024

When somewhere in the future, the need for load performance tests will appear, we should test the app under extreme conditions (one example is given in #4817). The problem appears with:

  1. generating such massive amount of data (as importing data is now very limited, one user reported problems when importing data with 4k rows to workspace Onboarding Difficulties #6647)
  2. testing on such huge workspace as by this point the basic assumption is more than 1 user has an account and is using app concurrently

Problem 1. can be solved by:

  • generating entries in API (but it may throttle and reach bottleneck if requests are sent too fast)
  • adding entries directly in database (which means that a special driver would be needed in order to access the database and add those entries, which means more problems as there would be at least 2 drivers, for database in Linux and database in Docker)
  • creating a special command which will seed workspace with lots of data (which sounds plausible but that again means lots of code as they're basically objects in key-value fashion, maybe for loop with fakerjs will solve this? https://github.com/faker-js/faker)
    UI is off-limits as it'll take way too much time (probably in hours depending on amount of data)

Problem 2. can be solved by creating realistic guidelines for such cases and modifying test to use x independent contexts (or pages if that's possible) simulating users (but here's catch, how to write test in such way that each "user" will perform on different objects so it won't lead to overwriting the same data over and over?)

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Sep 9, 2024

Something worth noting, normally when writing tests, it's normal to use the CSS selector to locate specific element IF the app doesn't change classes, in Twenty most components have 1 class which change depending on theme whether it's dark or light, essentially making them impossible to use in tests as locators. We should never resort to using CSS for this reason unless there's very specific reason (even then, we could opt out for XPath).

Also, while writing all necessary components for tests, it's possible to find a component by text inside component, but this will become a problem when localization tests will be introduced. All components in tests must be language- and theme-agnostic so there won't be a need to refactor a huge chunk of code just to allow writing localization tests and/or testing in different mode than light/dark mode.

@lucasbordeau
Copy link
Contributor

What do you think about :

  • Using data-test-id attributes (low maintenance but bloats the codebase)
  • Using XPath (low maintenance but can break if we reorder the component hierarchy)

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Sep 16, 2024

Using data-testid attributes isn't a bad idea, I'm using it only when I know I can't locate an element easily (like it was with workspace dropdown because it couldn't be hardcoded [who knows what workspace will be default after changing some things and forgetting to revert them]), XPath on one hand is known to be durable but on the other hand writing a XPath made out of 20 divs just to reach one element is something I wouldn't do unless I have to (maybe I'm overexaggerating but I hope you get the idea), also, it's not that hard to fix a XPath. I'll use them if anything else fails or there'll be need for automated localisation tests (locating by text/label won't work there so other methods must be used to locate specific elements).

About automated localisation tests, are you planning to use them in the future?

@lucasbordeau
Copy link
Contributor

Ok, I don't think localization will be implemented soon, so maybe a good starting compromise would be fetch by text + data-testid when more precision and stability is required ?

@lucasbordeau
Copy link
Contributor

That's quite what we already have for storybook integration tests.

@lucasbordeau
Copy link
Contributor

@BOHEUS Another important thing to keep in mind, let's also not test only happy paths but also check that we shouldn't be able to do what we can't do and that we shouldn't see what shouldn't be displayed (to be defined).

@charlesBochet
Copy link
Member

@BOHEUS one of our priority next week will be to integrate your end to end tests to our release flow. We will likely rework seed and have a clean environment to run them against

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Oct 13, 2024

Thanks for the info @charlesBochet, if you'll need my help, ping me

@charlesBochet
Copy link
Member

Will do!

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Oct 13, 2024

Based on all notes above as well as talk with @prastoin earlier this week, other issues I were mentioned in, changes made in Twenty repo adding new functionality, some research in the meantime and more, here's rough "plan" (order of elements may not apply to order in which specific steps will be done):

  1. add data-testid attribute to all elements which are not stable by the nature (like the dropdowns which show last selected value thus can't be searched using text) or can't be easily searched (like buttons which can't be searched by text obviously nor class [explained earlier with CSS locators problems]), rest of them will be located using text as localization isn't a high priority right now and changing that in the future won't be a big cost (although it's better to skip refactoring if it's known ahead in my opinion)
  2. prepare POMs (Page Object Models) which will reflect actual state of the app, while preparing it's important to include everything which may be used later in tests; while writing locators in POMs be aware that:
    2.1. some locators won't change during the tests (like buttons) so they need to be initialized in constructor and set to read-only as we won't change them during the test (additional pro: variable is initialized at the beginning = less computing cost)
    2.2. some locators may change during the tests (like rows) so they need to be in functions which will take some argument depending on the element and create new locator based on provided value (which creates variable every time function is called but there's no other way so be careful while doing this, calling variable won't be a big cost compared to initializing over and over)
    2.3. XPath is your friend, not your enemy, while Playwright offers lots of different and nice ways to locate elements, XPath must be used in places where normal Playwright locators fail like all data in record's table in e.g. Companies (creating new objects or modifying data of objects in table, not in the object's details are perfect examples of failing Playwright locators where they can't locate elements by placeholder because they're probably too deeply nested?)
  3. create a driver which will change the .env file in packages/twenty-server depending on scenario like testing if user can create new workspace or check if user is logged out after x minutes of inactivity (to be analyzed how to do it properly as .env contains some variables which shouldn't be deleted like PG_DATABASE_URL [appending to the file isn't a good idea so maybe prepare some payload with constant variables, append whatever is needed according to scenario and overwrite the file's content?])
    PROBLEM: server doesn't read .env file every x minutes (or I'm not aware how much time it is) so after changing .env file, server must be restarted in order to apply changes which forces to group scenarios based on similar factors
  4. analyze the areas which need E2E testing (relatively easy task)
  5. prepare scenarios after step 4 (not so easy task as we are in Wild West of E2E testing and shouldn't cover only happy paths)
  6. prepare utils which will help in testing like keyboard shortcuts, drag and drop actions and such (probably after 5 but not sure)
  7. prepare objects (either own or with zod, to be decided which one) for tests with API and methods to verify if expected data is returned by API (JSON parser is enough)
  8. prepare base scenario for API testing which will log in, generate new API key and get value of key (then how should this value be passed to tests? via .env?)
  9. add containers for mailing (smtp4dev) and webhook testing (webhook.site, if anything goes wrong, user won't be IP-banned)

More info just to be aware:

  • all PRs must refer to this "master" issue
  • right now CI/CD is out of the bounds as cost of running tests is unknown but it may change in the future (which means sharding tests will be necessary to speed up testing process)
  • it's possible to pararellize tests and we should go for it when combining scenarios, but once Twenty has implemented a way to fetch changes made by other members of workspace live and not after page refreshing, we have to be aware of possible unknown state of the app at runtime when tests are running in parallel
  • DOM isn't very friendly as now it has lots of div and it's slowly changing to be more friendly and accessible so don't be afraid to change the codebase to make life slightly easier and small refactors will happen. for sure
  • scenarios should be written in separate issues and tagged [E2E scenario] in title and with label to make it easier to search (to be decided) EDIT 12.11.24: scenarios must created in under 1 group issue based on testing area, each scenario is written in separate comment to make it easier to edit/fix scenario as well as linking said scenario in PR, 1 PR per scenario, each test must have linked scenario as a comment

@lucasbordeau
Copy link
Contributor

Excellent ! The POM is a good pattern to implement from the outset. Be careful not to fall into premature optimization. We'll see how we'll need utils, XPath, etc.

We should have a way to launch commands, to reset the database, create demo data, this way if we need to add a command for E2E tests we can create it in the server and the E2E test will just call it, what do you think ?

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Oct 17, 2024

@lucasbordeau I've already implemented it in #6717, it's shell_driver.ts in packages/twenty-e2e-testing/drivers, if you think something needs to be changed there, let me know

@lucasbordeau
Copy link
Contributor

lucasbordeau commented Oct 18, 2024

Ok, maybe we'll need some class / module that wraps this shell driver to execute specific functions : resetDatabaseAndSeed, seedForLoadTesting, seedForComplexRelations, migrateMetadata, rollbackMetadata, activateFeatureFlag, etc.

But let's see when we'll have the concrete need in front of us.

@lucasbordeau
Copy link
Contributor

We shouldn't optimize for the subscription pattern right now, it's too premature and we don't know what will be implemented and how.

Our knowledge of E2E testing and our tooling around it should be sufficiently evolved, should subscription be implemented, that refactoring for subscription will be doable in a reasonable amount of time. Plus we have already done major refactors and continue to do so, it's in our culture, so nothing to be afraid of here.

@lucasbordeau
Copy link
Contributor

Parallelization shouldn't be our main focus since E2E tests won't be run as frequently as other unit tests. Plus it's easy with Playwright to just re-run the failed ones by hand while we're in the process of manually preparing and testing everything before a release, which should be our main focus.

Our persona for this E2E testing is the person who will spend multiple days to manually release and review everything, test, migrate, etc. And E2E test should be a manual step for expanding the coverage of testing before a release.

@lucasbordeau
Copy link
Contributor

lucasbordeau commented Oct 18, 2024

Also I think we shouldn't prioritize testing complex E2E features right now like emails, remote table, stripe, etc. It would be better to just seed the database with a snapshot of real world emails data from a Twenty instance.

Because the main goal is to help the review process right now.

But we should certainly implement those E2E tests in a second phase, because they are very hard to test thoroughly at each release.

I think that the first of those cases of E2E test we want is around login, subscription, cancellation, etc. With a staging stripe account and fake credit cards.

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Oct 18, 2024

Parallelization will be main focus once there are tests covering different areas of the product and CI/CD pipelines for tests are available, it's more of a note for future myself or someone else rather than a goal for now.

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Oct 27, 2024

For testing things which require additional sites like emails, here are some possibilities:

  • outgoing emails like password reminder: smtp4dev, Mailosaur (paid solution) (of course, for integration with Google/O365, Google/O365 account should be used)
  • webhooks: webhook.site
  • SAML: mock-saml

@BOHEUS
Copy link
Contributor Author

BOHEUS commented Oct 27, 2024

If there's need to find in page's source code disappearing element which appears only on hover, here is StackOverflow thread with solutions (trick with JS debugger works fine)

lucasbordeau pushed a commit that referenced this issue Nov 7, 2024
lucasbordeau pushed a commit that referenced this issue Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

No branches or pull requests

3 participants