Refactor integration tests to improve legibility #11

dlqqq · 2024-02-02T19:50:13Z

Summary

This PR refactors the test_pycrdt_yjs.py test file, along with the JS client test files, to be more legible.

Detailed description

removes redundant/unnecessary logic that was obscuring the intent of each test
renames the test helper class YTest => YClient for clarity
- renames clock_run() => inc_clock()
- renames async run_clock() => async wait_for_reply()
prefers testing on a single root map type when possible, rather than performing some statements & assertions on multiple root map types with different names in the same test
applies prettier formatting to JS test files
simplifies implementation of JS test clients (no need for local clock variable to determine whether ymap.clock was incremented by self or by the Python client)

welcome · 2024-02-02T19:50:16Z

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly.

You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

tests/test_pycrdt_yjs.py

dlqqq · 2024-02-06T15:38:01Z

@davidbrochart @Zsailer Quick ping for visibility on this PR and #12.

I'm working on writing more integration tests for this package, so I would like to see whether my "test style" is acceptable to the maintainers before proceeding further.

davidbrochart · 2024-02-08T09:03:22Z

Sorry for the late reply @dlqqq.
I'm not sure we should improve the current tests, or rewrite them altogether. At the time it seemed to me that using CRDTs (ytest, clock) for testing was the easiest, but it ended up being quite complicated anyway, and I think it lead to flaky tests. Also, it feels weird to use CRDTs to test CRDTs :)
I can't think of a better solution right now, but what would you do if you were starting from a blank page?

dlqqq · 2024-02-08T15:34:04Z

@davidbrochart I don't think these tests will exhibit any flakiness. The strategy of pinging the other client and waiting for its reply via clock should guarantee that the Python and Node.js test clients are synced afterwards. I think the current integration tests closely mimic what is being done in RTC (pycrdt-websocket <=> y-websocket). These tests offer high confidence in the code and should be kept in my opinion.

I think the remaining work is to add some more complex tests that may catch the data loss issues users are experiencing, which I'm working on.

At the time it seemed to me that using CRDTs (ytest, clock) for testing was the easiest, but it ended up being quite complicated anyway, and I think it lead to flaky tests. Also, it feels weird to use CRDTs to test CRDTs :)

I think with more development on the test fixtures (like the work done here in this PR), we can mitigate the complexity of using CRDTs in our tests. For example, the tests in this branch are quite readable now IMO. I'm not sure how one would test pycrdt-websocket without CRDTs. I think we should stick as close as possible to real-life use cases to get the greatest confidence from our tests.

davidbrochart · 2024-02-08T15:52:43Z

What I mean is that we could use a different strategy for client-server synchronization than what I did in ytest. Basically ytest is here to avoid waiting unnecessarily, but it could be replaced with a sleep. But it is a bit overkill to use a CRDT just to send an event, and it bothers me that tests use a feature that we want to test. And tests are indeed flaky, I rerun them when they fail.

dlqqq · 2024-02-08T16:05:40Z

@davidbrochart

it could be replaced with a sleep

I believe this would make tests more flaky, no? A computer cannot guarantee that an interprocess call will take less than XXX milliseconds. It seems better to explicitly wait for the other client to reply.

And tests are indeed flaky, I rerun them when they fail.

I haven't been able to reproduce that. I will do more testing however.

What are the remaining steps needed to merge this PR? Is it just that these tests are flaky on your machine?

davidbrochart · 2024-02-08T16:13:07Z

I believe this would make tests more flaky, no? A computer cannot guarantee that an interprocess call will take less than XXX milliseconds. It seems better to explicitly wait for the other client to reply.

Sure, it was just to show you what ytest is here for. My point is that I'm not sure we should use a CRDT to implement a synchronization mechanism, given that this CRDT is what the test is testing.

What are the remaining steps needed to merge this PR? Is it just that these tests are flaky on your machine?

I need to review it :) Tests are flaky in the CI too.

dlqqq · 2024-02-08T16:38:20Z

@davidbrochart Thanks for clarifying! I feel like CRDT tests belong in the pycrdt repo, and that any tests here in pycrdt-websocket should focus on features specific to this package. Testing dependencies has not proven worthwhile in my experience.

I just pushed another commit that fixes a small issue I noticed in the test teardown:

=========================================================================================================================== short test summary info ============================================================================================================================
ERROR tests/test_pycrdt_yjs.py::test_pycrdt_yjs_1[1] - websockets.exceptions.ConnectionClosedError: sent 1001 (going away); no close frame received
========================================================================================================================== 2 passed, 1 error in 1.11s ==========================================================================================================================

The server was not being closed in the test fixture during teardown (after the yield statement), which caused the test suite to produce an error (even though all tests pass). Commit 2bdd9cc fixes that with a one-liner, and tests now run great locally.

Now, running the test 100 times does not produce a single failure:

% pip install pytest-repeat
% pytest tests/test_pycrdt_yjs.py --count 100
============================================================================================================================= test session starts ==============================================================================================================================
platform darwin -- Python 3.11.7, pytest-7.4.4, pluggy-1.4.0
rootdir: /Volumes/workplace/pycrdt-websocket
plugins: repeat-0.9.3, asyncio-0.23.4, anyio-4.2.0
asyncio: mode=Mode.STRICT
collected 200 items

tests/test_pycrdt_yjs.py ........................................................................................................................................................................................................                                        [100%]

======================================================================================================================= 200 passed in 106.43s (0:01:46) ========================================================================================================================

tests/conftest.py

davidbrochart · 2024-02-08T16:56:07Z

@davidbrochart Thanks for clarifying! I feel like CRDT tests belong in the pycrdt repo, and that any tests here in pycrdt-websocket should focus on features specific to this package. Testing dependencies has not proven worthwhile in my experience.

I don't think you get my point, I don't want to test dependencies.

dlqqq · 2024-02-08T17:48:31Z

@davidbrochart

I don't think you get my point, I don't want to test dependencies.

Yes, the tests require pycrdt to work as expected in order to pass. So in that sense, each test also tests every one of its dependents. However, when I write tests for a package, I write them assuming that all of the dependencies behave as expected. Yes, that means sometimes a breaking change in a dependency can cause tests to fail, but I believe that is both expected and desirable. Some redundancy in integration tests across packages is OK; if further isolation is desired, you can add unit tests instead of integration tests.

davidbrochart · 2024-02-13T17:47:44Z

@dlqqq After simplifying the tests in #14 I don't think this PR makes sense anymore, as it was building on top of YTest which I got rid of. One thing that came out of it is that the flakiness seems to have disappeared, so that's good. Feel free to add more tests, it's currently very basic.

dlqqq commented Feb 2, 2024

View reviewed changes

tests/test_pycrdt_yjs.py Show resolved Hide resolved

refactor tests to improve legibility

f75c614

dlqqq force-pushed the improve-integ-tests branch from e9f061b to f75c614 Compare February 2, 2024 19:59

refactor JS test clients

11597d5

dlqqq mentioned this pull request Feb 8, 2024

Better syntax for accessing existing shared types jupyter-server/pycrdt#76

Closed

stop server in yws_server fixture teardown

2bdd9cc

davidbrochart reviewed Feb 8, 2024

View reviewed changes

tests/conftest.py Show resolved Hide resolved

davidbrochart mentioned this pull request Feb 12, 2024

Drop YTest #14

Merged

davidbrochart closed this in #14 Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor integration tests to improve legibility #11

Refactor integration tests to improve legibility #11

dlqqq commented Feb 2, 2024 •

edited

Loading

welcome bot commented Feb 2, 2024

dlqqq commented Feb 6, 2024

davidbrochart commented Feb 8, 2024 •

edited

Loading

dlqqq commented Feb 8, 2024

davidbrochart commented Feb 8, 2024

dlqqq commented Feb 8, 2024

davidbrochart commented Feb 8, 2024

dlqqq commented Feb 8, 2024 •

edited

Loading

davidbrochart commented Feb 8, 2024

dlqqq commented Feb 8, 2024

davidbrochart commented Feb 13, 2024

Refactor integration tests to improve legibility #11

Refactor integration tests to improve legibility #11

Conversation

dlqqq commented Feb 2, 2024 • edited Loading

Summary

Detailed description

welcome bot commented Feb 2, 2024

dlqqq commented Feb 6, 2024

davidbrochart commented Feb 8, 2024 • edited Loading

dlqqq commented Feb 8, 2024

davidbrochart commented Feb 8, 2024

dlqqq commented Feb 8, 2024

davidbrochart commented Feb 8, 2024

dlqqq commented Feb 8, 2024 • edited Loading

davidbrochart commented Feb 8, 2024

dlqqq commented Feb 8, 2024

davidbrochart commented Feb 13, 2024

dlqqq commented Feb 2, 2024 •

edited

Loading

davidbrochart commented Feb 8, 2024 •

edited

Loading

dlqqq commented Feb 8, 2024 •

edited

Loading