Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MST Test Suite integration #42

Merged
merged 4 commits into from
Dec 12, 2024
Merged

Conversation

DavidBuchanan314
Copy link
Contributor

@DavidBuchanan314 DavidBuchanan314 commented Dec 11, 2024

The tests take ~30 seconds to run on my M1 pro macbook (with failing asserts commented out) - not too bad, but I anticipate adding a lot more tests to the suite.

Due to the slowness, I deliberately named the test file such that python -m unittest discover won't find it. It can be manually invoked via python -m unittest arroba.tests.mst_test_suite.

The test suite itself is still in a state of flux, hence this is a draft PR for now.

Currently, it only tests basic MST diffing (detection of created, updated, deleted records, and new and deleted MST blocks)

Unfortunately the tests currently don't pass - Diff.new_cids is flakey (which you note in one of your existing diff tests). Everything else seems good though! (due to the symmetry of diffing, you could in theory compute the reverse-diff and take Diff.removed_cids to get a correct result - but presumably you want to fix it properly heh)

@DavidBuchanan314
Copy link
Contributor Author

Note to self, I should probably add something to the README about how to init the submodule

return car_header["roots"][0]

def test_diffs(self):
for testname, testcase in tqdm(self.diff_testcases.items()):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, and I used tqdm for a progress bar - not super necessary. I should either take it out or add it as an optional dependency

reference_cid_set = set(x[0] for x in reference_blocks) # just look at the cids from the car

self.assertEqual(root_b, reference_root, f"{testname} inverse: new root") # fails occasionally
self.assertEqual(diff.new_cids, reference_cid_set, f"{testname} inverse: new cid set") # basically always fails, I think I'm doing something wrong
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the gist of this test case is: take mst_a, apply the list of ops to it, and compare the result to mst_b. The list of CIDs I'm getting from null_diff(mst).new_cids looks completely different to what I expect, so I suspect I'm doing something wrong here (or maybe null_diff() is just very broken heh)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah. Both are possible I guess! arroba's diff code is used in production, but definitely not as broadly as it could be, and not all parts, so big bugs are very possible.

Copy link
Owner

@snarfed snarfed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Really psyched for this test suite. Thank you so much for putting it together, and for this PR! Especially appreciate you seeing and using arroba-idiomatic parts like testutil.TestCase.

Minor nit, you'll probably want to use with self.subTest(...) for the individual test cases to get unittest to see and run and count them individually. Alternatively you can generate them programmatically, eg https://github.com/snarfed/granary/blob/c29dd723f8b82ca5874f2122de64b2e9cd2c1fd7/granary/tests/test_testdata.py#L144-L156 , but that's more complicated and less idiomatic.

I'll also mention making the test suite pip-installable just for the record here, even just from GitHub if not pypi. Submodule is ok, pip package is definitely nicer.

Regardless, this is exciting!

@snarfed
Copy link
Owner

snarfed commented Dec 11, 2024

Oh and I like separating it out from unittest discover, but I'd happily run it in CI, 30s is totally fine. Example past runs: https://app.circleci.com/pipelines/github/snarfed/arroba

@DavidBuchanan314
Copy link
Contributor Author

ah yes, subTest is what I was looking for. Thanks for the feedback!

Current stats:

Ran 2 tests in 79.933s

FAILED (failures=30596)

(some of these are plausibly bugs in the test cases themselves, since I generated them with my own code which has low test coverage itself!)

@snarfed
Copy link
Owner

snarfed commented Dec 12, 2024

Thanks! I'm happy to merge this whenever. Lots more to do, sure, but it's contained and harmless and a great start as is. Up to you though!

@snarfed
Copy link
Owner

snarfed commented Dec 12, 2024

@DavidBuchanan314
Copy link
Contributor Author

Cool, feel free to merge it now. When I add new tests it should just be a case of updating the submodule, and if I add new test types I can PR those in too.

I'll think about making it a pip-installable package in the future, but submodule definitely works for now.

I just bumped the submodule version, no changes to the tests themselves but it includes my visualisation script.

@DavidBuchanan314 DavidBuchanan314 marked this pull request as ready for review December 12, 2024 18:13
@snarfed snarfed merged commit 507956c into snarfed:main Dec 12, 2024
2 checks passed
@snarfed
Copy link
Owner

snarfed commented Dec 12, 2024

Thanks again, this is so great!

@snarfed
Copy link
Owner

snarfed commented Dec 12, 2024

image

[cracks knuckles] time to get to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants