Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(9/9) [PR COMMENTS] Split up P2PDataStoreTest #3587

Merged
merged 40 commits into from
Nov 18, 2019

Conversation

julianknutsen
Copy link
Contributor

@julianknutsen julianknutsen commented Nov 9, 2019

New commits start eb2f8f3

Review of #3554 asked for some iteration on the number of different tests that all existed in one file. To that end, this PR splits up the tests into multiple files, cleans up the common helper functions, and adds Javadoc comments on all of the test fixtures to help future reviewers understand how the tests work.

I would prefer to just put this PR on the end of the rest since it is strictly limited to test code. That will save a lot of the hassle of having to merge the various iterations of tests throughout the refactor effort across the new files.

I think the end result is in a good place and there is a clean separation between the common test helpers and the various entry points.

Fix a bug where remove() was called in the addMailboxData()
failure path.

1. Sender's can't remove mailbox entries. Only
   the receiver can remove it so even if the previous add() failed and
   left partial state, the remove() can never succeed.

2. Even if the sender could remove, this path used remove() instead
   of removeMailboxData() so it wouldn't have succeed anyway.

This patch cleans up the failure path as well as adds a precondition
for the remove() function to ensure future callers don't use them for
ProtectedMailboxStorageEntrys.
Add comments and refactor the body in order to make it easier to
understand.
Refactor addProtectedStorageEntry for more readability and add comments
to help future readers.
Refactor for readability and add comments for future readers.
Refactor for readability and add comments to help future readers.
Refactor for readability and add comments for future readers.
Removed duplicate log messages that are handled inside the various helper methods
and print more verbose state useful for debugging.

Updated potentially misleading comments around hashing collisions
…icates

Now returns false on duplicate sequence numbers. This matches more of
the expected behavior for an add() function when the element previously exists.

The only callers are either P2PService users that always increment the
sequence number or the onMessage() handler which doesn't verify the return
so there will be no visible change other than the increased readability
of the code and deduplication of the code paths.
Now returns false if the sequence number of the refresh matches
the last operation seen for the specified hash. This is a more expected
return value when no state change occurs.

The only callers are either P2PService users that always increment the
sequence number or the onMessage() handler which doesn't verify the return
so there will be no visible change other than the increased readability
of the code and deduplication of the code paths.
…duplicate sequence #s

Remove operations are now only processed if the sequence number
is greater than the last operation seen for a specific payload.

The only creator of new remove entrys is the P2PService layer that always increments
the sequence number. So, this is either left around from a time where
removes needed to work with non-incrementing sequence numbers or just
a longstanding bug.

With the completion of this patch, all operations now require increasing
sequence numbers so it should be easier to reason about the behavior in
future debugging.
chimp1984
chimp1984 previously approved these changes Nov 12, 2019
Copy link
Contributor

@chimp1984 chimp1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK
I did some smoke tests but as that is high risk area we should wait for an ACK from @freimair as well.

Use the DI Clock object already available in P2PDataStore, instead
of calling System.currentTimeMillis() directly. These two functions
have the same behavior and switching over allows finer control
of time in the tests.
Deduplicate some code in the ProtectedStorageEntry constructors in
preparation for passing in a Clock parameter.
Switch from System.currentTimeMills() to
Clock.millis() so dependency injection can
be used for tests that need finer control of time.

This involves attaching a Clock to the resolver
so all fromProto methods have one available when they
reconstruct a message. This uses the Injector for the APP
and a default Clock.systemDefaultZone is used in the manual
instantiations.

Work was already done in bisq-network#3037 to make this possible.

All tests still use the default system clock for now.
Reduces non-deterministic failures of the refreshTTL tests that resulted
from the uncontrollable System.currentTimeMillis().

Now, all tests have extremely fine control over the elapsed time between
calls which makes the current and future tests much better.
Add tests for removing expired entries and optionally purging
the sequence number map. Now possible since these tests have
control over time with the ClockFake.

The remove validation needed to be improved since deletes through
the expire path don't signal HashMap listeners or write sequence numbers.
The original test would take over 5 seconds. Allow tests to set the number
of required entries before purge to a lower value so the tests
can run faster with the same confidence.
The custom code to verify the refreshTTLMessage's signature and update
an entry isn't necessary. Just have the code construct an updated
ProtectedStorageEntry from the existing and new data, verify it,
and add it to the map.

This also allows the removal of the ProtectedStorageEntry APIs
that modify internal state.
The code around validating MailboxStoragePayloads is subtle when
a MailboxStoragePayload is wrapped in a ProtectedStorageEntry. Add tests
to document the current behavior.
Method bodies are copied from P2PDataStore to separate refactoring
efforts and behavior changes.

Identified a bug where a ProtectedMailboxStorageEntry mailbox entry
could be added, but never removed.
Extract code from P2PDataStore, test it, and use new methods.
Now that the objects can answer questions about valid conditions
for add/remove, ask them directly.

This also pushes the logging down into the ProtectedStorageEntry and
ProtectedMailboxStorageEntry and cleans up the message.
Add toString() for ProtectedStorageEntry so log messages have useful
information and clean up the formatting.
Move the signature checks into the objects to clean up the calling code
and make it more testable.

The testing now has to take real hashes so some work was done in the fixtures
to create valid hashable objects.
Move the signature checks into the objects to clean up the calling code
and make it more testable.
This mailbox-only check can now exist inside the object for which it
belongs. This makes it easier to test and moves closer to allowing
the deduplication of the remove() methods.
The current check verifies that the stored Payload.ownerPubKey == stored Entry.ownerPubKey.

This is the same check that was done when the item was originally added
and there is no reason to do it again.
Let the objects compare their metadata instead of doing it for them. This
allows for actual unit testing and paves the way for deduplicating the
remove code paths.

This patch also removes an unnecessary check around comparing the hash
of the stored data to the new data's hash. That check can't fail since
the hash was a requirement for the map lookup in the first place.
Make the remove validation more robust by asserting that the
correct remove message is broadcast. This will provide a better
safety net when combining the remove functions.
Now that all the code is abstracted and tested, the remove()
and removeMailboxData() functions are identical. Combine them and update
callers appropriately.

Now, any caller don't need to know the difference and it removes the
sharp edge originally found in bisq-network#3556
Use a more compact version of string formatting
in log messages

Rename isMetadataEquals to matchesRelevantPubKey
which is more descriptive of the actual check
Now that the unit tests cover all of the per-Entry validation,
the tests that create specific configuration of ProtectedStorageEntry
and ProtectedMailboxStorageEntry objects can be removed in favor
of mockable Entrys.

Using mocks prior to this patch was impossible due to the relationship
between the Entry objects and the P2PDataStorage helper functions. Now
that the objects are properly abstracted and tested, real unit tests
can be written for the P2PDataStore module.

This patch leaves the tests and adds an @ignore so the reviewer can see
which unit test now supersedes the integration test.
Purge all the superseded tests and helper functions now that everything
can be unit tested.
Add JavaDocs for the various Stub and Fake objects that are used
in the P2PDataStore test so future developers can understand why they
exist.
One monolithic test was useful when it was under development to reduce
code churn, but now that the tests are complete it is easier to find
and run a specific test when separated into separate test files.

This also fixes a downside of Enclosed.class that didn't allow individual
tests to be run in intellij.
All users pass in an instance of TestState, just make it an instance
method instead of static.
All test callers now just ask the TestState for a SavedTestState instead
of SavedTestState ctor. This makes more sense with the object
relationship since SavedTestState is only used internally to TestState.
File name now reflects the actual tests run after everything
was split up.
Copy link
Contributor

@freimair freimair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

@ripcurlx ripcurlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@ripcurlx
Copy link
Contributor

I don't merge this PR as the commits will be part of #3609 anyways. So I'll close it as soon as #3609 gets merged. In the future I might do it differently as I do it right now. If we have bigger changes that rely on each other and are pushed as separate PRs, it might be better to only merge the final PR (squash and merge) and use the other PRs for initial feedback. So we don't end up with in-progress commits in our git history. Still the title of the final PR will need some change as it should sum up the whole change. Within the descriptions we should reference all other closed PRs for documentation purpose.
@freimair @julianknutsen @sqrrm What do you think?

@julianknutsen
Copy link
Contributor Author

julianknutsen commented Nov 18, 2019

From the author's side of things, each PR could have been independently merged and it would not have affected product stability. For cases like that AND if PRs are handled within a day or so then it allows iterative development and commits to master that can make smaller improvements that other devs will pull. You get some additional free "testing" although this should never be the sole reason to do it. The major benefit is any development on the same files or modules can be merged easier because you don't have 50 commits that may have merge conflicts.

If we expect longer cycles, I think just merging the last PR may make sense. This is similar to a feature branch merge, but you still have the benefit of going back to the second parent to see the individual commits in the event of bugs. With squash-merging, it all shows up as one huge changeset and it is hard to dig through.

After reading more about the different merge options available, I think I prefer merge-commits over squash-and-merge whenever possible. The ability to track a change back to the individual commits is useful for bisecting and understanding the original intention of a specific fix. If it is hard for devs to read git log I think git log --first-parent will give you what you want. It just removes the second parent from the merge commits so you just see the individual PRs.

I'm interested to hear everyone else's thoughts on the benefits of squash-and-merge so I can see if that would make my workflow any better. There may be something I am missing since I don't have a ton of experience with Git.

@julianknutsen julianknutsen deleted the split-test-files branch November 18, 2019 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants