Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TempProposal objects transmit on every startup when already present on starting node #3629

Closed
julianknutsen opened this issue Nov 19, 2019 · 1 comment · Fixed by #3636
Closed

Comments

@julianknutsen
Copy link
Contributor

julianknutsen commented Nov 19, 2019

As part of investigating #3610, I found another bug in how ProtectedStorageEntry objects are persisted to disk. The result is that a node starting up does not properly format the GetData request to the seed node leading to retransmission of TempProposal payloads that it already knows about.

I am already working on this area as part of bisq-network/proposals#140 and it will likely show up as a failing test. Documenting this here in case people see behavior they don't understand.

Tech Info

ProtectedPayload objects implementing the PersistablePayload interface are saved to disk by the Sha256Ripemd160hash but are loaded from disk expecting the Sha256Hash. This results in the keys in the hashmap being 20-bytes for the items loaded from disk and 32-bytes for the items sent from the seed node.

So, the starting node asks the seednode to exclude the 20-byte key and it sends back the 32-byte key which both point to the exact same payload.

This is relevant for bisq-network/proposals#140 because we need to understand the behavior of non-upgraded nodes and their ability to propagate old TempProposals to the network.

More to come with how to address it.

julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Write a test that shows the incorrect behavior for bisq-network#3629, the hashmap
is rebuilt from disk using the 20-byte key instead of the 32-byte key.
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Addresses the first half of bisq-network#3629 by ensuring that the reconstructed
HashMap always has the 32-byte key for each payload.

I feel this fix is less risky than writing ondisk upgrade code throughout
the TempProposalStore and other disk-writing classes that are currently
untested and hard to test.

The end result is that the persistence code in P2PDataStorage still use
the 20-byte key, but the HashMap is reconstructed from disk using
the 32-byte key of the payload which minimizes the affected classes and
gives the same outward behavior.

Important to note that until all seednodes receive this update, nodes
will continue to have both the 20-byte and 32-byte keys in their HashMap.
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Addresses the second half of bisq-network#3629 by using the HashMap, not the
protectedDataStore to generate the known keys in the requestData path.

This won't have any bandwidth reduction until all seednodes have the
update and only have the 32-byte key in their HashMap.

fixes bisq-network#3629
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Write a test that shows the incorrect behavior for bisq-network#3629, the hashmap
is rebuilt from disk using the 20-byte key instead of the 32-byte key.
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Addresses the first half of bisq-network#3629 by ensuring that the reconstructed
HashMap always has the 32-byte key for each payload.

I feel this fix is less risky than writing ondisk upgrade code throughout
the TempProposalStore and other disk-writing classes that are currently
untested and hard to test.

The end result is that the persistence code in P2PDataStorage still use
the 20-byte key, but the HashMap is reconstructed from disk using
the 32-byte key of the payload which minimizes the affected classes and
gives the same outward behavior.

Important to note that until all seednodes receive this update, nodes
will continue to have both the 20-byte and 32-byte keys in their HashMap.
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Addresses the second half of bisq-network#3629 by using the HashMap, not the
protectedDataStore to generate the known keys in the requestData path.

This won't have any bandwidth reduction until all seednodes have the
update and only have the 32-byte key in their HashMap.

fixes bisq-network#3629
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Addresses the first half of bisq-network#3629 by ensuring that the reconstructed
HashMap always has the 32-byte key for each payload.

I feel this fix is less risky than writing ondisk upgrade code throughout
the TempProposalStore and other disk-writing classes that are currently
untested and hard to test.

The end result is that the persistence code in P2PDataStorage still use
the 20-byte key, but the HashMap is reconstructed from disk using
the 32-byte key of the payload which minimizes the affected classes and
gives the same outward behavior.

Important to note that until all seednodes receive this update, nodes
will continue to have both the 20-byte and 32-byte keys in their HashMap.
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 19, 2019
Addresses the second half of bisq-network#3629 by using the HashMap, not the
protectedDataStore to generate the known keys in the requestData path.

This won't have any bandwidth reduction until all seednodes have the
update and only have the 32-byte key in their HashMap.

fixes bisq-network#3629
julianknutsen added a commit to julianknutsen/bisq that referenced this issue Nov 21, 2019
Addresses the first half of bisq-network#3629 by ensuring that the reconstructed
HashMap always has the 32-byte key for each payload.

It turns out, the TempProposalStore persists the ProtectedStorageEntrys
on-disk as a List and doesn't persist the key at all. Then, on
reconstruction, it creates the 20-byte key for its internal map.

The fix is to update the TempProposalStore to use the 32-byte key instead.
This means that all writes, reads, and reconstrution of the TempProposalStore
uses the 32-byte key which matches perfectly with the in-memory map
of the P2PDataStorage that expects 32-byte keys.

Important to note that until all seednodes receive this update, nodes
will continue to have both the 20-byte and 32-byte keys in their HashMap.
@julianknutsen
Copy link
Contributor Author

After more investigation, the same bugfix commit fixes an additional issue where we were retransmitting every non-persistable ProtectedStorageEntry in the GetUpdatedDataRequest path. By using the in-memory map instead of the protectedDataStore, we now stop transmitting duplicates. Here was the breakdown of the duplicate transmissions before that patch:

#################################################################
Received 724 instances
RefundAgent: 1 (915 B)
Filter: 2 (5.7 KB)
TempProposalPayload: 56 (33.9 KB)
MailboxStoragePayload: 457 (2.0 MB)
Alert: 1 (826 B)
Mediator: 2 (1.8 KB)
OfferPayload: 205 (249.6 KB)
################################################################# 

ripcurlx added a commit that referenced this issue Nov 26, 2019
* [PR COMMENTS] Make maxSequenceNumberBeforePurge final

Instead of using a subclass that overwrites a value, utilize Guice
to inject the real value of 10000 in the app and let the tests overwrite
it with their own.

* [TESTS] Clean up 'Analyze Code' warnings

Remove unused imports and clean up some access modifiers now that
the final test structure is complete

* [REFACTOR] HashMapListener::onAdded/onRemoved

Previously, this interface was called each time an item was changed. This
required listeners to understand performance implications of multiple
adds or removes in a short time span.

Instead, give each listener the ability to process a list of added or
removed entrys which can help them avoid performance issues.

This patch is just a refactor. Each listener is called once for each
ProtectedStorageEntry. Future patches will change this.

* [REFACTOR] removeFromMapAndDataStore can operate on Collections

Minor performance overhead for constructing MapEntry and Collections
of one element, but keeps the code cleaner and all removes can still
use the same logic to remove from map, delete from data store, signal
listeners, etc.

The MapEntry type is used instead of Pair since it will require less
operations when this is eventually used in the removeExpiredEntries path.

* Change removeFromMapAndDataStore to signal listeners at the end in a batch

All current users still call this one-at-a-time. But, it gives the ability
for the expire code path to remove in a batch.

* Update removeExpiredEntries to remove all items in a batch

This will cause HashMapChangedListeners to receive just one onRemoved()
call for the expire work instead of multiple onRemoved() calls for each
item.

This required a bit of updating for the remove validation in tests so
that it correctly compares onRemoved with multiple items.

* ProposalService::onProtectedDataRemoved signals listeners once on batch removes

#3143 identified an issue that tempProposals listeners were being
signaled once for each item that was removed during the P2PDataStore
operation that expired old TempProposal objects. Some of the listeners
are very expensive (ProposalListPresentation::updateLists()) which results
in large UI performance issues.

Now that the infrastructure is in place to receive updates from the
P2PDataStore in a batch, the ProposalService can apply all of the removes
received from the P2PDataStore at once. This results in only 1 onChanged()
callback for each listener.

The end result is that updateLists() is only called once and the performance
problems are reduced.

This removes the need for #3148 and those interfaces will be removed in
the next patch.

* Remove HashmapChangedListener::onBatch operations

Now that the only user of this interface has been removed, go ahead
and delete it. This is a partial revert of
f5d75c4 that includes the code that was
added into ProposalService that subscribed to the P2PDataStore.

* [TESTS] Regression test for #3629

Write a test that shows the incorrect behavior for #3629, the hashmap
is rebuilt from disk using the 20-byte key instead of the 32-byte key.

* [BUGFIX] Reconstruct HashMap using 32-byte key

Addresses the first half of #3629 by ensuring that the reconstructed
HashMap always has the 32-byte key for each payload.

It turns out, the TempProposalStore persists the ProtectedStorageEntrys
on-disk as a List and doesn't persist the key at all. Then, on
reconstruction, it creates the 20-byte key for its internal map.

The fix is to update the TempProposalStore to use the 32-byte key instead.
This means that all writes, reads, and reconstrution of the TempProposalStore
uses the 32-byte key which matches perfectly with the in-memory map
of the P2PDataStorage that expects 32-byte keys.

Important to note that until all seednodes receive this update, nodes
will continue to have both the 20-byte and 32-byte keys in their HashMap.

* [BUGFIX] Use 32-byte key in requestData path

Addresses the second half of #3629 by using the HashMap, not the
protectedDataStore to generate the known keys in the requestData path.

This won't have any bandwidth reduction until all seednodes have the
update and only have the 32-byte key in their HashMap.

fixes #3629

* [DEAD CODE] Remove getProtectedDataStoreMap

The only user has been migrated to getMap(). Delete it so future
development doesn't have the same 20-byte vs 32-byte key issue.

* [TESTS] Allow tests to validate SequenceNumberMap write separately

In order to implement remove-before-add behavior, we need a way to
verify that the SequenceNumberMap was the only item updated.

* Implement remove-before-add message sequence behavior

It is possible to receive a RemoveData or RemoveMailboxData message
before the relevant AddData, but the current code does not handle
it.

This results in internal state updates and signal handler's being called
when an Add is received with a lower sequence number than a previously
seen Remove.

Minor test validation changes to allow tests to specify that only the
SequenceNumberMap should be written during an operation.

* [TESTS] Allow remove() verification to be more flexible

Now that we have introduced remove-before-add, we need a way
to validate that the SequenceNumberMap was written, but nothing
else. Add this feature to the validation path.

* Broadcast remove-before-add messages to P2P network

In order to aid in propagation of remove() messages, broadcast them
in the event the remove is seen before the add.

* [TESTS] Clean up remove verification helpers

Now that there are cases where the SequenceNumberMap and Broadcast
are called, but no other internal state is updated, the existing helper
functions conflate too many decisions. Remove them in favor of explicitly
defining each state change expected.

* [BUGFIX] Fix duplicate sequence number use case (startup)

Fix a bug introduced in d484617 that
did not properly handle a valid use case for duplicate sequence numbers.

For in-memory-only ProtectedStoragePayloads, the client nodes need a way
to reconstruct the Payloads after startup from peer and seed nodes. This
involves sending a ProtectedStorageEntry with a sequence number that
is equal to the last one the client had already seen.

This patch adds tests to confirm the bug and fix as well as the changes
necessary to allow adding of Payloads that were previously seen, but
removed during a restart.

* Clean up AtomicBoolean usage in FileManager

Although the code was correct, it was hard to understand the relationship
between the to-be-written object and the savePending flag.

Trade two dependent atomics for one and comment the code to make it more
clear for the next reader.

* [DEADCODE] Clean up FileManager.java

* [BUGFIX] Shorter delay values not taking precedence

Fix a bug in the FileManager where a saveLater called with a low delay
won't execute until the delay specified by a previous saveLater call.

The trade off here is the execution of a task that returns early vs.
losing the requested delay.

* [REFACTOR] Inline saveNowInternal

Only one caller after deadcode removal.
ripcurlx added a commit that referenced this issue Nov 26, 2019
… PersistablePayload (#3640)

* [PR COMMENTS] Make maxSequenceNumberBeforePurge final

Instead of using a subclass that overwrites a value, utilize Guice
to inject the real value of 10000 in the app and let the tests overwrite
it with their own.

* [TESTS] Clean up 'Analyze Code' warnings

Remove unused imports and clean up some access modifiers now that
the final test structure is complete

* [REFACTOR] HashMapListener::onAdded/onRemoved

Previously, this interface was called each time an item was changed. This
required listeners to understand performance implications of multiple
adds or removes in a short time span.

Instead, give each listener the ability to process a list of added or
removed entrys which can help them avoid performance issues.

This patch is just a refactor. Each listener is called once for each
ProtectedStorageEntry. Future patches will change this.

* [REFACTOR] removeFromMapAndDataStore can operate on Collections

Minor performance overhead for constructing MapEntry and Collections
of one element, but keeps the code cleaner and all removes can still
use the same logic to remove from map, delete from data store, signal
listeners, etc.

The MapEntry type is used instead of Pair since it will require less
operations when this is eventually used in the removeExpiredEntries path.

* Change removeFromMapAndDataStore to signal listeners at the end in a batch

All current users still call this one-at-a-time. But, it gives the ability
for the expire code path to remove in a batch.

* Update removeExpiredEntries to remove all items in a batch

This will cause HashMapChangedListeners to receive just one onRemoved()
call for the expire work instead of multiple onRemoved() calls for each
item.

This required a bit of updating for the remove validation in tests so
that it correctly compares onRemoved with multiple items.

* ProposalService::onProtectedDataRemoved signals listeners once on batch removes

#3143 identified an issue that tempProposals listeners were being
signaled once for each item that was removed during the P2PDataStore
operation that expired old TempProposal objects. Some of the listeners
are very expensive (ProposalListPresentation::updateLists()) which results
in large UI performance issues.

Now that the infrastructure is in place to receive updates from the
P2PDataStore in a batch, the ProposalService can apply all of the removes
received from the P2PDataStore at once. This results in only 1 onChanged()
callback for each listener.

The end result is that updateLists() is only called once and the performance
problems are reduced.

This removes the need for #3148 and those interfaces will be removed in
the next patch.

* Remove HashmapChangedListener::onBatch operations

Now that the only user of this interface has been removed, go ahead
and delete it. This is a partial revert of
f5d75c4 that includes the code that was
added into ProposalService that subscribed to the P2PDataStore.

* [TESTS] Regression test for #3629

Write a test that shows the incorrect behavior for #3629, the hashmap
is rebuilt from disk using the 20-byte key instead of the 32-byte key.

* [BUGFIX] Reconstruct HashMap using 32-byte key

Addresses the first half of #3629 by ensuring that the reconstructed
HashMap always has the 32-byte key for each payload.

It turns out, the TempProposalStore persists the ProtectedStorageEntrys
on-disk as a List and doesn't persist the key at all. Then, on
reconstruction, it creates the 20-byte key for its internal map.

The fix is to update the TempProposalStore to use the 32-byte key instead.
This means that all writes, reads, and reconstrution of the TempProposalStore
uses the 32-byte key which matches perfectly with the in-memory map
of the P2PDataStorage that expects 32-byte keys.

Important to note that until all seednodes receive this update, nodes
will continue to have both the 20-byte and 32-byte keys in their HashMap.

* [BUGFIX] Use 32-byte key in requestData path

Addresses the second half of #3629 by using the HashMap, not the
protectedDataStore to generate the known keys in the requestData path.

This won't have any bandwidth reduction until all seednodes have the
update and only have the 32-byte key in their HashMap.

fixes #3629

* [DEAD CODE] Remove getProtectedDataStoreMap

The only user has been migrated to getMap(). Delete it so future
development doesn't have the same 20-byte vs 32-byte key issue.

* [TESTS] Allow tests to validate SequenceNumberMap write separately

In order to implement remove-before-add behavior, we need a way to
verify that the SequenceNumberMap was the only item updated.

* Implement remove-before-add message sequence behavior

It is possible to receive a RemoveData or RemoveMailboxData message
before the relevant AddData, but the current code does not handle
it.

This results in internal state updates and signal handler's being called
when an Add is received with a lower sequence number than a previously
seen Remove.

Minor test validation changes to allow tests to specify that only the
SequenceNumberMap should be written during an operation.

* [TESTS] Allow remove() verification to be more flexible

Now that we have introduced remove-before-add, we need a way
to validate that the SequenceNumberMap was written, but nothing
else. Add this feature to the validation path.

* Broadcast remove-before-add messages to P2P network

In order to aid in propagation of remove() messages, broadcast them
in the event the remove is seen before the add.

* [TESTS] Clean up remove verification helpers

Now that there are cases where the SequenceNumberMap and Broadcast
are called, but no other internal state is updated, the existing helper
functions conflate too many decisions. Remove them in favor of explicitly
defining each state change expected.

* [BUGFIX] Fix duplicate sequence number use case (startup)

Fix a bug introduced in d484617 that
did not properly handle a valid use case for duplicate sequence numbers.

For in-memory-only ProtectedStoragePayloads, the client nodes need a way
to reconstruct the Payloads after startup from peer and seed nodes. This
involves sending a ProtectedStorageEntry with a sequence number that
is equal to the last one the client had already seen.

This patch adds tests to confirm the bug and fix as well as the changes
necessary to allow adding of Payloads that were previously seen, but
removed during a restart.

* Clean up AtomicBoolean usage in FileManager

Although the code was correct, it was hard to understand the relationship
between the to-be-written object and the savePending flag.

Trade two dependent atomics for one and comment the code to make it more
clear for the next reader.

* [DEADCODE] Clean up FileManager.java

* [BUGFIX] Shorter delay values not taking precedence

Fix a bug in the FileManager where a saveLater called with a low delay
won't execute until the delay specified by a previous saveLater call.

The trade off here is the execution of a task that returns early vs.
losing the requested delay.

* [REFACTOR] Inline saveNowInternal

Only one caller after deadcode removal.

* [TESTS] Introduce MapStoreServiceFake

Now that we want to make changes to the MapStoreService,
it isn't sufficient to have a Fake of the ProtectedDataStoreService.

Tests now use a REAL ProtectedDataStoreService and a FAKE MapStoreService
to exercise more of the production code and allow future testing of
changes to MapStoreService.

* Persist changes to ProtectedStorageEntrys

With the addition of ProtectedStorageEntrys, there are now persistable
maps that have different payloads and the same keys. In the
ProtectedDataStoreService case, the value is the ProtectedStorageEntry
which has a createdTimeStamp, sequenceNumber, and signature that can
all change, but still contain an identical payload.

Previously, the service was only updating the on-disk representation on
the first object and never again. So, when it was recreated from disk it
would not have any of the updated metadata. This was just copied from the
append-only implementation where the value was the Payload
which was immutable.

This hasn't caused any issues to this point, but it causes strange behavior
such as always receiving seqNr==1 items from seednodes on startup. It
is good practice to keep the in-memory objects and on-disk objects in
sync and removes an unexpected failure in future dev work that expects
the same behavior as the append-only on-disk objects.

* [DEADCODE] Remove protectedDataStoreListener

There were no users.

* [DEADCODE] Remove unused methods in ProtectedDataStoreService
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant