-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transient tx map to DaoState to speed up getTx queries #3773
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,18 +35,21 @@ | |
import bisq.core.dao.state.model.governance.Issuance; | ||
import bisq.core.dao.state.model.governance.IssuanceType; | ||
import bisq.core.dao.state.model.governance.ParamChange; | ||
import bisq.core.util.coin.BsqFormatter; | ||
import bisq.core.util.ParsingUtils; | ||
import bisq.core.util.coin.BsqFormatter; | ||
|
||
import org.bitcoinj.core.Coin; | ||
|
||
import javax.inject.Inject; | ||
|
||
import com.google.common.base.Preconditions; | ||
|
||
import java.util.ArrayList; | ||
import java.util.Comparator; | ||
import java.util.HashSet; | ||
import java.util.LinkedList; | ||
import java.util.List; | ||
import java.util.Map; | ||
import java.util.Optional; | ||
import java.util.Set; | ||
import java.util.TreeMap; | ||
|
@@ -115,6 +118,9 @@ public void applySnapshot(DaoState snapshot) { | |
|
||
daoState.setChainHeight(snapshot.getChainHeight()); | ||
|
||
daoState.getTxMap().clear(); | ||
daoState.getTxMap().putAll(snapshot.getTxMap()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The snapshot is created from the persisted state, but the transient map isn't saved to disk. Is the tx map always empty after applying a snapshot? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it shouldn't be because the snapshot is just another DaoState instance constructed via DaoState.fromProto(), and in that method it recalculates and populates the transient field. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed. The |
||
|
||
daoState.getBlocks().clear(); | ||
daoState.getBlocks().addAll(snapshot.getBlocks()); | ||
|
||
|
@@ -226,7 +232,16 @@ public void onNewBlockWithEmptyTxs(Block block) { | |
} | ||
} | ||
|
||
// Third we get the onParseBlockComplete called after all rawTxs of blocks have been parsed | ||
// Third we add each successfully parsed BSQ tx to the last block | ||
public void onNewTxForLastBlock(Block block, Tx tx) { | ||
// At least one block must be present else no rawTx would have been recognised as a BSQ tx. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should add here the |
||
Preconditions.checkArgument(block == getLastBlock().orElseThrow()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would change the original behaviour. We used the onNewBlockWithEmptyTxs to add a block, but in case we got into the EDIT:
I am not sure if that case is valid and can happen, but as the log suggests there might be tricky edge cases in re-org scenarious where this was a possible scenario. Testing those edge cases is pretty tricky and it can be that it was during development an issue which disappeared later and is not present anymore. But I would prefer to stay very conservative/restrictive in the DAO domain as a consensus bug can have severe consequences and the DAO has a very deep level of complexity. If we are not 100% sure that existing code is wrong I prefer to stick with it, as this code base has been tested excessively and is in production since April. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. NACK for that Preconditions check, otherwise it looks good. Thanks for working on that. Could you provide a comparision of performance gains from that PR? |
||
|
||
block.getTxs().add(tx); | ||
daoState.getTxMap().put(tx.getId(), tx); | ||
} | ||
|
||
// Fourth we get the onParseBlockComplete called after all rawTxs of blocks have been parsed | ||
public void onParseBlockComplete(Block block) { | ||
if (parseBlockChainComplete) | ||
log.info("Parse block completed: Block height {}, {} BSQ transactions.", block.getHeight(), block.getTxs().size()); | ||
|
@@ -348,16 +363,16 @@ public Stream<Tx> getTxStream() { | |
.flatMap(block -> block.getTxs().stream()); | ||
} | ||
|
||
public TreeMap<String, Tx> getTxMap() { | ||
return new TreeMap<>(getTxStream().collect(Collectors.toMap(Tx::getId, tx -> tx))); | ||
public Map<String, Tx> getTxMap() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't look like this function is used outside of this file. Probably worth inlining it appropriately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I noticed it was unused originally. Perhaps it should just be inlined. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm always in favor of removing 1-line functions as well as public functions that are unused. Just one less way for people to do the wrong thing in the future. |
||
return daoState.getTxMap(); | ||
} | ||
|
||
public Set<Tx> getTxs() { | ||
return getTxStream().collect(Collectors.toSet()); | ||
return new HashSet<>(getTxMap().values()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the contract difference between getTxMap.values() and getTxStream()? Why have both? If they have the same data with different performance characteristics I would be in favor of using composition here and having all users go through the cache. It is much less error-prone and easier to reason about when all users just need to deal with one object for txns. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I switched it from getTxStream() to getTxMap.values() for slightly better performance, as more indirection is required to iterate through all the block tx lists (and many of the blocks are empty). They're not interchangeable elsewhere, though, since getTxMap.values() is unordered. It doesn't matter in this case since (before and after) it's just collecting into an unordered set. Actually, the only place getTxs() is used is in one of the views to get the total number of transactions (using size()), so it doesn't need to collect into a new HashSet and the return type could just be a Collection. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are the callers I found of
My point at a high level was that if we are going to build a cache for something, we should just use it for everything. By keeping an interface and users for the old slow way, It just creates additional performance work in the future that could be avoided by doing something better now. Adding javadocs as you change methods is also a great way to reduce the technical debt and help explain to future users the guarantees or any gotchas that you learned the hard way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I looked to all usages as well:
Order is irrelevant for sum().
I guess the explorer will love to get some performance improvement from that change as well ;-). Writing to disk at each new block will be probably a major performance iussue there (another "low hanging fruit" -;) ). Also looked into all usages of All the other data objects are immutable, only with |
||
} | ||
|
||
public Optional<Tx> getTx(String txId) { | ||
return getTxStream().filter(tx -> tx.getId().equals(txId)).findAny(); | ||
return Optional.ofNullable(getTxMap().get(txId)); | ||
} | ||
|
||
public List<Tx> getInvalidTxs() { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,6 +19,7 @@ | |
|
||
import bisq.core.dao.state.model.blockchain.Block; | ||
import bisq.core.dao.state.model.blockchain.SpentInfo; | ||
import bisq.core.dao.state.model.blockchain.Tx; | ||
import bisq.core.dao.state.model.blockchain.TxOutput; | ||
import bisq.core.dao.state.model.blockchain.TxOutputKey; | ||
import bisq.core.dao.state.model.governance.Cycle; | ||
|
@@ -28,16 +29,19 @@ | |
import bisq.core.dao.state.model.governance.ParamChange; | ||
|
||
import bisq.common.proto.persistable.PersistablePayload; | ||
import bisq.common.util.JsonExclude; | ||
|
||
import com.google.protobuf.Message; | ||
|
||
import javax.inject.Inject; | ||
|
||
import java.util.ArrayList; | ||
import java.util.HashMap; | ||
import java.util.LinkedList; | ||
import java.util.List; | ||
import java.util.Map; | ||
import java.util.TreeMap; | ||
import java.util.function.Function; | ||
import java.util.stream.Collectors; | ||
|
||
import lombok.Getter; | ||
|
@@ -98,6 +102,11 @@ public static DaoState getClone(DaoState daoState) { | |
@Getter | ||
private final List<DecryptedBallotsWithMerits> decryptedBallotsWithMeritsList; | ||
|
||
// Transient data used only as an index - must be kept in sync with the block list | ||
@Getter | ||
@JsonExclude | ||
private transient final Map<String, Tx> txMap; // key is txId | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does the memory footprint change with this cache and how is it expected to scale over time? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's using a hash table rather than a tree set, so I don't think the additional memory will be a problem - there are only about 10,000 txs or so right now, so I don't think it will take up more than 100KB or so. (The fact that DaoState.blocks is a linked list instead of an array list is probably a more significant memory issue that could be easily fixed.) |
||
|
||
|
||
/////////////////////////////////////////////////////////////////////////////////////////// | ||
// Constructor | ||
|
@@ -145,6 +154,10 @@ private DaoState(int chainHeight, | |
this.paramChangeList = paramChangeList; | ||
this.evaluatedProposalList = evaluatedProposalList; | ||
this.decryptedBallotsWithMeritsList = decryptedBallotsWithMeritsList; | ||
|
||
txMap = blocks.stream() | ||
.flatMap(block -> block.getTxs().stream()) | ||
.collect(Collectors.toMap(Tx::getId, Function.identity(), (x, y) -> y, HashMap::new)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume the mergeFunction is only passed as you want to have the mapFactory. Not sure what the mergeFunction really should do as conflicts are not expected and not clear how to handle it. Maybe throwing an exception would be more appropriate here? Or maybe just add a comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted to guarantee that the output is a HashMap and it looks like the only There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I expected that intention... I just was wondering how we can be sure to not change behaviour. The previous code used the flatMap.
Do you know how potential key conflics would have been handled there? i assume your mergeFunction to overwrite with a new value if it happens is likely the standad behaviour if not otherwise defined. So your mergeFunction is likely better than throwing an exception if flatMap behaves the same. Anyway a bit "esoteric" but the DAO might deserve a bit of extra paranoia ;-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With the Also, for consistent merge behaviour,
in |
||
} | ||
|
||
@Override | ||
|
@@ -237,6 +250,7 @@ public String toString() { | |
",\n paramChangeList=" + paramChangeList + | ||
",\n evaluatedProposalList=" + evaluatedProposalList + | ||
",\n decryptedBallotsWithMeritsList=" + decryptedBallotsWithMeritsList + | ||
",\n txMap=" + txMap + | ||
"\n}"; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did caching at the
DaoState
level compare to caching at theBlock
level? Keeping object in-sync is complicated and I'd be interested in understanding if the simpler block-level cache has most of the gain without any of the synchronization complication.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what you mean by caching at the block level - do you mean adding a transient field of some kind to Block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Instead of keeping a map of all txns that needs to be kept in sync in the
DaoState
object, theBlock
could just cache the list of transactions for itself and the lookup functions changed from O(txns) to O(blocks). I only bring this up because it would require less complexity and since there are no tests it may make it easier to guarantee correctness through just code review.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fear that might not have much effect on performance as usually there are very few txs in a block. In average we have 1 tx in 2 blocks.