feat! content addressable transaction pool #935

cmwaters · 2023-01-12T19:15:21Z

Closes: #884

…pool (#1)

…n message passing (#875) This commit adds two metrics AlreadySeenTxs and SuccessfulTxs to both mempool implementations. If there is a high ratio of AlreadySeenTxs to total txs this indicates a high degree of duplication in the messages sent across the wire and validates the hypothesis that a content addressable network would be more advantageous in saving bandwidth (especially given the expected size of a tx). Co-authored-by: Rootul P <[email protected]>

Wondertan · 2023-01-12T19:51:03Z

Wondering how much code here was copied from the existing implementations

cmwaters · 2023-01-12T20:00:00Z

Wondering how much code here was copied from the existing implementations

v2 started as a complete copy of v1 so there's a significant amount although the protocol does make some substantial changes to information flow. For example, you no longer have a go routine for every peer you connect with.

rootulp

Reviewed 14/41 files but pausing review so I wanted to submit this first batch of comments.

Since the cat mempool code seems inspired by the priority mempool I wonder if we need to address any of the performance issues Sergio alluded to in informalsystems/audit-celestia#36

rootulp · 2023-01-12T20:33:46Z

mempool/cat/cache.go

+// GetList returns the underlying linked-list that backs the LRU cache. Note,
+// this should be used for testing purposes only!


[question] since cache_test.go lives inside the cat package, what are you thoughts on un-exporting this method?

The question is motived by an effort to enforce: "Note, this should be used for testing purposes only!"

Good point.

mempool/cat/cache.go

mempool/cat/peers.go

mempool/cat/pool.go

rootulp · 2023-01-12T21:05:46Z

mempool/cat/pool.go

+	// - We send multiple requests and the first peer eventually responds after the second peer has already
+	if txmp.IsRejectedTx(key) {
+		// The peer has sent us a transaction that we have previously marked as invalid. Since `CheckTx` can
+		// be non-deterministic, we don't punish the peer but instead just ignore the msg


[nit] I think the mempool only deals with transactions and doesn't inspect the contents (i.e. messages)

Suggested change

// be non-deterministic, we don't punish the peer but instead just ignore the msg

// be non-deterministic, we don't punish the peer but instead just ignore the tx

mempool/cat/pool.go

evan-forbes

I still need to take a second look at the tests, but overall this is really really good. The spec and the ADR are excellently written. Really top shelf stuff.

mempool/cat/spec.md

evan-forbes · 2023-01-13T02:38:51Z

node/node.go

+		if err != nil {
+			// TODO: find a more polite way of handling this error
+			panic(err)
+		}


[optional]
perhaps just a simple error log and os.Exit call?

evan-forbes · 2023-01-13T02:50:14Z

mempool/cat/reactor.go

+	// Add jitter to when the node broadcasts it's seen txs to stagger when nodes
+	// in the network broadcast their seenTx messages.
+	time.Sleep(time.Duration(rand.Intn(10)*10) * time.Millisecond)


[question]
how critical is this to the protocol working? What about natural network latencies?

Non-critical. We could remove it. Staggering when we send these messages may reduce the amount we need to send but it might mean in the worst case that a node becomes aware of transactions existence later than necessary.

mempool/cat/reactor.go

evan-forbes · 2023-01-13T04:01:22Z

mempool/cat/requests.go

+func (r *requestScheduler) ForTx(key types.TxKey) uint16 {
+	r.mtx.Lock()
+	defer r.mtx.Unlock()
+
+	return r.requestsByTx[key]
+}


[comment]
fwiw I found this name confusing, not sure I have a better suggestion tho.

The name of the method?

yeah, just the name of the method

mempool/cat/pool.go

evan-forbes · 2023-01-13T04:38:33Z

mempool/cat/pool.go

+	}
+
+	numExpired := txmp.store.purgeExpiredTxs(expirationHeight, expirationAge)
+	txmp.metrics.EvictedTxs.Add(float64(numExpired))


[question]
are we adding to the evicted transaction metrics here but not adding them to the evicted cache?

Yeah, I remember thinking that there's a difference between eviction due to a TTL and eviction due to overflow but now I'm not sure if I agree with my earlier self.

cmwaters · 2023-01-13T13:58:11Z

Since the cat mempool code seems inspired by the priority mempool I wonder if we need to address any of the performance issues Sergio alluded to in informalsystems/audit-celestia#36

We've been running the priority mempool in 100 node networks without any noticeable problems to performance. If there are any known problems then let me know and we can work them out. I'm also indifferent if we want to base CAT off the FIFO mempool. My understanding was that priority based block production and eviction was an important design requirement.

MSevey · 2023-01-13T18:49:03Z

consensus/reactor_test.go

@@ -184,6 +185,16 @@ func TestReactorWithEvidence(t *testing.T) {
 				mempoolv1.WithPreCheck(sm.TxPreCheck(state)),
 				mempoolv1.WithPostCheck(sm.TxPostCheck(state)),
 			)
+		case cfg.MempoolV2:


[non-blocking] this switch statement for defining the mempool can probably be refactored into a helper. I think I've seen it 3x already.

MSevey · 2023-01-13T18:54:18Z

mempool/cat/cache_test.go

@@ -0,0 +1,97 @@
+package cat


[non-blocking] to properly test the threadsafe nature of the cache we should add a test that spins up a number of go rountines and goes through the various actions.

Something like, goroutine A is Adding txns, goroutine B is removing txns, goroutine C is checking and evicting txns. They all have random sleeps in between actions, and then then a common done channel that closes them and the test down after 10s or something.

MSevey

Some initial thoughts 12/42 files reviewed.

MSevey · 2023-01-13T19:00:09Z

mempool/cat/cache.go

+	mtx      tmsync.Mutex
+	size     int
+	cacheMap map[types.TxKey]*list.Element
+	list     *list.List


If I'm understanding this correctly, the size is static. I'm not sure if we have any standards around prefixes cc @evan-forbes but I'd recommend something like static so that it is clear that this field doesn't require acquiring the mutex in order to access it.

Similarly the list should also have the static prefix as it manages its own mutex and doesn't require the LRUTxCache mutex for its operations to be threadsafe.

Suggested change

mtx tmsync.Mutex

size int

cacheMap map[types.TxKey]*list.Element

list *list.List

staticSize int

staticList *list.List

mtx tmsync.Mutex

cacheMap map[types.TxKey]*list.Element

Yeah you're right. I think the convention is to place the mutex only above the fields that access it. Here this was just copied it from the v1 mempool implementation

Similarly the list should also have the static prefix as it manages its own mutex and doesn't require the LRUTxCache mutex for its operations to be threadsafe.

list.List does not have it's own mutex and is managed by the LRUTxCache's mutex so I will leave it as is.

mempool/cat/cache.go

mempool/cat/cache_test.go

mempool/cat/peers.go

…to feature/cat

cmwaters · 2023-01-16T12:39:13Z

I have split this out into 4PRs: #941, #942, #943 & #944. All current suggestions have been incorporated here and in those PRs

I have kept this in draft mode as a point of reference so people can view the complete change

staheri14

I am currently going through some of the pull requests that are related to my upcoming tasks, to familiarize myself with the codebase. I'm not able to give a detailed review at this time, as I am still gathering more information and background. I'll make some non-critical suggestions and comments as I read through the code. :)

staheri14 · 2023-01-19T00:39:39Z

mempool/cat/cache.go

+
+	mtx      tmsync.Mutex


Suggested change

mtx tmsync.Mutex

mtx tmsync.Mutex

staheri14 · 2023-01-19T00:45:54Z

mempool/cat/cache.go

+	}
+}
+
+func (c *LRUTxCache) Reset() {


Some brief function description for the LRUTxCache methods can be helpful.

staheri14 · 2023-01-19T01:02:57Z

mempool/cat/cache.go

+	delete(c.cacheMap, txKey)
+
+	if e != nil {
+		c.list.Remove(e)
+	}


Curious to know why the delete operation is not moved to its following if block? the current version does not error out as delete is a no-op for the invalid and non-exiting keys, but logically it would make sense for both Remove and delete operations to be conditioned to e being nil, wdyt?

staheri14 · 2023-01-19T01:14:51Z

mempool/cat/cache.go

+}
+
+func (c *LRUTxCache) Remove(txKey types.TxKey) {
+	c.mtx.Lock()


Following what has been implemented in the Has method (where it checks the cache size), a similar check can be done in here too:

Suggested change

c.mtx.Lock()

if c.staticSize == 0 {

return

}

c.mtx.Lock()

staheri14 · 2023-01-19T01:16:15Z

I have split this out into 4PRs: #941, #942, #943 & #944. All current suggestions have been incorporated here and in those PRs

I have kept this in draft mode as a point of reference so people can view the complete change

I just saw this comment, will move my suggestions to the relevant PRs.

cmwaters · 2023-03-09T09:43:14Z

Closing this as all the constituent parts have been completed

cmwaters and others added 18 commits January 2, 2023 16:20

specification and implementation for content addressable transaction …

18db218

…pool (#1)

fix tests

cf6b66a

v2 transaction pool: add pull capability (#2)

c39fb34

fix txpool test

6d11e17

add logic to handle malleated txs in Update

4cc87dc

fix nill pointer dereference

fc4fb61

fix panic in map

88cf0e4

add more mempool metrics

4cbf2eb

e2e: add ability to configure network connectivity

f317fb9

rerequest txs after disconnecting with a peer

6479e85

fix panics and races

8725f21

fix recursion bug on exit

7830047

remove from field from seen tx

3fa6e40

complete initial draft of ADR009

4e06a09

add more tests and fix p2p stuff after patch

7ad7910

remove adr

65673f6

Merge branch 'v0.34.x-celestia' into feature/cat

6b01143

cmwaters requested a review from evan-forbes as a code owner January 12, 2023 19:15

remove v1.toml

d234299

cmwaters mentioned this pull request Jan 12, 2023

ADR009: CAT pool #936

Merged

make proto-gen

9ea16f7

rootulp reviewed Jan 12, 2023

View reviewed changes

evan-forbes reviewed Jan 13, 2023

View reviewed changes

evan-forbes assigned cmwaters Jan 13, 2023

apply suggestions

a0e71b6

MSevey reviewed Jan 13, 2023

View reviewed changes

MSevey requested changes Jan 13, 2023

View reviewed changes

cmwaters added 4 commits January 16, 2023 10:17

apply matt's suggestions

8a444b4

fix some tests

5f0236e

Merge branch 'v0.34.x-celestia' into feature/cat

9f740b8

add concurrency tests to the cache

2974d1f

cmwaters mentioned this pull request Jan 16, 2023

CAT Part 1: assemble subcomponents: cache's, store and request manager #941

Merged

add missing defer statements

e55688f

This was referenced Jan 16, 2023

CAT Part 2: write pool for handling CRUD-like operations #942

Merged

CAT Part 3: add gossiping protocol #943

Merged

cmwaters added 2 commits January 16, 2023 12:24

add small test

41e3dca

Merge branch 'feature/cat' of github.com:celestiaorg/celestia-core in…

37366e3

…to feature/cat

cmwaters mentioned this pull request Jan 16, 2023

CAT Part 4: plumb cat pool to the rest of tendermint #944

Merged

cmwaters marked this pull request as draft January 16, 2023 12:38

staheri14 reviewed Jan 19, 2023

View reviewed changes

initial draft of cat pool ADR (#936)

1ce8926

cmwaters closed this Mar 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat! content addressable transaction pool #935

feat! content addressable transaction pool #935

cmwaters commented Jan 12, 2023

Wondertan commented Jan 12, 2023

cmwaters commented Jan 12, 2023

rootulp left a comment

rootulp Jan 12, 2023

cmwaters Jan 13, 2023

rootulp Jan 12, 2023

evan-forbes left a comment

evan-forbes Jan 13, 2023

evan-forbes Jan 13, 2023

cmwaters Jan 13, 2023

evan-forbes Jan 13, 2023

cmwaters Jan 13, 2023

evan-forbes Jan 16, 2023

evan-forbes Jan 13, 2023

cmwaters Jan 13, 2023

cmwaters commented Jan 13, 2023

MSevey Jan 13, 2023

MSevey Jan 13, 2023

MSevey left a comment

MSevey Jan 13, 2023

cmwaters Jan 14, 2023

cmwaters Jan 16, 2023

cmwaters commented Jan 16, 2023 •

edited

Loading

staheri14 left a comment

staheri14 Jan 19, 2023

staheri14 Jan 19, 2023

staheri14 Jan 19, 2023

staheri14 Jan 19, 2023

staheri14 commented Jan 19, 2023

cmwaters commented Mar 9, 2023

		// GetList returns the underlying linked-list that backs the LRU cache. Note,
		// this should be used for testing purposes only!

	// be non-deterministic, we don't punish the peer but instead just ignore the msg
	// be non-deterministic, we don't punish the peer but instead just ignore the tx

feat! content addressable transaction pool #935

feat! content addressable transaction pool #935

Conversation

cmwaters commented Jan 12, 2023

Wondertan commented Jan 12, 2023

cmwaters commented Jan 12, 2023

rootulp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evan-forbes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmwaters commented Jan 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MSevey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmwaters commented Jan 16, 2023 • edited Loading

staheri14 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

staheri14 commented Jan 19, 2023

cmwaters commented Mar 9, 2023

cmwaters commented Jan 16, 2023 •

edited

Loading