-
Notifications
You must be signed in to change notification settings - Fork 20.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
les: handle conn/disc/reg logic in the eventloop #16981
Conversation
Ideally mention this in the title like with a [WIP] prefix |
a47417e
to
b434d76
Compare
Running connect/registered/disconnect in the event loop might be a cleaner solution so I'm not against it but I'd like to understand the issue first and I don't see how the described scenario could cause a data race. Both serverPool.registered (which calls setLatest) and the event pool handler of newly discovered peers are protected by the same pool.lock mutex. Probably I'm missing something because the data race detector did catch something but I don't see it yet. |
b1be9ff
to
6a5e4bb
Compare
@zsfelfoldi Sorry for my negligence, i missed the lock there. So i guess the data race might caused by the Besides, i clean up some unused codes. PTAL |
6a5e4bb
to
3f8b3a5
Compare
les/serverpool.go
Outdated
} | ||
|
||
// discReq represents a request for peer disconnection. | ||
type discReq struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"disc" is easily associated with "discovery" too so I'd call it "disconn" instead.
les/serverpool.go
Outdated
p *peer | ||
ip net.IP | ||
port uint16 | ||
cont chan *poolEntry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "cont" stand for? I'd call this "done" or "result".
les/serverpool.go
Outdated
default: | ||
} | ||
|
||
if stopped { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicating this logic here is weird and also unsafe. When pool.quit has been closed there is no guarantee that the event loop has actually stopped and no one will concurrently access the entry. What I'd do is
- put this logic into a function with stopped as a parameter
- disconnect should always create discReqs and send them to discCh (do not care about pool.quit here, always send it)
- add a for .. range loop in the main event loop's case <-pool.quit handler that reads from discCh until it is closed and calls the disconnect logic with stopped=true
- run pool.connWg.Wait() in a goroutine before that event loop and close discCh after Wait has returned
@zsfelfoldi Thanks for your comments! For the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linter complains about an unnecessary return in line 207, otherwise fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approved this PR too soon, it just threw a panic for me.
les/serverpool.go
Outdated
pool.knownQueue.setLatest(entry) | ||
entry.shortRetry = shortRetryCnt | ||
<-req.done | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linter complains about this, please remove.
les/serverpool.go
Outdated
close(req.done) | ||
} | ||
|
||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not right, connWg is only Add-ed by connect so this Wait will pass instantly, closing disconnCh immediately. This also causes case req := <-pool.disconnCh to receive a nil req and panic. Put this goroutine in the case <-pool.quit handler instead, before running the final disconnect loop. This ensures that the normal disconn req handler will never receive nil and also that connWg will not be Add-ed after Wait.
047d9b8
to
5cf0cfd
Compare
les/serverpool.go
Outdated
// Spawn a goroutine to close the disconnCh until all connections are disconnected. | ||
go func() { | ||
pool.connWg.Wait() | ||
close(pool.disconnCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you close this channel? In general closing channels that you are a reader of is very very dangerous and can almost always backfire. The general rule is that only the single-sender should ever close a channel. Otherwise it's very easy to end up in a panic.
les/serverpool.go
Outdated
pool.connWg.Done() | ||
// Block until disconnection request be served. | ||
pool.disconnCh <- req | ||
<-req.done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be racy construct to me:
- Check whether stopped, not yet, continue to disconnect
- In the mean time scheduler runs a different goroutine that stops the server pool
- Schedule this
pool.disconnCh <- req
, which will either crash (if closed) or hang (it open).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, even if pool.quit is closed, this branch will trigger, causing a panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar nitpicks :P
les/serverpool.go
Outdated
close(pool.disconnCh) | ||
}() | ||
|
||
// Handle all remain disconnection requests before exit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remain
-> remaining
les/serverpool.go
Outdated
case <-pool.quit: | ||
if pool.discSetPeriod != nil { | ||
close(pool.discSetPeriod) | ||
} | ||
pool.connWg.Wait() | ||
|
||
// Spawn a goroutine to close the disconnCh until all connections are disconnected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
until
-> after
les/serverpool.go
Outdated
} | ||
pool.setRetryDial(entry) | ||
pool.connWg.Done() | ||
// Block until disconnection request be served. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be
-> is
@karalabe Updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
* les: handle conn/disc/reg logic in the eventloop * les: try to dial before start eventloop * les: handle disconnect logic more safely * les: grammar fix
* les: handle conn/disc/reg logic in the eventloop * les: try to dial before start eventloop * les: handle disconnect logic more safely * les: grammar fix
* build: add -e and -X flags to get more information on ethereum#16433 (ethereum#16443) * core: remove stray account creations in state transition (ethereum#16470) The 'from' and 'to' methods on StateTransitions are reader methods and shouldn't have inadvertent side effects on state. It is safe to remove the check in 'from' because account existence is implicitly checked by the nonce and balance checks. If the account has non-zero balance or nonce, it must exist. Even if the sender account has nonce zero at the start of the state transition or no balance, the nonce is incremented before execution and the account will be created at that time. It is safe to remove the check in 'to' because the EVM creates the account if necessary. Fixes ethereum#15119 * travis, appveyor: bump to Go 1.10.1 * travis.yml: add TEST_PACKAGES to speed up swarm testing (ethereum#16456) This commit is meant to allow ecosystem projects such as ethersphere to minimize CI build times by specifying an environment variable with the packages to run tests on. If the environment variable isn't defined the build script will test all packages so this shouldn't affect the main go-ethereum repository. * les: add ps.lock.Unlock() before return (ethereum#16360) * core/state: fix bug in copy of copy State * core/state: fix ripemd-cornercase in Copy * core: txpool stable underprice drop order, perf fixes * miner: remove contention on currentMu for pending data retrievals (ethereum#16497) * ethdb: add leveldb write delay statistic (ethereum#16499) * eth/downloader: wait for all fetcher goroutines to exit before terminating (ethereum#16509) * cmd/clef, signer: initial poc of the standalone signer (ethereum#16154) * signer: introduce external signer command * cmd/signer, rpc: Implement new signer. Add info about remote user to Context * signer: refactored request/response, made use of urfave.cli * cmd/signer: Use common flags * cmd/signer: methods to validate calldata against abi * cmd/signer: work on abi parser * signer: add mutex around UI * cmd/signer: add json 4byte directory, remove passwords from api * cmd/signer: minor changes * cmd/signer: Use ErrRequestDenied, enable lightkdf * cmd/signer: implement tests * cmd/signer: made possible for UI to modify tx parameters * cmd/signer: refactors, removed channels in ui comms, added UI-api via stdin/out * cmd/signer: Made lowercase json-definitions, added UI-signer test functionality * cmd/signer: update documentation * cmd/signer: fix bugs, improve abi detection, abi argument display * cmd/signer: minor change in json format * cmd/signer: rework json communication * cmd/signer: implement mixcase addresses in API, fix json id bug * cmd/signer: rename fromaccount, update pythonpoc with new json encoding format * cmd/signer: make use of new abi interface * signer: documentation * signer/main: remove redundant option * signer: implement audit logging * signer: create package 'signer', minor changes * common: add 0x-prefix to mixcaseaddress in json marshalling + validation * signer, rules, storage: implement rules + ephemeral storage for signer rules * signer: implement OnApprovedTx, change signing response (API BREAKAGE) * signer: refactoring + documentation * signer/rules: implement dispatching to next handler * signer: docs * signer/rules: hide json-conversion from users, ensure context is cleaned * signer: docs * signer: implement validation rules, change signature of call_info * signer: fix log flaw with string pointer * signer: implement custom 4byte databsae that saves submitted signatures * signer/storage: implement aes-gcm-backed credential storage * accounts: implement json unmarshalling of url * signer: fix listresponse, fix gas->uint64 * node: make http/ipc start methods public * signer: add ipc capability+review concerns * accounts: correct docstring * signer: address review concerns * rpc: go fmt -s * signer: review concerns+ baptize Clef * signer,node: move Start-functions to separate file * signer: formatting * light: new CHTs (ethereum#16515) * params: release Geth v1.8.4 * VERSION, params: begin v1.8.5 release cycle * build: enable goimports and varcheck linters (ethereum#16446) * core/asm: remove unused condition (ethereum#16487) * cmd/utils: fix help template issue for subcommands (ethereum#16351) * rpc: clean up IPC handler (ethereum#16524) This avoids logging accept errors on shutdown and removes a bit of duplication. It also fixes some goimports lint warnings. * core/asm: accept uppercase instructions (ethereum#16531) * all: fix various typos (ethereum#16533) * fix typo * fix typo * fix typo * rpc: handle HTTP response error codes (ethereum#16500) * whisper/whisperv6: post returns the hash of sent message (ethereum#16495) * ethclient: add DialContext and Close (ethereum#16318) DialContext allows users to pass a Context object for cancellation. Close closes the underlying RPC connection. * vendor: update elastic/gosigar so that it compiles on OpenBSD (ethereum#16542) * eth/downloader: fix for Issue ethereum#16539 (ethereum#16546) * params: release Geth v1.8.5 - Dirty Derivative² * VERSION, params: begin Geth 1.8.6 release cycle * cmd/geth: update the copyright year in the geth command usage (ethereum#16537) * Revert "Dockerfile.alltools: fix invalid command" * Revert "cmd/puppeth: fix node deploys for updated dockerfile user" * Dockerfile: revert the user change PR that broke all APIs * Dockerfile: drop legacy discovery v5 port mappings * params: release v1.8.6 to fix docker images * VERSION, params: begin release cycle 1.8.7 * cmd/geth, mobile: add memsize to pprof server (ethereum#16532) * cmd/geth, mobile: add memsize to pprof server This is a temporary change, to be reverted before the next release. * cmd/geth: fix variable name * core/types: avoid duplicating transactions on changing signer (ethereum#16435) * core/state: cache missing storage entries (ethereum#16584) * cmd/utils: point users to --syncmode under DEPRECATED (ethereum#16572) Indicate that --light and --fast options are replaced by --syncmode * trie: remove unused `buf` parameter (ethereum#16583) * core, eth: fix tracer dirty finalization * travis.yml: remove obsolete brew-cask install * whisper: Golint fixes in whisper packages (ethereum#16637) * vendor: fix leveldb crash when bigger than 1 TiB * core: ensure local transactions aren't discarded as underpriced This fixes an issue where local transactions are discarded as underpriced when the pool and queue are full. * evm/main: use blocknumber from genesis * accounts: golint updates for this or self warning (ethereum#16627) * tests: golint fixes for tests directory (ethereum#16640) * trie: golint iterator fixes (ethereum#16639) * internal: golint updates for this or self warning (ethereum#16634) * core: golint updates for this or self warning (ethereum#16633) * build: Add ldflags -s -w when building aar Smaller size on mobile is always good. Might also solve our maven central upload problem * cmd/clef: documentation about setup (ethereum#16568) clef: documentation about setup * params: release geth 1.8.7 * VERSION, params: begin v1.8.8 release cycle * log: changed if-else blocks to conform with golint (ethereum#16661) * p2p: changed if-else blocks to conform with golint (ethereum#16660) * les: changed if-else blocks to conform with golint (ethereum#16658) * accounts: changed if-else blocks to conform with golint (ethereum#16654) * rpc: golint error with context as last parameter (ethereum#16657) * rpc/*: golint error with context as last parameter * Update json.go * metrics: golint updates for this or self warning (ethereum#16635) * metrics/*: golint updates for this or self warning * metrics/*: golint updates for this or self warning, updated pr from feedback * consensus/ethash: fixed typo (ethereum#16665) * event: golint updates for this or self warning (ethereum#16631) * event/*: golint updates for this or self warning * event/*: golint updates for this or self warning, pr updated per feedback * eth: golint updates for this or self warning (ethereum#16632) * eth/*:golint updates for this or self warning * eth/*: golint updates for this or self warning, pr updated per feedback * signer: fix golint errors (ethereum#16653) * signer/*: golint fixes Specifically naming and comment formatting for documentation * signer/*: fixed naming error crashing build * signer/*: corrected error * signer/core: fix tiny error whitespace * signer/rules: fix test refactor * whisper/mailserver: pass init error to the caller (ethereum#16671) * whisper/mailserver: pass init error to the caller * whisper/mailserver: add returns to fmt.Errorf * whisper/mailserver: check err in mailserver init test * common: changed if-else blocks to conform with golint (ethereum#16656) * mobile: add GetStatus Method for Receipt (ethereum#16598) * core/rawdb: separate raw database access to own package (ethereum#16666) * rlp: fix some golint warnings (ethereum#16659) * p2p: fix some golint warnings (ethereum#16577) * eth/filters: derive FilterCriteria from ethereum.FilterQuery (ethereum#16629) * p2p/simulations/adapters: fix websocket log line parsing in exec adapter (ethereum#16667) * build: specify the key to use when invoking gpg:sign-and-deploy-file (ethereum#16696) * crypto: fix golint warnings (ethereum#16710) * p2p: don't discard reason set by Disconnect (ethereum#16559) Peer.run was discarding the reason for disconnection sent to the disc channel by Disconnect. * cmd: various golint fixes (ethereum#16700) * cmd: various golint fixes * cmd: update to pr change request * cmd: update to pr change request * eth: golint fixes to variable names (ethereum#16711) * eth/filter: check nil pointer when unsubscribe (ethereum#16682) * eth/filter: check nil pointer when unsubscribe * eth/filters, accounts, rpc: abort system if subscribe failed * eth/filter: add crit log before exit * eth/filter, event: minor fixes * whisper/shhclient: update call to shh_generateSymKeyFromPassword to pass a string (ethereum#16668) * all: get rid of error when creating memory database (ethereum#16716) * all: get rid of error when create mdb * core: clean up variables definition * all: inline mdb definition * event: document select case slice use and add edge case test (ethereum#16680) Feed keeps active subscription channels in a slice called 'f.sendCases'. The Send method tracks the active cases in a local variable 'cases' whose value is f.sendCases initially. 'cases' shrinks to a shorter prefix of f.sendCases every time a send succeeds, moving the successful case out of range of the active case list. This can be confusing because the two slices share a backing array. Add more comments to document what is going on. Also add a test for removing a case that is in 'f.sentCases' but not 'cases'. * travis: use Android NDK 16b (ethereum#16562) * bmt: golint updates for this or self warning (ethereum#16628) * bmt/*: golint updates for this or self warning * Update bmt.go * light: new CHT for mainnet and ropsten (ethereum#16736) * params: release go-ethereum v1.8.8 * VERSION, params: start 1.8.9 release cycle * accounts/abi: allow abi: tags when unpacking structs Go code users can now tag event struct members with `abi:` to specify in what fields the event will be de-serialized. See PR ethereum#16648 for details. * travis: try to upgrade android builder to trusty * p2p/enr: updates for discovery v4 compatibility (ethereum#16679) This applies spec changes from ethereum/EIPs#1049 and adds support for pluggable identity schemes. Some care has been taken to make the "v4" scheme standalone. It uses public APIs only and could be moved out of package enr at any time. A couple of minor changes were needed to make identity schemes work: - The sequence number is now updated in Set instead of when signing. - Record is now copy-safe, i.e. calling Set on a shallow copy doesn't modify the record it was copied from. * all: collate new transaction events together * core, eth: minor txpool event cleanups * travis, appveyor: bump Go release to 1.10.2 * core, consensus: fix some typos in comment code and output log * eth: propagate blocks and transactions async * trie: fixes to comply with golint (ethereum#16771) * log: fixes for golint warnings (ethereum#16775) * node: all golint warnings fixed (ethereum#16773) * node: all golint warnings fixed * node: rm per peter * node: rm per peter * vendor, ethdb: print warning log if leveldb is performing compaction (ethereum#16766) * vendor: update leveldb package * ethdb: print warning log if db is performing compaction * ethdb: update annotation and log * core/types: convert status type from uint to uint64 (ethereum#16784) * trie: support proof generation from the iterator * core/vm: fix typo in instructions.go (ethereum#16788) * core: use a wrapped map to remove contention in `TxPool.Get`. (ethereum#16670) * core: use a wrapped `map` and `sync.RWMutex` for `TxPool.all` to remove contention in `TxPool.Get`. * core: Remove redundant `txLookup.Find` and improve comments on txLookup methods. * trie: cleaner logic, one less func call * eth, node, trie: fix minor typos (ethereum#16802) * params: release go-ethereum v1.8.9 * VERSION, params: begin 1.8.10 release cycle * ethereum: fix a typo in FilterQuery{} (ethereum#16827) Fix a spelling mistake in comment * eth/fetcher: reuse variables for hash and number (ethereum#16819) * whisper/shhclient: update call to shh_post to expect string instead of bool (ethereum#16757) Fixes ethereum#16756 * common: improve documentation comments (ethereum#16701) This commit adds many comments and removes unused code. It also removes the EmptyHash function, which had some uses but was silly. * core/vm: fix typo in comment * p2p/discv5: add egress/ingress traffic metrics to discv5 udp transport (ethereum#16369) * core: improve test for TransactionPriceNonceSort (ethereum#16413) * trie: rename TrieSync to Sync and improve hexToKeybytes (ethereum#16804) This removes a golint warning: type name will be used as trie.TrieSync by other packages, and that stutters; consider calling this Sync. In hexToKeybytes len(hex) is even and (even+1)/2 == even/2, remove the +1. * core: fix transaction event asynchronicity * params: release Geth 1.8.10 hotfix * VERSION, params: begin 1.8.11 release cycle * ethstats: fix last golint warning (ethereum#16837) * console: squash golint warnings (ethereum#16836) * rpc: use HTTP request context as top-level context (ethereum#16861) * consensus/ethash: reduce keccak hash allocations (ethereum#16857) Use Read instead of Sum to avoid internal allocations and copying the state. name old time/op new time/op delta CacheGeneration-8 764ms ± 1% 579ms ± 1% -24.22% (p=0.000 n=20+17) SmallDatasetGeneration-8 75.2ms ±12% 60.6ms ±10% -19.37% (p=0.000 n=20+20) HashimotoLight-8 1.58ms ±11% 1.55ms ± 8% ~ (p=0.322 n=20+19) HashimotoFullSmall-8 4.90µs ± 1% 4.88µs ± 1% -0.31% (p=0.013 n=19+18) * core, eth, trie: streaming GC for the trie cache (ethereum#16810) * core, eth, trie: streaming GC for the trie cache * trie: track memcache statistics * rpc: set timeouts for http server, see ethereum#16859 * metrics: expvar support for ResettingTimer (ethereum#16878) * metrics: expvar support for ResettingTimer * metrics: use integers for percentiles; remove Overall * metrics: fix edge-case panic for index-out-of-range * cmd/geth: cap cache allowance * core: fix typo in comment code * les: add Skip overflow check to GetBlockHeadersMsg handler (ethereum#16891) * eth/tracers: fix minor off-by-one error (ethereum#16879) * tracing: fix minor off-by-one error * tracers: go generate * core: concurrent background transaction sender ecrecover * miner: not call commitNewWork if it's a side block (ethereum#16751) * cmd/abigen: support for reading solc output from stdin (ethereum#16683) Allow the --abi flag to be given - to indicate that it should read the ABI information from standard input. It expects to read the solc output with the --combined-json flag providing bin, abi, userdoc, devdoc, and metadata, and works very similarly to the internal invocation of solc, except it allows external invocation of solc. This facilitates integration with more complex solc invocations, such as invocations that require path remapping or --allow-paths tweaks. Simple usage example: solc --combined-json bin,abi,userdoc,devdoc,metadata *.sol | abigen --abi - * params: fix golint warnings (ethereum#16853) params: fix golint warnings * vendor: added vendor packages necessary for the swarm-network-rewrite merge (ethereum#16792) * vendor: added vendor packages necessary for the swarm-network-rewrite merge into ethereum master * vendor: removed multihash deps * trie: reduce hasher allocations (ethereum#16896) * trie: reduce hasher allocations name old time/op new time/op delta Hash-8 4.05µs ±12% 3.56µs ± 9% -12.13% (p=0.000 n=20+19) name old alloc/op new alloc/op delta Hash-8 1.30kB ± 0% 0.66kB ± 0% -49.15% (p=0.000 n=20+20) name old allocs/op new allocs/op delta Hash-8 11.0 ± 0% 8.0 ± 0% -27.27% (p=0.000 n=20+20) * trie: bump initial buffer cap in hasher * whisper: re-insert ethereum#16757 that has been lost during a merge (ethereum#16889) * cmd/puppeth: fixed a typo in a wizard input query (ethereum#16910) * core: relax type requirement for bc in ApplyTransaction (ethereum#16901) * trie: avoid unnecessary slicing on shortnode decoding (ethereum#16917) optimization code * cmd/ethkey: add command to change key passphrase (ethereum#16516) This change introduces ethkey changepassphrase <keyfile> to change the passphrase of a key file. * metrics: return an empty snapshot for NilResettingTimer (ethereum#16930) * light: new CHTs for mainnet and ropsten (ethereum#16926) * ethclient: fix RPC parse error of Parity response (ethereum#16924) The error produced when using a Parity RPC was the following: ERROR: transaction did not get mined: failed to get tx for txid 0xbdeb094b3278019383c8da148ff1cb5b5dbd61bf8731bc2310ac1b8ed0235226: json: cannot unmarshal non-string into Go struct field txExtraInfo.blockHash of type common.Hash * core: improve getBadBlocks to return full block rlp (ethereum#16902) * core: improve getBadBlocks to return full block rlp * core, eth, ethapi: changes to getBadBlocks formatting * ethapi: address review concerns * rpc: fix a comment typo (ethereum#16929) * rpc: support returning nil pointer big.Ints (null) * trie: don't report the root flushlist as an alloc * metrics: removed repetitive calculations (ethereum#16944) * core/rawdb: wrap db key creations (ethereum#16914) * core/rawdb: use wrappered helper to assemble key * core/rawdb: wrappered helper to assemble key * core/rawdb: rewrite the wrapper, pass common.Hash * ethdb: gracefullly handle quit channel (ethereum#16794) * ethdb: gratefullly handle quit channel * ethdb: minor polish * internal/ethapi: reduce pendingTransactions to O(txs+accs) from O(txs*accs) * les: pass server pool to protocol manager (ethereum#16947) * metrics: fix gofmt linter warnings * crypto: replace ToECDSAPub with error-checking func UnmarshalPubkey (ethereum#16932) ToECDSAPub was unsafe because it returned a non-nil key with nil X, Y in case of invalid input. This change replaces ToECDSAPub with UnmarshalPubkey across the codebase. * core, eth, les: more efficient hash-based header chain retrieval (ethereum#16946) * les: fix retriever logic (ethereum#16776) This PR fixes a retriever logic bug. When a peer had a soft timeout and then a response arrived, it always assumed it was the same peer even though it could have been a later requested one that did not time out at all yet. In this case the logic went to an illegal state and deadlocked, causing a goroutine leak. Fixes ethereum#16243 and replaces ethereum#16359. Thanks to @riceke for finding the bug in the logic. * params: release go-ethereum v1.8.11 * VERSION, params: begin v1.8.12 release cycle * core: change comment to match code more closely (ethereum#16963) * internal/web3ext: fix method name for enabling mutex profiling (ethereum#16964) * eth/fetcher: fix annotation (ethereum#16969) * core/asm: correct comments typo (ethereum#16975) core/asm/lexer: correct comments typo * console: correct some comments typo (ethereum#16971) console/console: correct some comments typo * ethereum#15685 made peer_test.go more portable by using random free port instead of hardcoded port 30303 (ethereum#15687) Improves test portability by resolving 127.0.0.1:0 to get a random free port instead of the hard coded one. Now the test works if you have a running node on the same interface already. Fixes ethereum#15685 * all: library changes for swarm-network-rewrite (ethereum#16898) This commit adds all changes needed for the merge of swarm-network-rewrite. The changes: - build: increase linter timeout - contracts/ens: export ensNode - log: add Output method and enable fractional seconds in format - metrics: relax test timeout - p2p: reduced some log levels, updates to simulation packages - rpc: increased maxClientSubscriptionBuffer to 20000 * core/vm: optimize MSTORE and SLOAD (ethereum#16939) * vm/test: add tests+benchmarks for mstore * core/vm: less alloc and copying for mstore * core/vm: less allocs in sload * vm: check for errors more correctly * eth/filters: make filterLogs func more readable (ethereum#16920) * cmd/utils: fix NetworkId default when -dev is set (ethereum#16833) Prior to this change, when geth was started with `geth -dev -rpc`, it would report a network id of `1` in response to the `net_version` RPC request. But the actual network id it used to verify transactions was `1337`. This change causes geth instead respond with `1337` to the `net_version` RPC when geth is started with `geth -dev -rpc`. * travis, appveyor: update to Go 1.10.3 * common: all golint warnings removed (ethereum#16852) * common: all golint warnings removed * common: fixups * eth: conform better to the golint standards (ethereum#16783) * eth: made changes to conform better to the golint standards * eth: fix comment nit * core: reduce nesting in transaction pool code (ethereum#16980) * bmt: fix package documentation comment (ethereum#16909) * common/number: delete unused package (ethereum#16983) This package was meant to hold an improved 256 bit integer library, but the effort was abandoned in 2015. AFAIK nothing ever used this package. Time to say goodbye. * core/asm: correct comments typo (ethereum#16974) * core/asm/compiler: correct comments typo core/asm/compiler: correct comments typo * Correct comments typo * internal/debug: use pprof goroutine writer for debug_stacks (ethereum#16892) * debug: Use pprof goroutine writer in debug.Stacks() to ensure all goroutines are captured. * Up to 64MB limit, previous code only captured first 1MB of goroutines. * internal/debug: simplify stacks handler * fix typo * fix pointer receiver * accounts/keystore: assign schema as const instead of var (ethereum#16985) * cmd: remove faucet/puppeth dead code (ethereum#16991) * cmd/faucet: authGitHub is not used anymore * cmd/puppeth: remove not used code * mobile: correct comment typo in geth.go (ethereum#17021) * accounts/usbwallet: correct comment typo (ethereum#17008) * core: remove dead code, limit test code scope (ethereum#17006) * core: move test util var/func to test file * core: remove useless func * accounts/usbwallet: correct comment typo (ethereum#16998) * signer: remove useless errorWrapper (ethereum#17003) * travis: use NDK 17b for Android archives (ethereum#17029) * tracers: fix err in 4byte, add some opcode analysis tools * accounts: remove deadcode isSigned (ethereum#16990) * mobile: correct comment typo in ethereum.go (ethereum#17040) * cmd/geth: remove the tail "," from genesis config (ethereum#17028) remove the tail "," from genesis config, which will cause genesis config parse error . * trie: cache collapsed tries node, not rlp blobs (ethereum#16876) The current trie memory database/cache that we do pruning on stores trie nodes as binary rlp encoded blobs, and also stores the node relationships/references for GC purposes. However, most of the trie nodes (everything apart from a value node) is in essence just a collection of references. This PR switches out the RLP encoded trie blobs with the collapsed-but-not-serialized trie nodes. This permits most of the references to be recovered from within the node data structure, avoiding the need to track them a second time (expensive memory wise). * swarm: network rewrite merge * les: handle conn/disc/reg logic in the eventloop (ethereum#16981) * les: handle conn/disc/reg logic in the eventloop * les: try to dial before start eventloop * les: handle disconnect logic more safely * les: grammar fix * log: Change time format - Keep the tailing zeros. - Limit precision to milliseconds. * swarm/fuse: Disable fuse tests, they are flaky (ethereum#17072) * swarm/pss: Hide big network tests under longrunning flag (ethereum#17074) * whisper: Reduce message loop log from Warn to Info (ethereum#17055) * core/vm: clear linter warnings (ethereum#17057) * core/vm: clear linter warnings * core/vm: review input * core/vm.go: revert lint in noop as per request * build: make build/goimports.sh more potable * node: remove formatting from ResettingTimer metrics if requested in raw * ethstats: comment minor correction (ethereum#17102) spell correction from `repors` to `reports` * ethdb, core: implement delete for db batch (ethereum#17101) * vendor: update docker/docker/pkg/reexec so that it compiles on OpenBSD (ethereum#17084) * trie: fix a temporary memory leak in the memcache * cmd/geth: export metrics to InfluxDB (ethereum#16979) * cmd/geth: add flags for metrics export * cmd/geth: update usage fields for metrics flags * metrics/influxdb: update reporter logger to adhere to geth logging convention * node: documentation typo fix (ethereum#17113) * core/vm: reuse bigint pools across transactions (ethereum#17070) * core/vm: A pool for int pools * core/vm: fix rebase issue * core/vm: push leftover stack items after execution, not before * cmd/p2psim: add exit error output and exit code (ethereum#17116) * p2p/discover: move bond logic from table to transport (ethereum#17048) * p2p/discover: move bond logic from table to transport This commit moves node endpoint verification (bonding) from the table to the UDP transport implementation. Previously, adding a node to the table entailed pinging the node if needed. With this change, the ping-back logic is embedded in the packet handler at a lower level. It is easy to verify that the basic protocol is unchanged: we still require a valid pong reply from the node before findnode is accepted. The node database tracked the time of last ping sent to the node and time of last valid pong received from the node. Node endpoints are considered verified when a valid pong is received and the time of last pong was called 'bond time'. The time of last ping sent was unused. In this commit, the last ping database entry is repurposed to mean last ping _received_. This entry is now used to track whether the node needs to be pinged back. The other big change is how nodes are added to the table. We used to add nodes in Table.bond, which ran when a remote node pinged us or when we encountered the node in a neighbors reply. The transport now adds to the table directly after the endpoint is verified through ping. To ensure that the Table can't be filled just by pinging the node repeatedly, we retain the isInitDone check. During init, only nodes from neighbors replies are added. * p2p/discover: reduce findnode failure counter on success * p2p/discover: remove unused parameter of loadSeedNodes * p2p/discover: improve ping-back check and comments * p2p/discover: add neighbors reply nodes always, not just during init * consensus/ethash: fixed documentation typo (ethereum#17121) "proot-of-work" to "proof-of-work" * light: new CHTs (ethereum#17124) * les: add announcement safety check to light fetcher (ethereum#17034) * params: v1.8.12 stable * 1.8.12
This pr fix #16610 .
Now almost all operations are done in the eventloop, so we can ensure the concurrency safety. But some operations exposed by serverpool (e.g. connect, disconnect, register) are handled in different way, it can introduce concurrency problems.
E.g, like described in the issue, a node is discovered, eventloop will try to update the
newSelect
to schedule a new dial. But in the mean time a peer's registration operation can happen andknownQueue
is oversize, which can leads to the least recently entry be deleted. SonewSelect
is suffered a concurrency problem.