Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite of provides to better select peers to send RPCs to #456

Merged
merged 8 commits into from
Jan 2, 2015

Conversation

whyrusleeping
Copy link
Member

The main change in this PR is the implementation of getClosestPeers that returns a channel of peers. progressively closer to the target key. This is used to put provider values to the best possible peers.

@whyrusleeping whyrusleeping added the status/in-progress In progress label Dec 16, 2014
@whyrusleeping
Copy link
Member Author

This isnt ready for CR yet, i just wanted to run the tests on jenkins.

@whyrusleeping
Copy link
Member Author

@jbenet @maybebtc im ready for a bit of CR on this one.

@@ -74,10 +75,10 @@ type PubKey interface {
type GenSharedKey func([]byte) ([]byte, error)

// Generates a keypair of the given type and bitsize
func GenerateKeyPair(typ, bits int) (PrivKey, PubKey, error) {
func GenerateKeyPair(typ, bits int, src io.Reader) (PrivKey, PubKey, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait why do you want to pass this in? it should always be crypto/rand.Reader. cannot trust any other randomness-- exposing it like this will likely create problems elsewhere. I suggest the following instead: inside of testutil add a GenerateTestKeyPair, which uses the time seeded rand. the interface to the rest of the codebase remains the same (no passing in rand), but tests can opt to use the faster one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that works too, I figured id just have our function mimic the interface of the RSA key generator and have it accept an io.Reader

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. Looking over it again looks like we dont generate keys many places -- that is, after the changes in #466 ... -- so not a big deal to introduce this Reader. might want to rebase on that sooner

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, thinking about it more, i think this is the right thing: https://github.com/jbenet/go-ipfs/pull/456/files#r22301316

@btc
Copy link
Contributor

btc commented Dec 23, 2014

@whyrusleeping you'll probably want to get looped in on #468

@whyrusleeping
Copy link
Member Author

Alright, got it rebased properly

@whyrusleeping
Copy link
Member Author

@jbenet @maybebtc Can you guys take a look at this? It improves the correctness of the DHT and makes the DHT tests more predictable. I could use another set of eyes over the changes ive done.

@btc
Copy link
Contributor

btc commented Dec 25, 2014

It improves the correctness of the DHT

Are we sure? Just ran the network tests locally. Not passing yet. https://build.protocol-dev.com/job/network-test/336/

@whyrusleeping
Copy link
Member Author

@maybebtc i discovered some fun new bugs after rebasing on master again... Ill have those fixes up soon. But theres one very weird one, where network.DialPeer is hanging...

@btc
Copy link
Contributor

btc commented Dec 25, 2014

@whyrusleeping we may want to merge #470 before anything else. it adds a go test that reproduces the docker network test failure we've been seeing.

@whyrusleeping
Copy link
Member Author

@maybebtc oh, that will be awesome. Is it RFCR?

@whyrusleeping
Copy link
Member Author

alright, so running go test ./routing/dht/ -v -run TestProvides reproduces the error im encountering. It looks to me like network.DialPeer is hanging inside of the ResponseHandler.

@jbenet could you take a look at this when you have time? Im very unfamiliar with that bit of the codebase ( @maybebtc if youre knowledgable in that area you should take a look too)

@jbenet
Copy link
Member

jbenet commented Dec 25, 2014

Oh the default network is going to try to run ProtocolIdentify. Some of the tests may not include a full network (though I think they do) and only have a dht handler. Maybe that's the issue you're seeing. If both networks are made with NewNetwork (or mocknet even) it should be fine.


Sent from Mailbox

On Wed, Dec 24, 2014 at 11:32 PM, Jeromy Johnson [email protected]
wrote:

alright, so running go test ./routing/dht/ -v -run TestProvides reproduces the error im encountering. It looks to me like network.DialPeer is hanging inside of the ResponseHandler.

@jbenet could you take a look at this when you have time? Im very unfamiliar with that bit of the codebase ( @maybebtc if youre knowledgable in that area you should take a look too)

Reply to this email directly or view it on GitHub:
#456 (comment)

@jbenet
Copy link
Member

jbenet commented Dec 28, 2014

@whyrusleeping sorry about that... there was indeed a bug in how streams were being handled. If you're curious: jbenet/go-peerstream@fc6b2a9 -- AFAICT, that hang is fixed. thanks for pushing me to find this and sorry we kept hammering our heads against the wall on this silly bug!

@jbenet
Copy link
Member

jbenet commented Dec 28, 2014

Though travis succeeded, re-running multiple times I sometimes get:

--- FAIL: TestLayeredGet (0.38 seconds)
    dht_test.go:501: interface was changed. GetValue should not use providers.
    dht_test.go:505: <nil>
    dht_test.go:508: should not get value.

@whyrusleeping
Copy link
Member Author

@jbenet i get that too... I dont really trust the test itself anymore.

@jbenet
Copy link
Member

jbenet commented Dec 28, 2014

@whyrusleeping fixed it in 0a24d92

I still see occasional:

--- FAIL: TestGetFailures (4.06 seconds)
    ext_test.go:50: Timeout test passed.
    ext_test.go:77: Expected ErrNotFound, got: context deadline exceeded

Which -- under proper functioning -- should not happen. ErrNotFound should be returned well before the ctx expires.

@jbenet
Copy link
Member

jbenet commented Dec 28, 2014

@whyrusleeping there's also a ton of extra logging as ERROR that shouldn't be there, or should be bumped down to DEBUG.

@@ -252,7 +253,7 @@ func identityConfig(nbits int) (config.Identity, error) {
}

fmt.Printf("generating key pair...")
sk, pk, err := ci.GenerateKeyPair(ci.RSA, nbits)
sk, pk, err := ci.GenerateKeyPair(ci.RSA, nbits, rand.Reader)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we break this out into:

ci.GenerateKeyPairWithReader(...) 
ci.GenerateKeyPair(...) // calls GenerateKeyPairWithReader with crypto/rand
testutil.NewTestKeyPair(...) // calls GenerateKeyPairWithReader with NewTimeSeededRand()

Tons of security bugs happen because someone somewhere called a function with the wrong params (something most compilers don't guard well against). I want to minimize the surface area. Particularly if people will be building the habit of using test keys with weak pseudorandomness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ci.GenerateKeyPairWithReader(...)

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, makes sense

@@ -238,12 +237,12 @@ func (dht *IpfsDHT) Update(ctx context.Context, p peer.ID) {
}

// FindLocal looks for a peer with a given ID connected to this dht and returns the peer and the table it was found in.
func (dht *IpfsDHT) FindLocal(id peer.ID) (peer.PeerInfo, *kb.RoutingTable) {
func (dht *IpfsDHT) FindLocal(id peer.ID) peer.PeerInfo {
p := dht.routingTable.Find(id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low priority but the latter would make more sense than the former

func (rt *RoutingTable) Find(id peer.ID) peer.ID
func (rt *RoutingTable) Exists(id peer.ID) bool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thats leftover from when it returned a peer.Peer, fairly pointless after the refactor

@btc
Copy link
Contributor

btc commented Dec 29, 2014

A general observation...

Overall, would benefit from clearer interface specification. Each function performs some actions, but doesn't clearly address its spec/contract. Important considerations:

How do parameters affect the behavior of the function?
What are the possible return values? Does the function return any named errors? (impl concern: Are the right errors returned in each case?)
What guarantees are provided?
Are there any side-effects?

When designing a functional signature, if these things are not carefully considered and documented, when irregularities are discovered (eg. len(count) versus len(tablepeers)), readers are faced with the difficult task of guessing the designer's intent.

Over time, ill-specified interfaces cause callers to depend on undocumented behaviors and side-effects. This harms composition.

It's easy to pass tests and get away with unspecified interfaces in the short term. In the long term, causes numerous problems.

It slows down development by creating knowledge silos. Even after thorough inspection, only the original author is in a position to correctly interpret the intended behavior.

It leads to instability as callers write new code on a foundation of incorrect assumptions.

Certainly, much of these concerns have to do with old code that you didn't necessarily write. That's why it's important to consider the simple boy scout rule:

Leave things better than you found them.

If along the natural course of development, you examine a function, come to understand it, and the function is missing a specification/contract (in the form of a Godoc comment), leave a simple comment for others who follow.

If there's some vestigial code, take the opportunity to clean it up.

"There is a genuine danger of going down a rabbit hole here, as you fix one thing you spot another, and another, and before long you're deep in yak hair. Skillful opportunistic refactoring requires good judgement, where you decide when to call it a day. You want to leave the code better than you found it, but it can also wait for another visit to make it the way you'd really like to see it. If you always make things a little better, then repeated applications will make a big impact that's focused on the areas that are frequently visited - which are exactly the areas where clean code is most valuable. Like most aspects of programming this decision requires thoughtfulness."

@whyrusleeping
Copy link
Member Author

The sharness test is failing because the last echo is trying to print a newline '\n' but its actually printing '\n' to the screen.

edit: Or maybe it decided not to fail that time? Thats bizarre.

whyrusleeping and others added 7 commits January 2, 2015 07:42
refactor test peer creation to be deterministic and reliable

a bit of cleanup trying to figure out TestGetFailure

add test to verify deterministic peer creation

switch put RPC over to use getClosestPeers

rm 0xDEADC0DE

fix queries not searching peer if its not actually closer
The test was occasionally passing because:
- it called `putLocal(key, val)`
- GetValue calls `getLocal(key)` optimistically.

cc @whyrusleeping
Also remove older useless code.
@@ -33,9 +32,9 @@ func init() {
}
}

func setupDHT(ctx context.Context, t *testing.T, addr ma.Multiaddr) *IpfsDHT {
func setupDHT(ctx context.Context, t *testing.T, addr ma.Multiaddr, seed int64) *IpfsDHT {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's painful to put the seeds everywhere in the interface, the testutil package is there to have all the sane defaults for tests. Lets use a shared seed (the test shared time-seeded rand, or the current time itself), and move the bits # into testutil as a default.

jbenet added a commit that referenced this pull request Jan 2, 2015
rewrite of provides to better select peers to send RPCs to
@jbenet jbenet merged commit b86101b into master Jan 2, 2015
@jbenet jbenet deleted the provides-rewrite branch January 2, 2015 09:56
@jbenet jbenet removed the status/in-progress In progress label Jan 2, 2015
ariescodescream pushed a commit to ariescodescream/go-ipfs that referenced this pull request Oct 23, 2021
…com/ipfs/go-datastore-0.4.4

build(deps): bump github.com/ipfs/go-datastore from 0.4.2 to 0.4.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants