Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interop tests are failing against master #7390

Closed
achingbrain opened this issue May 29, 2020 · 18 comments · Fixed by #7395
Closed

Interop tests are failing against master #7390

achingbrain opened this issue May 29, 2020 · 18 comments · Fixed by #7395
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@achingbrain
Copy link
Member

achingbrain commented May 29, 2020

It looks like CI ignores failing interop tests:

https://app.circleci.com/pipelines/github/ipfs/go-ipfs/2908/workflows/0b7a5ebf-cb6c-4960-b19e-fd4c1b61983c/jobs/34278

This doesn't seem great because without passing tests there's no guarantee the code does what it claims to do and can then prevent the release of downstream projects that do not ignore failing interop tests.

@achingbrain achingbrain added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels May 29, 2020
@Stebalien
Copy link
Member

It looks like an issue with the mocha reporter we were using. I'm working on a fix in #7395.

@achingbrain
Copy link
Member Author

Reopening because #7395 doesn't solve the problem and the interop tests are still failing against master.

Following on from the discussion in #7394 running the tests with pubsub logging turned on shows this sort of thing (js is prefixed with libp2p:gossipsub, go with DEBUG and WARN):

Successful:

DEBUG	pubsub	[email protected]/gossipsub.go:304	JOIN pubsub-non-ascii
WARN	pubsub	[email protected]/log.go:180	bootstrap: error providing rendezvous for pubsub-non-ascii: failed to find any peer in table
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub JOIN [ 'pubsub-non-ascii' ]
libp2p:gossipsub JOIN: Add mesh link to QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-non-ascii
DEBUG	pubsub	[email protected]/gossipsub.go:216	GRAFT: Add mesh link from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb in pubsub-non-ascii
Found QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn on attempt 1 in 22ms
Found QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb on attempt 1 in 3ms
---> sending message
libp2p:gossipsub publish pubsub-non-ascii <Buffer e4 bd a0 e5 a5 bd e4 b8 96 e7 95 8c>

      ✓ should exchange non ascii data (41ms)

Failure:

DEBUG	pubsub	[email protected]/gossipsub.go:304	JOIN pubsub-ascii
WARN	pubsub	[email protected]/log.go:180	bootstrap: error providing rendezvous for pubsub-ascii: failed to find any peer in table
libp2p:gossipsub JOIN [ 'pubsub-ascii' ]
DEBUG	pubsub	[email protected]/gossipsub.go:84	PEERUP: Add new peer QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb using /meshsub/1.0.0
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub HEARTBEAT: Add mesh link to QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-ascii
WARN	pubsub	[email protected]/log.go:175	unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb
DEBUG	pubsub	[email protected]/gossipsub.go:459	HEARTBEAT: Add mesh link to QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb in pubsub-ascii
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
2020-06-01T21:31:39.729Z libp2p:gossipsub GRAFT: Add mesh link from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-ascii

Found QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn on attempt 2 in 1034ms
Found QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb on attempt 1 in 10ms
---> sending message
libp2p:gossipsub publish pubsub-ascii <Buffer 68 65 6c 6c 6f 20 77 6f 72 6c 64>
WARN	pubsub	[email protected]/log.go:175	unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb
libp2p:gossipsub connected QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn

The key here seems to be the unexpected message warning go prints after js has printed libp2p:gossipsub publish indicating it's publishing a message:

WARN	pubsub	[email protected]/log.go:175	unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb

The warning seems to come from here. From what I can see it's reading a protobuf message out of the stream, then printing the 'unexpected message' warning and ignoring the message.

Incoming messages are handled by the handleNewStream method, handlePeerEOF is called from handleNewPeer which does open a stream to the new peer - should it be calling handleNewStream internally with that stream instead?

@achingbrain achingbrain reopened this Jun 2, 2020
@Stebalien
Copy link
Member

See libp2p/go-libp2p-pubsub#133. go-libp2p-pubsub uses unidirectional streams. It looks like js-libp2p-pubsub isn't. Is that correct?

@Stebalien
Copy link
Member

cc @vyzo

@vyzo
Copy link
Contributor

vyzo commented Jun 2, 2020

Yes, go-libp2p-pubsub uses unidirectional streams.

@vasco-santos
Copy link
Member

See libp2p/go-libp2p-pubsub#133. go-libp2p-pubsub uses unidirectional streams. It looks like js-libp2p-pubsub isn't. Is that correct?

Yes, js-libp2p-pubsub is using the same stream to receive and send pubsub messages: https://github.com/libp2p/js-libp2p-pubsub/blob/master/src/index.js#L201-L203

@achingbrain
Copy link
Member Author

@vasco-santos Sounds like that might be incorrect? Can it be changed to implement the protocol properly and use unidirectional streams?

@vasco-santos
Copy link
Member

Taking into account that an implementation must have this into consideration, we should add a reference for this in the spec.

@vasco-santos Sounds like that might be incorrect? Can it be changed to implement the protocol properly and use unidirectional streams?

We have been using bidirectional streams, at least since the async await refactor. I am not sure on how this was implemented with pull streams.
I was not aware of go using unidirectional streams, but I get it after reading all the discussions referenced in libp2p/go-libp2p-pubsub#133.

I have created libp2p/js-libp2p-pubsub#45 to use unidirectional streams. This is keeping the outbound stream open for all the messages for now. It will need more changes, if we want to open one every time we want to push a message (which is what is being done in go as far as I understood).

This PR is on top of new breaking releases of the pubsub subsystem + routers, which need [email protected] in js-ipfs. I have tested it with both JS routers (needed to update the tests as they were expecting bidirectional streams), js-libp2p tests, and libp2p interop tests and this change seems to not have a negative impact anywhere.

I will now create a branch from the old versions of js-libp2p-pubsub and js-libp2p-gossipsub with this change, so that you can test it with ipfs interop tests. If this fixes the issue, I will be creating a release for the older versions so that this can be fixed without updating to the new libp2p.

@vasco-santos
Copy link
Member

@achingbrain can you retry installing the following branches in ipfs?

Let me know if this works, so that we can handle the reviews and releases

@achingbrain
Copy link
Member Author

@vasco-santos running interop with js-ipfs depending on these PRs makes the ipns-pubsub and pubsub tests pass. Ship it!

@achingbrain
Copy link
Member Author

@vasco-santos
Copy link
Member

[email protected] is shipped! You can go back to use the previous versions of libp2p-floodsub and libp2p-gossipsub. Their PRs just fix the tests

@vyzo
Copy link
Contributor

vyzo commented Jun 3, 2020

there is the larger issue that the latest version of js-pubsub is not interoperable with go-pubsub; what were you guys thinking?

@achingbrain
Copy link
Member Author

what were you guys thinking?

Interesting question. What I'm currently thinking is there's clearly no meaningful interop testing being done between go-libp2p and js-libp2p which is a gap that needs to be filled.

I can see there's an interop repo and it has an open issue about doing automated testing. Maybe this is something that go-libp2p and js-libp2p could add to their respective CI builds sooner rather than later.

Where automated testing is being done is at the ipfs level and the failing interop tests have so far prevented this PR being merged and js-ipfs claiming compatibility with go-ipfs 0.5.x.

The interop tests have been failing in go-ipfs master for months but for whatever reason they haven't been breaking builds which seems to just be a bug which has since been squashed.

we should add a reference for [unidirectional streams] to the spec.

Also interesting - stream setup seems pretty fundamental - is it documented anywhere outside of the go source code or GitHub comments? I can't see anything relevant in the spec.

@vasco-santos
Copy link
Member

I agree with @achingbrain

Just have a few thoughts to add:

  • I have been trying to keep up with manual interop tests in libp2p land for the time being
    • this has been difficult to achieve because go-libp2p release process does not include updating the daemon.
    • js-libp2p has a CI job for running the interop tests, but in go-libp2p is harder to get this, since we need to generate binaries for each environment beforehand, update the libp2p/npm-go-libp2p-dep, as a consequence of our current setup.
  • The full interop tests automation got deprioritized for now.
    • the main reason for this is the intention to use testground for the interop tests, meaning this would be duplicated work

I think that if we have go-libp2p releases coming with the go-libp2p-daemon updated would help a lot preventing these events. I currently watch the go-libp2p-daemon releases and once that happens, I manually get it to the interop. While this is not perfect and is currently a single point of failure (only me), I think this initial step would be helpful before we get the interop tests in testground.

@vyzo
Copy link
Contributor

vyzo commented Jun 4, 2020

We definitely need to add a reference to the spec about unidirectional streams.

@achingbrain
Copy link
Member Author

@vyzo in the absence of testground being useable for interop tests, it'd also be incredibly helpful if you could work with @vasco-santos to figure out how to run the existing suite in CI for go & js libp2p as it sounds like he has to jump through a lot of hoops to do this at the moment. That would reduce the likelihood of this sort of bug reappearing.

@achingbrain
Copy link
Member Author

Closing this issue as the interop tests are passing again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants