Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow receiving multiple FD over ControlMessage in unix sockets #473

Closed
wants to merge 2 commits into from

Conversation

roblabla
Copy link
Contributor

Fixes #464

@fiveop
Copy link
Contributor

fiveop commented Nov 19, 2016

Thanks, looks good to me. Could you add an entry to the CHANGELOG.md and write a test based on your failing example in #464?

@roblabla roblabla force-pushed the bugfix-multipleFdInCmsg branch from 93b6cfc to f2ffc05 Compare November 19, 2016 14:46
@roblabla
Copy link
Contributor Author

There's still a problem with the second test (the one that sends multiple cmsgs). I believe this is due to a mistake in the way space/len of CmsgSpace is implemented (although I'm uncertain).

@roblabla
Copy link
Contributor Author

Nevermind my previous comment. I think the problem comes from

let buf2 = &mut &mut buf[padlen..];
. This creates a new pointer, writes the FD in it, but the "old" buf pointer isn't incremented by the size of the FD.

This means that when writing the second control message, it will overwrite the data of the first control message. I am unsure of how to fix this though, the code is pretty hard to follow and I'm unsure of how it interacts with lifetimes and all.

Copy link
Contributor

@fiveop fiveop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented on the easy problems. I'll read through socket/mod.rs on thursday and hope to be able to help with the problem afterwards.


#[test]
fn test_scm_rights_multiple_fd() {
use std::os::unix::net::UnixDatagram;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import breaks ARM and 1.7.0 builds.

@fiveop
Copy link
Contributor

fiveop commented Dec 3, 2016

I now have an idea what the problem is. copy_bytes used by encode_into called in sendmsg updates dst so that it points after the bytes just copied into. This works for the cmsghdr. For the second copy_bytes, however, another slice buf2 is used (which probably gets updated), thus buf (in encode_into) or cmsg_buffer (in sendmsg) is not updated. When two ControlMessages are encoded, this is a problem.

@homu
Copy link
Contributor

homu commented Dec 16, 2016

☔ The latest upstream changes (presumably #478) made this pull request unmergeable. Please resolve the merge conflicts.

homu added a commit that referenced this pull request Jan 11, 2017
Fix ControlMessage::encode_into when encoding multiple messages

copy_bytes updates dst so that it points after the bytes that were just
copied into it. encode_into did not advance the buffer in the same way
when encoding the data.

See #473
@kamalmarhubi
Copy link
Member

@fiveop did your #489 fix this issue? If so, should we close this PR?

@fiveop
Copy link
Contributor

fiveop commented Feb 15, 2017

No. It was a step toward fixing it.

@Susurrus
Copy link
Contributor

Susurrus commented Jun 5, 2017

What's necessary to wrap this PR up and get everything merged?

@roblabla roblabla force-pushed the bugfix-multipleFdInCmsg branch 2 times, most recently from 63b9f00 to 18f705d Compare June 5, 2017 14:40
@roblabla
Copy link
Contributor Author

roblabla commented Jun 5, 2017

Well, I rebased the commits on the latest master, which makes this mergeable again. Worth noting that I ran the "Add some tests" commit locally (independently of the fix) to make sure the bugs were still present. Indeed they are.

After my fix, test_scm_rights_multiple_cmsg still fails though, and test_scm_rights_multiple_fd segfaults on armv7 and aarch64

@Susurrus
Copy link
Contributor

Susurrus commented Jun 6, 2017

@roblabla Thanks for updating this!

Could you add some more details to the "Add some tests" commit and also some comments to the tests summarizing basically the discussion here for some context.

The segfaults are likely QEMU bugs as those are run on that. You can just ignore those tests for those targets.

But why is the "fixed" test still failing? Is the test not correct or is your "fix" not actually the correct fix? You didn't offer any commentary on that here, would you like a second pair of eyes on this or do you know how to fix it?

@roblabla
Copy link
Contributor Author

roblabla commented Jun 7, 2017

My main use when I made the PR was to fix my use-case, which is multiple FD. But while testing, I discovered there really were two bugs (depending on how you achieve multiple FD, with one cmsg containing multiple FD, or multiple cmsg containing each one FD).

I fixed one bug (multiple FD, one CMSG), but never quite managed to figure out the source of the other (multiple CMSG).

So the test is right (at least as far as I can see), but there is still a bug in nix somewhere.

I'll add some details to the commit as soon as I'm on a work machine.

@lucab
Copy link
Contributor

lucab commented Aug 25, 2017

Bump. Is there any way to make progress and eventually land this?

@Susurrus
Copy link
Contributor

@lucab I think we're basically waiting to resolve the second bug, which has not been resolved if I'm following this correctly.. If you're willing to dive into this, your help would be greatly appreciated and we could get this merged. @roblabla Can you confirm this is the current state of things and maybe provide any more context that's missing if you aren't able to continue to push this forward?

@roblabla
Copy link
Contributor Author

Well, this PR basically fixes what I needed fixed (I needed the ability to send multiple FD over one message, which is fixed), and I don't have the time to figure out what the other bug is.

IIRC, there's a problem in the way we encode multiple Control Messages which causes the kernel to only consider the first Control Message sent. The specifics of where the bug is, I have no idea. I mean, what needs to be done is take the failing test and figure out what's wrong with it.

In the meantime, maybe I could set the second test as ignored so we can land the first fix, and make an issue about the second bug so we can track further progress there ?

@Susurrus
Copy link
Contributor

@roblabla Yes, can you cut out the second test into it's own issue (someone can then take it and make a PR to fix it). And then we can merge your single fix and test case for it.

@Susurrus
Copy link
Contributor

This could also use a rebase and the CHANGELOG entry needs to be moved up into the current Unreleased section.

@Susurrus
Copy link
Contributor

Susurrus commented Sep 1, 2017

@roblabla Just wanted to ping you on this as I'd love to get it merged!

roblabla and others added 2 commits September 1, 2017 12:53
We currently don't make sure sending multiple FD over a single Control
Message or sending multiple Control Message works. We add two tests, and
mark them as ignored since the behavior is currently broken.
@roblabla roblabla force-pushed the bugfix-multipleFdInCmsg branch from 664e4cb to 4debb45 Compare September 1, 2017 10:56
@roblabla
Copy link
Contributor Author

roblabla commented Sep 1, 2017

Thanks for reminding me ! It had slipped from my mind ^^'.

Should I still remove the UnixDatagram stuff from the test ? Those types were added in 1.10, AFAIK we support minimum of 1.13 ?

@Susurrus
Copy link
Contributor

Susurrus commented Sep 1, 2017

I think the UnixDatagram stuff is the bug we haven't figure out yet, yes? I assume at this point it isn't a wrong value in a constant. So can you create a separate issue for that and copy/paste the test code there and give as much context as you can towards the error?

@roblabla
Copy link
Contributor Author

roblabla commented Sep 1, 2017

I'm not sure we understood eachother ? I meant that I'm using std::net::unix::UnixDatagram in the tests, which fiveop told me would break ARM 1.7 builds way back when, so I was wondering if the comment was still valid - that is, if I should be using the nix functions to create a UnixDatagram (which I find less convenient).

#756 contains the details of what I know about the remaining bug.

I'm waiting on the travis results to ignore out the test on the platforms that use the broken qemu implementation :)

@Susurrus
Copy link
Contributor

Susurrus commented Sep 5, 2017

@roblabla We aim to support Rust 1.13, so if the tests pass on all supported platforms (which test the minimum supported Rust version), then we should be fine!

I saw #756, which does a good job detailing the other bug you found, so thanks for that.

And Travis finished but found a few failures, so those'll need to be addressed (I haven't looked at them).

@roblabla
Copy link
Contributor Author

roblabla commented Sep 5, 2017

So, it looks like it segfaults basically every single qemu instance. Besides, I also have failures on IOS and MacOS, but the output is empty 😕.

Should I just #cfg[] the tests out ?

No output :

List of qemu-related failures (qemu segfaults) :

  • aarch64-unknown-linux-gnu
  • arm-unknown-linux-gnueabi
  • armv7-unknown-linux-gnueabihf
  • mips-unknown-linux-gnu
  • mips64-unknown-linux-gnuabi64
  • mips64el-unknown-linux-gnuabi64
  • mipsel-unknown-linux-gnu
  • powerpc64-unknown-linux-gnu
  • powerpc64le-unknown-linux-gnu

@Susurrus
Copy link
Contributor

Susurrus commented Sep 6, 2017

I guess so. Add a clear comment and use the travis feature with #cfg to ignore them.

@Susurrus
Copy link
Contributor

Susurrus commented Sep 6, 2017

We would ideally file a bug with upstreaam and include that in the comment about why we exclude it.

@Susurrus
Copy link
Contributor

Susurrus commented Dec 4, 2017

It's been about 3 months since there's been any movement on this, so I'm going to go ahead and close this.

@Susurrus Susurrus closed this Dec 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants