Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-use MultiHeader<()> in subsequent recvmmsg() calls #2506

Closed
OliverNChalk opened this issue Sep 18, 2024 · 14 comments · Fixed by #2530
Closed

Re-use MultiHeader<()> in subsequent recvmmsg() calls #2506

OliverNChalk opened this issue Sep 18, 2024 · 14 comments · Fixed by #2530

Comments

@OliverNChalk
Copy link

Problem

Before call:

MultiHeaders { items: [mmsghdr { msg_hdr: msghdr { msg_name: 0x1, msg_namelen: 0, msg_iov: 0x5583b86edbe0, msg_iovlen: 1, msg_control: 0x5583b86edc30, msg_controllen: 32, msg_flags: 0 }, msg_len: 0 }], addresses: [core::mem::maybe_uninit::MaybeUninit<()>], _cmsg_buffers: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), msg_controllen: 32 }

After call:

MultiHeaders { items: [mmsghdr { msg_hdr: msghdr { msg_name: 0x1, msg_namelen: 28, msg_iov: 0x5583b86edbe0, msg_iovlen: 1, msg_control: 0x5583b86edc30, msg_controllen: 32, msg_flags: 0 }, msg_len: 1200 }], addresses: [core::mem::maybe_uninit::MaybeUninit<()>], _cmsg_buffers: Some([32, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 35, 0, 0, 0, 30, 33, 234, 102, 0, 0, 0, 0, 139, 247, 225, 58, 0, 0, 0, 0]), msg_controllen: 32 }

The main thing to call out is that the kernel set the msg_namelen value to 28. This makes re-using MultiHeader for subsequent reads impossible because the kernel will complain with err=Bad address (os error 14) due to the fact we have a msg_name: NULL and msg_namelen: 28 in the input.

I'd like to re-use MultiHeader for subsequent reads and have confirmed it works if I instead use MultiHeaders<UnixAddr>, as we now have space of the address to be written.

Potential Solution

Letting the caller set the msglen would allow us to re-zero it so the kernel doesn't think we have space available for an address when we don't.

Side Question

Do you know why msg_name renders like this: msg_hdr.msg_name: 0x1, I would have thought 0x0 for NULL?

@SteveLauC
Copy link
Member

SteveLauC commented Sep 22, 2024

I'd like to re-use MultiHeader for subsequent reads

From git history, MultiHeaders was added to avoid the allocation in the loop within recvmmsg() in #1744, gentle ping on @pacak, could you please take a look at this issue?

@SteveLauC SteveLauC changed the title surprising behavior in MultiHeader<()> and recvmmsg() Re-use MultiHeader<()> in subsequent recvmmsg() calls Sep 22, 2024
@SteveLauC
Copy link
Member

Do you know why msg_name renders like this: msg_hdr.msg_name: 0x1, I would have thought 0x0 for NULL?

I guess this is an optimization done by Rust, MultiHeader::preallocate() uses MaybeUninit::<S> under the hood, MaybeUninit<()> does not need to be writable, so Rust does not need to allocate memory, and thus MaybeUninit<()>.as_ptr() could use something like std::ptr::dangling():

$ cat src/main.rs 
#![feature(strict_provenance)]

fn main() {
    println!("{:p}", std::ptr::dangling() as *const nix::libc::c_void);
}

$ cargo +nightly r -q
0x1

@pacak
Copy link
Contributor

pacak commented Sep 22, 2024

According to documentation for recvmsg, msg_name is optional and it is valid for it to be NULL:

The msg_name field points to a caller-allocated buffer that is used to return the source address if the socket is unconnected. The caller should set msg_namelen to the size of this buffer before this call; upon return from a successful call, msg_namelen will contain the length of the returned address. If the application does not need to know the source address, msg_name can be specified as NULL.

We've been using it in production in this way for quite a while now while reusing reallocated headers with no problems on Linux.

The fact that msg_name is not NULL and kernel manages to write 28 bytes there even though it was told that the buffer is 0 bytes long is a bit worrying.

@pacak
Copy link
Contributor

pacak commented Sep 22, 2024

I guess API can/should be extended to be able to set and reset buffer length? My main use case was working with multicast streams so knowing source address is not necessary.

@pacak
Copy link
Contributor

pacak commented Sep 22, 2024

A data point, I'm using fbdac70, about to upgrade. Both before and after I'm seeing msg_name: 0, msg_namelen: 0. Off to upgrade to more recent version.

@pacak
Copy link
Contributor

pacak commented Sep 22, 2024

0.25 works, 0.26 does not - producing msg_name: 1, going though the changelog.

@pacak
Copy link
Contributor

pacak commented Sep 22, 2024

#2041 - regression originates here

@pacak
Copy link
Contributor

pacak commented Sep 22, 2024

Reverting this "fixes" this problem by initializing address with a null pointer, not sure if things fixed by #2041 are still working after that.

-    (*p).msg_name = (*address).as_mut_ptr() as *mut c_void;
+    (*p).msg_name = address as *mut c_void;

I guess the right fix would be to figure out how to make it work as expected. From what I can tell this code existed before my changes to recvmmsg... Poking around a bit more.

@pacak
Copy link
Contributor

pacak commented Sep 22, 2024

core::mem::maybe_uninit::MaybeUninit<()> is a zero sized type, vec of them occupies zero space, iterating over each member gets mutable pointer with addr 1 and size 0. recvmmsg is not aware of such tricks. So the fix would be to do something like this

This fixes the problem:

(*p).msg_name = if S::size() == 0 { ptr::nul_mut() } else { (*address).as_mut_ptr() as *mut c_void }

Proper fix should include some tests for both sized and unsized addresses. There might be similar patterns in the code.

@pacak
Copy link
Contributor

pacak commented Sep 23, 2024

After #2091 this rust-analyzer is no longer useful in the default state. What's the trick?

@OliverNChalk
Copy link
Author

If you're using vscode you can:

    "rust-analyzer.cargo.features": "all",

in your global/workspace settings.json

@pacak
Copy link
Contributor

pacak commented Sep 23, 2024

I'm using neovim. This kind of setting doesn't belong to global editor settings - I'm working on other projects as well. I'll look into making a fix after I figure out the config - fairly low priority at the moment.

@OliverNChalk
Copy link
Author

global/workspace

@pacak
Copy link
Contributor

pacak commented Sep 24, 2024

global/workspace

I made this that fixes the problem with R-A: #2516

There will be one more to fix rustfmt and after that will make a fix for recvmmsg

pacak added a commit to pacak/nix that referenced this issue Oct 30, 2024
The msg_name field points to a caller-allocated buffer that is used to
return the source address if the socket is unconnected. The caller
should set msg_namelen to the size of this buffer before this call; upon
return from a successful call, msg_namelen will contain the length of
the returned address. If the application does not need to know the
source address, msg_name can be specified as NULL.

In case we use () msgname_len gets initialized with 0, but pointer to
the array with msg_name. This works for the first iteration somehow, but
after that kernel sets msgname_len to a non-zero and second invocation
with the same MultiHeader fails

Fixes nix-rust#2506
pacak added a commit to pacak/nix that referenced this issue Nov 2, 2024
The msg_name field points to a caller-allocated buffer that is used to
return the source address if the socket is unconnected. The caller
should set msg_namelen to the size of this buffer before this call; upon
return from a successful call, msg_namelen will contain the length of
the returned address. If the application does not need to know the
source address, msg_name can be specified as NULL.

In case we use () msgname_len gets initialized with 0, but a dangling
pointer to the array with msg_name. This works for the first iteration
somehow, but after that kernel sets msgname_len to a non-zero and second
invocation with the same MultiHeader fails

Fixes nix-rust#2506
pacak added a commit to pacak/nix that referenced this issue Nov 2, 2024
The msg_name field points to a caller-allocated buffer that is used to
return the source address if the socket is unconnected. The caller
should set msg_namelen to the size of this buffer before this call; upon
return from a successful call, msg_namelen will contain the length of
the returned address. If the application does not need to know the
source address, msg_name can be specified as NULL.

In case we use () msgname_len gets initialized with 0, but a dangling
pointer to the array with msg_name. This works for the first iteration
somehow, but after that kernel sets msgname_len to a non-zero and second
invocation with the same MultiHeader fails

Fixes nix-rust#2506
github-merge-queue bot pushed a commit that referenced this issue Nov 3, 2024
* Use as_mut_ptr() to initialize msg_name in pack_mhdr_to_receive

The msg_name field points to a caller-allocated buffer that is used to
return the source address if the socket is unconnected. The caller
should set msg_namelen to the size of this buffer before this call; upon
return from a successful call, msg_namelen will contain the length of
the returned address. If the application does not need to know the
source address, msg_name can be specified as NULL.

In case we use () msgname_len gets initialized with 0, but a dangling
pointer to the array with msg_name. This works for the first iteration
somehow, but after that kernel sets msgname_len to a non-zero and second
invocation with the same MultiHeader fails

Fixes #2506

* CI doesn't check for rustfmt but I'm tired of picking stuff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants