-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not so careful review comments #1
Comments
Thanks for the feedback!
This can certainly be done. I'll create a separate issue to track progress on this.
I hadn't considered the potential for information leaking here. This will get a separate issue too.
I completely missed the potential race here. This will also get a separate issue.
Indeed, stream mode is planned. It is my understanding that Qubes' libvchan exposes both of these interfaces (POSIX read/write semantics as well as getting the size). For completeness, it would probably be best to implement both of these, I suppose.
Great catch, this is a tricky one. It seems that
Yeah, this is definitely temporary. From my initial inspection,
That's embarrassing. I'll fix that ASAP.
This is correct. In fact, the library already supports this if a user was to use uio handles for both the client and server. This whole interface is a WIP though, and the end goal is to of course emulate the existing libvchan interface and allow the vchans to be addressed by domain number and devno. On this note, I have begun work on a daemon which runs on the Host to facilitate this. The idea is to have a dedicated ivshmem device on each VM that's solely used for communication with the daemon. The daemon would provide facilities for resolving domain numbers to the appropriate shared memory handles, as well as allocation of memory within VM-VM ivshmem regions to allow creation of an arbitrary number of ringbuffers (within memory constraints, of course). Currently, the daemon uses libvirt to detect VM configuration and resolve domain numbers. I must admit I'm not too familiar with Qubes' existing VM management, but it seems it also uses libvirt? If that's the case, hopefully this architecture isn't too problematic.
This is also something that I'm investigating. The goal is to integrate this functionality into the aforementioned daemon, so that upon vchan creation, the library can transparently communicate with the daemon and have it assert that an appropriate ivshmem device exists between the desired domain id and the caller, or create it otherwise. Thank you again for your comments. |
Yes, but having an interface for getting sizes, it's trivial to emulate the POSIX one. But with POSIX read/write, you can't easily get sizes.
That's also a possibility. But to be honest, I really like the idea for adjusting libvchan API to require host coordination in setting up VM-VM connection. Right now (in Xen case), it is theoretically possible for two cooperating VMs to establish vchan connection without dom0 approval. Improving that would be a good thing.
Yes. |
Interesting. How would you envision such an interface? Something along the lines of the following?:
Where the former can be used to determine whether or not a VM-VM connection between the caller and the given domain exists already and the former can be used to establish a VM-VM connection (with assistance/permission from dom0)? |
To get /*
* call on host to allow *domain_server* to call libvchan_server_init(domain_client, port, ...)
* and *domain_client* to call libvchan_client_init(domain_server, port)
*/
int libvchan_setup_connection(int domain_server, int domain_client, int port);
/* call on the host after *domain_server* <-> *domain_client* connection on *port* is closed */
int libvchan_cleanup_connection(int domain_server, int domain_client, int port); I'm not sure about the return value - maybe some struct, then given as an argument for |
Oh, right. That wouldn't make much sense indeed.
Yeah, I think some opaque pointer/handle type would be fine.
That's an interesting question. My initial impression is that allowing On the other hand, it is likely the case that multiple applications will need to open vchans to the same domain, so in order for an application to safely call cleanup it would have to know of all the other applications that can open vchans and kill them/wait for them to close. Another option I see is introducing some sort of reference counter that is incremented every time an application calls setup and decremented on cleanup. That way when the counter reaches 0, the connection can be safely closed. This of course introduces a large potential for leaks which may end up being an even worse design, though. In the end, you're much more familiar with the Qubes ecosystem and how applications utilize vchans, so I'll defer to your judgement on this one. |
That's what you have port number for. |
Oh alright, that makes sense then.
Could you elaborate on that? Do you mean it would be possible to remove the If so, would you introduce another libvchan API call to return the first free port for a given domain pair? |
Yes.
Or make |
Gotcha. Though I must cast my vote against the specific in-out semantics in this case, since it seems a bit clumsy. Perhaps retain Semantics aside, I think this is a reasonable set of API changes and moving port allocation out of qrexec seems like a definite improvement. Is the Qubes project likely to implement them? If so, I can add them to the roadmap for this project too. |
But then the idea of returning some opaque struct pointer, for
Yes, including adjusting qrexec and vchan for xen. If that would make KVM part more solid and perhaps even simpler. |
Yeah, that seems reasonable to me.
Glad to hear. The API changes do certainly match the semantics for KVM/ivshmem more closely, so you've got a +1 from me :) |
As suggested in #1, private data is now stored in a separate struct which does not leave process memory. This has a lower overhead than checking all fields of the struct on each operation, and is less error-prone. It also removes all pointers from shared data structures, preventing information leaks. Closes #2, Closes #3.
I've briefly looked at the code and have some random comments:
Design, interface
If something isn't suppose to change, it should be enough to use a local copy - which is guaranteed to not change, instead of verifying each time - which a) may be forgotten, b) will reduce performance.
For reference, Xen headers:
Shared ring structure: https://github.com/xen-project/xen/blob/master/xen/include/public/io/libxenvchan.h
Library own data: https://github.com/xen-project/xen/blob/master/tools/libvchan/libxenvchan.h
cons
andprod
are used from shared ring (and bound checked), butorder
and other metadata are stored locally, since the other end isn't supposed to change them.This probably means
struct ringbuf
should be actually two structures - one shared and one not. Especiallyeventfd
related fields should be in the non-shared one (you have already TODO comment about that).I'd also encourage you to not keep any pointers in shared structure. Only offsets/indexes. Even if you validate pointers properly, it will leaks some info about memory layout of the other side.
Generally libxenvchan protocol works lock-less only because each index is written only by one side of the connection, the other side only read it. If you have anything that can be written by both sides, you need to be very careful to avoid all kind of race conditions. In many cases it's impossible to do that without additional locks. To the point: I'd encourage you to drop
full
flag, at the cost of 1 byte of ring space - i.e.pos_start==pos_end
means ring empty,pos_end==pos_start-1
means ring full.For Qubes OS use case, it's necessary to support stream mode, not only packet mode. In short, it should be possible to either get available data/space size (via public interface) - preferred. Or read/write should handle data on best-effort basis - partial read/write if possible - return how much data was handled. See
read(2)
andwrite(2)
semantics.Implementation
When dealing with shared memory, it's critical to validate data only when the other side can no longer change it. In many places you copy the whole structure (which is safe, but may be slow). But there are cases where you don't do that:
libkvmchan/ringbuf.c
Lines 279 to 286 in 953fc36
You read both indexes here multiple times. In
ringbuf_read_sec
andringbuf_write_sec
you call it directly on the shared buffer - so the other end may freely manipulate them in the meantime. At very least it means that initially it may appear there is data/space in the buffer, but duringringbuf_sec_validate_untrusted_*
call indexes may be modified that no data/space is there. Then duringringbuf_read
/ringbuf_write
you'll hit busy-loop waiting for data/space. And since it operate on local buffer copy, it will never get it.Using busy-wait is a bad idea. For PoC phase it may be ok, but production use it is not acceptable, regardless of
usleep()
parameter. There must be some notification mechanism, without the need for constant polling.That said,
ringbuf_clear_eventfd
lackpthread_join
- it leaks threads.I have not verified safety of all the index and pointers handling at this stage.
VM-VM communication, connection setup
In the current shape, it works only host-guest. For proper qrexec support, it needs guest-guest connection. AFAIU it should be possible, if both guests are given shared memory backed by the same host memory (
mem-path
argument formemory-backend-file
object), right? Right now Qubes OS abstraction over libvchan does not include interface for host-side setup/cleanup of guest-guest channel, but it shouldn't be a problem to add one.Is it possible to attach/detach
ivshmem
devices at runtime (for example withdevice_add
/device_del
QMP commands)? Otherwise, there would be a need to setup a predefined set of devices, which would limit connections count (especially painful in guest-guest case).The text was updated successfully, but these errors were encountered: