Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no_std support #579

Open
Demi-Marie opened this issue Dec 31, 2019 · 17 comments
Open

no_std support #579

Demi-Marie opened this issue Dec 31, 2019 · 17 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@Demi-Marie
Copy link
Contributor

Would it be possible for quinn_proto to support #![no_std]?

@Ralith Ralith added enhancement New feature or request help wanted Extra attention is needed labels Dec 31, 2019
@Ralith
Copy link
Collaborator

Ralith commented Dec 31, 2019

Maybe. I like the idea and I think the quinn-proto code proper would be straightforward enough (after pulling in alloc and hashbrown). The meat of the problem would be dealing with dependencies, e.g. rustls/rustls#157.

@Demi-Marie
Copy link
Contributor Author

Would it be possible to avoid using alloc, as smoltcp does?

@Ralith
Copy link
Collaborator

Ralith commented Dec 31, 2019

It would be difficult; there is a lot of difficult-to-predict dynamic allocation. Would be pretty cool to have, though. I'd be happy to mentor an attempt.

@Demi-Marie
Copy link
Contributor Author

For the record, smoltcp is https://github.com/m-labs/smoltcp.

One source of unnecessary allocations is Transmit objects, which are boxed. We can replace them with buffers owned by the calling application. This would improve performance even if std is available.

Systems that do not support dynamic allocation generally only support a fixed number of connections at any one time. We might be able to somehow use 'static buffers for that.

@Ralith
Copy link
Collaborator

Ralith commented Dec 31, 2019

This is an area where TCP has a much easier time due to its comparative simplicity.

One source of unnecessary allocations is Transmit objects, which are boxed. We can replace them with buffers owned by the calling application. This would improve performance even if std is available.

In particular, this would enable zero-copy I/O with UDP segmentation offload (#501), which would be very cool.

Systems that do not support dynamic allocation generally only support a fixed number of connections at any one time. We might be able to somehow use 'static buffers for that.

The various config structs are in theory sufficient to determine a hard upper bound on memory use in that vein, so this is a reasonable direction. However, the relationship isn't trivial. For example, flow control might permit at most 1MiB of data to be transmitted by the peer, but the aggregate packets bearing that data might be much larger, and the current implementation never copies data out of packets (see also #431). std-capable applications would also benefit from the reduced attack surface that a no-std-friendly approach here would entail.

@Demi-Marie
Copy link
Contributor Author

This is an area where TCP has a much easier time due to its comparative simplicity.

One source of unnecessary allocations is Transmit objects, which are boxed. We can replace them with buffers owned by the calling application. This would improve performance even if std is available.

In particular, this would enable zero-copy I/O with UDP segmentation offload (#501), which would be very cool.

This would still be useful even if traits like AsyncRead and AsyncWrite required a copy, as we might be able to get the copy virtually for free during the crypto.

Systems that do not support dynamic allocation generally only support a fixed number of connections at any one time. We might be able to somehow use 'static buffers for that.

The various config structs are in theory sufficient to determine a hard upper bound on memory use in that vein, so this is a reasonable direction. However, the relationship isn't trivial. For example, flow control might permit at most 1MiB of data to be transmitted by the peer, but the aggregate packets bearing that data might be much larger, and the current implementation never copies data out of packets (see also #431). std-capable applications would also benefit from the reduced attack surface that a no-std-friendly approach here would entail.

Why would this have less attack surface?

@Ralith
Copy link
Collaborator

Ralith commented Dec 31, 2019

This would still be useful even if traits like AsyncRead and AsyncWrite required a copy, as we might be able to get the copy virtually for free during the crypto.

Decryption occurs in-place (and happens at a much lower level besides) so there's no free lunch there, but I wouldn't worry too much about those traits. They're still in flux and there's no harm in supporting them as an ecosystem-friendly slow-path while having a custom interface with less overhead.

Why would this have less attack surface?

The current implementation allows a malicious peer to cause disproportionate memory consumption. A no-std implementation would necessarily have to deal with this to be able to advertise useful flow control limits without over-committing a reasonable amount of preallocated storage.

@djc
Copy link
Member

djc commented Jan 2, 2020

What are you trying to build that requires no std support? What is the use case we're trying to serve here?

@Demi-Marie
Copy link
Contributor Author

Nothing in particular, tbh. I just think that supporting no_std is a good idea whenever it is practical.

@djc
Copy link
Member

djc commented Jan 2, 2020

Okay. I'd want to make sure the complexity trade-off is actually practical, which would probably involve doing some more design work before implementing this.

@Ralith
Copy link
Collaborator

Ralith commented Jan 2, 2020

I agree that we should be careful of that. That said, I think we can pretty easily identify some prerequisites of no-std support which are clear wins regardless, including the discussed quinn-proto support for caller-owned I/O buffers and a fix for #431.

@burdges
Copy link

burdges commented Jan 3, 2020

As alloc works now, I think the biggest single annoyance turns out to be std::io::Error and std::error::Error, which cause no end of trouble.

Related briansmith/ring#869 rustls/rustls#283

The current implementation allows a malicious peer to cause disproportionate memory consumption.

Is there an issue discussing this?

@Ralith
Copy link
Collaborator

Ralith commented Jan 3, 2020

Yes, #431.

@djc djc mentioned this issue Dec 7, 2020
@vadixidav
Copy link

This may be desirable for my use case (just looking at options), but I don't have anything to add in terms of design. What I can say is that my use case would involve bringing sensors into a larger industrial systems stack that has Ethernet at higher layers and serial comms at lower layers. In theory, one could use basic serial communication to encapsulate QUIC. The I/O would be handled by the consumer of the API, and any packets of data that are received would be forwarded to the API. I don't know exactly how that would look, as the current API does still seem to assume the use of UDP, whereas what I would want is something more raw, but still with TLS security. UDP could potentially still be used on top of serial just fine. I will have to look at how much program memory and main memory are consumed by an application using QUIC, as that is a limiting factor in embedded. A stack that takes less than 32 KiB of program memory on ARM would be ideal, and many MCUs have 8 KiB of SRAM or less. It might end up being just a bit over that, which would limit what chips it could be deployed to.

Again, this is not urgent or even a request. I figured that it would be worth enumerating a use case I am looking into to help in the designing process.

@Ralith
Copy link
Collaborator

Ralith commented Sep 12, 2022

Extremely resource constrained devices probably need a different implementation strategy which focuses on a narrow subset of the protocol and emphasizes simplicity over performance.

@vadixidav
Copy link

Then a better approach could be to refactor some of the low-level QUIC primitives out into crates that can be consumed by alternative implementations for resource constrained devices, and make that out of scope for quinn. The main reason I was looking at this is because quinn-proto works without I/O, so it does seem in the realm of possibility. We would need to test and see how much program memory it takes up. So far I have only used quinn on devices without such resource constraints.

@Matthias247
Copy link
Contributor

Matthias247 commented Sep 14, 2022

It's pretty much impossible to factor them out without redesigning the library, and even when one goes that way I'm not convinced it is possible implement QUIC for constrained devices in a meaningful way. My experience with those is that even TCP and TLS are a stretch (proper TLS support which doesn't fail randomly requires a 16kB receive buffer to hold a full record, plus a send buffer, plus > 5kB space for handshake data. Plus TCP requires buffers on its own).

The reason why I don't think the parts can easily be reused for constrained devices is the the library makes use of lots of dynamic allocations - e.g. for managing a varying amount of streams, to track and reorder incoming data, and to manage transmission of outgoing data, to keep track of what peers acknowledged, etc. Any dynamic allocation is a no-go for real constrained devices, and TCP stacks (like lwip) get a bit around it by letting users make use of pooled objects for nearly everything. Going that route would really require lots of effort, and that doesn't even touch the IO questions that have been brought up.

I think the best chance for QUIC support on such devices would be to have strict limits for everything - e.g. just one stream per connection with a fixed size send and receive buffer that is preallocated (like smoltcp). And failing connections rapidly if those buffers get too fragmented due to how peers perform acknowledgements. While that again might be doable, it is more like reimplementing the library than factoring some parts out, and might also negatively impact performance for users in normal environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants