add try fill #18059

mahkoh · 2014-10-15T15:08:04Z

A more useful version of read_at_least.

sfackler · 2014-10-15T15:42:59Z

Why does the behavior of this differ from read_at_least with respect to 0-byte reads?

mahkoh · 2014-10-15T15:51:16Z

It is not clear to me in which situations a Reader returns 0. This is the conservative choice that will always perform well.

alexcrichton · 2014-10-15T18:07:33Z

Can you provide some rationale for why this method should be added? "A more useful version of read_at_least" isn't very useful. I also agree with @sfackler that this is inconsistent with the way 0-byte reads are handled elsewhere throughout the Reader trait, which is worrisome. To land, this will also require some tests, can you add some?

sfackler · 2014-10-15T19:28:58Z

I'm also not sure why this is better than just calling read. Either way you have to be looping over it.

mahkoh · 2014-10-15T19:31:24Z

Why do I have to loop?

sfackler · 2014-10-15T19:37:03Z

It seems like the point of this is intended to be to avoid dealing with short reads, but you can still end up with a short read if read ever returns 0. Why would one use this instead of rdr.read_at_least(buf, buf.len())?

sfackler · 2014-10-15T19:38:08Z

Another way of thinking about this is that the semantics of read_fill are identical to read, except that there's a higher probability of getting a full buffer.

mahkoh · 2014-10-15T19:42:17Z

read_at_least is useless for any value of min except 1 because it throws the buffer away if it can't fill it or encounters EOF. It is also useless if you want performant code that doesn't do a thousand unnecessary function calls if read returns 0.

The behavior of try_fill is clearly documented, which means that you should not use it with Readers that can return 0, which should be all sensible Readers.

mahkoh · 2014-10-15T19:46:14Z

Another way of thinking about this is that the semantics of read_fill are identical to read, except that there's a higher probability of getting a full buffer.

There is a zero probability of getting a full buffer if you read from stdin.

mahkoh · 2014-10-15T21:20:25Z

cc @kballard

lilyball · 2014-10-15T21:43:30Z

There is a zero probability of getting a full buffer if you read from stdin.

Only true if stdin is an interactive terminal and is being driven by a human (as opposed to, say, expect).

In any case, the only utility here is "a higher chance of getting a full buffer". Can you provide any compelling reason why that's useful enough to be in the standard library? If a client needs a certain number of bytes in order to process, they can use read_at_least(). If a client doesn't need that many bytes, then what benefit does using try_fill() have over simply handling the output ofread()?

As for being a "more useful version of read_at_least", that's not true at all. It doesn't serve the same purpose. read_at_least allows clients to avoid having to deal with 0-length reads, and it also does all the work of dealing with bad readers that return 0 indefinitely. try_fill() does neither.

lilyball · 2014-10-15T21:51:38Z

Additionally, the behavior of try_fill() around 0-length reads is not very good. Why terminate when receiving a 0-length read? 0-length reads should be very uncommon, but they're certainly not errors. As an example, in Go, zero-length reads were returned sometimes from the TLS implementation, as part of their workaround for BEAST. Where OpenSSL took the approach of sending a 1-byte packet followed by an n-1-byte packet, Go's TLS implementation sent a 0-byte packet followed by an n-byte packet. Both approaches prevent the BEAST attack, but Go's approach resulted in TLS sockets sometimes producing 0-length reads.

Basically, a 0-length read does not mean that clients should stop processing anything. If your goal is "fill a buffer as much as possible", then you should read until you encounter an error. A 0-length read is meaningless. However, you should also be doing this reading using read_at_least(1), to protect against badly-behaving readers that return 0 indefinitely.

However, all that ends up being irrelevant, because this approach is fundamentally flawed. std::io provides a synchronous I/O interface, so issuing a second read after the first one returns is highly likely to block. If a Reader has 10 bytes buffered, and you issue a read() with a 1024-byte buffer, it will give you 10 bytes. If you then reissue a read() with the remaining 1014-byte buffer, it will block until it has another byte to give you (or until various implementation details cause a 0-length read, such as the TLS example given above). Surely the client of try_fill() does not want any blocking if any bytes have already been read.

In light of that, a function like try_fill() only makes sense if there exists an API for doing a non-blocking read. As std::io has no such API, try_fill() ends up being an unsuitable API. And of course if we did have a non-blocking read API, we'd still be back at the original challenge, which is providing a reason why try_fill() is worth having in the standard library.

mahkoh · 2014-10-15T21:52:38Z

There is no justification given for Readers ever returning 0. Such a Reader is broken and should be fixed.

If a client doesn't need that many bytes, then what benefit does using try_fill() have over simply handling the output ofread()?

Algorithms that work on blocks of a certain size and pad only the last block will need a full buffer.

It doesn't serve the same purpose.

I'm not sure what purpose read_at_least serves at all. A Reader that returns 0 is broken and even if you want to handle such Readers, there is no need for the min argument.

mahkoh · 2014-10-15T22:01:50Z

BEAST

I don't know anything about that and I doubt that it justifies adding this kind of API design that requires working around zero reads most of the time. In your workaround, you do 1000 unnecessary function calls in the worst case. Where is the justification for that? POSIX seems to do fine without 0 sized reads.

Surely the client of try_fill() does not want any blocking if any bytes have already been read.

I'm sure the client does because it is the purpose of try_fill to fill the complete buffer unless we're at EOF (ignoring 0 reads for the moment.)

In light of that, a function like try_fill() only makes sense if there exists an API for doing a non-blocking read.

Using try_fill for non-blocking reads would make no sense since try_fill would return an error as soon as there are no more bytes available. This is not the purpose of try_fill.

huonw · 2014-10-15T22:03:22Z

src/libstd/io/mod.rs

+    ///
+    /// # Error
+    ///
+    /// If an error different from EOF occurs at any point, that error is returned, and no


What happens if EOF occurs? It's not returned, and further bytes are read??

If read
returns 0 or EOF, the total number of bytes read will be returned.

I guess I'll reformulate that a bit:

If read returns 0 or EOF, the total number of bytes read is returned immediately.

mahkoh · 2014-10-15T22:35:54Z

In other words: Just like read_at_least is supposed to shield clients from having to deal with 0-sized reads, try_fill is supposed to shield clients from Readers that don't have all of their data available immediately (stdin, network, ..) The difference is that there is no justification for Readers ever returning 0. I will open an issue regarding this shortly.

After reading your Go-example again, I'm quite confused. You're talking about how Go created packages with a 0-sized payload, but this has nothing to do with Readers. It is very unlikely that a Rust implementation that would deal with such packages would use a Reader to forward the payload to the user.

nodakai · 2014-10-17T12:00:08Z

read_at_least() is a weird API. br.read_at_least(n, vec.slice_to_mut(m)) is pretty much saying "Give me at least n bytes, and if you feel generous, you can give me more. But please don't be too generous, I'm not ready to receive more than m bytes." Given that BufferedReader is... well... a buffered reader, such an indeterministic API design benefits nobody (it could have been meaningful if it were a real system call and someone wanted to reduce the number of kernel/user context switches.) I'll always say "give me exactly m bytes, except at the (untimely?) end of the stream" with this pattern:

br.read_at_least(buf.len(), buf) // buf is a mut slice

I don't particularly support @mahkoh 's patch as it is now, but I'll be glad to see a convenient wrapper of read_at_least() which saves me from typing buf.len() over and over again.

mahkoh · 2014-10-17T13:22:25Z

There will be no wrapper of read_at_least because it does a completely different thing. I'm not saying that you cannot improve its API, but if you change it enough to replace this function, then it has become the same function. If it can be agreed that Readers should never return 0(†), then read_at_least can be completely replaced by try_fill.

†: I've realized that, in the case where the buffer is empty, the return value should be unspecified.

huonw · 2014-10-17T14:24:51Z

(BTW, #18059 (comment) hasn't been addressed... or even acknowledged.)

mahkoh · 2014-10-17T14:27:13Z

@huonw: Why would I add tests before it is clear that this functionality will get merged?

huonw · 2014-10-17T14:33:33Z

Because any sane patch will have tests, tests can help clarify the intended behaviour and reviewing the tests is part of reviewing the patch. You could, at the very least, just inform us that you've seen the comment and will add tests when a decision is made (we can't read your mind).

mahkoh · 2014-10-17T14:35:55Z

When I opened the PR it was not clear to me that such a small function would even need a test. I will add one if I get a preliminary r+.

huonw · 2014-10-17T14:40:47Z

Tests are always necessary. Even a small function can have a typo (or a refactoring/renaming later can introduce a typo) and it is not acceptable for stdlib stuff to be untested (any places that are currently untested should be regarded as bugs).

Maybe it's fine in your own personal libraries, but this is functionality that may exist 'forever' and be used everywhere and anywhere, so it should be as bulletproof as possible. Of course bugs are hard to avoid, but having tests is a big part of the fight against them.

mahkoh · 2014-11-28T16:44:46Z

Time to close this one. See #18289

fix: Updating settings should not clobber discovered projects `linkedProjects` is owned by the user's configuration, so when users update this setting, `linkedProjects` is reset. This is problematic when `linkedProjects` also contains projects discovered with `discoverCommand`. The buggy behaviour occurred when: (1) The user configures `discoverCommand` and loads a Rust project. (2) The user changes any setting in VS Code, so rust-analyzer receives `workspace/didChangeConfiguration`. (3) `handle_did_change_configuration` ultimately calls `Client::apply_change_with_sink()`, which updates `config.user_config` and discards any items we added in `linkedProjects`. Instead, separate out `discovered_projects_from_filesystem` and `discovered_projects_from_command` from user configuration, so user settings cannot affect any type of discovered project. This fixes the subtle issue mentioned here: rust-lang/rust-analyzer#17246 (comment)

add try fill

20f6a08

huonw reviewed Oct 15, 2014
View reviewed changes

mahkoh closed this Nov 28, 2014

nodakai mentioned this pull request Dec 17, 2014

RFC: io and os reform: initial skeleton rust-lang/rfcs#517

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add try fill #18059

add try fill #18059

mahkoh commented Oct 15, 2014

sfackler commented Oct 15, 2014

mahkoh commented Oct 15, 2014

alexcrichton commented Oct 15, 2014

sfackler commented Oct 15, 2014

mahkoh commented Oct 15, 2014

sfackler commented Oct 15, 2014

sfackler commented Oct 15, 2014

mahkoh commented Oct 15, 2014

mahkoh commented Oct 15, 2014

mahkoh commented Oct 15, 2014

lilyball commented Oct 15, 2014

lilyball commented Oct 15, 2014

mahkoh commented Oct 15, 2014

mahkoh commented Oct 15, 2014

huonw Oct 15, 2014

mahkoh Oct 15, 2014

mahkoh commented Oct 15, 2014

nodakai commented Oct 17, 2014

mahkoh commented Oct 17, 2014

huonw commented Oct 17, 2014

mahkoh commented Oct 17, 2014

huonw commented Oct 17, 2014

mahkoh commented Oct 17, 2014

huonw commented Oct 17, 2014

mahkoh commented Nov 28, 2014

add try fill #18059

add try fill #18059

Conversation

mahkoh commented Oct 15, 2014

sfackler commented Oct 15, 2014

mahkoh commented Oct 15, 2014

alexcrichton commented Oct 15, 2014

sfackler commented Oct 15, 2014

mahkoh commented Oct 15, 2014

sfackler commented Oct 15, 2014

sfackler commented Oct 15, 2014

mahkoh commented Oct 15, 2014

mahkoh commented Oct 15, 2014

mahkoh commented Oct 15, 2014

lilyball commented Oct 15, 2014

lilyball commented Oct 15, 2014

mahkoh commented Oct 15, 2014

mahkoh commented Oct 15, 2014

huonw Oct 15, 2014

Choose a reason for hiding this comment

mahkoh Oct 15, 2014

Choose a reason for hiding this comment

mahkoh commented Oct 15, 2014

nodakai commented Oct 17, 2014

mahkoh commented Oct 17, 2014

huonw commented Oct 17, 2014

mahkoh commented Oct 17, 2014

huonw commented Oct 17, 2014

mahkoh commented Oct 17, 2014

huonw commented Oct 17, 2014

mahkoh commented Nov 28, 2014