Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify that buf can be shortened in readln #7125

Merged
merged 3 commits into from
Aug 6, 2019
Merged

Clarify that buf can be shortened in readln #7125

merged 3 commits into from
Aug 6, 2019

Conversation

mmtrebuchet
Copy link
Contributor

It wasn't clear on inspection if buf could be shortened, or if it was just lengthened. "As necessary" suggested that buf would only be resized if absolutely necessary, which would only be when it needs to be lengthened.

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @mmtrebuchet! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#7125"

@CyberShadow
Copy link
Member

So, if you want to use readln in such a way that it allocates only when it sees a line longer than any other line it has seen before (which I would assume is the intended use case for readln with a buffer), that makes the included example wrong, right? Passing a buf that was shortened by a previous call to readln will cause it to reallocate it to avoid clobbering the part of it that it doesn't see (past the array end), so you need to manually check and maintain the "longest buffer" in a separate variable.

@mmtrebuchet
Copy link
Contributor Author

@CyberShadow I think it just adjusts buf.length but not the actual buf.capacity, so that if your lines are of length 20, 10, 15, it'd allocate 20 characters, then just set buf.length to 10 and 15, without ever shortening the underlying storage. (I'm not sure if this is the intended behavior, but this PR is just to make the documentation reflect the behavior of the function.)

@CyberShadow
Copy link
Member

Here is what I mean:

https://run.dlang.io/is/UwJsaG

After shortening buf to 10, it is reallocated when reading the third, 15-character line.

@mmtrebuchet
Copy link
Contributor Author

Oh, I didn't notice that. Yeah, that's a bad thing, I think. Definitely not how I'd expect that function to behave.
I'm not sure why buf.capacity would ever be shortened, but if you change the code to have
f.readln(buf); writeln(buf.ptr); writeln(buf.capacity);
(I'm not sure how you got the link you posted; where's the button for that?)
then capacity is shortened after the second line is read. In fact, it's shortened to 0 capacity.
I'm stumped by (I've not studied stdio.d carefully) ReadlnAppender.data(), line 4980 in stdio.d:

    @property char[] data() @trusted
    {
        if (safeAppend)
            assumeSafeAppend(buf.ptr[0 .. pos]);
        return buf.ptr[0 .. pos];
    }

I've never tried to slice a pointer until today. Is this where the reallocation is occurring?

@CyberShadow
Copy link
Member

Yeah, that's a bad thing, I think.

I agree, it looks like the function interface was poorly designed. It should have been returning the slice, and only growing buf, never shrinking it.

(I'm not sure how you got the link you posted; where's the button for that?)

"Shorten" in the upper-left corner.

then capacity is shortened after the second line is read. In fact, it's shortened to 0 capacity.

The problem is that it's returning (and setting buf to) a slice of the input buffer. Slices cannot have a capacity, as appending to them will always reallocate.

Is this where the reallocation is occurring?

It happens in the third readln. Because the buf it receives is a slice, appending to it will reallocate. It wouldn't reallocate if it were to call assumeSafeAppend, but it doesn't know that the buf the user provided is one where the data past its end is safe to overwrite.

Copy link
Member

@CyberShadow CyberShadow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make these even more unambiguous, considering that the mistake in the function's design causes it to deviate from what would be the expected one.

that makes the included example wrong, right?

Upon review, the example is correct; it would have been much simpler if readln was better designed, though. The loop could have been replaced with just:

    while (!stdin.eof)
        words += stdin.readln(buf).split.length;

std/stdio.d Outdated Show resolved Hide resolved
std/stdio.d Outdated Show resolved Hide resolved
mmtrebuchet and others added 2 commits August 6, 2019 12:00
Co-Authored-By: Vladimir Panteleev <[email protected]>
Co-Authored-By: Vladimir Panteleev <[email protected]>
@mmtrebuchet
Copy link
Contributor Author

Urk. I guess we're stuck with this strange behavior. A shame that getline() hasn't made its way to core.stdc.stdio yet.

@dlang-bot dlang-bot merged commit 41f28f6 into dlang:master Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants