Skip to content

Commit

Permalink
stdio: fix scanf bug
Browse files Browse the repository at this point in the history
Lewis Tian noted on the mailing list that an application reading with
fscanf() from a large file fails after passing one BUFSIZ of data.

It turns out that the old Musl code we were using was only correct assuming
a certain bizarre readv() bug workaround which OSv dropped in commit
0b65142. But the Musl developers have
too noticed they can't rely on this workaround (which doesn't even apply
to unbuffered streams), and came up with a fix in their commit

https://git.musl-libc.org/cgit/musl/commit/src/internal/shgetc.c?id=c20804500deebaabc56f383d48dd1ac77dce8349

This patch does the same fix in our copy of the Musl code.
By the way, after this change libc/internal/shgetc.c is now identical to
musl/src/internal/shgetc.c but we can't drop the former yet because of some
header file differences that still need to be cleared up.

Here is the original Musl commit message describing the change:

  "fix major scanf breakage with unbuffered streams, fmemopen, etc.
  the shgetc api, used internally in scanf and int/float scanning code
  to handle field width limiting and pushback, was designed assuming
  that pushback could be achieved via a simple decrement on the file
  buffer pointer. this only worked by chance for regular FILE streams,
  due to the linux readv bug workaround in __stdio_read which moves the
  last requested byte through the buffer rather than directly back to
  the caller. for unbuffered streams and streams not using __stdio_read
  but some other underlying read function, the first character read
  could be completely lost, and replaced by whatever junk happened to be
  in the unget buffer.

  to fix this, simply have shgetc, when it performs an underlying read
  operation on the stream, store the character read at the -1 offset
  from the read buffer pointer. this is valid even for unbuffered
  streams, as they have an unget buffer located just below the start of
  the zero-length buffer. the check to avoid storing the character when
  it is already there is to handle the possibility of read-only buffers.
  no application-exposed FILE types are allowed to use read-only
  buffers, but sscanf and strto* may use them internally when calling
  functions which use the shgetc api."

Signed-off-by: Nadav Har'El <[email protected]>
Message-Id: <[email protected]>
  • Loading branch information
nyh authored and wkozaczuk committed Jan 14, 2020
1 parent b6654d0 commit 5a1d4c2
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions libc/internal/shgetc.c
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,6 @@ int __shgetc(FILE *f)
else
f->shend = f->rend;
if (f->rend) f->shcnt += f->rend - f->rpos + 1;
if (f->rpos[-1] != c) f->rpos[-1] = c;
return c;
}

0 comments on commit 5a1d4c2

Please sign in to comment.