Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail in gz_load() if read would block #993

Open
wants to merge 21 commits into
base: develop
Choose a base branch
from
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions gzread.c
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ local int gz_load(gz_statep state, unsigned char *buf, unsigned len,
break;
*have += (unsigned)ret;
} while (*have < len);
if (ret < 0) {
if (ret < 0 && errno != EAGAIN && errno != EWOULDBLOCK) {
Copy link
Author

@drbitboy drbitboy Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core of this PR, because all of the read-related wrappers here in gzread.c
end up calling this gz_load routine: when

  • BOTH state-fd (file descriptor) is non-blocking (fcntl(fd, SET_FL, fcntl(fd, GET_FL) | O_NONBLOCK))
  • AND there are no data available to read,

then the read(2) return value (ret above, assigned on line 22) will be -1, and errno will be either EWOULDBLOCK or EAGAIN (Resource not available).

The change here prevents that case from returning -1 from gz_load() here, which in turn
allows the calling routine* to continue working with data from the file descriptor.

It gets a little tricky, because no bytes are returned, and 0 as the return value from read(2)
indicates EOF (End-Of-File).

N.B. tt may make sense to clear errno to a value of 0 here in gz_load(), just before line 22 above.

* gz_avail() or gz_fetch() or gz_read(), with the latter being the ultimate caller of the first two.

gz_error(state, Z_ERRNO, zstrerror());
return -1;
}
Expand Down Expand Up @@ -268,6 +268,7 @@ local int gz_skip(gz_statep state, z_off64_t len) {
local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
z_size_t got;
unsigned n;
int load_errno;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable to keep track of errno from the read(2) in gz_load(), whether gz_load() is called from here (gz_read())

  • EITHER directly,
  • OR indirectly via gz_decomp() or gz_fetch().


/* if len is zero, avoid unnecessary operations */
if (len == 0)
Expand All @@ -283,6 +284,7 @@ local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
/* get len bytes to buf, or less than len if at the end */
got = 0;
do {
load_errno = 0;
/* set n to the maximum amount of len that fits in an unsigned int */
n = (unsigned)-1;
if (n > len)
Expand All @@ -307,25 +309,31 @@ local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
buffer */
else if (state->how == LOOK || n < (state->size << 1)) {
/* get more output, looking for header if required */
errno = 0;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prepare for errno to become EAGAIN/EWOULDBLOCK, indicating that the read(2) on the
non-blocking file descriptor would block, so the read(2) returns a value of -1.

if (gz_fetch(state) == -1)
return 0;
if (!state->x.have) { load_errno = errno; }
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to gz_fetch() (line 313may have resulted in a call to gz_load()/read(2) and updated the internal output buffer, so errno may have a value of EAGAIN/EWOULDBLOCK, indicating that read(2) would have blocked on a non-blocking file.

The next line, 316, will continue to the while at the end of this do-while loop.

The state->x internal output buffer is encapsulated in state->x.next and state->x.have (see lines 293-300 above).

So if there is any data in the internal output buffer (state->x.have is non-zero), then don't copy the value, possibly EAGAIN/EWOULDBLOCK, of errno from gz_fetch() here, because that would cause the while to exit the loop and the data in the internal input buffer would be lost.

If the the state->x internal output buffer is empty here (state->x.have is 0), then there was no data read in the gz_fetch() call (line 313), and it is save to copy the errno value to load_errno and continue, which, if errno is EAGAIN/EWOULDBLOCK here, will exit this do-while loop .

continue; /* no progress yet -- go back to copy above */
/* the copy above assures that we will leave with space in the
output buffer, allowing at least one gzungetc() to succeed */
}

/* large len -- read directly into user buffer */
else if (state->how == COPY) { /* read directly */
errno = 0;
if (gz_load(state, (unsigned char *)buf, n, &n) == -1)
return 0;
load_errno = errno;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gz_load() (line 324) read the data directly into the external output buffer (buf), and state->x was empty and not used here, so put the value of errno into load_errno, which if that value is EAGAIN/EWOULDBLOCK, will cause the do-while loop to exit (line 346).

}

/* large len -- decompress directly into user buffer */
else { /* state->how == GZIP */
state->strm.avail_out = n;
state->strm.next_out = (unsigned char *)buf;
errno = 0;
if (gz_decomp(state) == -1)
return 0;
load_errno = errno;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gz_decomp() (line 334) read the data directly into the external output buffer (buf; lines 331-332), and that buffer will be declared empty when state->x.have is cleared to 0 (line 338), so put the value of errno into load_errno, which if the value is ...

n = state->x.have;
state->x.have = 0;
}
Expand All @@ -335,7 +343,7 @@ local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
buf = (char *)buf + n;
got += n;
state->x.pos += n;
} while (len);
} while (len && load_errno != EAGAIN && load_errno != EWOULDBLOCK);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execute another pass on this do-while loop if

  • BOTH there is room left in the output buffer
  • AND we did not execute a blocking read on a non-blocking input on this pass


/* return number of bytes read into user buffer */
return got;
Expand Down