Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugs in file_descriptor and gzip-zlib-bzip2 filters #17

Merged
merged 8 commits into from
Dec 5, 2016

Conversation

yanikibo
Copy link
Contributor

zlib: recover Z_BUF_ERROR conditions
file_descriptor: openmode compatibility to C++ standard

zlib_error::check throws on Z_BUF_ERROR
zlib_error::check throws on Z_BUF_ERROR
Openmode compatibility to C++ standard
@yanikibo yanikibo changed the title Ticket #7810 bugs in file_descriptor and gzip-zlib-bzip2 filters Sep 29, 2015
yanikibo and others added 5 commits September 29, 2015 16:07
int buffer_size -> std::streamsize buffer_size
int buffer_size -> std::streamsize buffer_size
int buffer_size -> std::streamsize buffer_size
int buffer_size -> std::streamsize buffer_size
int buffer_size -> std::streamsize buffer_size
@@ -347,7 +347,7 @@ bool zlib_compressor_impl<Alloc>::filter
int result = xdeflate(flush ? zlib::finish : zlib::no_flush);
after(src_begin, dest_begin, true);
zlib_error::check BOOST_PREVENT_MACRO_SUBSTITUTION(result);
return result != zlib::stream_end;
return result == zlib::okay;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the zlib docs, if zlib::finish is passed in the call to deflate then zlib::stream_end is returned when successful, otherwise zlib::okay is returned when successful. I do not think your change reflects that reality.

Copy link
Contributor Author

@yanikibo yanikibo Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for handling Z_BUF_ERROR condition.
If you analyze the code carrefully, you can see that the return value of the filter function is used to flush the buffer, not to check the stream_end.
If we treat Z_BUF_ERROR as an error and throw an exception, then (result != zlib::stream_end) == (result == zlib::okay), because the result may only have two values (okay or stream_end). So in that case the patch is ok.
If we do not treat Z_BUF_ERROR as an error, then we should return false to flush the buffer.
In that case:
symmetric_filter::write flushes the buffer according to the return value of this function, and continue until the real end of stream is reached.
symmetric_filter::read returns by the size of the data read, not by EOF(-1). Thus the user will be able to call this function again by providing a buffer big enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the explanation for 'filter' in boost/iostreams/filter/symmetric.hpp it does not seem that a return of Z_BUF_ERROR from calling xdeflate can be handled properly. Since we return false in that case, rather than throwing an exception as previously, doesn't that mean that iostreams treats the operations as completing successfully where it really needs to be called again. At least throwing the exception alerts the end-user to the buffer problem whereas with your change any buffer problem is masked by your handling of the Z_BUF_ERROR condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not always a user-buffer error, conversely it's an internal buffer error usually. You can get this error with a very big user buffer. It's meaningless to throw an exception for an internal recoverable error. I use this library intensively for io operations with high throughput, and this fix really works for Z_BUF_ERROR conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we return false in that case, rather than throwing an exception as previously, doesn't that mean that iostreams treats the operations as completing successfully where it really needs to be called again.
No it does not. Because the filter function does not guarantee to complete the job in a single call, instead the outer functions check the end of stream and continue to call this function if it is not reached. Also user should know that the read and write functions does not guarantee to complete the job in a single call due to buffer issues. User should always check the completeness. So that it's meaningless to throw an exception for a situation already have to be known.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read and write functions of iostreams classes is designed to return the size of the data operated. It informs the user by an extra call returning -1 after the operation is completed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I agree with your analysis. I would still like to see, in place of result == zlib::okay the specific test for end-of-stream and buffer error in the form of result != zlib::stream_end && result != zlib::buf_error instead. The same goes for your second case below. I am being as conservative as I can be in this change so as to invoke only the particular consequences by which a Z_BUF_ERROR is now handled differently. In other words currently not reaching the end-of-stream returns 'true', and any other result returns 'false' while with your fix I want to have it explicitly changed so that not reaching the end-of-stream and not having a buffer error returns 'true', and any other result returns 'false'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The right way is to return (result == zlib::okay) because when we return true, the symmetric_filter continues filtering. However if there is an error (not throwing exception), it should stop filtering and return immediately. By this fix, there is only 3 values not throwing exception (Z_OK, Z_STREAM_END, Z_BUF_ERROR).
If we decide to add another value (e.g. Z_MEM_ERROR) to this noexception list, the filter function shall return false for that error code too, otherwise the symmetric_filter will enter an infinite loop.
So that returning (result == zlib::okay) is the right way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zlib::okay means that the operation is successfull and there is more data to be processed. This is exactly what symmetric_filter expects to continue filtering.

@@ -381,7 +381,8 @@ bool zlib_decompressor_impl<Alloc>::filter
int result = xinflate(zlib::sync_flush);
after(src_begin, dest_begin, false);
zlib_error::check BOOST_PREVENT_MACRO_SUBSTITUTION(result);
return !(eof_ = result == zlib::stream_end);
eof_ = result == zlib::stream_end;
return result == zlib::okay;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think your change here is correct as it follows the same logic as when xdeflate was called.

Copy link
Contributor Author

@yanikibo yanikibo Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value is used to flush the buffer, not to check the stream_end. Instead the eof_ variable is used to check the end of stream case.

@@ -76,7 +76,7 @@ void zlib_error::check BOOST_PREVENT_MACRO_SUBSTITUTION(int error)
switch (error) {
case Z_OK:
case Z_STREAM_END:
//case Z_BUF_ERROR:
case Z_BUF_ERROR:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why you think this change should be made. You are eseentially saying that a buffer error should not throw an exception.

Copy link
Contributor Author

@yanikibo yanikibo Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Z_BUF_ERROR is not a fatal error, it's a recoverable error. If we treat it as an error and throw an exception, we can not recover it. To recover this error, we should flush the data, empty the buffer and retry.

5. deflate() or inflate() returns Z_BUF_ERROR:

Before making the call, make sure that avail_in and avail_out are not zero. When setting the parameter flush equal to Z_FINISH, also make sure that avail_out is big enough to allow processing all pending input. Note that a Z_BUF_ERROR is not fatal--another call to deflate() or inflate() can be made with more input or output space. A Z_BUF_ERROR may in fact be unavoidable depending on how the functions are used, since it is not possible to tell whether or not there is more output pending when strm.avail_out returns with zero.
Reference: http://www.zlib.net/zlib_faq.html#faq05

@eldiener eldiener merged commit b8e459a into boostorg:develop Dec 5, 2016
@eldiener
Copy link
Contributor

eldiener commented Dec 6, 2016

The file_descriptor change broke the file descriptor tests so i am going to revert the file descriptor changes you made until I can look at them more closely and sync them with the file descriptor tests.

@eldiener
Copy link
Contributor

eldiener commented Dec 6, 2016

Some other change broke the gzip test, so I have backed out all the changes. I will be testing them out individually and re-instating them individually with any necessary fixes. I do thank you for this PR, even if it did not work out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants