-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IO-568: AutoCloseInputStream should disable mark/reset #55
base: master
Are you sure you want to change the base?
Conversation
* | ||
* @param n number of bytes read, or -1 if no more bytes are available | ||
* @throws IOException if the stream could not be closed | ||
* @since 2.0 | ||
*/ | ||
@Override | ||
protected void afterRead(final int n) throws IOException { | ||
if (n == EOF) { | ||
if (n == EOF && !marked) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the contract of the method and the whole class. The class Javadoc states "Proxy stream that closes and discards the underlying stream as soon as the end of input has been reached or when the stream is explicitly closed." If feels like you need a different class. Now, it is possible that the original author did not even think of the mark and reset use cases, but still, this has the potential of defeating the purpose of the class: which is to auto close no matter what. Maybe you need a class called "AutoCloseUnmarkedInputStream"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is AutoCloseInputStream itself does not respect the contract of mark/reset API (which makes Tika confuse) and I feel this use case was simply forgotten. Before anything, it's supposed to act as a valid InputStream (which happen to be automatically closed when you don't need it anymore). I assumed that if you mark it it means you will reset it at some point so there is no much risk IMO.
IMO either AutoCloseInputStream support mark/rest or it should force override mark/reset and return false for markSupported() (which is fine with me too since it fixes my use case, I just assumed a more subtle solution would be preferred :)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see your point and let me rephrase it to make sure I understand: Either we should implement the mark/reset APIs properly or we should implement markSupported() to return false.
My concern remains for this use case, today:
- I open a stream
- I read
- I call mark() and read() here and there
- I read to the end of stream and the stream closes as it should
- I do not bother to call close()
- All is well
With this change, my random calls to mark() before I read to the end of the stream cause the stream to remain open. Clearly a leak. Granted, I should close a stream that I open somewhere or in a try-with-resources block but this class lives outside of this use case.
It seems to me like the simplest solution would be for markSupported() to return false. WDYT? Or create a new class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main point is that it cannot stay like it currently is.
So yes if you want to be sure mark() can be a random mistake then mark/reset should throw an exception and markSupported() should always return false.
I can change my pull request in that directly, no problem. Both works.
I am glad that you have some flexibility here and for the civil discourse as well :-) My main concern is for compatibility so throwing an exception in the place where code worked before is a bad thing. What is the exception you are seeing? |
Sorry, I actually meant reset() only since reset is supposed to throw an exception when markSupported() return false. For mark() I would just not call the target InpuStream mark() method. For the exception itself I was thinking about the standard |
// Behave as standard InputStream not supporting mark/reset | ||
return false; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tmortagne
Again, this completely changes the contract of the existing class and will break existing call sites. I think there are two kinds of solutions here:
(1) Write a new class as I suggested before.
(2) Implement mark/reset with its own buffer so that reset works even when the underlying stream is actually closed. I'm not even sure of the sanity of that.
Gary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't understand this "again". The whole point of the previous conversation was that it was simpler to make AutoCloseInputStream just cancel mark/reset support since it's too complex to really support it properly and you said yourself
It seems to me like the simplest solution would be for markSupported() to return false.
As I said AutoCloseInputStream does not make much sense in its current state since it break InputStream contract regarding mark/reset support which is not so great. Cancel mark/rest support is one way to prevent this inconsistency which is what I suggested several times and that you seemed to agree on.
If you are suggesting that a new class should be introduced and AutoCloseInputStream deprecated because broken but not fixable it was not very clear.
Here is a new version which does not completely abandon AutoCloseInputStream to its fate but still don't change its default behavior. |
No description provided.