Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EML Attachments are modified/have the wrong size #303

Closed
Faelean opened this issue Feb 8, 2021 · 2 comments
Closed

EML Attachments are modified/have the wrong size #303

Faelean opened this issue Feb 8, 2021 · 2 comments

Comments

@Faelean
Copy link

Faelean commented Feb 8, 2021

When getting attachments from an eml file there is an error when getting the actual size of the attachment. This leads to the files being different than the original files.

I've attached a small program which compares the original files to the attachments from the email object. There's one .eml file and one .msg file with four attachments each and the original four files that are attached to the mails. The program compares the bytes from both files and prints the ones that are not present (or are different) in one of the files to the console.
Thats the (shortened) output I get from one of the attachments:
EML:

------------ Documents.7z -------------
attachement unread byte: 0 at index 1462
[...]
attachement unread byte: 0 at index 1499
Unread bytes in attachment file: 38
Unread bytes in original file: 0

MSG:

------------ Documents.7z -------------
Unread bytes in attachment file: 0
Unread bytes in original file: 0

As you can see when using the msg file the contents are identical while there are differences when using the eml file.

From what I've figured so far, the error is in the MiscUtil class:

public static byte[] readInputStreamToBytes(@NotNull final InputStream inputStream)
throws IOException {
try (InputStream is = inputStream) {
byte[] targetArray = new byte[is.available()];
//noinspection ResultOfMethodCallIgnored
is.read(targetArray);
return targetArray;
}
}

The javadoc for InputStream.available states:

Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. [...]
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

(https://docs.oracle.com/javase/9/docs/api/java/io/InputStream.html#available--)

In our case the available method returns 1500 for the Documents.7z, while the actual size is 1461. The additional bytes are filled with 0 values therefore modifying the file. This creates a warning when unzipping the file with 7zip. For the attached txt file this creates unreadable characters displayed when opening with Notepad++ and whitespace when opening with the Windows Editor.

I'd suggest replacing everything inside the try block with this (or something similar). It's a slightly modified version than the one Baeldung suggests when Java 9 is not available.

ByteArrayOutputStream buffer = new ByteArrayOutputStream();
byte[] data = new byte[1024];
int read;
while ((read = is.read(data, 0, data.length)) != -1) {
	buffer.write(data, 0, read);;
}

buffer.flush();

byte[] targetArray = buffer.toByteArray();
return targetArray;

example.zip

@bbottema
Copy link
Owner

Thanks so much for your research. I will incorporate your improvement asap.

@bbottema
Copy link
Owner

bbottema commented Feb 13, 2021

Released in 6.4.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants