Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed minimumBytes #319

Merged
merged 1 commit into from
Feb 1, 2020
Merged

Removed minimumBytes #319

merged 1 commit into from
Feb 1, 2020

Conversation

Borewit
Copy link
Collaborator

@Borewit Borewit commented Jan 26, 2020

FileType.minimumBytes became deprecated, now we read beyond this 4k boundary. It is still possible to to use a sample of the first portion of the file.

To ensure the best file identification, the entire file should be provided. Therefor it is highly recommended to use the parseFile(), which efficiently reads those, and only those, parts of the files, required for the file identification.

Related: #248

@Borewit Borewit added the API change Major change, dependents may need to update their code label Jan 26, 2020
@Borewit Borewit self-assigned this Jan 26, 2020
@Borewit Borewit merged commit c4966b8 into master Feb 1, 2020
@Borewit Borewit deleted the deprecate-minimumBytes branch February 1, 2020 13:00
@jmac105
Copy link

jmac105 commented Feb 3, 2020

Just to check my own understanding of this: there is still a minimum limit of 4100 bytes, but it now may be higher than that as well? I've seen a few cases in our implementation of this package where file type cannot be detected in an image because it is too small, this will still be an issue right?

@Borewit
Copy link
Collaborator Author

Borewit commented Feb 3, 2020

Just to check my own understanding of this: there is still a minimum limit of 4100 bytes, but it now may be higher than that as well?

The minimum chunk size required to be able to detect a file type are 2 bytes. A significant number of file types will be detectable with the first 12 bytes. But some may require 4 MB. 4100 bytes became on arbitrary chunk size, therefor this PR removed it from the API.

Although not suitable for each use case, I highly recommend to use the method FileType.fromFile(filePath) to determine the file type. That methods will query those parts of the file which are relevant to determine the file type.

I've seen a few cases in our implementation of this package where file type cannot be detected in an image because it is too small, this will still be an issue right?

If you provide fewer bytes as required for the detection, the detection will fail or fall back to a more generic file type. This is not limited just to images.

@jmac105
Copy link

jmac105 commented Feb 3, 2020

Thanks. The files do not exist on disk as per current implementation but I'll have a look into that.

@Borewit
Copy link
Collaborator Author

Borewit commented Feb 3, 2020

If you have the entire file in memory that is fine, in combination with fromBuffer. For an image file that is reasonable solution. Your browser has to load an image as well to be able to display it (and even decompress it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Major change, dependents may need to update their code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants