Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misidentifies jpeg as "audio/vnd.dts.hd" #36

Closed
rwstauner opened this issue Apr 1, 2016 · 5 comments
Closed

Misidentifies jpeg as "audio/vnd.dts.hd" #36

rwstauner opened this issue Apr 1, 2016 · 5 comments

Comments

@rwstauner
Copy link

I have 10 photos.
file and identify (imagemagick) think all are jpegs, however this gem classifies one of them incorrectly.

I used the "magic" found here:
https://github.com/minad/mimemagic/blob/v0.3.0/lib/mimemagic/tables.rb#L1487
to verify:

$ for i in *.jpg; { echo $i; file $i; identify $i; head -c 18725 $i | grep -Eo 'dX %'; }                                                                                                                        
1470445_01.jpg
1470445_01.jpg: JPEG image data, JFIF standard 1.02
1470445_01.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 85.6KB 0.000u 0:00.000
1470445_02.jpg
1470445_02.jpg: JPEG image data, JFIF standard 1.02
1470445_02.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 74.6KB 0.000u 0:00.000
1470445_03.jpg
1470445_03.jpg: JPEG image data, JFIF standard 1.02
1470445_03.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 73.4KB 0.000u 0:00.000
1470445_04.jpg
1470445_04.jpg: JPEG image data, JFIF standard 1.02
1470445_04.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 69.6KB 0.000u 0:00.000
1470445_05.jpg
1470445_05.jpg: JPEG image data, JFIF standard 1.02
1470445_05.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 82.5KB 0.000u 0:00.000
Binary file (standard input) matches
1470445_06.jpg
1470445_06.jpg: JPEG image data, JFIF standard 1.02
1470445_06.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 68.4KB 0.000u 0:00.000
1470445_07.jpg
1470445_07.jpg: JPEG image data, JFIF standard 1.02
1470445_07.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 49.6KB 0.000u 0:00.000
1470445_08.jpg
1470445_08.jpg: JPEG image data, JFIF standard 1.02
1470445_08.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 30KB 0.000u 0:00.000
1470445_09.jpg
1470445_09.jpg: JPEG image data, JFIF standard 1.02
1470445_09.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 35.9KB 0.000u 0:00.000
1470445_10.jpg
1470445_10.jpg: JPEG image data, JFIF standard 1.02
1470445_10.jpg JPEG 800x600 800x600+0+0 8-bit DirectClass 55.2KB 0.000u 0:00.000
@rwstauner
Copy link
Author

FTR if I cut the file short (first 13,000 bytes) it works correctly:

💥  docker run --rm -it -v ~work/tmp/1470445_05.jpg:/file.jpg ruby:2.3 bash -c 'gem install mimemagic; dd if=/file.jpg of=/file2.jpg bs=1 count=13000; ruby -rmimemagic -e "%w[/file.jpg /file2.jpg].each { |f| puts %Q|#{f}:#{ MimeMagic.by_magic(File.open(f)) }| }"'
Fetching: mimemagic-0.3.1.gem (100%)
Successfully installed mimemagic-0.3.1
1 gem installed
13000+0 records in
13000+0 records out
13000 bytes (13 kB) copied, 0.663014 s, 19.6 kB/s
/file.jpg:audio/vnd.dts.hd
/file2.jpg:image/jpeg

So the gem is selecting based on the a worse match (a match somewhere between bytes 13000 and 18725) instead of a better one (first 2 or 3 bytes).

@natematykiewicz
Copy link

data = File.binread('1470445_05.jpg')
MimeMagic::MAGIC.select {|type, matches| MimeMagic.send(:magic_match_str, data, matches) }
=> [["audio/vnd.dts.hd", [[0..18725, "dX %"]]], ["image/jpeg", [[0, "\xFF\xD8\xFF"], [0, "\xFF\xD8"]]]]

So, this file does match the image/jpeg format, but find is returning the first one.

@minad
Copy link
Collaborator

minad commented Apr 8, 2016

Probably this is a flaw in the algorithm or in the priority order in the upstream freedesktop file. Maybe someone could take a look on how this could be done better or how other tools based on the freedesktop database operate?

@janko
Copy link
Contributor

janko commented Jul 25, 2016

This issue should be solved by #40, because the image/jpeg is now on top of the table, so it will be returned first.

@minad
Copy link
Collaborator

minad commented Jan 30, 2021

I assume this is fixed?

@minad minad closed this as completed Jan 30, 2021
@mimemagicrb mimemagicrb locked as resolved and limited conversation to collaborators Mar 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants