Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure the DB file is seekable, use a StringIO wrapper if it's not #38

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions lib/mini_mime.rb
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,16 @@ def initialize(path, sort_order)
@file_length = File.size(@path)
@rows = @file_length / @row_length

# Make sure the DB file is seekable, use a StringIO wrapper if it's not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole dance of checking makes me uneasy... I would be comfortable with a "reaction" to an error JRuby raises when you do a seek but it gives us nothing. I prefer not to own this heuristic.

Perhaps a simple:

MiniMime.preload_dbs = true

Then

if MiniMime.preload_dbs
   @file = StringIO.new(File.read(@path))
end

With some readme data. Very uneasy with the hack of ... seek around file and check that the computer did what you asked it... as a general solution. @headius - it should be exceptioning out if you tell it to seek and it does not seek, this is hiding bugs in JRuby and making them "normal".

Copy link
Contributor Author

@ikaronen-relex ikaronen-relex Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be exceptioning out if you tell it to seek and it does not seek

I completely agree. However, if we want this code to be compatible with current and past JRuby behavior, we'd still need this "hack". At least the way I wrote it, it should continue working even if JRuby behavior is later changed to raise a reasonable exception (i.e. any IOError or SystemCallError) in this situation.

I suppose a comment explaining why this "dance" is needed might be worth including.

Copy link

@headius headius Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not do anything special for seek. It will either use JVM functions to do this (SeekableChannel.position) or a native lseek call.

It does appear to error for an IO.pipe, for example, but I'm unsure if it is the right error:

$ jruby -e 'IO.pipe[0].seek(0)'
Errno::EPIPE: Broken pipe - No message available
    seek at org/jruby/RubyIO.java:1817
  <main> at -e:1

Also for stdin, with a similar error:

$ jruby -e '$stdin.seek(0)'
Errno::ESPIPE: Illegal seek - <STDIN>
    seek at org/jruby/RubyIO.java:1817
  <main> at -e:1

I am a bit confused which resources are not seekable but not raising an error. Just classloader resources? Some other file type for the DB? I think we can extend this error raising to those other resources types if someone can clarify which ones are failing to raise an error here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a JRuby patch that will raise EPIPE for all streams that do not appear to be seekable, including classloader resources:

[] ~/work/jruby93 $ rvm ruby-3.1 do ruby -e 'IO.pipe[0].seek(0)'
-e:1:in `seek': Illegal seek (Errno::ESPIPE)
	from -e:1:in `<main>'

[] ~/work/jruby93 $ rvm ruby-3.1 do ruby -e '$stdin.seek(0)'
-e:1:in `seek': Illegal seek @ rb_io_seek - <STDIN> (Errno::ESPIPE)
	from -e:1:in `<main>'

[] ~/work/jruby93 $ jruby -e 'p File.open("uri:classloader:jruby/kernel/kernel.rb").seek(1)'
Errno::EPIPE: Broken pipe - uri:classloader:jruby/kernel/kernel.rb
    seek at org/jruby/RubyIO.java:1817
  <main> at -e:1

This is a pretty aggressive patch, but might be necessary to make this work properly. By the time we do this seek call, all we have in hand is a java.nio.Channel, and we use the presence of the SeekableChannel supertype to know if seeking is possible. We opted for quiet failure at some point in the past (no idea why) if a channel was not seekable but also not selectable.

I have pushed this as jruby/jruby#7415 but we'll need to evaluate how risky it is before merging and releasing.

cc @enebo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@headius

I am a bit confused which resources are not seekable but not raising an error. Just classloader resources?

Classloader resources are what I've encountered this issue with, since that's what the DB files will end up being when a JRuby application using MiniMime is bundled into a .jar.

We opted for quiet failure at some point in the past (no idea why) if a channel was not seekable but also not selectable.

The silent failure was apparently originally implemented in jruby/jruby@a4cc5ee as a workaround for jruby/jruby#3399. I don't personally think it was a good workaround, but if this behavior is changed, we should probably make sure there's no regression there. (What probably should be done, IMO, would be to fix Rack to properly handle non-seekable resources.)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented on my PR as well, but to keep this up-to-date...

I agree that classloader resources, accessible only as InputStream objects that don't support seeking, should raise an error when you attempt to seek on them. If we can be sure this won't affect rack, it is an easy change (and we could narrow my PR to just the resource types in question). Until we can be sure of that, however, we are potentially trading one "broken" behavior for another "broken" behavior.

The simplest fix would be to make all unseekable resources in JRuby raise an error, but there are many resource types in JRuby that would never be encountered in CRuby such as IO resources from in-memory resources that can only be read forward. It is an arguable grey area what is appropriate to do for those.

is_seekable = (@file.tell == @row_length)
begin
@file.seek(0)
rescue IOError, SystemCallError
is_seekable = false
end
is_seekable = false unless @file.tell == 0
@file = StringIO.new(File.read(@path)) unless is_seekable

@hit_cache = Cache.new(MAX_CACHED)
@miss_cache = Cache.new(MAX_CACHED)

Expand Down
24 changes: 10 additions & 14 deletions test/mini_mime_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,25 +12,25 @@ def test_that_it_has_a_version_number
end

def test_extension
assert_equal "application/zip", MiniMime.lookup_by_extension("zip").content_type
assert_equal "application/zip", MiniMime.lookup_by_extension("zip")&.content_type
end

def test_mixed_case

# irb(main):009:0> MIME::Types.type_for("TxT").first.to_s
# => "text/plain"

assert_equal "application/vnd.groove-tool-message", MiniMime.lookup_by_filename("a.GTM").content_type
assert_equal "application/zip", MiniMime.lookup_by_extension("ZiP").content_type
assert_equal "application/vnd.groove-tool-message", MiniMime.lookup_by_filename("a.GTM")&.content_type
assert_equal "application/zip", MiniMime.lookup_by_extension("ZiP")&.content_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm I don't understand these changes ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure that if the lookup fails and returns nil we get a clean failure from the assert_equals instead of a NoMethodError from calling .content_type on nil. I think I ran into that while testing my code.

end

def test_content_type
assert_equal "application/vnd.lotus-1-2-3", MiniMime.lookup_by_filename("a.123").content_type
assert_equal "application/x-compressed", MiniMime.lookup_by_filename("a.Z").content_type
assert_equal "application/vnd.groove-tool-message", MiniMime.lookup_by_filename("a.gtm").content_type
assert_equal "application/vnd.HandHeld-Entertainment+xml", MiniMime.lookup_by_filename("a.zmm").content_type
assert_equal "text/csv", MiniMime.lookup_by_filename("x.csv").content_type
assert_equal "application/x-msaccess", MiniMime.lookup_by_filename("x.mda").content_type
assert_equal "application/vnd.lotus-1-2-3", MiniMime.lookup_by_filename("a.123")&.content_type
assert_equal "application/x-compressed", MiniMime.lookup_by_filename("a.Z")&.content_type
assert_equal "application/vnd.groove-tool-message", MiniMime.lookup_by_filename("a.gtm")&.content_type
assert_equal "application/vnd.HandHeld-Entertainment+xml", MiniMime.lookup_by_filename("a.zmm")&.content_type
assert_equal "text/csv", MiniMime.lookup_by_filename("x.csv")&.content_type
assert_equal "application/x-msaccess", MiniMime.lookup_by_filename("x.mda")&.content_type

assert_nil MiniMime.lookup_by_filename("a.frog")
end
Expand Down Expand Up @@ -69,11 +69,7 @@ def test_full_parity_with_mime_types
type ||= types.detect(&:registered)
type ||= types.first

# if type.content_type != MiniMime.lookup_by_filename("a.#{ext}").content_type
# puts "#{ext} Expected #{type.content_type} Got #{MiniMime.lookup_by_filename("a.#{ext}").content_type}"
# end

assert_equal type.content_type, MiniMime.lookup_by_filename("a.#{ext}").content_type
assert_equal type.content_type, MiniMime.lookup_by_filename("a.#{ext}")&.content_type
end
end
end
Expand Down