Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeking the DB file does not work in a bundled JRuby application, crashes randomly #37

Closed
ikaronen-relex opened this issue Aug 16, 2021 · 5 comments · Fixed by #50
Closed

Comments

@ikaronen-relex
Copy link
Contributor

ikaronen-relex commented Aug 16, 2021

We have a Rails-based JRuby application deployed as a bundled .jar file, in which we currently use ActionMailer (which uses the Mail gem, which use MiniMime) to send reports as e-mail attachments. We've had reports of weird random crashes causing these e-mails sometimes not to be sent, and we've managed to narrow down the source of these crashes to MiniMime, and specifically to the code in MiniMime::Db::RandomAccessDb.

Here's an example stack trace from a crash log:

job=399431962505 Sending csv report #2501 to REDACTED failed: undefined method `>' for nil:NilClass
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:135:in `lookup_uncached'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:113:in `block in lookup'
	org/jruby/RubyHash.java:1263:in `fetch'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:89:in `fetch'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:112:in `block in lookup'
	org/jruby/RubyHash.java:1263:in `fetch'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:89:in `fetch'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:111:in `lookup'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:159:in `lookup_by_extension'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:65:in `block in lookup_by_extension'
	org/jruby/ext/thread/Mutex.java:165:in `synchronize'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:63:in `lookup_by_extension'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:59:in `lookup_by_filename'
	uri:classloader:/gems/mini_mime-1.0.2/lib/mini_mime.rb:6:in `lookup_by_filename'
	uri:classloader:/gems/mail-2.7.1/lib/mail/attachments_list.rb:105:in `set_mime_type'
	uri:classloader:/gems/mail-2.7.1/lib/mail/attachments_list.rb:42:in `[]='
	uri:classloader:/app/mailers/application_mailer.rb:58:in `block in add_files'
	org/jruby/RubyArray.java:1792:in `each'
	uri:classloader:/app/mailers/application_mailer.rb:50:in `add_files'
	uri:classloader:/app/mailers/scheduled_job_mailer.rb:17:in `generate'
	uri:classloader:/app/mailers/application_mailer.rb:11:in `send_mail'

(Yes, I know we're a few versions behind, but the relevant code in v1.0.2 looks the same as in master and I'm pretty sure the issue exists there too.)

Anyway, the weird crashes seem to happen because MiniMime::Db::RandomAccessDb.resolve tries to seek the DB file. But in a bundled app the DB files end up being uri:classloader resources, which are not seekable. Worse, due to a questionable decision by the JRuby devs, attempting to seek them doesn't actually raise an exception but fails silently (and, AFAICT, also causes some internal buffers to be dumped so that the next readline ends up starting from the middle of a line).

Anyway, while the weird silent seek failure is arguably a JRuby bug, even if it's fixed those files will never be seekable (since the underlying Java streams aren't). So this would still need to be fixed on MiniMime's side to make it work in a bundled JRuby app.

Off the top of my head, I can see a couple of possible solutions:

  1. Scrap the current overcomplicated RandomAccessDb implementation entirely and just read the whole database into a hash on the first lookup. It's really not that big, and doing so would make the code much simpler and lookups likely faster.
  2. Test each DB file to see if it's seekable (by trying to seek it, and checking that no exception is raised and that the reported file position actually changes), and switch to a backup implementation (e.g. slurping the whole database into a hash) if it's not. More complicated than option 1, but preserves current behavior in cases where the files are seekable.
  3. Same as option 2, but just read any unseekable DB files into strings and wrap them in a StringIO wrapper. Probably the option involving the least changes to existing code.
@SamSaffron
Copy link
Member

I think at a minimum JRuby should provide some obvious signal that the file is not seekable (perhaps &.seekable? == false) if that becomes the case we can feature detect and switch implementation.

As what to do if JRuby is not going to change anything, I would recommend some sort of magic include that switches the implementation to a completely in-memory implementation if you opt for it (eg: require "mini_mime/in_memory")

So we leave RandomAccessDb as is and just implement a new InMemoryDb that follows the same interface, then you could opt upfront for InMemoryDb over RandomAccessDb.

cc @headius

@headius
Copy link

headius commented Aug 18, 2021

Worse, due to a questionable decision by the JRuby devs, attempting to seek them doesn't actually raise an exception but fails silently (and, AFAICT, also causes some internal buffers to be dumped so that the next readline ends up starting from the middle of a line).

We should fix the buffer dumping, probably. I would guess it was just an oversight... we flush buffers before seek and then don't actually seek for these unseekable resources, but we can't go back to where we were in the buffer.

I think at a minimum JRuby should provide some obvious signal that the file is not seekable (perhaps &.seekable? == false)

There is no IO#seekable? in Ruby though. I agree, it would be a good idea, but we don't unilaterally add APIs like this.

We are always happy to fix issues in JRuby in a logical way, if you can summarize what you think should be fixed. If it's just repairing the buffer flush, that should be pretty easy.

@ikaronen-relex
Copy link
Contributor Author

We are always happy to fix issues in JRuby in a logical way, if you can summarize what you think should be fixed. If it's just repairing the buffer flush, that should be pretty easy.

@headius I agree that the buffer flush should certainly be fixed. Beyond that, though, I personally also think that a non-trivial seek on a non-seekable file should always raise an exception — where non-trivial means anything that isn't provably a no-op, such as f.seek(0, IO::SEEK_CUR). Such a failed non-trivial seek means that the program has requested a state change that the runtime cannot provide, and assumes it to have succeeded (since IO#seek provides no non-exceptional way to indicate failure), thus placing the system in an unexpected and possibly inconsistent state. IMO that's something that pretty much always should result in an exception.

But that's all tangential to the issue at hand: even if my suggestion above was implemented, it would just make MiniMime fail sooner and in a more obvious way when run from a .jar bundle. I can file a separate issue for JRuby if you'd like, but I don't think further in-depth discussion about changing the JRuby IO#seek behavior belongs here.

@SamSaffron Meanwhile, I'll try to provide a PR for MiniMime soon, probably tomorrow. AFAICT, the method I suggested earlier for detecting seekability (seek, rescue, check pos) should work reliably with or without any changes to JRuby's behavior. (To be safe, we do need to reopen the file if seeking fails, but that's easy enough to do in this case.)

casperisfine pushed a commit to casperisfine/mini_mime that referenced this issue Aug 2, 2023
When forking, file descriptors are inherited and their state shared.

In the context of MiniMime this means that the offset of the file
opened by RandomAccessDb is shared across processes, so the
`seek + read` combo is subject to inter-process race conditions.

Of course that file is lazily opened, so assuming most applications
don't query MiniMime before fork, it's not a big problem.

However when reforking post boot (e.g. https://github.com/Shopify/pitchfork)
this becomes an issue.

Additionally, even if the file descriptor isn't shared across processes,
the file position is still process global requiring a Mutex.

By using `pread` instead of `seek + read` we can both make the library
fork safe and get rid of the need to synchronize accesses.

This also happens to fix an outstanding JRuby issue.

Fix: discourse#37
Fix: discourse#38
casperisfine pushed a commit to casperisfine/mini_mime that referenced this issue Aug 2, 2023
When forking, file descriptors are inherited and their state shared.

In the context of MiniMime this means that the offset of the file
opened by RandomAccessDb is shared across processes, so the
`seek + read` combo is subject to inter-process race conditions.

Of course that file is lazily opened, so assuming most applications
don't query MiniMime before fork, it's not a big problem.

However when reforking post boot (e.g. https://github.com/Shopify/pitchfork)
this becomes an issue.

Additionally, even if the file descriptor isn't shared across processes,
the file position is still process global requiring a Mutex.

By using `pread` instead of `seek + read` we can both make the library
fork safe and get rid of the need to synchronize accesses.

This also happens to fix an outstanding JRuby issue.

Fix: discourse#37
Fix: discourse#38
SamSaffron pushed a commit that referenced this issue Aug 3, 2023
When forking, file descriptors are inherited and their state shared.

In the context of MiniMime this means that the offset of the file
opened by RandomAccessDb is shared across processes, so the
`seek + read` combo is subject to inter-process race conditions.

Of course that file is lazily opened, so assuming most applications
don't query MiniMime before fork, it's not a big problem.

However when reforking post boot (e.g. https://github.com/Shopify/pitchfork)
this becomes an issue.

Additionally, even if the file descriptor isn't shared across processes,
the file position is still process global requiring a Mutex.

By using `pread` instead of `seek + read` we can both make the library
fork safe and get rid of the need to synchronize accesses.

This also happens to fix an outstanding JRuby issue.

Fix: #37
Fix: #38
@headius
Copy link

headius commented Aug 8, 2023

I can file a separate issue for JRuby if you'd like

I missed this when you first posted, @ikaronen-relex, but it's a good idea if you're still out there :-)

@ikaronen-relex
Copy link
Contributor Author

ikaronen-relex commented Aug 27, 2023

@headius, sure, I can do that. Thanks for the reminder! 😃

Ps. I have a suspicion that the change in #50 doesn't actually fix the behavior when running from a bundled .jar file, since the JRuby implementation of pread apparently silently ignores(!) the offset if fd.chFile and fd.chNative are both null (as they presumably are for streams opened from uri:classloader URIs). I haven't tested that yet, but I'll create an internal ticket to remind me about it.

(FWIW, I also think that behavior is contrary to the POSIX spec, which says that "An attempt to perform a pread() on a file that is incapable of seeking results in an error." Arguably it also violates the Ruby stdlib documentation, which just references "the pread system call" and says that it "Raises SystemCallError on error". Anyway, I can also submit a fix for that, either in the same PR or in a separate one.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants