Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception in client's Listener.onNewData can crash SerialInputOutputManager #601

Closed
neboskreb opened this issue Oct 19, 2024 · 3 comments
Closed
Labels

Comments

@neboskreb
Copy link

neboskreb commented Oct 19, 2024

Low-level exceptions in client's Listener.onNewData can crash SerialInputOutputManager.

Impact

After this fault, no further RX is accepted and delivered to the Listener. The TX works, however. No recovery possible, the comm stack needs to be re-created. As the whole thing happens silently, this situation is very difficult to detect and debug.

Reproducing

Steps

  • In your Listener.onNewData throw an Error or a custom exception extending Throwable

Expected result

  • Your Listener.onRunError is called
  • The comm stack continues working

Actual result

  • Listener.onRunError is NOT called.
  • The exception propagates up, uncaught, eventually ending the Manager's thread ungracefully.

Minimal reproducible example

private SerialInputOutputManager manager;

@Test
void fails() throws IOException {
    // GIVEN
    Throwable error = new NoSuchMethodError("Boo!");
    doThrow(error).when(mockListener).onNewData(ArgumentMatchers.any());

    // WHEN
    manager.run();

    // THEN
    Mockito.verify(mockListener).onRunError(exceptionCaptor.capture());
    Exception capturedException = exceptionCaptor.getValue();
    assertThat(capturedException).hasMessageContaining("Boo!");
}

Full compilable runnable code can be found in the attached test.zip

Cause

The issue is caused by a too narrow scope of the catch clause in SerialInputOutputManager.run which catches subclasses of Exception only. However, quite a number of lower level errors in Java do not extend Exception but rather Error - those will not be caught.

Also, custom exceptions are not limited to extend Exception, so such a custom exception extending Throwable will not be caught too.

Possible solutions

  1. In method step, wrap the invocation of Listener.onNewData with try-catch (Throwable)
    Wrapping Listener.onRunError in the same manner may also be a good idea
  2. In method run, expand the scope from catch (Exception e) to catch (Throwable e)
  3. In method run, add clause catch (Throwable e) "for bad cases"

I recommend solution 1.

Action points

Let's quickly discuss the pros/cons of solutions, and once the option is selected I shall draft the PR with fix.

@neboskreb neboskreb changed the title Exception in client's Listener.onNewData can crush SerialInputOutputManager Exception in client's Listener.onNewData can crash SerialInputOutputManager Oct 19, 2024
@kai-morich
Copy link
Collaborator

run catches the exception and invokes listener.onRunError with it. Changing the signature of onRunError from Exceptionto Throwable would be incompatible.

First ideas:

  • wrap Throwable in Exception when invoking onRunError
  • add new method to Listener interface: default void onRunError(Throwable e) { onRunError(new Exception(e)); };

@kai-morich kai-morich added the bug label Oct 24, 2024
@neboskreb
Copy link
Author

My first idea was the same like yours - wrapping it in Exception to avoid changing the Listener interface. So it would look like

try {
    listener.onNewData(data);
} catch (Exception e) {
    // Do not wrap normal Exceptions
    throw e; 
} catch (Throwable t) {
    throw new Exception("Client exception in Listener.onNewData", t);
}

so run will catch it properly and report it to Listener.onRunError.

Letting the "normal" exception fly freely is important for backwards compatibility - the existing clients might have handing in their onRunError which does not expect our new wrapping.

I ask myself is Exception too general for the wrapper? I.e. how the handling in Listener.onRunError would look like if it was something more specific, e.g. ClientCallbackException? Maybe we should introduce one if it helps making the handling more structured.

kai-morich added a commit that referenced this issue Oct 26, 2024
…601)

to avoid breaking Interface changes, Error from onNewData() is wrapped into Exception when calling onRunError()
kai-morich added a commit that referenced this issue Oct 26, 2024
…601)

to avoid breaking Interface changes, Error from onNewData() is wrapped into Exception when calling onRunError()
kai-morich added a commit that referenced this issue Oct 26, 2024
…601)

to avoid breaking Interface changes, Error from onNewData() is wrapped into Exception when calling onRunError()
kai-morich added a commit that referenced this issue Oct 28, 2024
…601) (#606)

to avoid breaking Interface changes, Error from onNewData() is wrapped into Exception when calling onRunError()
@neboskreb
Copy link
Author

neboskreb commented Oct 29, 2024

Wow, you already patched it. That was quick! I thought I'd have to do the job and you only review :) Nice!

But I see few problems with this approach. Not something critical, but maybe worth revisiting:

  1. Catching Throwables is not recommended in general case, because it might affect the stability of the whole system.

    • Example: an OutOfMemoryError will be caught and suppressed by this code, depriving the application its last chance to recover. I know, I know - handling this error is as rare as unicorn among the general-purpose apps but not such a rarity among the industrial firmware.
    • Another example is ThreadDeath exception which is, again, rare among common apps but happens in the industrial field where the responsiveness is monitored and stuck threads are killed. With the current fix, Manager's thread cannot be stopped.

    So I'd rather not go wild with catching Throwables but do it very locally, like wrapping a callback.

  2. The stack trace is populated at the moment of creating the exception. So creating new Exception(e) at line 216 will get its stack trace referring to line 216 - not the place where the problem actually happened. This might really obscure and complicate the debugging.

Like I said, it's not something critical critical but it would improve the quality even more. Do you think it's worth further discussion?

If we decide to invest more effort into this, I can prepare the PR for your review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants