Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash reporting heaven #23691

Closed
wants to merge 8 commits into from
Closed

Conversation

pvinis
Copy link
Contributor

@pvinis pvinis commented Feb 28, 2019

Summary

I have used RN for a long time, and for all this time, crash reporting has been less great than native development crash reporting. At some point, companies like sentry, bugsnag and a bunch of others started supporting sourcemaps for js crashes in RN, which helped a lot.
But native crashes were (and still are) much harder to diagnose.

..Until now :D

I have make a repo of a sample RN app, included this PR in it, and some code and screenshots to help.
The repo is here.

Explanation

I was trying to get good crash reports from native crashes in iOS for a looong time. I spoke with people in sentry, in bugsnag and more, and I could not get this solved. There was no clear way to get the native crashed to display correctly.
I made two repos here, one for sentry and one for bugsnag, demonstrating the correct js handling and the bad native handling.

After all this, and talks with their support, twitter etc, I investigated further, on why this was happening. I thought there must be some reason that native crashes look bad in all the tools, and in the same way. Maybe it's not their fault, or up to them to fix it, or maybe they didn't have the experience to fix it.

In a test project I created, I checked what's up with the RCTFatalException, and I found out that the React Native code is catching the NSExceptions that come from any native modules of a RN app and converting it into an string and sending it to RCTFatal that created an NSError out of that string. Then it checks if the app has set a fatal error handler and if not, goes ahead and throws that NSError.

The problem here is that NSException has a bunch more info that the resulting NSError is missing or is altering. Turning the callstack into a string renders crash reporting tools useless as they are missing the original place the exception was thrown, symbols, return addresses etc. In both repos above it can be seen that both tools were thinking that the error happened somewhere in the RCTFatal function, and it did, since we create it there, losing all the previous useful info of the original exception. That leaves us with just a very long name including a callstack, but very hard to actually map this to the code and dsym.

I added a fatal exception handler, that mirrors the fatal error handler, as the error handler is used around React Native internal code.

Then I stopped making a string out of the original NSException and calling RCTFatal, and I simply throw the exception. This way no info is lost!

Finally, I added some code examples of native and js crashes and added a part in the RNTester app, so people can see how a js and a native error look like while debugging, as well as try to compile the app in release mode and see how the crash report would look like if they connect it to bugsnag or sentry or their tool of choice.

I have attached some images at the bottom of this PR, and you can find some in the 3 repos I linked above.

Changelog

[iOS] [Fixed] - Changed the way iOS native module exceptions get handled. Instead of making them into an NSError and lose the context and callstack, we keep them as NSExceptions and propagate them.
[General] [Added] - Example code for native crashes in iOS and Android, with buttons on RNTester, so developers can see how these look when debugging, as well as the crash reports in release mode.

Test Plan

You can take the app here and add your sentry dsn etc, or comment out the sentry line in App.js and add bugsnag or instabug or fabric or any other tool you like, and test the result of a native crash. Maybe also compare it with the existing native reports produced by the current RN version.

(I have cancelled any sensitive info that is contained in these repos like api keys etc there, so no worries for that.)

Appendix

Good js crashes:
good-js-bugsnag
good-js-sentry
Bad native crashes (pre-PR):
bad-na-bugsnag
bad-na-sentry
Good native crash (post-PR):
good-na-sentry

@pvinis pvinis requested a review from shergin as a code owner February 28, 2019 11:23
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 28, 2019
Copy link

@analysis-bot analysis-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code analysis results:

  • eslint found some issues. Run yarn lint --fix to automatically fix problems.

RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
Copy link

@analysis-bot analysis-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code analysis results:

  • eslint found some issues. Run yarn lint --fix to automatically fix problems.

RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
Copy link

@analysis-bot analysis-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code analysis results:

  • eslint found some issues. Run yarn lint --fix to automatically fix problems.

RNTester/js/CrashExample.js Outdated Show resolved Hide resolved
@pvinis
Copy link
Contributor Author

pvinis commented Feb 28, 2019

Maybe @HazAT and @fractalwrench are interested in this too.

Copy link
Contributor

@cpojer cpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing. Thank you so so much for doing all this work. I think this is going to be huge for the community.

I'm wondering if (as a follow-up) there is any possibility of adding an automated test for this so we ensure this doesn't ever regress again. I realize it may be hard to do so I'm happy to hear ideas you may have.

I'm gonna import and try to land it, but I need to check if this is gonna cause any trouble at FB :)

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpojer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@pvinis
Copy link
Contributor Author

pvinis commented Mar 1, 2019

Hm I will take a look for adding tests. I might need some help/guidance for this, but I will look.

@pvinis
Copy link
Contributor Author

pvinis commented Mar 1, 2019

Would I test the native module in a try-catch and check the exception there, or would I test in js by calling the native module and do the try-catch dance there? 🤔

@pvinis
Copy link
Contributor Author

pvinis commented Mar 4, 2019

I am adding an extra commit, that keeps the old RCTFatal behavior on the case of RCT_DEBUG being true. That is because when developing and debugging, it's more useful to see the string-constructed callstack on the app.

  • If we debug with Xcode running, the exception breakpoint will stop where it should. (in RNTester, that is CrashyCrash.m:19).
  • If we debug without Xcode (expo, or just running the app in device or simulator but without Xcode), we should get some info on the redbox. (this is the current master behavior).
    screenshot 2019-03-04 at 14 30 04
  • If we are on release build type, then we want the original exception, without wrappers or string callstacks. We need the original actual exception.

Before the commit I'll do now, in this PR, the redbox while debugging was
screenshot 2019-03-04 at 14 24 47
because or removing the RCTFatal call. I added it back, but restrict it only in the debug mode. This gives us as much info as possible in debug, and the correct info in release.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpojer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@HazAT
Copy link

HazAT commented Mar 5, 2019

@pvinis Awesome work!

@pvinis
Copy link
Contributor Author

pvinis commented Mar 6, 2019

I just tried the merged stack on sentry and it works soo much better now. <3
screenshot 2019-03-06 at 14 22 16

@react-native-bot
Copy link
Collaborator

This pull request was successfully merged by @pvinis in 629708b.

When will my fix make it into a release? | Upcoming Releases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged This PR has been merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants