Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: deadlock pattern in dynamic sdk-name assignment #857

Closed
wants to merge 3 commits into from

Conversation

supervacuus
Copy link
Collaborator

We recently introduced a deadlock in sentry_init() when implementing dynamic SDK names. This was the reason for the uptick in build timeouts in the CI (which was almost exclusively due to TEST(concurrent_init) being stuck). Sorry, I missed that during the review.

I also increased the concurrency in the test so that there is a higher chance of triggering such cases. The current level should be high enough for CI. I need to up that considerably on my machines to catch the issue.

@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

Merging #857 (1b8333a) into master (2aa321b) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #857      +/-   ##
==========================================
+ Coverage   82.57%   82.59%   +0.02%     
==========================================
  Files          53       53              
  Lines        7368     7372       +4     
  Branches     1186     1187       +1     
==========================================
+ Hits         6084     6089       +5     
+ Misses       1175     1174       -1     
  Partials      109      109              

Copy link
Member

@markushi markushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks for getting this fixed. Can you estimate how severe this issue is for mobile SDKs? If this has a bigger impact I'm happy to help you getting this shipped for all relevant mobile SDKs.

@supervacuus
Copy link
Collaborator Author

Nice catch, thanks for getting this fixed. Can you estimate how severe this issue is for mobile SDKs? If this has a bigger impact I'm happy to help you getting this shipped for all relevant mobile SDKs.

It is hard to say, but I don't think this will affect many.

The options lock is a recursive mutex so you can go pretty wild with nested locking solely on options. Also, the options lock usually is short-lived because it only guards the access to the ref-count to retrieve a read-only reference. The nesting of options -> scope -> options is generally a lousy pattern if one thread acquires the scope lock before another locks options at the start of sentry_init() 😓 .

How realistic this is on the Mobile SDKs depends on how they interact with the Native SDK (threads already running that could access scope-related functions before another thread calls sentry_init(), for instance... I think we talked about this recently). I don't know if the last ANR was a rare case or if there is an uptick, like in our CI.

In any case, we can release it immediately, but I am unsure if it is urgent.

The other aspect is that with the current change, you must ensure that the global scope hasn't been touched before sentry_init() or the initialization defaults to the build-time SDK name.

We can change this by not setting the dynamic SDK name in get_scope() at all but rather doing it in the SENTRY_WITH_SCOPE_MUT in sentry_init(). This way, we could eliminate the options path towards get_scope() and an optionally set sdk_name will always overwrite the scope in sentry_init(). WDYT?

@supervacuus
Copy link
Collaborator Author

We can change this by not setting the dynamic SDK name in get_scope() at all but rather doing it in the SENTRY_WITH_SCOPE_MUT in sentry_init(). This way, we could eliminate the options path towards get_scope() and an optionally set sdk_name will always overwrite the scope in sentry_init(). WDYT?

Just in case it is unclear: this would mean we would have to keep the client_sdk unfrozen in get_scope() and freeze it in sentry_init() together with setting the sdk_name. It boils down to these two options (they are two sides of a coin).

@bitsandfoxes
Copy link
Contributor

@supervacuus supervacuus deleted the fix/deadlock_sdk_name_scope branch September 12, 2023 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants