Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/215 async-signal-safe launcher shutdown #753

Merged
merged 7 commits into from
Jun 16, 2024

Conversation

PengZheng
Copy link
Contributor

@PengZheng PengZheng commented Jun 12, 2024

This PR fixes a long-standing issue #215 by implementing async-signal-safe launcher shutdown.
Without making our event loop more powerful (like libuv), I think setting a flag to be checked by a periodic scheduled event callback is the best we can do. So here it is.

This PR also makes celix_launcher_triggerStop safer. Previously a bundle may initiate a framework shutdown, which will cause celix_framework_waitForStop to return without noticing the launcher. In that case, celix_launcher_triggerStop might reference a to-be-destroyed framework instance to wreck havoc. Now we prevent it from happening with g_launcher.lock.

@PengZheng PengZheng linked an issue Jun 12, 2024 that may be closed by this pull request
@codecov-commenter
Copy link

codecov-commenter commented Jun 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.31%. Comparing base (58860bb) to head (2378bac).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #753      +/-   ##
==========================================
+ Coverage   90.24%   90.31%   +0.06%     
==========================================
  Files         226      226              
  Lines       26284    26316      +32     
==========================================
+ Hits        23721    23767      +46     
+ Misses       2563     2549      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@PengZheng
Copy link
Contributor Author

PengZheng commented Jun 12, 2024

Here is an interesting program hang: https://github.com/apache/celix/actions/runs/9478654947/job/26115518903

Warning: 12T08:29:26] [warning] [celix_framework] [celix_scheduledEvent_waitForRemoved:237] Timeout while waiting for removal of scheduled event 'celix_shutdown_check' (id=0) for bundle id 0.
Warning: 12T08:29:28] [warning] [celix_framework] [celix_scheduledEvent_waitForRemoved:237] Timeout while waiting for removal of scheduled event 'celix_shutdown_check' (id=0) for bundle id 0.
Warning: 12T08:29:30] [warning] [celix_framework] [celix_scheduledEvent_waitForRemoved:237] Timeout while waiting for removal of scheduled event 'celix_shutdown_check' (id=0) for bundle id 0.
Warning: 12T08:29:32] [warning] [celix_framework] [celix_scheduledEvent_waitForRemoved:237] Timeout while waiting for removal of scheduled event 'celix_shutdown_check' (id=0) for bundle id 0.
Warning: 12T08:29:34] [warning] [celix_framework] [celix_scheduledEvent_waitForRemoved:237] Timeout while waiting for removal of scheduled event 'celix_shutdown_check' (id=0) for bundle id 0.

The issue should have been fixed by d384c86. Note that hello_bundle in test_package is a perfect example of framework shutdown initiated by a bundle, which happens to reveal a defect in our scheduled event implementation.

There are still room for further improvements: ideally we shall not permit events (neither generic or scheduled) to be posted to an inactive dispatcher. But I will leave it for a future PR.

@PengZheng PengZheng requested a review from pnoltes June 12, 2024 10:02
Copy link
Contributor

@pnoltes pnoltes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpick remarks, but LGTM.

libs/framework/src/celix_launcher.c Outdated Show resolved Hide resolved
libs/framework/src/celix_launcher.c Show resolved Hide resolved
libs/framework/src/celix_launcher.c Outdated Show resolved Hide resolved
@pnoltes
Copy link
Contributor

pnoltes commented Jun 15, 2024

Without making our event loop more powerful (like libuv), ...

libuv is very interesting and seems to provide a lot of the functionality we use and have implemented ourselves, including a single-thread event loop, shutdown procedure, as well as cross-platform polling, threading, locking, filesystem abstractions, etc. Since it uses an MIT license and does not have dependencies, it could be suitable for Apache Celix.

This could help in supporting more platforms and even be a step towards supporting Windows (we had some support for loading DLL libraries in the past).

In the short term, this may be a step too far, but it's at least worth a discussion.
@PengZheng, any thoughts on how feasible it would be to introduce libuv, starting with our file, time, threading, and locking usage/abstractions?

@PengZheng
Copy link
Contributor Author

PengZheng commented Jun 16, 2024

@PengZheng, any thoughts on how feasible it would be to introduce libuv, starting with our file, time, threading, and locking usage/abstractions?

libuv is the central piece of an event processing framework used heavily in my day job. I have very positive experience with it, and think it can be useful in the following regards:

  • Besides a single-thread event loop, it also provides a thread-pool nicely integrated with the event loop: we can offload inherently blocking tasks to the thread-pool, and get notified when they are done. Therefore we can easily keep the Celix event loop totally non-blocking.
  • Besides networking, it provides various useful constructions, such as uv_timer_t/uv_signal_t/uv_process_t/uv_tty_t.
  • As you already mentioned, it provides a general OS abstraction. Utilizing it we become more portable, and Windows support is not far away.

Specific to this PR:

  • uv_async_send is async-signal-safe. It can be used to notify framework shutdown from a signal handler.
  • Or we can use uv_signal_t to add signal handlers called from the event loop.

@PengZheng PengZheng merged commit 63683c0 into master Jun 16, 2024
32 checks passed
@PengZheng PengZheng deleted the feature/215-stop-launcher-by-signal branch June 16, 2024 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Signal-unsafe functions calls inside signal handler
3 participants