Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use-after-free of ScheduledEvents #647

Closed
PengZheng opened this issue Sep 20, 2023 · 2 comments · Fixed by #683
Closed

Use-after-free of ScheduledEvents #647

PengZheng opened this issue Sep 20, 2023 · 2 comments · Fixed by #683
Assignees
Labels
component/framework Categorizes an issue or PR relevant to the framework. kind/bug Categorizes issue or PR as related to a bug.

Comments

@PengZheng
Copy link
Contributor

PengZheng commented Sep 20, 2023

I inspected scheduled CI actions, and found the following failure.
Though this is very unlikely to be a real bug, which I'll analyze later, I record it for further observation.

https://github.com/apache/celix/actions/runs/6242193992/job/16945710698

[ RUN      ] ScheduledEventTestSuite.ManyScheduledEventTest
[2023-09-20T00:08:14] [  debug] [celix_framework] Framework started event received -> registering framework.ready condition service
[2023-09-20T00:08:14] [   info] [celix_framework] [framework_start:476] Celix framework started
[2023-09-20T00:08:14] [  trace] [celix_framework] [framework_start:480] Celix framework started with uuid BC664877-2CD1-4AC1-BD79-32D8B629A16F
[2023-09-20T00:08:14] [  debug] [celix_framework] [celix_framework_scheduleEvent:2540] Added scheduled event 'Scheduled event test' (id=0) for bundle 'celix_framework' (id=0).
[2023-09-20T00:08:14] [  trace] [celix_framework] [celix_scheduledEvent_process:190] Processing scheduled event Scheduled event test for bundle id 0
[2023-09-20T00:08:14] [  debug] [celix_framework] [celix_framework_processScheduledEvents:1400] Removing processed one-shot scheduled event 'Scheduled event test' (id=0) for bundle if 0.
=================================================================
==15597==ERROR: AddressSanitizer: heap-use-after-free on address 0x6110000eaa00 at pc 0x000102e26137 bp 0x7ff7bd493530 sp 0x7ff7bd493528
READ of size 8 at 0x6110000eaa00 thread T0
    #0 0x102e26136 in celix_scheduledEvent_getId+0x26 (test_framework_with_cxx14:x86_64+0x1003bc136)
    #1 0x102def15f in celix_bundleContext_scheduleEvent+0x10f (test_framework_with_cxx14:x86_64+0x10038515f)
    #2 0x102d78b13 in ScheduledEventTestSuite_ManyScheduledEventTest_Test::TestBody()+0x423 (test_framework_with_cxx14:x86_64+0x10030eb13)
    #3 0x102e61050 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x160 (test_framework_with_cxx14:x86_64+0x1003f7050)
    #4 0x102e60c61 in testing::Test::Run()+0x971 (test_framework_with_cxx14:x86_64+0x1003f6c61)
    #5 0x102e65b26 in testing::TestInfo::Run()+0x9f6 (test_framework_with_cxx14:x86_64+0x1003fbb26)
    #6 0x102e68ba6 in testing::TestSuite::Run()+0x346 (test_framework_with_cxx14:x86_64+0x1003feba6)
    #7 0x102eaae10 in testing::internal::UnitTestImpl::RunAllTests()+0x1420 (test_framework_with_cxx14:x86_64+0x100440e10)
    #8 0x102ea928c in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x15c (test_framework_with_cxx14:x86_64+0x10043f28c)
    #9 0x102ea8fe9 in testing::UnitTest::Run()+0x1a9 (test_framework_with_cxx14:x86_64+0x10043efe9)
    #10 0x102ef705f in main+0xef (test_framework_with_cxx14:x86_64+0x10048d05f)
    #11 0x112f5852d in start+0x1cd (dyld:x86_64+0x552d)

0x6110000eaa00 is located 0 bytes inside of 248-byte region [0x6110000eaa00,0x6110000eaaf8)
freed by thread T401 here:
    #0 0x1036a30a9 in wrap_free+0xa9 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x4a0a9)
    #1 0x102df3b58 in fw_eventDispatcher+0x498 (test_framework_with_cxx14:x86_64+0x100389b58)
    #2 0x7ff81b1ec4e0 in _pthread_start+0x7c (libsystem_pthread.dylib:x86_64+0x64e0)
    #3 0x7ff81b1e7f6a in thread_start+0xe (libsystem_pthread.dylib:x86_64+0x1f6a)

previously allocated by thread T0 here:
    #0 0x1036a2f60 in wrap_malloc+0xa0 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x49f60)
    #1 0x102e25bc9 in celix_scheduledEvent_create+0x39 (test_framework_with_cxx14:x86_64+0x1003bbbc9)
    #2 0x102e00c5b in celix_framework_scheduleEvent+0xcb (test_framework_with_cxx14:x86_64+0x100396c5b)
    #3 0x102def15f in celix_bundleContext_scheduleEvent+0x10f (test_framework_with_cxx14:x86_64+0x10038515f)
    #4 0x102d78b13 in ScheduledEventTestSuite_ManyScheduledEventTest_Test::TestBody()+0x423 (test_framework_with_cxx14:x86_64+0x10030eb13)
    #5 0x102e61050 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x160 (test_framework_with_cxx14:x86_64+0x1003f7050)
    #6 0x102e60c61 in testing::Test::Run()+0x971 (test_framework_with_cxx14:x86_64+0x1003f6c61)
    #7 0x102e65b26 in testing::TestInfo::Run()+0x9f6 (test_framework_with_cxx14:x86_64+0x1003fbb26)
    #8 0x102e68ba6 in testing::TestSuite::Run()+0x346 (test_framework_with_cxx14:x86_64+0x1003feba6)
    #9 0x102eaae10 in testing::internal::UnitTestImpl::RunAllTests()+0x1420 (test_framework_with_cxx14:x86_64+0x100440e10)
    #10 0x102ea928c in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x15c (test_framework_with_cxx14:x86_64+0x10043f28c)
    #11 0x102ea8fe9 in testing::UnitTest::Run()+0x1a9 (test_framework_with_cxx14:x86_64+0x10043efe9)
    #12 0x102ef705f in main+0xef (test_framework_with_cxx14:x86_64+0x10048d05f)
    #13 0x112f5852d in start+0x1cd (dyld:x86_64+0x552d)

Thread T401 created by T0 here:
    #0 0x10369ca3c in wrap_pthread_create+0x5c (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x43a3c)
    #1 0x10325d9a1 in celixThread_create+0x11 (libcelix_utils.2.dylib:x86_64+0x139a1)
    #2 0x102df30a5 in fw_init+0x195 (test_framework_with_cxx14:x86_64+0x1003890a5)
    #3 0x102df4372 in framework_start+0x132 (test_framework_with_cxx14:x86_64+0x10038a372)
    #4 0x102e17770 in celix_frameworkFactory_createFramework+0x120 (test_framework_with_cxx14:x86_64+0x1003ad770)
    #5 0x102a99d99 in celix::createFramework(celix::Properties const&)+0x49 (test_framework_with_cxx14:x86_64+0x10002fd99)
    #6 0x102d9efce in ScheduledEventTestSuite::ScheduledEventTestSuite()+0x3ae (test_framework_with_cxx14:x86_64+0x100334fce)
    #7 0x102da006b in testing::internal::TestFactoryImpl<ScheduledEventTestSuite_ManyScheduledEventTest_Test>::CreateTest()+0x1b (test_framework_with_cxx14:x86_64+0x10033606b)
    #8 0x102e6601c in testing::Test* testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test* (testing::internal::TestFactoryBase::*)(), char const*)+0x15c (test_framework_with_cxx14:x86_64+0x1003fc01c)
    #9 0x102e65320 in testing::TestInfo::Run()+0x1f0 (test_framework_with_cxx14:x86_64+0x1003fb320)
    #10 0x102e68ba6 in testing::TestSuite::Run()+0x346 (test_framework_with_cxx14:x86_64+0x1003feba6)
    #11 0x102eaae10 in testing::internal::UnitTestImpl::RunAllTests()+0x1420 (test_framework_with_cxx14:x86_64+0x100440e10)
    #12 0x102ea928c in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x15c (test_framework_with_cxx14:x86_64+0x10043f28c)
    #13 0x102ea8fe9 in testing::UnitTest::Run()+0x1a9 (test_framework_with_cxx14:x86_64+0x10043efe9)
    #14 0x102ef705f in main+0xef (test_framework_with_cxx14:x86_64+0x10048d05f)
    #15 0x112f5852d in start+0x1cd (dyld:x86_64+0x552d)

SUMMARY: AddressSanitizer: heap-use-after-free (test_framework_with_cxx14:x86_64+0x1003bc136) in celix_scheduledEvent_getId+0x26
Shadow bytes around the buggy address:
  0x1c220001d4f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c220001d500: fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa
  0x1c220001d510: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x1c220001d520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c220001d530: 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x1c220001d540:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c220001d550: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x1c220001d560: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x1c220001d570: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c220001d580: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c220001d590: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==15597==ABORTING

@PengZheng PengZheng added the kind/bug Categorizes issue or PR as related to a bug. label Sep 20, 2023
@PengZheng
Copy link
Contributor Author

PengZheng commented Sep 20, 2023

To be honest, it is really annoying to see any ASAN report on the framework (pubsub ASAN report is more acceptable), especially in the last minutes before a shiny new release.

After celix_scheduledEvent_create returns a newly created event, ASAN reports that we are trying to read its ID via celix_scheduledEvent_getId from a freed memory region. From its size, it is most probably a Schedule Event. From the logging, we can tell there is only one ScheduleEvent ever existing: the first one-shot event. I re-checked the event handling path, and found no double-free.

After the framework startup, there is no further dynamic memory allocation/deallocation except for ScheduleEvent, thus it is not the framework who deallocated a newly created event. The interesting part is ASAN does not report use-after-free within celix_scheduledEvent_create, which initialized the event:

  • When we write to it, ASAN is OK
  • When we read it latter, ASAN screams.

From what I saw, it does not seem like a Celix bug. But let's watch for a longer time if it reproduces.
I've encountered some weird memory corruption bugs when KSM is used on some embedded devices in the past.
Considering such memory saving techs are often used in cloud environments, my unreliable wild guess is that this issue is one of them.

@PengZheng PengZheng changed the title Use-after-free for ScheduledEvents Use-after-free of ScheduledEvents Sep 20, 2023
@PengZheng
Copy link
Contributor Author

Another use-after-free is reported:
https://github.com/apache/celix/actions/runs/6741052039/job/18325045856

[ RUN      ] ScheduledEventTestSuite.OnceShotEventTest
[2023-11-03T03:58:12] [  debug] [celix_framework] Framework started event received -> registering framework.ready condition service
[2023-11-03T03:58:12] [   info] [celix_framework] [framework_start:476] Celix framework started
[2023-11-03T03:58:12] [  trace] [celix_framework] [framework_start:480] Celix framework started with uuid C113B138-D3CF-43A6-BEF1-A44853FBCA1F
[2023-11-03T03:58:12] [  debug] [celix_framework] [celix_framework_scheduleEvent:2540] Added scheduled event 'unnamed' (id=0) for bundle 'celix_framework' (id=0).
[2023-11-03T03:58:12] [  trace] [celix_framework] [celix_scheduledEvent_process:181] Processing scheduled event unnamed for bundle id 0
[2023-11-03T03:58:12] [  debug] [celix_framework] [celix_framework_processScheduledEvents:1400] Removing processed one-shot scheduled event 'unnamed' (id=0) for bundle if 0.
=================================================================
==18763==ERROR: AddressSanitizer: heap-use-after-free on address 0x6110000c6d88 at pc 0x00010b353b8b bp 0x7ff7b4f77430 sp 0x7ff7b4f77428
READ of size 8 at 0x6110000c6d88 thread T0
    #0 0x10b353b8a in celix_scheduledEvent_getId+0x2a (test_framework:x86_64+0x1003cdb8a)
    #1 0x10b31ccef in celix_bundleContext_scheduleEvent+0x10f (test_framework:x86_64+0x100396cef)
    #2 0x10b2a2018 in ScheduledEventTestSuite_OnceShotEventTest_Test::TestBody()+0x2f8 (test_framework:x86_64+0x10031c018)
    #3 0x10b38eab0 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x160 (test_framework:x86_64+0x100408ab0)
    #4 0x10b38e6c1 in testing::Test::Run()+0x971 (test_framework:x86_64+0x1004086c1)
    #5 0x10b393586 in testing::TestInfo::Run()+0x9f6 (test_framework:x86_64+0x10040d586)
    #6 0x10b396606 in testing::TestSuite::Run()+0x346 (test_framework:x86_64+0x100410606)
    #7 0x10b3d8870 in testing::internal::UnitTestImpl::RunAllTests()+0x1420 (test_framework:x86_64+0x100452870)
    #8 0x10b3d6cec in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x15c (test_framework:x86_64+0x100450cec)
    #9 0x10b3d6a49 in testing::UnitTest::Run()+0x1a9 (test_framework:x86_64+0x100450a49)
    #10 0x10b424abf in main+0xef (test_framework:x86_64+0x10049eabf)
    #11 0x11716652d in start+0x1cd (dyld:x86_64+0x552d)

0x6110000c6d88 is located 8 bytes inside of 248-byte region [0x6110000c6d80,0x6110000c6e78)
freed by thread T401 here:
    #0 0x10bb39019 in wrap_free+0xa9 (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x4b019)
    #1 0x10b3216e8 in fw_eventDispatcher+0x498 (test_framework:x86_64+0x10039b6e8)
    #2 0x7ff8136794e0 in _pthread_start+0x7c (libsystem_pthread.dylib:x86_64+0x64e0)
    #3 0x7ff813674f6a in thread_start+0xe (libsystem_pthread.dylib:x86_64+0x1f6a)

previously allocated by thread T0 here:
    #0 0x10bb38ed0 in wrap_malloc+0xa0 (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x4aed0)
    #1 0x10b353619 in celix_scheduledEvent_create+0x39 (test_framework:x86_64+0x1003cd619)
    #2 0x10b32e7eb in celix_framework_scheduleEvent+0xcb (test_framework:x86_64+0x1003a87eb)
    #3 0x10b31ccef in celix_bundleContext_scheduleEvent+0x10f (test_framework:x86_64+0x100396cef)
    #4 0x10b2a2018 in ScheduledEventTestSuite_OnceShotEventTest_Test::TestBody()+0x2f8 (test_framework:x86_64+0x10031c018)
    #5 0x10b38eab0 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x160 (test_framework:x86_64+0x100408ab0)
    #6 0x10b38e6c1 in testing::Test::Run()+0x971 (test_framework:x86_64+0x1004086c1)
    #7 0x10b393586 in testing::TestInfo::Run()+0x9f6 (test_framework:x86_64+0x10040d586)
    #8 0x10b396606 in testing::TestSuite::Run()+0x346 (test_framework:x86_64+0x100410606)
    #9 0x10b3d8870 in testing::internal::UnitTestImpl::RunAllTests()+0x1420 (test_framework:x86_64+0x100452870)
    #10 0x10b3d6cec in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x15c (test_framework:x86_64+0x100450cec)
    #11 0x10b3d6a49 in testing::UnitTest::Run()+0x1a9 (test_framework:x86_64+0x100450a49)
    #12 0x10b424abf in main+0xef (test_framework:x86_64+0x10049eabf)
    #13 0x11716652d in start+0x1cd (dyld:x86_64+0x552d)

Thread T401 created by T0 here:
    #0 0x10bb3299c in wrap_pthread_create+0x5c (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x4499c)
    #1 0x10b790981 in celixThread_create+0x11 (libcelix_utils.3.dylib:x86_64+0x13981)
    #2 0x10b320c35 in fw_init+0x195 (test_framework:x86_64+0x10039ac35)
    #3 0x10b321f02 in framework_start+0x132 (test_framework:x86_64+0x10039bf02)
    #4 0x10b3452f0 in celix_frameworkFactory_createFramework+0x120 (test_framework:x86_64+0x1003bf2f0)
    #5 0x10afb5919 in celix::createFramework(celix::Properties const&)+0x49 (test_framework:x86_64+0x10002f919)
    #6 0x10b2cd06e in ScheduledEventTestSuite::ScheduledEventTestSuite()+0x3ae (test_framework:x86_64+0x10034706e)
    #7 0x10b2ccc6b in testing::internal::TestFactoryImpl<ScheduledEventTestSuite_OnceShotEventTest_Test>::CreateTest()+0x1b (test_framework:x86_64+0x100346c6b)
    #8 0x10b393a7c in testing::Test* testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test* (testing::internal::TestFactoryBase::*)(), char const*)+0x15c (test_framework:x86_64+0x10040da7c)
    #9 0x10b392d80 in testing::TestInfo::Run()+0x1f0 (test_framework:x86_64+0x10040cd80)
    #10 0x10b396606 in testing::TestSuite::Run()+0x346 (test_framework:x86_64+0x100410606)
    #11 0x10b3d8870 in testing::internal::UnitTestImpl::RunAllTests()+0x1420 (test_framework:x86_64+0x100452870)
    #12 0x10b3d6cec in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x15c (test_framework:x86_64+0x100450cec)
    #13 0x10b3d6a49 in testing::UnitTest::Run()+0x1a9 (test_framework:x86_64+0x100450a49)
    #14 0x10b424abf in main+0xef (test_framework:x86_64+0x10049eabf)
    #15 0x11716652d in start+0x1cd (dyld:x86_64+0x552d)

SUMMARY: AddressSanitizer: heap-use-after-free (test_framework:x86_64+0x1003cdb8a) in celix_scheduledEvent_getId+0x2a
Shadow bytes around the buggy address:
  0x1c2200018d60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c2200018d70: fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa
  0x1c2200018d80: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x1c2200018d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c2200018da0: 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x1c2200018db0: fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c2200018dc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x1c2200018dd0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x1c2200018de0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c2200018df0: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c2200018e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==18763==ABORTING

@PengZheng PengZheng self-assigned this Nov 3, 2023
@PengZheng PengZheng linked a pull request Nov 3, 2023 that will close this issue
@PengZheng PengZheng added component/framework Categorizes an issue or PR relevant to the framework. and removed stage:triaging labels Nov 3, 2023
PengZheng added a commit that referenced this issue Nov 4, 2023
#647: Fix use-after-free in celix_framework_scheduleEvent.
PengZheng added a commit that referenced this issue Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/framework Categorizes an issue or PR relevant to the framework. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant