Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some tests failed on MSVC 2019 (x64 build) #478

Closed
phprus opened this issue Jul 4, 2021 · 49 comments
Closed

Some tests failed on MSVC 2019 (x64 build) #478

phprus opened this issue Jul 4, 2021 · 49 comments
Labels

Comments

@phprus
Copy link
Contributor

phprus commented Jul 4, 2021

Release build:

F:\tmp\tbb\oneTBB-2021.3.0\build\2019.64>cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=Releas
e -DCMAKE_CXX_STANDARD=17 -G "NMake Makefiles" ..\..
-- The CXX compiler identification is MSVC 19.20.27519.0
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/
Tools/MSVC/14.20.27508/bin/HostX64/x64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/
Tools/MSVC/14.20.27508/bin/HostX64/x64/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - not found
-- Found Threads: TRUE
-- HWLOC target HWLOC::hwloc_1_11 doesn't exist. The tbbbind target cannot be created
-- HWLOC target HWLOC::hwloc_2 doesn't exist. The tbbbind_2_0 target cannot be created
-- HWLOC target HWLOC::hwloc_2_4 doesn't exist. The tbbbind_2_4 target cannot be created
-- The C compiler identification is MSVC 19.20.27519.0
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/To
ols/MSVC/14.20.27508/bin/HostX64/x64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/To
ols/MSVC/14.20.27508/bin/HostX64/x64/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: F:/tmp/tbb/oneTBB-2021.3.0/build/2019.64

Tests:

The following tests FAILED:
         11 - test_partitioner (Failed)
         12 - test_parallel_for (Failed)
         14 - test_parallel_reduce (Failed)
         21 - test_concurrent_vector (Failed)
         63 - test_task (Failed)
         80 - conformance_parallel_for (Failed)
         82 - conformance_parallel_reduce (Failed)

RelWithDebInfo build:

F:\tmp\tbb\oneTBB-2021.3.0\build\2019.64rd>cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=RelW
ithDebInfo -DCMAKE_CXX_STANDARD=17 -G "NMake Makefiles" ..\..
-- The CXX compiler identification is MSVC 19.20.27519.0
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/
Tools/MSVC/14.20.27508/bin/HostX64/x64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/
Tools/MSVC/14.20.27508/bin/HostX64/x64/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - not found
-- Found Threads: TRUE
-- HWLOC target HWLOC::hwloc_1_11 doesn't exist. The tbbbind target cannot be created
-- HWLOC target HWLOC::hwloc_2 doesn't exist. The tbbbind_2_0 target cannot be created
-- HWLOC target HWLOC::hwloc_2_4 doesn't exist. The tbbbind_2_4 target cannot be created
-- The C compiler identification is MSVC 19.20.27519.0
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/To
ols/MSVC/14.20.27508/bin/HostX64/x64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/To
ols/MSVC/14.20.27508/bin/HostX64/x64/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: F:/tmp/tbb/oneTBB-2021.3.0/build/2019.64rd

F:\tmp\tbb\oneTBB-2021.3.0\build\2019.64rd>

Tests:

The following tests FAILED:
         11 - test_partitioner (Failed)
         12 - test_parallel_for (Failed)
         14 - test_parallel_reduce (Failed)
         21 - test_concurrent_vector (Failed)
         36 - test_eh_flow_graph (Failed)
         63 - test_task (Failed)
         80 - conformance_parallel_for (Failed)
         82 - conformance_parallel_reduce (Failed)

LOG:

test 11
        Start  11: test_partitioner

11: Test command: "test_partitioner" "--forc
e-colors=1"
11: Test timeout computed to be: 10000000
11: Access violation
 11/132 Test  #11: test_partitioner .........................***Failed    0.04 sec
test 12
        Start  12: test_parallel_for

12: Test command: "test_parallel_for" "--for
ce-colors=1"
12: Test timeout computed to be: 10000000
12: Access violation
 12/132 Test  #12: test_parallel_for ........................***Failed    3.68 sec

test 14
        Start  14: test_parallel_reduce

14: Test command: "test_parallel_reduce" "--
force-colors=1"
14: Test timeout computed to be: 10000000
14: Access violation
 14/132 Test  #14: test_parallel_reduce .....................***Failed    0.20 sec


test 21
        Start  21: test_concurrent_vector

21: Test command: "test_concurrent_vector" "
--force-colors=1"
21: Test timeout computed to be: 10000000
21: Access violation
 21/132 Test  #21: test_concurrent_vector ...................***Failed    0.88 sec


test 36
        Start  36: test_eh_flow_graph

36: Test command: "test_eh_flow_graph" "--fo
rce-colors=1"
36: Test timeout computed to be: 10000000
36: [doctest] doctest version is "2.3.5"
36: [doctest] run with "--help" for options
36: ===============================================================================
36: F:\tmp\tbb\oneTBB-2021.3.0\test\tbb\test_eh_flow_graph.cpp(2029):
36: TEST CASE:  Testing several threads
36:
36: F:\tmp\tbb\oneTBB-2021.3.0\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout )
 is NOT correct!
36:   values: WARN( 1000000 <  1000000 )
36:   logged: input_node(1): Missed wakeup or machine is overloaded?
36:
.......
36: F:\tmp\tbb\oneTBB-2021.3.0\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout )
 is NOT correct!
36:   values: WARN( 1000000 <  1000000 )
36:   logged: input_node(1): Missed wakeup or machine is overloaded?
36:
36: F:\tmp\tbb\oneTBB-2021.3.0\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout )
 is
 36/132 Test  #36: test_eh_flow_graph .......................***Failed   96.01 sec


test 63
        Start  63: test_task

63: Test command: "test_task" "--force-color
s=1"
63: Test timeout computed to be: 10000000
63: [doctest] doctest version is "2.3.5"
63: [doctest] run with "--help" for options
63: Access violation
 63/132 Test  #63: test_task ................................***Failed    0.24 sec


test 80
        Start  80: conformance_parallel_for

80: Test command: "conformance_parallel_for"
 "--force-colors=1"
80: Test timeout computed to be: 10000000
80: Access violation
 80/132 Test  #80: conformance_parallel_for .................***Failed    0.06 sec


test 82
        Start  82: conformance_parallel_reduce

82: Test command: "conformance_parallel_redu
ce" "--force-colors=1"
82: Test timeout computed to be: 10000000
82: Access violation
 82/132 Test  #82: conformance_parallel_reduce ..............***Failed    0.07 sec
@phprus
Copy link
Contributor Author

phprus commented Jul 5, 2021

conformance_parallel_for.exe call stack:

[Inline Frame] tbb12.dll!tbb::detail::d1::small_object_allocator::new_object(tbb::detail::d1::execution_data &) Line 55
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\detail\_small_object_pool.h(55)
tbb12.dll!tbb::detail::r1::spawn(tbb::detail::d1::task & t, tbb::detail::d1::task_group_context & ctx, unsigned short id) Line 57
    at F:\tmp\tbb\oneTBB-2021.3.0\src\tbb\task_dispatcher.cpp(57)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::spawn(tbb::detail::d1::task &) Line 192
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\detail\_task.h(192)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::linear_affinity_mode<tbb::detail::d1::static_partition_type>::spawn_task(tbb::detail::d1::task &) Line 386
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\partitioner.h(386)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const>::spawn_self(tbb::detail::d1::execution_data &) Line 146
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(146)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const>::offer_work_impl(tbb::detail::d1::execution_data &) Line 142
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(142)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const>::offer_work(tbb::detail::d0::proportional_split &) Line 124
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(124)
conformance_parallel_for.exe!tbb::detail::d1::partition_type_base<tbb::detail::d1::static_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const>,tbb::detail::d1::blocked_range<unsigned __int64>>(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const> & start, tbb::detail::d1::blocked_range<unsigned __int64> & range, tbb::detail::d1::execution_data & ed) Line 284
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\partitioner.h(284)
conformance_parallel_for.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const>::execute(tbb::detail::d1::execution_data & ed) Line 173
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(173)
tbb12.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all<0,tbb::detail::r1::external_waiter>(tbb::detail::d1::task * t, tbb::detail::r1::external_waiter & waiter) Line 321
    at F:\tmp\tbb\oneTBB-2021.3.0\src\tbb\task_dispatcher.h(321)
tbb12.dll!tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task * t, tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & w_ctx) Line 172
    at F:\tmp\tbb\oneTBB-2021.3.0\src\tbb\task_dispatcher.cpp(172)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::execute_and_wait(tbb::detail::d1::task &) Line 196
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\detail\_task.h(196)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const>::run(const tbb::detail::d1::blocked_range<unsigned __int64> &) Line 114
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(114)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>,tbb::detail::d1::static_partitioner const>::run(const tbb::detail::d1::blocked_range<unsigned __int64> &) Line 103
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(103)
conformance_parallel_for.exe!tbb::detail::d1::parallel_for<tbb::detail::d1::blocked_range<unsigned __int64>,tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64>>(const tbb::detail::d1::blocked_range<unsigned __int64> & range, const tbb::detail::d1::parallel_for_body_wrapper<void <lambda>(unsigned __int64),unsigned __int64> & body, const tbb::detail::d1::static_partitioner & partitioner) Line 255
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(255)
conformance_parallel_for.exe!tbb::detail::d1::parallel_for_impl<unsigned __int64,void <lambda>(unsigned __int64),tbb::detail::d1::static_partitioner const>(unsigned __int64 first, unsigned __int64 last, unsigned __int64 step, const _DOCTEST_ANON_FUNC_60::__l2::void <lambda>(unsigned __int64) & f, const tbb::detail::d1::static_partitioner & partitioner) Line 318
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(318)
conformance_parallel_for.exe!tbb::detail::d1::parallel_for<unsigned __int64,void <lambda>(unsigned __int64)>(unsigned __int64 first, unsigned __int64 last, const _DOCTEST_ANON_FUNC_60::__l2::void <lambda>(unsigned __int64) & f, const tbb::detail::d1::static_partitioner & partitioner) Line 374
    at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\parallel_for.h(374)
conformance_parallel_for.exe!_DOCTEST_ANON_FUNC_60() Line 272
    at F:\tmp\tbb\oneTBB-2021.3.0\test\conformance\conformance_parallel_for.cpp(272)
conformance_parallel_for.exe!doctest::Context::run() Line 5891
    at F:\tmp\tbb\oneTBB-2021.3.0\test\common\doctest.h(5891)
conformance_parallel_for.exe!main(int argc, char * * argv) Line 5971
    at F:\tmp\tbb\oneTBB-2021.3.0\test\common\doctest.h(5971)
[External Code]

@phprus
Copy link
Contributor Author

phprus commented Jul 6, 2021

MSVC 19.21 - test_eh_flow_graph (Failed)
MSVC 19.22 - 100% tests passed

@Iliamish
Copy link
Contributor

Iliamish commented Jul 7, 2021

Hello! Do you see these errors every time you start tests, or do they run sporadically?

@phprus
Copy link
Contributor Author

phprus commented Jul 7, 2021

Windows 7 SP1 - every time,
Windows 10 - sporadically.

Pull request #480 does not solve this issue.

@Iliamish
Copy link
Contributor

Iliamish commented Jul 8, 2021

Can you provide a more detailed description of your hardware?

@phprus
Copy link
Contributor Author

phprus commented Jul 8, 2021

Windows 7 SP1:
Proxmox VE KVM Virtual machine,
2 Virtual Core CPU, 6 GB RAM.

Windows 10 Bare metal,
16 GB RAM, 8 Core Intel CPU

@phprus
Copy link
Contributor Author

phprus commented Jul 11, 2021

Commit b3fb839
MSVC 19.24 - test_eh_flow_graph (Failed)

@Iliamish
Copy link
Contributor

We tried to get the same errors on the test built with same parameters, but did not get the result described by you. Could you please give a more detailed description of the hardware and software on which the tests are falling. As well as a complete set of steps for building and running tests, for the most accurate reproduction of the error.

@phprus
Copy link
Contributor Author

phprus commented Jul 13, 2021

Before testing, I completely reinstalled the system.

  1. Install new Windows 7 SP1 Professional x64 with all updates from Windows Update.
  2. Install Visual Studio 2019 Community Edition with all compiler versions.
  3. Run 19.20 x64 console tools (x64 Native Tools for VS 2019) (%comspec% /k "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat" -vcvars_ver=14.20).
  4. Check cl version:
cl /?
Microsoft (R) C/C++ Optimizing Compiler Version 19.20.27525 for x64
  1. Configure oneTBB-2021.3:
    cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_STANDARD=17 -G "NMake Makefiles" ..\..
    or
    cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_STANDARD=17 -G "NMake Makefiles" ..\..
  2. Build:
    nmake
  3. Run tests:
    ctest --verbose

On Windows 7 tests failed every time.

Test hardware:
Proxmox Virtual Environment 5.4
CPU 4 x Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz

VM:
CPU: 2 Virtual Cores, 1 Socket.
RAM: 6 GB

@phprus
Copy link
Contributor Author

phprus commented Jul 14, 2021

Windows 7 SP1 x64 without virtualization on CPU Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (4 core, 8 thread)

2021.3.0:

11 - test_partitioner (SEGFAULT)
12 - test_parallel_for (SEGFAULT)
14 - test_parallel_reduce (SEGFAULT)
21 - test_concurrent_vector (SEGFAULT)
63 - test_task (SEGFAULT)
80 - conformance_parallel_for (SEGFAULT)
82 - conformance_parallel_reduce (SEGFAULT)

commit 68e075c:

12 - test_partitioner (SEGFAULT)
13 - test_parallel_for (SEGFAULT)
15 - test_parallel_reduce (SEGFAULT)
22 - test_concurrent_vector (SEGFAULT)
25 - test_task_arena (SEGFAULT)
63 - test_task (SEGFAULT)
81 - conformance_parallel_for (SEGFAULT)
83 - conformance_parallel_reduce (SEGFAULT)

@Iliamish
Copy link
Contributor

Hello, we tried to reproduce the errors on the available for us MSVC 19.21, but we could not reproduce them. Can you please update your MSVC compiler to the latest version and build the library in Debug mode. If errors are still reproducible, then please send the most complete call stack

@phprus
Copy link
Contributor Author

phprus commented Jul 15, 2021

Hello!
Errors are not reproduced in Debug mode.
Please describe the environment in which you tried to reproduce the problem (Windows and MSVC versions).

@Iliamish
Copy link
Contributor

Iliamish commented Jul 15, 2021

Windows 10, MSVC 19.21 and 19.29. We can`t use Windows 7 because it is no longer supported and has outdated security protocols.

@Iliamish
Copy link
Contributor

Have you tried the latest version of the compiler?

@phprus
Copy link
Contributor Author

phprus commented Jul 15, 2021

Windows 7 screen:
1
Pointer to virtual functions table is NULL.

Remove alignas(task_alignment) or change to alignas(16) in task definition:
https://github.com/oneapi-src/oneTBB/blob/v2021.3.0/include/oneapi/tbb/detail/_task.h#L219
fix all tests in all my msvc environments.

@alexey-katranov
Copy link
Contributor

It seems weird: if the alignment the only issue, almost all the tests should fail because each parallel construction uses tasks. What is the allocated Type?

auto p = new Type() is not the exactly same as new (alocated_object) Type() because r1::allocate guarantees alignment on nfs_size (128 bytes) (while usual new uses the standard allocator that should align on 16 bytes). Another thought is that the fail is inside new so maybe vptr is not initialized yet and fail occurred earlier than object is constructed.

MSVC 19.24 - test_eh_flow_graph (Failed)

Does it mean that the issue reproduced with only 19.20 version of the compiler and cannot be reproduced with other versions?

@phprus
Copy link
Contributor Author

phprus commented Jul 16, 2021

I don't know why this helped. I have no idea. And I have not found similar errors on the Internet.

Type is task_proxy.

I know, but this change gave me the idea to check the alignments.

Before remove alignas test test_eh_flow_graph failed sporadically on MSVC 19.20, 19.21, 19.24.
After - all tests run without errors.
I'll check the rest of the MSVC versions a bit later.

Other tests without test_eh_flow_graph failed every time on Windows 7 and periodically in Windows 10 - no more than 10 times per night of running all tests in a cycle.

@alexey-katranov
Copy link
Contributor

So, it is specific to task_proxy somehow. Some suppositions can be either it loses alignment or it struct (not a class).
Can you please try to specify the alignment and change struct to class (does it really matter?) on mailbox:31, e.g.

-struct task_proxy : public d1::task {
+class alignas(d1::task_alignment) task_proxy : public d1::task {
+public:

@phprus
Copy link
Contributor Author

phprus commented Jul 16, 2021

Not working :(
In any variant if d1::task has alignas.

@phprus
Copy link
Contributor Author

phprus commented Jul 16, 2021

On CPU Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (8 core)

Tests (msvc 19.20):
test_eh_algorithms
test_task_group
test_flow_graph_priorities
is sporadically freezeing.

In virtualized environment with 2 virtual cores this tests is passed.

@phprus
Copy link
Contributor Author

phprus commented Jul 16, 2021

Commit 68e075c:

On CPU Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (8 core):

test_eh_flow_graph is sporadically failed:

C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\build\1920.x64>msvc_19.20_cxx17_64_md_relwithdebinfo\test_eh_flow_graph.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\test\tbb\test_eh_flow_graph.cpp(2029):
TEST CASE:  Testing several threads

C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: input_node(1): Missed wakeup or machine is overloaded?

C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: input_node(1): Missed wakeup or machine is overloaded?
...

Tests (msvc 19.20):
test_task_group
test_flow_graph_priorities
is sporadically freezeing.

Test test_eh_algorithms is sporadically freezeing or:

C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\build\1920.x64>msvc_19.20_cxx17_64_md_relwithdebinfo\test_eh_algorithms.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\test\tbb\test_eh_algorithms.cpp(1577):
TEST CASE:  parallel_pipeline exception handling test #2

C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: Missed wakeup or machine is overloaded?

C:\tmp\oneTBB-68e075cbb96de2b92d1a95832754c24a07b31cc8\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: Missed wakeup or machine is overloaded?
...

I think this is a different problem. Not related to alignas.

@alexey-katranov
Copy link
Contributor

conformance_parallel_for.exe call stack:

[Inline Frame] tbb12.dll!tbb::detail::d1::small_object_allocator::new_object(tbb::detail::d1::execution_data &) Line 55
at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\detail_small_object_pool.h(55)
tbb12.dll!tbb::detail::r1::spawn(tbb::detail::d1::task & t, tbb::detail::d1::task_group_context & ctx, unsigned short id) Line 57
at F:\tmp\tbb\oneTBB-2021.3.0\src\tbb\task_dispatcher.cpp(57)
[Inline Frame] conformance_parallel_for.exe!tbb::detail::d1::spawn(tbb::detail::d1::task &) Line 192
at F:\tmp\tbb\oneTBB-2021.3.0\include\oneapi\tbb\detail_task.h(192)
...

It seems we are stuck. Is it acceptable to create a dump file and share it with us? If yes, can you share the dump of failed conformance_parallel_for with unmodified version of oneTBB in RelWithDebInfo mode? To create a dump file in Visual Studio: "Debug"->"Save Dump As...".

@phprus
Copy link
Contributor Author

phprus commented Jul 16, 2021

Done.
Error: Unhandled exception at 0x000007FEEEA77CF4 (tbb12.dll) in conformance_parallel_for.exe: 0xC0000005: Access violation reading location 0x0000000000000000.
Dump: conformance_parallel_for.dmp.zip

@alexey-katranov
Copy link
Contributor

Thank you for the dump.

At failed address 0x000007FEEEA77CF4 I see unaligned access with movaps - this instruction requires 16-byte aligned address. I do not know why the error message says that it is null pointer access but it is definitely unaligned access:

000007FEEEA77CF4  movaps      xmmword ptr [r10+68h],xmm1
r10+68h : 0x0000000000693968 

Can you please share tbb12.pdb that I can connect assembler with the code that generates this instruction?

@phprus
Copy link
Contributor Author

phprus commented Jul 17, 2021

tbb12.pdb.zip

@alexey-katranov
Copy link
Contributor

The fail is occurred in construction of proxy task task_dispatcher.cpp:L57
I looked at surrounding assembler, it looks strange:

000007FEEEA77CD8  call        tbb::detail::r1::allocate (07FEEEA41145h)
000007FEEEA77CDD  mov         r10,rax                     // rax contains allocated pointer and it is copied to r10
000007FEEEA77CE0  xorps       xmm0,xmm0                   // nullify xmm0
000007FEEEA77CE3  lea         r9,[r10+48h]  
000007FEEEA77CE7  xorps       xmm1,xmm1                   // nullify xmm1
000007FEEEA77CEA  movaps      xmmword ptr [rax+40h],xmm0  // why it nullifies the begining of proxy task? 
                                                          // it is not an issue but the compiler supposes rax+40h 
                                                          // to be aligned on 16 byte
000007FEEEA77CEE  movaps      xmmword ptr [rax+50h],xmm0  // nullify the next 16 bytes
000007FEEEA77CF2  xor         eax,eax  
000007FEEEA77CF4  movaps      xmmword ptr [r10+68h],xmm1  // it tries to nullify 16 bytes with 18h offset from 
                                                          // the previous access, so either rax+50h is not aligned 
                                                          // or r10+68h is not aligned... the code is broken...
000007FEEEA77CF9  mov         qword ptr [r10+78h],rax  
000007FEEEA77CFD  lea         rax,[tbb::detail::r1::task_proxy::`vftable' (07FEEEA899F0h)]  
000007FEEEA77D04  mov         qword ptr [r10+8],rbx  
000007FEEEA77D08  movups      xmmword ptr [r10+10h],xmm0  // but here, the compiler does not suppose
                                                          // the pointer to be aligned...
000007FEEEA77D0D  movups      xmmword ptr [r10+20h],xmm0  
000007FEEEA77D12  movups      xmmword ptr [r10+30h],xmm0  
000007FEEEA77D17  mov         qword ptr [r10],rax 

I looked at the code generated with msvc 19.28 - the compiler does not suppose any alignment:

00007FFE79C2BF71  call        tbb::detail::r1::allocate (07FFE79BF2DD8h)  
00007FFE79C2BF76  mov         r14,rax  
00007FFE79C2BF79  lea         rcx,[rsp+68h]  
00007FFE79C2BF7E  xorps       xmm0,xmm0  
00007FFE79C2BF81  lea         rsi,[r14+48h]  
00007FFE79C2BF85  xorps       xmm1,xmm1  
00007FFE79C2BF88  or          rbp,3  
00007FFE79C2BF8C  movups      xmmword ptr [rax+50h],xmm0  // unaligned access is used 
00007FFE79C2BF90  xor         eax,eax  
00007FFE79C2BF92  mov         qword ptr [rsp+68h],rbp  
00007FFE79C2BF97  movups      xmmword ptr [r14+68h],xmm1  // unaligned access is used 
00007FFE79C2BF9C  mov         qword ptr [r14+78h],rax  
00007FFE79C2BFA0  lea         rax,[tbb::detail::r1::task_proxy::`vftable' (07FFE79C3CEF8h)]  
00007FFE79C2BFA7  mov         qword ptr [r14+8],rbx  
00007FFE79C2BFAB  movups      xmmword ptr [r14+10h],xmm0  
00007FFE79C2BFB0  movups      xmmword ptr [r14+20h],xmm0  
00007FFE79C2BFB5  movups      xmmword ptr [r14+30h],xmm0  
00007FFE79C2BFBA  mov         qword ptr [r14],rax

It seems as an issue of msvc 19.20 that it uses aligned accesses where it cannot guarantee the required alignment. Looking at the source code, I could not find any UB that can cause abnormal behavior of the compiler.

@phprus, Do you agree with the analysis that msvc 19.20 generates broken code and it does not make sense to fix anything in that regard? (I am speaking only about the tests related to this issue, as for hangs, it seems another story)

@phprus
Copy link
Contributor Author

phprus commented Jul 19, 2021

Very strange compiler behavior...
I agree, this is are msvc 19.20 bug. In my case, I will stop using this version of the compiler.

At the beginning of the week, I'll check the hanging tests again and try to find more information.

@phprus
Copy link
Contributor Author

phprus commented Jul 20, 2021

Commit a080baf
Debug build
MSVC 19.21 x64
Test test_task_group.exe hangs:

Threads:

Not Flagged		10708	0	Worker Thread	ucrtbased.dll thread	tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P
Not Flagged	>	6960	0	Main Thread	Main Thread	tbb12_debug.dll!tbb::detail::r1::task_dispatcher::get_inbox_or_critical_task
Not Flagged		9912	0	Worker Thread	ucrtbased.dll thread	tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P
Not Flagged		10568	0	Worker Thread	ucrtbased.dll thread	tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P
Not Flagged		10412	0	Worker Thread	ucrtbased.dll thread	tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P
Not Flagged		10960	0	Worker Thread	ucrtbased.dll thread	tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P
Not Flagged		1884	0	Worker Thread	ucrtbased.dll thread	tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P
Not Flagged		9568	0	Worker Thread	ucrtbased.dll thread	tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P

Main Thread call stack:

tbb12_debug.dll!tbb::detail::r1::task_dispatcher::get_inbox_or_critical_task(tbb::detail::r1::execution_data_ext & ed, tbb::detail::r1::mail_inbox & inbox, __int64 isolation, bool critical_allowed) Line 126
	at ...\src\tbb\task_dispatcher.h(126)
tbb12_debug.dll!tbb::detail::r1::task_dispatcher::receive_or_steal_task<0,tbb::detail::r1::external_waiter>(tbb::detail::r1::thread_data & tls, tbb::detail::r1::execution_data_ext & ed, tbb::detail::r1::external_waiter & waiter, __int64 isolation, bool fifo_allowed, bool critical_allowed) Line 206
	at ...\src\tbb\task_dispatcher.h(206)
tbb12_debug.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all<0,tbb::detail::r1::external_waiter>(tbb::detail::d1::task * t, tbb::detail::r1::external_waiter & waiter) Line 349
	at ...\src\tbb\task_dispatcher.h(349)
tbb12_debug.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter>(tbb::detail::d1::task * t, tbb::detail::r1::external_waiter & waiter) Line 464
	at ...\src\tbb\task_dispatcher.h(464)
tbb12_debug.dll!tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task * t, tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & w_ctx) Line 168
	at ...\src\tbb\task_dispatcher.cpp(168)
tbb12_debug.dll!tbb::detail::r1::wait(tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & w_ctx) Line 127
	at ...\src\tbb\task_dispatcher.cpp(127)
test_task_group.exe!tbb::detail::d1::wait(tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & ctx) Line 198
	at ...\include\oneapi\tbb\detail\_task.h(198)
test_task_group.exe!tbb::detail::d1::task_group_base::wait::__l2::<lambda>() Line 584
	at ...\include\oneapi\tbb\task_group.h(584)
test_task_group.exe!tbb::detail::d0::try_call_proxy<void <lambda>(void)>::on_completion<void <lambda>(void)>(tbb::detail::d1::task_group_base::wait::__l2::void <lambda>(void) on_completion_body) Line 230
	at ...\include\oneapi\tbb\detail\_template_helpers.h(230)
test_task_group.exe!tbb::detail::d1::task_group_base::wait() Line 589
	at ...\include\oneapi\tbb\task_group.h(589)
test_task_group.exe!_DOCTEST_ANON_FUNC_122::__l2::<lambda>() Line 977
	at ...\test\tbb\test_task_group.cpp(977)
test_task_group.exe!tbb::detail::d1::task_arena_function<void <lambda>(void),void>::operator()() Line 68
	at ...\include\oneapi\tbb\task_arena.h(68)
tbb12_debug.dll!tbb::detail::r1::task_arena_impl::execute(tbb::detail::d1::task_arena_base & ta, tbb::detail::d1::delegate_base & d) Line 698
	at ...\src\tbb\arena.cpp(698)
tbb12_debug.dll!tbb::detail::r1::execute(tbb::detail::d1::task_arena_base & ta, tbb::detail::d1::delegate_base & d) Line 413
	at ...\src\tbb\arena.cpp(413)
test_task_group.exe!tbb::detail::d1::task_arena::execute_impl<void,void <lambda>(void)>(_DOCTEST_ANON_FUNC_122::__l2::void <lambda>(void) & f) Line 259
	at ...\include\oneapi\tbb\task_arena.h(259)
test_task_group.exe!tbb::detail::d1::task_arena::execute<void <lambda>(void)>(_DOCTEST_ANON_FUNC_122::__l2::void <lambda>(void) && f) Line 408
	at ...\include\oneapi\tbb\task_arena.h(408)
test_task_group.exe!_DOCTEST_ANON_FUNC_122() Line 978
	at ...\test\tbb\test_task_group.cpp(978)
test_task_group.exe!doctest::Context::run() Line 6587
	at ...\test\common\doctest.h(6587)
test_task_group.exe!main(int argc, char * * argv) Line 6671
	at ...\test\common\doctest.h(6671)
[External Code]

task_dispatcher::get_inbox_or_critical_task return nullptr by condition inbox.empty().
...
and jump to waiter.pause(slot); in function task_dispatcher::receive_or_steal_task

Other threads call stack:

[External Code]
tbb12_debug.dll!tbb::detail::r1::binary_semaphore::P() Line 217
	at ...\src\tbb\semaphore.h(217)
tbb12_debug.dll!tbb::detail::r1::rml::internal::thread_monitor::commit_wait(tbb::detail::r1::rml::internal::thread_monitor::cookie & c) Line 242
	at ...\src\tbb\rml_thread_monitor.h(242)
tbb12_debug.dll!tbb::detail::r1::rml::private_worker::run() Line 274
	at ...\src\tbb\private_server.cpp(274)
tbb12_debug.dll!tbb::detail::r1::rml::private_worker::thread_routine(void * arg) Line 222
	at ...\src\tbb\private_server.cpp(222)
[External Code]

All MSVC up to 19.29 has same issues.

@phprus
Copy link
Contributor Author

phprus commented Jul 21, 2021

Test test_task_group.exe:

C:\tmp\oneTBB-a080baf9968482c3e90d3a337d35ee9221b1ab4d\build\1921x64>msvc_19.21_cxx17_64_md_relwithdebinfo\test_task_group.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
TBB Warning: The number of workers is currently limited to 7. The request for 8 workers is ignored. Further requests for more workers will be silently ignored until the limit changes.

===============================================================================
[doctest] test cases:    9 |    9 passed | 0 failed | 0 skipped
[doctest] assertions: 2661 | 2661 passed | 0 failed |
[doctest] Status: SUCCESS!

C:\tmp\oneTBB-a080baf9968482c3e90d3a337d35ee9221b1ab4d\build\1921x64>msvc_19.21_cxx17_64_md_relwithdebinfo\test_task_group.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options

If the test is successful, the warning "TBB Warning: The number of workers is currently limited to 7. The request for 8 workers is ignored. Further requests for more workers will be silently ignored until the limit changes." is displayed, and if the test hangs, then no.

Maybe there is a race condition in the initialization of the library?

@phprus
Copy link
Contributor Author

phprus commented Jul 21, 2021

More examples of errors:

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\build\1929x64>msvc_19.29_cxx17_64_md_relwithdebinfo\test_eh_flow_graph.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\tbb\test_eh_flow_graph.cpp(2029):
TEST CASE:  Testing several threads

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: input_node(1): Missed wakeup or machine is overloaded?

...

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: input_node(1): Missed wakeup or machine is overloaded?
  
C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\build\1929x64>msvc_19.29_cxx17_64_md_relwithdebinfo\test_eh_algorithms.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\tbb\test_eh_algorithms.cpp(1583):
TEST CASE:  parallel_pipeline exception handling test #3

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: Missed wakeup or machine is overloaded?

===============================================================================
[doctest] test cases:    22 |    22 passed | 0 failed | 0 skipped
[doctest] assertions: 69219 | 69219 passed | 0 failed |
[doctest] Status: SUCCESS!

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\build\1929x64>msvc_19.29_cxx17_64_md_relwithdebinfo\test_eh_algorithms.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\tbb\test_eh_algorithms.cpp(462):
TEST CASE:  parallel_for and parallel_reduce exception handling test #4

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\tbb\test_eh_algorithms.cpp(374): FATAL ERROR: REQUIRE( g_CurExecuted <= minExecuted + (g_N
umThreads-1)*g_NumThreads/2 ) is NOT correct!
  values: REQUIRE( 1945 <= 1941 )
  logged: Too many tasks survived exception

===============================================================================
[doctest] test cases:    22 |    21 passed | 1 failed | 0 skipped
[doctest] assertions: 68650 | 68649 passed | 1 failed |
[doctest] Status: FAILURE!

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\build\1929x64>msvc_19.29_cxx17_64_md_relwithdebinfo\test_eh_algorithms.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\tbb\test_eh_algorithms.cpp(1583):
TEST CASE:  parallel_pipeline exception handling test #3

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: Missed wakeup or machine is overloaded?

...

C:\tmp\oneTBB-9cef57778c3cf01cad35b22bac7c90ed657a5442\test\common/exception_handling.h(241): WARNING: WARN( n < c_Timeout ) is NOT correct!
  values: WARN( 1000000 <  1000000 )
  logged: Missed wakeup or machine is overloaded?

@alexey-katranov
Copy link
Contributor

Thank you for the logs. I will try to figure out what is going on.

@pavelkumbrasev
Copy link
Contributor

@phprus can you please try reproduce hand on test_task_group with current master?

@phprus
Copy link
Contributor Author

phprus commented Aug 2, 2021

Commit: 4df48f9

Hangs on 2 core cpu:

  • test_collaborative_call_once.exe (>50% runs)
  • test_flow_graph_priorities.exe (~0,5% runs)
  • test_eh_flow_graph.exe (logged: Missed wakeup or machine is overloaded?)
  • test_eh_algorithms.exe (logged: Missed wakeup or machine is overloaded?)

test_task_group.exe work fine on 2 core cpu (4000 runs), but hangs on 8 core cpu (~10 times from 50 runs).

Tomorrow I will run all tests on an 8 core cpu.

@phprus
Copy link
Contributor Author

phprus commented Aug 3, 2021

Hangs on 8 core cpu:

  • test_collaborative_call_once.exe (41 times from 100 runs)
  • test_flow_graph_priorities.exe (43 / 100)
  • test_eh_flow_graph.exe (logged: Missed wakeup or machine is overloaded?)
  • test_eh_algorithms.exe (logged: Missed wakeup or machine is overloaded?)
  • test_task_group.exe (11 / 100)

@phprus
Copy link
Contributor Author

phprus commented Aug 3, 2021

test_scheduler_mix.exe:

C:\tmp\oneTBB-4df48f98506570fc5a035d9b16366436c97825e9\build\1921x64>msvc_19.21_cxx17_64_md_relwithdebinfo\test_scheduler_mix.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
TBB Warning: The number of workers is currently limited to 7. The request for 8 workers is ignored. Further requests for more workers will be silently
 ignored until the limit changes.

TBB Warning: The number of workers is currently limited to 7. The request for 8 workers is ignored. Further requests for more workers will be silently
 ignored until the limit changes.

===============================================================================
[doctest] test cases: 1 | 1 passed | 0 failed | 0 skipped
[doctest] assertions: 0 | 0 passed | 0 failed |
[doctest] Status: SUCCESS!

C:\tmp\oneTBB-4df48f98506570fc5a035d9b16366436c97825e9\build\1921x64>msvc_19.21_cxx17_64_md_relwithdebinfo\test_scheduler_mix.exe
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
TBB Warning: The number of workers is currently limited to 7. The request for 8 workers is ignored. Further requests for more workers will be silently
 ignored until the limit changes.

===============================================================================
[doctest] test cases: 1 | 1 passed | 0 failed | 0 skipped
[doctest] assertions: 0 | 0 passed | 0 failed |
[doctest] Status: SUCCESS!

Some of the launches are 2 or more warnings and the other part of the launches is one warning.

This is expected behavior?

@pavelkumbrasev
Copy link
Contributor

Yes, it's a lot of random in this test.

@phprus
Copy link
Contributor Author

phprus commented Aug 9, 2021

New infinity loop on Linux now!

Commit: 1ecde27
Test: test_flow_graph_whitebox
OS: CentOS 7.9 or SLES 11sp3
2 core virtual CPU
Compiler: Clang 10 (https://releases.llvm.org/download.html#10.0.0, URL: https://github.com/llvm/llvm-project/releases/download/llvmorg-10.0.0/clang+llvm-10.0.0-x86_64-linux-sles11.3.tar.xz) with libc++.

Backtrace from SLES 11sp3 (Release build):

(gdb) info threads
  Id   Target Id         Frame
  3    Thread 0x2aaf0f305700 (LWP 28499) "test_flow_graph" 0x00002aaf0e454789 in syscall () from /lib64/libc.so.6
  2    Thread 0x2aaf0f506700 (LWP 28506) "test_flow_graph" 0x00002aaf0e440237 in sched_yield () from /lib64/libc.so.6
* 1    Thread 0x2aaf0e8ff5c0 (LWP 28498) "test_flow_graph" 0x00002aaf0e16efad in nanosleep () from /lib64/libpthread.so.0
(gdb) thread 3
[Switching to thread 3 (Thread 0x2aaf0f305700 (LWP 28499))]
#0  0x00002aaf0e454789 in syscall () from /lib64/libc.so.6
(gdb) bt
#0  0x00002aaf0e454789 in syscall () from /lib64/libc.so.6
#1  0x00002aaf0d4092e5 in tbb::detail::r1::rml::private_worker::thread_routine(void*) ()
   from /home/..../target_sles11sp3-clang10/onetbb_bin_release/libtbb.so.12
#2  0x00002aaf0e1677b6 in start_thread () from /lib64/libpthread.so.0
#3  0x00002aaf0e457d6d in clone () from /lib64/libc.so.6
#4  0x0000000000000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 0x2aaf0f506700 (LWP 28506))]
#0  0x00002aaf0e440237 in sched_yield () from /lib64/libc.so.6
(gdb) bt
#0  0x00002aaf0e440237 in sched_yield () from /lib64/libc.so.6
#1  0x00002aaf0d410636 in tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task*, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&) ()
   from /home/..../target_sles11sp3-clang10/onetbb_bin_release/libtbb.so.12
#2  0x00000000002e3d05 in tbb::detail::d1::task_arena_function<tbb::detail::d1::graph::wait_for_all()::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const ()
#3  0x00002aaf0d3f8cd6 in tbb::detail::r1::execute(tbb::detail::d1::task_arena_base&, tbb::detail::d1::delegate_base&) ()
   from /home/..../target_sles11sp3-clang10/onetbb_bin_release/libtbb.so.12
#4  0x00000000002e3c95 in void tbb::detail::d0::try_call_proxy<tbb::detail::d1::graph::wait_for_all()::{lambda()#1}>::on_exception<tbb::detail::d1::graph::wait_for_all()::{lambda()#2}>(tbb::detail::d1::graph::wait_for_all()::{lambda()#2}) ()
#5  0x00000000002dd6f8 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, TestSequencerNode()::$_69> >(void*) ()
#6  0x00002aaf0e1677b6 in start_thread () from /lib64/libpthread.so.0
#7  0x00002aaf0e457d6d in clone () from /lib64/libc.so.6
#8  0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 0x2aaf0e8ff5c0 (LWP 28498))]
#0  0x00002aaf0e16efad in nanosleep () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00002aaf0e16efad in nanosleep () from /lib64/libpthread.so.0
#1  0x00002aaf0d85fd3b in std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) ()
   from /opt/llvm/clang+llvm-10.0.0-x86_64-linux-sles11.3/lib/libc++.so.1
#2  0x00000000002c2d04 in TestSequencerNode() ()
#3  0x00000000002b6536 in doctest::Context::run() ()
#4  0x00000000002b76cb in main ()
(gdb)

@phprus
Copy link
Contributor Author

phprus commented Aug 11, 2021

Commit: 8584c45

cmake command ("-O3 -g"):

CC=clang CXX=clang++ CXXFLAGS="-g -stdlib=libc++" LDFLAGS="-Wl,--disable-new-dtags -fuse-ld=lld -Wl,--build-id=sha1" cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_STANDARD=17 ../..

Output before hangs:

40: Test command: /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/build/target_clang10-release/clang_10.0_cxx17_64_release/test_flow_graph_whitebox "--force-colors=1"
40: Test timeout computed to be: 10000000
40: TBB Warning: The number of workers is currently limited to 1. The request for 2 workers is ignored. Further requests for more workers will be silently ignored until the limit changes.
40:

Backtrace:

(gdb) info threads
  Id   Target Id         Frame
  3    Thread 0x2ba4b8c85700 (LWP 9306) "test_flow_graph" 0x00002ba4b7dd4789 in syscall () from /lib64/libc.so.6
  2    Thread 0x2ba4b8e86700 (LWP 9313) "test_flow_graph" 0x00002ba4b7dc0237 in sched_yield () from /lib64/libc.so.6
* 1    Thread 0x2ba4b827f5c0 (LWP 9305) "test_flow_graph" 0x00002ba4b7aeefad in nanosleep () from /lib64/libpthread.so.0
(gdb) thread 1
[Switching to thread 1 (Thread 0x2ba4b827f5c0 (LWP 9305))]
#0  0x00002ba4b7aeefad in nanosleep () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00002ba4b7aeefad in nanosleep () from /lib64/libpthread.so.0
#1  0x00002ba4b71dfd3b in std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) ()
   from /opt/llvm/clang+llvm-10.0.0-x86_64-linux-sles11.3/lib/libc++.so.1
#2  0x00000000002c2d94 in sleep_for<long long, std::__1::ratio<1, 1000000> > (__d=...) at /opt/llvm/clang+llvm-10.0.0-x86_64-linux-sles11.3/bin/../include/c++/v1/thread:385
#3  SpinWaitWhile<(lambda at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/common/spin_barrier.h:60:19)> (pred=...)
    at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/common/spin_barrier.h:49
#4  SpinWaitWhileCondition<int, (lambda at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/common/spin_barrier.h:67:38)> (location=..., comp=...)
    at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/common/spin_barrier.h:60
#5  SpinWaitWhileEq<int, int> (location=..., value=0) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/common/spin_barrier.h:67
#6  TestSequencerNode() () at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/tbb/test_flow_graph_whitebox.cpp:679
#7  0x00000000002b65c6 in doctest::Context::run() (this=<optimized out>) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/common/doctest.h:6586
#8  0x00000000002b775b in main (argc=<optimized out>, argv=<optimized out>) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/common/doctest.h:6671
(gdb) thread 2
[Switching to thread 2 (Thread 0x2ba4b8e86700 (LWP 9313))]
#0  0x00002ba4b7dc0237 in sched_yield () from /lib64/libc.so.6
(gdb) bt
#0  0x00002ba4b7dc0237 in sched_yield () from /lib64/libc.so.6
#1  0x00002ba4b6d90676 in tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task*, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&) (
    t=<optimized out>, wait_ctx=..., w_ctx=...) at /opt/llvm/clang+llvm-10.0.0-x86_64-linux-sles11.3/bin/../include/c++/v1/__threading_support:419
#2  0x00000000002e3d95 in wait (wait_ctx=..., ctx=...) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/oneapi/tbb/detail/_task.h:197
#3  operator() (this=0x0) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/tbb/../oneapi/tbb/detail/_flow_graph_impl.h:283
#4  tbb::detail::d1::task_arena_function<tbb::detail::d1::graph::wait_for_all()::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const (this=<optimized out>)
    at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/oneapi/tbb/detail/../task_arena.h:67
#5  0x00002ba4b6d78d06 in tbb::detail::r1::execute(tbb::detail::d1::task_arena_base&, tbb::detail::d1::delegate_base&) (ta=..., d=...)
    at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/arena.cpp:698
#6  0x00000000002e3d25 in execute_impl<void, (lambda at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/tbb/../oneapi/tbb/detail/_flow_graph_impl.h:282:36)> (this=0x341c70, f=...) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/oneapi/tbb/detail/../task_arena.h:258
#7  execute<(lambda at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/tbb/../oneapi/tbb/detail/_flow_graph_impl.h:282:36)> (this=0x341c70,
    f=<unknown type in /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/build/target_clang10-release/clang_10.0_cxx17_64_release/test_flow_graph_whitebox, CU 0x102, DIE 0xfda33>) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/oneapi/tbb/detail/../task_arena.h:407
#8  operator() (this=<optimized out>) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/tbb/../oneapi/tbb/detail/_flow_graph_impl.h:282
#9  tbb::detail::d0::try_call_proxy<tbb::detail::d1::graph::wait_for_all()::{lambda()#1}>::on_exception<tbb::detail::d1::graph::wait_for_all()::{lambda()#2}>(tbb::detail::d1::graph::wait_for_all()::{lambda()#2}) (this=<optimized out>, on_exception_body=...)
    at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/tbb/../oneapi/tbb/detail/_template_helpers.h:223
#10 0x00000000002dd788 in wait_for_all (this=0x7fff9ff673f8)
    at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/../../include/tbb/../oneapi/tbb/detail/_flow_graph_impl.h:286
#11 operator() (this=0x342b08) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/tbb/test_flow_graph_whitebox.cpp:676
#12 __invoke<(lambda at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/tbb/test_flow_graph_whitebox.cpp:674:19)> (
    __f=<unknown type in /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/build/target_clang10-release/clang_10.0_cxx17_64_release/test_flow_graph_whitebox, CU 0x102, DIE 0x809fc>) at /opt/llvm/clang+llvm-10.0.0-x86_64-linux-sles11.3/bin/../include/c++/v1/type_traits:3539
#13 __thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, (lambda at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/test/tbb/test_flow_graph_whitebox.cpp:674:19)> (__t=...) at /opt/llvm/clang+llvm-10.0.0-x86_64-linux-sles11.3/bin/../include/c++/v1/thread:273
#14 std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, TestSequencerNode()::$_69> >(void*) (
    __vp=0x342b00) at /opt/llvm/clang+llvm-10.0.0-x86_64-linux-sles11.3/bin/../include/c++/v1/thread:284
#15 0x00002ba4b7ae77b6 in start_thread () from /lib64/libpthread.so.0
#16 0x00002ba4b7dd7d6d in clone () from /lib64/libc.so.6
#17 0x0000000000000000 in ?? ()
(gdb) thread 3
[Switching to thread 3 (Thread 0x2ba4b8c85700 (LWP 9306))]
#0  0x00002ba4b7dd4789 in syscall () from /lib64/libc.so.6
(gdb) bt
#0  0x00002ba4b7dd4789 in syscall () from /lib64/libc.so.6
#1  0x00002ba4b6d89315 in futex_wait (futex=0x2ba4b848912c, comparand=2) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/semaphore.h:104
#2  P (this=0x2ba4b848912c) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/semaphore.h:290
#3  commit_wait (this=0x2ba4b8489120, c=...) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/rml_thread_monitor.h:242
#4  run (this=0x2ba4b8489100) at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/private_server.cpp:274
#5  tbb::detail::r1::rml::private_worker::thread_routine(void*) (arg=0x2ba4b8489100)
    at /home/phprus/tmp/oneTBB/oneTBB-8584c456346730a68ce763dced5a5e8d5f7c10a9/src/tbb/private_server.cpp:221
#6  0x00002ba4b7ae77b6 in start_thread () from /lib64/libpthread.so.0
#7  0x00002ba4b7dd7d6d in clone () from /lib64/libc.so.6
#8  0x0000000000000000 in ?? ()
(gdb)

@zheltovs
Copy link
Contributor

zheltovs commented Aug 11, 2021

I have checked test_flow_graph_whitebox test on the following configurations:

  • Compiler: clang 10.0 as you provided
  • CMake command: the same as you provided
  • OS:
    • CentOS 7.9 (gcc 4.8.5, Docker container)
    • RHEL 6.10 (gcc 4.4.7, 8 vCPU VM configuration only)
  • Hardware:
    • 10 vCPU VM (no CPU limitations for Docker container)
    • 8 vCPU VM (RHEL only)
    • 2 vCPU VM (no CPU limitations for Docker container)
    • 2 cores CPU limit
    • 2 cores CPU limit + 50% CPU time limit

I ran this test ~ 30-40 times for each configuration, but it always passes

@alexey-katranov
Copy link
Contributor

alexey-katranov commented Aug 12, 2021

@phprus , it seems we cannot reproduce the issue. Can you share a dump (and pdb files for tbb library and test application) of some of the hanged tests? E.g. test_task_group in debug: #478 (comment)

@phprus
Copy link
Contributor Author

phprus commented Aug 12, 2021

Unfortunately, I cannot give access to the servers where the error is reproduced :(
test_flow_graph_whitebox error doesn't seem to reproduce in vmware esxi virtualization on an idle server.

Dump and pdb for test_collaborative_call_once.exe (commit: de0109b) in attach: test_collaborative_call_once.zip

@phprus
Copy link
Contributor Author

phprus commented Aug 22, 2021

Hello @alexey-katranov,

Commit: d2405e3
MSVC 19.21 x64
Test: test_task_group.exe

Dump and PDB files in attach:
test_task_group.zip

@alexey-katranov
Copy link
Contributor

alexey-katranov commented Aug 27, 2021

I have a suspicion about the possible issue, can you please try the fix in
governor.cpp:135:

-#if USE_WINTHREAD
+#if __TBB_USE_WINAPI

misc.h:56

-#if __TBB_WIN8UI_SUPPORT && (_WIN32_WINNT < 0x0A00)
+#if __TBB_USE_WINAPI

@phprus
Copy link
Contributor Author

phprus commented Aug 28, 2021

src\tbb\governor.cpp(139): error C2440: 'return': cannot convert from 'PVOID' to 'uintptr_t'
src\tbb\governor.cpp(139): note: There is no context in which this conversion is possible

To fix this error, I replaced return pteb->StackBase; to return reinterpret_cast<std::uintptr_t>(pteb->StackBase);

On 8 core computer, the error in tests:

  • test_task_group.exe
  • test_collaborative_call_once.exe
  • test_eh_flow_graph.exe
  • test_eh_algorithms.exe
  • test_flow_graph_priorities.exe

is no longer reproducible.

But why did this change solve the problem?
Is there an undocumented API bug on Windows 7 related to thread stack size?

@alexey-katranov
Copy link
Contributor

But why did this change solve the problem?
Is there an undocumented API bug on Windows 7 related to thread stack size?

See #553. The root cause is that stack size is incorrectly calculated. In your environment it causes stack anchor overflow that leads to hangs.

@casparvl
Copy link

casparvl commented Sep 27, 2021

I have a similar issue, but on a very different system: my test_eh_algorithms 'hangs' (well, it still shows high CPU usage, but doesn't seem to progress) and also prints the logged: Missed wakeup or machine is overloaded?.

I've build tbb 2021.2.0 with GCC 10.3.0 on a CentOS 8.4 machine with two AMD EPYC 7H12 64-core processors. The funny thing is, the same cluster has a few RHEL 8.2 nodes as well, with two AMD EPYC 7F32 8-Core processors. Since the build was done on a shared filesystem, I figured I'd also run the make test on one of the RHEL nodes (i.e. the configure and make were run on a CentOS machine, the make test on a RHEL machine). Surprisingly enough, this make test on the RHEL node just worked without problems.

Now, I see that #553 is very specific to Windows, but seeing as the symptoms I'm facing are very similar: do you think the root cause could somehow be similar? And do you have any suggestion for how this could be fixed?

@alexey-katranov
Copy link
Contributor

@casparvl #553 is really specific to Windows, can you please try 2021.3 or master to check if the issue is still present (there was a big set of fixes)? If present, feel free to open a separate issue (because this one is really overloaded and I am not sure if we investigated all the issues in this thread :) )

@casparvl
Copy link

casparvl commented Sep 27, 2021

Thanks @alexey-katranov for the quick response. I'll check 2021.3 and if still present, I'll submit a new (seperate) issue :)

@phprus
Copy link
Contributor Author

phprus commented Nov 19, 2021

Fixed by #553.
Thank you!

@phprus phprus closed this as completed Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants