Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple memory/threading errors under sanitizers. #651

Closed
melg8 opened this issue Sep 5, 2023 · 3 comments
Closed

Multiple memory/threading errors under sanitizers. #651

melg8 opened this issue Sep 5, 2023 · 3 comments

Comments

@melg8
Copy link

melg8 commented Sep 5, 2023

Problem

When you compile project (at least) on linux with flags:

  1. -fsanitize=address
  2. -fsanitize=thread
    multiple tests from unit test suit fail with error reports from sanitizers.

How to reproduce

  1. Add next line to main CMakeLists.txt
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address")
  1. Compile project with tests enabled.
  2. Run tests suite.
  3. Observe errors - example from CI run of my fork.

Notes

  1. For address sanitizer it seems errors occur only when certain tests are run together. For example i found pair:
    CoroTest.do_action + BehaviorTreeFactory.WrongTreeName execution will enable WrongTreeName test failure with stack overflow error.
    From that pair i tried to make minimal reproducible test. It looks like this so far:
void TestLocal(const int32_t root_blackboard[4],
                               std::string main_tree_ID)
{
 static_assert(sizeof(main_tree_ID) == 32);
 std::string output_tree;
 throw 42;
}


void TestWrapper(const std::string& tree_name,
               int32_t blackboard [4] = {}) {
 TestLocal(blackboard, tree_name);
}

TEST(CoroTest, do_action)
{
 BT::NodeConfig node_config_;
 node_config_.blackboard = BT::Blackboard::create();
 BT::assignDefaultRemapping<SimpleCoroAction>(node_config_);
 SimpleCoroAction node(milliseconds(1), false, "Action", node_config_);
 node.executeTick();

 try {
   TestWrapper("Wrong Name");
 } catch (...) {

 }
}

I tried to remove all functions/parameters/types which doesn't affect error.
I'm not very familiar with this library, but i suspect that "address" issues are connected to coro library usage with exceptions thrown after that interaction which enables stack buffer overflow error.

-fsanitize=thread showing different places with multiple issues. I didn't try to look to much into it.

Expected

You can checkout info about sanitizers here they are integrated into most of modern c++ compilers and have quite good signal/noise ratio, so it would be preferable for any significant library to have test suit cleanly passing without any errors under sanitizers. And if any of them have false positives - they also can be added to exclusion file so tests still pass cleanly. Also its great idea to add sanitizers into CI run to ensure clean runs in future.

@facontidavide
Copy link
Collaborator

Thanks, I will address this this week, if I can

@facontidavide
Copy link
Collaborator

facontidavide commented Sep 6, 2023

I investigated and the warning in the thread sanitizer are a well known false positive with condition variables.
google/sanitizers#1259

The warning with the address sanitizer seems also a false positive, due to the fact that gtest was not compiled with the sanitizer.

For instance, this fails:

TEST(BehaviorTreeFactory, Issue7)
{
  const std::string xml_text_issue = R"(
<root BTCPP_format="4">
    <BehaviorTree ID="ReceiveGuest">
    </BehaviorTree>
</root> )";

  BehaviorTreeFactory factory;
  XMLParser parser(factory);
  EXPECT_THROW(parser.loadFromText(xml_text_issue), RuntimeError);
}

But the same code outside gtest works fine

@melg8
Copy link
Author

melg8 commented Sep 7, 2023

@facontidavide thanks for fast response.

Although address sanitizer indeed maybe reporting false positives, even in that case. I don't think gtest is the reason. I implemented standalone example from 2 of your tests which reproduce error under sanitizers while not being linked to gtest at all.
Source code:

// Changes start.

#include <iostream>

#define EXPECT_EQ(expected, actual) \
  if (expected != actual) { \
    std::cout << "Not equal, returning 121" << std::endl; \
    exit(121); \
  }

#define EXPECT_FALSE(condition) \
  if (condition) { \
    std::cout << "Condition is not false, returning 122" << std::endl; \
    exit(122); \
  } \

#define TEST(name, subname) \
void name##_##subname()

#define EXPECT_THROW(expr, exception_type) \
  try \
  { \
    expr; \
    std::cout << "No exception was thrown, returning 123" << std::endl; \
    exit(123); \
  } catch (const exception_type &) { \
    std::cout << "Expected exception was thrown" << std::endl; \
  } catch (...) { \
    std::cout << "Unexpected exception was thrown, returning 124" << std::endl; \
    exit(124); \
  }

// Changes end.


// gtest_coroutines.cpp start.
#include "behaviortree_cpp/decorators/timeout_node.h"
#include "behaviortree_cpp/behavior_tree.h"
#include <chrono>
#include <future>



using namespace std::chrono;

using Millisecond = std::chrono::milliseconds;
using Timepoint = std::chrono::time_point<std::chrono::steady_clock>;

class SimpleCoroAction : public BT::CoroActionNode
{
public:
  SimpleCoroAction(milliseconds timeout, bool will_fail, const std::string& node_name,
                   const BT::NodeConfig& config) :
    BT::CoroActionNode(node_name, config),
    will_fail_(will_fail),
    timeout_(timeout),
    start_time_(Timepoint::min())
  {}

  virtual void halt() override
  {
    std::cout << "Action was halted. doing cleanup here" << std::endl;
    start_time_ = Timepoint::min();
    halted_ = true;
    BT::CoroActionNode::halt();
  }

  bool wasHalted()
  {
    return halted_;
  }

  void setRequiredTime(Millisecond ms)
  {
    timeout_ = ms;
  }

protected:
  virtual BT::NodeStatus tick() override
  {
    std::cout << "Starting action " << std::endl;
    halted_ = false;

    if (start_time_ == Timepoint::min())
    {
      start_time_ = std::chrono::steady_clock::now();
    }

    while (std::chrono::steady_clock::now() < (start_time_ + timeout_))
    {
      setStatusRunningAndYield();
    }

    halted_ = false;

    std::cout << "Done" << std::endl;
    start_time_ = Timepoint::min();
    return (will_fail_ ? BT::NodeStatus::FAILURE : BT::NodeStatus::SUCCESS);
  }

public:
  bool will_fail_;

private:
  std::chrono::milliseconds timeout_;
  Timepoint start_time_;
  bool halted_;
};

BT::NodeStatus executeWhileRunning(BT::TreeNode& node)
{
  auto status = node.executeTick();
  while (status == BT::NodeStatus::RUNNING)
  {
    status = node.executeTick();
    std::this_thread::sleep_for(Millisecond(1));
  }
  return status;
}

TEST(CoroTest, do_action)
{
  BT::NodeConfig node_config_;
  node_config_.blackboard = BT::Blackboard::create();
  BT::assignDefaultRemapping<SimpleCoroAction>(node_config_);
  SimpleCoroAction node(milliseconds(200), false, "Action", node_config_);

  EXPECT_EQ(BT::NodeStatus::SUCCESS, executeWhileRunning(node));
  EXPECT_FALSE(node.wasHalted());

  EXPECT_EQ(BT::NodeStatus::SUCCESS, executeWhileRunning(node));
  EXPECT_FALSE(node.wasHalted());

  node.will_fail_ = true;
  EXPECT_EQ(BT::NodeStatus::FAILURE, executeWhileRunning(node));
  EXPECT_FALSE(node.wasHalted());

  EXPECT_EQ(BT::NodeStatus::FAILURE, executeWhileRunning(node));
  EXPECT_FALSE(node.wasHalted());
}

// gtest_coroutines.cpp end.

// gtest_factory.cpp start.
#include "behaviortree_cpp/xml_parsing.h"
#include "behaviortree_cpp/exceptions.h"

using namespace BT;

TEST(BehaviorTreeFactory, Issue7)
{
  const std::string xml_text_issue = R"(
<root BTCPP_format="4">
    <BehaviorTree ID="ReceiveGuest">
    </BehaviorTree>
</root> )";

  BehaviorTreeFactory factory;
  XMLParser parser(factory);


  EXPECT_THROW(parser.loadFromText(xml_text_issue), RuntimeError);
}

// gtest_factory.cpp end.

int main() {
  CoroTest_do_action();
  BehaviorTreeFactory_Issue7();
}

Were added as address_sanitizer_standalone.cpp into examples folder. CMakeLists.txt modified to compile this as separate example: CompileExample("address_sanitizer_standalone")
As you can see it doesnt use any includes or symbols from gtest at all.

When running it i see this error output:

[nix-shell:~/work/BehaviorTree.CPP/build]$ ./examples/address_sanitizer_standalone 
Starting action 
Done
Starting action 
Done
Starting action 
Done
Starting action 
Done
==22666==WARNING: ASan is ignoring requested __asan_handle_no_return: stack type: default top: 0x63000003e900; bottom 0x7ffc222c3000; size: 0xffffe303ddd7b900 (-31869230401280)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
=================================================================
==22666==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc222c4410 at pc 0x7f0d14ed98b1 bp 0x7ffc222c43e0 sp 0x7ffc222c3b90
WRITE of size 24 at 0x7ffc222c4410 thread T0
    #0 0x7f0d14ed98b0 in __interceptor_sigaltstack.part.0 (/usr/lib/libasan.so.8+0x638b0)
    #1 0x7f0d14f36e93 in __asan::PlatformUnpoisonStacks() (/usr/lib/libasan.so.8+0xc0e93)
    #2 0x7f0d14f3d34c in __asan_handle_no_return (/usr/lib/libasan.so.8+0xc734c)
    #3 0x7f0d14a4ee92 in BT::VerifyXML(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, BT::NodeType, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, BT::NodeType> > > const&)::{lambda(tinyxml2::XMLElement const*)#3}::operator()(tinyxml2::XMLElement const*) const [clone .cold] (/home/user/work/BehaviorTree.CPP/build/libbehaviortree_cpp.so+0x96e92)
    #4 0x7f0d14ccb0db in BT::VerifyXML(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, BT::NodeType, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, BT::NodeType> > > const&) (/home/user/work/BehaviorTree.CPP/build/libbehaviortree_cpp.so+0x3130db)
    #5 0x7f0d14ccd51c in BT::XMLParser::PImpl::loadDocImpl(tinyxml2::XMLDocument*, bool) [clone .localalias] (/home/user/work/BehaviorTree.CPP/build/libbehaviortree_cpp.so+0x31551c)
    #6 0x7f0d14cd14c6 in BT::XMLParser::loadFromText(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) (/home/user/work/BehaviorTree.CPP/build/libbehaviortree_cpp.so+0x3194c6)
    #7 0x40b127 in BehaviorTreeFactory_Issue7() (/home/user/work/BehaviorTree.CPP/build/examples/address_sanitizer_standalone+0x40b127)
    #8 0x406a4d in main (/home/user/work/BehaviorTree.CPP/build/examples/address_sanitizer_standalone+0x406a4d)
    #9 0x7f0d144ccacd in __libc_start_call_main (/usr/lib/libc.so.6+0x23acd)
    #10 0x7f0d144ccb88 in __libc_start_main_alias_2 (/usr/lib/libc.so.6+0x23b88)
    #11 0x406ab4 in _start (/home/user/work/BehaviorTree.CPP/build/examples/address_sanitizer_standalone+0x406ab4)

Address 0x7ffc222c4410 is a wild pointer inside of access range of size 0x000000000018.
SUMMARY: AddressSanitizer: stack-buffer-overflow (/usr/lib/libasan.so.8+0x638b0) in __interceptor_sigaltstack.part.0
Shadow bytes around the buggy address:
  0x100004450830: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x100004450840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100004450850: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100004450860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100004450870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x100004450880: f3 f3[f3]f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
  0x100004450890: 00 00 00 00 00 00 f1 f1 f1 f1 f8 f2 01 f2 01 f2
  0x1000044508a0: 01 f2 01 f2 01 f2 01 f2 01 f2 01 f2 01 f2 01 f2
  0x1000044508b0: 01 f2 01 f2 f8 f2 01 f2 01 f2 00 f2 f2 f2 00 f2
  0x1000044508c0: f2 f2 00 f2 f2 f2 00 f2 f2 f2 00 00 00 00 f2 f2
  0x1000044508d0: f2 f2 00 00 00 00 f2 f2 f2 f2 00 00 00 00 f2 f2
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==22666==ABORTING

While ldd shows this dependencies:

ldd ./examples/address_sanitizer_standalone 
        linux-vdso.so.1 (0x00007ffeaa946000)
        libasan.so.8 => /usr/lib/libasan.so.8 (0x00007fc140a4f000)
        libbehaviortree_cpp.so => /home/user/work/BehaviorTree.CPP/build/libbehaviortree_cpp.so (0x00007fc140591000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fc14036b000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007fc14028b000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fc14026a000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fc140082000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fc14007d000)
        librt.so.1 => /usr/lib/librt.so.1 (0x00007fc140078000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fc140073000)
        libzmq.so.5 => /nix/store/5nbd10268965s5v3jwx6p2m0gh9wca57-zeromq-4.3.4/lib/libzmq.so.5 (0x00007fc13ff8a000)
        libsqlite3.so.0 => /nix/store/n12a68qch9s85k6ni4m4r4xxr8lwys1i-sqlite-3.41.2/lib/libsqlite3.so.0 (0x00007fc13fe39000)
        /nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib/ld-linux-x86-64.so.2 => /nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib64/ld-linux-x86-64.so.2 (0x00007fc1410f5000)
        libsodium.so.23 => /nix/store/b19c558b7c21a72hxr6cq61jhzz3h8hq-libsodium-1.0.18/lib/libsodium.so.23 (0x00007fc13fdde000)
        libz.so.1 => /usr/lib/libz.so.1 (0x00007fc13fdbf000)

I would suggest to further investigate this case and not dismiss it only as false positive from gtest influence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants