Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

可能存在内存泄漏? #53

Closed
wants to merge 4 commits into from
Closed

可能存在内存泄漏? #53

wants to merge 4 commits into from

Conversation

qinwf
Copy link
Contributor

@qinwf qinwf commented Jan 28, 2016

最近R包仓库上的管理者通知代码要在 GCC 5 和 Clang 3.8 上编译的上跑测试。最近这两个编译器都带了 UBSAN 和 内存泄漏等测试。我跑 R 接口的时候发现有内存泄漏。

这里改了一下 cppjieba 的 travis 和 CMakeLists.txt 的文件,在 travis 上跑,发现也有内存泄漏。

travis 运行结果

Clang-3.6 with UBSAN https://travis-ci.org/qinwf/cppjieba/jobs/105404339,

GCC-5 with UBSAN https://travis-ci.org/qinwf/cppjieba/jobs/105404340.

具体泄漏的内容,通过 travis 上的 make test 可能看不到。直接运行 ./test/test.run ./load_test 在程序 exit 的时候就可以看到了。

Test project /cppjieba/build
    Start 1: ./test/test.run
1/3 Test #1: ./test/test.run ..................***Failed   29.35 sec
    Start 2: ./load_test
2/3 Test #2: ./load_test ......................***Failed   46.60 sec
    Start 3: ./demo
3/3 Test #3: ./demo ...........................   Passed    3.57 sec

@qinwf
Copy link
Contributor Author

qinwf commented Jan 28, 2016

这里附我在 docker 跑的一些结果

root@8bb3b62db7ef:/cppjieba/build/test# ./test.run
Running main() from gtest_main.cc
[==========] Running 25 tests from 13 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from KeywordExtractorTest
[ RUN      ] KeywordExtractorTest.Test1
/cppjieba/include/cppjieba/HMMSegment.hpp:21:9: runtime error: load of value 176, which is not a valid value for type 'bool'
[       OK ] KeywordExtractorTest.Test1 (1712 ms)
[ RUN      ] KeywordExtractorTest.Test2
2016-01-28 12:48:28 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8, lines: 7
[       OK ] KeywordExtractorTest.Test2 (1594 ms)
[----------] 2 tests from KeywordExtractorTest (3306 ms total)

[----------] 2 tests from TrieTest
[ RUN      ] TrieTest.Empty
[       OK ] TrieTest.Empty (1 ms)
[ RUN      ] TrieTest.Construct
[       OK ] TrieTest.Construct (1 ms)
[----------] 2 tests from TrieTest (2 ms total)

[----------] 5 tests from DictTrieTest
[ RUN      ] DictTrieTest.NewAndDelete
[       OK ] DictTrieTest.NewAndDelete (611 ms)
[ RUN      ] DictTrieTest.Test1
[       OK ] DictTrieTest.Test1 (650 ms)
[ RUN      ] DictTrieTest.UserDict
2016-01-28 12:48:31 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8, lines: 7
[       OK ] DictTrieTest.UserDict (640 ms)
[ RUN      ] DictTrieTest.UserDictWithMaxWeight
2016-01-28 12:48:31 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8, lines: 7
[       OK ] DictTrieTest.UserDictWithMaxWeight (622 ms)
[ RUN      ] DictTrieTest.Dag
2016-01-28 12:48:32 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8, lines: 7
[       OK ] DictTrieTest.Dag (586 ms)
[----------] 5 tests from DictTrieTest (3109 ms total)

[----------] 5 tests from MixSegmentTest
[ RUN      ] MixSegmentTest.Test1
[       OK ] MixSegmentTest.Test1 (2315 ms)
[ RUN      ] MixSegmentTest.NoUserDict
[       OK ] MixSegmentTest.NoUserDict (650 ms)
[ RUN      ] MixSegmentTest.UserDict
2016-01-28 12:48:35 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../dict/user.dict.utf8, lines: 3
[       OK ] MixSegmentTest.UserDict (681 ms)
[ RUN      ] MixSegmentTest.TestUserDict
2016-01-28 12:48:36 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8, lines: 7
[       OK ] MixSegmentTest.TestUserDict (797 ms)
[ RUN      ] MixSegmentTest.TestMultiUserDict
2016-01-28 12:48:37 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8;../test/testdata/userdict.2.utf8, lines: 8
[       OK ] MixSegmentTest.TestMultiUserDict (758 ms)
[----------] 5 tests from MixSegmentTest (5202 ms total)

[----------] 1 test from MPSegmentTest
[ RUN      ] MPSegmentTest.Test1
[       OK ] MPSegmentTest.Test1 (2166 ms)
[----------] 1 test from MPSegmentTest (2166 ms total)

[----------] 1 test from HMMSegmentTest
[ RUN      ] HMMSegmentTest.Test1
/cppjieba/include/cppjieba/HMMSegment.hpp:21:9: runtime error: load of value 128, which is not a valid value for type 'bool'
[       OK ] HMMSegmentTest.Test1 (80 ms)
[----------] 1 test from HMMSegmentTest (80 ms total)

[----------] 1 test from FullSegment
[ RUN      ] FullSegment.Test1
[       OK ] FullSegment.Test1 (586 ms)
[----------] 1 test from FullSegment (587 ms total)

[----------] 2 tests from QuerySegment
[ RUN      ] QuerySegment.Test1
[       OK ] QuerySegment.Test1 (659 ms)
[ RUN      ] QuerySegment.Test2
2016-01-28 12:48:41 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8|../test/testdata/userdict.english, lines: 9
[       OK ] QuerySegment.Test2 (880 ms)
[----------] 2 tests from QuerySegment (1539 ms total)

[----------] 1 test from LevelSegmentTest
[ RUN      ] LevelSegmentTest.Test0
[       OK ] LevelSegmentTest.Test0 (675 ms)
[----------] 1 test from LevelSegmentTest (675 ms total)

[----------] 1 test from PosTaggerTest
[ RUN      ] PosTaggerTest.Test
[       OK ] PosTaggerTest.Test (2356 ms)
[----------] 1 test from PosTaggerTest (2356 ms total)

[----------] 1 test from PosTagger
[ RUN      ] PosTagger.TestUserDict
2016-01-28 12:48:46 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../test/testdata/userdict.utf8, lines: 7
[       OK ] PosTagger.TestUserDict (2315 ms)
[----------] 1 test from PosTagger (2315 ms total)

[----------] 2 tests from JiebaTest
[ RUN      ] JiebaTest.Test1
2016-01-28 12:48:48 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../dict/user.dict.utf8, lines: 3
[       OK ] JiebaTest.Test1 (2304 ms)
[ RUN      ] JiebaTest.InsertUserWord
2016-01-28 12:48:51 /cppjieba/include/cppjieba/DictTrie.hpp:121 INFO load userdicts ../dict/user.dict.utf8, lines: 3
[       OK ] JiebaTest.InsertUserWord (2261 ms)
[----------] 2 tests from JiebaTest (4565 ms total)

[----------] 1 test from PreFilterTest
[ RUN      ] PreFilterTest.Test1
[       OK ] PreFilterTest.Test1 (1 ms)
[----------] 1 test from PreFilterTest (1 ms total)

[----------] Global test environment tear-down
[==========] 25 tests from 13 test cases ran. (25904 ms total)
[  PASSED  ] 25 tests.

=================================================================
==8==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 10144512 byte(s) in 422688 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x474a3b in __gnu_cxx::new_allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> >::allocate(unsigned long, void const*) /usr/include/c++/5/ext/new_allocator.h:104
    #2 0x474a3b in std::allocator_traits<std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> > >::allocate(std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> >&, unsigned long) /usr/include/c++/5/bits/alloc_traits.h:360
    #3 0x474a3b in std::__detail::_Hash_node<std::pair<unsigned short const, double>, false>* std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> > >::_M_allocate_node<std::piecewise_construct_t const&, std::tuple<unsigned short const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<unsigned short const&>&&, std::tuple<>&&) /usr/include/c++/5/bits/hashtable_policy.h:1949
    #4 0x474a3b in std::__detail::_Map_base<unsigned short, std::pair<unsigned short const, double>, std::allocator<std::pair<unsigned short const, double> >, std::__detail::_Select1st, std::equal_to<unsigned short>, std::hash<unsigned short>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>, true>::operator[](unsigned short const&) /usr/include/c++/5/bits/hashtable_policy.h:597
    #5 0x474a3b in std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >::operator[](unsigned short const&) /usr/include/c++/5/bits/unordered_map.h:668
    #6 0x474a3b in cppjieba::HMMModel::LoadEmitProb(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >&) /cppjieba/include/cppjieba/HMMModel.hpp:111
    #7 0x7fb31f3f7161 in std::basic_filebuf<char, std::char_traits<char> >::open(char const*, std::_Ios_Openmode) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xec161)

Indirect leak of 3621504 byte(s) in 48 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x47106c in __gnu_cxx::new_allocator<std::__detail::_Hash_node_base*>::allocate(unsigned long, void const*) /usr/include/c++/5/ext/new_allocator.h:104
    #2 0x47106c in std::allocator_traits<std::allocator<std::__detail::_Hash_node_base*> >::allocate(std::allocator<std::__detail::_Hash_node_base*>&, unsigned long) /usr/include/c++/5/bits/alloc_traits.h:360
    #3 0x47106c in std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> > >::_M_allocate_buckets(unsigned long) /usr/include/c++/5/bits/hashtable_policy.h:1996
    #4 0x47106c in std::_Hashtable<unsigned short, std::pair<unsigned short const, double>, std::allocator<std::pair<unsigned short const, double> >, std::__detail::_Select1st, std::equal_to<unsigned short>, std::hash<unsigned short>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_allocate_buckets(unsigned long) /usr/include/c++/5/bits/hashtable.h:347
    #5 0x47106c in std::_Hashtable<unsigned short, std::pair<unsigned short const, double>, std::allocator<std::pair<unsigned short const, double> >, std::__detail::_Select1st, std::equal_to<unsigned short>, std::hash<unsigned short>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_rehash_aux(unsigned long, std::integral_constant<bool, true>) /usr/include/c++/5/bits/hashtable.h:1974

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x4deadf in MixSegmentTest_TestMultiUserDict_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:111

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x4dde0f in MixSegmentTest_TestUserDict_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:89

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x496886 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x496886 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x496886 in cppjieba::KeywordExtractor::KeywordExtractor(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/KeywordExtractor.hpp:19
    #4 0x430862 in KeywordExtractorTest_Test2_Test::TestBody() /cppjieba/test/unittest/keyword_extractor_test.cpp:31

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x4dcd61 in MixSegmentTest_NoUserDict_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:53

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x4e0050 in cppjieba::QuerySegment::QuerySegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long) /cppjieba/include/cppjieba/QuerySegment.hpp:21
    #4 0x4e0050 in QuerySegment_Test1_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:214

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x496886 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x496886 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x496886 in cppjieba::KeywordExtractor::KeywordExtractor(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/KeywordExtractor.hpp:19
    #4 0x42ffe2 in KeywordExtractorTest_Test1_Test::TestBody() /cppjieba/test/unittest/keyword_extractor_test.cpp:7

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x4dbe00 in MixSegmentTest_Test1_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:13

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x4df0b8 in cppjieba::QuerySegment::QuerySegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long) /cppjieba/include/cppjieba/QuerySegment.hpp:21
    #4 0x4df0b8 in QuerySegment_Test2_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:228

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x50f478 in cppjieba::PosTagger::PosTagger(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/PosTagger.hpp:20
    #4 0x50f478 in PosTaggerTest_Test_Test::TestBody() /cppjieba/test/unittest/pos_tagger_test.cpp:16

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x504d8f in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x4e06c0 in HMMSegmentTest_Test1_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:180

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x50fd37 in cppjieba::PosTagger::PosTagger(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/PosTagger.hpp:20
    #4 0x50fd37 in PosTagger_TestUserDict_Test::TestBody() /cppjieba/test/unittest/pos_tagger_test.cpp:26

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x507b20 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x507b20 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x4dd2fd in MixSegmentTest_UserDict_Test::TestBody() /cppjieba/test/unittest/segments_test.cpp:62

Indirect leak of 384 byte(s) in 12 object(s) allocated from:
    #0 0x7fb31f937c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x4457c2 in __gnu_cxx::new_allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>::allocate(unsigned long, void const*) /usr/include/c++/5/ext/new_allocator.h:104
    #2 0x4457c2 in std::allocator_traits<std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::allocate(std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>&, unsigned long) /usr/include/c++/5/bits/alloc_traits.h:360
    #3 0x4457c2 in std::_Vector_base<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*, std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::_M_allocate(unsigned long) /usr/include/c++/5/bits/stl_vector.h:170
    #4 0x4457c2 in void std::vector<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*, std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::_M_emplace_back_aux<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>(std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*&&) /usr/include/c++/5/bits/vector.tcc:412
    #5 0x4457c2 in void std::vector<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*, std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::emplace_back<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>(std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*&&) /usr/include/c++/5/bits/vector.tcc:101

SUMMARY: AddressSanitizer: 13771392 byte(s) leaked in 422760 allocation(s).

@qinwf
Copy link
Contributor Author

qinwf commented Jan 28, 2016

root@8bb3b62db7ef:/cppjieba/build# ./load_test
process [100 %]
Cut: [32.625 seconds]time consumed.
/cppjieba/include/cppjieba/HMMSegment.hpp:21:9: runtime error: load of value 46, which is not a valid value for type 'bool'
process [100 %]
Extract: [4.053 seconds]time consumed.

=================================================================
==11==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 845376 byte(s) in 35224 object(s) allocated from:
    #0 0x7f9168041c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x4532bb in __gnu_cxx::new_allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> >::allocate(unsigned long, void const*) /usr/include/c++/5/ext/new_allocator.h:104
    #2 0x4532bb in std::allocator_traits<std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> > >::allocate(std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> >&, unsigned long) /usr/include/c++/5/bits/alloc_traits.h:360
    #3 0x4532bb in std::__detail::_Hash_node<std::pair<unsigned short const, double>, false>* std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> > >::_M_allocate_node<std::piecewise_construct_t const&, std::tuple<unsigned short const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<unsigned short const&>&&, std::tuple<>&&) /usr/include/c++/5/bits/hashtable_policy.h:1949
    #4 0x4532bb in std::__detail::_Map_base<unsigned short, std::pair<unsigned short const, double>, std::allocator<std::pair<unsigned short const, double> >, std::__detail::_Select1st, std::equal_to<unsigned short>, std::hash<unsigned short>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>, true>::operator[](unsigned short const&) /usr/include/c++/5/bits/hashtable_policy.h:597
    #5 0x4532bb in std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >::operator[](unsigned short const&) /usr/include/c++/5/bits/unordered_map.h:668
    #6 0x4532bb in cppjieba::HMMModel::LoadEmitProb(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >&) /cppjieba/include/cppjieba/HMMModel.hpp:111
    #7 0x7f9167d1e161 in std::basic_filebuf<char, std::char_traits<char> >::open(char const*, std::_Ios_Openmode) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xec161)

Indirect leak of 301792 byte(s) in 4 object(s) allocated from:
    #0 0x7f9168041c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x44f8ec in __gnu_cxx::new_allocator<std::__detail::_Hash_node_base*>::allocate(unsigned long, void const*) /usr/include/c++/5/ext/new_allocator.h:104
    #2 0x44f8ec in std::allocator_traits<std::allocator<std::__detail::_Hash_node_base*> >::allocate(std::allocator<std::__detail::_Hash_node_base*>&, unsigned long) /usr/include/c++/5/bits/alloc_traits.h:360
    #3 0x44f8ec in std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<unsigned short const, double>, false> > >::_M_allocate_buckets(unsigned long) /usr/include/c++/5/bits/hashtable_policy.h:1996
    #4 0x44f8ec in std::_Hashtable<unsigned short, std::pair<unsigned short const, double>, std::allocator<std::pair<unsigned short const, double> >, std::__detail::_Select1st, std::equal_to<unsigned short>, std::hash<unsigned short>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_allocate_buckets(unsigned long) /usr/include/c++/5/bits/hashtable.h:347
    #5 0x44f8ec in std::_Hashtable<unsigned short, std::pair<unsigned short const, double>, std::allocator<std::pair<unsigned short const, double> >, std::__detail::_Select1st, std::equal_to<unsigned short>, std::hash<unsigned short>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_rehash_aux(unsigned long, std::integral_constant<bool, true>) /usr/include/c++/5/bits/hashtable.h:1974

Indirect leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0x7f9168041c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x472428 in cppjieba::HMMSegment::HMMSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/HMMSegment.hpp:15
    #2 0x472428 in cppjieba::MixSegment::MixSegment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /cppjieba/include/cppjieba/MixSegment.hpp:15
    #3 0x41129c in Cut(unsigned long) /cppjieba/test/load_test.cpp:13

Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7f9168041c0a in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x94c0a)
    #1 0x426252 in __gnu_cxx::new_allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>::allocate(unsigned long, void const*) /usr/include/c++/5/ext/new_allocator.h:104
    #2 0x426252 in std::allocator_traits<std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::allocate(std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>&, unsigned long) /usr/include/c++/5/bits/alloc_traits.h:360
    #3 0x426252 in std::_Vector_base<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*, std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::_M_allocate(unsigned long) /usr/include/c++/5/bits/stl_vector.h:170
    #4 0x426252 in void std::vector<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*, std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::_M_emplace_back_aux<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>(std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*&&) /usr/include/c++/5/bits/vector.tcc:412
    #5 0x426252 in void std::vector<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*, std::allocator<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*> >::emplace_back<std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*>(std::unordered_map<unsigned short, double, std::hash<unsigned short>, std::equal_to<unsigned short>, std::allocator<std::pair<unsigned short const, double> > >*&&) /usr/include/c++/5/bits/vector.tcc:101

SUMMARY: AddressSanitizer: 1147616 byte(s) leaked in 35230 allocation(s).
root@8bb3b62db7ef:/cppjieba/build#

@qinwf
Copy link
Contributor Author

qinwf commented Jan 28, 2016

上面用的 docker 镜像是这个 https://hub.docker.com/r/qinwf/cppjieba-ubsan ,可能有点大。

自己用 GCC-5 编译应该也能有类似的结果。

@yanyiwu
Copy link
Owner

yanyiwu commented Jan 28, 2016

@qinwf

  1. 非常感谢你的反馈,我看了一下报错的信息,也认真review的对应的代码,而且报错信息比较诡异,感觉好像是标准库相关的代码有问题,但是我肉眼没有发现内存泄露的代码,需要再详细深究一下。稍微给我多一点的时间哈。
  2. 内存泄露是个可怕的问题,但是这个cppjieba也是我在线上使用的代码库。个人认为应该没有内存泄露的问题,或者是内存泄露属于一次性载入的时候发生的,不是在每次分词的时候发生,否则应该线上应该很容易就出现问题。我会仔细查清楚。

@qinwf
Copy link
Contributor Author

qinwf commented Jan 28, 2016

我跑了一下 valgrind test.run 的结果。有点晚了,先睡了~ 88

https://bitbucket.org/snippets/qinwf/rKzre

@yanyiwu
Copy link
Owner

yanyiwu commented Jan 28, 2016

@qinwf 我刚才也刚好正在用 valgrind 测试,也发现了这个问题,没想到碰巧你也正在测试,非常感谢你,这么晚了还在帮忙定位bug。
我已经在master代码里面修复了这个bug,确实如猜想的那样,就是在载入的时候发生一次性的内存泄露的问题:

使用 valgrind 检查内存泄露的问题,定位出一个HMM模型初始化的问题导致内存泄露的bug,不过此内存泄露不是致命问题,
因为只会在词典载入的时候发生,而词典载入通常情况下只会被运行一次,故不会导致严重问题。

所以希望使用老版本的cppjieba的用户也不必担心。

目前最新代码使用 valgrind 内存检查是没有问题了,但是不知道你说的那个gcc5的检测是否仍然有问题,我预计应该是没有问题了。不过我这边没有对应的环境,希望你能更新一下最新代码试试,期待你的反馈,如果反馈一切正常了,我再打包个新版本 v4.4.1 release 。非常感谢!如果你有空来北京的话,希望能请你吃顿饭以表谢意。

@qinwf
Copy link
Contributor Author

qinwf commented Jan 29, 2016

travis 没有报错了,应该修复了,谢谢!

@qinwf qinwf closed this Jan 29, 2016
@qinwf
Copy link
Contributor Author

qinwf commented Jan 29, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants