Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APP CRASH in sql_unittests on win7-cr bot due to incorrect symcache entries #1427

Open
derekbruening opened this issue Nov 28, 2014 · 3 comments

Comments

@derekbruening
Copy link
Contributor

From [email protected] on January 28, 2014 21:29:05

On our win{7,8}-cr bots on the DrMemory waterfall we have some failures.

I looked into one on win7-cr in sql_unittests:

$ export RUNNING_ON_VALGRIND=1
$ export DRMEMORY_COMMAND=/cygdrive/e/b/build/slave/win7-cr/build/unpacked/bin/drmemory.exe
$ ../../win7-cr-builder/build/src/tools/valgrind/chrome_tests.sh -t sql --tool drmemory
...
[ RUN ] SQLConnectionTest.Execute
Backtrace:
(No symbol) [0x01A7E069]
SHPathPrepareForWriteW [0x75DF5C81+1902]
Ordinal201 [0x75DA9461+2849]
Ordinal201 [0x75DAAB99+8793]
IsUserAnAdmin [0x75DF4DE5+2408]
SHFileOperationW [0x75DF97EF+231]
SHFileOperationW [0x75DF9717+15]
base::DeleteFileW 0x007429B5+437
...
Dr.M WARNING: application exited with abnormal code 0xc000001d

$ unpacked/bin/drmemory.exe -version
Dr. Memory version 1.6.1686 -- build 42

Yet running it without going through the script works fine:

$ unpacked/bin/drmemory.exe -light ../../win7-cr-builder/build/src/out/Debug/sql_unittests.exe --gtest_filter=SQLConnectionTest.Execute --single-process-tests
Dr.M Dr. Memory version 1.6.1686
Dr.M Running "../../win7-cr-builder/build/src/out/Debug/sql_unittests.exe --gtest_filter=SQLConnectionTest.Execute --single-process-tests"
Valgrind detected, switching to single process mode.
Pass --test-launcher-debug-launcher to valgrind the launcher itself.
Note: Google Test filter = SQLConnectionTest.Execute
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from SQLConnectionTest
[ RUN ] SQLConnectionTest.Execute
[ OK ] SQLConnectionTest.Execute (1029 ms)
[----------] 1 test from SQLConnectionTest (1029 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1061 ms total)
[ PASSED ] 1 test.
Dr.M
Dr.M NO ERRORS FOUND:
Dr.M 0 unique, 0 total unaddressable access(es)
Dr.M 0 unique, 0 total invalid heap argument(s)
Dr.M 0 unique, 0 total GDI usage error(s)
Dr.M 0 unique, 0 total warning(s)
Dr.M ERRORS IGNORED:
Dr.M Details: E:\b\build\slave\win7-cr\build\unpacked\drmemory\logs\DrMemory-sql_unittests.exe.3180.000\results.txt

I can repro with:

$ unpacked/bin/drmemory.exe -symcache_dir C:/Users//AppData/LocalLow/drmemory.symcache -batch -no_count_leaks -no_leak_scan -light -no_callstack_use_top_fp_selectively -- ../../win7-cr-builder/build/src/out/Debug/sql_unittests.exe --gtest_filter=SQLConnectionTest.Execute --single-process-tests
Dr.M Dr. Memory version 1.6.1686
Dr.M Running "../../win7-cr-builder/build/src/out/Debug/sql_unittests.exe --gtest_filter=SQLConnectionTest.Execute --single-process-tests"
Note: Google Test filter = SQLConnectionTest.Execute
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from SQLConnectionTest
[ RUN ] SQLConnectionTest.Execute
Backtrace:
(No symbol) [0x01C2DF56]
SHPathPrepareForWriteW [0x759C5C81+1902]
Ordinal201 [0x75979461+2849]
Ordinal201 [0x7597AB99+8793]
IsUserAnAdmin [0x759C4DE5+2408]
SHFileOperationW [0x759C97EF+231]
SHFileOperationW [0x759C9717+15]
base::DeleteFileW 0x007B29B5+437
...

-no_use_symcache => works fine

So it's a problem with a symcache file.

I made a copy:
$ rsync -av drmemory.symcache drmemory.BUGCOPY.symcache

Binary search => it's SHELL32.dll.txt

$ cat SHELL32.dll.txt
Dr. Memory symbol cache version 11
408,12873728,2139568001799028737,2139568001799028737,12906023,1339216345,12886016
0
operator delete,0x6b0da
std::_DebugHeapDelete<>,0x0
Concurrency::details::Security::EncodePointer,0x0
operator delete[] nothrow,0x0
operator delete[],0x6b0e0
operator delete nothrow,0x0
operator new nothrow,0x0
operator new,0x6b296
operator new[] nothrow,0x0
operator new[],0x6b090

$ ls -l ../drmemory.BUGCOPY.symcache/drmemory.symcache/SHELL32.dll.txt SHELL32.dll.txt
-rwx------ 1 408 Aug 22 18:26 ../drmemory.BUGCOPY.symcache/drmemory.symcache/SHELL32.dll.txt
-rwxr-xr-x 1 392 Jan 28 18:09 SHELL32.dll.txt
$ diff ../drmemory.BUGCOPY.symcache/drmemory.symcache/SHELL32.dll.txt SHELL32.dll.txt
2c2
< 408,12873728,2139568001799028737,2139568001799028737,12906023,1339216345,12886016


   392,12873728,2139568001799028737,2139568001799028737,12906023,1339216345,12886016

4c4
< operator delete,0x6b0da


operator delete,0x0
8c8
< operator delete[],0x6b0e0


operator delete[],0x0
11c11
< operator new,0x6b296


operator new,0x0
13c13
< operator new[],0x6b090


operator new[],0x0

So the bad one has entries for operators while the newly generated one does
not.

module load event: "SHELL32.dll" 0x75970000-0x765ba000 modid: 40 C:\Windows\SysWOW64\shell32.dll
module SHELL32.dll imports from msvc* so not searching inside it
module SHELL32.dll imports from msvc* so not searching inside it
WARNING: unable to load symbols for C:\Windows\SysWOW64\shell32.dll
sym lookup of shell32!operator new* in C:\Windows\SysWOW64\shell32.dll => 0 0x00000000
marking operator new as processed since regex didn't match
marking operator new[] as processed since regex didn't match
marking operator new nothrow as processed since regex didn't match
marking operator new[] nothrow as processed since regex didn't match
sym lookup of shell32!operator delete* in C:\Windows\SysWOW64\shell32.dll => 0 0x00000000
marking operator delete as processed since regex didn't match
marking operator delete[] as processed since regex didn't match
marking operator delete nothrow as processed since regex didn't match
marking operator delete[] nothrow as processed since regex didn't match
sym lookup of shell32!std::_DebugHeapDelete<*> in C:\Windows\SysWOW64\shell32.dll => 0 0x00000000
sym lookup of shell32!std::_DebugHeapDelete<> in C:\Windows\SysWOW64\shell32.dll => 5 0x00000000
sym lookup of shell32!Concurrency::details::Security::EncodePointer in C:\Windows\SysWOW64\shell32.dll => 0 0x00000000
post-module-load @0:00:01.903 in thread 2092

With symbols (on another machine, w/ that shell32.dll):

0:000> lm
start end module name
73800000 7444a000 shell32 (pdb symbols) c:\src\symbols\shell32.pdb\1489BCD8E80C441E91308A093631DE5E2\shell32.pdb
0:000> x shell32!operator new*
738ac806 shell32!operator new[](no parameter info)
738aa97d shell32!operator new ()
0:000> x shell32!operator delete*
738ac788 shell32!operator delete[](no parameter info)
738ac374 shell32!operator delete ()
738adf73 shell32!operator delete ()

So those values are incorrect.

Did we ever have a pdb on that bot? Why would it have found those operator
symbols at all?

Original issue: http://code.google.com/p/drmemory/issues/detail?id=1427

@derekbruening
Copy link
Contributor Author

From [email protected] on February 01, 2014 08:44:46

xref issue #1313 , issue #1428 , issue #1430

@derekbruening
Copy link
Contributor Author

From [email protected] on July 07, 2014 13:23:53

We have more crashes on the win7-cr bot due to symcache corruption. I'm documenting it in this case even though it's not sql_.

The Buildbot has detected a new failure on builder win7-cr while building DrMemory.
Full details are available at: http://build.chromium.org/p/client.drmemory/builders/win7-cr/builds/8846 These crashes are happening every time on this bot:

base:
[ RUN ] SafeSPrintfTest.Padding
Valgrind detected, switching to single process mode.
Pass --test-launcher-debug-launcher to valgrind the launcher itself.
Dr.M WARNING: application exited with abnormal code 0xc0000005

net:
[ RUN ] NetworkChangeNotifierWinTest.NetChangeWinBasic
<Application E:\b\build\slave\win7-cr-builder\build\src\out\Debug\net_unittests.exe (2396). Client exception at PC 0x05d88288. Program aborted.
0xc0000005 0x00000000 0x05d88288 0x05d88288 0x00000000 0x00000003

unit:
[ RUN ] HistoryURLProviderTest.EmptyVisits
Dr.M WARNING: application exited with abnormal code 0xc0000005

We have symcache error reports from r1970 :

Dr.M SYMCACHE ERROR: module GDI32.dll (base @0x754a0000) with entry _malloc_dbg (offset @0x000077b2) is not accessible!
Dr.M SYMCACHE ERROR: module GDI32.dll (base @0x754a0000) with entry operator new (offset @0x00006c42) is not accessible!
Dr.M SYMCACHE ERROR: module GDI32.dll (base @0x754a0000) with entry operator delete (offset @0x00006c48) is not accessible!
Dr.M SYMCACHE ERROR: module KERNEL32.dll (base @0x75670000) with entry _malloc_dbg (offset @0x00269ac8) is not accessible!
Dr.M SYMCACHE ERROR: module IMM32.dll (base @0x75810000) with entry _malloc_dbg (offset @0x00032852) is not accessible!
Dr.M SYMCACHE ERROR: module IMM32.dll (base @0x75810000) with entry operator new (offset @0x00031ad0) is not accessible!
Dr.M SYMCACHE ERROR: module IMM32.dll (base @0x75810000) with entry operator new[](offset @0x000319b0) is not accessible!
Dr.M SYMCACHE ERROR: module IMM32.dll (base @0x75810000) with entry operator delete (offset @0x00031ad6) is not accessible!
Dr.M SYMCACHE ERROR: module IMM32.dll (base @0x75810000) with entry operator delete[](offset @0x00031e82) is not accessible!
Dr.M SYMCACHE ERROR: module IMM32.dll (base @0x75810000) with entry operator new nothrow (offset @0x00031e90) is not accessible!
Dr.M SYMCACHE ERROR: module USER32.dll (base @0x75e40000) with entry _malloc_dbg (offset @0x00006ff8) is not accessible!
Dr.M SYMCACHE ERROR: module USER32.dll (base @0x75e40000) with entry operator new (offset @0x00006316) is not accessible!
Dr.M SYMCACHE ERROR: module USER32.dll (base @0x75e40000) with entry operator delete (offset @0x0000631c) is not accessible!
Dr.M SYMCACHE ERROR: module PSAPI.DLL (base @0x76320000) with entry _malloc_dbg (offset @0x00006ff8) is not accessible!
Dr.M SYMCACHE ERROR: module NSI.dll (base @0x77ca0000) with entry _malloc_dbg (offset @0x000175da) is not accessible! r1970 should lead to bogus symcache entries being replaced as it will fall back to a drsyms lookup. We can still have crashes from readable but still wrong entries, but shouldn't the error entries go away from one run to the next?

@derekbruening
Copy link
Contributor Author

From [email protected] on July 17, 2014 12:04:49

Xref more diagnostic steps taken for symcache corruption in issue #1465 *** TODO now wait for win7-cr bot to hit one of the diagnostics

based on where the printout is (at exit?) try and deduce further: or go
and ask for ldmp on dr_abort and wait again

hmm, win7-cr is failing on unit_tests light: http://build.chromium.org/p/client.drmemory/builders/win7-cr/builds/9096/steps/Chromium%20%27unit%27%20tests/logs/stdio [ RUN ] DefaultAppsTest.Install
[2428:3732:0715/142633:8503411:FATAL:first_run_internal_win.cc(155)] Check failed: migrated. : The parameter is incorrect. (0x57)
Backtrace:

MFYI bot works though: http://build.chromium.org/p/chromium.memory.fyi/builders/Windows%20Unit%20%28DrMemory%29/builds/17773/steps/memory%20test%3A%20unit/logs/stdio [----------] 1 test from DefaultAppsTest
[ RUN ] DefaultAppsTest.Install
[ OK ] DefaultAppsTest.Install (94 ms)
[----------] 1 test from DefaultAppsTest (94 ms total)

locally that test works for me, isolated or in full unit_ light run
I blew the symcache away gain after build #9105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant