-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DrMemory full mode uses too much memory, causing app alloc failures on Chrome unit_tests #792
Comments
From [email protected] on February 29, 2012 07:46:46 So, I sharded unit_tests 6 ways (very inconvenient to have 6 separate logs), and the problem is still there. My current belief is that dbghelp is using all of this memory. Only the first unit_test shard is failing, so I'm guessing that it starts with an empty symcache and populates it for the rest of the shards. It has to query drsyms about all of the system dlls loaded by the app, and the pdbs just sit around after that, eating lots of memory. We need some of them to do address to symbol lookup for symbolized stack traces, but most of them we can probably drop. Adding some cache management to drsyms would probably fix this issue. I'll measure the usage for just the "statically" linked (is there a better term for that?) system dlls by looking at the process in process explorer after reaching main and report back. Owner: [email protected] |
From [email protected] on February 29, 2012 07:48:14 On second though, I'm not sure this is the case: the bots don't have symbols installed, so they shouldn't be loading large pdbs other than unit_tests.pdb. |
From [email protected] on February 29, 2012 07:56:06 if it is drsyms, this issue covers pdb management: https://code.google.com/p/dynamorio/issues/detail?id=449 |
From [email protected] on February 29, 2012 15:06:25 I ran unit_tests.exe locally for a while. This was the heap usage from DrMemory's log: Heap usage: Altogether the overhead is less than 115 MB, and the delay max free size is ~20 MB, so drmemory's data structures don't seem to be the source of the problem. I used vmmap (from the same guys as Process Explorer) to get a visualization of memory usage by type, and got the attached screenshot some time through execution. Of the 1.7 GB working set, 1.3 is "private data" and only 36 MB is heap. All of the app's heap allocations should be detected as living in the heap, so the app is not using too much heap. DR's heap should be in the "private data" category, however we know that DrMemory's data structures only account for a small percentage of that. A large portion of the private data is RWX, which should only be DR's code cache. I can't get a breakdown by permission to try to estimate the code cache size, though. I am going to run again with DR -loglevel 1 to get some heap usage stats from there, and see if some privately loaded library is allocating behind our back or if DR itself is letting the code cache grow too large. Attachment: unit_tests_heap_usage.png |
From [email protected] on March 12, 2012 12:52:32 I ran drmemory on all of unit_tests.exe over the weekend with logging, but I forgot that I had disabled heap accounting temporarily. =/ I'll run again with that soon. I attached the final stats dump. The highlights are: That isn't enough to really hit the 2 GB limit, so that points to dbghelp's memory. Attachment: unit_tests_stats.txt |
From [email protected] on March 13, 2012 10:32:14 Yup, it's definitely dbghelp, but it's not a file mapping. Here's DR's heap breakdown: Updated-at-end Process (max is total of maxes) heap breakdown: 100 MB from the client is consistent with what DrMemory thinks it is using. The 860 MB of heap allocs from dbghelp is what's really clogging things up. I grepped for just the Lib Dup line in the logs, and you can see occasional 40 MB spikes in usage. Unfortunately, they don't seem to correspond with module load events, because there are none between the spikes in the logs. This suggests to me that dbghelp (or at least our version) has some overgrown cache data structure, or it is leaking memory when we ask it to symbolize. |
From [email protected] on March 13, 2012 15:05:56 I added an LRU cache to drsyms_windows.c for testing purposes. It seems that unloading symbols for modules is enough to prevent run away memory usage, but it still uses quite a bit. I attached a graph of "lib dup" heap size over "time" with a cache size of 5 modules. It's not really time since it's just whenever the heap stats got dumped, which I think is controlled by # of fragments translated. I spent a bit of time investigating and the major decreases in the memory usage correlate pretty closely to unloading syms for unit_tests.exe. This run was also done with a primed symcache, so pdbs are not needed when loading a module, they are only needed when symbolizing a report to be suppressed. The run hasn't finished yet, so I can't say how many stacks have been symbolized. I also don't know what impact this has on runtime, because this run has logging enabled. Before submitting I'd like to measure the impact on a relase build on a small shard of unit_tests.exe, since that binary tends to load many different dlls and generate many reports to suppress. Attachment: chart_1.png |
From [email protected] on March 14, 2012 11:44:00 I left unit_tests running over night, and I got the updated attached memory usage graph. My conclusion is that:
The simple LRU cache I wrote isn't really the right way to manage this, because memory associated with a module can grow over time. We were able to run with less usage over time because the LRU cache would unload each module every so often and prevent any one module from growing too huge. It's all coincidental, though, so I don't think it's a solid solution. I wanted to dig further into this by logging all the drsyms queries and replaying them with a separate exe that we can apply standard heap profiling tools to, but I realized that that's basically implementing issue #446 , callstack post-processing. I'm also just generally concerned about this excessive usage, and implementing issue #446 will allow us to run larger tests where we can't use dbghelp at all. Blockedon: 446 Attachment: chart_1 (1).png |
From [email protected] on March 20, 2012 11:18:58 one idea that is a targeted fix to unit_tests is to build unit_tests.exe /largeaddressaware |
From [email protected] on April 07, 2012 08:26:15 Between implementing issue #849 (the bit-level heuristic to mark the whole byte as defined) and issue #839 (whole module suppression), the situation is much better. I don't think we need to depend on postprocessing in order to bring down our memory usage. However, we're still hitting this "failed to create thread" error fairly frequently. 3/4 of the last 4 builds hit this error. For now, I'm assuming it's lack of address space causing the failure, but I need to repro locally and look at memory usage in order to confirm this. http://build.chromium.org/p/chromium.fyi/builders/Windows%20Tests%20%28DrMemory%20full%29/builds/1277 http://build.chromium.org/p/chromium.fyi/builders/Windows%20Tests%20%28DrMemory%20full%29/builds/1276 http://build.chromium.org/p/chromium.fyi/builders/Windows%20Tests%20%28DrMemory%20full%29/builds/1275 http://build.chromium.org/p/chromium.fyi/builders/Windows%20Tests%20%28DrMemory%20full%29/builds/1274 Blockedon: -446 |
From [email protected] on February 23, 2012 10:51:40
Example build: http://build.chromium.org/p/chromium.fyi/builders/Windows%20Tests%20%28DrMemory%20full%29/builds/936 See unit_tests shard 1 of 3 in particular: http://build.chromium.org/p/chromium.fyi/builders/Windows%20Tests%20%28DrMemory%20full%29/builds/936/steps/memory%20test%3A%20unit_1/logs/stdio We run for awhile and eventually we get some warnings about failed heap allocations:
[ RUN ] ExtensionServiceTest.InstallTheme
Dr.MDr.MError#133
: WARNING: heap allocation failedDr.M# 0 (0x22b80048)Dr.M# 1 Pickle::Resize [base\pickle.cc:405]Dr.M# 2 Pickle::BeginWrite [base\pickle.cc:383]Dr.M# 3 Pickle::WriteBytes [base\pickle.cc:330]Dr.M# 4 Pickle::WriteData [base\pickle.cc:324]Dr.M# 5 IPC::ParamTraits::Write [content\public\common\common_param_traits.cc:538]Dr.M# 6 IPC::WriteParam [ipc\ipc_message_utils.h:169]Dr.M# 7 IPC::ParamTraits<Tuple2<SkBitmap,FilePath> >::Write [ipc\ipc_message_utils.h:897]Dr.M# 8 IPC::WriteParam<Tuple2<SkBitmap,FilePath> > [ipc\ipc_message_utils.h:169]Dr.M# 9 IPC::ParamTraitsstd::vector<Tuple2<SkBitmap,FilePath,std::allocator<Tuple2<SkBitmap,FilePath> > > >::Write [ipc\ipc_message_utils.h:541]Dr.M#10
IPC::WriteParamstd::vector<Tuple2<SkBitmap,FilePath,std::allocator<Tuple2<SkBitmap,FilePath> > > > [ipc\ipc_message_utils.h:169]Dr.M#11
ExtensionUnpacker::DumpImagesToFile [chrome\common\extensions\extension_unpacker.cc:218]Dr.M#12
SandboxedExtensionUnpacker::Start [chrome\browser\extensions\sandboxed_extension_unpacker.cc:275]Dr.M#13
base::internal::RunnableAdapter<void (__thiscall SandboxedExtensionUnpacker::*)(void)>::Run [base\bind_internal.h:132]Dr.MNote: @0:36:37.300 in thread 3652The app crashes and does a self-backtrace afterwards.
We should repro this locally and gather statistics about what is taking the most memory. My best guesses are that we're generating and suppressing lots of reports, so we may be holding too many packed callstacks. Alternatively, maybe the delayed freelist is too long. Or dbghelp is using lots of memory.
For the moment I'm going to try splitting unit_tests into 6 shards to see if that lets us get further.
Original issue: http://code.google.com/p/drmemory/issues/detail?id=792
The text was updated successfully, but these errors were encountered: