-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cache last callstack for further callstack walking performance improvement #1187
Comments
From [email protected] on April 17, 2013 08:21:24 **** TODO initial implementation: stats look good, but no perf win on cfrac https://codereview.appspot.com/8789046/ cfrac full mode:
w/o issue #1186 find_next_fp cache or issue #1187 cstack cache: w/ issue #1186 find_next_fp cache: w/ issue #1187 cstack cache: w/ both: **** TODO slight speedup on perlbench this is the copy from goobuntu, diffmail run: /usr/bin/time /work/drmemory/git/build_x86_rel/bin/drmemory.pl -replace_malloc -dr_ops "-msgbox_mask 12 -prof_pcs" -pause_at_assert -dr /work/dr/git/exports -- /extsw/spec2006/v1.2-perlbench/perlbench_base.gcc43-32bit -I./lib diffmail.pl 4 800 10 17 19 300 > diffmail.4.800.10.17.19.300.out1719.92user 5.65system 28:51.26elapsed 99%CPU (0avgtext+0avgdata 429740maxresident)k this is vs the final issue #1186 run, pasting from below: debug run: again, fpra_cache_check shows up: can we speed that up? **** TODO correctness: seems to have some problems? => bailing The performance numbers show that while it has benefits in the absence of fpra_cache_check() shows up in profiling. For every frame on a new Running drheapstat on cfrac w/ and w/o this feature and comparing perf summary: perlbench diffmail: I implemented it quickly, so it's possible it's worthwhile if done right: Bailing for now. Leaving this issue open for either trying something like this cache again, or some other idea to improve callstack walking. |
From [email protected] on April 16, 2013 19:53:10
Callstack walking for optimized code is complex and I could cite quite a few issue #s here on tweaks for both functionality and performance. This issue covers going a step beyond the work in issue #1186 and caching the full last callstack, with fp's, to try and avoid find_next_fp() even more. It still shows up on perlbench diffmail:
1833.59user 5.72system 30:45.28elapsed 99%CPU (0avgtext+0avgdata 429768maxresident)k
ITIMER distribution (182407):
0.0% of time in APPLICATION (1)
3.2% of time in INTERPRETER (5785)
0.3% of time in DISPATCH (628)
0.1% of time in SYSCALL HANDLER (103)
3.0% of time in INDIRECT BRANCH LOOKUP (5421)
43.8% of time in FRAGMENT CACHE (79891)
49.7% of time in UNKNOWN (90578)
RES-pcsamples.0.7276.html
897 get_shadow_table
1011 add_to_delay_list
1109 bitmapx2_set
1262 find_free_list_entry
1307 packed_callstack_hash
1438 safe_read
1662 module_lookup
2334 shadow_set_range
2458 address_to_frame
2999 rb_in_node
3925 print_callstack
5895 find_next_fp
Original issue: http://code.google.com/p/drmemory/issues/detail?id=1187
The text was updated successfully, but these errors were encountered: