-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relocatable Debug Counters #1589
Conversation
@mstoodle Could you take a look at this and eclipse-omr/omr#2408 @fjeremic @ymanton I'm gonna need your help to get this going on P and Z. Eventually I'll try to figure out reason for the java code not functioning as expected, but I figure it's good to get the initial code reviewed. |
130b32f
to
f97d394
Compare
@mstoodle Could you review the design |
The problem of the JIT code getting messed up when running with all debug counters on X is due to a limitation of the JIT; with AOT, the JIT changes
As such, until this underlying problem is fixed, the cg.prologue debug counters won't work. |
This entire problem is fixed by implementing eclipse-omr/omr#397 on x86 and Power. We had the exact same problem on Z and hence why we redesigned the codegen part of the debug counter infrastructure. @andrewcraik / @0dvictor FYI. |
void mapKeyToCounterName(const char *key, const char *counterName); | ||
const char *getCounterName(const char *key); | ||
uint32_t calculateSizeOfCounterNames(uint32_t &numDebugCounterNames); | ||
DebugCounterNameMap &getDebugCounterNameMap() { return _debugCounterNameMap; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new APIs should have Doxygen documentation, unless this is an extensible class and they've been already documented in OMR. Similarly for the new field added as well.
ep2 = cursor+8; // bcIndex | ||
ep3 = cursor+16; // keyOfNameString | ||
ep4 = cursor+24; // delta | ||
ep5 = cursor+40; // staticDelta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not use offsetof(TR_RelocationRecordDebugCounterBinaryTemplate, _staticDelta)
here and in nearby places? This seems not very robust to changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems not very robust to changes.
Yeah, the entire file follows this paradigm. At some point when I get time I want to clean up the AOT store-to-scc code (the load-from-scc code is already properly refactored). For now though, I'll do what you suggest with this bit of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh I'm not easily able to change the code to use offsetof(TR_RelocationRecordDebugCounterBinaryTemplate, _staticDelta)
; I'd have to do something like cursor+offsetof(TR_RelocationRecordDebugCounterBinaryTemplate, _staticDelta)-sizeof(TR_RelocationRecordBinaryTemplate)
which is also ugly and lacks robustness in a diff way (since changing the inheritance graph would cause problems again). I'll leave it as is for now.
Reminder to self: update commit when #1682 gets merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks mostly fine, and I think I'm ok to store the counter names alongside the meta data. I just don't really like the hoops to use the counter names.
TR::Compilation::DebugCounterNameMap::iterator entry = self()->getDebugCounterNameMap().find(key); | ||
if (entry != self()->getDebugCounterNameMap().end()) | ||
{ | ||
return entry->second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will comment once on indentation...
|
||
if (!_comp->getCounterName(key)) | ||
{ | ||
// Need to allocate persistent memory since the metadata can be reclaimed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it reclaimed or just disclaimed? the entire shared classes cache should be mapped into the address space, so I would have guessed you could just use pointers to the names stored in the meta data.
In that case, it would be better to store the references as offsets from the beginning of the shared classes cache, so then you can easily convert the offset in the relocation data into a pointer (to a spot in the meta data holding the name) by just adding the cache address (we do that in a bunch of other places e.g. class chains).
That would avoid this extra persistent memory allocation and the construction of the map (on both sides).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it reclaimed or just disclaimed?
The metadata that gets allocated in the data cache gets reclaimed, ie, freed back into the data cache for reuse.
the entire shared classes cache should be mapped into the address space, so I would have guessed you could just use pointers to the names stored in the meta data.
In that case, it would be better to store the references as offsets from the beginning of the shared classes cache, so then you can easily convert the offset in the relocation data into a pointer (to a spot in the meta data holding the name) by just adding the cache address (we do that in a bunch of other places e.g. class chains).
At AOT compile, when we store all the relocation records to the buffer, the metadata hasn't be created yet (I believe), so the relocation record can't know the offset at which the string will be stored. That's why I used the string ptr as both the key and the value during the AOT compile step.
However, what you're suggesting does work at AOT load time. since I can calculate the offset into the memory mapped shared cache from the metadata that gets memcpy'd into data cache allocated memory. That way I don't need to allocate the strings into persistent memory.
All that said, I just realized that there's a problem with the current implementation. TR::DebugCounterGroup::findCounter
finds the counter from a hashtable that's hashed via a const char *
. For counters that are created during a compile where the name is dynamically generated, my current implementation is fine. However, for counters who's names are known at compile time (ie stored in the BSO section of the dll), my implementation breaks down; this is because if the counter is enabled for multiple methods, the same string will get stored in multiple metadatas, and each of those will result in a new counter created at relo time (because they will have a different const char *
ptr as the key in the counter hash). I don't think it's a massive functional problem, the accumulation will just not be truly accumulative and instead will result in the developer having to add up stuff manually which is annoying.
I'm gonna have to think about how to get around this issue without dealing directly with the SCC; I'm reluctant to deal with the SCC directly because, as far as I could tell, the only way to store data to it was to hash it with a key, and generating a unique key involved (at least looking at how classes in the class chain are stored) taking the pointer and converting it to an ASCII string. If there's an easier way to deal with the SCC I'm more than happy to look into it.
@mstoodle Made the requested changes. I also changed the code to store the debug counter name strings directly to the SCC instead of the per method metadata. Unfortunately, this only partially solves the problem I mentioned in #1589 (comment) . Now if there's a counter name that sits in the BSO section of the jit dll, instead of seeing that counter for each AOT'd method + all JIT'd methods, there will be one counter for all AOT loaded methods and one counter for all JIT'd methods. Actually, if in the AOT load run we do an AOT compile, then on the subsequent run, there will be three counters instead of 1. There really isn't a good way to solve this issue, except creating a mapping between the string in the BSO and a copy that can be persisted... |
@@ -455,7 +455,8 @@ TR_J9SharedCache::rememberClass(J9Class *clazz, bool create) | |||
return NULL; | |||
} | |||
|
|||
char key[17]; // longest possible key length is way less than 16 digits | |||
const uint32_t bufferLength = 17; // longest possible key length is way less than 16 digits | |||
char key[bufferLength]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm should I change this back? I had modified this line when I thought I had to change the number of arguments to createClassKey
but as it turns out, I didn't need to. I guess I forgot to revert this.
After talking with @jdmpapin , it looks like the Debug Counter infra finds out if a counter already exists by checking in a hash table, which is special cased for |
@mstoodle Made all the requested changes. |
@dsouzai there's still a copyright that needs to be updated in J9SharedCache.cpp :) . |
@@ -833,6 +864,9 @@ uint32_t J9::Power::AheadOfTimeCompile::_relocationTargetTypeToHeaderSizeMap[TR_ | |||
32, // TR_VirtualRamMethodConst = 53 | |||
40, // TR_InlinedInterfaceMethod = 54 | |||
40, // TR_InlinedVirtualMethod = 55 | |||
0, // TR_NativeMethodAbsolute = 56 | |||
0, // TR_NativeMethodRelative = 57 | |||
56, // TR_DebugCounter = 55 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be = 58
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow good catch
Changes made :) |
@mstoodle I updated the "rememberDebugCounterName" function to use I've verified the new code on x86, I need to verify it on P and Z, but I don't anticipate any issues. |
Verified my changes on P and Z. |
Signed-off-by Check failed because of |
I'll rebase and re-test this once the OMR changes get to openj9-omr |
@andrewcraik I had been paying more attention to the dependent omr changes, but since I've now merged those, will now focus more on this one...Thanks for the ping. |
Signed-off-by: Irwin D'Souza <[email protected]>
Signed-off-by: Irwin D'Souza <[email protected]>
@mstoodle rebased changes look good (from a Travis CI pov). Do you mind now taking a look and seeing if everything from the previous reviews looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only 2 issues stood out to me:
- the first UDATA references in the codegen directory
- doesn't seem like it would be hard to at least add the presence of and the ability to create the DebugCounter relocation record in the ARM code (even though nothing may generate such relocations yet on ARM, at least the underlying infra would be there)
|
||
TR::DebugCounterReloData *counterReloData = counter->getReloData(); | ||
|
||
UDATA offset = fej9->sharedCache()->rememberDebugCounterName(counter->getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer this to be a JIT data type rather than UDATA, just for consistency. I know it doesn't architecturally matter any more now that we're in the same repo, but I think the consistency is still helpful until we get around to unifying all our types.
Signed-off-by: Irwin D'Souza <[email protected]>
Signed-off-by: Irwin D'Souza <[email protected]>
Signed-off-by: Irwin D'Souza <[email protected]>
Signed-off-by: Irwin D'Souza <[email protected]>
@mstoodle made the requested changes |
My personal builds/tests all passed. |
@mstoodle Does everything look ok to you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good enough to go in, to me. Duplicating the code 3 times is in character for these files, sadly, though taking advantage of a function implemented in the base class might have been better.
@@ -69,6 +69,9 @@ class TR_J9SharedCache : public TR_SharedCache | |||
|
|||
UDATA *rememberClass(J9Class *clazz, bool create=true); | |||
|
|||
UDATA rememberDebugCounterName(const char *name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would have been more correct to use uintptrj_t here rather than UDATA and do the cast inside these functions, but I'll let it go as something easily cleaned up later if it the typing inconsistency becomes any kind of issue.
jenkins test sanity |
One of my goals is to try and clean up the AOT store side infrastructure; hopefully I manage to get to that this year :) |
NPT library missing errors; looks like infra issues (affecting other PRs too: See #2129 ). jenkins test sanity win |
Windows build still failing with |
@mstoodle Is it worth trying a sanity build for windows again? |
ok, i'll try one more time :) . It would be nice to have a clean run, but if this doesn't fail I am pretty confident we can merge this PR. You're convinced by "pretty confident" right? ;) jenkins test sanity win |
the "intermittent" problem only affects the JDK8 runs, I believe, but it's hard to believe your changes are dependent on JDK10 vs. JDK8... |
Heh, yeah, considering that changes to the actual compiler were all done in OMR that's been fine. As an aside, only 2 of the 3 win jobs kicked off; not sure why, but it looks like it won't be a clean PR either way :P |
be ware of #2129 NPT Error failures on windows - they happen pretty much every time on every PR build at the moment. |
jenkins test sanity win32 |
I don't believe the JDK8 win or win32 failures are due to this PR. Merging. |
Depends on eclipse-omr/omr#2408
I tested this with a simple java program:
This just does an add 100 times and prints out the total (which is 100). I added a dynamic debug counter in simplifier called "CounterTest".
JIT
AOT Compile run
AOT Load run
There's still a bug somewhere though, since if I enable all debug counters, the number printed out by the java code is wrong (though the counters seem ok :P)
JIT
AOT Compile run
AOT Load run
Credit to @lmaisons for the idea of using the metadata to store the debug counter names.