-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure that unoptimised functions calls ONLY unoptimised functions. #216
Make sure that unoptimised functions calls ONLY unoptimised functions. #216
Conversation
* - Function `f` calls function `g`. | ||
* - Function `g` is cloned as `__yk_clone_g`. | ||
* - Function `f` is not cloned because its address is taken. | ||
* - As a result, function `f` calls `__yk_clone_g` instead of `g`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the idea here is "we need to be able to trace f
, but its address has taken, so we will always call f
with tracing calls inserted"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More like:
We need to be able to trace f
, but its address has taken.
We need to make sure that any functions called from f
are also non-cloned functions (functions that we can trace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated tests 👉 0f3e89c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a result, function
f
calls__yk_clone_g
instead ofg
.
Is that what's already happening or what this function does? I feel like there's another bullet point missing.
Also, please remind me which functions have the yk_record_block
calls, the original or the cloned ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that what's already happening or what this function does? I feel like there's another bullet point missing.
The comment example scenario is about what's currently happening (before this function in this PR)
I can update it to make it clearer though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a result, function
f
calls__yk_clone_g
instead ofg
.
If f
isn't cloned, why is it calling __yk_clone_g
. We must have altered f
somewhere. So it appears, we first alter f
to call _yk_clone_g
instead of g
and then in this function we revert this change. Is that correct? If so, why don't we only update function calls to their cloned version if the caller function is also cloned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what's happening here, is that we rename all functions in the cloned module before linking.
But then some of them are not renamed cause their address is taken, while sill referencing functions that were renamed.
let's say we have:
func a { }
// function which address is taken
func b {
a();
}
After calling renameFunctions
on a cloned module we have:
func __yk_clone_a { }
// let b a function which address is taken
func b {
__yk_clone_a();
}
function a
was renamed but function b
wasn't (because address is taken).
And then when we link both modules,b
from cloned module overrides the original b
definition.
Example of actual module IR generated (before this PR):
; Function Attrs: noinline nounwind optnone uwtable
define internal i32 @luaB_assert(ptr noundef %0) #0 {
%2 = getelementptr inbounds %struct.lua_State, ptr %0, i64 0, i32 8
%3 = load ptr, ptr %2, align 8, !tbaa !27
%4 = load ptr, ptr %3, align 8, !tbaa !22
%5 = getelementptr inbounds %union.StackValue, ptr %4, i64 1
%6 = getelementptr inbounds %struct.lua_State, ptr %0, i64 0, i32 6
%7 = load ptr, ptr %6, align 8, !tbaa !22
%8 = icmp ult ptr %5, %7
br i1 %8, label %13, label %9
9: ; preds = %1
%10 = getelementptr inbounds %struct.lua_State, ptr %0, i64 0, i32 7
%11 = load ptr, ptr %10, align 8, !tbaa !10
%12 = getelementptr inbounds %struct.global_State, ptr %11, i64 0, i32 8
br label %13
13: ; preds = %9, %1
%14 = phi ptr [ %12, %9 ], [ %5, %1 ]
%15 = getelementptr inbounds %struct.TValue, ptr %14, i64 0, i32 1
%16 = load i8, ptr %15, align 8, !tbaa !23
%17 = icmp eq i8 %16, 1
%18 = and i8 %16, 15
%19 = icmp eq i8 %18, 0
%20 = or i1 %17, %19
br i1 %20, label %21, label %24, !prof !5
21: ; preds = %13
tail call void @__yk_clone_luaL_checkany(ptr noundef %0, i32 noundef 1) #77
tail call void @__yk_clone_lua_rotate(ptr noundef %0, i32 noundef 1, i32 noundef -1) #77
tail call void @__yk_clone_lua_settop(ptr noundef %0, i32 noundef -2) #77
%22 = tail call ptr @__yk_clone_lua_pushstring(ptr noundef %0, ptr noundef nonnull @.str.26.676) #77
tail call void @__yk_clone_lua_settop(ptr noundef %0, i32 noundef 1) #77
%23 = tail call i32 @luaB_error(ptr noundef %0)
unreachable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe the question is how these call sites in the function that is not cloned got renamed in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like F.setName
is actually updating the callsites as well as the function name.
I can see:
Before F.setName
is called for luaL_checkany
function:
21: ; preds = %13
tail call void @luaL_checkany(ptr noundef %0, i32 noundef 1) #77
...
unreachable
And just after F.setName
is called for luaL_checkany
function:
21: ; preds = %13
tail call void @__yk_clone_luaL_checkany(ptr noundef %0, i32 noundef 1) #77
....
unreachable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that what's already happening or what this function does? I feel like there's another bullet point missing.
Updated comment for clarity 👉 3462947
Added test in 👉 ykjit/yk#1477 |
* - As a result, function `f` calls `__yk_clone_g` instead of `g`. | ||
* | ||
* **Reasoning:** | ||
* In `YkIRWriter` we only serialise non-cloned functions. We want |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS this isn't really related to the serialiser. The real reason we want unoptimised functions to only call other unoptimised functions is because otherwise block recording won't work. In other words, if f
calls g_opt
, then f
will record the blocks, but g_opt
won't, we get incorrect traces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
youre right.
Okay, I had to remind myself how this all works, but I think I get it now. Please let me know if I said anything wrong. The way we make the optimised module is:
Question:
This seems a bit more straight forward then what we are doing now, where things get updated that we don't want to and then we have to get back and fix it, due to the way Separate suggestion (for a separate PR): I keep forgetting what the cloned and uncloned version of a function is. We should use unoptimised and optimised instead. I.e. the unoptimised function is the original (containing the |
Addressed here 👉 12fc9d7 |
for (const Argument &OrigArg : F.args()) { | ||
DestArgIt->setName(OrigArg.getName()); | ||
VMap[&OrigArg] = &*DestArgIt++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed? From what I can see llvm::CloneFunction
already copies the arguments. And the VMap
isn't actually read anywhere here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed 👉 4563f62f6b46eca4d91cd71192c4a32356690560
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the commit.
Aha, this looks good. I actually wasn't sure if there was a reason we did things the other way (and was half expecting you to tell me why my approach won't work). But I'm glad this works as it's a bit easier to think about and we don't need to go back and fix things up as with the earlier approach. Do we need to update the PR title now since our approach has changed, e.g. something like "Avoid updating unoptimised functions with optimised callees by processing optimised functions in isolation."? |
I think we went with cloning approach just because that's what we initially drawn on the whiteboard :) |
ready for review :) |
Please squash. |
SWT. Switch from module linking to function cloning for simplicity and changed the function cloning prefix from `__yk_clone_` to `__yk_opt_` to better reflect the optimization intent in ModuleClone pass.
94a1f50
to
453f961
Compare
Squashed 👉 453f961 |
Ensure that calls in original functions are referencing only original functions.
I encountered this issue when I was building
yklua
with latestyk
YKB_TRACER=swt
.The main issue here is with serialization in
YkIRWriter
, where we don't serialize cloned functions but we still might end up with original functions calling cloned functions.Lua build error:
source