Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CODEGEN][LLVM] Cache packed func ptr, lift alloca #2070

Merged
merged 1 commit into from
Nov 6, 2018

Conversation

tqchen
Copy link
Member

@tqchen tqchen commented Nov 6, 2018

This PR resolves to problems revealed by #2067, thanks to @denis0x0D.

  • Previously the code did not store the loaded packedFunc handle when it get the correct one,
  • Alloca are now always lifted to the beginning of the function

@yzhliu
Copy link
Member

yzhliu commented Nov 6, 2018

trying to understand how lifting could prevent allocating memory repeatedly?

@denis0x0D
Copy link
Contributor

denis0x0D commented Nov 6, 2018

@tqchen The patch solves my problem, thanks a lot!
@yzhliu As far as I understood the point was to cache the packed function which was found, and don't call for search every time.
Also, I am not sure how, but if we cache the pointer to packed function the problem #2067 disappears
Line 325 - store the ptr to packed function to "cache ptr"

315 .LBB1_4:
316   movq  __tvm_module_ctx@GOTPCREL(%rip), %rax
317   movq  (%rax), %rdi
318   leaq  .L.str.27(%rip), %rsi
319   leaq  64(%rsp), %rdx
320   movq  __TVMBackendGetFuncFromEnv@GOTPCREL(%rip), %rax
321   callq *(%rax)
322   testl %eax, %eax
323   jne .LBB1_10
324   movq  64(%rsp), %rdi
325   movq  %rdi, .tvm_func.my_debug(%rip)
326   movq  128(%rsp), %rdx
327   jmp .LBB1_6
328   .p2align  4, 0x90
329 .LBB1_3:

Check cache and then call it. Line 345 and so on.

345   movq  .tvm_func.my_debug(%rip), %rdi
346   testq %rdi, %rdi
347   je  .LBB1_4
348 .LBB1_6:
349   movl  $1, %ecx
350   movq  %r12, %rbx
351   movq  %r12, %rsi
352   movq  %rdx, %r12
353   movq  %r14, %r8
354   movq  40(%rsp), %r9
355   movq  __TVMFuncCall@GOTPCREL(%rip), %rax
356   callq *(%rax)
357   testl %eax, %eax

Before this patch, we just call for search the function in the hashtable repeatedly

332   movq  __tvm_module_ctx@GOTPCREL(%rip), %rax
333   movq  (%rax), %rdi
334   leaq  .L.str.27(%rip), %rsi
335   movq  %r14, %rdx
**336   movq  __TVMBackendGetFuncFromEnv@GOTPCREL(%rip), %rax**
337   callq *(%rax)
338   testl %eax, %eax
339   jne .LBB1_8
340   movq  (%r14), %rdi
341   movl  $1, %ecx
342   movq  %r13, %rsi
343   movq  %r12, %rdx
344   movq  %rbx, %r8
345   movq  -88(%rbp), %r9
346   movq  __TVMFuncCall@GOTPCREL(%rip), %rax
347   callq *(%rax)
348   testl %eax, %eax

But I am not sure why the error does disappear.

@tqchen
Copy link
Member Author

tqchen commented Nov 6, 2018

This is the compound of two problems:

Here is the pseudo code of old code:

void* packed_handle = nullptr;
int myfunc() {
  for (int i = 0; i < n; ++i) {
     alloca out;
     GetFuncFromEnv(ctx, "name_of_func", &out);
     CallFunc(out, ...);
   }
}

After the fix, it becomes

void* packed_handle = nullptr;
int myfunc() {
   // lift aloca
  alloca out;
  for (int i = 0; i < n; ++i) {
     if (packed_handle == nullptr) {
        GetFuncFromEnv(ctx, "name_of_func", &out);
        // store the pointer
        packed_handle = out;
     }
     CallFunc(packed_handle, ...);
  }
}
  • The root problem is that alloca simply increases stack pointer but did not release the memory in the stack until function exits, so it is bad to call alloca inside a loop. LLVM recommends us to only do alloca in the head of the function.
  • In our case, we alloca the out temporary space to store the result of the GetFuncFromEnv. We did it in a loop before. Because we didn't cache the result of the packed_function handle, we call alloca each time we call GetFuncFromEnv and it results in a stack overflow.
  • Either the problem(of handle cache, or alloca location) would have resolved the problem.
    • If we cache the handle, alloca is only called once(because next time handle is not null)
    • If we lift alloca, the temporary space is allocated at beginning of the function and used repeatively even if we call GetFuncFromEnv multiple times.

But since both are actually problems, so this PR fixes both.

@tqchen tqchen merged commit fdf035e into apache:master Nov 6, 2018
FrozenGene pushed a commit to FrozenGene/tvm that referenced this pull request Dec 27, 2018
@ZihengJiang ZihengJiang mentioned this pull request Feb 1, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants