Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building with debugging causes test failures on i686-unknown-linux-gnu #1343

Closed
brson opened this issue Dec 19, 2011 · 19 comments
Closed

Building with debugging causes test failures on i686-unknown-linux-gnu #1343

brson opened this issue Dec 19, 2011 · 19 comments
Assignees
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.)

Comments

@brson
Copy link
Contributor

brson commented Dec 19, 2011

This command:

make check-stage1-T-i686-unknown-linux-gnu-H-x86_64-unknown-linux-gnu-rfail DEBUG=1

Results in tests where the instruction pointer ends up in bizarre places or the stack is misaligned. I don't know if it's related to the recent addition of debug info or not.

@jdm
Copy link
Contributor

jdm commented Jan 5, 2012

What sort of output do you see?

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

Here's what I'm seeing now

valgrind: m_debuginfo/storage.c:389 (vgModuleLocal_addLineInfo): Assertion 'lineno >= 0' failed.
==19583==    at 0x38028210: report_and_quit (m_libcassert.c:193)
==19583==    by 0x38028477: vgPlain_assert_fail (m_libcassert.c:267)
==19583==    by 0x3805E09A: vgModuleLocal_addLineInfo (storage.c:389)
==19583==    by 0x380AB666: vgModuleLocal_read_debuginfo_dwarf3 (readdwarf.c:770)
==19583==    by 0x3805A836: vgModuleLocal_read_elf_debug_info (readelf.c:2206)
==19583==    by 0x38051C99: vgPlain_di_notify_mmap (debuginfo.c:822)
==19583==    by 0x3806D451: vgModuleLocal_generic_PRE_sys_mmap (syswrap-generic.c:2065)
==19583==    by 0x3809E9C1: vgSysWrap_x86_linux_sys_mmap2_before (syswrap-x86-linux.c:1381)
==19583==    by 0x38069F35: vgPlain_client_syscall (syswrap-main.c:1443)
==19583==    by 0x3806638D: handle_syscall (scheduler.c:895)
==19583==    by 0x380682BD: vgPlain_scheduler (scheduler.c:1091)
==19583==    by 0x38079F60: run_a_thread_NORETURN (syswrap-linux.c:94)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==19583==    at 0x4416BA3: mmap (mmap.S:65)
==19583==    by 0x4406582: _dl_map_object_from_fd (dl-load.c:1240)
==19583==    by 0x4407EE2: _dl_map_object (dl-load.c:2250)
==19583==    by 0x440CD5F: openaux (dl-deps.c:65)   
==19583==    by 0x440E7E5: _dl_catch_error (dl-error.c:178)
==19583==    by 0x440CF09: _dl_map_object_deps (dl-deps.c:247)
==19583==    by 0x4402BB6: dl_main (rtld.c:1809)
==19583==    by 0x441427D: _dl_sysdep_start (dl-sysdep.c:244)
==19583==    by 0x4404A5F: _dl_start (rtld.c:336)   
==19583==    by 0x4400856: ??? (in /lib32/ld-2.13.so)

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

This doesn't sound like what I was seeing in the original report.

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

Running the alt-bot-fail test results in a failure here

#0  0xf7ef976e in check_stack_alignment () from /home/banderson/Dev/rust/build/x86_64-unknown-linux-gnu/test/run-fail/../../stage1/lib/rustc/i686-unknown-linux
-gnu/lib/librustrt.so
#1  0xf7d92936 in memmove () at ../sysdeps/i386/i686/multiarch/memmove.S:42
#2  0xf7fdd900 in ?? ()
#3  0xf7ed9eb5 in upcall_fail (expr=0x8049090 "explicit failure", file=0x80490b0 "../src/test/run-fail/alt-bot-fail.rs", line=7) at ../src/rt/rust_upcall.cpp:9
1
#4  0x08048bd0 in main ()
#5  0x08048c21 in _rust_main ()
#6  0xf7ed49ce in task_start_wrapper (a=0x8052d3c) at ../src/rt/rust_task.cpp:354
#7  0x00000000 in ?? ()

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

This could just be something wrong with check_stack_alignment, which I wrote recently and do not have complete confidence in.

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

This is with --disable-optimize-cxx - otherwise check_stack_alignment gets inlined

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

Er, check_stack_alignment is an assembly function. It shouldn't be inlined.

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

Disabling check_stack_alignment and turning off valgrind results in segfaults with the following backtrace:

#0  0x0805014a in ?? ()
#1  0xf7ffd918 in _r_debug ()

gdb doesn't think the program counter is inside a function in either of these frames

@jdm
Copy link
Contributor

jdm commented Jan 5, 2012

Having configured with --disable-optimize on OS X 10.6.8, I don't see any problems with alt-bot-fail (or any other run-fail tests, for that matter).

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

You're building 32-bit tests?

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

I guess it could be linux-specific

@jdm
Copy link
Contributor

jdm commented Jan 5, 2012

I'm building x86_64.

@brson
Copy link
Contributor Author

brson commented Jan 5, 2012

0e98e64 made the valgrind complaints about debug info go away, and now I see again the errors that this issue was originally about. These problems are only present for 32-bit targets.

Here's a sampling:

command: /usr/bin/valgrind --leak-check=full --error-exitcode=100 --quiet --suppressions=../src/etc/x86.supp x86_64-unknown-linux-gnu/test/run-fail/morestack4.stage1-i686-unknown-linux-gnu
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------
vex x86->IR: unhandled instruction bytes: 0x6F 0x29 0xAC 0x4
==15754== Thread 4:
==15754== Invalid read of size 1
==15754==    at 0x804AFF4: ??? (in /home/brian/Dev/rust/build/x86_64-unknown-linux-gnu/test/run-fail/morestack4.stage1-i686-unknown-linux-gnu)
==15754==    by 0x728DAD7: ???
==15754==  Address 0x728e75c is 2,644 bytes inside a block of size 2,836 alloc'd
==15754==    at 0x48DDBD3: malloc (vg_replace_malloc.c:236)
==15754==    by 0x4AC6DD0: rust_srv::malloc(unsigned int) (rust_srv.cpp:18)
==15754==    by 0x4AE1208: memory_region::malloc(unsigned int, char const*, bool) (memory_region.cpp:104)
==15754==    by 0x4ABDACC: rust_task::malloc(unsigned int, char const*, type_desc*) (rust_task.cpp:530)
==15754==    by 0x4ABC349: new_stk(rust_scheduler*, rust_task*, unsigned int) (rust_task.cpp:183)
==15754==    by 0x4ABCAEE: rust_task::rust_task(rust_scheduler*, rust_task_list*, rust_task*, char const*) (rust_task.cpp:261)
==15754==    by 0x4ABB2FB: rust_scheduler::create_task(rust_task*, char const*) (rust_scheduler.cpp:340)
command: /usr/bin/valgrind --leak-check=full --error-exitcode=100 --quiet --suppressions=../src/etc/x86.supp x86_64-unknown-linux-gnu/test/run-fail/unwind-misc-1.stage1-i686-unknown-linux-gnu
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------
==16269== Thread 4:
==16269== Invalid read of size 1
==16269==    at 0x4A62FFC: _GLOBAL_OFFSET_TABLE_ (in /home/brian/Dev/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustc/i686-unknown-linux-gnu/lib/libstd-79ca5fac56b63fde-0.1.so)
==16269==  Address 0xdc is not stack'd, malloc'd or (recently) free'd
==16269==
==16269==
==16269== Process terminating with default action of signal 11 (SIGSEGV)
==16269==  Access not within mapped region at address 0xDC
==16269==    at 0x4A62FFC: _GLOBAL_OFFSET_TABLE_ (in /home/brian/Dev/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustc/i686-unknown-linux-gnu/lib/libstd-79ca5fac56b63fde-0.1.so)
==16269==  If you believe this happened as a result of a stack
==16269==  overflow in your program's main thread (unlikely but
==16269==  possible), you can try to increase the size of the
==16269==  main thread stack using the --main-stacksize= flag.
==16269==  The main thread stack size used in this run was 16777216.
==16269== Thread 1:
==16269== 64 bytes in 3 blocks are possibly lost in loss record 14 of 35
==16269==    at 0x48DDBD3: malloc (vg_replace_malloc.c:236)
==16269==    by 0x4B15DD0: rust_srv::malloc(unsigned int) (rust_srv.cpp:18)

@ghost ghost assigned brson Jan 10, 2012
@brson
Copy link
Contributor Author

brson commented Jan 10, 2012

I'm going to disable debug info on 32-bit targets.

@graydon
Copy link
Contributor

graydon commented Jan 10, 2012

Consensus is to, short term, add a table in the driver that lists which targets debuginfo is actually working for, and to disable it for i686 for now. Friendlier than random corruption.

@brson
Copy link
Contributor Author

brson commented Jan 10, 2012

Compiling run-pass/return-nil for i686-linux, the call to f() is followed by a bogus 'subl $4, %esp'

Without -g:

.LBB1_5:                                                                                                                                                                      
        #APP                                                                                                                                                                  
        ; let x = f(); (../src/test/run-pass/return-nil.rs:5:12: 5:23)                                                                                                        
        #NO_APP                                                                                                                                                               
.Ltmp8:                                                                                                                                                                       
        movl    -8(%ebp), %ebx                                                                                                                                                
        calll   _ZN1f17_cf204e937fd488eeE@PLT                                                                                                                                 
.Ltmp9:                                                                                                                                                                       
        jmp     .LBB1_7                                                                                                                                                       
.LBB1_7:                                                                                                                                                                      
        jmp     .LBB1_9

With -g:

.LBB1_5:                                                                                                                                                                      
        .loc    2 5 10                                                                                                                                                        
        #APP                                                                                                                                                                  
        ; let x = f(); (../src/test/run-pass/return-nil.rs:5:12: 5:23)                                                                                                        
        #NO_APP                                                                                                                                                               
        .loc    2 5 20                                                                                                                                                        
.Ltmp10:                                                                                                                                                                      
        movl    -8(%ebp), %ebx                                                                                                                                                
        calll   _ZN1f17_9f8ee824e170db46E@PLT                                                                                                                                 
        subl    $4, %esp                                                                                                                                                      
.Ltmp11:                                                                                                                                                                      
        jmp     .LBB1_7                                                                                                                                                       
.LBB1_7:                                                                                                                                                                      
        jmp     .LBB1_9

@jckarter
Copy link

LLVM does that if it thinks the function being called has an sret argument on i386, because struct return pointers are callee-cleanup. Could that be what's happening?

@brson
Copy link
Contributor Author

brson commented Jan 10, 2012

It is definitely related to the sret attribute. Right now -g does not apply sret to the function definition, but does apply sret to the call site. Remove sret makes the problem go away. I will probably just remove sret for now, let @jdm ponder it later.

It does seem to me that sret isn't a debugging attribute - when we figure out how it should be used, we should just always use it.

@brson brson closed this as completed in af086aa Jan 10, 2012
@jdm
Copy link
Contributor

jdm commented Jan 10, 2012

Ah, sorry about leaving that in. The sret attribute was part of my attempt to get function return values to show up correctly in gdb due to our bizarro way of returning values.

celinval pushed a commit to celinval/rust-dev that referenced this issue Jun 4, 2024
Kobzol pushed a commit to Kobzol/rust that referenced this issue Dec 30, 2024
bors pushed a commit to rust-lang-ci/rust that referenced this issue Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.)
Projects
None yet
Development

No branches or pull requests

4 participants