Bug optimization #47410

aristarh2704 · 2018-01-13T09:48:10Z

I write this code:

pub fn clean(x:isize){
unsafe{
let mut buffer=0xb8000 as *mut u8;
let mut y=0;
while y<(80 * 25 * 2){
let mut p=buffer.offset(y);
*buffer.offset(y)=0x00;
y=y+1;
*buffer.offset(y)=0x20;
y=y+1;
}
}
}

and compile: "cargo build --release"
Disassembling views this:

000000e0 <__ZN6memory5clean17h43e2ddeb8148e2edE>:
push %ebp
mov %esp,%ebp
movaps 0x0,%xmm0
xor %eax,%eax
nopl 0x0(%eax)
movups %xmm0,0xb8010(%eax,%eax,1)
movups %xmm0,0xb8000(%eax,%eax,1)
add $0x10,%eax
cmp $0x7d0,%eax
jne f0 <__ZN6memory5clean17h43e2ddeb8148e2edE+0x10>
pop %ebp
ret

It's not correct code.
But when I write

y=y+x; //x is argument equal 1

instead of

y=y+1;

it produce this asm:

000000e0 <__ZN6memory5clean17h43e2ddeb8148e2edE>:
push %ebp
mov %esp,%ebp
push %esi
mov 0x8(%ebp),%eax
mov $0xb8000,%ecx
xor %esi,%esi
lea 0xb8000(%eax),%edx
data16 data16 nopw %cs:0x0(%eax,%eax,1)
movb $0x0,(%ecx,%esi,1)
movb $0x20,(%edx,%esi,1)
add %eax,%esi
add %eax,%esi
cmp $0xfa0,%esi
jl 100 <__ZN6memory5clean17h43e2ddeb8148e2edE+0x20>
pop %esi
pop %ebp
ret

which is really correct.
Rustc version:

$ rustc -vV

rustc 1.22.1 (05e2e1c 2017-11-22)
binary: rustc
commit-hash: 05e2e1c
commit-date: 2017-11-22
host: i686-pc-windows-gnu
release: 1.22.1
LLVM version: 4.0

The text was updated successfully, but these errors were encountered:

gamozolabs · 2018-01-13T10:59:49Z

This is likely because you are either doing an objdump on a .o file (which has not been linked) or you have incorrectly linked the binary.

When running objdump on the .o file, relocations have not been performed so the movaps instruction which is moving zero in objdump is actually supposed to be a load from memory. However since relocations have not been performed yet you observe that it is not a memory access but a zero. This is actually pretty misleading from objdump. It seems that objdump or AT&T syntax does not explicitly display that this is a memory access as the movaps has no immediate form, thus it just displays 0, which is the pre-relocation value for the memory address that it is loading from.

Here you can see a side-by-side of the disassembly of the object file in IDA (a more advanced disassembler) and objdump. You can see while they're both the same instruction "0f 28 05 xx 00 00 00", IDA correctly recognizes this as a memory load (and also performs the relocation hence the 0x6c instead of 0x00)

If you run objdump -r on your file you will see there is a relocation entry for the offset where the immediate value for the movaps instructions is.

This movaps actually is loading a memory array filled with the correct values to compile to exactly what you specified in your Rust code.

Further the reason the code changes when you do y+x instead of y+1 is simply due to the optimizer not able to know what that value is and thus it emits scalar code which is more portable. For example if y was a large number then it could no longer emit a vectorized/SSE version of the code.

-B

aristarh2704 · 2018-01-13T11:52:02Z

But when I test this code in qemu (I try write my own OS), if I use "y+=1", execution fails and VM repeatedly reboots and then freezes. If I use "y+=x", execution ends successfully. This indicates a significant difference in the generated code.

gamozolabs · 2018-01-13T12:02:52Z

It's probably very likely that you did not initialize SSE in your OS and thus you're getting an undefined instruction exception on the SSE instruction. To enable SSE you should check bit 25 in CPUID.01h:EDX and if it is set you can proceed to clear the CR0.EM bit, set the CR0.MP bit, set the CR4.OSFXSR bit and set the CR4.OSXMMEXCPT bit.

That being said, most OSes do not use SSE and floats in the kernel (to save costs on context switches) and thus they build their kernels with flags such as -C target-feature=+soft-float to enable soft float (floating point emulation, disables use of SSE and FPU instructions).

aristarh2704 · 2018-01-13T12:04:23Z

Thank you for your help)

aristarh2704 · 2018-01-13T12:11:28Z

May I disable using SSE/MMX instruction in rustc?

pnkfelix · 2018-01-15T12:46:34Z

@aristarh2704 I don't have much experience with disabling such instructions myself, but from the bug description at #26449, it sounds like you might be able to use the -C target-feature=... flag to disable a whole host of code generation features that are AFAIK solely controlled by LLVM.

aristarh2704 · 2018-01-15T16:28:11Z

How I can give this argument to cargo, not to rustc?

pnkfelix · 2018-01-15T16:36:59Z

@aristarh2704 I'm not sure what current best practice is for configuring cargo that way (I do more manual invocation of rustc than setting up cargo projects), but it seems like one option is to use the rustflags configuration switch in your toml file:

I think this was added here: rust-lang/cargo#2241

and is documented here (scroll to where rustflags is mentioned): https://doc.rust-lang.org/cargo/reference/config.html#configuration-keys

But for current best practices, you may want to ask in the #cargo IRC channel.

IsaacWoods · 2018-01-15T17:07:42Z

The correct solution here is to provide a custom target specification and then to use Xargo to recompile core with the correct options. In particular, you want to disable the sse and mmx features, to stop LLVM emitting problematic instructions.

A sample target file that does this might look like:

{
  "llvm-target": "x86_64-unknown-none",
  "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
  "linker-flavor": "gcc",
  "target-endian": "little",
  "target-pointer-width": "64",
  "target-c-int-width": "32",
  "arch": "x86_64",
  "os": "none",
  "disable-redzone": true,
  "features": "-mmx,-sse,+soft-float"
}

Save this as my_target.json or whatever, then tell xargo to use it with: xargo --target=my_target

aristarh2704 closed this as completed Jan 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug optimization #47410

Bug optimization #47410

aristarh2704 commented Jan 13, 2018 •

edited

Loading

gamozolabs commented Jan 13, 2018 •

edited

Loading

aristarh2704 commented Jan 13, 2018 •

edited

Loading

gamozolabs commented Jan 13, 2018

aristarh2704 commented Jan 13, 2018

aristarh2704 commented Jan 13, 2018

pnkfelix commented Jan 15, 2018

aristarh2704 commented Jan 15, 2018

pnkfelix commented Jan 15, 2018 •

edited

Loading

IsaacWoods commented Jan 15, 2018

Bug optimization #47410

Bug optimization #47410

Comments

aristarh2704 commented Jan 13, 2018 • edited Loading

gamozolabs commented Jan 13, 2018 • edited Loading

aristarh2704 commented Jan 13, 2018 • edited Loading

gamozolabs commented Jan 13, 2018

aristarh2704 commented Jan 13, 2018

aristarh2704 commented Jan 13, 2018

pnkfelix commented Jan 15, 2018

aristarh2704 commented Jan 15, 2018

pnkfelix commented Jan 15, 2018 • edited Loading

IsaacWoods commented Jan 15, 2018

aristarh2704 commented Jan 13, 2018 •

edited

Loading

gamozolabs commented Jan 13, 2018 •

edited

Loading

aristarh2704 commented Jan 13, 2018 •

edited

Loading

pnkfelix commented Jan 15, 2018 •

edited

Loading