Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcore: fix compilation on 16bit target (MSP430). #40832

Merged
merged 2 commits into from
Mar 30, 2017

Conversation

pftbest
Copy link
Contributor

@pftbest pftbest commented Mar 25, 2017

Since PR #40601 has been merged, libcore no longer compiles on MSP430.
The reason is this code in break_patterns:

 let mut random = len;
 random ^= random << 13;
 random ^= random >> 17;
 random ^= random << 5;
 random &= modulus - 1;

It assumes that len is at least a 32 bit integer.
As a workaround replace break_patterns with an empty function for 16bit targets.

cc @stjepang
cc @alexcrichton

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting the problem! I assume the issue is with shifting a usize by 17 when usize consists of 16 bits only.

Instead of fixing this by writing a custom fn break_patterns under a #[cfg(...)], could you please just copy this implementation? This one shouldn't suffer from the problem as it shifts u64s rather than usizes.

@pftbest
Copy link
Contributor Author

pftbest commented Mar 26, 2017

@stjepang, but this change converts len to u64, and computes next_power_of_two on each iteration, are you sure it won't be slower on 32bit targets?
I don't want to affect the performance of other targets because of MSP430.

@ghost
Copy link

ghost commented Mar 26, 2017

@pftbest Indeed - next_power_of_two should be outside the loop. I've updated the implementation so that it uses u32 on 32-bit and 16-bit platforms. Performance on my 64-bit machine is just as good as it was before the change.

What do you think, does it look good now?

@pftbest
Copy link
Contributor Author

pftbest commented Mar 26, 2017

I did some testing and on my machine the new version is 3 times slower than original:

test breakv1 ... bench:           4 ns/iter (+/- 0)
test breakv2 ... bench:          11 ns/iter (+/- 0)

I didn't test the u64 variant, maybe it would have been better.

Also this new version gives different results compared to original, and I don't know enough about the algorithm to confirm that it is OK and won't introduce any regressions.

@ghost
Copy link

ghost commented Mar 26, 2017

Interesting. Can you share the tests? I'd like to investigate why performance suffers and why the results are different.

Btw, you can find me on IRC if you wish.

@pftbest
Copy link
Contributor Author

pftbest commented Mar 26, 2017

@stjepang here is my crude test https://gist.github.com/pftbest/b15f39b866e70bd9f6ea0ed84aef9fb0

It is also possible to see that something is wrong just by looking at assembly code:
https://godbolt.org/g/0mg59B
The loop is unrolled by LLVM but random value is still calculated on each iteration.

@ghost
Copy link

ghost commented Mar 26, 2017

That's totally okay, don't worry. So, the purpose of break_patterns is to shuffle some elements around in order to randomize the next pivot selection round.

I slightly changed the algorithm so that it's more random. Now it picks three independent random indices instead of just one. Also, the range of indices is [0, len-1], while before it was [len/4, len/4*3-1].

Yes, the function is slightly slower now, but we should be better off overall because produced indices are more random, which should speed up the sort by creating more balanced partitions. When benchmarking the whole sort algorithm, I don't see any difference in performance.

This function is rarely called anyways, and all this fiddling with randomness is not too important... All that matters is that we swap the 3 indices in the middle with some other reasonably random indices. :)

Select 3 random points instead of just 1.
Also the code now compiles on 16bit architectures.
@pftbest
Copy link
Contributor Author

pftbest commented Mar 26, 2017

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@japaric
Copy link
Member

japaric commented Mar 27, 2017

Thanks @pftbest and @stjepang for working on this!

@bors r=stjepang

@bors
Copy link
Contributor

bors commented Mar 27, 2017

📌 Commit fda8e15 has been approved by stjepang

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Mar 27, 2017
libcore: fix compilation on 16bit target (MSP430).

Since PR rust-lang#40601 has been merged, libcore no longer compiles on MSP430.
The reason is this code in `break_patterns`:
```rust
 let mut random = len;
 random ^= random << 13;
 random ^= random >> 17;
 random ^= random << 5;
 random &= modulus - 1;
```
It assumes that `len` is at least a 32 bit integer.
As a workaround replace `break_patterns` with an empty function for 16bit targets.

cc @stjepang
cc @alexcrichton
bors added a commit that referenced this pull request Mar 27, 2017
// we first take it modulo a power of two, and then decrease by `len` until it fits
// into the range `[0, len - 1]`.
let mut other = gen_usize() & (modulus - 1);
while other >= len {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stjepang Can't this just be a conditional instead of a loop? As it stands now, you're guaranteed other < 2*len upon entering the loop, so the body runs either 0 or 1 times and no more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is true. :) It can be a single conditional.

`other` is guaranteed to be less than `2 * len`.
bors added a commit that referenced this pull request Mar 28, 2017
@pftbest
Copy link
Contributor Author

pftbest commented Mar 29, 2017

I've replaced the loop with conditional, please review.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve! :)

@japaric
Copy link
Member

japaric commented Mar 29, 2017

@bors r=stjepang

@bors
Copy link
Contributor

bors commented Mar 29, 2017

📌 Commit b909364 has been approved by stjepang

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Mar 29, 2017
libcore: fix compilation on 16bit target (MSP430).

Since PR rust-lang#40601 has been merged, libcore no longer compiles on MSP430.
The reason is this code in `break_patterns`:
```rust
 let mut random = len;
 random ^= random << 13;
 random ^= random >> 17;
 random ^= random << 5;
 random &= modulus - 1;
```
It assumes that `len` is at least a 32 bit integer.
As a workaround replace `break_patterns` with an empty function for 16bit targets.

cc @stjepang
cc @alexcrichton
bors added a commit that referenced this pull request Mar 29, 2017
Rollup of 6 pull requests

- Successful merges: #40780, #40814, #40816, #40832, #40901, #40907
- Failed merges:
@bors bors merged commit b909364 into rust-lang:master Mar 30, 2017
@pftbest pftbest deleted the fix_msp430 branch July 23, 2019 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants