[WIP] Bump allocator for rustc #38725

mattico · 2016-12-31T00:28:30Z

Inspired by Dlang's allocator, I stuck a bump allocator into rustc to see what would happen. No good benchmarks yet because I need to think about how to do them. Lots of work to be done, but posting now because there should probably be discussion about whether or not this is a Good Idea. For now this will tease:

ripgrep compile:
BEFORE: 53.52s
AFTER: 44.38s

TODO:

rust-highfive · 2016-12-31T00:28:44Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

hanna-kruppe · 2016-12-31T00:36:47Z

src/liballoc_frame/lib.rs

+const CHUNK_ALIGN: usize = 4096;
+
+static mut HEAP: *mut u8 = ptr::null_mut();
+static mut HEAP_LEFT: usize = 0;


These variables are global and there's no implicit lock around allocation calls, so the implementation of __rust_allocate below is full of data races. Maybe these should be atomics or thread local?

Oh, yep. I had implemented this using atomics the first time I wanted to do this (but got stuck getting the allocator to inject). I'll go dig that code up.

alexcrichton · 2016-12-31T00:59:09Z

This seems like it's mostly related to the compiler, so

r? @eddyb

hanna-kruppe · 2016-12-31T01:18:07Z

src/liballoc_frame/lib.rs

+        }
+
+        let heap = HEAP.load(Ordering::SeqCst);
+        let heap_left = HEAP_LEFT.load(Ordering::SeqCst);


Well, this eliminates the instant UB due to data races, but it's still a logical race. Suppose a context switch happens after these two loads have been executed but before the following lines are executed, and another thread allocates. When the first thread continues, it will return the same address as the other thread did.

Makes sense. I have little experience with atomics, as you may have guessed. I would rather use a Mutex or thread_local!, but neither of those are in core. I'll make it use a spinlock which should hopefully be correct.

brson · 2016-12-31T01:18:28Z

src/bootstrap/config.toml.example

@@ -137,6 +137,12 @@
 # Whether or not jemalloc is built with its debug option set
 #debug-jemalloc = false

+# Whether or not the frame allocator is built and enabled in std
+#use-alloc-frame = false


I don't see where this option is used. Is the doc out of date?

This is still TODO

brson · 2016-12-31T01:23:52Z

src/libstd/Cargo.toml

@@ -13,6 +13,7 @@ crate-type = ["dylib", "rlib"]
 alloc = { path = "../liballoc" }
 alloc_jemalloc = { path = "../liballoc_jemalloc", optional = true }
 alloc_system = { path = "../liballoc_system" }
+alloc_frame = { path = "../liballoc_frame" }


This makes all platforms require a alloc_frame implementation, doesn't it? I don't have much opinion about that. It doesn't look too hard to implement. A better setup would be for this feature to be optional and off by default, so porters don't have to think about it. I don't know whether that's necessary for this PR but something to consider.

Actually, hm. Platforms without libc would have a hard time implementing frame_alloc.

cc @jackpot51

Oh yes, this is meant to be optional.

@brson: Redox is using the libc allocator right now, so this would not be an issue.

brson · 2016-12-31T01:29:47Z

This is a neat exercise. Thanks for submitting it.

mattico · 2016-12-31T03:01:56Z

One more data point for now, I'll have to do better benchmarks in a few days:
libcore before: 10.3 +- 0.7
libcore after: 10.5 +- 0.5

eddyb · 2016-12-31T05:38:29Z

cc @rust-lang/compiler

nagisa · 2016-12-31T05:48:01Z

src/liballoc_frame/lib.rs

+
+    pub unsafe fn allocate(size: usize, align: usize) -> *mut u8 {
+        if align <= MIN_ALIGN {
+            libc::malloc(size as libc::size_t) as *mut u8


Is this supposed to inter-operate with the regular system allocator? If that’s not the case, then why not use a plain sbrk?

At the very least, LLVM will be using its the system allocator in the same process, won't it? ~~Calling sbrk may interact badly with that.~~ nvm

Yes, this is probably a good idea. I'll look into using sbrk and VirtualPageAllocEx or whatever. The current implementation as simple as possible just to test out the idea.

hanna-kruppe · 2016-12-31T12:05:10Z

@mattico Regarding thread safety in the implementation, I think a spin lock is okay. Most parts of the compilation process aren't parallelized, and those that are mostly happen within LLVM, which won't be using this allocator IIUC. Besides, I really wouldn't be comfortable with a complicated atomics-based implementation, even if nobody could poke holes in it — these things are notoriously tricky and the failure mode (non-deterministic allocator bugs) is really terrible.

Thread locals would have been ideal, since this allocator never frees memory there are none of the usual complications and it would neatly solve any concurrency issues. But whatever.

mattico · 2016-12-31T14:57:19Z

@rkuppe the current spinlock implementation does seem to have significant overhead. I'll post benchmarks when I get back to a computer.

I agree about a more complicated atomic implementation. If one were even desired, I definitely shouldn't be the one to write it.

I wish we had thread locals in core :/ Perhaps someday core and std will be modularized enough to use them without needing an allocator. I'll look into what hackery would be required to use them here but I don't expect it'll be worthwhile.

bors · 2016-12-31T21:06:42Z

☔ The latest upstream changes (presumably #38482) made this pull request unmergeable. Please resolve the merge conflicts.

nikomatsakis · 2017-01-02T20:13:39Z

Interesting thought experiment. I'd definitely want to run a wider variety of compiles to compare the performance characteristics (also -- can we measure peak memory usage?)

mattico · 2017-01-04T22:42:11Z

Profiling is going to have to wait until next week, since performance counters don't work in WSL/Virtualbox, and I can't get MSVC to build. My current computer doesn't have an actual linux install.

mattico · 2017-01-16T19:23:01Z

I don't have time to work on this at the moment, so I'm closing this for now just to keep the PR queue clean. I'll revisit this sometime soon.

mattico · 2017-03-16T00:23:06Z

In case anybody is wondering where this went, I tested this on a 4 core machine and the performance degraded to the point that it was a bit slower than jemalloc. I imagine that 4 threads stuck behind a single lock erased any gains we may have had. This is worth revisiting if we ever get thread_local! in core, due to some user-space implementation like https://github.com/Amanieu/thread_local-rs.

mattico added 4 commits December 24, 2016 21:38

Starting to implement a frame allocator for Rustc

5fad627

Hack to enable alloc_frame for stage1+ rustc

553f5be

Working?!

5c4aadf

Remove debug print statemtnts

6eec941

rust-highfive assigned alexcrichton Dec 31, 2016

hanna-kruppe reviewed Dec 31, 2016

View reviewed changes

rust-highfive assigned eddyb and unassigned alexcrichton Dec 31, 2016

Use atomic operations for statics

8be59fa

hanna-kruppe reviewed Dec 31, 2016

View reviewed changes

brson reviewed Dec 31, 2016

View reviewed changes

Use an atomic spinlock for the heap

39af02b

mattico force-pushed the frame_alloc branch from ce9dfb5 to 39af02b Compare December 31, 2016 01:50

nagisa reviewed Dec 31, 2016

View reviewed changes

pnkfelix added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Jan 3, 2017

mattico closed this Jan 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Bump allocator for rustc #38725

[WIP] Bump allocator for rustc #38725

mattico commented Dec 31, 2016 •

edited

Loading

rust-highfive commented Dec 31, 2016

hanna-kruppe Dec 31, 2016

mattico Dec 31, 2016

alexcrichton commented Dec 31, 2016

hanna-kruppe Dec 31, 2016

mattico Dec 31, 2016

brson Dec 31, 2016

mattico Dec 31, 2016

brson Dec 31, 2016 •

edited

Loading

brson Dec 31, 2016

mattico Dec 31, 2016

jackpot51 Jan 4, 2017

brson commented Dec 31, 2016

mattico commented Dec 31, 2016

eddyb commented Dec 31, 2016

nagisa Dec 31, 2016

hanna-kruppe Dec 31, 2016 •

edited

Loading

mattico Dec 31, 2016

hanna-kruppe commented Dec 31, 2016 •

edited

Loading

mattico commented Dec 31, 2016

bors commented Dec 31, 2016

nikomatsakis commented Jan 2, 2017

mattico commented Jan 4, 2017

mattico commented Jan 16, 2017

mattico commented Mar 16, 2017

[WIP] Bump allocator for rustc #38725

[WIP] Bump allocator for rustc #38725

Conversation

mattico commented Dec 31, 2016 • edited Loading

rust-highfive commented Dec 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton commented Dec 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brson Dec 31, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brson commented Dec 31, 2016

mattico commented Dec 31, 2016

eddyb commented Dec 31, 2016

Choose a reason for hiding this comment

hanna-kruppe Dec 31, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanna-kruppe commented Dec 31, 2016 • edited Loading

mattico commented Dec 31, 2016

bors commented Dec 31, 2016

nikomatsakis commented Jan 2, 2017

mattico commented Jan 4, 2017

mattico commented Jan 16, 2017

mattico commented Mar 16, 2017

mattico commented Dec 31, 2016 •

edited

Loading

brson Dec 31, 2016 •

edited

Loading

hanna-kruppe Dec 31, 2016 •

edited

Loading

hanna-kruppe commented Dec 31, 2016 •

edited

Loading