Implement "small substs optimization" for substs of length 1 #58310

arielb1 · 2019-02-08T16:24:51Z

One of the core types in the compiler is ty::Subst, which represents a list of "substitutions" - arguments to generic parameters. For example, Option<u32> has a Substs of [Uint(u32)] and T: Sized has a Substs of [Param(T)].

A ty::Substs is an interned &'tcx List<Kind<'tcx>>. It is represented as a thin pointer to a struct of this form:

pub struct List<T> {
    len: usize,
    data: [T; 0],
    opaque: OpaqueListContents,
}

Where a List of length 0 is represented by this bit of evil code:

impl<T> List<T> {
    #[inline(always)]
    pub fn empty<'a>() -> &'a List<T> {
        #[repr(align(64), C)]
        struct EmptySlice([u8; 64]);
        static EMPTY_SLICE: EmptySlice = EmptySlice([0; 64]);
        assert!(mem::align_of::<T>() <= 64);
        unsafe {
            &*(&EMPTY_SLICE as *const _ as *const List<T>)
        }
    }
}

And Lists of non-zero length are represented by a List that references an interner. This is inefficient, because I'm quite sure that many Substs are of length 1 (e.g., consider all the trait bounds of traits with no type parameter).

When we look into the interner, there are 2 problems:

This requires a memory lookup, which increases CPU cache usage and can cause CPU cache misses.
When we are creating a new type, we need to intern the substs, which causes hashmap lookups and even more CPU cache misses.

Potential for measurements

It might be nice to count the amount of Substs of each length while compiling popular crates. This should be rather easy if you modify the interner.

Implementation Strategy

Stage 1 - getting a newtype

There are a few representations for Substs that are possible, but in any case I think the best way to go forward would be to first hide the representation.

Currently a Substs is this typedef:

pub type Substs<'tcx> = List<Kind<'tcx>>;

And it is normally used as &'tcx Substs<'tcx>.

We would need to replace it with something that has a hidden representation, i.e. initially a newtype of the form:

pub struct SubstsRef<'tcx> {
    inner: &'tcx Substs<'tcx>
}

I think a reasonable way of going around this would be to first have a

pub type SubstsRef<'tcx> = &'tcx Substs<'tcx>;

Then retire the use of the old Substs typedef in most places, then switch SubstsRef to be a newtype.

When using a newtype, it's probably a good idea to have a Deref impl of this form:

impl<'tcx> Deref for SubstsRef<'tcx> {
    type Target = [Kind<'tcx>];

    #[inline]
    fn deref(&self) -> &Self::Target { self.inner }
}

This will avoid needing to implement specific methods in most cases.

Also remember to implement the "derive" trait impls (Hash, PartialEq, etc.) as needed.

Stage 2 - improving the representation

My preferred representation is as follows:

Use the first 2 bits as the tag, in the style of [TAG_MASK], e.g.

const TYPE_TAG: usize = 0b00;
const REGION_TAG: usize = 0b01;
const LIST_TAG: usize = 0b10;

Then represent things as follows:

A substs of length 0: List::empty() | LIST_TAG
A substs of a single type: ty | TYPE_TAG
A substs of a single region: ty | REGION_TAG
A substs of length >1: substs | LIST_TAG

You'll want to define Substs as follows:

#[repr(C)]
pub struct SubstsRef<'tcx> {
    ptr: NonZeroUsize,
    marker: PhantomData<Kind<'tcx>>
}

Then you can implement Deref in this style:

impl<'tcx> Deref for SubstsRef<'tcx> {
    type Target = [Kind<'tcx>];

    #[inline]
    fn deref(&self) -> &Self::Target {
        let ptr = self.ptr.get();
        match ptr & TAG_MASK {
            REGION_TAG | TYPE_TAG => {
                // We still match layout with `Kind`.
                let this: &[Kind; 1] = mem::transmute(self);
                this
            }
            LIST_TAG => {
                let inner: &List<Kind> = &*((ptr & !TAG_MASK) as *const _);
                inner
            }
            _ => intrinsics::unreachable()
        }
    }
}

Then you'll basically only need to change the interning functions not to go through the interner for substs of length 1. As long as you always consistently map substs of length 1 to not use an interner, you can do PartialEq, Hash etc. "by value".

Then you can "cut the ribbon" and do a @bors try to see how much perf you gain.

Stage 2a - Cleanup

Remove occurrences of InternalSubsts in comments.

Items for later investigation

It might be worth investigating whether it's worth extending the small substs optimization to substs of length 2 to get rid of even more interner calls - that would increase the size of TyS and things, so it might be a space-time tradeoff.

The text was updated successfully, but these errors were encountered:

oli-obk · 2019-02-18T11:37:22Z

Bikeshed: I think we should get rid of the old Substs type alias and just name the SubstsRef Substs, similar to what we're doing with Ty<'tcx> (which is an alias for &'tcx TyS<'tcx>).

arielb1 · 2019-02-24T23:08:06Z

@oli-obk

I think we should do this refactor in several steps - in the first step, we should get rid of all uses of the old Substs, and then afterwards we could rename SubstsRef to Substs. I would prefer doing it in 2 separate PRs because it would make merge conflicts nicer.

oli-obk · 2019-02-25T08:12:42Z

Ok! I just wanted to bring it up since I thought doing it in one step would make the diff simpler (basically just loads of &'tcx disappearing)

arielb1 · 2019-02-25T21:56:08Z

@oli-obk

I think that changing it in one step will just make rebase conflicts quite more confusing, with more type errors instead of merge/resolve errors.

@arielb1

[Step 1] Implement "small substs optimization" for substs of length 1 addresses part of #58310 r?@arielb1

CohenArthur · 2020-08-21T21:51:37Z

Since the first PR has been closed, and a second one has been merged regarding that issue (#64817), is there still work to do here ? Or did the second PR fix the issue ?

csmoe self-assigned this Feb 9, 2019

csmoe mentioned this issue Feb 9, 2019

[Step 1] Implement "small substs optimization" for substs of length 1 #58321

Merged

bors added a commit that referenced this issue Feb 27, 2019

Auto merge of #58321 - csmoe:substs, r=oli-obk

f0be457

[Step 1] Implement "small substs optimization" for substs of length 1 addresses part of #58310 r?@arielb1

csmoe mentioned this issue Mar 20, 2019

[Step 2] Implement "small substs optimization" for substs of length 1 #59312

Closed

jonas-schievink added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Dec 16, 2019

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement "small substs optimization" for substs of length 1 #58310

Implement "small substs optimization" for substs of length 1 #58310

arielb1 commented Feb 8, 2019 •

edited

Loading

oli-obk commented Feb 18, 2019 •

edited

Loading

arielb1 commented Feb 24, 2019

oli-obk commented Feb 25, 2019

arielb1 commented Feb 25, 2019 •

edited

Loading

CohenArthur commented Aug 21, 2020 •

edited

Loading

Implement "small substs optimization" for substs of length 1 #58310

Implement "small substs optimization" for substs of length 1 #58310

Comments

arielb1 commented Feb 8, 2019 • edited Loading

Potential for measurements

Implementation Strategy

Stage 1 - getting a newtype

Stage 2 - improving the representation

Stage 2a - Cleanup

Items for later investigation

oli-obk commented Feb 18, 2019 • edited Loading

arielb1 commented Feb 24, 2019

oli-obk commented Feb 25, 2019

arielb1 commented Feb 25, 2019 • edited Loading

CohenArthur commented Aug 21, 2020 • edited Loading

arielb1 commented Feb 8, 2019 •

edited

Loading

oli-obk commented Feb 18, 2019 •

edited

Loading

arielb1 commented Feb 25, 2019 •

edited

Loading

CohenArthur commented Aug 21, 2020 •

edited

Loading