Optimize `Semantic`'s structs-of-arrays and make API simpler #11

overlookmotel · 2024-05-17T10:03:54Z

Problem

Semantic uses a struct of arrays pattern for ScopeTree and other structures.

If we do oxc-project/oxc#4269, then ScopeTree would be purely a collection of IndexVecs. Then we can optimize it.

At present we treat all 6 x IndexVecs as independent. When creating a scope, we push to each vec independently.

This is inefficient in 3 ways:

len and capacity are stored 6 times. They only need to be stored once, as they're the same for all 6 vecs.
Each time we add a scope, IndexVec::push looks up len for that vec, checks if it's at capacity, and grows if required. These lookups/checks happen 6 times, when they only need to happen once.
Looking up multiple pieces of information about a scope will involve multiple bounds checks, when only 1 is required.

Potential solution

Find a crate which optimizes SOA structures to avoid these penalties. Hopefully one exists, but if not, create one.

The text was updated successfully, but these errors were encountered:

overlookmotel · 2024-05-17T10:07:04Z

NB: Before attempting any perf improvements to semantic, we should tackle oxc-project/oxc#4270 first, so we can validate that these "improvements" do actually improve things.

overlookmotel · 2024-05-17T10:15:11Z

https://crates.io/crates/soa_derive looks like a good start, though not sure if it implements all the optimizations listed above.

overlookmotel · 2024-07-15T08:19:05Z

oxc-project/oxc#4208 converted the 2 FxHashMaps to IndexVecs, and oxc-project/oxc#3776 stabilized the semantic benchmarks. So there's no blocker on proceeding with this now.

Dunqing · 2024-07-15T08:30:15Z

Wow, just realized you thought of this optimization a long time ago! That's great.

overlookmotel · 2024-07-20T12:35:22Z

@DonIsaac suggested changes to API to hide away the struct-of-arrays approach as an implementation detail in oxc-project/oxc#4372.

@rzvxa said in oxc-project/oxc#4372 (comment):

Not directly related to this PR, But maybe we should put some effort toward researching any existing solution like MultiArrayList in zig or implement one ourselves. It is a really handy data structure that makes any struct into a SoA without drastic changes to the API for the users.

https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig

rzvxa · 2024-07-21T11:04:11Z

https://crates.io/crates/soa_derive looks like a good start, though not sure if it implements all the optimizations listed above.

I found this which doesn't use multiple lengths on each vector. It does most of the optimizations we want.

https://github.com/tim-harding/soa-rs

There is also soa-vec which doesn't use proc macros, This crate uses nightly for its allocations but we can create such a thing with our own APIs. It has an interesting approach using generics similar to how tuple types are defined in the Rust standard library.

I'm also doing a small experiment on how it would look if we use declarative macros to create Soa types. It would be more verbose than any approach using procedural macros, because of concat_ident being nightly but with something like this we should be good to go.

        soa! {
            @vec(Things)
            @ref(ThingRef)
            struct Thing {
                pub a: u32,
                b: Vec<u32>,
                c: Box<u32>,
                d: bool,
            }
        }
        // usage

        let thing1 = Thing { a: 42, b: vec![42], c: Box::from(42), d: true };
        let thing2 = Thing { a: 420, b: vec![420], c: Box::from(420), d: true };

        let mut things = Things::new();
        things.push(thing1);
        things.push(thing2);

        // only this method is `pub` similar to the field definition.
        // `VecLike` can be a slice here.
        let a_vec: &VecLike<u32> = things.a();
        let b_vec: &VecLike<Vec<u32>> = things.b();
        let c_vec: &VecLike<Box<u32>> = things.c();
        let d_vec: &VecLike<bool> = things.d();

        for thing in things {
            let ThingRef { a, b, c, d } = thing;
        }

As you can see, the definition isn't as neat as options using a derive macro, but it should provide almost all of the functionalities. It might not be worth doing; however, it is possible if we wish for.

overlookmotel · 2024-07-21T23:21:01Z

@aapoalas suggested https://crates.io/crates/parallel_vec on Discord.

overlookmotel · 2024-07-24T12:28:48Z

#67 is also relevant.

This was referenced Jun 4, 2024

Optimize Semantic #32

Open

Priority performance issues #42

Open

overlookmotel mentioned this issue Jul 15, 2024

perf(semantic): reduce index_vec allocation oxc-project/oxc#4266

Closed

overlookmotel changed the title ~~Optimize Semantic's structs-of-arrays~~ Optimize Semantic's structs-of-arrays and make API simpler Jul 20, 2024

overlookmotel mentioned this issue Jul 20, 2024

feat(semantic): add Symbol oxc-project/oxc#4372

Closed

overlookmotel mentioned this issue Aug 22, 2024

Add Scoping structure combining ScopeTree and SymbolTable #92

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `Semantic`'s structs-of-arrays and make API simpler #11

Optimize `Semantic`'s structs-of-arrays and make API simpler #11

overlookmotel commented May 17, 2024 •

edited

Loading

overlookmotel commented May 17, 2024

overlookmotel commented May 17, 2024 •

edited

Loading

overlookmotel commented Jul 15, 2024 •

edited

Loading

Dunqing commented Jul 15, 2024

overlookmotel commented Jul 20, 2024 •

edited

Loading

rzvxa commented Jul 21, 2024 •

edited

Loading

overlookmotel commented Jul 21, 2024

overlookmotel commented Jul 24, 2024

Optimize Semantic's structs-of-arrays and make API simpler #11

Optimize Semantic's structs-of-arrays and make API simpler #11

Comments

overlookmotel commented May 17, 2024 • edited Loading

Problem

Potential solution

overlookmotel commented May 17, 2024

overlookmotel commented May 17, 2024 • edited Loading

overlookmotel commented Jul 15, 2024 • edited Loading

Dunqing commented Jul 15, 2024

overlookmotel commented Jul 20, 2024 • edited Loading

rzvxa commented Jul 21, 2024 • edited Loading

overlookmotel commented Jul 21, 2024

overlookmotel commented Jul 24, 2024

Optimize `Semantic`'s structs-of-arrays and make API simpler #11

Optimize `Semantic`'s structs-of-arrays and make API simpler #11

overlookmotel commented May 17, 2024 •

edited

Loading

overlookmotel commented May 17, 2024 •

edited

Loading

overlookmotel commented Jul 15, 2024 •

edited

Loading

overlookmotel commented Jul 20, 2024 •

edited

Loading

rzvxa commented Jul 21, 2024 •

edited

Loading