Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Semantic's structs-of-arrays and make API simpler #11

Open
overlookmotel opened this issue May 17, 2024 · 8 comments
Open

Optimize Semantic's structs-of-arrays and make API simpler #11

overlookmotel opened this issue May 17, 2024 · 8 comments

Comments

@overlookmotel
Copy link

overlookmotel commented May 17, 2024

Problem

Semantic uses a struct of arrays pattern for ScopeTree and other structures.

If we do oxc-project/oxc#4269, then ScopeTree would be purely a collection of IndexVecs. Then we can optimize it.

At present we treat all 6 x IndexVecs as independent. When creating a scope, we push to each vec independently.

This is inefficient in 3 ways:

  1. len and capacity are stored 6 times. They only need to be stored once, as they're the same for all 6 vecs.
  2. Each time we add a scope, IndexVec::push looks up len for that vec, checks if it's at capacity, and grows if required. These lookups/checks happen 6 times, when they only need to happen once.
  3. Looking up multiple pieces of information about a scope will involve multiple bounds checks, when only 1 is required.

Potential solution

Find a crate which optimizes SOA structures to avoid these penalties. Hopefully one exists, but if not, create one.

@overlookmotel
Copy link
Author

NB: Before attempting any perf improvements to semantic, we should tackle oxc-project/oxc#4270 first, so we can validate that these "improvements" do actually improve things.

@overlookmotel
Copy link
Author

overlookmotel commented May 17, 2024

https://crates.io/crates/soa_derive looks like a good start, though not sure if it implements all the optimizations listed above.

This was referenced Jun 4, 2024
@overlookmotel
Copy link
Author

overlookmotel commented Jul 15, 2024

oxc-project/oxc#4208 converted the 2 FxHashMaps to IndexVecs, and oxc-project/oxc#3776 stabilized the semantic benchmarks. So there's no blocker on proceeding with this now.

@Dunqing
Copy link
Member

Dunqing commented Jul 15, 2024

Wow, just realized you thought of this optimization a long time ago! That's great.

@overlookmotel
Copy link
Author

overlookmotel commented Jul 20, 2024

@DonIsaac suggested changes to API to hide away the struct-of-arrays approach as an implementation detail in oxc-project/oxc#4372.

@rzvxa said in oxc-project/oxc#4372 (comment):

Not directly related to this PR, But maybe we should put some effort toward researching any existing solution like MultiArrayList in zig or implement one ourselves. It is a really handy data structure that makes any struct into a SoA without drastic changes to the API for the users.

https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig

@overlookmotel overlookmotel changed the title Optimize Semantic's structs-of-arrays Optimize Semantic's structs-of-arrays and make API simpler Jul 20, 2024
@rzvxa
Copy link

rzvxa commented Jul 21, 2024

https://crates.io/crates/soa_derive looks like a good start, though not sure if it implements all the optimizations listed above.

I found this which doesn't use multiple lengths on each vector. It does most of the optimizations we want.

https://github.com/tim-harding/soa-rs

There is also soa-vec which doesn't use proc macros, This crate uses nightly for its allocations but we can create such a thing with our own APIs. It has an interesting approach using generics similar to how tuple types are defined in the Rust standard library.

I'm also doing a small experiment on how it would look if we use declarative macros to create Soa types. It would be more verbose than any approach using procedural macros, because of concat_ident being nightly but with something like this we should be good to go.

        soa! {
            @vec(Things)
            @ref(ThingRef)
            struct Thing {
                pub a: u32,
                b: Vec<u32>,
                c: Box<u32>,
                d: bool,
            }
        }
        // usage

        let thing1 = Thing { a: 42, b: vec![42], c: Box::from(42), d: true };
        let thing2 = Thing { a: 420, b: vec![420], c: Box::from(420), d: true };

        let mut things = Things::new();
        things.push(thing1);
        things.push(thing2);

        // only this method is `pub` similar to the field definition.
        // `VecLike` can be a slice here.
        let a_vec: &VecLike<u32> = things.a();
        let b_vec: &VecLike<Vec<u32>> = things.b();
        let c_vec: &VecLike<Box<u32>> = things.c();
        let d_vec: &VecLike<bool> = things.d();

        for thing in things {
            let ThingRef { a, b, c, d } = thing;
        }

As you can see, the definition isn't as neat as options using a derive macro, but it should provide almost all of the functionalities. It might not be worth doing; however, it is possible if we wish for.

@overlookmotel
Copy link
Author

@aapoalas suggested https://crates.io/crates/parallel_vec on Discord.

@overlookmotel
Copy link
Author

#67 is also relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants