-
-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(js_semantic): identify bindings with indexes #582
base: main
Are you sure you want to change the base?
Conversation
!bench_analyzer |
Parser conformance results onjs/262
jsx/babel
symbols/microsoft
ts/babel
ts/microsoft
|
Analyzer Benchmark Results
|
Some preliminary feedback to make your PR easier to review:
Overall I think it's important to explain the technical decisions: reasons, tradeoffs, etc.
It seems you have a vision, but it's important to not be vague and explain it to us. |
Description updated. |
Would you happen to have any proof to claim that?
This might be a translation problem, but when I read this phrase, I interpret it in a negative way, something like that "it doesn't allow progression", which is something that goes against the previous claim because it seems to you are in favour of using indexes.
Overall I understand what you have in mind, and I follow what you explained, although this refactor seems bigger than expected, and we are still determining what it will provide. There's a claim around "overhead", but it isn't explained what this overhead is. I am not very comfortable landing this PR to
If you think this suggestion is too much, then I suggest making the whole refactor in one single PR and explaining it to us. |
✅ Deploy Preview for biomejs canceled.
|
Any updates on this PR? |
427b4e1
to
9757e18
Compare
@Conaclos what's the status of this PR? |
I updated the PR. However, I'm still hesitating if it is the best move. The semantic model is currently using two ways of identifying a binding (declaration):
The semantic event emitter uses range starts and then change it to indexes in the semantic model builder. Indexes are used for efficient retrieving of semantic data: it is used when we convert declaration node into a semantic binding or when iterating over semantic references to a binding. Thus, it is hard to know which way of identification is better / more performant. |
Thank you for the explanation! I don't have a great in-depth knowledge of the semantic model, however I trust your judgement. Another way to judge the feature is DX, and usages outside of a linter. For example, we might need the semantic model to retrieve the bindings exported from a module. |
I just run the benchmark action in this branch to see the differences: https://github.com/biomejs/biome/actions/runs/9993569703/job/27621225066 |
CodSpeed Performance ReportMerging #582 will degrade performances by 37.22%Comparing Summary
Benchmarks breakdown
|
9757e18
to
c09c99e
Compare
Summary
The name resolver identifies a declaration (binding identifier) thanks to its range, while the semantic model identifies a declaration thanks to its index. Using a different way of identifying a declaration leads to wasted lookups: every time that the name resolver emits an event that binds a reference to its binding, the semantic model have to retrieve the binding index from the declaration range.
For consistency, we should use a single way of identifying a declaration. We have two options: using an index or relying on the range (or preferably the start of the range). The current architecture of the semantic model heavily relies on declaration indexes. In a such architecture, the use of range could lead to a perf overhead. With enough refactoring we could reduce this overhead. However, this looks like a large refactor. Moreover, scopes are currently identified by indexes (in the semantic model and in the name resolver). Using indexes seems like a conservative approach.
Thus, I propose in this PR to refactor the name resolver in order to use indexes as declaration identifiers. This change also simplifies a potential refactoring that will use ranges start as identifiers for declarations.
Using indexes instead of ranges could also allow separating declarations (binding identifiers) from (semantic) bindings. In other words we could associate several declarations (binding identifiers to a same semantic binding. It is one of the approach I proposed to solve #565. Although I am no longer convinced by this approach (see the
EDIT:
comment), using indexes provides more flexibility.For now an index is a
usize
. It should be changed tou32
. I prefer delaying this task to another PR because the change is broad: it affects the semantic model. We could even use the typed-index-collections crate.Test Plan
Ci should pass.