Use separate value stores for identifiers and string literals #4106
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This undoes a previous change to unify them, and I think at my advice. =[ Sorry about that, I think I was just wrong.
Specifically, I think I had suggested that it would be more efficient to have a single shared hashtable of strings. The more I look at profiles of the toolchain, the less likely that seems. Specifically for identifiers and string literals it seems especially problematic.
Using a single, joint hashtable is likely a good idea when all of the different querying code paths are equally likely, the strings follow the same distribution of sizes, and either there is no clustering of access to different sets of strings or none of the sets are meaningfully small enough to fit into a lower level of resident cache.
I think essentially none of these predicates actually hold for identifiers vs. string literals:
Sorry for the misleading advice on that one.
While splitting them, I've worked to simplify the code a bit by building a way to have the
StringRef
holding canonical value stores not require specializations, and so we get a pretty large code cleanup in the process here.