-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unique: new package with unique.Handle #62483
Comments
Oooh, I like it. I note that I've wanted both weak-keys and weak-value maps, to solve different problems. A weakly-keyed You could sort of envision the symbol mapping working this way, except that it solves a slightly different problem. In particular, a key point of the interning is that if no one still has a given symbol, it Could Have Been Anything, so it's fine if the next time someone asks, they get a new one. With a hypothetical weakly-keyed-map approach, the existence of the value-the-symbol-denotes would potentially prevent this. Not sure whether it makes more sense to try to view these as fundamentally different questions, or as inherently-related questions, because I think weak-reference maps very close to entirely solve the interning problem. |
@seebs We went deep down the weak-keyed maps (specifically, ephemerons, which preserve the GC abstraction) rabbit hole as a solution for interning. I think we concluded that it's insufficient for interning, though they are useful for other purposes (as you note, RE: Weak-value maps, I believe I came to the conclusion that those are basically weak references, and thus suffer from the same issues. |
String interning with the proposed API would be equivalent to this right? var b []byte = ...
s := intern.New(string(b)).Value() where the |
IIUC, this doesn't solve the global type cache problem because you can't attach a value to the key. Quite a few packages have a global map that caches a complex data structure for a Go type: go/src/encoding/json/encode.go Line 336 in e844d72
Today, there's a memory leak in "json" where dynamically constructed types with |
That's basically right, but in this design you're encouraged to keep the |
That seems a bit unfortunate. The "json" package when unmarshaling would be storing a |
I think the reason weak-keyed maps weren't a great solution because if you allow anything comparable except a pointer, you end up being able to break the GC abstraction. Heap pointers are special because they are an unforgeable identity. So Notably, for the pointers-only case, it's also surprisingly easy to implement, because the Go runtime already has a way to associate arbitrary data with a pointer. But, you probably also want that weak map to be cyclic. That is
It doesn't. A
I believe the only thing that would fully satisfy this particular use-case is the "opaque string interning" alternative. It may be worth considering in the future as it's somewhat orthogonal to this. I'm pessimistic that there's a single API that will solve all these problems, but I think this package is a decent starting point. |
This comment was marked as resolved.
This comment was marked as resolved.
FWIW there is also @josharian's string interning library, which works somewhat as you've described by keeping a It seems like it'd be possible to create a (I do agree that it's unfortunate that the |
I suppose we could make an exception for strings and have the FWIW, I think strings are a reasonable exception given that they're the only type that is defined as a reference yet behaves like a value type. |
This proposal has been added to the active column of the proposals project |
It occurs to me that if we're going to force a copy and a new allocation for an insertion, then strings are entirely uniquely identifiable via their data pointer (regardless of length). The |
It took me three-four reads to understand, but it sounds like a useful addition to the standard library. What I'm wondering is if there's an automatic way in the profiling tooling to recommend "hey, you should intern this value" as part of the output. Otherwise I can't really see the usage of |
Hi all, I'm still learning about Golang (I'm the group manager for Java and now Golang @ MSFT), so take all of this with that in mind :-). String Dedup - I wanted to add a quick note that in Java, String Deduplication is accessible to all of the GCs but it is still a runtime flag to enable. https://malloc.se/blog/zgc-jdk18 for the extra detail there. As you can imagine, since it's not on by default it's usage is pretty low, not sure 'we' got the user experience right there. String intern perf - Although I know you have a different design in mind, I (or more likely one of our real engineers ;-)) will take a look at Java's intern performance and see if the concerns raised above still hold true or if that was patched or a new design put in place. All good things to learn either way I guess. I'll post back here if there's anything interesting. public + internal? - Is there a use case where keeping an internal version of this for the runtime that can evolve more rapidly for internal only needs as well as an external public facing package make sense? I know if Java they started doing this a bunch with certain com.sun packages within the new module system. weak references - Your note about using weak references resonates! Speaking from Java-land, its memory management challenges are usually confined to the realms of issues with strongly referenced memory leaks or whatever, but on the odd occasion when WeakReferences go wrong then its an order of magnitude harder to debug with tooling etc. That might just be more a reflection on their design or gaps in o11y tooling but for me its always a case of this is where some Dragons live. |
'intern' may not be the best name for this package. It requires knowledge of other systems (particularly Lisp) that not everyone has. Also intern here is not the same meaning as intern string in Java. The package is about creating unique identities for values, so what about package unique? Then the bikeshed becomes what the name of the type is. 'Symbol' has the right meaning for people with a Lisp or compiler background, but not to others. Symbol has many other meanings in many other parts of CS. unique.Handle seems like it strikes the right notes: the thing is a Handle, and it aligns with cgo.Handle too (although this one uses generics but cgo predated them).
|
Change https://go.dev/cl/574355 mentions this issue: |
Change https://go.dev/cl/576297 mentions this issue: |
Change https://go.dev/cl/573955 mentions this issue: |
This change defines two commonly-defined functions and a commonly-defined type in internal/abi to try and deduplicate some definitions. This is motivated by a follow-up CL which will want access to TypeOf in yet another package. There still exist duplicate definitions of all three of these things in the runtime, and this CL doesn't try to handle that yet. There are far too many uses in the runtime to handle manually in a way that feels comfortable; automated refactoring will help. For #62483. Change-Id: I02fc64a28f11af618f6071f94d27f45c135fa8ac Reviewed-on: https://go-review.googlesource.com/c/go/+/573955 Auto-Submit: Michael Knyszek <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: David Chase <[email protected]>
This change adds a concurrent hash-trie map implementation to the standard library in the new internal/concurrent package, intended to hold concurrent data structures. (The name comes from how Java names their concurrent data structure library in the standard library.) This data structure is created specially for the upcoming unique package. It is built specifically around frequent successful lookups and comparatively rare insertions and deletions. A valid question is whether this is worth it over a simple locked map. Some microbenchmarks in this new package show that yes, this extra complexity appears to be worth it. Single-threaded performance for LoadOrStore is comparable to a locked map for a map with 128k small string elements. The map scales perfectly up to 24 cores for Loads, which is the maximum available parallelism on my machine. LoadOrStore operations scale less well. Small maps will have a high degree of contention, but for the unique library, small maps are very unlikely to stay small if there are a lot of inserts, since they have a full GC cycle to grow. For #62483. Change-Id: I38e5ac958d19ebdd0c8c02e36720bb3338fe2e35 Reviewed-on: https://go-review.googlesource.com/c/go/+/573956 Reviewed-by: David Chase <[email protected]> Auto-Submit: Michael Knyszek <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
This change adds the internal/weak package, which exposes GC-supported weak pointers to the standard library. This is for the upcoming weak package, but may be useful for other future constructs. For #62483. Change-Id: I4aa8fa9400110ad5ea022a43c094051699ccab9d Reviewed-on: https://go-review.googlesource.com/c/go/+/576297 Auto-Submit: Michael Knyszek <[email protected]> Reviewed-by: David Chase <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
This PR adds a hashtriemap implementation to the project. This is mostly a direct copy from unique package [^1] internals [^2] with some adjustments to make it work with our project. Once `maphash.Comparable` lands we can remove dependency on Go internals [^3]. Michael Knyszek says he plans to use this implementation as a general idea for the future generic sync/v2.Map, but nothing is in stone yet. [^1]: golang/go#62483 [^2]: https://go-review.googlesource.com/c/go/+/573956 [^3]: golang/go#54670 Signed-off-by: Dmitriy Matrenichev <[email protected]>
I know this has been closed and already accepted, and I'm not opposed to having small packages or anything, but I'm a bit surprised that this didn't end up in the |
Apologies if my question does not make sense, but I am still a bit confused as to where benefits can be expected (or not) with this feature. Would there be any runtime performance benefits in using a handle on a hardcoded string? For example: var data = map[unique.Handle[string]]int{
unique.Make("the_first_key"): 1,
unique.Make("a_second_key"): 2,
unique.Make("some_third_key"): 3,
}
//go:noinline
func getUsingUserInput(key string) int {
return data[unique.Make(key)]
}
//go:noinline
func getSecond() int {
return data[unique.Make("a_second_key")] // Hardcoded string
}
func main() {
userInput := "some_third_key" // Assume it's not hardcoded
fmt.Println(getUsingUserInput(userInput))
fmt.Println(getSecond())
} In the first case ( In the second ( I also realize that a hashmap is not the best example to illustrate this question, but as the goal is to understand the internals, I would have the same questions for any other kind of sorted map (red black tree for example...) |
@Seb-C In both of your cases, there would be an added cost with little benefit, yes. But the reason has little to do with whether or not the strings are constant and everything to do with the fact that you don't keep them around and only create the handle once and only do so for lookup. The benefit comes if you don't pass The prototypical example is if you have a parser and return tokens from that. If you use |
+1 to @Merovius' comment. For another example, see the |
Proposal:
unique
packageUpdated: 10 April 2024
Oct 4th EDIT: Changed package name from "intern" to "unique" and renamed "Symbol" to "Handle" based on feedback.
Apr 10th EDIT: Updated the proposal to describe the actual implementation. Notably, map entries can be dropped after just a single GC cycle instead of two, and there's no more object resurrection, which has several well-known problems.
CC @golang/runtime
Motivation
The lack of runtime support for interning in Go represents a gap with other languages. Demand in the Go community for a weak map and/or string interning has shown up in a few GitHub issues over the years. Although the section of the community requesting these features is quite small, we see the consequences of not having built-in support for this functionality in the form of the go4.org/intern package.
go4.org/intern is a package that implements a global intern pool. Its goals are two-fold: intern data to deduplicate copies, and provide fast comparisons for such interned data. Functionally, it returns a unique global identity for every value placed into it. That value is weakly held by the pool's internals, to avoid letting the pool accumulate in an unbounded manner. This implementation follows in the footsteps of Lisp: interning a value produces a "symbol" that may be used for a fast direct comparison and provides a method through which the interned value may be recovered.
Although it has only 4 direct importers, it is transitively used quite widely (estimated 0.1% of modules) thanks to use by inet.af/netaddr (reverse module deps), which uses it to deduplicate strings and reduce the cost of their comparisons. This has caused friction within the ecosystem because go4.org/intern makes assumptions about the implementation of Go. In particular, it imports the package go4.org/unsafe/assume-no-moving-gc to validate those assumptions. That package, by design, needs to be updated with every release of Go.
inet.af/netaddr's functionality has also recently been merged into the standard library as the
net/netip
package, bringing along with it a copy of go4.org/intern as theinternal/intern
package. Furthermore, go4.org/unsafe/assume-no-moving-gc now references an internal Go runtime variable to avoid having to update every release.Although the immediate issue has been mitigated, the gap in functionality remains, and we now have a copy of go4.org/intern in the standard library anyway.
Goals
The goal of this proposal is to tidy up
internal/intern's
interface with generics, integrate it more tightly into the runtime (for efficiency and future-proofing), and expose it as a standard library package.This proposal will also motivate why we should move forward with an API that's similar go4.org/intern and not some other direct or indirect ways of enabling efficient value interning.
Design
We more-or-less propose to move the go4.org/intern package API into the standard library, but with a few tweaks. The biggest difference is that go4.org/intern uses an interface to represent the "key" part of the mapping, with a special case for strings (
GetByString
) to avoid an additional allocation for the string header. Using generics, we can avoid these additional allocations while also improving type-safety and ergonomics.Here's a sketch of the revised API, which will live in a new package called "
unique
":The unique package must maintain an internal mapping of values to globally unique symbols, which risks growing without bound. However, once no copies of symbols produced by
Make
are reachable, the program loses access to that globally unique identity. The intern package is then free to produce a new one without the program noticing. If it can produce a new one, it can just as well delete the old one during that time period in which no copy of a symbol exists for a given value.In practice, integration with the garbage collector is necessary to achieve this, introducing the risk of leaking details about the garbage collector's implementation.
A key property to consider with functionality that relies on specific behavior from the garbage collector is whether it preserves the illusion that the executing program has infinite memory (with a GC there's a
malloc
but nofree
). The utility of such an illusion is that it makes programs significantly simpler to reason about. When programs are able to observe the fact that the garbage collector reclaimed memory, it is possible for programs to rely on non-deterministic properties of that reclamation. Finalizers are a good example of a feature that breaks the illusion and are difficult to use correctly.Luckily, it's not possible for programs to notice when
New
returns a differentHandle[T]
precisely because it can produce new symbols without the program noticing. Therefore, it's not possible for a program to observe when memory is reclaimed. (Someone could use the unsafe package to write down theHandle[T]'s
internal value as auintptr
, but that specifically requires the use of unsafe. One also can't do much with thatuintptr
; casting that back to a pointer violates theunsafe.Pointer
rules.)Implementation
The core data structure is approximately a
map[any]*T
, where the*T
is a weak reference. The runtime fully controls the*T
created here, so it would attach aspecial
that represents a handle to the value that can go nil once the object is reclaimed. Once the*T
is no longer referenced, the GC will clear the handle. Later, a background goroutine will clean up map entries with nil handles.Note that user code here is racing with the garbage collector. It's possible that in between the garbage collector noticing that there are no more references to the
*T
and the value being collected, the program may read the value out and produce a new strong pointer to the*T
. To alleviate this, the reader must synchronize with the garbage collector. To avoid issues with object resurrection and to allow for immediate reclamation of memory, the reader can simply ensure the span containing theT
is always swept before accessing the weak pointer handle. The only costs here are thus the highly-optimized span lookup and, once per GC, a single goroutine may need to sweep the span, which is generally quite fast. Note that if a value is being lookup frequently, then only the first lookup in each GC cycle will need to sweep; otherwise it'll return quickly.This is quite different to the implementation of go4.org/intern, and is only possible by accessing runtime internals. It allows us to reclaim memory in just a single GC cycle instead of the 3 that are currently required by the intern package.
Next is the map implementation itself. Because the map will be global, the underlying data structure will need to be thread-safe. Unfortunately there are far too many choices for a concurrent map data structure, so we need to cut them down by setting some goals and requirements. Here's a list:
A traditional bucketed hash map is not terribly difficult to make concurrently under these circumstances, but has the downside that incremental growth and shrinking of such a map is quite complicated. Trying to extend the existing
map
implementation with concurrency features is also likely to be complicated.Another option is to use something like an adaptive radix tree over 8-byte hash values. The growth and shrinking of such a tree comes naturally, and making them concurrent within the bounds of our requirements isn't too complicated. The downside is poor locality because of the tree structure. I suspect that at least to begin with, something like this will provide good enough performance.
For the actual implementation, we pick a simplified form of the adaptive radix tree specialized for
uintptr
values (hashes), essentially forming a hash-trie. It's straightforward to make reads out of this data structure perfectly scalable. Insertions and deletions will be performed using fine-grained locking techniques, reducing contention.As an additional note on performance, calls to
Make
should always avoid allocating in the case where the provided value is already in the map. Meanwhile they should explicitly clone the value when adding it to the map. (This means ifT
is a struct with pointers in it, the values those pointers point to are almost always going to be forced to escape to the heap, like with regular Go maps.)Risks
API design impact
The fact that interned values are represented by a type
Handle[T]
means that details around interning may encourage two versions of APIs, one supporting interned values and the other not. Contrast this with an "opaque" alternative (see "Opaque string interning" in the alternatives section) that makes no distinction between interned and non-interned values in the type system.I believe this is a legitimate concern, but suspect that it will mostly be mitigated in practice. Interning is a somewhat niche technique that helps performance dramatically in certain cases, but only in those certain cases that clearly benefit from data deduplication and/or fast comparisons of data. Elsewhere it's more clearly just cumbersome and simple microbenchmarks should reveal the slowdown.
Thus, I think "polluting" APIs in the cases where it's useful is likely worth the tradeoff, and it's sufficiently cumbersome to use that where it's not necessary it will simply be ignored. I believe the fact that go4.org/intern has relatively few direct importers supports this hypothesis.
Poor performance
One situation we absolutely want to avoid is performance issues like with Java's
String.intern
. My best understanding of the situation (as of when the linked blog post was written) is that:I believe we avoid all three of these issues in the implementation described above. The third one is less obvious if you're not familiar with the existing Go GC implementation. Briefly, the global intern map would only be a GC root insofar as any global variable is a GC root (i.e. the entire map would not be part of the root set, just the reference to it). And even if it was fully part of the root set, the Go GC already shards and scans the root set concurrently with the mutator executing. Hence, no pause-time impact.
Disclaimer: I don't know if this is still an issue in the Java world; I didn't look into this too deeply. If it's not, then that's great to hear. Nevertheless, it's still worthwhile to learn from past attempts at interning.
Alternatives considered
Opaque string interning
One alternative is interning just for strings, which is a common case. For instance:
Although such an API is temptingly simple, it's not really useful for any other kind of data structure. Only strings are properly immutable in the language.
Furthermore, because the interning is opaque, we don't get the full benefit of cheap comparisons out-of-the-box. The string comparison fast path would be taken more frequently, but when there's no pointer equality the string data would still need to be compared. To get the full benefit, there would need to be some additional runtime and/or compiler support to identify when two interned strings are being compared, which may end up slowing down non-interned string operations as well.
This is still worth considering for the future, but doesn't properly address the use-cases this proposal intends to address.
Background string deduplication
Another alternative is to just have the runtime deduplicate strings in the background. For instance, the JVM G1 garbage collector has a flag for such a feature (off by default). The advantage of this approach is the programmer has to set one flag and they get to save on memory costs.
However, like the previous alternative, this only really applies to strings again, because only strings are properly immutable at the language level. The other problem is that this feature would need to be opt-in, requiring a new top-level runtime knob. (Presumably this feature isn't on by default because it's not always worth the CPU cost in the garbage collector.) It's also substantially more complex to implement this in the Go garbage collector, because it doesn't currently know the type of anything in the heap.
Weak references
An even more general alternative to the proposed API is to just add support for weak references to the standard library and/or language. After all, the proposed implementation conceptually just uses a weak reference in its implementation anyway.
The main issue with weak references is that they're very hard to use in a system with tracing garbage collection, since they can turn out to be nil at very surprising times (or possibly never). Fundamentally, they break the aforementioned infinite memory illusion, because they reveal when memory is reclaimed.
The bar is extremely high for adding anything this difficult to use, and I believe we should prefer easier-to-use abstractions as much as possible.
The text was updated successfully, but these errors were encountered: