This repository has been archived by the owner on Jul 8, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Minimal core vs. per-method override
- Loading branch information
Showing
1 changed file
with
189 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,189 @@ | ||
# Minimal core vs. per method override | ||
|
||
This document is created to discuss [issue raised on TC39 meeting](https://docs.google.com/presentation/d/1BeqsmXGPm_GEAApnpIfVZd3KTOykS4tQVQ7FvYTDtHg/edit#slide=id.g4e7e69452c_0_386). | ||
|
||
### Minimal core | ||
You should be able to override a core set of methods, then all new methods added in this proposal will automatically work. | ||
|
||
|
||
### Per method override | ||
You should be able to override all methods that you want to work properly, and then they will work properly. | ||
|
||
## Worth reading | ||
|
||
[Avoid "delete" and "add"](https://github.com/tc39/proposal-set-methods/issues/50) | ||
|
||
[TC39 notes](https://github.com/rwaldron/tc39-notes/blob/master/es9/2019-01/jan-29.md#update-on-set-methods) | ||
|
||
[rekey proposal](https://github.com/tc39/proposal-richer-keys/tree/master/collection-rekey) | ||
|
||
## Main points: | ||
|
||
These are quotes from notes and issues in this repository. If I have misquoted any fragment, please let me know. | ||
|
||
### Arguments in favor of minimal core | ||
|
||
> the Set constructor [already creates a contract around having `.add`](https://tc39.github.io/ecma262/#sec-set-iterable), though it's hard to observe. | ||
~ @bakkot | ||
|
||
> I don't think "customize the algorithm's behavior" is the right way to think of this. The point of `Symbol.iterator` is to allow consumers to iterate over the contents a container. Algorithms which want to iterate over the contents of a container should therefore use `Symbol.iterator`. Internal slots are precisely that - internal. Subclasses shouldn't have to worry about them as long as they maintain a consistent public API. (I very much do not agree that a well-behaved subclass must always call its super methods.) | ||
~ @bakkot | ||
|
||
> it is not the case that "per-method override" gives you everything you want. If I pass an instance of `class LoggingSet extends Set { has(x) { console.log(x); return super.has(x); } }` - approximately the simplest nontrivial subclass imaginable - to `(new Set([0, 1])).isSubsetOf(instance)` as you would like it to be defined, it will not trigger my instrumentation and I cannot make it trigger my instrumentation. Even though any other consumer of my instance obviously would. | ||
~ @bakkot | ||
|
||
> Requiring subclasses to overwrite all methods ever added does not work; experience from Ruby demonstrates this. | ||
~ @wycats | ||
|
||
> Agree with both Kevin and Yehuda, and rekey must be orthogonal to the discussion because there are already examples of conceivable Set subclasses that cannot be implemented with Set plus rekey. | ||
~ @gibson042 | ||
|
||
> My hope is that with a Set subclass, I can get the new functionality for free. | ||
~ @michaelficarra | ||
|
||
|
||
### Arguments in favor of per-method override | ||
|
||
|
||
> I'd love to do this --remove all lookups, operate on an internal list and create a Set finally from this. Three main reasons I didn't do this is: | ||
> (a) Consistency: we'd be moving away from the pattern used by the Set constructor (which looks up the `add` method). I wasn't sure if the committee would be happy with this. | ||
> (b) Subclassing becomes less useful if internal operations are going to bypass overridden methods | ||
> (c) The runtime cost that @bakkot described (... although this isn't going to happen for almost all common cases) | ||
~ @gsathya | ||
|
||
> I continue to be struck by the over-complexity that we are injecting into the spec, in the name of genericism, subclassability, and the desire to touch the public API as many times as possible in order to give other classes (or other thisArgs) a chance to insert their own behaviors. As someone who did the same thing with promises in ES2015, and lived to regret it greatly, it's heartbreaking to see those same patterns propagated further. I'd hoped we'd do things differently going forward. | ||
> (...) | ||
> The guiding principles, which I think are fairly generally applicable, are: | ||
> * Convert from user-hookable objects (`Set` or iterable) to spec-internal data structures ASAP, "at the boundary". | ||
> * Operate entirely on the spec-internal data structures for the actual algorithm you're performing. | ||
> * Convert back to user-hookable objects right before returning to the caller, no sooner | ||
> * Extract and emplace data out of/into `Set`s by using their internal slots as the source of truth | ||
> * Hard-code your object creation to not be hookable | ||
> * I'm unsure on this particular principle, as it just seems kind of nicer to use `new this.constructor()`. But hard-coding is way nicer than the gross species dance, and subclassing in general has not panned out well. The dream that "all the inherited methods will automatically do the right thing" that drives using `new (this.constructor && this.constructor[Symbol.species] || Set)()` has failed pretty badly, so I'd rather abandon it than keep it up. | ||
~ @domenic | ||
|
||
> I think "cripples subclassing" is overstating it a bit. It's important to separate subclassing and the desire to touch public APIs. | ||
> Subclassing still works fine in general. The only thing that might be slightly surprising is that you get back `Set` instances instead of `Subclass` instances. But we could back off a bit and re-include `new this.constructor()` (or, if you must, `new (this.constructor && this.constructor[Symbol.species] || Set)()`). | ||
> It's the pattern of going through as many public APIs as possible, in order to allow arguments or thisArgs to customize the algorithm's behavior, that the approach specifically avoids. Instead it treats the [[SetData]] as the source of truth---but any well-behaved subclass (i.e. one that calls super constructor and super methods) will work fine with that constraint. | ||
~ @domenic | ||
|
||
> When operating on a Set, `[[SetData]]` is a better source of truth in my opinion than the monkey-patchable public API. Subclasses which don't keep an accurate `[[SetData]]` are not, in my opinion, "perfectly reasonable". | ||
~ @domenic | ||
|
||
> Yes, that is the minimal-core principle. As I stated, I think that has led to a terrifying increase in complexity in the ES6-era classes, with excessive customizability and monkey-patch interception points permeating the operation of every method and constructor. I hope we do not propagate that cost forward, and instead create a simpler subclassing contract for `Set` than we did for `Promise`, `RegExp`, etc. | ||
~ @domenic | ||
|
||
> I did part of the work on the implementation of ES6 RegExp and Array subclassing in V8, and I deeply regret it. To avoid performance regressions while adding these observable points, we ended up adding "performance cliffs": The algorithm was something like, check whether anyone is observing it, if not take the fast path, otherwise take a new, slow path. Since my work there (which I'm not proud of at all), my colleagues in the V8 team have gone back and sped up many of those slow paths, but the performance difference remains, and an incredible amount of engineering effort went into this project. | ||
> Meanwhile, years later, it's still not clear what kind of software engineering benefit those changes brought to programmers. I don't get the impression that lots of people are deliberately subclassing `RegExp` and overriding `exec` for memoization. And if they do, they will find their RegExp operations run more slowly. | ||
~ @littledan | ||
|
||
|
||
## Enforcing protocols on receiver (`this`) | ||
|
||
Currently this proposal requires `this` to have `.has`, `.delete` and `.add` methods. | ||
|
||
Using `.delete` is algorithm choice and probably will be removed soon, because of [issue #56](https://github.com/tc39/proposal-set-methods/issues/56) (Set.prototype.symmetricDifference spec has unexpected behavior when given iterable containing duplicate elements). | ||
|
||
|
||
## Prior art | ||
|
||
### Java | ||
|
||
#### `Set` | ||
|
||
`Set` interface is very limited, so it can't be used as our inspiration. | ||
|
||
#### Stream API | ||
|
||
[Java 8 Stream API](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html). | ||
|
||
[Interface Stream<T>](https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html) | ||
|
||
Even though `Stream` implements most of methods in this proposal, it's more similar to [iterator helpers proposal](https://github.com/tc39/proposal-iterator-helpers), as it's generic and not tied to `Set`. | ||
|
||
### C# HashSet | ||
|
||
[`HashSet`](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.hashset-1) inherits set methods from [`IEnumerable<T>`](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1). Therefore, it uses approach more similar to "minimal core", because no `HashSet` internals are touched directly. | ||
|
||
### F# Set | ||
|
||
[Docs](https://msdn.microsoft.com/en-au/vstudio/ee340244(v=vs.89)) | ||
|
||
|
||
[Source code](https://github.com/fsharp/fsharp/blob/cb6cb5c410f537c81cf26825657ef3bb29a7e952/src/fsharp/FSharp.Core/set.fs) | ||
|
||
F# are sealed (similar to Java's `final`). Instance methods delegates to static `Set` or internal `SetTree` module. | ||
|
||
Internal `SetTree` methods are implemented like: | ||
|
||
```fsharp | ||
let subset comparer a b = forall (fun x -> mem comparer x b) a | ||
let psubset comparer a b = forall (fun x -> mem comparer x b) a && exists (fun x -> not (mem comparer x a)) b | ||
let rec filterAux comparer f s acc = | ||
match s with | ||
| SetNode(k,l,r,_) -> | ||
let acc = if f k then add comparer k acc else acc | ||
filterAux comparer f l (filterAux comparer f r acc) | ||
| SetOne(k) -> if f k then add comparer k acc else acc | ||
| SetEmpty -> acc | ||
let filter comparer f s = filterAux comparer f s SetEmpty | ||
``` | ||
|
||
|
||
|
||
### Ruby Set | ||
|
||
[Docs](https://ruby-doc.org/stdlib-2.6.1/libdoc/set/rdoc/Set.html) (include source code for each method). | ||
|
||
Methods in Ruby 2.6 are implemented using "minimal core" approach. | ||
|
||
Example: | ||
|
||
```ruby | ||
# File set.rb, line 259 | ||
def superset?(set) | ||
case | ||
when set.instance_of?(self.class) && @hash.respond_to?(:>=) | ||
@hash >= set.instance_variable_get(:@hash) | ||
when set.is_a?(Set) | ||
size >= set.size && set.all? { |o| include?(o) } | ||
else | ||
raise ArgumentError, "value must be a set" | ||
end | ||
end | ||
``` | ||
|
||
Therefore `superset` method call `this.all` (`.every` equivalent) and `this.is_a`. | ||
|
||
### Python set | ||
|
||
* [Python - set/frozenset](https://docs.python.org/3.6/library/stdtypes.html#set) | ||
|
||
[Example `set_symetric_difference`](https://github.com/python/cpython/blob/bb86bf4c4eaa30b1f5192dab9f389ce0bb61114d/Objects/setobject.c#L1710) | ||
|
||
Python `set` methods use "per-method override" approach. | ||
|
||
### Hack's HH\Set | ||
|
||
[Source code](https://github.com/facebook/hhvm/blob/ef523aef97e38c3a3d4341f4f8c3dcb6468d978d/hphp/runtime/ext/collections/ext_collections-set.php#L47) | ||
|
||
`HH\Set` implements only `difference` method. It delegates to `removeAll` method, but `HH\Set` class is *final*. | ||
|
84ca64e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bakkot @wycats @gibson042 @michaelficarra @gsathya @domenic @littledan
I have quoted your opinions. Let me know if I have misunderstood or misquoted them.
84ca64e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ginden, you haven't misquoted me, but the third quote is a little misleading in this context. My point was that there is an additional issue distinct from the "minimal core" issue, which is: should it be possible to build a
Set
subclass which overrideshas
and whose instances still behave consistently when passed as an argument to a method fromSet.prototype
?For this reason I think you should be more precise with the phrasing of "minimal core" issue. Specifically, "work" is ambiguous between "work with a subclass instance as their receiver" and "work with a subclass instance as their argument". The minimal core thing is concerned only with the first of these, and has no particular relationship to the second except in that such a minimal core, if it existed, would be a natural way for methods to interact with their arguments.
Also, you may wish to add Allen's opinion, as seen here and here.
As a small side note, Java's implementation actually is rich enough to draw inspiration from, if you look in the right place: any class which extends
AbstractSet
and providesadd
,remove
,size
, anditerator
methods (and whose iterators provideremove
) will get the right behavior for a host of other methods, includingcontainsAll
(effectivelyisSupersetOf
),retainAll
(effectively a mutatingintersection
), etc. The same is true for the concrete implementations: if you subclassHashSet
and overridecontains
, for example, thecontainsAll
method which your subclass will inherit will behave correctly.That is, Java is firmly in the "minimal core" camp.