-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port “The Architecture of Scala Collections” to the new architecture #1078
Port “The Architecture of Scala Collections” to the new architecture #1078
Conversation
Instead of replacing the existing documentation, it creates a new page for 2.13’s collections.
- Start by listing the challenges addressed by the collections framework, - Fix the markdown syntax used for tables so that they can be rendered by GitHub, - Mention that custom collections can be mutables or immutables, - Add an appendix listing the transformation operations that are candidates to be overloaded to support the “same result type” principle, - Add an appendix showing how to cross-build custom collection implementations with sbt.
I picked the contents of #960 and moved it to a new documentation page (without removing the existing documentation page): I updated the contents so that it works with the last changes we applied to the collections framework. There is still an area where we could improve the framework (and where we would have to update this document): scala/collection-strawman#325 should give us reusable builder-based operation implementations (in the current document we re-implement I addressed most of the feedback given in #960 by @lrytz and @SethTisue. |
What do you think should be done about the collections user guide (https://docs.scala-lang.org/overviews/collections/introduction.html)? Most content would hopefully be the same, so there it's maybe better to have a single version with a version-specific sections where necessary. |
This implementation now behaves correctly, but we can still improve | ||
a few things. Since our collection is strict, we can take advantage | ||
of the better performance offered by | ||
of strict implementations for transformation operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"by of" – I think you meant "offered by strict implementations of transformation operations"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, thanks!
operations to take advantage of strict builders, | ||
- uses a strict mode for overloads of transformation operations that return | ||
an `RNA`, | ||
- has a companion object that extends `SpecificFactory[Base, RNA]`, which makes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpecificIterableFactory[Base, RNA]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
with StrictOptimizedIterableOps[(String, A), mutable.Iterable, PrefixMap[A]] { | ||
|
||
var suffixes: immutable.Map[Char, PrefixMap[A]] = immutable.Map.empty | ||
var value: Option[A] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Access modifiers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
|
||
## Appendix: Methods to overload to support the “same result type” principle ## | ||
|
||
You want to add overloads to specialize a transformation operations such that they return a more specific result. Examples are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"specialize transformation operations", minus "a". Also "return a more specific result type" seems more precise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
- `map`, on `SortedSet`, when an implicit `Ordering` is available for the resulting element type, should return a | ||
`SortedSet` (instead of a `Set`). | ||
|
||
The following table lists transformation operations that might return a too wide type. You might want to overload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subtle, but "return a too wide type" is slightly awkward, "return too wide a type" more natural. Or "return an undesirably wide type" is perhaps an alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Forgot to enter a general comment with the inline ones on my random unsolicited review…).
Having read it straight through, to echo something @SethTisue mentioned on #960, I still found coverage of Views to be conspicuously thin on the concept. Builders are covered now and their place in the interfaces of IterableOps
and IterableFactory
plus the examples makes their role evident enough. But we have View
referred to as a concrete collection type itself and also seen in snippets of View.Filter
etc. to produce an Iterable
… it didn't connect the dots for me on the design of the abstraction and its mechanics in the architecture. I'd need another reference source or time diving into the library code before I could attempt a lazy collection implementation myself—perhaps it doesn't help exposition on Views that all three case study implementations are strict.
If Views will be covered in the Collections overview as they are today then it might be a moot point, but as that's more a user guide than library developer one, it still seems like their role in the architecture is pertinent here.
@julienrf is right that this doc has grown lengthy. If it were to grow further, it could be split into a few pages. These could be one covering concepts and mechanics, one with strict implementation examples, and one lazy. It may be hard to detach the first from any running example, though.
On the strict side, this was a great introduction to how the new collections work and at how implementing my own can feel approachable now.
Agree that we should explain more about how views are implemented, what role iterators play. There should also be some overview on the differences between views, iterators and lazy lists, but that should go in the collections user guide, not here. |
Good point, I’ve updated #995 to mention this idea. |
OK, I’ll add a short section about views, but I think they should be covered more deeply in #995. |
} | ||
~~~ | ||
|
||
And then leaf collection types appropriately instantiate the type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence is a bit hard to understand because it starts with "And then"; maybe "Leaf collection types then instantiate the type parameters appropriately."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
specific than `IterableOps`), which returns a `Map`, but if it doesn’t type | ||
check (in case the return type of the function passed to the `map` call is not | ||
a pair), it fallbacks to the definition from `IterableOps`, which returns an | ||
`Iterable`. This is how we follow the “same-result-type” principle: wherever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't exactly describe what happens in overloading resolution. Slightly more correct:
"If the function passed as argument to map
returns a pair, both functions are applicable. In this case, the version from MapOps
is used because it is more specific by the rules of overloading resultion, so the resultion collection is a Map
. If the argument function does not return a pair, only the version defined in IterableOps
is applicable. In this case, the resulting collection is an Iterable
."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
![]({{ site.baseurl }}/resources/images/collections-architecture.svg) | ||
|
||
Template traits are in grey whereas collection types are in white. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the grey could be a bit darker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
`Seq`, `Set`, `Map` or just `Iterable`? In our case, it is tempting | ||
to choose `Seq` because our collection can contain duplicates and | ||
iteration order is determined by insertion order. However, some | ||
properties of [`Seq`](/overviews/collections/seqs.html) are not satisfied: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cover more text under the link ([properties of `Seq`](/overviews/collections/seqs.html)
), otherwise it's hard to see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
behave as expected except for the last one: after calling `take` we | ||
get back a `List` instead of the expected `Capped1` collection. This | ||
is because all that was done in class | ||
[`Capped1`](#first-version-of-capped-class) was making `Capped1` extend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cover more text under the link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
### Final version of RNA strands class ### | ||
|
||
~~~ scala |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some missing imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
strictly necessary to define this companion object, as class `PrefixMap` | ||
can stand well on its own. The main purpose of object `PrefixMap` is to | ||
define some convenience factory methods. It also defines an implicit | ||
`Factory` for a better interoperability with other collections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be explained in more detail. Why do we need it / are those in the std lib not enough? What does it do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
We'll now turn to the companion object `PrefixMap`. In fact it is not | ||
strictly necessary to define this companion object, as class `PrefixMap` | ||
can stand well on its own. The main purpose of object `PrefixMap` is to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should talk about when it's necessary to have a companion, what the XFactory
traits are for etc.
`SortedSet` (instead of a `Set`). | ||
|
||
The following table lists transformation operations that might return an undesirably wide type. You might want to overload | ||
these operations to return a more specific type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe give some explanation when this is necessary - basically when the collection fixes some type parameter, like LongMap, BitSet, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
And then you can define a Scala 2.13 compatible implementation of your collection | ||
in the `src/main/scala-2.13+` source directory, and an implementation for the | ||
previous Scala versions in the `src/main/scala-2.13-` source directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could include links to projects, like the scalacheck PR (typelevel/scalacheck#411)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Where would we document how to write extension methods for collections? Here or in the user guide? |
@lrytz That’s a good question. On one hand extension methods are a different topic than custom collections, so it would justify a separate documentation page, on the other hand, being familiar with the architecture definitely helps to understand how to write extension methods, so a lot of contents from this page is relevant. I think that makes another argument to split this page in two: one part about the architecture of the collections, and another part about the implementation of custom collections (as suggested by @ches). Then we could have a third page documenting how to write extension methods (and the first page would be a prerequisite for both implementing custom collections and extension methods). What do you think? |
I think splitting it up sounds very good, it's a really long page already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think this is good to go (can always be refined if necessary).
Instead of replacing the existing documentation, it creates a new
page for 2.13’s collections.