Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port “The Architecture of Scala Collections” to the new architecture #1078

Merged
merged 6 commits into from
Jun 8, 2018

Conversation

julienrf
Copy link
Contributor

Instead of replacing the existing documentation, it creates a new
page for 2.13’s collections.

Instead of replacing the existing documentation, it creates a new
page for 2.13’s collections.
- Start by listing the challenges addressed by the collections
  framework,
- Fix the markdown syntax used for tables so that they can be rendered
  by GitHub,
- Mention that custom collections can be mutables or immutables,
- Add an appendix listing the transformation operations that are
  candidates to be overloaded to support the “same result type”
  principle,
- Add an appendix showing how to cross-build custom collection
  implementations with sbt.
@julienrf
Copy link
Contributor Author

I picked the contents of #960 and moved it to a new documentation page (without removing the existing documentation page):

index

I updated the contents so that it works with the last changes we applied to the collections framework.

There is still an area where we could improve the framework (and where we would have to update this document): scala/collection-strawman#325 should give us reusable builder-based operation implementations (in the current document we re-implement map, concat and a few others).

I addressed most of the feedback given in #960 by @lrytz and @SethTisue.

@julienrf julienrf requested a review from SethTisue May 31, 2018 08:32
@lrytz
Copy link
Member

lrytz commented Jun 2, 2018

What do you think should be done about the collections user guide (https://docs.scala-lang.org/overviews/collections/introduction.html)? Most content would hopefully be the same, so there it's maybe better to have a single version with a version-specific sections where necessary.

This implementation now behaves correctly, but we can still improve
a few things. Since our collection is strict, we can take advantage
of the better performance offered by
of strict implementations for transformation operations.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"by of" – I think you meant "offered by strict implementations of transformation operations"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks!

operations to take advantage of strict builders,
- uses a strict mode for overloads of transformation operations that return
an `RNA`,
- has a companion object that extends `SpecificFactory[Base, RNA]`, which makes
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpecificIterableFactory[Base, RNA]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

with StrictOptimizedIterableOps[(String, A), mutable.Iterable, PrefixMap[A]] {

var suffixes: immutable.Map[Char, PrefixMap[A]] = immutable.Map.empty
var value: Option[A] = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Access modifiers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.


## Appendix: Methods to overload to support the “same result type” principle ##

You want to add overloads to specialize a transformation operations such that they return a more specific result. Examples are:
Copy link

@ches ches Jun 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"specialize transformation operations", minus "a". Also "return a more specific result type" seems more precise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

- `map`, on `SortedSet`, when an implicit `Ordering` is available for the resulting element type, should return a
`SortedSet` (instead of a `Set`).

The following table lists transformation operations that might return a too wide type. You might want to overload
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtle, but "return a too wide type" is slightly awkward, "return too wide a type" more natural. Or "return an undesirably wide type" is perhaps an alternative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions!

Copy link

@ches ches left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Forgot to enter a general comment with the inline ones on my random unsolicited review…).

Having read it straight through, to echo something @SethTisue mentioned on #960, I still found coverage of Views to be conspicuously thin on the concept. Builders are covered now and their place in the interfaces of IterableOps and IterableFactory plus the examples makes their role evident enough. But we have View referred to as a concrete collection type itself and also seen in snippets of View.Filter etc. to produce an Iterable… it didn't connect the dots for me on the design of the abstraction and its mechanics in the architecture. I'd need another reference source or time diving into the library code before I could attempt a lazy collection implementation myself—perhaps it doesn't help exposition on Views that all three case study implementations are strict.

If Views will be covered in the Collections overview as they are today then it might be a moot point, but as that's more a user guide than library developer one, it still seems like their role in the architecture is pertinent here.

@julienrf is right that this doc has grown lengthy. If it were to grow further, it could be split into a few pages. These could be one covering concepts and mechanics, one with strict implementation examples, and one lazy. It may be hard to detach the first from any running example, though.

On the strict side, this was a great introduction to how the new collections work and at how implementing my own can feel approachable now.

@lrytz
Copy link
Member

lrytz commented Jun 4, 2018

Agree that we should explain more about how views are implemented, what role iterators play.

There should also be some overview on the differences between views, iterators and lazy lists, but that should go in the collections user guide, not here.

@julienrf
Copy link
Contributor Author

julienrf commented Jun 4, 2018

What do you think should be done about the collections user guide (https://docs.scala-lang.org/overviews/collections/introduction.html)? Most content would hopefully be the same, so there it's maybe better to have a single version with a version-specific sections where necessary.

Good point, I’ve updated #995 to mention this idea.

@julienrf
Copy link
Contributor Author

julienrf commented Jun 4, 2018

Agree that we should explain more about how views are implemented, what role iterators play.

OK, I’ll add a short section about views, but I think they should be covered more deeply in #995.

}
~~~

And then leaf collection types appropriately instantiate the type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence is a bit hard to understand because it starts with "And then"; maybe "Leaf collection types then instantiate the type parameters appropriately."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

specific than `IterableOps`), which returns a `Map`, but if it doesn’t type
check (in case the return type of the function passed to the `map` call is not
a pair), it fallbacks to the definition from `IterableOps`, which returns an
`Iterable`. This is how we follow the “same-result-type” principle: wherever
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't exactly describe what happens in overloading resolution. Slightly more correct:

"If the function passed as argument to map returns a pair, both functions are applicable. In this case, the version from MapOps is used because it is more specific by the rules of overloading resultion, so the resultion collection is a Map. If the argument function does not return a pair, only the version defined in IterableOps is applicable. In this case, the resulting collection is an Iterable."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


![]({{ site.baseurl }}/resources/images/collections-architecture.svg)

Template traits are in grey whereas collection types are in white.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the grey could be a bit darker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

`Seq`, `Set`, `Map` or just `Iterable`? In our case, it is tempting
to choose `Seq` because our collection can contain duplicates and
iteration order is determined by insertion order. However, some
properties of [`Seq`](/overviews/collections/seqs.html) are not satisfied:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cover more text under the link ([properties of `Seq`](/overviews/collections/seqs.html)), otherwise it's hard to see it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

behave as expected except for the last one: after calling `take` we
get back a `List` instead of the expected `Capped1` collection. This
is because all that was done in class
[`Capped1`](#first-version-of-capped-class) was making `Capped1` extend
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cover more text under the link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


### Final version of RNA strands class ###

~~~ scala
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some missing imports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

strictly necessary to define this companion object, as class `PrefixMap`
can stand well on its own. The main purpose of object `PrefixMap` is to
define some convenience factory methods. It also defines an implicit
`Factory` for a better interoperability with other collections.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be explained in more detail. Why do we need it / are those in the std lib not enough? What does it do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


We'll now turn to the companion object `PrefixMap`. In fact it is not
strictly necessary to define this companion object, as class `PrefixMap`
can stand well on its own. The main purpose of object `PrefixMap` is to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should talk about when it's necessary to have a companion, what the XFactory traits are for etc.

`SortedSet` (instead of a `Set`).

The following table lists transformation operations that might return an undesirably wide type. You might want to overload
these operations to return a more specific type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe give some explanation when this is necessary - basically when the collection fixes some type parameter, like LongMap, BitSet, etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


And then you can define a Scala 2.13 compatible implementation of your collection
in the `src/main/scala-2.13+` source directory, and an implementation for the
previous Scala versions in the `src/main/scala-2.13-` source directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could include links to projects, like the scalacheck PR (typelevel/scalacheck#411)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@lrytz
Copy link
Member

lrytz commented Jun 4, 2018

Where would we document how to write extension methods for collections? Here or in the user guide?

@julienrf
Copy link
Contributor Author

julienrf commented Jun 4, 2018

@lrytz That’s a good question. On one hand extension methods are a different topic than custom collections, so it would justify a separate documentation page, on the other hand, being familiar with the architecture definitely helps to understand how to write extension methods, so a lot of contents from this page is relevant.

I think that makes another argument to split this page in two: one part about the architecture of the collections, and another part about the implementation of custom collections (as suggested by @ches). Then we could have a third page documenting how to write extension methods (and the first page would be a prerequisite for both implementing custom collections and extension methods). What do you think?

@lrytz
Copy link
Member

lrytz commented Jun 4, 2018

I think splitting it up sounds very good, it's a really long page already.

@julienrf
Copy link
Contributor Author

julienrf commented Jun 5, 2018

I’ve done the splitting in the last commit I’ve pushed.

I’ve kept the Git history to make the review process easier (hopefully…).

Here is how the index looks like now:

overviews

Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think this is good to go (can always be refined if necessary).

@SethTisue SethTisue merged commit 124c465 into scala:master Jun 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants