-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ranged-based for
for user-defined types
#1885
ranged-based for
for user-defined types
#1885
Conversation
3bc148b
to
c7b8ccf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is looking promising!
I think we should have at least a sketch of a plan for mutability in this proposal. Given that our main way of providing mutable handles to objects is via pointers, I think we'd want some way to convert a container into a range of pointers to container elements. I think that can be done without any changes to this proposal, for example:
fn F() {
var v: Array(i32, 5) = ...;
for (p: i32* in v.MutableElements()) {
// ...
}
}
where MutableElements
produces an Iterable where .ElementType == .Self.ElementType*
. (Maybe we'd want a different interface for converting the container to a sequence of element pointers, but we don't need to decide on that here.)
proposals/p1885.md
Outdated
The solution highlighted embeds the end condition in `Next()`, returns a null | ||
`Optional` at the last elements. This guarantees that `Iterator`s cannot be used | ||
improperly (code safety goal), and exposes a simple interface, but does have a | ||
performance cost (cost of the `Optional`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it could also have an interoperability and migration cost, because Carbon ranged-for loops won't work with C++ containers, and vice-versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option for addressing interoperability without binding ourselves to the C++ API would be to have the interop layer automatically inject adopters for language-conforming iterators. e.g., a Carbon Iterable
could get adapted to look like a C++ iterator for C++ code, whereas a C++ iterator could have an external impl defined for Carbon code (and likely should need that regardless), letting each language keep its own idioms.
(it may make sense to mark interop as an open question since it's not designed out?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be challenging to bridge the gap between these two APIs with adapters, because there's a substantial "impedance mismatch" between them that's hard to hide.
- With this design, a Carbon iterator basically plays the same role as a begin/end iterator pair in C++, so there isn't a one-to-one relationship between the objects on either side of the adaptation. That will be a particular problem for automatic adaptation, but may also cause problems for manual adaptation.
- A C++ forward iterator (or better) must support calling
operator*
multiple times, returning the same result so long as the iterator is not advanced. To satisfy that requirement, an adaptor would have to cache the value returned fromNext
. - C++ iterators support mutation via
operator*
. That compounds the previous problem, because mutation has to apply to the original element, not a cached copy. - We could avoid those last two problems by saying that Carbon iterators are exposed in C++ as input iterators, but that excludes mutating use cases, and could dramatically degrade performance even for read-only cases, because many algorithms can't be implemented efficiently (or at all) in terms of the input iterator API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last points definitely highlight the need for a Get
function, in place of a simpler Next
-only approach. I propose to move in that direction.
Regarding the iterator pairs, it is an option to expose an way to get an end iterator for comparison purposes, and interoperability. From a user standpoint, I would probably prefer to not be forced to rely on this end operator, which can be akward at times (needing the container in scope and such), but as a additional feature I believe it makes sense, and likely to have other uses (reverse and such).
Edit: Proposal amended with Get
for Iterator
.
proposals/p1885.md
Outdated
The solution highlighted embeds the end condition in `Next()`, returns a null | ||
`Optional` at the last elements. This guarantees that `Iterator`s cannot be used | ||
improperly (code safety goal), and exposes a simple interface, but does have a | ||
performance cost (cost of the `Optional`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option for addressing interoperability without binding ourselves to the C++ API would be to have the interop layer automatically inject adopters for language-conforming iterators. e.g., a Carbon Iterable
could get adapted to look like a C++ iterator for C++ code, whereas a C++ iterator could have an external impl defined for Carbon code (and likely should need that regardless), letting each language keep its own idioms.
(it may make sense to mark interop as an open question since it's not designed out?)
proposals/p1885.md
Outdated
The solution highlighted embeds the end condition in `Next()`, returns a null | ||
`Optional` at the last elements. This guarantees that `Iterator`s cannot be used | ||
improperly (code safety goal), and exposes a simple interface, but does have a | ||
performance cost (cost of the `Optional`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be challenging to bridge the gap between these two APIs with adapters, because there's a substantial "impedance mismatch" between them that's hard to hide.
- With this design, a Carbon iterator basically plays the same role as a begin/end iterator pair in C++, so there isn't a one-to-one relationship between the objects on either side of the adaptation. That will be a particular problem for automatic adaptation, but may also cause problems for manual adaptation.
- A C++ forward iterator (or better) must support calling
operator*
multiple times, returning the same result so long as the iterator is not advanced. To satisfy that requirement, an adaptor would have to cache the value returned fromNext
. - C++ iterators support mutation via
operator*
. That compounds the previous problem, because mutation has to apply to the original element, not a cached copy. - We could avoid those last two problems by saying that Carbon iterators are exposed in C++ as input iterators, but that excludes mutating use cases, and could dramatically degrade performance even for read-only cases, because many algorithms can't be implemented efficiently (or at all) in terms of the input iterator API.
Co-authored-by: josh11b <[email protected]> Co-authored-by: Jon Ross-Perkins <[email protected]>
Depends on other discussion, and too large for this proposal
Necessary for C++ interoperability with iterators.
ad010ef
to
c9bbad7
Compare
Consumable interface can be handled in a different proposal, and is not key to moving forward for the range-based for support.
proposals/p1885.md
Outdated
@@ -81,6 +85,12 @@ The original `for` proposal does not define a way for user-defined types to | |||
opt-in to support range-based for loops. An approach is proposed in this | |||
document. | |||
|
|||
This can be achieved using either a cursor or an iterator. A cursor benefits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably worth spelling out the difference between "cursor" and "iterator"; I'm not sure everyone is familiar with that terminology.
proposals/p1885.md
Outdated
// `container` is now empty | ||
``` | ||
|
||
### Limitations and future language improvements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably also mention language support for const
, and language support for user-defined l-value expressions, because those will determine how iterators should support mutability.
Co-authored-by: Geoff Romer <[email protected]>
proposals/p1885.md
Outdated
- `begin()` and `end()` methods or free functions | ||
- supporting pre-increment operator `++` and indirection operator `*` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- Something to connect these two requirements to make it clear that the type returned by
begin()
andend()
is what has to support these operators. - I believe the iterator type must also support inequality comparison (at least). According to https://en.cppreference.com/w/cpp/language/range-for a C++ range-based
for
loop gets rewritten to includefor ( ; __begin != __end; ++__begin)
.
We may need to be careful about closely matching C++ so we can be as compatible as we can manage, as long as it doesn't impact the usability for ordinary users. In C++, my understanding is as long as the rewrite type checks, the ranged-based for loop is legal, like a template. This means we may need to support begin()
and end()
returning different types, as long as !=
is defined on those two types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely yes, clarified.
proposals/p1885.md
Outdated
auto begin() -> IterateIterator<Iterable<C, T>> { | ||
return IterateIterator<Iterable<C, T>>(*this, NewCursor()); | ||
} | ||
auto end() -> IterateIterator<Iterable<C, T>> { | ||
return IterateIterator<Iterable<C, T>>::Invalid(*this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is Iterable<...>
here? It looks like IterateIterator
expects to be parameterized by a container type. Is Iterable
some sort of adapter to define the CursorType
and ElementType
members? I don't see it defined anywhere else in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a typo, and should be Iterate<C, T>
, which is the C++ equivalent to Carbon's type. Fixed.
need to copy and cache the last value, to allow for `*it` to be called. This | ||
puts a requirement on `ElementType` to be copyable for this interoperability to | ||
be usable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will still work if ElementType
is move-only (but see the comment below), it just means that IterateIterator
will be move-only in those cases. That means it won't be a valid iterator according to the standard library, but it should still work with range-based for
loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just looked back at the code:
if (const auto v = container_->Next(*cursor_); v) {
value_ = *std::move(v);
}
Won't Next()
require ElementType
to be copyable nonetheless, as we are effectively returning a copy of an element, and not moving anything from the container? Even if we avoid an optional-like wrapper, or just move from it, the ElementType
still has to be copied at some point (likely withing Next()
).
Maybe I forgot something between now and then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In recent versions of C++, if Next
returns an ElementType
rather than a reference, the code you quote won't actually copy or move the return value; it will use v
itself as the storage for the returned temporary. Now, you're right that in this context, the body of Next
will very likely need to make a copy, but Next
is written in pure Carbon, so that problem isn't caused by caching in the interoperability layer, and in fact it's not caused by the interop layer at all -- the same problem exists in pure Carbon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that this is by design, and not caused by the interop. To expand, I think we could say "move-only" in the C++ domain, most that would be ignoring a very likely "copiable" requirement on Carbon side / Next
. So all-in-all, "copyable" seem more correct, and the upper bound of reqs, vs the lower bound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that this doc doesn't explicitly talk about copyability on the Carbon side, so by introducing it here as a new topic that's specifically tied to caching in the interop layer, it really makes it sound like native Carbon will not require copyability, but C++ interop will.
I'd suggest revising the "Large elements and Optional" section to explicitly talk about the fact that this design may force Next
to make a copy, and hence require ElementType
to be copyable. Then down here you can say that the caching requires ElementType
to be movable, and note that this won't matter in practice if Carbon requires it to be copyable anyway.
Co-authored-by: josh11b <[email protected]>
Co-authored-by: Geoff Romer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is about ready for leads review
for (v: my_dict.(Iterate.ElementType) in my_dict.reversed) {} | ||
``` | ||
|
||
### Large elements and Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chandlerc I feel like we had a conversation about this and r-value representations. Could you recall where we ended up with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is about ready for leads review
From Barry Revzin's "take(5)" presentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this. It seems likely to me that we'll want to revisit some aspects of this eventually, such as ergonomic improvements for mutable iteration, but this seems like a good foundation to use as the basis for further exploration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also approving on behalf of leads. Thank you for working on this!
The goal of this proposal is to provide a way for user-defined types to support range-based iteration with
for
.The current proposed solution exposes 3 interfaces that can be implemented by user types to enable support for
ranged-for loops.
Co-authored-by: josh11b [email protected]