ranged-based `for` for user-defined types #1885

Pixep · 2022-08-03T03:36:50Z

The goal of this proposal is to provide a way for user-defined types to support range-based iteration with for.
The current proposed solution exposes 3 interfaces that can be implemented by user types to enable support for
ranged-for loops.

Co-authored-by: josh11b [email protected]

zygoloid

Thanks, this is looking promising!

I think we should have at least a sketch of a plan for mutability in this proposal. Given that our main way of providing mutable handles to objects is via pointers, I think we'd want some way to convert a container into a range of pointers to container elements. I think that can be done without any changes to this proposal, for example:

fn F() {
  var v: Array(i32, 5) = ...;
  for (p: i32* in v.MutableElements()) {
    // ...
  }
}

where MutableElements produces an Iterable where .ElementType == .Self.ElementType*. (Maybe we'd want a different interface for converting the container to a sequence of element pointers, but we don't need to decide on that here.)

proposals/p1885.md

docs/design/generics/overview.md

proposals/p1885.md

geoffromer · 2022-08-11T18:08:29Z

proposals/p1885.md

+The solution highlighted embeds the end condition in `Next()`, returns a null
+`Optional` at the last elements. This guarantees that `Iterator`s cannot be used
+improperly (code safety goal), and exposes a simple interface, but does have a
+performance cost (cost of the `Optional`).


It seems like it could also have an interoperability and migration cost, because Carbon ranged-for loops won't work with C++ containers, and vice-versa.

One option for addressing interoperability without binding ourselves to the C++ API would be to have the interop layer automatically inject adopters for language-conforming iterators. e.g., a Carbon Iterable could get adapted to look like a C++ iterator for C++ code, whereas a C++ iterator could have an external impl defined for Carbon code (and likely should need that regardless), letting each language keep its own idioms.

(it may make sense to mark interop as an open question since it's not designed out?)

I think it will be challenging to bridge the gap between these two APIs with adapters, because there's a substantial "impedance mismatch" between them that's hard to hide.

With this design, a Carbon iterator basically plays the same role as a begin/end iterator pair in C++, so there isn't a one-to-one relationship between the objects on either side of the adaptation. That will be a particular problem for automatic adaptation, but may also cause problems for manual adaptation.

A C++ forward iterator (or better) must support calling operator* multiple times, returning the same result so long as the iterator is not advanced. To satisfy that requirement, an adaptor would have to cache the value returned from Next.

C++ iterators support mutation via operator*. That compounds the previous problem, because mutation has to apply to the original element, not a cached copy.

We could avoid those last two problems by saying that Carbon iterators are exposed in C++ as input iterators, but that excludes mutating use cases, and could dramatically degrade performance even for read-only cases, because many algorithms can't be implemented efficiently (or at all) in terms of the input iterator API.

The last points definitely highlight the need for a Get function, in place of a simpler Next-only approach. I propose to move in that direction.

Regarding the iterator pairs, it is an option to expose an way to get an end iterator for comparison purposes, and interoperability. From a user standpoint, I would probably prefer to not be forced to rely on this end operator, which can be akward at times (needing the container in scope and such), but as a additional feature I believe it makes sense, and likely to have other uses (reverse and such).

Edit: Proposal amended with Get for Iterator.

jonmeow · 2022-08-15T19:55:24Z

proposals/p1885.md

+The solution highlighted embeds the end condition in `Next()`, returns a null
+`Optional` at the last elements. This guarantees that `Iterator`s cannot be used
+improperly (code safety goal), and exposes a simple interface, but does have a
+performance cost (cost of the `Optional`).


One option for addressing interoperability without binding ourselves to the C++ API would be to have the interop layer automatically inject adopters for language-conforming iterators. e.g., a Carbon Iterable could get adapted to look like a C++ iterator for C++ code, whereas a C++ iterator could have an external impl defined for Carbon code (and likely should need that regardless), letting each language keep its own idioms.

(it may make sense to mark interop as an open question since it's not designed out?)

proposals/p1885.md

geoffromer · 2022-08-15T21:10:13Z

proposals/p1885.md

+The solution highlighted embeds the end condition in `Next()`, returns a null
+`Optional` at the last elements. This guarantees that `Iterator`s cannot be used
+improperly (code safety goal), and exposes a simple interface, but does have a
+performance cost (cost of the `Optional`).


I think it will be challenging to bridge the gap between these two APIs with adapters, because there's a substantial "impedance mismatch" between them that's hard to hide.

With this design, a Carbon iterator basically plays the same role as a begin/end iterator pair in C++, so there isn't a one-to-one relationship between the objects on either side of the adaptation. That will be a particular problem for automatic adaptation, but may also cause problems for manual adaptation.

A C++ forward iterator (or better) must support calling operator* multiple times, returning the same result so long as the iterator is not advanced. To satisfy that requirement, an adaptor would have to cache the value returned from Next.

C++ iterators support mutation via operator*. That compounds the previous problem, because mutation has to apply to the original element, not a cached copy.

We could avoid those last two problems by saying that Carbon iterators are exposed in C++ as input iterators, but that excludes mutating use cases, and could dramatically degrade performance even for read-only cases, because many algorithms can't be implemented efficiently (or at all) in terms of the input iterator API.

proposals/p1885.md

Co-authored-by: josh11b <[email protected]> Co-authored-by: Jon Ross-Perkins <[email protected]>

Moved to carbon-language#2025

Depends on other discussion, and too large for this proposal

Necessary for C++ interoperability with iterators.

Consumable interface can be handled in a different proposal, and is not key to moving forward for the range-based for support.

geoffromer · 2022-10-06T18:25:43Z

proposals/p1885.md

@@ -81,6 +85,12 @@ The original `for` proposal does not define a way for user-defined types to
 opt-in to support range-based for loops. An approach is proposed in this
 document.

+This can be achieved using either a cursor or an iterator. A cursor benefits


It's probably worth spelling out the difference between "cursor" and "iterator"; I'm not sure everyone is familiar with that terminology.

proposals/p1885.md

geoffromer · 2022-10-06T18:50:08Z

proposals/p1885.md

+// `container` is now empty
+```
+
+### Limitations and future language improvements


We should probably also mention language support for const, and language support for user-defined l-value expressions, because those will determine how iterators should support mutability.

proposals/p1885.md

Co-authored-by: Geoff Romer <[email protected]>

proposals/p1885.md

josh11b · 2023-04-07T19:56:11Z

proposals/p1885.md

+-   `begin()` and `end()` methods or free functions
+-   supporting pre-increment operator `++` and indirection operator `*`


Two things:

Something to connect these two requirements to make it clear that the type returned by begin() and end() is what has to support these operators.

I believe the iterator type must also support inequality comparison (at least). According to https://en.cppreference.com/w/cpp/language/range-for a C++ range-based for loop gets rewritten to include for ( ; __begin != __end; ++__begin).

We may need to be careful about closely matching C++ so we can be as compatible as we can manage, as long as it doesn't impact the usability for ordinary users. In C++, my understanding is as long as the rewrite type checks, the ranged-based for loop is legal, like a template. This means we may need to support begin() and end() returning different types, as long as != is defined on those two types.

Absolutely yes, clarified.

proposals/p1885.md

josh11b · 2023-04-07T20:17:01Z

proposals/p1885.md

+  auto begin() -> IterateIterator<Iterable<C, T>> {
+    return IterateIterator<Iterable<C, T>>(*this, NewCursor());
+  }
+  auto end() -> IterateIterator<Iterable<C, T>> {
+    return IterateIterator<Iterable<C, T>>::Invalid(*this);


What is Iterable<...> here? It looks like IterateIterator expects to be parameterized by a container type. Is Iterable some sort of adapter to define the CursorType and ElementType members? I don't see it defined anywhere else in this file.

That's a typo, and should be Iterate<C, T>, which is the C++ equivalent to Carbon's type. Fixed.

geoffromer · 2023-04-10T18:30:56Z

proposals/p1885.md

+need to copy and cache the last value, to allow for `*it` to be called. This
+puts a requirement on `ElementType` to be copyable for this interoperability to
+be usable.


I think this will still work if ElementType is move-only (but see the comment below), it just means that IterateIterator will be move-only in those cases. That means it won't be a valid iterator according to the standard library, but it should still work with range-based for loops.

I just looked back at the code:

if (const auto v = container_->Next(*cursor_); v) { value_ = *std::move(v); }

Won't Next() require ElementType to be copyable nonetheless, as we are effectively returning a copy of an element, and not moving anything from the container? Even if we avoid an optional-like wrapper, or just move from it, the ElementType still has to be copied at some point (likely withing Next()).

Maybe I forgot something between now and then.

In recent versions of C++, if Next returns an ElementType rather than a reference, the code you quote won't actually copy or move the return value; it will use v itself as the storage for the returned temporary. Now, you're right that in this context, the body of Next will very likely need to make a copy, but Next is written in pure Carbon, so that problem isn't caused by caching in the interoperability layer, and in fact it's not caused by the interop layer at all -- the same problem exists in pure Carbon.

Agree that this is by design, and not caused by the interop. To expand, I think we could say "move-only" in the C++ domain, most that would be ignoring a very likely "copiable" requirement on Carbon side / Next. So all-in-all, "copyable" seem more correct, and the upper bound of reqs, vs the lower bound.

The problem is that this doc doesn't explicitly talk about copyability on the Carbon side, so by introducing it here as a new topic that's specifically tied to caching in the interop layer, it really makes it sound like native Carbon will not require copyability, but C++ interop will.

I'd suggest revising the "Large elements and Optional" section to explicitly talk about the fact that this design may force Next to make a copy, and hence require ElementType to be copyable. Then down here you can say that the caching requires ElementType to be movable, and note that this won't matter in practice if Carbon requires it to be copyable anyway.

proposals/p1885.md

Co-authored-by: josh11b <[email protected]>

Co-authored-by: Geoff Romer <[email protected]>

proposals/p1885.md

josh11b

I think this is about ready for leads review

josh11b · 2023-05-18T18:24:29Z

proposals/p1885.md

+for (v: my_dict.(Iterate.ElementType) in my_dict.reversed) {}
+```
+
+### Large elements and Optional


@chandlerc I feel like we had a conversation about this and r-value representations. Could you recall where we ended up with this?

proposals/p1885.md

josh11b

I think this is about ready for leads review

From Barry Revzin's "take(5)" presentation

zygoloid

I'm happy with this. It seems likely to me that we'll want to revisit some aspects of this eventually, such as ergonomic improvements for mutable iteration, but this seems like a good foundation to use as the basis for further exploration.

zygoloid

Also approving on behalf of leads. Thank you for working on this!

Incorporates proposals: #1885, #2138, #2188, #2200, #2360, #2760, #2964, and #3162. --------- Co-authored-by: Richard Smith <[email protected]>

Filling out template with PR 1885

c7b8ccf

Pixep force-pushed the proposal-for-statement branch from 3bc148b to c7b8ccf Compare August 3, 2022 03:37

Pixep changed the title ~~for statement~~ ranged-based for for user-defined types Aug 3, 2022

Add context, other languages, and basic interfaces proposal

a355466

Pixep added the proposal A proposal label Aug 6, 2022

Pixep requested a review from a team August 6, 2022 04:28

Cleanup proposal structure, add context and proposal details

ef76d36

Pixep added the proposal rfc Proposal with request-for-comment sent out label Aug 6, 2022

Pixep self-assigned this Aug 6, 2022

Pixep marked this pull request as ready for review August 6, 2022 23:31

zygoloid reviewed Aug 9, 2022

View reviewed changes

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

geoffromer reviewed Aug 11, 2022

View reviewed changes

jonmeow reviewed Aug 15, 2022

View reviewed changes

geoffromer mentioned this pull request Aug 15, 2022

Slice generator #1927

Closed

geoffromer reviewed Aug 15, 2022

View reviewed changes

zygoloid mentioned this pull request Aug 25, 2022

Implement for loops #1592

Closed

josh11b reviewed Sep 1, 2022

View reviewed changes

zygoloid reviewed Sep 14, 2022

View reviewed changes

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

Pixep and others added 8 commits September 24, 2022 11:52

Fix various typos

83b208c

Co-authored-by: josh11b <[email protected]> Co-authored-by: Jon Ross-Perkins <[email protected]>

docs: revert typo fix to generics, separate PR

217ee4e

Moved to carbon-language#2025

Drop 'Indexable' interface proposal

02ee7f8

Depends on other discussion, and too large for this proposal

Add abstract section

0471f8d

Fix invalid code flags, () for interfaces

cf71002

Add a basic cursor vs iterator paragraph

91d9709

Add Get method to Iterator

c90d7ba

Necessary for C++ interoperability with iterators.

Fix typos

c9bbad7

Pixep force-pushed the proposal-for-statement branch from ad010ef to c9bbad7 Compare October 2, 2022 00:33

Pixep added 2 commits October 2, 2022 17:01

Drop 'TODO' section

0152f41

Park Consumable interface

00ef294

Consumable interface can be handled in a different proposal, and is not key to moving forward for the range-based for support.

geoffromer reviewed Oct 6, 2022

View reviewed changes

Pixep added 5 commits March 10, 2023 16:39

Clarify sections for C++ interop

f90fbfd

Merge branch 'carbon-language:trunk' into proposal-for-statement

31f9361

Add mitigation for container with large values

f5eea57

Add Iterate to C++ interop sample implementation

59637d2

add copy constructor for cursor iterator

63f927c

geoffromer reviewed Apr 3, 2023

View reviewed changes

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

proposals/p1885.md Outdated Show resolved Hide resolved

Pixep and others added 3 commits April 3, 2023 21:06

Update proposals/p1885.md

cc86c60

Co-authored-by: Geoff Romer <[email protected]>

Improve C++ iterator sample

80a74cd

Precise the impact of caching last value

4fad5ef

Pixep requested a review from geoffromer April 6, 2023 23:53

josh11b requested changes Apr 7, 2023

View reviewed changes

geoffromer reviewed Apr 10, 2023

View reviewed changes

Pixep and others added 2 commits April 23, 2023 15:18

Add suggestions from @josh11b

ff102e4

Co-authored-by: josh11b <[email protected]>

Add suggestion from @geoffromer

caa8d86

Co-authored-by: Geoff Romer <[email protected]>

geoffromer reviewed May 17, 2023

View reviewed changes

proposals/p1885.md Show resolved Hide resolved

Pixep added 2 commits May 17, 2023 21:17

Fix typo on Iterate class name in c++ snippet

fe33c52

Clarified C++ requirements, and implementation example

7b81a95

josh11b approved these changes May 18, 2023

View reviewed changes

Pixep added 2 commits May 22, 2023 14:03

Link range-based for statement from C++ standard.

0e545d8

Mention ranges consuming their input

8e1af85

From Barry Revzin's "take(5)" presentation

zygoloid approved these changes Jul 1, 2023

View reviewed changes

zygoloid enabled auto-merge July 1, 2023 22:19

Force checks to re-run.

4277d37

zygoloid added this pull request to the merge queue Aug 25, 2023

Merged via the queue into carbon-language:trunk with commit 1d4e9ee Aug 25, 2023
6 checks passed

josh11b mentioned this pull request Aug 31, 2023

Update design overview to reflect submitted proposals #3180

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ranged-based `for` for user-defined types #1885

ranged-based `for` for user-defined types #1885

Pixep commented Aug 3, 2022 •

edited

Loading

zygoloid left a comment

geoffromer Aug 11, 2022

jonmeow Aug 15, 2022

geoffromer Aug 15, 2022

Pixep Oct 1, 2022 •

edited

Loading

jonmeow Aug 15, 2022

geoffromer Aug 15, 2022

geoffromer Oct 6, 2022

geoffromer Oct 6, 2022

josh11b Apr 7, 2023

Pixep May 18, 2023

josh11b Apr 7, 2023

Pixep May 18, 2023

geoffromer Apr 10, 2023

Pixep May 18, 2023

geoffromer May 22, 2023

Pixep May 22, 2023

geoffromer May 23, 2023

josh11b left a comment

josh11b May 18, 2023

josh11b left a comment

zygoloid left a comment

zygoloid left a comment

		- `begin()` and `end()` methods or free functions
		- supporting pre-increment operator `++` and indirection operator `*`

ranged-based for for user-defined types #1885

ranged-based for for user-defined types #1885

Conversation

Pixep commented Aug 3, 2022 • edited Loading

zygoloid left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pixep Oct 1, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh11b left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh11b left a comment

Choose a reason for hiding this comment

zygoloid left a comment

Choose a reason for hiding this comment

zygoloid left a comment

Choose a reason for hiding this comment

ranged-based `for` for user-defined types #1885

ranged-based `for` for user-defined types #1885

Pixep commented Aug 3, 2022 •

edited

Loading

Pixep Oct 1, 2022 •

edited

Loading