Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Deprecate Iterator::size_hint and ExactSizeIterator for better named alternatives. #1034

Closed
wants to merge 4 commits into from

Conversation

CloudiDust
Copy link
Contributor

This change is originally brought up in this discuss thread.

Rendered View

@CloudiDust
Copy link
Contributor Author

There was some issue with #1032 that new commits were not showing up there. Thus here is a new PR. Sorry for the confusion.

This PR promotes the "rename size_hint during the 1.0 beta cycle" alternative to be the main candidate.

cc @gankro.

@nikomatsakis
Copy link
Contributor

👍 Seems more consistent to me.

@Gankra
Copy link
Contributor

Gankra commented Apr 6, 2015

Still 👍

@aturon
Copy link
Member

aturon commented Apr 6, 2015

I'm on board with this; would've been nice to land before beta, but it slipped through the cracks.

We can carry out a deprecation + removal process before 1.0, to make this not-too-painful.


# Detailed design

Rename the `size_hint` method of `std::iter::Iterator` to `len_hint` during the Rust 1.0 beta cycle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate a little more on the deprecation plan and migration plans here as well? I would personally vote for the addition of a default len_hint method which delegates to size_hint while adding a #[deprecated] tag on the size_hint method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton, one way to do the deprecation and migration is to publish a new beta release (beta2) that is not compatible with beta, but compatible with final.

Note that there were some changes that were "technically breaking" but didn't make it into the beta. Also there is #1030. And we may find other problems in beta that requires breaking change. So a new beta may be a good idea anyway.

I think "three weeks later" is a good time to release a new beta. Rust is still pre-1.0-final, so it may be fine to do some special casing.

(Currently I cannot update the RFC, I'll do so when I can.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton I'm hoping you guys don't release a 1.0 that has any deprecations.

@CloudiDust I think it's okay to break stuff in this Beta cycle without releasing Beta2. These breaking changes are very minor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tshepang, without beta 2, lib authors affected by late breaking changes would have to maintain at least two branchs, one for beta (because many people would be using beta) and one for the yet-to-be-released final. But if we release a new beta right after applying breaking changes, the lib authors would not have to branch their libs as people will quickly switch to beta 2.

Or we can apply the changes right before the final release, but that may be a bit surprising and feel rushed.

I think whether beta 2 is necessary depends on how many "quite visible" breaking changes we want to apply. (I expect most late breaking changes to be of the "technically breaking" kind, but size_hint -> len_hint is "quite visible", though minor and easy to deal with.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CloudiDust you can have people update their code either 3 or 6 weeks after Beta. How is 3 weeks better? It's like you are assuming people should have 1.0-compatible libraries as soon as 1.0 is out. And even then, the breaking changes until 1.0 should still be minor, so the updates should be minimal.

Note that even if 3 weeks was chosen, there could still be breaking changes after Beta 2, and that's still preferable to getting stuck with something that's not ideal for all of 1.x.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tshepang, "three weeks" is better because more people are likely to do feedback about the breaking changes if there is another beta. Though, if the only “quite visible" breaking change is this len_hint one, then a new beta may not be necessary. (But some libs would have to be branched if they want to support both beta and nighty, once such a breaking change lands.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CloudiDust I think there is already some precedent. e.g., PhantomFn has been deprecated. So you'll get warnings on nightly, but things still work. I guess it will be removed For Real once 1.0 hits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BurntSushi, PhantomFn was what I had in mind when I said:

Note that there were some changes that were "technically breaking" but didn't make it into the beta.

And I believe yes, it will be removed in 1.0 final.

Renaming size_hint is a bit more involving than removing PhantomFn, though. As the latter only involves a removal, while the former can be seen as a removal plus an addition. Also, I expect size_hint to be more frequently used than PhantomFn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I still prefer the renaming (after all it is still a simple change). Failing that, I'd like to go with "deprecate but retain size_hint".

@ben0x539
Copy link

ben0x539 commented Apr 6, 2015

rfc: rename len() to size()

:(

@CloudiDust
Copy link
Contributor Author

@ben0x539, I could not find the relevent discussions, but vaguely remember that one of the reasons for preferring len is that size can mean the size of a value, not the number of elements in a container. In a "low level" language like Rust, sizes of values are important, and the semantics of size_of and len is not the same.

C++ reuses the name size for different things, and Rust can do some improvements here. Of course whether len is an improvement is a bit subjective, considering that non-linear collections don't technically have lengths. But then again, there is Python. ;)

@Gankra
Copy link
Contributor

Gankra commented Apr 7, 2015

Yeah size vs len was debated plenty several months ago. It's not an important bikeshed. All that matters is that we're consistent.

votes for num

@carllerche
Copy link
Member

It seems that I am in the minority here but...

The beta and stable releases, on the other hand, will only include features and APIs deemed stable, which represents a commitment to avoid breaking code that uses those features or APIs.

Beta is out, please don't make any more changes unless there have been critical issues discovered. I believe that it was stated that the bar to make a breaking change once beta lands was high. A naming bike shed is not that.

@steveklabnik
Copy link
Member

I agree with @carllerche . We're going to have to start being okay with Rust having little warts like this.

@arthurprs
Copy link

👍
Rust will eventually have to live with some regret... But this is simple enough, let's not let it be one of them.

@wycats
Copy link
Contributor

wycats commented Apr 8, 2015

I agree with @carllerche as well. We should stick to our guns on stability unless we have an extremely good reason.

@CloudiDust
Copy link
Contributor Author

@carllerche @steveklabnik @wycats, would deprecating size_hint (the first alternative) be acceptable? (Note: per #1030, deprecations may happen "soon", anyway.)

@tshepang
Copy link
Member

tshepang commented Apr 8, 2015

As @aturon mentioned, this was an oversight. It should be fixed, together with whatever else gets noticed between now and 1.0. It's after 1.0 that we dare not break anything "unless we have an extremely good reason" (as @wycats says). Let''s live with "warts like this" (as @steveklabnik says) if only they are noticed after 1.0.

@bjadamson
Copy link

I think the extra API bloat to keep around size_hint() would be confusing from my previous experience using libraries such as QT and PEG (Portable Embedded Graphics). In both of these libraries objects representing rectangles on the screen all have a size_hint() method which I always end up needing to look up to see what it does. In both of these libraries the size_hint() method refers to how many pixels the rectangles on the screen should be as a suggestion.

The semantics are clearly different than here, I think using size_hint() for a collection is confusing given my experience with these C++ libraries. +1 to renaming size_hint() to len_hint().

@pcwalton
Copy link
Contributor

pcwalton commented Apr 8, 2015

I agree with @carllerche — this is really minor. I'm OK with living with this small issue forever.

@CloudiDust
Copy link
Contributor Author

(This is a slightly modified version of my reply to @pcwalton in /r/rust.)

"This is really minor" can go both ways:

  1. This is really minor, so no need to fix this.
  2. This is really minor, so why not fix this?

So we may agree to disagree. :)

Also, according to semver, 1.0.0-beta -> 1.0.0 (using the correct semver format here) is a special case where it is okay to have a small amount of breaking changes.

From the semver specification:

5. Version 1.0.0 defines the public API. ...
9. ... Pre-release versions have a lower precedence than the associated normal version. A pre-release version indicates that the version is unstable and might not satisfy the intended compatibility requirements as denoted by its associated normal version. ...

@arthurprs
Copy link

I'm in for the change. Sacrifices will have to be made either with breaking changes or regret...
I vote to break things, at least until 1.0 is out.

I posted the same for the auto deriving traits.

@nstoddard
Copy link

I support this. Sure, it's minor, and there's eventually going to be a few warts like this one, but I don't see any reason not to fix this one while we still can. Let's try not to ship a version 1.0 which already has warts.

@carllerche
Copy link
Member

The last release before 1.0 stable is a dry run for stability. It should be used to let all the changes (including last minute changes) that happened in the previous cycle to bake so that any emergency issues that are discovered can be fixed before 1.0.

If aesthetic / bike shedding level changes wish to be made, then there should be another pre 1.0 release that happens after the beta, perhaps a release candidate, to allow changes a chance to sit for a while and for users to actually use them.

@wycats
Copy link
Contributor

wycats commented Apr 8, 2015

@carllerche indeed, and not just 1.0 beta but every beta after that too. Beta releases only include features that have gone through a fairly rigorous process and have been enabled because they're ready to go. The beta release is one last chance to find bugs before committing to semver, and changes have a high bar to clear.

In general, if a problem is found in a feature that justifies a change, it means it wasn't actually stable enough to justify unflagging it, and it should be moved back to nightlies so the change can be made through the full process. If people don't think it's worth doing that, the reason isn't big enough to justify a late-breaking change.

@nikomatsakis
Copy link
Contributor

Reading the arguments from others is pushing me towards the "meh, just leave it" camp. As @alexcrichton put it last night, we have a certain budget for "optional" breaking changes. I don't really think I would want to spend it here!

@jpernst
Copy link

jpernst commented Apr 8, 2015

If anything it should probably be count_hint since an iterator doesn't really have a len per se. But I tend to agree that this ship has sailed, and speaking as a user I'd just like to move forward and start building things in rust.

To take a different approach, how about a retroactive justification for leaving it as is? The purpose of this method is to estimate the amount of data that can be retrieved from it, presumably from some underlying collection. This collection could be anything though, and might not necessarily have the linear notion of a len. It could be tree nodes, a file, a procedurally generated sequence, or anything really. I think '`size' is an appropriately general term to reflect the abstract nature of the data source. "size" also conjures notions of reserving space, which is likely the primary use for this method.

For my part, this handwavy justification is enough to satisfy me that it's fine left as is.

@CloudiDust
Copy link
Contributor Author

@wycats,

In general, if a problem is found in a feature that justifies a change, it means it wasn't actually stable enough to justify unflagging it ...

In general, yes. However, this particular change is basically because an oversight. I believe the content of the change is not controversial, only the timing is.

@carllerche, I won't mind another beta actually. (In general I believe pre-1.0.0 is where a bit special casing can apply.)

@nikomatsakis, if there is a "budget", then a small change like this won't occupy much of it. (Also, we may wait and see, if not "many" other optional breaking changes happen during the cycle, then this change can utilize the budget.)

@jpernst, len is uniformly used across Rust's standard collections, even the non-linear ones, to mean "numbers of elements". While size means "sizes of values". And size_hint is not consistent with the convention. The doc even directly says that size_hint returns a length. If a function returning a length has no len in its name, then I think it would be hard to handwave len on BTreeMaps etc.

@CloudiDust CloudiDust closed this Apr 9, 2015
@CloudiDust CloudiDust reopened this Apr 9, 2015
@CloudiDust
Copy link
Contributor Author

Oops it was easy to accidentally hit the close button on mobile.

The flip side of the "budget" argument is, as this change itself is small, some may believe it is not worth landing if this is the only breaking change, but I'd argue that in that case, this change can totally "piggyback" on other breaking changes. (We are having breaking changes anyway, why not add this small one?)

@aturon
Copy link
Member

aturon commented Apr 24, 2015

Sorry for the delay! I will make sure this gets attention for a decision on the core team ASAP.

@CloudiDust
Copy link
Contributor Author

@aturon, excuse me but will this be accepted?

@arthurprs
Copy link

I'm hoping it will.

@alexcrichton
Copy link
Member

@CloudiDust ah thanks for the ping, it appears this has fallen by the wayside! @aturon and I discussed this awhile back and our thoughts on this basically boiled down to how the standard library deals with deprecated code. We would like to take a path of aggressively deprecating existing APIs if necessary (for cases like this), but taking this route has a number of downsides which haven't been fully thought through just yet. In short, I believe everyone's in favor of this RFC, but we do not have precedent yet for dealing with large-scale deprecations such as this. As a result there's been some hesitation while the ramifications of such a decision are explored.

With the recent change to Rust's governance, as well, this RFC will soon come under the purview of the libraries subteam, and we're still in the process of bootstrapping the subteams and getting then under way. This should happen within the coming weeks, however, and this will surely be one of the first RFCs under discussion by the subteam!

Sorry again for the delay, but thanks for your patience.

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label May 18, 2015
@Gankra
Copy link
Contributor

Gankra commented May 27, 2015

Note: this PR is in its Final Comment Period (as of yesterday).

@alexcrichton alexcrichton added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label May 27, 2015
@liigo
Copy link
Contributor

liigo commented May 29, 2015

Since we've released 1.0, I do not think it's sound to change this any more. size_hint is not the end of world. I prefer size_hint to len_hint, because the former is human-readable, and easy to pronounce.

@tshepang
Copy link
Member

I'm still in favor... consistency is great, and nothing should be broken by the change.

@alexcrichton
Copy link
Member

This PR hasn't gotten a lot of comments in the past week, and I personally still had a number of concerns about the policy on deprecating APIs in the standard library. I would like to propose that we renew the FCP for this RFC (for one more week) to ensure that we're all comfortable with the precedent that this RFC is setting (deprecating APIs in the standard library).

I've written up a thread on internals about this topic, and I'd encourage discussion about the policy to deprecating APIs in the standard library to happen there instead of here (as this is focused on size_hint).

cc @rust-lang/libs

@bluss
Copy link
Member

bluss commented Jun 1, 2015

How does this look on the implementor side?

How do we enforce that len_hint and size_hint are in sync?

I fear some iterators implement size_hint and delegate to size_hint and some only len_hint and delegate to len_hint. It will be a mix. I understand size_hints aren't very well “appreciated” right now and mostly don't get much attention when you implement iterators — they are optional after all, but they are kind of important for allocations, that the reason we have them.

@alexcrichton
Copy link
Member

Oh that's a good point, I believe the implementation would be for size_hint to delegate to the defaulted len_hint function, but this means that iterators only implementing size_hint will have the wrong return value when they have len_hint called on them. This will not be the case, however, for any standard iterator (which is where I believe most perf-sensitive iteration happens).

On the other hand it means that all crates which want to remain compatible with 1.0 (and hence can't define len_hint) will have sub-par performance going forward (or have to give up 1.0 compat). This does seems like a legitimate drawback to mention (cc @CloudiDust)

@carllerche
Copy link
Member

Given the point raised by @bluss (and elaborated on by @alexcrichton), the fact that 1.0 is out, and that this is a minor "esthetic" change, it just doesn't seem worth it...

@emk
Copy link

emk commented Jun 2, 2015

Oh that's a good point, I believe the implementation would be for size_hint to delegate to the defaulted len_hint function, but this means that iterators only implementing size_hint will have the wrong return value when they have len_hint called on them.

Hmm. So if I write an iterator under Rust 1.0, and somebody calls that iterator under Rust 1.2 (using non-deprecated methods), then len_hint will return an incorrect value? Personally, that seems like a much more unfortunate consequence than an inconsistent name. The damage is slightly ameliorated by the fact that an iterator with a broken len_hint is merely potentially very slow, but at least it still returns correct output. Still, it feels like we'd be trading a small wart for a medium wart here.

@Gankra
Copy link
Contributor

Gankra commented Jun 2, 2015

I am currently working on an RFC to make iterators more optimizable in unsafe code that would introduce "trusted_len" and "TrustedLen". I could call them trusted_size, but I'd rather not let this mistake bleed out.

@arthurprs
Copy link

Indeed that's a valid point, but it'll still give correct output.

That's 1.0 (the first stable release ever...) only though, it don't see why it should be a big factor on this decision.

@daboross
Copy link

daboross commented Jun 2, 2015

@alexcrichton Just a question, if this was added is there a reason to not make len_hint delegate to size_hint by default? That way any crate wanting to stay backwards compatible can define just size_hint and keep performance on all versions.

Also, any crate which actually provides len_hint will be locked in to 1.2+ rust versions, so users of the function have no reason to use size_hint as the dependency iterator will already be incompatible with earlier rust versions.

To be clear, I would probably prefer that this isn't accepted, I'm just pointing out a possible solution to that problem.

@bluss
Copy link
Member

bluss commented Jun 2, 2015

I think you'd need a rustc method renaming feature to do this properly in fact. We need the method names len_hint and size_hint to refer to the same method.

@alexcrichton alexcrichton mentioned this pull request Jun 2, 2015
@alexcrichton
Copy link
Member

@daboross

Just a question, if this was added is there a reason to not make len_hint delegate to size_hint by default?

I think the problem unfortunately exists both ways. I outlined a problem where size_hint delegates to len_hint where defining an iterator is now difficult to do at 1.0 in the "fastest way possible" as new code using len_hint will not be using your implemented size_hint function, and you can't implement len_hint if you're locked to 1.0.

Conversely, if len_hint delegated to size_hint it would be difficult to consume an iterator. At 1.0 I could only call the size_hint method on a given iterator, but all new iterators would define len_hint instead of size_hint, so I wouldn't be getting the latest-and-greatest size hints from iterators I take from an API.

Deciding the direction of delegation affects what part of an API suffers, but I think regardless of what we choose it's impossible to have everyone on all versions of Rust using the same size hints.

@daboross
Copy link

daboross commented Jun 2, 2015

@alexcrichton One point on that - if you are using any iterators which define len_hint instead of size_hint, aren't you already constrained to Rust 1.2+? I mean it seems like you would have no reason to use size_hint if you've already broken compatibility with 1.0/1.1.

@alexcrichton
Copy link
Member

If I'm a Rust 1.0 crate that wants to consume an iterator, I can only call size_hint, not the len_hint function. If len_hint delegates to size_hint, then new Rust 1.1+ iterators which only define len_hint (and hence size_hint returns the normal default) will be "slower" when used with my crate as my crate wouldn't be picking up the most accurate size hints.

Sure, though, if a crate actually calls len_hint then this is moot.

@huonw
Copy link
Member

huonw commented Jun 3, 2015

I think @bluss's suggestion of adding method renaming/aliasing ( #1034 (comment)) seems like a good way to solve this (and future cases). It's something we support with everything(?) except trait methods, since we can use pub use/have wrappers to rename other things. However, I'm not sure adding a new language feature is worth solving just this case, and it's not clearly to me how many others there will be.

@Gankra
Copy link
Contributor

Gankra commented Jun 3, 2015

Actually yeah having "function re-exports" would be a pretty sweet thing to have for a few problems. I imagine we'll stumble into more "weird name" problems and having this as just a "ok. problem solved" would be pretty sweet. No forward/back-compat struggles, and everyone gets the best name.

(someone should make an RFC for that...)

@BurntSushi
Copy link
Member

I tend to really dislike naming inconsistencies, so I was leaning towards being in favor of this change (despite the downsides of deprecation). However, the very subtle issues surrounding the conflict between size_hint and len_hint when both are present push me towards the other direction: the naming inconsistency is unfortunate, but the effects of deprecation are worse. I'd rather live with an unfortunate naming inconsistency than subtle effects on performance.

If there were some hypothetical mechanism to do renaming without incurring that cost, then I think I'd be in favor of this change.

@Gankra
Copy link
Contributor

Gankra commented Jun 3, 2015

Yes, at this point I think this should be closed/post-poned in favour of a more general renaming scheme like #[deprecated(renamed_to="new_name")] or fn foo = bar.

@bluss
Copy link
Member

bluss commented Jun 3, 2015

@gankro Yes I agree, at that point it would be relatively simple to do.

I'm envisioning a renaming feature that makes sure users can't define both the old and the new name at the same time.

To keep it clean & nice, let the old name only be visible in the attribute?

#[renamed(from="size_hint", since="1.2.0")]
fn len_hint(&self) -> (usize, Option<usize>) {
    (0, None)
}

Using #[renamed] would imply the deprecation of the old name.

@alexcrichton
Copy link
Member

The consensus of the libs subteam is now that this RFC should not be merged at this time. @bluss's observation that this will lead to a performance degredation for crates tied to 1.0 is enough motivation to not push on this RFC at this time. Due to no mechanism being able to deprecate these methods in a truly zero-cost manner, I'm going to close this RFC at this time.

Thanks regardless for the RFC @CloudiDust! We definitely think it's possible to have a nice way to rename functions in the future (e.g. #1150) which should allow changes like this to be made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.