Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing the _iter suffixes on specialized Iterator constructors #9440

Closed
Kimundi opened this issue Sep 23, 2013 · 10 comments
Closed

Comments

@Kimundi
Copy link
Member

Kimundi commented Sep 23, 2013

The names tend to become increasingly long, and if there is no alternative way it seems silly to add to the name. The _iter suffix can still be used for cases where an operation is also implemented in a non-iterator way, or where no better name can be found.

For example, the iterator constructors for StrSlice would improve greatly from this:

byte_iter()             -> bytes()
byte_rev_iter()         -> bytes_rev()
char_offset_iter()      -> chars_offset()
char_offset_rev_iter()  -> chars_offset_rev()
split_iter()            -> split()
rsplit_iter()           -> rsplit()
splitn_iter()           -> split_n()
rsplitn_iter()          -> rsplit_n()
split_terminator_iter() -> split_terminator()
split_str_iter()        -> split_str()
matches_index_iter()    -> match_indices()
line_iter()             -> lines()
any_line_iter()         -> lines_any()
word_iter()             -> words()
// Can't think of something shorter for these two:
nfd_iter()              -> nfd_iter()
nfkd_iter()             -> nfkd_iter()
// These would be Iterable impls, and thus keep the name:
iter()                  -> iter()
rev_iter()              -> rev_iter()
@alexcrichton
Copy link
Member

I am totally in favor of this.

@alexcrichton
Copy link
Member

I suppose I should elaborate a bit more. I personally don't like the _iter suffix everywhere when using iterators (especially in struct names), but I do agree that there are locations in which it is appropriate. I think that the four methods you've listed as retaining iter in them are appropriate, but all of the others I believe are just as natural without _iter than with.

It could be difficult in that words doesn't necessarily indicate that you're getting an iterator instead of an array, but I think that this isn't too bad.

If we do push forward with this, I think that other modules in libstd/extra should be checked to ensure that they adhere to this same convention, which is basically: "Does it make sense without _iter? Then you can probably drop it."

@Kimundi
Copy link
Member Author

Kimundi commented Sep 23, 2013

@alexcrichton Right, the idiomatic way should be to offer an lazy evaluating Iterator for all operations where you're computing an stream of values, so once we standarized on that it would become unidiomatic for something like words() to not return something like a WordIterator.

@huonw
Copy link
Member

huonw commented Sep 24, 2013

👍 I'd almost say that str.iter should be str.chars since we don't have Iterable yet.

@kud1ing
Copy link

kud1ing commented Sep 24, 2013

Could we always return an iterator where we return a list today?
Would words().collect() be more expensive than words() which returns a list? Or would that be optimized away?

@thestinger
Copy link
Contributor

It will rarely be less efficient, because the iterator's next function will be inlined and the struct removed by scalar replacement of aggregates (SROA).

The only case I can think of where there's an efficiency issue is when you're replacing recursive traversals with an iterator managing a stack. The reason simply being that the iterator is usually going to have to allocate the stack, but if it can be done up-front then it will be cheaper than recursion.

@kud1ing
Copy link

kud1ing commented Sep 24, 2013

@ thestinger: Thanks. Then i suggest porting everything to iterators. A _iter suffix would then be superfluous, because we would not need to distinguish between functions returning iterators or lists.

On second thought, str::chars().collect() will certainly be less efficient than just charing the internal data structure.

@Kimundi
Copy link
Member Author

Kimundi commented Sep 24, 2013

@huonw: That is my opinion too - iterating over a string should not prefer any of the three possibilities of bytes, codepoints, and grapheme clusters. I elaborated a bit on that here: #7043

EDIT: Which I just realized yourself opened, so you already know :P

@bluss
Copy link
Member

bluss commented Sep 25, 2013

I think char_offset_iter() could be → char_indices() (it's really chars and their byte indices)

Regarding chars().collect() it has an issue with the size_hint which I'm unsure how to solve. The problem is that a char may require 1-4 bytes of allocation in the str. CharIterator uses the lower bound for its size hint and guesses that every remaining byte is part of a 4-byte encoded char. FromIterator for str however, I think it just allocates the lower bound of bytes, so it assumes 1 byte per char! This is unfortunately a pessimisation for the ascii case. (Edit: The worst part is that I've written both of the methods).

bors added a commit that referenced this issue Nov 26, 2013
This PR removes almost all `_iter` suffixes in various APIs of the codebase that return Iterators, as discussed in #9440.

As a summarize for the intend behind this PR:

- Iterators are the recommended way to provide a potentially lazy list of values, no need to name them painfully verbose. If anything, functions that return a specific container type should have more verbose names.
- We have a static type system, so no need to encode the return value of a constructor function into its name.

Following is a possibly incomplete list of all renamings I performed in the codebase. For a few of them I'm a bit unsure whether the new name still properly expresses their functionality, so feedback would be welcome:

~~~
&str : word_iter()             -> words()
       line_iter()             -> lines()
       any_line_iter()         -> lines_any()
       iter()                  -> chars()
       char_offset_iter()      -> char_indices()
       byte_iter()             -> bytes()
       split_iter()            -> split()
       splitn_iter()           -> splitn()
       split_str_iter()        -> split_str()
       split_terminator_iter() -> split_terminator()
       matches_index_iter()    -> match_indices()
       nfd_iter()              -> nfd_chars()
       nfkd_iter()             -> nfkd_chars()
      
&[T] : split_iter()        -> split()
       splitn_iter()       -> splitn()
       window_iter()       -> windows()
       chunk_iter()        -> chunks()
       permutations_iter() -> permutations()
      
extra:bitv::Bitv :  rev_liter()    -> rev_iter()
                    common_iter()  -> commons()
                    outlier_iter() -> outliers()

extra::treemap::{...} : lower_bound_iter() -> lower_bound()
                        upper_bound_iter() -> upper_bound()
                       
std::trie::{...} : bound_iter()       -> bound()
                   lower_bound_iter() -> lower_bound()
                   upper_bound_iter() -> upper_bound()

rustpkg::package_id::{...} : prefixes_iter() -> prefixes()

std::hashmap::{...} : difference_iter()           -> difference()
                      symmetric_difference_iter() -> symmetric_difference()
                      intersection_iter()         -> intersection()
                      union_iter()                -> union()
                     
std::path::{posix, windows} : component_iter()     -> components()
                              str_component_iter() -> str_components()

... not showing all identical renamings for reverse versions
~~~

---

I'm also planning a few more changes, like removing all unnecessary `_rev` constructors (#9391), or reducing the `split` variants on `&str` to a more versatile and concise system.
@Kimundi
Copy link
Member Author

Kimundi commented Nov 30, 2013

This got solved by #10622

@Kimundi Kimundi closed this as completed Nov 30, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants