Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for slices #13

Closed
nikomatsakis opened this issue Mar 17, 2014 · 19 comments
Closed

Syntax for slices #13

nikomatsakis opened this issue Mar 17, 2014 · 19 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@nikomatsakis
Copy link
Contributor

There is a long discussion on rust-lang/rust#4160 discussing the possibility of adding explicit slice operators to the language. Clearly if this is to be done, it requires an RFC and explicit discussion and approval.

I think the consensus on that thread was for the proposal found in this comment (full disclosure, by me):

  1. Add a Slice trait:

    trait Slice<T> {
        fn as_slice<'a>(&'a self) -> &'a [T];
        fn slice_from(&'a self, from: uint) -> &'a [T];
        fn slice_to(&'a self, to: uint) -> &'a [T];
        fn slice(&'a self, from: uint, to: uint) -> &'a [T];
    }
    
    trait MutSlice<T> : Slice<T> { /* as above but with mut qualifiers */ }
    
  2. Add a slice operator expr[a..b] where a and b are optional.

    • The operator autoderefences, like indexing.
    • For fixed-length vectors, its effect is built-in.
    • For other types, it is translated to a call to the appropriate trait method:
      • expr[..] => expr.as_slice()
      • expr[a..] => expr.slice_from(a)
      • expr[..b] => expr.slice_to(b)
      • expr[a..b] => expr.slice(a, b)
  3. Do we have something for mutable slices? Perhaps expr[mut a..b]? These come up less frequently but nonetheless it seems useful, particularly for fixed-length vectors since otherwise there remains no explicit syntax for slicing one that does not rely on coercion.

I'm still roughly in favor of this, though I think I would change the syntax expr[..] to expr[], just because it's shorter. Note though that with DST one can simply do &*vec to get the as_slice() notation.

@pnkfelix
Copy link
Member

though I think I would change the syntax expr[..] to expr[], just because it's shorter.

But more easily confused with the normal index operator by a novice, no? Especially when you consider the kinds of error messages that the compiler will emit when someone accidentally omits the index argument to the index operator?

expr[..] has the virtue that every expression using the slice operator has a .. in its indexing expression.

(Anyway, I'm not actually sure I'm in favor of this, but I'll have to think on it further. Then again, there may be some interesting ternary operators one could abuse this syntax to express; whether that's a pro or a con is subjective, of course...)

@huonw
Copy link
Member

huonw commented Mar 17, 2014

Note though that with DST one can simply do &*vec to get the as_slice() notation.

(This will even happen automatically with auto-deref if I understand it, right?)

@chris-morgan
Copy link
Member

I am in favour of writing expr[..] rather than expr[]. It's more obvious, and there's also precedent in Python's expr[:], whereas as an expression expr[] is more commonly used in other languages for appending (e.g. in PHP $foo[] = $bar is equivalent to our foo.push(bar)).

Having slice assignation would be nice, also: expr[a..b] = […];—but I can perceive that it might be troublesome. Python has it.

Python's approach to slicing as a whole is quite interesting also in that expr[start:end:step] is syntax sugar for expr[slice(start, end, step)] (I'm not suggesting that we add support for steps; that's not the sort of thing that fits in in Rust). This then translates to a simple __getitem__ call (or __setitem__), just with a slice object as the index argument. Would it be feasible to get slicing to use the same indexing traits as normal indexing?

Python also allows for tuples and Ellipsis in there, e.g. expr[a, b:c, ...]. Getting slice objects, numbers and ellipsis to all play together nicely would require something like rust-lang/rust#8277, but it'd be nice if we could have something like impl Index<[Slice | int | Ellipsis, ..N], MatrixSlice> for MatrixSlice. numpy is an example of something that uses this form of complex slicing/indexing to very good purpose, for N-dimensional matrices.

(My comments here evidently encompass more than just a simple slice syntax. They're probably unreasonably broad in scope for now, but hopefully food for thought and an alternative model worth considering.)

@jmgrosen
Copy link

I would personally like to have an option to change the index type, assuming something more like the proposal of @nikomatsakis goes through (as opposed to that of @chris-morgan). That is,

trait Slice<T, Idx=uint> {
    fn as_slice<'a>(&'a self) -> &'a [T];
    fn slice_from<'a>(&'a self, from: &Idx) -> &'a [T];
    fn slice_to<'a>(&'a self, to: &Idx) -> &'a [T];
    fn slice<'a>(&'a self, from: &Idx, to: &Idx) -> &'a [T];
}

That said, if/when this gets approved in some manner, I'd like to take a crack at adding it as my first non-trivial contribution; I added it around two months ago without too much work, but didn't end up sending a PR due to being indecisive over the return type.

@emberian
Copy link
Member

-1 I see this as superfluous.

@zkamsler
Copy link

Does this have any bearing on the eventual fate of autoslicing, since the current incarnations of ~[T] are on their way out? The ongoing ~[T] -> Vec<T> conversions have introduced quite a few as_slice calls and some occasional &Vec<T> parameters. I had gathered that there was some sentiment against autoslicing, since it can obscure parameter ownership at the call site, but it is convenient.

@sfackler
Copy link
Member

Autoslicing should be possible (I think? I'm not totally sure if it'll autoderef to match a function argument) with Deref and DerefMut after DST lands:

impl<T> Deref<[T]> for Vec {
    fn deref<'a>(&'a self) -> &'a [T] {
        self.as_slice()
    }
}

I'm not sure if that's a good idea, both because of the ownership issues you raised and the fact that methods on &[T] will magically become callable on a Vec without showing up in the docs.

@lilyball
Copy link
Contributor

lilyball commented May 4, 2014

I'm very slightly in favor of slicing syntax, but I think it would have to be compatible with strings too, not just vectors.

@Florob
Copy link

Florob commented May 4, 2014

@kballard: Isn't the intend that this would work with any type implementing the Slice trait?

I personally like this idea, it seems like useful syntax sugar for a common operation. Like others I too prefer expr[..] over expr[] for consistency though.

@lilyball
Copy link
Contributor

lilyball commented May 4, 2014

@Florob The proposed return value of the slice operation is &'a [T]. That's not compatible with strings.

@dobkeratops
Copy link

+1 for parameterising the index type - I wish Vec had this too, Vec<T,I=uint> - (for code where we want to predominantly use 32bit indices, it would really help reduce the casting done, also one could express more information through types r.e. which indices apply to which collections; and allow signed indices (should just bounds check), for code where negative indices have special meaning like tristrip breaks or whatever)

interesting sugar, but with the move to Vec, Box it almost seems strange to me that rust is keeping a slice concept built into the language. Perhaps i'm perceiving that wrong, perhaps it is a lower level concept, closer to raw arrays.

@lilyball
Copy link
Contributor

lilyball commented May 4, 2014

Vec, StrBuf are for owned containers. But most functions take slices or &str. Having convenient syntax for that is quite nice.

@huonw
Copy link
Member

huonw commented May 4, 2014

Perhaps i'm perceiving that wrong, perhaps it is a lower level concept, closer to raw arrays.

The [T] is low-level: it's just a raw C array (in the sense of T*) where the length is automatically carried around with the pointer (so that it is safer).

(However, discussion of the appropriateness of having the [T] type built-in are mostly off-topic for this RFC, which is just about indexing syntax.)

@ftxqxd
Copy link
Contributor

ftxqxd commented May 5, 2014

@kballard
The proposed return value of the slice operation is &'a [T]. That’s not compatible with strings.

Given that strings already return u8s when indexing, it seems consistent for slices to behave the same way and return &'a [u8]. (But IMO strings shouldn’t be indexable at all because of the byte ↔ code point confusion — from looking at a simple indexing operation, how can one tell if one is indexing bytes by byte position, code points by byte position, or code points by code point position (or maybe even graphemes)?)

@huonw
Copy link
Member

huonw commented May 5, 2014

@P1start yes, string [] indexing (in it's current form) is slated for removal: rust-lang/rust#12710.

@lilyball
Copy link
Contributor

lilyball commented May 5, 2014

@P1start Returning a &'a [u8] from string slicing is extremely unhelpful. Even if string indexing is removed, we'll always have the ability to slice strings with .slice() and friends, using byte indexes, returning a &str. It's too painful not to have that ability (going through &[u8] requires doing a utf-8 check to get back to &str, which is extremely wasteful).

Given that, if we support [a..b] slicing for &[u8], it only makes sense to support it for strings as well (and returning a &str).

@peterhj
Copy link

peterhj commented Aug 3, 2014

@chris-morgan +1, numerical/matrix stuff is a pretty big use case for indexing and slicing. In particular, assuming that we have your "anonymous union types" in some future, writing in a natural indexing syntax:

impl Index<[uint | SliceOp<uint>, ..2], Matrix<F>> for Matrix<F> { ... }

would be substantially more pleasant than using tuple indices:

impl Index<(uint | SliceOp<uint>, uint | SliceOp<uint>), Matrix<F>> for Matrix<F> { ... }

I like the Index idea very much. A general pitfall though is that indexing a single element is qualitatively different from indexing an arbitrary slice, perhaps more so for simple structures like 1-D vectors than for, e.g., N-dim. arrays/matrices. One may reasonably expect some_byte_slice[5] to return a u8 rather than &'a [u8]. On a similar note, slicing in only one dimension of an M x N Matrix, for example like [0..K,0] may be expected to produce a Vector of length K rather than a K x 1 Matrix.

A possible solution:

  1. Reserve [] indexing for obtaining the "underlying element" type, while also allowing multi-dim. indices.
  2. Use the Slice trait in combination with SliceOp (and, hopefully, union types).

I think using Slice along with SliceOp like the following would be quite usable, if a bit ambitious (if it's unclear, SliceOp is an enum type generated from .. operator):

trait Index<[E, ..N], R> {
  fn index<'a>(&'a Self, idx: &[E, ..N]) -> &'a R;
}

trait Slice<[E | SliceOp<E>, ..N], T> {
  fn as_slice<'a>(&'a Self) -> &'a T;
  fn slice<'a>(&'a Self, ops: &[E | SliceOp<E>, ..N]) -> &'a T;
}

(edited 8/5)
Not sure if I'm correctly expressing that index() and slice() should look and feel "variadic", i.e., we shouldn't need to pass them tuples. Moving E | SliceOp<E> out of the trait definition and into the implementation might be simpler, although I think it's more expressive as part of the trait.

A Matrix type may then choose to implement its own slice_col() and slice_row() methods to return column and row Vectors, respectively, whereas slice() will return a Matrix, and index() will return a float.

@brendanzab
Copy link
Member

+1 from me, with expr[..] and a type parameter for the index type. This is good for ergonomics.

@nikomatsakis
Copy link
Contributor Author

Closed in favor of #198

steveklabnik pushed a commit to steveklabnik/rfcs that referenced this issue Feb 9, 2018
@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Feb 23, 2018
wycats pushed a commit to wycats/rust-rfcs that referenced this issue Mar 5, 2019
English Mother@#$%*&^ Do You Speak It?
wycats pushed a commit to wycats/rust-rfcs that referenced this issue Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests