-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Naming suggestions #2
Comments
Thanks. It is good to get comments, and naming is not easy and a compromise. For Span, I agree Slice could work and is commonly used. The reason for using Span is mainly to align with C++, but it is not that important. One slight drawback with Slice is risk of confusion with normal slices, but it is the case also for Array. Grid is maybe the most controversial, but it is being used for multidimensional arrays for example in some crates. One alternative could be Tensor, which is well known and used e.g. in C++ Eigen/xtensor and in Python. Would that work? For Expression/Expr/... the idea is as you have seen to mirror the iterator types. But they can be seen as more than iterators, for example there are operators defined and they can be mixed with arrays. So they are not that different from expression templates in C++ libraries. I also wanted to have short names for expr/expr_mut/into_expr since they could be quite common. |
preliminary remark: Interestingly, Rust’s However, due to the prominence of C++, one can argue that by now the term “vector” has acquired the meaning “dynamic array” as well, so I think it’s fine for Rust to use it. Still, I think it would have been better to use Returning to mdarray, in my opinion the current names do work as pointers into documentation, but not as more. They have the advantage of being compact, but otherwise they convey no meaning even to someone who has experience with numerical computation. They do not even work as aid to memory. This could be fine. After all it’s OK to introduce some jargon, and if the four terms array, grid, expr, and span are introduced clearly and prominently in the documentation, it may be not too much to ask from users to simply learn their meanings as used in this library. However, I think that the names could be improved. This is my personal impression, and I write this down in the hope of being of some help, so feel free to completely ignore... An For Same with To me, the biggest problem is how to cleanly name the two concepts that are currently called Perhaps this confusion is just the symptom of a deeper problem? Couldn’t one get rid of
But I suspect that I am missing something here... |
One thing that causes confusion is that Expr/ExprMut are used for both array views and expressions. It is probably better to call them View/ViewMut instead. It makes it clear they are array views, and that they happen to implement Expression like some other types. To make it consistent, IntoExpr should no longer be an array type that that can be dereferenced to Span. The existing Expr/ExprMut types could of course be kept for expressions, and that View/ViewMut implement IntoExpression instead of Expression. However I think it seems better to remove them to simplify and reduce the number of types. Renaming Span to Slice is fine I think, as the latter is more well known. It also makes sense since it is unsized and must be used behind a reference like normal slices. Regarding owned arrays, the Array type is just a wrapper around a normal array. It contains no pointer, so it can't be replaced with an array view. Given that there are two owned array types, it seems better to use Array for the fixed array since it is a wrapper and to make it consistent with normal arrays. I don't think it is that bad to use Tensor for a dynamic array, since it is commonly used for a multidimensional array in other libraries. So with this my current thinking is to do these renamings:
|
Both in Rust and Numpy/Python a slice is a view of a part of a container, just like a slice of bread is a part of the whole loaf. Crucially, except in trivial cases, the same container can be sliced in many different ways. But the type that is currently called In addition, in Rust references to slices are wide pointers, which would not be the case here and could lead to further confusion, since "slice" is such a well-known concept in Rust. Thanks for explaining the difference between "Tensor" is definitely much clearer then "grid". What I don’t like in the resulting naming scheme is that the other types (i.e. I think that the following naming scheme (for example) would be more symmetric: Or perhaps you don’t like The type that is currently called
This is much clearer, isn’t it? I think that I would like the following a lot:
Now we have one basis concept ("tensor"), and three realizations! |
As I'm thinking, Span (or Slice) is very similar to the Rust slice type, and it can represent a part of the container but also the full container. For example with vec.iter() you don't get an iterator over a vector but an iterator over a slice, since vec will first dereference to a slice. It does not matter how Span is represented internally, and what is important is the interface to the user. Now &Span is a thin pointer, but if/when custom DST is possible I would like to represent it with a wide pointer to the elements and to the mapping type. This is better for the compiler since direct references can have noalias annotations which should help optimizations. Thanks for the feedback on Tensor vs Grid. I see what you mean about using Tensor for one type when other types are also tensors, but I think it is difficult to avoid. There is the same issue with normal arrays and Vec, that you should use slices in interfaces even though they are arrays/vectors. To me it is all just multidimensional arrays, and I want to find names that are reasonably correct and fit together. Regarding array references, yes I think it would be better to use ArraySlice instead of ArrayRef. The reason is that &ArrayRef looks like a double reference, and for mutable references it is unclear if they should be called &mut ArrayRef or &mut ArrayMut. But otherwise the proposal is good. |
Sure, but currently in mdarray all spans that reference some array (say a
Say we have an n-dimensional array and we want to iterate over some m-dimensional slices, where m <= n. In the general case, the mapping object must store the shape and the strides of the slices, so this can be quite a large object which won't fit itself into the wide pointer. However, all slices that one iterates over share the same shape and strides, so they can use the same mapping object. So, if I understand correctly, you would store the mapping as part of the iterator object, and then the wide pointer to a slice would point to the data (this pointer will wander among the data), and to the mapping inside the iterator. The lifetime of each slice will be bound to the lifetime of the iterator. Is this what you'd like to do one Rust supports custom DSTs? If yes, then I fully agree that this future thing would be great, very useful, and should be called something like "slice". If we can be sure that the API for these great future slices will match the API of the current restricted pseudo-slices, then I think there's a point in renaming current But if we can't be sure, I think it's better to use a different name, and reserve |
Yes and no. If you create a Span reference from a Grid, it will point to the RawSpan that is embedded in the grid. So if you copy the reference around or dereference the grid several times they will be identical. But you can also create a view to the same array, so that you get a View or ViewMut instance that contains a new RawSpan internally and can then be dereferenced. So the difference compared to normal slices is that Span needs an array instance to point to, and not just the array elements. This can only be avoided with custom DSTs that are larger than two words. But otherwise the important function of Span is that it is the type that all arrays can dereference to, and can be used for example in function arguments. For iteration, the shape and strides are stored in the mapping type in each array instance. If the iteration is over several arrays, they will each have a mapping type (since strides might be different per array) and the arrays are stored recursively in Zip structures. When iterating over Span references, they must first be converted to array views that contain a RawSpan and lifetime. This is similar to when iterating over normal slices, since they must first be converted to Iter or IterMut. |
OK, so if
I think that's really quite good, certainly much clearer than the current names! In a way |
Coming from C++, Python/Numpy and Rust, my impression is that some of the most central concepts of mdarray are named in a confusing way. Perhaps this is due to my rudimentary understanding, but perhaps the author will find an outsider's perspective interesting. Here are some of my thoughts.
Span
In C++ there is
std::span
, but in Rust the equivalent isslice
. The same name is used in Python/Numpy, so many potential users have been exposed to it. Wouldn't it be better to use the nameSlice
instead ofSpan
in mdarray? (E.g. there would beSlice
andRawSlice
, etc.)But if I understand correctly, mdarray's
Span
is not really the equivalent of C++'smdspan
. (That would be ratherExpr
.) Mdarray'sSpan
seems rather a simple reference to an existing array. So how about renaming it toArrReference
orArrRef
?Grid
I'm not aware of the name "grid" being used in numerical computing, except in the sense of a physical grid (e.g. a hexagonal grid), or in "grid computing". So while short,
Grid
does not seem intuitive for "owned array". How about renamingGrid
to simplyArray
? Alternatively, if ownership is to be emphasized, how aboutOwnedArr
?Array
Static arrays in mdarray are called
Array
which nicely mirrors the usage in Rust, but arguably conflicts with the expectation that the name "Array" in an array library denotes a heap-allocated array. How about renaming it toStaticArr
?Expression
In computing, the term "expression" means "something that can be evaluated to determine its value". If my understanding is correct, mdarray's trait
Expression
is an array iterator. Why not call itArrIterator
then? Then there could beIntoArrIterator
etc. This naming should be much clearer to people having some experience with Rust.Expr
If I understand correctly,
Expr
is an "owned view". WhileSpan
is just a "reference to an array", andExpr
has its own shape and stried. Only the actual array values are that of another array. So how about calling itArrView
?shorter names
Perhaps the different names proposed here are deemed a bit long. One possibility would be to get rid of the "Arr" and simply have
Iterator
,Ref
,View
,Static
,Owned
. If mdarray is used asmd
these would becomemd::View
ormd::Iterator
which I'd say are fine.My gut feeling is that the longer names are preferable, though. It feels a bit presumptuous to introduce another
Iterator
trait.The text was updated successfully, but these errors were encountered: