Fix Tensor SetItem for Growing Tensor #92

ntjohnson1 · 2023-04-18T16:25:03Z

If a tensor of a fixed size is created when using set item it should grow to accommodate provided size. When implementing tendiag I found a few edge cases where this doesn't work properly. Will add examples later.

For a single index:

>>> import pyttb as ttb
>>> import numpy as np
>>> X = ttb.tensor.from_data(np.ones((1,1)))
>>> X[1,1,1] = 1
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<path_to_clone>\pyttb\pyttb\tensor.py", line 1322, in __setitem__
    if (newsiz != self.shape).any():
AttributeError: 'bool' object has no attribute 'any'

For a series of indices:

>>> X = ttb.tensor.from_data(np.ones((2,1)))
>>> X[np.array([[0, 1],[0, 1]])] = 1
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<path_to_clone>\pyttb\pyttb\tensor.py", line 1368, in __setitem__
    newData[idx] = self.data
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
>>> X.data[np.array([[0, 1],[0, 1]])] # works

The text was updated successfully, but these errors were encountered:

ntjohnson1 · 2023-04-30T12:31:17Z

As I started to explore this there was definitely scope creep. The tensor indexing code was some of the more complicated logic when doing the translation. I might use this ticket as an opportunity to better describe the different options and look over our tests. Depending on how big an effort that looks like may break into another ticket/open a discussion about indexing clarity.

In case this pops up as a bug that is blocking something and needs a more short term fix here is the branch that resolves the two issues above and a few more https://github.com/ntjohnson1/pyttb/tree/tensor_indexing (each commit should have a test that shows an example of something that the change now supports)

dmdunla · 2023-05-01T15:51:16Z

This is quite a refactor! Since pyttb.tensor data is a np.ndarray, should we consider just directly leveraging np.array getitem/setitem methods? The current method is a refactor of the Tensor Toolbox for MATLAB tensor class. Would this simplify these methods?

ntjohnson1 · 2023-05-01T19:30:27Z

My changes above aren't too large at the moment. The break into methods commits make no change but break the giant single function into a few helpers. That might simplify things, but we'd want sptensor to be consistent. Additionally, I don't think numpy supports dynamic growth. I put the note above to provide the update that my plan was to write up some more thoughts on this before exploring further.

dmdunla · 2023-05-01T20:05:21Z

OK, looking at numpy.ndarray, I see there are helper methods, like expand_dims, append, insert, and concatenate that allow for extending arrays in different ways. But the Tensor Toolbox for MATLAB allows for updating the size, shape, and values of a tensor using slice notation. It's very convenient for working with tensors that need to be adjusted:

% Tensor Toolbox for MATLAB
>> X = tensor(ones([1 1]))
X is a tensor of size 1 x 1
	X(:,:) = 
	     1
>> X(1,1,1) = 1
X is a tensor of size 1 x 1 x 1
	X(:,:,1) = 
	     1
>> X(1,1,2) = 2
X is a tensor of size 1 x 1 x 2
	X(:,:,1) = 
	     1
	X(:,:,2) = 
	     2
>> X(1,2,2) = 2
X is a tensor of size 1 x 2 x 2
	X(:,:,1) = 
	     1     0
	X(:,:,2) = 
	     2     2
>> X([1,2],[1,2]) = 2
Error using tensor/subsasgn (line 126)
Invalid use of tensor/subsasgn
 
>> X([1:2],[1:2]) = 2
Error using tensor/subsasgn (line 126)
Invalid use of tensor/subsasgn
 
>> X([1:2],[1:2],[1:2]) = 2
X is a tensor of size 2 x 2 x 2
	X(:,:,1) = 
	     2     2
	     2     2
	X(:,:,2) = 
	     2     2
	     2     2
>> X([1,2],[1,2],[1,2]) = 3
X is a tensor of size 2 x 2 x 2
	X(:,:,1) = 
	     3     3
	     3     3
	X(:,:,2) = 
	     3     3
	     3     3
>> X = tensor(ones([1 1]))
X is a tensor of size 1 x 1
	X(:,:) = 
	     1
>> X([1:2],[1:2],[1:2]) = 2
X is a tensor of size 2 x 2 x 2
	X(:,:,1) = 
	     2     2
	     2     2
	X(:,:,2) = 
	     2     2
	     2     2

ntjohnson1 · 2023-05-01T20:23:13Z

It's very convenient for working with tensors that need to be adjusted

Agreed it looks like this dynamic nature is useful or at least consistent with MATLAB tensortoolbox today. We mostly cover that behavior, many of my fixes in the linked branch are edge cases with changes less than ~10 lines. I think a full refactor probably isn't necessary but thinking through the interpretation of the setitem/getitem and jotting down some notes is useful (as I mentioned above we can probably take my PR above then break out more follow on).

Ultimately, I just realized the expected behavior was a little unclear to me, which means its probably additionally unclear to a new user to pyttb. Since all our algorithms implemented to date appear to be numerically correct the indexing is used consistently for common paths, but want to try and clean it up a bit for further extension.

Here is one example where I would consider the lack of clarity effectively a bug (I didn't check the MATLAB side yet):

empty_sptensor = ttb.sptensor()
empty_sptensor[np.array([[0, 1], [3, 3]])] = 5
assert np.all(empty_sptensor[0, 3] != 5)

empty_tensor = ttb.tensor()
empty_tensor[np.array([[0, 1], [3, 3]])] = 5
assert np.all(empty_tensor[0, 3] == 5)

Basically for sptensor we are treating the numpy array as a series of subs where for tensor we are treating it as [row_1, row_2], [column_1, column_2]

dmdunla · 2023-05-02T12:45:15Z

Here is the MATLAB version:

>> empty_tensor = sptensor()
empty_tensor is an all-zero sparse tensor of size [empty tensor]
>> empty_tensor([1,2],[4,4]) = 5
empty_tensor is a sparse tensor of size 2 x 4 with 4 nonzeros
	(1,4)     5
	(1,4)     5
	(2,4)     5
	(2,4)     5
>> assert(empty_tensor([1,4]) ~= 5)
Assertion failed.
 
>> empty_tensor = tensor()
empty_tensor is a tensor of size [empty tensor]
	empty_tensor = []
>> empty_tensor([1,2],[4,4]) = 5
empty_tensor is a tensor of size 2 x 4
	empty_tensor(:,:) = 
	     0     0     0     5
	     0     0     0     5
>> assert(empty_tensor([1,4]) == 5)

ntjohnson1 · 2023-05-06T16:03:40Z

Here are the different kinds of MATLAB behavior for Tensor as far as I can tell (not including expanding dimensions etc):

Tensor (MATLAB):
    Linear:
    >> b = tenones(3,3);
    >> b(1);
    >> b(5);
    Subscripts:
    >> b = tenones(3,3);
    >> b([1,1;2,2]) = 3;
    >> assert(b(1,1) == 3);
    >> assert(b(2,2) == 3);
    Subtensor:
    >> b = tenones(3,3);
    >> b([1,1], [2,2]) = 3;
    >> assert(b(1,1) ~= 3);
    >> assert(b(2,2) ~= 3);
    >> assert(b(1,2) == 3);
    ```

ntjohnson1 · 2023-05-06T16:35:01Z

I fixed Linear for python in the branch (before it required a numpy array with a single value). Working my way through the different flavors to make sure they align with MATLAB:

>>> import pyttb as ttb
>>> b = ttb.tenones((3,3))
>>> b[0] = 5
>>> assert b[0] == 5
>>> b[0:1] = 6
>>> assert b[0:1] == 6

ntjohnson1 · 2023-05-08T13:26:32Z

This discussion has grown long. I am going to break this into two issues. The first for the bug identified initially that dimensions weren't growing properly using set item. Then the newer issue I saw while investigating that related to making our indexing more intuitive for python. I'll probably open the PR for the first later this week and pause digging into the second.

dmdunla · 2023-05-09T20:27:05Z

Basically for sptensor we are treating the numpy array as a series of subs where for tensor we are treating it as [row_1, row_2], [column_1, column_2]

From the MATLAB code, tensor.__setitem__ and sptensor.__setitem__ should behave the same (per the corresponding MATLAB tensor/subsasgn and sptensor/subsasgn.

For updating p individual elements of an n-dimensional tensor using n-tuple subscripts, you should pass a p x n array of indices into __setitem__ and the right-hand side should be a p x 1 array.

For updating p individual elements of an n-dimensional tensor using linear indices, you should pass a p x 1 array of indices into __setitem__ and the right-hand side should be a p x 1 array.

For updating a subtensor for an n-dimensional tensor, you should pass n arrays of size either 1 x 2 (for a range in that dimension) or of size 1 x 1 (for a single element in that dimension) and the right-hand side should be the appropriate size rectangular tensor. Note that "appropriate" is very MATLAB specific in this, as single dimensions in the first or last dimension of a multidimensional array are squeezed out, but for tensor classes in the Tensor Toolbox, they are not squeezed out unless the squeeze method is called explicitly.

ntjohnson1 · 2023-06-01T15:52:48Z

The original scope of this is closed with #109

ntjohnson1 added the bug Something isn't working label Apr 18, 2023

ntjohnson1 added this to the v2.0 milestone Apr 18, 2023

ntjohnson1 self-assigned this Apr 18, 2023

ntjohnson1 mentioned this issue Apr 18, 2023

Tensor generator helpers #93

Merged

ntjohnson1 added the doing Actively being worked on label Apr 27, 2023

This was referenced May 22, 2023

Clean Up Tensor/Sptensor setitem (and getitem to match) #108

Closed

Tensor growth #109

Merged

ntjohnson1 closed this as completed Jun 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Tensor SetItem for Growing Tensor #92

Fix Tensor SetItem for Growing Tensor #92

ntjohnson1 commented Apr 18, 2023 •

edited

Loading

ntjohnson1 commented Apr 30, 2023

dmdunla commented May 1, 2023

ntjohnson1 commented May 1, 2023

dmdunla commented May 1, 2023

ntjohnson1 commented May 1, 2023

dmdunla commented May 2, 2023

ntjohnson1 commented May 6, 2023

ntjohnson1 commented May 6, 2023

ntjohnson1 commented May 8, 2023

dmdunla commented May 9, 2023 •

edited

Loading

ntjohnson1 commented Jun 1, 2023

Fix Tensor SetItem for Growing Tensor #92

Fix Tensor SetItem for Growing Tensor #92

Comments

ntjohnson1 commented Apr 18, 2023 • edited Loading

ntjohnson1 commented Apr 30, 2023

dmdunla commented May 1, 2023

ntjohnson1 commented May 1, 2023

dmdunla commented May 1, 2023

ntjohnson1 commented May 1, 2023

dmdunla commented May 2, 2023

ntjohnson1 commented May 6, 2023

ntjohnson1 commented May 6, 2023

ntjohnson1 commented May 8, 2023

dmdunla commented May 9, 2023 • edited Loading

ntjohnson1 commented Jun 1, 2023

ntjohnson1 commented Apr 18, 2023 •

edited

Loading

dmdunla commented May 9, 2023 •

edited

Loading