Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Combined indexing for __getitem__ and __setitem__ #55211

Merged
merged 35 commits into from
Aug 4, 2023

Conversation

zoooo0820
Copy link
Contributor

@zoooo0820 zoooo0820 commented Jul 6, 2023

PR types

New features

PR changes

APIs

Description

Pcard-66985

Background

At present, the indexing and assignment (i.e. __getitem__ / __setitem__) of Paddle are incomplete in combined indexing (basic-indexing and advanced-indexing both appeared). This brings great trouble to build some complex models.

  • Problem 1: For __getitem__ , both dynamic and static mode are not supported combined indexing.
>>> x = paddle.randn((2,3,4))
>>> x[:,[0,1]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/paddle/fluid/dygraph/tensor_patch_methods.py", line 747, in __getitem__
    return self._getitem_index_not_tensor(item)
ValueError: (InvalidArgument) When index contains a list, its length is excepted to 1, but received 2
  [Hint: Expected size == 1, but received size:2 != 1:1.] (at /paddle/Paddle/paddle/fluid/pybind/slice_utils.h:249)
  • Problem 2: For __setitem__, combined indexing forward but not backward is supported in dynamic mode; both forward and backward are not supported in static mode.
>>> x = paddle.randn((2,2))
>>> x.stop_gradient=False
>>> y = x + 1
>>> y[:,[[0]]] = 10
>>> paddle.grad(y, x)
[Tensor(shape=[2, 2], dtype=float32, place=Place(gpu:0), stop_gradient=True,
       [[1., 1.],
        [1., 1.]])]

# the correct result should be 
[Tensor(shape=[2, 2], dtype=float32, place=Place(gpu:0), stop_gradient=True,
       [[0., 1.],
        [0., 1.]])]
>>> paddle.enable_static()
>>> x = paddle.randn((2,2))
>>> x[:, [[0]]] = 10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/framework.py", line 2387, in __setitem__
    return _setitem_impl_(self, item, value)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/variable_index.py", line 684, in _setitem_impl_
    .format(item))
IndexError: Valid index accept int or slice or ellipsis or list, but received [slice(None, None, None), [[0]]].

What this PR did

  1. [New Feature] Support combined indexing for both __getitem__ and __setitem__, with autograd;
  2. [Bug fixes] Fix wrong output shape when slices containing a stride in dynamic mode __getitem__; e.g. If x.shape=(2,3,4), x[0, ::2, 1].shape will be [2], while currently outputs [1,2,1], incorrectly
  3. [Optimize] Change the strategy for advanced-indexing in __setitem__, using index_put instead of earlier gather_nd -> elementwise_sub -> scatter_nd_add or reshape-> elementwise_mul / reduce_sum -> stack -> scatter -> reshape.
  4. Modify some unit tests since these case are supported and will not raise error in the future.

@paddle-bot
Copy link

paddle-bot bot commented Jul 6, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@zoooo0820 zoooo0820 force-pushed the combined_indexing branch from 76304a5 to d6f9a2c Compare July 11, 2023 06:12
@paddle-ci-bot
Copy link

paddle-ci-bot bot commented Jul 19, 2023

Sorry to inform you that d6f9a2c's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

jeff41404
jeff41404 previously approved these changes Jul 28, 2023
Copy link
Contributor

@jeff41404 jeff41404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeff41404 jeff41404 merged commit 697c712 into PaddlePaddle:develop Aug 4, 2023
@zoooo0820 zoooo0820 deleted the combined_indexing branch August 4, 2023 06:49
cxxly pushed a commit to cxxly/Paddle that referenced this pull request Aug 7, 2023
…le#55211)

* WIP: start writing combined indexing get

* list/tuple/Variable

* getitem 80%

* add setitem

* add some unittest for setitem

* lazy import

* fix some setitem error

* fix advance indexing with decreasing axes; fix strided_slice input name

* combine int-tensor getitem is ok (without boolean support & broadcast); add getitem unittest for static

* add broadcast & parse bool tensor for __getitem

* [change getitem] _getitem_impl_ to _getitem_static, not deleting the former one

* refine new getitem; fix ut in variable/var_base

* add __getitem__ ut in dygraph

* re-dispatch getitem for Py/CPP; fix strided_slice decrease axes error in dygraph

* fix ut; support tensor in slice

* [change setitem] _setitem_impl_ to _setitem_static, not deleting the former one

* remove some UT (for some, temporarily)

* add IndexError to solve timeout problem in static-mode

* 1.temply forbideen all-False bool-indexput; 2.setitem_static will return new variable

* xpu uses old stratege

* rename dy2st setitem ut to avoid same-name problem

* dy2st for new combined index

* ut case for combine-index with dy2st

* open ut with all-false-bool setitem

* remove useless doc and _getitem_impl_

* change static res

* fix static xpu
cxxly pushed a commit to cxxly/Paddle that referenced this pull request Aug 7, 2023
…le#55211)

* WIP: start writing combined indexing get

* list/tuple/Variable

* getitem 80%

* add setitem

* add some unittest for setitem

* lazy import

* fix some setitem error

* fix advance indexing with decreasing axes; fix strided_slice input name

* combine int-tensor getitem is ok (without boolean support & broadcast); add getitem unittest for static

* add broadcast & parse bool tensor for __getitem

* [change getitem] _getitem_impl_ to _getitem_static, not deleting the former one

* refine new getitem; fix ut in variable/var_base

* add __getitem__ ut in dygraph

* re-dispatch getitem for Py/CPP; fix strided_slice decrease axes error in dygraph

* fix ut; support tensor in slice

* [change setitem] _setitem_impl_ to _setitem_static, not deleting the former one

* remove some UT (for some, temporarily)

* add IndexError to solve timeout problem in static-mode

* 1.temply forbideen all-False bool-indexput; 2.setitem_static will return new variable

* xpu uses old stratege

* rename dy2st setitem ut to avoid same-name problem

* dy2st for new combined index

* ut case for combine-index with dy2st

* open ut with all-false-bool setitem

* remove useless doc and _getitem_impl_

* change static res

* fix static xpu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants