Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No.11】add RFC for index_fill #122

Merged
merged 6 commits into from
Oct 21, 2022
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
318 changes: 318 additions & 0 deletions rfcs/APIs/20220429_api_design_for_index_fill.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,318 @@
# paddle.Tensor.index_fill 设计文档



|API名称 | paddle.Tensor.index_fill |

|---|------------------------------------------|

|提交作者<input type="checkbox" class="rowselector hidden"> | thunder95 |

|提交时间<input type="checkbox" class="rowselector hidden"> | 2022-04-29 |

|版本号 | V1.0 |

|依赖飞桨版本<input type="checkbox" class="rowselector hidden"> | develop |

|文件名 | 20220429_api_design_for_index_fill.md<br> |



# 一、概述



## 1、相关背景

为了提升飞桨API丰富度,支持科学计算领域API,Paddle需要实现API`paddle.index_fill功能需求。

## 2、功能目标

增加API`paddle.index_fill`, `paddle.index_fill_`,`tensor.index_fill`,`tensor.index_fill_`, 通过按index中给定的顺序, 在指定轴上用固定值填充输入的张量。

## 3、意义

飞桨支持index_fill算子进一步满足用户需求。



# 二、飞桨现状

目前paddle缺少相关功能实现, 且用其他API来组合实现也较为困难, 因为axis可以是任意轴而且index也不一定连续,`paddle.index_select`和`paddle.slice`也无法直接达到目的。

简单场景下计算逻辑如下:





```Python

import paddle

import numpy as np

np.random.seed(102)



np_data = np.random.rand(4, 3)

pd_tensor = paddle.to_tensor(np_data)

np_res = np_data.copy()

fill_val = 9.0

index = [0, 2]

np_res[index, :] = 9.0

for i in range(pd_tensor.shape[1]):

if i in index:

pd_tensor[i, ...] = fill_val



print(np.allclose(np_res, pd_tensor.numpy()))

```



# 三、业内方案调研

## Numpy

### 实现方法

Numpy没有对该功能有特定的API进行支持,但是Numpy有非常完善的切片操作和广播机制,可以很好的实现。示例如下:



```Python

import numpy as np

axis = 0

index = [0, 2]

data = np.random.rand(4, 3)

print(data[[0, 2],:])

```



## TensorFlow

tensorflow目前也没有特定的API支持类似功能,

但在tensorflow里也可以通过tf.exprimental.numpy直接调用numpy函数



## Pytorch

Pytorch中有API`Tensor.index_fill_(dim, index, value)`和`Tensor.index_fill(dim, index, value)`,

其中dim是选取index所在的轴, value是待填充的值, index_fill_是对原输入张量的修改。

index_fill_对应的out of palce
iclementine marked this conversation as resolved.
Show resolved Hide resolved



在pytorch中,[文档地址](https://pytorch.org/docs/stable/generated/torch.Tensor.index_fill_.html#torch.Tensor.index_fill_), 介绍为:

```

Fills the elements of the self tensor with value value by selecting the indices in the order given in index.

```

在底层分别通过c++和cuda和函数实现上



### 实现方法



cuda核函数实现的主要位置:

```c++

void index_fill_kernel_impl(

TensorIterator& iter,

int64_t dim,

int64_t self_dim_size,

int64_t self_dim_stride,

scalar_t fill_val);

}

```

cpu核函数主要基于loop方式实现,

```c++

void index_fill_kernel(

TensorIterator& iter,

int64_t dim,

int64_t self_dim_size,

int64_t self_dim_stride,

const Scalar& source);

}

```



# 四、对比分析

- Numpy基于切片操作和广播机制功能上更灵活更自由。

- Pytorch支只支持一个axis,不仅支持cpu还支持gpu。



# 五、方案设计

## 命名与参数设计

新增API设计为:

`paddle.index_fill(x, axis, index, fill_value)`

`paddle.index_fill_(x, axis, index, fill_value)`

`Tensor.index_fill(axis, index, fill_value)`

`Tensor.index_fill_(axis, index, fill_value)`



index_fill_支持inplace方式修改输入张量。

axis是index索引选择的轴, 只支持int参数类型。
iclementine marked this conversation as resolved.
Show resolved Hide resolved

index在指定轴上含索引下标的1-D Tensor。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议同时支持 python 的 list of int. 支持以 Tensor 或者非 Tensor 的方式传入此参数,这样使用更加灵活。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

完成


fill_value是待填充的数据,参数类型支持int,float以及仅包含1个元素的Tensor。
iclementine marked this conversation as resolved.
Show resolved Hide resolved



## 底层OP设计

参考飞桨现有算子,分别实现cpu和cuda的算子kernel。



## API实现方案

在 python/paddle/tensor/manipulation.py 中增加index_fill以及index_fill_函数,分别通过_C_ops调用底层算子

计算正确的stride之后,参考index_select算子进行逻辑修改

在指定轴上指定索引的元素梯度为0.0,其他未被选中的元素梯度是1.0
iclementine marked this conversation as resolved.
Show resolved Hide resolved


## 代码实现文件路径



CPU中正向和反向计算: paddle/phi/kernels/cpu/index_fill_kernel.cc paddle/phi/kernels/cpu/index_fill_grad_kernel.cc

GPU中正向和反向计算: paddle/phi/kernels/gpu/index_fill_kernel.cu paddle/phi/kernels/gpu/index_fill_grad_kernel.cu



```c++

template <typename T, typename Context>

void IndexFillKernelKernel(const Context& dev_ctx,
iclementine marked this conversation as resolved.
Show resolved Hide resolved

const DenseTensor& x,

DenseTensor* out);



```



算子注册路径:

paddle/fluid/operators/index_fill_op.cc



函数API实现路径: python/paddle/tensor/manipulation.py

单元测试路径: python/paddle/fluid/tests/unittests/test_index_fill_op.py



# 六、测试和验收的考量

测试考虑的case如下:



- 和numpy结果的数值的一致性, `paddle.index_fill`和numpy切片操作结果是否一致;

- 参数`axis`校验参数类型int,判断axis合法,并进行边界检查;

- 校验参数`index`的正确性,索引边界检查,输出结果的正确性;

- 校验参数fill_value的正确性, 是否是支持的数据类型

- 测试在进行反向梯度计算时结果的正确性;

- 错误检查:输入`x`不是Tensor时,能否正确抛出错误;



# 七、可行性分析及规划排期



方案实施难度可控,工期上可以满足在当前版本周期内开发完成。



# 八、影响面

为独立新增API,对其他模块没有影响



# 名词解释


# 附件及参考资料