Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auto Parallel] Improve the interface and the underlying mechanisms #36617

Merged
merged 71 commits into from
Oct 29, 2021

Conversation

aoyulong
Copy link
Contributor

@aoyulong aoyulong commented Oct 21, 2021

PR types

Others

PR changes

Others

Describe

  • Improve the interface of auto parallel
    • Remove unnecessary api and only remain shard_tensor and shard_op
    • Use dist_attr in the shard_tensor and shard_op to encapsulate related distributed attributes
    • Hide ProcessMesh for users and remove the ROOT_MESH argument of its constructor
    • Modify unit tests by replacing old interfaces with new interfaces
  • Explicit construct DistributedTensor and DistributedOperator
    • Add DistributedTensor and DistributedOperator classes for better understanding
    • Make TensorDistributedAttribute and OperatorDistributeAttribute class more coherent
  • Add some distributed attributes
    • Add shard_sizes for uneven split
    • Add device_placement for physical mapping
    • Add impl_type for operator's default parallelism
  • Change the existing codes based on the new mechanisms and the previous feedback

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

self.serial_tensor.desc.name(), self.serial_tensor.desc.id())

# str += ", {}".format(self.dist_attr)
# return str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG API

@JZ-LIANG JZ-LIANG merged commit a02532b into PaddlePaddle:develop Oct 29, 2021
ghost pushed a commit to piotrekobi/Paddle that referenced this pull request Nov 3, 2021
…addlePaddle#36617)

* default dist op

* add dist_attr for dist op

* add unitest

* update inputname

* update function name

* add unitest

* update CMakeLists.txt for CI

* fix dis_matmul

* fix compile error

* update matmul to matmul_v2

* unify api

* unify api

* todo

* update distop forward func

* update distop forward func

* auto parallel backward

* update dist op

* autoparallel backward

* add backward for embedding

* temp1

* temp2

* temp3

* temp4

* backward done1

* backward done2

* backward done3

* dist embedding remove mp mode

* dist matmul remove mp mode

* update dist embedding
『

* dist op init1

* dist op init 2

* update unitest

* context remove parallel mode

* partitioner remove parallel mode

* update unitest

* a more general method to support varying mesh in pipeline parallel

* support varying mesh in pipeline parallel

* embedding support varying mesh in pipeline parallel

* matmul support varying mesh in pipeline parallel

* default dist op support varying mesh in pipeline parallel

* dist attribute for startup program

* default dist op support varying mesh in pipeline parallel 2

* partitoner support varying mesh in pipeline parallel

* revise logic for auto compeletion

* revise framework.py

* revise reshard unitest

* revise unitest for parallelize

* chmod

* fixed bug for dist embedding name mapping

* Improve the interface and the underlying mechanisms of auto parallel

* revise completion for backward

* revise completion for update

* revise completion for update

* update unitest

* chmod

* bugfix for grad_op output var's mesh

* Modify codes for pr 36744

* Remove unnecessary comments in framework.py

* Remove unnecessary comments in completion.py

Co-authored-by: JZ-LIANG <[email protected]>
Co-authored-by: zhaoyingli <[email protected]>
Co-authored-by: JZ-LIANG <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants