Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dist error with lr decay layer #9489

Merged
merged 9 commits into from
Mar 30, 2018

Conversation

Yancey1989
Copy link
Contributor

Fixed #9429

@@ -24,6 +24,7 @@
from regularizer import append_regularization_ops
from clip import append_gradient_clip_ops, error_clip_callback
from contextlib import contextmanager
from distribute_transpiler import UnionFind
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimizer.py should not depend on distribute_transpiler, either put _get_lr_decay_ops in the transpiler or put UnionFind in a single file. I'd prefer the first method, because we don't have to change current demo files then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimizer.py should not depend on distribute_transpiler

Thanks @typhoonzero ,I think it's a good point, and maybe we can pass Optimizer instance to transpile interface so that we can support regularization for future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can consider that when moving regularizer and clipping to the server side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved to transpiler.

@Yancey1989 Yancey1989 changed the title [WIP]Fix dist error with lr decay layer Fix dist error with lr decay layer Mar 29, 2018
typhoonzero
typhoonzero previously approved these changes Mar 29, 2018
Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM++

for _, opt_op in enumerate(opt_op_on_pserver):
for _, op in enumerate(self.optimize_ops):
# optimizer is connected to itself
if ufind.is_connected(op, opt_op) and \
op not in global_ops:
__append_optimize_op__(op, per_opt_block)
per_opt_block = pserver_program.create_block(0)
per_opt_block = pserver_program.create_block(append_block.idx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's smart.

find_ops = []
# find ops which output is lr var
# make a union-find struct by all ops
block = default_main_program().global_block()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use self.program instead of default_main_program because we may need to transpile a program that is specified by user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

typhoonzero
typhoonzero previously approved these changes Mar 29, 2018
Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM++

@Yancey1989 Yancey1989 force-pushed the fix_dist_with_lr_decay branch from d9d11a3 to b7ffd5d Compare March 30, 2018 02:52
Copy link
Contributor

@gongweibao gongweibao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongweibao gongweibao merged commit 374f1ca into PaddlePaddle:develop Mar 30, 2018
@Yancey1989 Yancey1989 deleted the fix_dist_with_lr_decay branch March 30, 2018 04:01
mikeseven added a commit to mikeseven/Paddle that referenced this pull request Mar 30, 2018
* commit '33b8b3d22034423455a493712955e419aac7b19b': (251 commits)
  Remove redundant commands in build.sh and build_doc.sh
  Add dependencies
  Move v2/api/fluid to fluid/api and Adjust doc build commands
  Plain LRN op throws an exception when is_test is set in backward pass
  fix compiler error of profiler_test in ONLY_CPU mode
  fix server shutdown
  Translation for Model Configuration (PaddlePaddle#9513)
  Fix data transform when inplace (PaddlePaddle#9450)
  refine parallel
  add FAQ (PaddlePaddle#9494)
  Fix dist error with lr decay layer (PaddlePaddle#9489)
  add prefetch_op (PaddlePaddle#9495)
  Fix some errors (PaddlePaddle#9403)
  hookup WITH_FLUID_ONLY in TeamCity build.sh (PaddlePaddle#9509)
  Fix the order of reads and write from buffered channel  (PaddlePaddle#9423)
  change WITH_FLUID to WITH_FLUID_ONLY (PaddlePaddle#9427)
  fix block num
  Revert "make append activation in place by default (PaddlePaddle#9417)"
  Speed/sequence op1 (PaddlePaddle#9217)
  fix a compile error
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants