-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we split Executor::Run into Executor::Prepare and Executor::exe #6285
Comments
To my understanding, there are two problems:
If we can solve these problems in a better design without losing the flexibility, maybe it is good. |
Looks like the cost of creating an operator instance is not significant.
Although we expect the serialization and sending are slow, can we first measure how slow it is? Also, @reyoung how does an executor determine |
I think different executors should never communicate with each other, the module who send ExecutionPlan to the executors communicate with all executors. |
The cost of creating operators in RNN is very significant since it will create operators every time-step. It could also be significant when remotely.
@chengduoZH @qingqing01 In my view of running a plain network and an RNN, the time cost of Executor.Run might be around 8%/20%.
The time complexity to determine current_program == previous_program is exactly as same as creating C++ operators because we need to compare every operators and variables between two programs are same. In this issue, I suggest to let end users or a higher level API manage the cache handle, not the Or, another straight-forward way is making |
Curious what is the benefit of letting end users or a higher level API manage the cache handle? I can think of the benefit of not exposing cache handling: simple executor API, no chance for the user to mess up the cache handling. |
The time consumption of creating and destroying operators in a Dynamic RNN is pretty large. It takes about |
Related issue #6885 |
It is done in #9000 . |
Problem
We create new operators in CPP when
Executor::Run
is invoked since we assume the topology may be changed every time. However, the program is usually not changed. To create operators locally or sending protobuf again and again to a remote node is very time-consuming.Solution
To reduce the time cost of creating operators in local mode and network communication in cluster mode, we can extract a method named
Executor::Prepare
.Prepare return a
HANDLE
.In local mode, It could be an array index of an internal data structure of Executor. The internal data structure holds the C++ operators which the program contains.
In cluster mode,
Prepare
could just send the protobuf of the program to a remote node. The handle could be an RPC return value. We can just send the HANDLE to remote to execute the associated program, instead of serializing and sending protobuf again and again.The text was updated successfully, but these errors were encountered: