Skip to content
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

Data parallelism across multiple GPUs #121

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

dennybritz
Copy link
Contributor

Allow the user to replicate the model on multiple GPUs. Still WIP and untested.

@dennybritz
Copy link
Contributor Author

This code works, but it is currently very slow. Need to very the op placement on different GPUs to figure out why it is slow.

@dennybritz
Copy link
Contributor Author

Ref #44

@dennybritz
Copy link
Contributor Author

A few things:

  • Variables need to be placed on CPU
  • Optimizer ops and preprocessing need to be on CPU

It's probably cleaner to put this into the Estimator class. For example, subclass estimator and add support for model replicas.

@bhack
Copy link

bhack commented Apr 11, 2017

See also tensorflow/tensorflow#2126

@SvensBigData
Copy link

I get the following error on this branch:

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'save/ShardedFilename_1': Could not satisfy explicit device specification '/device:GPU:1' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and devices:
Identity: CPU
ShardedFilename: CPU
[[Node: save/ShardedFilename_1 = ShardedFilename[_device="/device:GPU:1"](save/StringJoin, save/ShardedFilename_1/shard, save/num_shards)]]

@hrishikeshvganu
Copy link

@dennybritz : wanted to know if the branch is usable now. If there are some specific ToDos I can help with the implementation

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants