Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create CPU only Version #3

Closed
sguada opened this issue Nov 25, 2013 · 16 comments
Closed

Create CPU only Version #3

sguada opened this issue Nov 25, 2013 · 16 comments
Assignees

Comments

@sguada
Copy link
Contributor

sguada commented Nov 25, 2013

No description provided.

@ghost ghost assigned sguada Nov 25, 2013
@Yangqing
Copy link
Member

IMHO I wouldn't put this at a high priority. Plus it would be pretty nontrivial to implement this, requiring a lot of #ifdef macros as well as substantially rewriting the syncedmem class. We would also need to consider how we would like caffe::Caffe::mode to be... All these might make the codebase rather messy so I am not sure if it is the right move - at the end of day, it probably does not hurt too much to just ship with cuda libraries - unlike MKL, they are freely redistributable.

@sguada
Copy link
Contributor Author

sguada commented Nov 26, 2013

I agree that is not high priority, but I think that separating the cpu and gpu parts of code would be helpful for maintenance and eventually to make a CPU only version without needing CUDA to compile or run it.

@Yangqing
Copy link
Member

The current code actually allows one to run without a physical GPU (as long
as cuda runtime is distributed - again it is freely allowed), which is what
I am planning to deploy on the ICSI cluster.

For developers working on caffe I assume they will at least have cuda
compilers.

Yangqing

On Tue, Nov 26, 2013 at 12:21 PM, Sergio Guadarrama <
[email protected]> wrote:

I agree that is not high priority, but I think that separating the cpu and
gpu parts of code would be helpful for maintenance and eventually to make a
CPU only version without needing CUDA to compile or run it.


Reply to this email directly or view it on GitHubhttps://github.com/Yangqing/caffe/issues/3#issuecomment-29330167
.

@tdomhan
Copy link
Contributor

tdomhan commented Jan 23, 2014

on Mac OS X 10.9 the cuda libaries are linked to libstdc++ while everything else on the system is linked to libc++ by default, due to the switch from gcc to clang. This way compiling caffe on OS X 10.9 is a huge mess right now, because you need to make sure that all the libraries you link to are linked to libstdc++, which mostly means you need to manually compile all the libraries and set the correct flags. The worst part is that you won't notice directly during compilation, but only when you try to run the program afterwards.
If there was a CPU only version, you could at least compile only the CPU code without a hassle, until nvcc will work with libc++ instead of stdlibc++.

@junwang4
Copy link

Hi tdomhan,
Have you succeeded to install it on your Mac OS X 10.9? I got an error:
clang: error: unsupported option '-dumpspecs'
Then it stops to compile.

@shelhamer
Copy link
Member

The OSX 10.9 situation hasn't yet changed since @tdomhan's nice summary of the issue. OS X 10.9 is not currently a feasible compilation target for GPU mode. The -dumpspecs error you mention is due to CUDA/clang incompatibility, but it is not currently possible to compile with gcc either.

For CUDA compatibility on OS X it seems to mostly be a matter of waiting for a new version of CUDA to ship that links to libc++.

@sguada
Copy link
Contributor Author

sguada commented Feb 1, 2014

The problem of compiling with gcc seems to be that when linking it detects
duplicate functions between the cpp and the cuda code. If you figure out
how to fix that, then we will be a bit closer to compile with gcc in OS X
10.9

duplicate symbol caffe::LRNLayer::LRNLayer(caffe::LayerParameter
const&)in:
src/caffe/layers/lrn_layer.o
src/caffe/layers/lrn_layer.cuo

Sergio

2014-01-31 Evan Shelhamer [email protected]:

The OSX 10.9 situation hasn't yet changed since @tdomhanhttps://github.com/tdomhan's
nice summary of the issue. OS X 10.9 is not currently a feasible
compilation target for GPU mode. The -dumpspecs error you mention is due
to CUDA/clang incompatibility, but it is not currently possible to compile
with gcc either.

For CUDA compatibility on OS X it seems to mostly be a matter of waiting
for a new version of CUDA to ship that links to libc++.

Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-33861457
.

@tdomhan
Copy link
Contributor

tdomhan commented Feb 1, 2014

I managed to build and run caffe under Mac OS X. I'll post the details later. It's a little hacky of course.

@tdomhan
Copy link
Contributor

tdomhan commented Feb 5, 2014

Already I finally had time to add instructions for OS X 10.9, see: f0f594c
Hope this helps.

@junwang4
Copy link

junwang4 commented Feb 7, 2014

Thanks, Tobias! Following your instructions, I succeeded in installing Caffe on OS X 10.9.

I did a test on the MNIST demo, but found that the GPU setting (running time: 275s) doesn't have any advantage over the CPU setting (running time: 284s) on a latest iMac. I only changed the setting "solver_mode: 1/0" in data/lenet_solver.prototxt. Do you have any comparison between the GPU and the CPU setting?

@Yangqing
Copy link
Member

Yangqing commented Feb 7, 2014

The MNIST demo is not likely to show the advantage of GPU over CPU, since
the model is very small and the overhead of e.g. data transfer and CPU side
control codes is big enough that GPU and CPU takes approximately the same
time. For larger models like ImageNet the GPU advantage will become clearer.

Yangqing

On Thu, Feb 6, 2014 at 8:39 PM, junwang4 [email protected] wrote:

Thanks, Tobias! Following your instructions, I succeeded in installing
Caffe on OS X 10.9.

I did a test on the MNIST demo, but found that the GPU setting (running
time: 275s) doesn't have any advantage over the CPU setting (running time:
284s) on a latest iMac. I only changed the setting "solver_mode: 1/0" in
data/lenet_solver.prototxt. Do you have any comparison between the GPU and
the CPU setting?

Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-34403732
.

@junwang4
Copy link

junwang4 commented Feb 7, 2014

Yangqing,
Thanks for your clarification! Is there any comparison data of the running time between the CPU vs. GPU on ImageNet or CIFAR?

@Yangqing
Copy link
Member

Yangqing commented Feb 7, 2014

I have not got detailed analysis yet. On imagenet with GPUs, full
forward+backward using Alex Krizhevsky's network takes about 7ms, and
forward only takes about 2.5ms on my desktop with a K20 (when computation
are carried out in a batch fashion). CPUs are about 10 times slower than
that, but given the multiple choices of specific CPU types it might be hard
to say exactly what the speed it is.

Yangqing

On Thu, Feb 6, 2014 at 8:59 PM, junwang4 [email protected] wrote:

Yangqing,
Thanks for your clarification! Is there any comparison data of the running
time between the CPU vs. GPU on ImageNet or CIFAR?

Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-34404390
.

@junwang4
Copy link

junwang4 commented Feb 7, 2014

@kloudkl
Copy link
Contributor

kloudkl commented Feb 10, 2014

A not very accurate benchmark of CPU and GPU mode is done in #85. Even the result is not fair for GPU, it still shows great speed advantage.

@shelhamer
Copy link
Member

Done in #561!

mitmul pushed a commit to mitmul/caffe that referenced this issue Sep 30, 2014
Split source files between CUDA and CPU code. Pave the way for BVLC#3 and BVLC#122.
puzzledqs referenced this issue in puzzledqs/caffe Oct 8, 2014
mtourne pushed a commit to mtourne/caffe that referenced this issue Jun 27, 2016
dtmoodie referenced this issue in dtmoodie/caffe Oct 11, 2016
Add include and lib required for building with mxGPUArray support
coder-james pushed a commit to coder-james/caffe that referenced this issue Nov 28, 2016
mbassov added a commit to mbassov/caffe that referenced this issue Aug 28, 2017
dillonfzw added a commit to bluemindor/caffe that referenced this issue Dec 3, 2017
*   3f48aeb merge to new bvlc base c0597b1
|\
| *   fc0a02e Merge pull request BVLC#6 from yahoo/jun_cos_layer
| |\
| | * 807ee66 enhance cos layer
| |/
| *   ce1db4b Merge pull request BVLC#5 from yahoo/new_data_layer
| |\
| | * d4b2bf1 new datalayer changes
| |/
| *   5498759 Merge pull request BVLC#4 from yahoo/clean_permission
| |\
| | * 9b7fff8 Permission change for travis
| | * 8377456 Added dynamic linker resolution
| | * 840d6b5 Synchedmem
| | * 5fc8416 Need set_gpu_data
| | * ea0a8cb Merge branch 'bvlc_master' into clean_permission
| | * fb7e2a9 fix file permissions
| |/
| *   611197a Merge pull request BVLC#3 from yahoo/python_path_patch
| |\
| | * e107fb7 Python path patch
| |/
| *   486f979 Merge pull request BVLC#2 from anfeng/master
| |\
| | * 9cdfeb2 fix field Id for dataframe_format
| |/
| * f4e26f1 move dataframe format setting into prototxt
| *   23b0191 Merge pull request BVLC#1 from yahoo/afeng_df
| |\
| | * 177e0d9 MemoryDataLayer optional fields for DataFrames
| |/
| * 4b677c6 initial commit
*   eb8dfc8 Merge pull request BVLC#6
|\
| * 864aa65 implement CPU-GPU parameter update for data parallelism
* 411aafd Merge pull request BVLC#5
* 0cb7d18 Avoid sending the prefetched batch back to host
cepiross pushed a commit to cepiross/caffe that referenced this issue May 13, 2018
Fixed test exec error: lrn_ristretto_layer.cpp:16] LRN layer only supports minifloat
dkoes added a commit to gnina/caffe that referenced this issue Jun 4, 2018
twmht pushed a commit to twmht/caffe that referenced this issue Aug 20, 2018
fzd9752 pushed a commit to fzd9752/caffe that referenced this issue Mar 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants