-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Mac OS X port #138
Add Mac OS X port #138
Conversation
* Also make use cmake find to zlib. * circle link in osx, use reverse link all libs instead. But maybe osx just don't care circle link.
revise CMake for MAC OS
Mac port
# Conflicts: # paddle/math/tests/test_perturbation.cpp
@@ -239,7 +239,7 @@ void Matrix::toNumpyMatInplace(float** view_data, int* dim1, | |||
} | |||
void Matrix::copyToNumpyMat(float** view_m_data, int* dim1, | |||
int* dim2) throw(UnsupportError) { | |||
static_assert(sizeof(paddle::real) == sizeof(float), | |||
static_assert(sizeof(float) == sizeof(float), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizeof(paddle::real)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -251,12 +251,12 @@ void Matrix::copyToNumpyMat(float** view_m_data, int* dim1, | |||
if (auto cpuMat = dynamic_cast<paddle::CpuMatrix*>(m->mat.get())) { | |||
auto src = cpuMat->getData(); | |||
auto dest = *view_m_data; | |||
std::memcpy(dest, src, sizeof(paddle::real) * (*dim1) * (*dim2)); | |||
std::memcpy(dest, src, sizeof(float) * (*dim1) * (*dim2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizeof(paddle::real)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
} else if (auto gpuMat = dynamic_cast<paddle::GpuMatrix*>(m->mat.get())) { | ||
auto src = gpuMat->getData(); | ||
auto dest = *view_m_data; | ||
hl_memcpy_device2host(dest, src, | ||
sizeof(paddle::real) * (*dim1) * (*dim2)); | ||
sizeof(float) * (*dim1) * (*dim2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizeof(paddle::real)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -56,9 +69,9 @@ def parent_dir_str(self): | |||
|
|||
def libs_str(self): | |||
libs = [ | |||
"-Wl,--whole-archive", | |||
whole_start, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for apple, do we need to use " "-Wl,-force_load" instead, as what you does in cmake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already done this via "-Wl, all_load" in line 48, /Users/liaogang/baidu/Paddle/paddle/setup.py.in.
NeuralNetwork* NeuralNetwork::newNeuralNetwork( | ||
const std::string& name, | ||
NeuralNetwork* rootNetwork) { | ||
if (newCustomNeuralNetwork) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newCustomNeuralNetwork needs to be kept. If not possible on MacOS, we can use #ifdef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use cmake to fix this
if("${CMAKE_CXX_COMPILER_ID}" STREQUAL "Clang")
list(APPEND LIBS "-undefined dynamic_lookup")
endif()
CHECK_PY(pyModule) << "Import module " << moduleName << " failed."; | ||
PyObjectPtr pyDict(PyModule_GetDict(pyModule.get())); | ||
CHECK_PY(pyDict) << "Get Dict failed."; | ||
PyObjectPtr pyClass(PyDict_GetItemString(pyDict.get(), className.c_str())); | ||
LOG(INFO) << "createPythonClass className.c_str():" << className.c_str(); | ||
// LOG(INFO) << "createPythonClass className.c_str():" << className.c_str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either uncomment it or delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
// #ifndef _XOPEN_SOURCE | ||
// #warning "no _XOPEN_SOURCE defined in Python.h" | ||
// #endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uncomment or delete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
#endif | ||
CHECK_NE(tid, -1); | ||
return tid; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 21-32 can share code with getTID() in Stat.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
#define __NR_gettid 224 | ||
#endif | ||
pid_t tid = syscall(__NR_gettid); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getTID()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -146,12 +146,12 @@ TEST(compareSparse, remote_cpu) { | |||
TEST(compareSparse, cpu10_local_vs_remote) { | |||
FLAGS_local = 1; // disable remote sparse update in parameter config | |||
std::vector<ParameterPtr> localParameters = | |||
trainerOnePassTest(configFile1, true, 10); | |||
trainerOnePassTest(configFile1, true, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't "10" work?
@@ -44,8 +44,8 @@ set(ATLAS_LIB_SEARCH_PATHS | |||
/usr/lib | |||
/usr/lib/blas/atlas | |||
/usr/lib/atlas | |||
/usr/lib/atlas-base # special for ubuntu 14.04. | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line code be revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
you will already have Python 2.7.10 and Numpy 1.8 installed. | ||
|
||
The best option is to use the package manager homebrew to handle installations and upgrades for you. | ||
To install homebrew, first open a terminal window (you can find Terminal in the Utilities folder in Applications), and issue the command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
give a link to homebrew site
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
# Build gtest | ||
mkdir build && cmake .. | ||
make | ||
# Install gtest library |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is make install
no good for gtest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After make install
, print 'make install' is dangerous and not supported. Instead, see README for how to integrate Google Test into your build system.
**only cpu** | ||
|
||
```bash | ||
cmake -DWITH_GPU=OFF -DWITH_DOC=OFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invoke cmake in source directory will cause a lot of cmake generated file in it.
It may be better to create a build
directory, them cmake .. -DWITH_GPU=OFF -DWITH_DOC=OFF
.
And -DWITH_DOC
is OFF
by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
mkdir build | ||
cd build | ||
# you can add build option here, such as: | ||
cmake -DWITH_GPU=ON -DWITH_DOC=OFF -DCMAKE_INSTALL_PREFIX=<path to install> .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change order to
cmake .. -DWITH_GPU=ON -DWITH_DOC=OFF -DCMAKE_INSTALL_PREFIX=<path to install>
in case of user ignore ..
at last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
``` | ||
**Note** | ||
|
||
And if you set WITH_SWIG_PY=ON, you have to install related python predict api at the same time: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not only in WITH_SWIG_PY=ON. The python
module is also need be installed.
Paddle will automatically install python dependencies at first time user run paddle commands, such as paddle version
, paddle train
. It may require sudo
privileges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -95,7 +95,7 @@ float Matrix::get(size_t x, size_t y) const throw(RangeError) { | |||
} | |||
|
|||
void Matrix::set(size_t x, size_t y, float val) throw(RangeError, | |||
UnsupportError) { | |||
UnsupportError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not change this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
return __longlong_as_double(old); | ||
} | ||
namespace paddle { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is not related to osx port.
Could extract it to another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bug found in MAC OSX when open WITH_DOUBLE
@@ -84,4 +84,10 @@ int main(int argc, char** argv) { | |||
return ret; | |||
} | |||
|
|||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lines may be can removed in mac osx.
Since the whole-archive
was not correct before, this lines ware added.
I will give a PR to you, and check it is cool or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -90,7 +94,7 @@ TEST(checkGradient, multi) { | |||
TEST(checkGradient, hsigmoid) { checkGradientTest(configFile2, false, false); } | |||
|
|||
TEST(checkGradient, chunk) { | |||
EXPECT_EQ(0, system("python2 trainer/tests/gen_proto_data.py")); | |||
EXPECT_EQ(0, system("python trainer/tests/gen_proto_data.py")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is python2 not good?
But in some linux distribution, python3
is default python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAC OS X do not have python2. I will add some condition branch to avoid potential hazard.
Remove main function in some unittest.
* follow comments to fix bugs
change tag to 1.0.1
update go api doc
* optimize mem in uniq slot feature * cherry-pick var slot_feature * fix kernel overflow && add max feature num flag Co-authored-by: huwei02 <[email protected]>
* Optimizing the zero key problem in the push phase * Optimize CUDA thread parallelism in MergeGrad phase * Optimize CUDA thread parallelism in MergeGrad phase * Performance optimization, segment gradient merging * Performance optimization, segment gradient merging * Optimize pullsparse and increase keys aggregation * sync gpugraph to gpugraph_v2 (#86) * change load node and edge from local to cpu (#83) * change load node and edge * remove useless code Co-authored-by: root <[email protected]> * extract pull sparse as single stage(#85) Co-authored-by: yangjunchao <[email protected]> Co-authored-by: miaoli06 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: chao9527 <[email protected]> Co-authored-by: yangjunchao <[email protected]> * [GPUGraph] graph sample v2 (#87) * change load node and edge from local to cpu (#83) * change load node and edge * remove useless code Co-authored-by: root <[email protected]> * extract pull sparse as single stage(#85) Co-authored-by: yangjunchao <[email protected]> * support ssdsparsetable;test=develop (#81) * graph sample v2 * remove log Co-authored-by: miaoli06 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: chao9527 <[email protected]> Co-authored-by: yangjunchao <[email protected]> Co-authored-by: danleifeng <[email protected]> * Release cpu graph * uniq nodeid (#89) * compatible whole HBM mode (#91) Co-authored-by: yangjunchao <[email protected]> * Gpugraph v2 (#93) * compatible whole HBM mode * unify flag for graph emd storage mode and graph struct storage mode * format Co-authored-by: yangjunchao <[email protected]> * split generate batch into multi stage (#92) * split generate batch into multi stage * fix conflict Co-authored-by: root <[email protected]> * [GpuGraph] Uniq feature (#95) * uniq feature * uniq feature * uniq feature * [GpuGraph] global startid (#98) * uniq feature * uniq feature * uniq feature * global startid * load node edge seperately and release graph (#99) * load node edge seperately and release graph * load node edge seperately and release graph Co-authored-by: root <[email protected]> * v2 infer (#102) * optimize begin pass and end pass (#106) Co-authored-by: yangjunchao <[email protected]> * fix ins no (#104) * [GPUGraph] fix FillOneStep args (#107) * fix ins no * fix FillOnestep args * fix bug for whole hbm mode (#110) Co-authored-by: yangjunchao <[email protected]> * [GPUGraph] fix infer && add infer_table_cap (#108) * fix ins no * fix FillOnestep args * fix infer && add infer table cap * fix infer * 【PSCORE】perform ssd sparse table (#111) * perform ssd sparsetable;test=develop Conflicts: paddle/fluid/framework/fleet/ps_gpu_wrapper.cc * perform ssd sparsetable;test=develop * remove debug code; * remove debug code; * add jemalloc cmake;test=develop * fix wrapper;test=develop * fix sample core (#114) * [GpuGraph] optimize shuffle batch (#115) * fix sample core * optimize shuffle batch * release gpu mem when sample end (#116) Co-authored-by: root <[email protected]> * fix class not found err (PaddlePaddle#118) Co-authored-by: root <[email protected]> * optimize sample (PaddlePaddle#117) * optimize sample * optimize sample Co-authored-by: yangjunchao <[email protected]> * fix clear gpu mem (PaddlePaddle#119) Co-authored-by: root <[email protected]> * fix sample core (PaddlePaddle#121) Co-authored-by: yangjunchao <[email protected]> * add ssd cache (PaddlePaddle#123) * add ssd cache;test=develop * add ssd cache;test=develop * add ssd cache;test=develop * add multi epoch train & fix train table change ins & save infer embeding (PaddlePaddle#129) * add multi epoch train & fix train table change ins & save infer embedding * change epoch finish judge * change epoch finish change Co-authored-by: root <[email protected]> * Add debug log (PaddlePaddle#131) * Add debug log * Add debug log Co-authored-by: root <[email protected]> * optimize mem in uniq slot feature (PaddlePaddle#130) * [GpuGraph] cherry pick var slot feature && fix load multi path node (PaddlePaddle#136) * optimize mem in uniq slot feature * cherry-pick var slot_feature Co-authored-by: huwei02 <[email protected]> * [GpuGraph] fix kernel overflow (PaddlePaddle#138) * optimize mem in uniq slot feature * cherry-pick var slot_feature * fix kernel overflow && add max feature num flag Co-authored-by: huwei02 <[email protected]> * fix ssd cache;test=develop (PaddlePaddle#139) * slot feature secondary storage (PaddlePaddle#140) * slot feature secondary storage * slot feature secondary storage Co-authored-by: yangjunchao <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: xuewujiao <[email protected]> Co-authored-by: miaoli06 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: chao9527 <[email protected]> Co-authored-by: yangjunchao <[email protected]> Co-authored-by: Thunderbrook <[email protected]> Co-authored-by: danleifeng <[email protected]> Co-authored-by: huwei02 <[email protected]>
* Optimizing the zero key problem in the push phase * Optimize CUDA thread parallelism in MergeGrad phase * Optimize CUDA thread parallelism in MergeGrad phase * Performance optimization, segment gradient merging * Performance optimization, segment gradient merging * Optimize pullsparse and increase keys aggregation * sync gpugraph to gpugraph_v2 (PaddlePaddle#86) * change load node and edge from local to cpu (PaddlePaddle#83) * change load node and edge * remove useless code Co-authored-by: root <[email protected]> * extract pull sparse as single stage(PaddlePaddle#85) Co-authored-by: yangjunchao <[email protected]> Co-authored-by: miaoli06 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: chao9527 <[email protected]> Co-authored-by: yangjunchao <[email protected]> * [GPUGraph] graph sample v2 (PaddlePaddle#87) * change load node and edge from local to cpu (PaddlePaddle#83) * change load node and edge * remove useless code Co-authored-by: root <[email protected]> * extract pull sparse as single stage(PaddlePaddle#85) Co-authored-by: yangjunchao <[email protected]> * support ssdsparsetable;test=develop (PaddlePaddle#81) * graph sample v2 * remove log Co-authored-by: miaoli06 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: chao9527 <[email protected]> Co-authored-by: yangjunchao <[email protected]> Co-authored-by: danleifeng <[email protected]> * Release cpu graph * uniq nodeid (PaddlePaddle#89) * compatible whole HBM mode (PaddlePaddle#91) Co-authored-by: yangjunchao <[email protected]> * Gpugraph v2 (PaddlePaddle#93) * compatible whole HBM mode * unify flag for graph emd storage mode and graph struct storage mode * format Co-authored-by: yangjunchao <[email protected]> * split generate batch into multi stage (PaddlePaddle#92) * split generate batch into multi stage * fix conflict Co-authored-by: root <[email protected]> * [GpuGraph] Uniq feature (PaddlePaddle#95) * uniq feature * uniq feature * uniq feature * [GpuGraph] global startid (PaddlePaddle#98) * uniq feature * uniq feature * uniq feature * global startid * load node edge seperately and release graph (PaddlePaddle#99) * load node edge seperately and release graph * load node edge seperately and release graph Co-authored-by: root <[email protected]> * v2 infer (PaddlePaddle#102) * optimize begin pass and end pass (PaddlePaddle#106) Co-authored-by: yangjunchao <[email protected]> * fix ins no (PaddlePaddle#104) * [GPUGraph] fix FillOneStep args (PaddlePaddle#107) * fix ins no * fix FillOnestep args * fix bug for whole hbm mode (PaddlePaddle#110) Co-authored-by: yangjunchao <[email protected]> * [GPUGraph] fix infer && add infer_table_cap (PaddlePaddle#108) * fix ins no * fix FillOnestep args * fix infer && add infer table cap * fix infer * 【PSCORE】perform ssd sparse table (PaddlePaddle#111) * perform ssd sparsetable;test=develop Conflicts: paddle/fluid/framework/fleet/ps_gpu_wrapper.cc * perform ssd sparsetable;test=develop * remove debug code; * remove debug code; * add jemalloc cmake;test=develop * fix wrapper;test=develop * fix sample core (PaddlePaddle#114) * [GpuGraph] optimize shuffle batch (PaddlePaddle#115) * fix sample core * optimize shuffle batch * release gpu mem when sample end (PaddlePaddle#116) Co-authored-by: root <[email protected]> * fix class not found err (PaddlePaddle#118) Co-authored-by: root <[email protected]> * optimize sample (PaddlePaddle#117) * optimize sample * optimize sample Co-authored-by: yangjunchao <[email protected]> * fix clear gpu mem (PaddlePaddle#119) Co-authored-by: root <[email protected]> * fix sample core (PaddlePaddle#121) Co-authored-by: yangjunchao <[email protected]> * add ssd cache (PaddlePaddle#123) * add ssd cache;test=develop * add ssd cache;test=develop * add ssd cache;test=develop * add multi epoch train & fix train table change ins & save infer embeding (PaddlePaddle#129) * add multi epoch train & fix train table change ins & save infer embedding * change epoch finish judge * change epoch finish change Co-authored-by: root <[email protected]> * Add debug log (PaddlePaddle#131) * Add debug log * Add debug log Co-authored-by: root <[email protected]> * optimize mem in uniq slot feature (PaddlePaddle#130) * [GpuGraph] cherry pick var slot feature && fix load multi path node (PaddlePaddle#136) * optimize mem in uniq slot feature * cherry-pick var slot_feature Co-authored-by: huwei02 <[email protected]> * [GpuGraph] fix kernel overflow (PaddlePaddle#138) * optimize mem in uniq slot feature * cherry-pick var slot_feature * fix kernel overflow && add max feature num flag Co-authored-by: huwei02 <[email protected]> * fix ssd cache;test=develop (PaddlePaddle#139) * slot feature secondary storage (PaddlePaddle#140) * slot feature secondary storage * slot feature secondary storage Co-authored-by: yangjunchao <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: xuewujiao <[email protected]> Co-authored-by: miaoli06 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: chao9527 <[email protected]> Co-authored-by: yangjunchao <[email protected]> Co-authored-by: Thunderbrook <[email protected]> Co-authored-by: danleifeng <[email protected]> Co-authored-by: huwei02 <[email protected]>
No description provided.