-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
there were no acceleration effect using Multi GPU #877
Comments
your batch size is 8 times bigger, so it is faster. |
ummm .. both of these batch_size are 280 , I just changed the number of GPU ( 1 vs 8 ). what I can see is tok/s was became bigger and training_time every 50 steps was longer |
what's the meaning of ##n_src_words## and ##n_words## every step? why with the number of gpu increasing, the number of ##n_src_words## and ##n_words## show same trend ? |
I suppose that with the GPU increasing , more sentences and words were loaded at each step. and the time each step using will reduce the eight times , but ... conversely each step 's time consuming is increasing |
you're are mistaken. |
So , I can reduce the train_steps to 1/8 and still achieve comparable results? BTW, if I wanna to figure out the acceleration effect, Should I set the batch_size to 280/8 when training at multi GPU to make comparison with training at single GPU? |
read more here for instace: tensorflow/tensor2tensor#444 |
during using multiGPU training , I fond that there is no speeding effect~~
(those two experiments equipped same hyper-parameters and frames, in addition to same preprocessed data)
[2018-08-03 18:07:31,781 INFO] Step 50/55000; acc: 1.33; ppl: 24568.27; xent: 10.11; lr: 1.00000; 23006/21669 tok/s; 18 sec
[2018-08-03 18:07:46,144 INFO] Step 100/55000; acc: 3.70; ppl: 68305.64; xent: 11.13; lr: 1.00000; 12894/13920 tok/s; 32 sec
[2018-08-03 18:07:59,929 INFO] Step 150/55000; acc: 7.27; ppl: 6525.28; xent: 8.78; lr: 1.00000; 10488/9970 tok/s; 46 sec
[2018-08-03 18:11:14,990 INFO] Step 50/55000; acc: 5.32; ppl: 11557.90; xent: 9.36; lr: 1.00000; 53166/56374 tok/s; 44 sec
[2018-08-03 18:11:48,445 INFO] Step 100/55000; acc: 6.32; ppl: 5841.12; xent: 8.67; lr: 1.00000; 68049/73648 tok/s; 77 sec
[2018-08-03 18:12:22,650 INFO] Step 150/55000; acc: 6.93; ppl: 4129.08; xent: 8.33; lr: 1.00000; 59233/68822 tok/s; 111 sec
The text was updated successfully, but these errors were encountered: