Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating Point Exception (core dumped) #3

Open
ghost opened this issue Jan 21, 2016 · 9 comments
Open

Floating Point Exception (core dumped) #3

ghost opened this issue Jan 21, 2016 · 9 comments

Comments

@ghost
Copy link

ghost commented Jan 21, 2016

Hi,

I was able to build the caffe version. Created two directories train and test, in a folder named seaquest, having (0,1,2,3,4) and (0) folders respectively.

On running ../train_lstm.sh 18 0 , I get this error (see screenshot below):

error

@tknandu
Copy link

tknandu commented Jan 21, 2016

The same error occurred for me as well. Could you kindly look into it at the earliest.

@junhyukoh
Copy link
Owner

Hi,
Have you checked if the folder names are in the following format?
./[game name]/train/[%04d]/[%05d].png
An example would be

/seaquest/train/0000/00000.png
/seaquest/train/0000/00001.png
/seaquest/train/0000/...
/seaquest/train/0001/00000.png
/seaquest/train/0001/00001.png
/seaquest/train/0001/...

@tknandu
Copy link

tknandu commented Jan 22, 2016

Hi,

Naming the folders and files to ensure they are 0-padded ensured that all the episodes & images are loaded. However, after the loading, a Floating Point Exception is still raised.

Is it possible for you to share the sampled trajectories data you have generated (for Seaquest)?

@ghost
Copy link
Author

ghost commented Jan 25, 2016

ramderror

@tknandu
Copy link

tknandu commented Jan 28, 2016

As shown in the screenshot, even after the episodes and images are successfully loaded, a FloatingPoint Exception is raised. Any idea how to resolve this?

@junhyukoh
Copy link
Owner

I'm sorry for my late response.
I will look into this issue after ICML deadline (2/5).

@junhyukoh
Copy link
Owner

image

This actually works fine with my data.
I just uploaded some example data (2 episodes for train/test).
Can you check if the example works for you?

@ghost
Copy link
Author

ghost commented Feb 11, 2016

Hi.. Yes! It works. We had a problem with the GPU memory limit of 2GB on using a GTX 680. The network has 9 million parameters and needs 8+ GB of GPU memory. We ran it on a Titan X and the network trains!
Thanks for the help.

@tknandu
Copy link

tknandu commented Feb 15, 2016

Hi,

Currently training the LSTM network with 1-step prediction on a TitanX GPU with 12 GB GPU RAM for around 1.5 million iterations on Seaquest. Could you give an estimate of the training time?

Also, could you provide a trained model for 1-step prediction on Seaquest to compare results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants