Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFound error, as well as a few other errors. #74

Open
Subash-Chandra opened this issue Jun 24, 2020 · 9 comments
Open

FileNotFound error, as well as a few other errors. #74

Subash-Chandra opened this issue Jun 24, 2020 · 9 comments

Comments

@Subash-Chandra
Copy link

image

I'm getting these errors no matter the picture combo I used.

I tried lowering the resolutions of the pictures in the script because I thought it was failing to compute, and therefore failing to save which caused the next function to error, but lowering the resolution didn't fix it.

I'm running on CUDA with CUDnn, and I'm running it on an i7-7700k + RTX 2080 Super. I've run higher res non-script style transfers that haven't failed though so I'm not too sure what the problem may be.

I thought it may be because of edits I made to the starry_stanford.sh script, but I redownloaded and ran with default parameters, and it still failed with the exact same errors.

@ProGamerGov
Copy link
Owner

The FileNotFound errors look like errors caused by another error that occurs earlier. The first step in the starry_stanford.sh script takes your input images, and produces an output image. The resulting output image is then used an input image for the next step, and the output of that step is used as an input for the step after that. If an earlier step fails, then no output image is produced.

What was the first error?

@Subash-Chandra
Copy link
Author

The error is the first line. It's as follows

RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED (createCuDNNHandle at ..\aten\src\ATen\cudnn\Handle.cpp:9)

Here is the Traceback just before that error.

Traceback (most recent call last):
  File "neural_style.py", line 468, in <module>
    main()
  File "neural_style.py", line 262, in main
    optimizer.step(feval)
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\optim\lbfgs.py", line 311, in step
    orig_loss = closure()
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "neural_style.py", line 253, in feval
    loss.backward()
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\tensor.py", line 198, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\autograd\__init__.py", line 100, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED (createCuDNNHandle at ..\aten\src\ATen\cudnn\Handle.cpp:9)
(no backtrace available)

@Subash-Chandra
Copy link
Author

Subash-Chandra commented Jun 24, 2020

I thought it was because I have the wrong version of cuDNN, so I reinstalled, and it was still errorring.

For reference, I am using CUDA v11.

@ProGamerGov
Copy link
Owner

ProGamerGov commented Jun 25, 2020

@Subash-Chandra The PyTorch site shows that the PyTorch Conda install only supports CUDA 9.2, CUDA 10.1, and CUDA 10.2: https://pytorch.org/get-started/locally/

Also, unless you are installing from source, I think cuDNN is prepackaged (comes with the pip and Conda packages).

@Subash-Chandra
Copy link
Author

Subash-Chandra commented Jun 25, 2020

I reinstalled CUDA 10.2, cuDNN 7.6.5 for CUDA 10.2, and the correct pyTorch version as well. Still getting the same error.

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)``

I'm not sure why it's able to compute the first 15 or so images, and only fail after that.

@ProGamerGov
Copy link
Owner

I'm not sure why it's able to compute the first 15 or so images, and only fail after that.

Can you elaborate on that?

@Subash-Chandra
Copy link
Author

Subash-Chandra commented Jun 26, 2020

image

It makes 15 images no matter the input image combination.

I have 8 GB Vram on my 2080 Super and cuDNN is installed, so it should be able to handle resolution way above the 2350 from Step 5.

It finishes doing the 1000 iterations of Step 1, and 500 iterations of Step 2, and then just stops creating any more files.

edit - Also, there is no difference in the script itself between Step 2 and Step 3 except for the resolution, so there shouldn't be any reason that the script itself fails at that spot every time.

@ProGamerGov
Copy link
Owner

@Subash-Chandra I can't seem to figure if the error is because of a lack of memory or something else. I originally created starry_stanford.sh with a GPU with 12 GB of VRAM, and that's all I tested it with.

@DataCrusade1999
Copy link

I reinstalled CUDA 10.2, cuDNN 7.6.5 for CUDA 10.2, and the correct pyTorch version as well. Still getting the same error.

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)``

I'm not sure why it's able to compute the first 15 or so images, and only fail after that.

I was able to resolve this issue by not including the -cudnn_autotune flag maybe this can help you out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants