Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated everything for Python3.6+ and TensorFlow1.13+ #59

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

akhilvasvani
Copy link

Updated all the code for Python3.6+ and TensorFlow1.13+ usage

@kramarov-evg
Copy link

Thank you for your work, but trying to run your code I faced a problem. Trying to restore a model from the checkpoint provided in orginal repo, I got an error:

Tensor name "g_net/g_OT/batch_norm/beta" not found in checkpoint files ./models/birds_skip_thought_model_164000.ckpt
         [[node save/RestoreV2 (defined at demo/birds_skip_thought_demo.py:67) ]]

Is it possible at all to use this saved checkpoint with your updated implementation or it's my fault somewhere?

@akhilvasvani
Copy link
Author

akhilvasvani commented Jul 17, 2019

Because I upgraded the code and factored out certain parts, you cannot use the previously pretrained model by hanzhanggit. So there are two options at the moment:

  1. Train your own model, which depending on your hardware could take up to 3 days
  2. If you wait a couple days, I will post my pretrained model-- at the moment, my model is being trained

Sorry to be a downer

@kramarov-evg
Copy link

That was my first thought, but I was unsure so decided to clarify.
Anyway, thank you for your work. Looking forward to see your pretrained model asap, as on my hardware it'll be far to long (no even CUDA-compatible GPU)

@kramarov-evg
Copy link

Could you please provide information on the amount of video memory required for the model training?

@akhilvasvani
Copy link
Author

akhilvasvani commented Jul 19, 2019

Good news! Uploaded the pretrained model on my StackGAN forked repo as well as the updated birds_skipthoughts_demo.py file for python3 usage.

It took me 3 days to train the model on a NVIDIA GPU 1080 Ti Founder's edition (with 11 Gbs). Also the file size was ~1Gbs large for the all the pictures of the birds

@kramarov-evg
Copy link

kramarov-evg commented Jul 19, 2019

I can't comment some lines right in the PR, as they were created by original authors and left unedited by you. That's why I'll leave some comments here

File misc/skipthoughts.py lines 80-81 should be like:

utable = numpy.load(path_to_tables + 'utable.npy', encoding='latin1', allow_pickle=True)
btable = numpy.load(path_to_tables + 'btable.npy', encoding='latin1', allow_pickle=True)

These are files from original skipthoughts source, which wasn't updated to Python3. that's why encoding='latin1' is needed in Python3. allow_pickle=True is because it's set to False by default in Numpy 1.16.3+. In earlier versions it was True so it didn't matter

What's more, I anyway get an error loading your checkpoint. It says:

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [4800,256] rhs shape= [1024,256]
         [[node save/Assign_55 (defined at demo/birds_skip_thought_demo.py:67) ]]

I tried to figure out, what's wrong, but failed. What could have caused it? I used your code as is without editing the model or anything, but changes listed above

Thank you for your code and support

@YiXun31
Copy link

YiXun31 commented Jul 19, 2019

hello, how can test ?
ex: I enter the text and the generator produces the image.
If you has test code, please give me.
Thanks for your help!

@akhilvasvani
Copy link
Author

akhilvasvani commented Jul 19, 2019

@YiXun31: If by test you mean run own our text captions, inside the birds_demo.sh script,
change CAPTION_PATH=Data/birds/example_captions to point to wherever your own captions text path is

@akhilvasvani
Copy link
Author

akhilvasvani commented Jul 19, 2019

@kramarov-evg : I'll be completely honest, I did not train the model using skip-thoughts because I did not want to download Theano and/ or upgrade that code for training. However, after looking into the problem, I found a way around Theano. There is a TensorFlow implementation by Chris Shallue.

At the moment, I am rewriting my skipthoughts.py file to incorporate these changes and everything in sync

@YiXun31
Copy link

YiXun31 commented Jul 20, 2019

hello
My system is windows10.
so I can't carried out sh demo/birds_demo.sh.
how can I carried out?
Thanks for your help!

@akhilvasvani
Copy link
Author

@YiXun31: Look at this link for running bash commands on Windows 10

@kramarov-evg
Copy link

@akhilvasvani, not sure it'll succeed, as wsl doesn't yet support CUDA integration.

@YiXun31, please, let me know if it works

@YiXun31
Copy link

YiXun31 commented Jul 20, 2019

Display after execution
demo/birds_demo.sh: line 11: th: command not found
demo/birds_demo.sh: line 19: python3: command not found

@kramarov-evg
Copy link

kramarov-evg commented Jul 20, 2019

Search for instructions on installing torch. Can be found in README by the Torch link. That's for th. for python3 just sudo apt install python3. If you use package manager different from apt look for instructions for your package manager

@YiXun31
Copy link

YiXun31 commented Jul 20, 2019

Do you have to execute the .sh file?
Can I execute in Windows python?
Sorry, I have been bothering you.

@akhilvasvani
Copy link
Author

@YiXun31: You can execute the birds_demo.py python file. You will just have to provide the arguments yourself. If you look at the .sh file you can specifically what arguments you have to fulfill.

@kramarov-evg: what is wsl?

@kramarov-evg
Copy link

kramarov-evg commented Jul 20, 2019

@akhilvasvani WSL stands for Windows Subsystem for Linux. If I'm not mistaken it was in the link, you provided as a solution

Moreover won't he need to generate embeddings for sentences in example_samples.txt? If so, they're generated using Torch, which has no Windows support, but compiling from source, which happens to be really complicated task. Correct me if I'm wrong

That was the only reason I wanted to use skip-thoughts. This has full Windows compatibility

@YiXun31
Copy link

YiXun31 commented Jul 20, 2019

@akhilvasvani Hello I can't find the birds_demo.py .

@kramarov-evg
Copy link

@YiXun31 You should wait until author fully implements skip-thought embeddings. Just like I do. Maybe there's a different way that I'm not informed about, IDK

@YiXun31
Copy link

YiXun31 commented Jul 22, 2019

@kramarov-evg Thank you for your answer.

@akhilvasvani
Copy link
Author

@kramarov-evg:
Good News: I got the skip_thoughts_demo.py file working, so the script runs
Bad News: You cannot use the pretrained model because the encoder_dimensions is a mismatch to what is required. So, now I have to train a new model. However, this will take several days to train

@kramarov-evg
Copy link

@akhilvasvani Wow, congratulations. Amazing news. Waiting for your next pretrained model. No words can describe my gratefulness for this support and contribution

@kramarov-evg
Copy link

@akhilvasvani How's it going?

@kramarov-evg
Copy link

@akhilvasvani
Hello, I've tried many other ways to launch cnn-rnn model, while waiting for you to update the repo to include skip-thought. Figuring out a way to do so, I faced several issues. One of them, probably the last one, is that text-encoder, provided by reedscot was trained on a GPU. That's why it can't be used on a CPU-only machine, like mine.
There's a nice script by one developer, that converts GPU-pretrained model to be CPU-compatible. But, unfortunately, running this script also requires CUDA GPU. Could you, please, convert a text-encoder to be CPU-compatible, then there'll be no need in skip-thought model, as anyone would be able to run a model regardless of OS or GPU availibility

@akhilvasvani
Copy link
Author

akhilvasvani commented Aug 2, 2019

I'm so sorry @kramarov-evg. I had to stop training on the skip-thoughts model for a short while I worked on another project. Sure, I can do that. Let me get on that.

Also nice find with that script.

For future developers though, I will keep training the skip-thoughts model

@kramarov-evg
Copy link

@akhilvasvani no problem with that at all. Finally that's not your obligation :-)

Hope you finish it once

@akhilvasvani
Copy link
Author

@kramarov-evg: I finally got the skip-thoughts model to work! I got the pickle file from the pretrained skip-thoughts model (for training and testing) and now I am training the StackGAN model with it. Stay tuned for an update

@vigneshdurairaj
Copy link

vigneshdurairaj commented Mar 13, 2020

@akhilvasvani I tried executing the skip-thoughts.py. Now I'm stuck at this check point error mwentioned by @kramarov-evg earlier. I don't know where to take it from here without pretrained model.

Assign requires shapes of both tensors to match. lhs shape= [4800,256] rhs shape= [1024,256]
[[node save/Assign_55 (defined at demo/birds_skip_thought_demo.py:67) ]]


I tried to figure out, what's wrong, but failed. What could have caused it? I used your code as is without editing the model or anything, but changes listed above

Thank you for your code and support

Thanks in advance

@ChelyYi
Copy link

ChelyYi commented May 4, 2020

@akhilvasvani I tried executing the skip-thoughts.py. Now I'm stuck at this check point error mwentioned by @kramarov-evg earlier. I don't know where to take it from here without pretrained model.

Assign requires shapes of both tensors to match. lhs shape= [4800,256] rhs shape= [1024,256]
[[node save/Assign_55 (defined at demo/birds_skip_thought_demo.py:67) ]]


I tried to figure out, what's wrong, but failed. What could have caused it? I used your code as is without editing the model or anything, but changes listed above

Thank you for your code and support

Thanks in advance
@akhilvasvani
Yeah, I have the same problem.
The embedding dimension used in pretrained model is 1024, but skip-thought is 4800 dimension in skipthought.py file, so we cannot run the demo with your pretrained model.
Did you test performance like inception score with your model?
And do you change the model structure of original stackGAN?

Thx in advance.

@akhilvasvani
Copy link
Author

akhilvasvani commented May 6, 2020

Hi @ChelyYi + @vigneshdurairaj, the reason why the script does not run is because I have to post the generated the pickle file for skip-thoughts text-embeddings. At the moment, I am retraining the StackGAN model so you can use it with the skip-thoughts embeddings

@ChelyYi
Copy link

ChelyYi commented May 8, 2020

@akhilvasvani Thx, waiting for your new model.
And just asking, did you test your model and compare with old version's result? Can it achieve same good result?

@akhilvasvani
Copy link
Author

Hey @ChelyYi, apologies for the wait. I have just added in the skip-thoughts pickle embedding file online, so that you can train the StackGAN network using skip-thoughts embedding

Here is the downloadable link for birds

@gowthamvbhat
Copy link

@akhilvasvani Thanks a lot for the updated code for python 3.6+ and Tensorflow 1.13+. I was struggling to run the old code : )
Waiting for your skip-thoughts model.

@shikhar-scs
Copy link

shikhar-scs commented Oct 5, 2020

hey @akhilvasvani I followed your setup in the fork available on your GH account. However, I get the following error. let me know if you've seen it or anything similar.

Using config:
{'CONFIG_NAME': '3stages',
 'CUDA': True,
 'DATASET_NAME': 'birds',
 'DATA_DIR': 'data/birds',
 'EMBEDDING_TYPE': 'cnn-rnn',
 'GAN': {'B_CONDITION': True,
         'DF_DIM': 64,
         'EMBEDDING_DIM': 128,
         'GF_DIM': 64,
         'NETWORK_TYPE': 'default',
         'R_NUM': 2,
         'Z_DIM': 100},
 'GPU_ID': '0',
 'TEST': {'B_EXAMPLE': True, 'SAMPLE_NUM': 30000},
 'TEXT': {'DIMENSION': 1024},
 'TRAIN': {'BATCH_SIZE': 9,
           'COEFF': {'COLOR_LOSS': 0.0, 'KL': 2.0, 'UNCOND_LOSS': 1.0},
           'DISCRIMINATOR_LR': 0.0002,
           'FLAG': True,
           'GENERATOR_LR': 0.0002,
           'MAX_EPOCH': 600,
           'NET_D': '',
           'NET_G': '',
           'SNAPSHOT_INTERVAL': 1000,
           'VIS_COUNT': 64},
 'TREE': {'BASE_SIZE': 64, 'BRANCH_NUM': 3},
 'WORKERS': 4}
Total filenames:  11788 001.Black_footed_Albatross/Black_Footed_Albatross_0046_18.jpg
Load filenames from: data/birds/train/filenames.pickle (8855)
embeddings:  (8855, 10, 1024)
Segmentation fault (core dumped)

Tried debugging, things are fine till the following line https://github.com/akhilvasvani/StackGAN-v2/blob/94df11d10dba2ddc62aa4c339617d846195d6a17/code/main.py#L141
however, post this I get the Segmentation fault (core dumped) error.

Tried reducing batch size etc, but to no avail

@shikhar-scs
Copy link

@akhilvasvani

@akhilvasvani
Copy link
Author

akhilvasvani commented Oct 7, 2020

Hi @shikhar-scs, apologies for the wait. I did not get this error. What GPU are you using? And what is your memory size?

@akhilvasvani
Copy link
Author

@gowthamvbhat I can provide an updated model by later this weekend

@shikhar-scs
Copy link

shikhar-scs commented Oct 7, 2020

What GPU are you using?

I tried a TITAN V and also a GeForce GTX 1080. Same error on both. It works on my local with gpu=-1, some issue with the data folder maybe.
For pre-processing we follow Stackgan v1 right ?

I can provide an updated model by later this weekend

Wanted to train it actually for different scenarios.

@akhilvasvani
Copy link
Author

akhilvasvani commented Oct 7, 2020

Oh, that's weird. You should enough memory on those GPUs. For memory: it's 11 GB or more? Wait, what type of local GPU do you have? Are you using these GPUs on the cloud? I trained the model locally

@gowthamvbhat
Copy link

@akhilvasvani That would be great! Thank you.

@shikhar-scs
Copy link

shikhar-scs commented Oct 7, 2020

For memory: it's 11 GB or more?

Slightly more than that on each. I have a VM, for local I was trying on my MacBook Pro. The code was running at the least. I did an scp of the exact same zip file of the directory to my VM and it stopped working with the same code.

Anyways I guess something is off from my side. Let me figure it out. Thanks for your response.

@sharad1999
Copy link

Good news! Uploaded the pretrained model on my StackGAN forked repo as well as the updated birds_skipthoughts_demo.py file for python3 usage.

It took me 3 days to train the model on a NVIDIA GPU 1080 Ti Founder's edition (with 11 Gbs). Also the file size was ~1Gbs large for the all the pictures of the birds

Hi Sir
I ran your code & i am getting a error in Stage 1 about the checkpoints
tensorflow.python.framework.errors_impl.NotFoundError: ./ckt_logs/birds/stageI_2019_07_10_09_33_08; No such file or directory

can You provide a solution

@sharad1999
Copy link

Good news! Uploaded the pretrained model on my StackGAN forked repo as well as the updated birds_skipthoughts_demo.py file for python3 usage.
It took me 3 days to train the model on a NVIDIA GPU 1080 Ti Founder's edition (with 11 Gbs). Also the file size was ~1Gbs large for the all the pictures of the birds

Hi Sir
I ran your code & i am getting a error in Stage 1 about the checkpoints
tensorflow.python.framework.errors_impl.NotFoundError: ./ckt_logs/birds/stageI_2019_07_10_09_33_08; No such file or directory

can You provide a solution

@akhilvasvani

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants