-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train pix2pix with my own data #309
Comments
How did you stack A and B as a pair? Did you stack along the channel dimension. For example A has dimension (H1,W1,C1) and B is (H1,W1,C2). Did you stack along the C dimension to have (H1,W1,C1+C2)? Does your segmentation label have only label 1 and 0, or the label is between 0 and 1? How you normalize the data after stacking (I guess the range of your RGB image and segmentation result is different)? |
Yes, I stack A and B along the channel dimension as you said. My segmentation label is also a RGB image. I think their range is similar.Do you have any advice? Thanks. |
I am doing similar problem. But my A and B have different range. And I am having same problem with you. I am looking for the solution as well. |
Ok, good luck! If you have any useful idea,please tell me! |
You should stack two images as (H1, W1+W1, C1) where we assume C1=C2. For other types of data, you may consider writing your own data loader inherited from the base_dataset model. |
Why do we need to stack along the width dimension? |
@phamnam95 it is the current design of default data loader for pix2pix. You can run this script to concatenate input and output images. It works fine if C1=C2=3 or C1=C2=1. It might not be the best way for your datasets. Feel free to write your own data loader. |
Is it possible to have the input image and output image with different dimensions? If we stack along the width dimension, the dimension of input image is (H,W+W,C) and the dimension of output image is (H,W,C)? |
It's not supported by the |
So if I stack the dataset like you suggested, the dimension of input and output will be different. For example, I have two sets of images with dimension (200,200,1) and (200,200,1). I want to create the output of dimension (200,200,1). How can I stack inputs to feed in training? If I stack along width dimension, it will be (200,400,1) for input and (200,200,1) for output? |
If you stack your inputs, the image will be (200, 400, 1). |
I am a little confused because my input has 2 image A, B; and my output is only image C. Thanks |
I see. In this special case, you may want to write your own data loader. It should only take 1 hour. |
Should I stack along the channel dimension for A and B? |
If you write your own data loader, you can load each image separately by the name |
I mean when I train, I guess I cannot feed two input images A and B to the input tensor. I need to stack them to create one image for input. |
I use my own data to train the pix2pix model.
I stack the A and B(a pair) as the input to my model and I want to get the C.
Besides, the B is the segmentation label for C. I use the A to provide RGB information for the generation of C.
However, in my training,the result is not what I want. The fake C is very alike to the A and not like the real C that I want.
The loss plot is as follows:
I want to know how can I decrease the influence of A and make the fake C look like the real C rather than the A? How should I change the pix2pix model(now used by default) ?Can someone give me some advice?Thanks!
The text was updated successfully, but these errors were encountered: