the result looks not that good #5

Orange-Ctrl · 2023-12-07T08:10:59Z

Hi,
thank you for your great work!
I run the code on rtx 3090 and the training process works well. But the result I got looks so strange. Yesterday you told me to fix the tiny-cuda-nn warning tinycudann was built for lower compute capability ({cc}) than the system's ({system_compute_capability}). Performance may be suboptimal. and I just can't fix it by now. But maybe the result won't be that bad because of the warning?
Can you give me some advice to fix this. Thank you in advance!

The text was updated successfully, but these errors were encountered:

sbharadwajj · 2023-12-08T00:41:18Z

Hi,

This looks wrong. I think something else is not working at all. It is not because of tinycudann because we use that only for the 2nd stage of training. However, it is possible that during the first stage we get a proper mesh, but due to the wrong installation of tinycudann, the mesh diverges completely in the 2nd stage.

can you visualize mesh_latest from the first training stage?
can you show me the grid images saved during the first training stage?

Orange-Ctrl · 2023-12-08T07:42:58Z

Hi,

This looks wrong. I think something else is not working at all. It is not because of tinycudann because we use that only for the 2nd stage of training. However, it is possible that during the first stage we get a proper mesh, but due to the wrong installation of tinycudann, the mesh diverges completely in the 2nd stage.

can you visualize mesh_latest from the first training stage?

can you show me the grid images saved during the first training stage?

Hi,
mesh_latest and grid_100 image on the first training stage look like this:

Also, I got this: ninja: no work to do.

sbharadwajj · 2023-12-09T05:44:56Z

These two are from the first stage of training correct?

Which dataset are you using? And can you show me grid_0? To check if the data is correct?

maybe there is a problem with the dataset? As a sanity check can you run on the dataset provided by IMavatar?

Orange-Ctrl · 2023-12-13T06:51:54Z

These two are from the first stage of training correct?

Which dataset are you using? And can you show me grid_0? To check if the data is correct?

maybe there is a problem with the dataset? As a sanity check can you run on the dataset provided by IMavatar?

Yes, they're all from the stage_1.
I use the dataset 'yufeng' downloaded from IMavatar. The stage_1 grid_1:

sbharadwajj · 2023-12-13T13:40:44Z

I see the problem.
the first column is supposed to visualise the ground truth, but it’s blank. So it’s training with a blank image.

Can you verify if you have changed something? It’s not loading the data at all.

adrianJW421 · 2024-01-03T09:39:58Z

I came across a similar problem when simply running the " python train.py --config configs/001.txt " in README. I didn't change any other code except adjust the dataset_util.py in flare/dataset -- replace the as_gray=True to mode="L", which I believe is not the cause of the error. So I'm looking forward to seeing further developments in this discussion.

sbharadwajj · 2024-01-03T10:05:09Z

@adrianJW421
Can you change it back to how it previously was and share a visualisation of grid_0?

Can you elaborate what you mean by similar problem? I think the problem with @Orange-Ctrl is that the data is not loading correctly at all.

Orange-Ctrl · 2024-01-15T14:23:58Z

I see the problem. the first column is supposed to visualise the ground truth, but it’s blank. So it’s training with a blank image.

Can you verify if you have changed something? It’s not loading the data at all.

hello!
I'm pretty sure i download the correct dataset and didn't change any code. What may be the problem of not loading data?

Orange-Ctrl · 2024-01-17T04:54:48Z

I think it's because the mask is always zero. when I delect img = img * mask in dataset_real.py, the gt show.

this is the stage1 grid1

zydmu123 · 2024-01-17T06:33:08Z

Hello@Orange-Ctrl Could you share your env setting? my code is also based on RTX3090, but it couldn't work, just "Re-initializing main() because the training of light MLP diverged and all the values are zero" for all the time...

Orange-Ctrl · 2024-01-17T06:39:07Z

Hello@Orange-Ctrl Could you share your env setting? my code is also based on RTX3090, but it couldn't work, just "Re-initializing main() because the training of light MLP diverged and all the values are zero" for all the time...

I met this problem before #4 . requirement.txt lack of the module robust_loss_pytorch, pip install git+https://github.com/jonbarron/robust_loss_pytorch and it works

zydmu123 · 2024-01-17T06:49:16Z

I trid installing this lib, but it didn't change anthing... My pytorch version is 1.13.1 by "conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia", cudatookit is 11.3

Orange-Ctrl · 2024-01-17T07:03:23Z

I trid installing this lib, but it didn't change anthing... My pytorch version is 1.13.1 by "conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia", cudatookit is 11.3

the same as you, maybe you can try to change the code in train.py, don't use while(True) and get the error information

zydmu123 · 2024-01-17T07:07:42Z

OK, I'll try, thanks so much! @Orange-Ctrl

Orange-Ctrl · 2024-01-17T08:12:43Z

hi!@sbharadwajj
I finally get the correct mesh. I change the code in dataset_util.py

def _load_mask(fn):
     alpha = imageio.imread(fn) 
     mask = torch.Tensor(np.array(alpha) > 127.5)[:, :, 1:2].bool().int().float()
    return mask

instead of the origin

def _load_mask(fn):
    alpha = imageio.imread(fn, mode='L') 
    alpha = skimage.img_as_float32(alpha)
    mask = torch.tensor(alpha / 255., dtype=torch.float32).unsqueeze(-1)
    mask[mask < 0.5] = 0.0
    return mask

sbharadwajj · 2024-01-17T08:50:59Z

@Orange-Ctrl can you tell me the quantitative results on yufeng dataset? I will verify if I have the same.

I will look into why you had to change the mask code soon.

Orange-Ctrl · 2024-01-17T09:25:44Z

@Orange-Ctrl can you tell me the quantitative results on yufeng dataset? I will verify if I have the same.

I will look into why you had to change the mask code soon.
ok, this is stored in final_eval.txt


w/o cloth result:

MAE | LPIPS | SSIM | PSNR
0.24321708372194473 0.4164765131310241 0.5220661378886602 8.06310037168738
w/o cloth result:

MAE | LPIPS | SSIM | PSNR
0.026779092522720767 0.1019469020097223 0.8518311991103708 23.98372743893976
w/o cloth result:

MAE | LPIPS | SSIM | PSNR
0.02751253441690582 0.09591557103068861 0.8505647247458158 23.915589923597363

sbharadwajj · 2024-01-17T10:35:12Z

The numbers look correct for Yufeng dataset. I assume the first row of results are when the training diverged correct?

I will get back to you about the mask.

Orange-Ctrl · 2024-01-17T11:02:57Z

The numbers look correct for Yufeng dataset. I assume the first row of results are when the training diverged correct?

I will get back to you about the mask.

ok, thank you so much~

adrianJW421 · 2024-03-04T04:04:23Z

hi!@sbharadwajj I finally get the correct mesh. I change the code in dataset_util.py

def _load_mask(fn):
     alpha = imageio.imread(fn) 
     mask = torch.Tensor(np.array(alpha) > 127.5)[:, :, 1:2].bool().int().float()
    return mask

instead of the origin

def _load_mask(fn):
    alpha = imageio.imread(fn, mode='L') 
    alpha = skimage.img_as_float32(alpha)
    mask = torch.tensor(alpha / 255., dtype=torch.float32).unsqueeze(-1)
    mask[mask < 0.5] = 0.0
    return mask

I also receive right results by following @Orange-Ctrl 's answer, and making sure that all necessary packages like "robust_loss_pytorch" is correctly installed. Besides, I made another change at dataset_util.py:

def _load_semantic(fn):
# delete the out dated param 'as_gray=True' in my env settings
img = imageio.imread(fn)

sbharadwajj · 2024-07-28T15:52:50Z

hi @adrianJW421, @Orange-Ctrl
My apologies for getting back so late.

While changing the mask code makes the code run, it is not exactly correct as the mask values are binary and not continuous anymore.

Could you please share a single mask image with me? When I tested now by downloading IMavatar's data, this code seems to work for me:

def _load_mask(fn):
    alpha = imageio.imread(fn, mode='L') 
    alpha = skimage.img_as_float32(alpha)
    mask = torch.tensor(alpha / 255., dtype=torch.float32).unsqueeze(-1)
    mask[mask < 0.5] = 0.0
    return mask

sbharadwajj mentioned this issue Aug 22, 2024

loss is non #8

Closed

sbharadwajj closed this as completed Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the result looks not that good #5

the result looks not that good #5

Orange-Ctrl commented Dec 7, 2023

sbharadwajj commented Dec 8, 2023 •

edited

Loading

Orange-Ctrl commented Dec 8, 2023

sbharadwajj commented Dec 9, 2023 •

edited

Loading

Orange-Ctrl commented Dec 13, 2023

sbharadwajj commented Dec 13, 2023

adrianJW421 commented Jan 3, 2024

sbharadwajj commented Jan 3, 2024

Orange-Ctrl commented Jan 15, 2024

Orange-Ctrl commented Jan 17, 2024

zydmu123 commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024

zydmu123 commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024 •

edited

Loading

zydmu123 commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024 •

edited

Loading

sbharadwajj commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024

sbharadwajj commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024

adrianJW421 commented Mar 4, 2024

sbharadwajj commented Jul 28, 2024

the result looks not that good #5

the result looks not that good #5

Comments

Orange-Ctrl commented Dec 7, 2023

sbharadwajj commented Dec 8, 2023 • edited Loading

Orange-Ctrl commented Dec 8, 2023

sbharadwajj commented Dec 9, 2023 • edited Loading

Orange-Ctrl commented Dec 13, 2023

sbharadwajj commented Dec 13, 2023

adrianJW421 commented Jan 3, 2024

sbharadwajj commented Jan 3, 2024

Orange-Ctrl commented Jan 15, 2024

Orange-Ctrl commented Jan 17, 2024

zydmu123 commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024

zydmu123 commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024 • edited Loading

zydmu123 commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024 • edited Loading

sbharadwajj commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024

sbharadwajj commented Jan 17, 2024

Orange-Ctrl commented Jan 17, 2024

adrianJW421 commented Mar 4, 2024

sbharadwajj commented Jul 28, 2024

sbharadwajj commented Dec 8, 2023 •

edited

Loading

sbharadwajj commented Dec 9, 2023 •

edited

Loading

Orange-Ctrl commented Jan 17, 2024 •

edited

Loading

Orange-Ctrl commented Jan 17, 2024 •

edited

Loading