error about training code #2

yjh576 · 2022-08-19T11:14:45Z

I encountered a programming error in

Line 135 in d3bc1c3

    
           trainer.fit(model=model, train_dataloaders=train_loader, val_dataloaders=val_loader, ckpt_path=checkpoint_path)

, but I have confirmed that the data was loaded successfully.

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'trimesh.caching.TrackedArray'>

taconite · 2022-08-19T12:22:58Z

It seems one of the dataloader's outputs is of type trimesh.caching.TrackedArray instead the expected type of numpy.ndarray. This is most likely caused by the wrong version of trimesh installed. It's not possible for me to tell which specific item from the dataloader is wrong with your information.

Can you check which version of trimesh you installed in your environment?

On the other hand, the trimesh.caching.TrackedArray is most likely a result of the following lines:

arah-release/im2mesh/data/zju_mocap.py

Line 486 in d3bc1c3

points_skinning, _ = smpl_mesh.sample(1024, return_index=True)

arah-release/im2mesh/data/zju_mocap.py

Line 534 in d3bc1c3

inside_points, face_idx = smpl_mesh.sample(4096, return_index=True)

Can you add points_skinning = np.array(points_skinning) and inside_points = np.array(inside_points) right after these two lines, respectively, and see if this resolves your issue?

yjh576 · 2022-08-19T13:07:54Z

One more question, does training with a single gpu and four gpus have any effect on the performance of the model?

taconite · 2022-08-19T13:57:01Z

To reproduce the numbers reported in the paper you need 4 GPUs, or more specifically, a batch size of 4 - the current implementation allows only one batch per GPU, so using a batch size of 4 is equivalent to using 4 GPUs

Training with a single GPU (a batch size of 1) could result in degraded accuracy and may lead to unstable gradients during training

yjh576 · 2022-09-29T02:22:22Z

Hi! I would like to ask you about the calculation of the geometric evaluation metrics, i.e. CD and NC. The paper states that the pseudo-ground-truth geometry is used in the calculation of these metrics. I would like to ask how these pseudo-ground-truth geometries are obtained.

taconite · 2022-09-30T14:55:54Z

We used NeuS with all cameras to construct pseudo-ground-truth. For reproducibility, please use the commit 2708e43ed71bcd18dc26b2a1a9a92ac15884111c - they fixed some minor bug for background NeRF in a later commit; this shouldn't affect the final results in any significant way, but I did not test their new commit.

The official NeuS code should work out of the box for ZJU-MoCap data. You only need to preprocess ZJU's camera parameters. I've attached my preprocessing script as follows, the interface should be self-explanatory. Note this works on raw ZJU-MoCap dataset, not the dataset preprocessed by our script.

import numpy as np
import argparse
import os
import cv2
import shutil

def parse_scan(subject_ind, output_dir, dataset_path, frame_ind):
    subject_dir = os.path.join(dataset_path, 'CoreView_{}'.format(subject_ind))
    annots_file = os.path.join(subject_dir, 'annots.npy')
    annots = np.load(annots_file, allow_pickle=True).item()
    num_cams = len(annots['cams']['K'])

    cameras_new = {}
    for i in range(num_cams):
        Ki = np.array(annots['cams']['K'][i], dtype=np.float32)
        Ri = np.array(annots['cams']['R'][i], dtype=np.float32)
        ti = np.array(annots['cams']['T'][i], dtype=np.float32).reshape([3, 1]) / 1000.0
        Mi = np.concatenate([Ri, ti], axis=-1)
        curp = np.eye(4).astype(np.float32)
        curp[:3, :] =  Ki @ Mi
        cameras_new['world_mat_{}'.format(i)] = curp.copy()
        # cameras_new['dist_coeff_{}'.format(i)] = np.array(annots['cams']['D'][i], dtype=np.float32).ravel()

    image_paths = [os.path.join(subject_dir, im) for im in annots['ims'][frame_ind-1]['ims']]
    mask_paths = [os.path.join(subject_dir, 'mask_cihp', im[:-4] + '.png') for im in annots['ims'][frame_ind-1]['ims']]

    image_dir = os.path.join(output_dir, '{:06d}'.format(frame_ind), 'image')
    if not os.path.exists(image_dir):
        os.makedirs(image_dir)

    mask_dir = os.path.join(output_dir, '{:06d}'.format(frame_ind), 'mask')
    if not os.path.exists(mask_dir):
        os.makedirs(mask_dir)

    # Save images and masks
    for idx, (image_path, mask_path) in enumerate(zip(image_paths, mask_paths)):
        img = cv2.cvtColor(cv2.imread(image_path), cv2.COLOR_BGR2RGB)
        mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
        mask[mask != 0] = 255
        K = np.array(annots['cams']['K'][idx], dtype=np.float32)
        dist = np.array(annots['cams']['D'][idx], dtype=np.float32).ravel()
        img = cv2.undistort(img, K[:3, :3], dist, None)
        mask = cv2.undistort(mask, K[:3, :3], dist, None)
        cv2.imwrite(os.path.join(image_dir, '{:03d}.png'.format(idx)), cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        cv2.imwrite(os.path.join(mask_dir, '{:03d}.png'.format(idx)), mask)

    # Compute normalizatioin matrix
    smpl_filename = os.path.join(subject_dir, 'new_vertices', '{}.npy'.format(frame_ind if subject_ind in ['313', '315'] else frame_ind - 1))
    verts = np.load(smpl_filename)
    center = np.mean(verts, axis=0)
    radius = np.linalg.norm(verts - center, axis=-1).max() + 0.1

    normalization = np.eye(4).astype(np.float32)

    normalization[0, 3] = center[0]
    normalization[1, 3] = center[1]
    normalization[2, 3] = center[2]

    normalization[0, 0] = radius
    normalization[1, 1] = radius
    normalization[2, 2] = radius

    for i in range(num_cams):
        cameras_new['scale_mat_{}'.format(i)] = normalization.copy()

    np.savez(
        os.path.join(output_dir, '{:06d}'.format(frame_ind), 'cameras_sphere.npz'),
        **cameras_new)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Parsing ZJUMoCap')
    parser.add_argument('--dataset_path', type=str, default="/home/sfwang/Datasets_nfs/ZJU-MoCap",
                        help='dataset path')
    parser.add_argument('--output_dir', type=str, default="/home/sfwang/Datasets/NeuS/ZJUMoCap_313",
                        help='the output cameras file')
    parser.add_argument('--subject_ind',type=int, default=313,
                        help='Subject id')
    parser.add_argument('--frame_ind',type=int, default=1,
                        help='Frame id (1-based)')

    args = parser.parse_args()
    parse_scan(str(args.subject_ind), args.output_dir, args.dataset_path, args.frame_ind)

yjh576 · 2022-10-01T07:53:54Z

Thanks!

yjh576 closed this as completed Aug 25, 2022

yjh576 reopened this Aug 25, 2022

yjh576 closed this as completed Sep 29, 2022

yjh576 reopened this Sep 29, 2022

yjh576 closed this as completed Oct 1, 2022

taconite mentioned this issue Nov 6, 2022

Code for getting GT 3d mesh with NEUS #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error about training code #2

error about training code #2

yjh576 commented Aug 19, 2022

taconite commented Aug 19, 2022 •

edited

Loading

yjh576 commented Aug 19, 2022

taconite commented Aug 19, 2022

yjh576 commented Sep 29, 2022

taconite commented Sep 30, 2022 •

edited

Loading

yjh576 commented Oct 1, 2022

error about training code #2

error about training code #2

Comments

yjh576 commented Aug 19, 2022

taconite commented Aug 19, 2022 • edited Loading

yjh576 commented Aug 19, 2022

taconite commented Aug 19, 2022

yjh576 commented Sep 29, 2022

taconite commented Sep 30, 2022 • edited Loading

yjh576 commented Oct 1, 2022

taconite commented Aug 19, 2022 •

edited

Loading

taconite commented Sep 30, 2022 •

edited

Loading