Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error about training code #2

Closed
yjh576 opened this issue Aug 19, 2022 · 6 comments
Closed

error about training code #2

yjh576 opened this issue Aug 19, 2022 · 6 comments

Comments

@yjh576
Copy link

yjh576 commented Aug 19, 2022

I encountered a programming error in

trainer.fit(model=model, train_dataloaders=train_loader, val_dataloaders=val_loader, ckpt_path=checkpoint_path)
, but I have confirmed that the data was loaded successfully.

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'trimesh.caching.TrackedArray'>

@taconite
Copy link
Owner

taconite commented Aug 19, 2022

It seems one of the dataloader's outputs is of type trimesh.caching.TrackedArray instead the expected type of numpy.ndarray. This is most likely caused by the wrong version of trimesh installed. It's not possible for me to tell which specific item from the dataloader is wrong with your information.

Can you check which version of trimesh you installed in your environment?

On the other hand, the trimesh.caching.TrackedArray is most likely a result of the following lines:

points_skinning, _ = smpl_mesh.sample(1024, return_index=True)

inside_points, face_idx = smpl_mesh.sample(4096, return_index=True)

Can you add points_skinning = np.array(points_skinning) and inside_points = np.array(inside_points) right after these two lines, respectively, and see if this resolves your issue?

@yjh576
Copy link
Author

yjh576 commented Aug 19, 2022

One more question, does training with a single gpu and four gpus have any effect on the performance of the model?

@taconite
Copy link
Owner

To reproduce the numbers reported in the paper you need 4 GPUs, or more specifically, a batch size of 4 - the current implementation allows only one batch per GPU, so using a batch size of 4 is equivalent to using 4 GPUs

Training with a single GPU (a batch size of 1) could result in degraded accuracy and may lead to unstable gradients during training

@yjh576 yjh576 closed this as completed Aug 25, 2022
@yjh576 yjh576 reopened this Aug 25, 2022
@yjh576 yjh576 closed this as completed Sep 29, 2022
@yjh576
Copy link
Author

yjh576 commented Sep 29, 2022

Hi! I would like to ask you about the calculation of the geometric evaluation metrics, i.e. CD and NC. The paper states that the pseudo-ground-truth geometry is used in the calculation of these metrics. I would like to ask how these pseudo-ground-truth geometries are obtained.

@yjh576 yjh576 reopened this Sep 29, 2022
@taconite
Copy link
Owner

taconite commented Sep 30, 2022

We used NeuS with all cameras to construct pseudo-ground-truth. For reproducibility, please use the commit 2708e43ed71bcd18dc26b2a1a9a92ac15884111c - they fixed some minor bug for background NeRF in a later commit; this shouldn't affect the final results in any significant way, but I did not test their new commit.

The official NeuS code should work out of the box for ZJU-MoCap data. You only need to preprocess ZJU's camera parameters. I've attached my preprocessing script as follows, the interface should be self-explanatory. Note this works on raw ZJU-MoCap dataset, not the dataset preprocessed by our script.

import numpy as np
import argparse
import os
import cv2
import shutil

def parse_scan(subject_ind, output_dir, dataset_path, frame_ind):
    subject_dir = os.path.join(dataset_path, 'CoreView_{}'.format(subject_ind))
    annots_file = os.path.join(subject_dir, 'annots.npy')
    annots = np.load(annots_file, allow_pickle=True).item()
    num_cams = len(annots['cams']['K'])

    cameras_new = {}
    for i in range(num_cams):
        Ki = np.array(annots['cams']['K'][i], dtype=np.float32)
        Ri = np.array(annots['cams']['R'][i], dtype=np.float32)
        ti = np.array(annots['cams']['T'][i], dtype=np.float32).reshape([3, 1]) / 1000.0
        Mi = np.concatenate([Ri, ti], axis=-1)
        curp = np.eye(4).astype(np.float32)
        curp[:3, :] =  Ki @ Mi
        cameras_new['world_mat_{}'.format(i)] = curp.copy()
        # cameras_new['dist_coeff_{}'.format(i)] = np.array(annots['cams']['D'][i], dtype=np.float32).ravel()

    image_paths = [os.path.join(subject_dir, im) for im in annots['ims'][frame_ind-1]['ims']]
    mask_paths = [os.path.join(subject_dir, 'mask_cihp', im[:-4] + '.png') for im in annots['ims'][frame_ind-1]['ims']]

    image_dir = os.path.join(output_dir, '{:06d}'.format(frame_ind), 'image')
    if not os.path.exists(image_dir):
        os.makedirs(image_dir)

    mask_dir = os.path.join(output_dir, '{:06d}'.format(frame_ind), 'mask')
    if not os.path.exists(mask_dir):
        os.makedirs(mask_dir)

    # Save images and masks
    for idx, (image_path, mask_path) in enumerate(zip(image_paths, mask_paths)):
        img = cv2.cvtColor(cv2.imread(image_path), cv2.COLOR_BGR2RGB)
        mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
        mask[mask != 0] = 255
        K = np.array(annots['cams']['K'][idx], dtype=np.float32)
        dist = np.array(annots['cams']['D'][idx], dtype=np.float32).ravel()
        img = cv2.undistort(img, K[:3, :3], dist, None)
        mask = cv2.undistort(mask, K[:3, :3], dist, None)
        cv2.imwrite(os.path.join(image_dir, '{:03d}.png'.format(idx)), cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        cv2.imwrite(os.path.join(mask_dir, '{:03d}.png'.format(idx)), mask)

    # Compute normalizatioin matrix
    smpl_filename = os.path.join(subject_dir, 'new_vertices', '{}.npy'.format(frame_ind if subject_ind in ['313', '315'] else frame_ind - 1))
    verts = np.load(smpl_filename)
    center = np.mean(verts, axis=0)
    radius = np.linalg.norm(verts - center, axis=-1).max() + 0.1

    normalization = np.eye(4).astype(np.float32)

    normalization[0, 3] = center[0]
    normalization[1, 3] = center[1]
    normalization[2, 3] = center[2]

    normalization[0, 0] = radius
    normalization[1, 1] = radius
    normalization[2, 2] = radius

    for i in range(num_cams):
        cameras_new['scale_mat_{}'.format(i)] = normalization.copy()

    np.savez(
        os.path.join(output_dir, '{:06d}'.format(frame_ind), 'cameras_sphere.npz'),
        **cameras_new)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Parsing ZJUMoCap')
    parser.add_argument('--dataset_path', type=str, default="/home/sfwang/Datasets_nfs/ZJU-MoCap",
                        help='dataset path')
    parser.add_argument('--output_dir', type=str, default="/home/sfwang/Datasets/NeuS/ZJUMoCap_313",
                        help='the output cameras file')
    parser.add_argument('--subject_ind',type=int, default=313,
                        help='Subject id')
    parser.add_argument('--frame_ind',type=int, default=1,
                        help='Frame id (1-based)')

    args = parser.parse_args()
    parse_scan(str(args.subject_ind), args.output_dir, args.dataset_path, args.frame_ind)

@yjh576
Copy link
Author

yjh576 commented Oct 1, 2022

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants