-
Notifications
You must be signed in to change notification settings - Fork 40
bounds and downsampling factor for load_llff_data_multi_view #11
Comments
Thank you for your kind words!1. Yes, min_bound and max_bound are the same across all cameras. They are in world units. For multi-view, there is no heuristic implemented to determine good values, instead you'd have to come up with your own heuristic (e.g. for a spherical camera setup, the maximum distance between any two cameras might be a good first guess).2. You can first try to just set factor=4 for example and the code at https://github.com/facebookresearch/nonrigid_nerf/blob/main/train.py#L1354 will take care of adjusting the calibration (namely focal and center of the intrinsics). Extrinsics don't need to be adjusted. If that doesn't work, store the correct (downsampled) values for focal and center in calibration.json and use factor=1.Hope that helps!
|
Thank you for the swift response! 1. Do we have to adjust 2. Do you think using
3. Where is 4. If so, what is the harm of heuristically setting |
1. Yes, that's how min and max_bound are also obtained in the monocular setting, seems like a very reasonable heuristic to me if you can get colmap to run on all your images. If you cannot use all images, make sure that the images from the single time step still cover the full depth of the scene (I'd think that that is the case usually).2. Yes, they are the near and far plane distance for volume rendering.3. Setting the near plane to 0 might lead to artifacts because the nerf can place artifacts right in front of the camera that are practically not visible from other cameras. It's an okay-ish heuristic if there's no better alternative. The far plane distance should not be super large because the 64 coarse samples along the ray are evenly spaced along the ray, which leads to very large distances between samples along the ray if the far plane is not reasonable.On Nov 27, 2021 23:01, Chonghyuk Song ***@***.***> wrote:
Thank you for the swift response!
I have just a few more follow-up questions:
1. Do you think using min_bounds and max_bounds in the poses_bounds.npy file generated by running colmap as follows (https://colmap.github.io/faq.html#reconstruct-sparse-dense-model-from-known-camera-poses) constitutes a good heuristic for multi-view?
I ran colmap on multi-view images from a single timestep to estimate the 3D points, and used the 1% and 99% percentile depth values to define the min_bounds and max_bounds for each camera; the shared min_bound and max_bound then would become the minimum and maximum, respectively, across all caermas.
2. Where is min_bound and max_bound used? is it used as the integration bounds for volume rendering?
3. If so, what is the harm of heuristically setting min_bound as 0 and max_bound as a very large number?
—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android.
|
I have never tried running the multi-view code with rendering. The spiral code might be too sensitive, you could try the static or input reconstruction rendering. Changing to load_llff_data_multi_view sounds reasonable, but again, I have not tried that part. |
First of all, thank you for releasing your impactful work!
I'm trying to train NRNeRF on multi-view data from 8 synchronized cameras with known intrinsics and extrinsics, and I ran into a couple questions regarding the bounds and the downsampling factor.
1. Are the parameters
min_bound
andmax_bound
defined as the minimum and maximum across all cameras?I noticed that in the README.md, there is a single
min_bound
andmax_bound
that is shared between all cameras when specifyingcalibration.json
, as opposed to there being one for each camera.2. When using
load_llff_data_multi_view
, if our training images are downsampled from their original resolution by a certain factor, are there any parts of thecalibration.json
(i.e. camera intrinsics / extrinsics) we have to accordingly adjust to account for the downsampling factor?I'm asking this question because that downsampling images by a
factor
is not implemented inload_llff_data_multi_view
, butload_llff_data
appears to be usingfactor
in a couple of cases (https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/load_llff.py#L76, https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/load_llff.py#L103).Thank you in advance for reading this long question.
I look forward to reading your response.
The text was updated successfully, but these errors were encountered: