API
+This section provides in depth documentation of all the implemented functionality.
+⚒️⚒️ This section is still under construction as I am trying to fix some issues with sphinx and GH pages. ⚒️⚒️
+' + + '' + + _("Hide Search Matches") + + "
" + ) + ); + }, + + /** + * helper function to hide the search marks again + */ + hideSearchWords: () => { + document + .querySelectorAll("#searchbox .highlight-link") + .forEach((el) => el.remove()); + document + .querySelectorAll("span.highlighted") + .forEach((el) => el.classList.remove("highlighted")); + localStorage.removeItem("sphinx_highlight_terms") + }, + + initEscapeListener: () => { + // only install a listener if it is really needed + if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) return; + + document.addEventListener("keydown", (event) => { + // bail for input elements + if (BLACKLISTED_KEY_CONTROL_ELEMENTS.has(document.activeElement.tagName)) return; + // bail with special keys + if (event.shiftKey || event.altKey || event.ctrlKey || event.metaKey) return; + if (DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS && (event.key === "Escape")) { + SphinxHighlight.hideSearchWords(); + event.preventDefault(); + } + }); + }, +}; + +_ready(() => { + /* Do not call highlightSearchWords() when we are on the search page. + * It will highlight words from the *previous* search query. + */ + if (typeof Search === "undefined") SphinxHighlight.highlightSearchWords(); + SphinxHighlight.initEscapeListener(); +}); diff --git a/api.html b/api.html new file mode 100644 index 0000000..3323561 --- /dev/null +++ b/api.html @@ -0,0 +1,115 @@ + + + + + + +This section provides in depth documentation of all the implemented functionality.
+⚒️⚒️ This section is still under construction as I am trying to fix some issues with sphinx and GH pages. ⚒️⚒️
+This implementation is an extension of the approach presented in SeaThru-NeRF. +This documentation describes the ideas and underlying equations and network architectures of the approach in detail in the +Introduction. Furthermore, it describes how to install this repository in the section on Installation. +Additionally, it provides some examples on how to use the code in the Usage section and presents some results that can be achieved with it in +the section Results. Finally, it provides an in depth API documentation of the code in the section API.
+This repository is built upon the nerfstudio +library. This library has a few requirements, with the most important being having access +to a CUDA compatible GPU. Furthermore, for full functionality, it needs colmap, +ffmpeg and tinycuda-nn. +Those requirements might be a bit tricky to install and the installation is system dependent. +The easiest way to install them is to follow the instructions here +up to the point of actually installing nerfstudio.
+For this approach, I extended the nerfstudio library. Therefore please install my fork of +nerfstudio available here. This can be +done via:
+git clone https://github.com/acse-pms122/nerfstudio_dev.git
+cd nerfstudio_dev
+pip install -e .
+
Then, clone this repository and install it via:
+cd ..
+git clone https://github.com/ese-msc-2022/irp-pms122.git
+cd irp-pms122
+pip install -e .
+
Then, install the command line completion via:
+ns-install-cli
+
To check the installation, type:
+ns-train seathru-nerf --help
+
If you see the help message, you are good to go! 🚀🚀🚀
+This implementation requires a GPU with a CUDA copatible driver. There are two model configurations as summarised +in the following table:
+Method |
+Description |
+Memory |
+Quality |
+
---|---|---|---|
|
+Larger model, used to produced results in report |
+~23 GB |
+Best |
+
|
+Smaller model |
+~7 GB |
+Good |
+
I recommend to use the seathru-nerf
method as it was used to experiment and produce the results presented in the paper.
+The seathru-nerf-lite
still produces good results, but has not been tested on all scenes. If you happen to run into a
+CUDA_OUT_MEMORY_ERROR
it is a sign that the available VRAM on the GPU is not enough. You can either use the smaller
+model, decrease the batch size, do both or upgrade to a better GPU.
With Neural Radiance Fields (NeRFs), we can store a 3D scene as a continuous function. +This idea was first introduced in the original NeRF publication [5]. +Since then, the field experienced many advancements. Some of them even in the subsea domain. +However, these advancements still have some limitations. This implementation adresses some of those limitations +and provides a modular and documented implementation of a subsea specific NeRF that allows for easy modifications and experiments.
+The fundamental principle underlying NeRFs is to represent a scene as a continuous function that maps a position, +\(\mathbf{x} \in \mathbb{R}^{3}\), and a viewing direction, \(\boldsymbol{\theta} \in \mathbb{R}^{2}\), +to a color \(\mathbf{c} \in \mathbb{R}^{3}\) and volume density \(\sigma\). We can approximate this +continuous scene representation with a simple Multi Layer Perceptron (MLP). +\(F_{\mathrm{\Theta}} : (\mathbf{x}, \boldsymbol{\theta}) \to (\mathbf{c},\sigma)\).
+It is common to also use positional and directional encodings to improve the performance of NeRF approaches. Furthermore, +there are various approaches in order to sample points in regions of a scene that are relevant to the final image. A detailed +explanation of the exact implemented architecture is given in the Network architecture section.
+The authors of [3] combine the fundamentals of NeRFs with the following underwater image formation model +proposed in [1]:
+\(I\) …………… Image
+\(J\) …………… Clear image (without any water effects like attenuation or backscatter)
+\(B^\infty\) ……….. Backscatter water colour at depth infinity
+\(\beta^D(\mathbf{v}_D)\) … Attenuation coefficient [1]
+\(\beta^B(\mathbf{v}_B)\) … Backscatter coefficient [1]
+\(z\) …………… Camera range
+This image formation model allows the model to seperate between the clean scene and the water effects. This is very useful +since it allows for filtering out of water effects from a scene. Some results where this was achieved are shown in the +Results section.
+As NeRFs require a discrete and differentiable volumetric rendering equation, the authors of [3] propose +the following formulation:
+This equation features an object and a medium part contributing towards the final rendered pixel +colour \(\hat{\boldsymbol{C}}(\mathbf{r})\). Those two components are given by:
+, with
+The above equations contain five parameters that are used to describe the underlying scene: +object density \(\sigma^{\text{obj}}_i \in \mathbb{R}^{1}\), object colour +\(\mathbf{c}^{\text{obj}}_i \in \mathbb{R}^{3}\), backscatter density +\(\boldsymbol{\sigma}^{\text{bs}} \in \mathbb{R}^{3}\), attenuation density +\(\boldsymbol{\sigma}^{\text{attn}} \in \mathbb{R}^{3}\), and medium colour +\(\mathbf{c}^{\text{med}} \in \mathbb{R}^{3}\).
+I use the network discussed below to compute those five parameters that parametrize the underlying scene.
+The network implemented for this approach has the following architecture:
+ +The object network computes \(\sigma^{\text{obj}}_i\) and \(\mathbf{c}^{\text{obj}}_i\), while the +medium network computes \(\boldsymbol{\sigma}^{\text{bs}}\), \(\boldsymbol{\sigma}^{\text{attn}}\) and +\(\mathbf{c}^{\text{med}}\).
+The proposal network is used to sample point in regions of the scene that contribute most to the final image. This approach +actually uses two proposal networks that are connected sequentially. More details on the concept of proposal samplers and +how they are optimized during training can be found in [2].
+For positional encoding, I use Hash Grid Encodings as proposed in [4] and for directional +encoding I use Spherical Harmonics Encoding (SHE) introduced in [6].
+The MLPs in the object and medium networks are implemented using tinycuda-nn for +performance reasons.
+Footnotes
+ +References
+Derya Akkaynak and Tali Treibitz. A revised underwater image formation model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6 2018.
+Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. Mip-nerf 360: unbounded anti-aliased neural radiance fields. CVPR, 2022.
+Deborah Levy, Amit Peleg, Naama Pearl, Dan Rosenbaum, Derya Akkaynak, Simon Korman, and Tali Treibitz. Seathru-nerf: neural radiance fields in scattering media. 2023. arXiv:2304.07743.
+Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, July 2022. doi:10.1145/3528223.3530127.
+Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: representing scenes as neural radiance fields for view synthesis. In ECCV. 2020.
+Dor Verbin, Peter Hedman, Ben Mildenhall, Todd Zickler, Jonathan T. Barron, and Pratul P. Srinivasan. Ref-NeRF: structured view-dependent appearance for neural radiance fields. CVPR, 2022.
+This section shows off some results that can be achieved with this repository. To get a sense of the capabilities and +limitations of the implemented approach, have a look at some of them!🔍
+This are some results that show the reconstructed scene on the left and the clean scene on the right.
+The implementation can also be used to render depthmaps and object weight accumulation maps of the scene. The following videos +show the depthmaps on the left and the object weight accumulation maps on the right. From the object accumulation maps, we +can nicely see, that the model is able to seperate between objects (red hue, indicating high accumulation) and the water (blue +hue, indicating low accumulation).
+The model also allows to render only the backscatter of the scene and the clear scene but with attenuation effects.
+As this approach built-upon nerfstudio, you can use the full functionality of the library. Therefore +please get familiar with the basic commands of this library. You can find the documentation +here.
+Below, you can find some examples of how to use this library. I will run you through basic training and rendering commands. +Additionally I will include an example of how to use the feature of adding synthetic water to a scene.
+The Seathru-NeRF dataset was used for all examples and results in this documentation. It is a good starting point to experiment +with and it can be downloaded here.
+If you want to use your own dataset, please +refer to the guide here.
+🌟🌟🌟It’s time to train your first subsea-NeRF🌟🌟🌟
+To get an overview of the training options, you can use the help command:
+ns-train setahru-nerf --help
+
The output should look something like this:
+ +Note that the image above is cut off. If you run this command on your machine, you can see all the parameters you can specify
+when training the implemented Subsea-NeRF. The default options should do fine on most scenes. One thing I strogly recommend is to use the
+--vis wandb
option, as this will allow you to log training on W&B. (there is also an option for
+tensorboard) If you specify this option, do not forget to provide your API key as well.
If you want to train the implemented subsea-NeRF model on the IUI3-RedSea scene of the Seathru-NeRF dataset, that +can be downloaded following the instructions above, you can use the following command:
+ns-train seathru-nerf --vis wandb --data <path_to_Seathru_NeRF_dataset>/IUI3-RedSea
+
On your wandb page you can then see. something that looks like the following:
+ +All the panels can be used to inspect the training process. They are very informative and can give you a good sense of the +progress of model training. Make sure to check them out! ✅
+When specifying --vis viewer+wandb
, you can additionally see a live view of the scene during the training process in the
+interactive viewer built into nerfstudio. See the documentation here
+and this instructional video to use the viewer (provided by the nerfstudio team):
After having trained the Subsea-NeRF, you can use it to render videos from arbitrary camera trajectories of the scene. +Make sure to first locate the config.yml of the trained model as you need to pass the path to the rendering script. +This file can be found in the output folder created when training the NeRF. Due to the underlying image formation model +that allows us to seperate between the objects and the water within a scene, you need to +choose the kind of video you want to render. The following options exist:
+rgb: To render the reconstructed scene.
J: To render the clear scene (water effect removed).
direct: To render the attenuated clear scene.
bs: To render the backscatter of the water within the scene.
depth: To render the depthmaps of the scene.
accumulation: To render the object weight accumulation maps of the scene.
For a detailed explanation of the arguments that can be specified when rendering, you can use the help command:
+ns-render --help
+
If you want to render out an RGB video of a scene where the camera trajectory is interpolated between the evaluation images of +the dataset, a command looks similar to the following:
+ns-render interpolate --load-config <path_to_config.yml> --rendered-output-names rgb --output-path <desired_path_for_output>
+
Some results of example renderings are provided in the Results section.
+