deploy: 88b95f2

AkerBP · Aug 27, 2024 · 07fefba · 07fefba
commit 07fefba
Show file tree

Hide file tree

Showing 63 changed files with 6,639 additions and 0 deletions.
diff --git a/.buildinfo b/.buildinfo
@@ -0,0 +1,4 @@
+# Sphinx build info version 1
+# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
+config: 8c9300cafad748e36e01daa7dba73845
+tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/.doctrees/api.doctree b/.doctrees/api.doctree
diff --git a/.doctrees/environment.pickle b/.doctrees/environment.pickle
diff --git a/.doctrees/index.doctree b/.doctrees/index.doctree
diff --git a/.doctrees/installation.doctree b/.doctrees/installation.doctree
diff --git a/.doctrees/intro.doctree b/.doctrees/intro.doctree
diff --git a/.doctrees/results.doctree b/.doctrees/results.doctree
diff --git a/.doctrees/usage.doctree b/.doctrees/usage.doctree
diff --git a/.nojekyll b/.nojekyll
diff --git a/_images/help.png b/_images/help.png
diff --git a/_images/my_architecture.png b/_images/my_architecture.png
diff --git a/_images/wandb.png b/_images/wandb.png
diff --git a/_sources/api.rst.txt b/_sources/api.rst.txt
@@ -0,0 +1,13 @@
+.. _api-label:
+
+API
+===
+
+
+This section provides in depth documentation of all the implemented functionality.
+
+⚒️⚒️ This section is still under construction as I am trying to fix some issues with sphinx and GH pages. ⚒️⚒️
+
+
+.. automodule:: seathru
+  :members:
diff --git a/_sources/index.rst.txt b/_sources/index.rst.txt
@@ -0,0 +1,24 @@
+Welcome to the Subsea-NeRF documentation!
+==========================================
+
+This implementation is an extension of the approach presented in `SeaThru-NeRF <https://sea-thru-nerf.github.io/>`_.
+This documentation describes the ideas and underlying equations and network architectures of the approach in detail in the
+:ref:`intro-label`. Furthermore, it describes how to install this repository in the section on :ref:`installation-label`.
+Additionally, it provides some examples on how to use the code in the :ref:`usage-label` section and presents some results that can be achieved with it in
+the section :ref:`results-label`. Finally, it provides an in depth API documentation of the code in the section :ref:`api-label`.
+
+.. toctree::
+  :maxdepth: 2
+  :caption: Contents:
+
+  intro
+  installation
+  usage
+  results
+  api
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`search`
diff --git a/_sources/installation.rst.txt b/_sources/installation.rst.txt
@@ -0,0 +1,74 @@
+.. _installation-label:
+
+Installation
+============
+
+This repository is built upon the `nerfstudio <https://github.com/nerfstudio-project/nerfstudio/>`_
+library. This library has a few requirements, with the most important being having access
+to a CUDA compatible GPU. Furthermore, for full functionality, it needs `colmap <https://colmap.github.io/>`_,
+`ffmpeg <https://ffmpeg.org/>`_ and `tinycuda-nn <https://github.com/NVlabs/tiny-cuda-nn>`_.
+Those requirements might be a bit tricky to install and the installation is system dependent.
+The easiest way to install them is to follow the instructions `here <https://docs.nerf.studio/en/latest/quickstart/installation.html>`__
+up to the point of actually installing nerfstudio.
+
+For this approach, I extended the nerfstudio library. Therefore please install my fork of
+nerfstudio available `here <https://github.com/acse-pms122/nerfstudio_dev>`__. This can be
+done via:
+
+.. code-block:: bash
+
+    git clone https://github.com/acse-pms122/nerfstudio_dev.git
+    cd nerfstudio_dev
+    pip install -e .
+
+Then, clone this repository and install it via:
+
+.. code-block:: bash
+
+    cd ..
+    git clone https://github.com/ese-msc-2022/irp-pms122.git
+    cd irp-pms122
+    pip install -e .
+
+Then, install the command line completion via:
+
+.. code-block:: bash
+
+    ns-install-cli
+
+To check the installation, type:
+
+.. code-block:: bash
+
+    ns-train seathru-nerf --help
+
+If you see the help message, you are good to go! 🚀🚀🚀
+
+
+Requirements
+************
+
+This implementation requires a GPU with a CUDA copatible driver. There are two model configurations as summarised
+in the following table:
+
+.. list-table::
+   :header-rows: 1
+   :widths: 20 40 10 10
+
+   * - Method
+     - Description
+     - Memory
+     - Quality
+   * - ``seathru-nerf``
+     - Larger model, used to produced results in report
+     - ~23 GB
+     - Best
+   * - ``seathru-nerf-lite``
+     - Smaller model
+     - ~7 GB
+     - Good
+
+I recommend to use the ``seathru-nerf`` method as it was used to experiment and produce the results presented in the paper.
+The ``seathru-nerf-lite`` still produces good results, but has not been tested on all scenes. If you happen to run into a
+``CUDA_OUT_MEMORY_ERROR`` it is a sign that the available VRAM on the GPU is not enough. You can either use the smaller
+model, decrease the batch size, do both or upgrade to a better GPU.
diff --git a/_sources/intro.rst.txt b/_sources/intro.rst.txt
@@ -0,0 +1,122 @@
+.. _intro-label:
+
+Introduction
+============
+
+With Neural Radiance Fields (NeRFs), we can store a 3D scene as a continuous function.
+This idea was first introduced in the original NeRF publication :cite:`nerf`.
+Since then, the field experienced many advancements. Some of them even in the subsea domain.
+However, these advancements still have some limitations. This implementation adresses some of those limitations
+and provides a modular and documented implementation of a subsea specific NeRF that allows for easy modifications and experiments.
+
+Approach
+********
+The fundamental principle underlying NeRFs is to represent a scene as a continuous function that maps a position,
+:math:`\mathbf{x} \in \mathbb{R}^{3}`, and a viewing direction, :math:`\boldsymbol{\theta} \in \mathbb{R}^{2}`,
+to a color :math:`\mathbf{c} \in \mathbb{R}^{3}` and volume density :math:`\sigma`. We can approximate this
+continuous scene representation with a simple Multi Layer Perceptron (MLP).
+:math:`F_{\mathrm{\Theta}} : (\mathbf{x}, \boldsymbol{\theta}) \to (\mathbf{c},\sigma)`.
+
+It is common to also use positional and directional encodings to improve the performance of NeRF approaches. Furthermore,
+there are various approaches in order to sample points in regions of a scene that are relevant to the final image. A detailed
+explanation of the exact implemented architecture is given in the :ref:`architecture-label` section.
+
+Image formation model
+---------------------
+The authors of :cite:`seathru_nerf` combine the fundamentals of NeRFs with the following underwater image formation model
+proposed in :cite:`seathru`:
+
+.. math::
+
+   I = \overbrace{\underbrace{J}_{\text{colour}} \cdot \underbrace{(e^{-\beta^D(\mathbf{v}_D)\cdot z})}_{\text{attenuation}}}^{\text{direct}} + \overbrace{\underbrace{B^\infty}_{\text{colour}} \cdot \underbrace{(1 - e^{-\beta^B(\mathbf{v}_B)\cdot z})}_{\text{attenuation}}}^{\text{backscatter}}
+
+:math:`I` ............... Image
+
+:math:`J` ............... Clear image (without any water effects like attenuation or backscatter)
+
+:math:`B^\infty` ........... Backscatter water colour at depth infinity
+
+:math:`\beta^D(\mathbf{v}_D)` ... Attenuation coefficient [#f1]_
+
+:math:`\beta^B(\mathbf{v}_B)` ... Backscatter coefficient [#f1]_
+
+:math:`z` ............... Camera range
+
+This image formation model allows the model to seperate between the clean scene and the water effects. This is very useful
+since it allows for filtering out of water effects from a scene. Some results where this was achieved are shown in the
+:ref:`results-label` section.
+
+
+Rendering equations
+-------------------
+As NeRFs require a discrete and differentiable volumetric rendering equation, the authors of :cite:`seathru_nerf` propose
+the following formulation:
+
+.. math::
+
+    \hat{\boldsymbol{C}}(\mathbf{r}) = \sum_{i=1}^N \hat{\boldsymbol{C}}^{\text{obj}}_i(\mathbf{r}) + \sum_{i=1}^N \hat{\boldsymbol{C}}^{\text{med}}_i(\mathbf{r})
+
+This equation features an object and a medium part contributing towards the final rendered pixel
+colour :math:`\hat{\boldsymbol{C}}(\mathbf{r})`. Those two components are given by:
+
+.. math::
+
+    \hat{\boldsymbol{C}}^{\text{obj}}_i(\mathbf{r}) = T^{\text{obj}}_i \cdot \exp (-\boldsymbol{\sigma}^{\text{attn}} t_i) \cdot \left(1 - \exp({-\sigma^{\text{obj}}_i \delta_i})\right) \cdot \mathbf{c}^{\text{obj}}_i
+
+.. math::
+
+    \hat{\boldsymbol{C}}^{\text{med}}_i(\mathbf{r}) = T^{\text{obj}}_i \cdot \exp (-\boldsymbol{\sigma}^{\text{bs}} t_i) \cdot \left(1 - \exp({-\boldsymbol{\sigma}^{\text{bs}} \delta_i})\right) \cdot \mathbf{c}^{\text{med}}
+
+, with
+
+.. math::
+
+    T^{\text{obj}}_i = \exp\left(-\sum_{j=0}^{i-1}\sigma^{\text{obj}}_j\delta_j\right)
+
+The above equations contain five parameters that are used to describe the underlying scene:
+object density :math:`\sigma^{\text{obj}}_i \in \mathbb{R}^{1}`, object colour
+:math:`\mathbf{c}^{\text{obj}}_i \in \mathbb{R}^{3}`, backscatter density
+:math:`\boldsymbol{\sigma}^{\text{bs}} \in \mathbb{R}^{3}`, attenuation density
+:math:`\boldsymbol{\sigma}^{\text{attn}} \in \mathbb{R}^{3}`, and medium colour
+:math:`\mathbf{c}^{\text{med}} \in \mathbb{R}^{3}`.
+
+I use the network discussed below to compute those five parameters that parametrize the underlying scene.
+
+.. _architecture-label:
+
+Network architecture
+--------------------
+
+The network implemented for this approach has the following architecture:
+
+.. image:: media/my_architecture.png
+   :align: center
+   :alt: Network architecture
+
+.. raw:: html
+
+    <br>
+
+The object network computes :math:`\sigma^{\text{obj}}_i` and :math:`\mathbf{c}^{\text{obj}}_i`, while the
+medium network computes :math:`\boldsymbol{\sigma}^{\text{bs}}`, :math:`\boldsymbol{\sigma}^{\text{attn}}` and
+:math:`\mathbf{c}^{\text{med}}`.
+
+The proposal network is used to sample point in regions of the scene that contribute most to the final image. This approach
+actually uses two proposal networks that are connected sequentially. More details on the concept of proposal samplers and
+how they are optimized during training can be found in :cite:`mipnerf360`.
+
+For positional encoding, I use Hash Grid Encodings as proposed in :cite:`instant-ngp` and for directional
+encoding I use Spherical Harmonics Encoding (SHE) introduced in :cite:`refnerf`.
+
+The MLPs in the object and medium networks are implemented using `tinycuda-nn <https://github.com/NVlabs/tiny-cuda-nn>`_ for
+performance reasons.
+
+.. rubric:: Footnotes
+
+.. [#f1] Those depend on range, object reflectance, spectrum of ambient light, the camera's spectral response, and the physical scattering and beam attenuation coefficients of the water, all of which are wavelength-dependent.
+
+
+.. rubric:: References
+
+.. bibliography:: references.bib
+    :style: plain
diff --git a/_sources/results.rst.txt b/_sources/results.rst.txt
@@ -0,0 +1,71 @@
+.. _results-label:
+
+Results
+=======
+
+This section shows off some results that can be achieved with this repository. To get a sense of the capabilities and
+limitations of the implemented approach, have a look at some of them!🔍
+
+Renderings
+**********
+
+This are some results that show the reconstructed scene on the left and the clean scene on the right.
+
+
+.. raw:: html
+
+    <details>
+        <summary>Click to see renderings of IUI3-RedSea</summary>
+        <iframe width="560" height="315" src="https://www.youtube.com/embed/EjNc-o3LptY?si=wXmoviLHrwdfTHDc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
+    </details>
+
+
+.. raw:: html
+
+    <details>
+        <summary>Click to see renderings of Curasao</summary>
+        <iframe width="560" height="315" src="https://www.youtube.com/embed/PO6vF60tdDQ?si=C0c-VQ_r6nf53EBn" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
+    </details>
+
+.. raw:: html
+
+    <details>
+        <summary>Click to see renderings of JapaneseGardens-RedSea</summary>
+        <iframe width="560" height="315" src="https://www.youtube.com/embed/EMFzwOsSNKQ?si=SvmtJUpADdgQFSGR" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
+    </details>
+
+.. raw:: html
+
+    <br>
+
+The implementation can also be used to render depthmaps and object weight accumulation maps of the scene. The following videos
+show the depthmaps on the left and the object weight accumulation maps on the right. From the object accumulation maps, we
+can nicely see, that the model is able to seperate between objects (red hue, indicating high accumulation) and the water (blue
+hue, indicating low accumulation).
+
+.. raw:: html
+
+    <details>
+        <summary>Click to see renderings of IUI3-RedSea</summary>
+        <iframe width="560" height="315" src="https://www.youtube.com/embed/UKqaSy9rJnA?si=3v9WdqxpmvvpEBPO" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
+    </details>
+
+.. raw:: html
+
+    <details>
+        <summary>Click to see renderings of Curasao</summary>
+        <iframe width="560" height="315" src="https://www.youtube.com/embed/qEBe0dL4kMc?si=3cnfkPgyKBBgF13Y" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
+    </details>
+
+.. raw:: html
+
+    <details>
+        <summary>Click to see renderings of JapaneseGardens-RedSea</summary>
+        <iframe width="560" height="315" src="https://www.youtube.com/embed/YW33VPZYGg0?si=xKS55gbN4W8rByQ0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
+    </details>
+
+.. raw:: html
+
+    <br>
+
+The model also allows to render only the backscatter of the scene and the clear scene but with attenuation effects.