In this work, we adapt 3RScan - a recently introduced indoor RGB-D dataset of changing indoor environments - to create RIO10, a new long-term camera re-localization benchmark focused on indoor scenes. RIO10 consists of 74 sequences. We provide splits into training, validation (one sequence per scene) and testing sets, leaving us with 10 train, 10 validation and 54 test sequences overall, see corresponding metadata.json.
If you would like to download the RIO10 data, please fill out this form.
Sequences in RIO10 follow the naming convention seq<scene_id>_<scan_id>
. The scenes (seq<scene_id>.zip
), models (models<scene_id>.zip
) and semantics (semantics<scene_id>.zip
) are stored per scene and consist of multiple sequences e.g.:
seq02.zip
├── seq02_01
│ ├── frame-000000.color.jpg
│ ├── frame-000000.pose.txt
│ ├── frame-000000.rendered.depth.png
│ ├── frame-000001.color.jpg
│ ├── frame-000001.pose.txt
│ ├── frame-000001.rendered.depth.png
│ └── ...
├── seq02_01
│ ├── frame-000000.color.jpg
│ ├── frame-000000.pose.txt
│ ├── frame-000000.rendered.depth.png
│ └── ...
├── seq02_03
│ ├── frame-000000.color.jpg
│ ├── frame-000000.rendered.depth.png
│ ├── frame-000001.color.jpg
│ ├── frame-000001.rendered.depth.png
│ └── ...
├── seq02_04
└── ...
semantics02.zip
├── seq02_01
│ ├── instances.txt
│ ├── frame-000000.instances.png
│ ├── frame-000001.instances.png
│ └── ...
└── seq02_02
models02.zip
├── seq02_01
│ ├── labels.ply
│ ├── mesh.obj
│ ├── mesh.refined_0.png
│ └── mesh.refined.mtl
├── seq02_02
├── seq02_03
├── seq02_04
└── ...
Sequence seq<scene_id>_01
is always the training sequence (with *.color.jpg
, *.pose.txt
and *.rendered.depth.png
), seqXX_02 is the validation sequence (with *.color.jpg
, *.pose.txt
and *.rendered.depth.png
). Please note that we do not provide the ground truth (including semantics) for our hidden test set, all seqXX_02+ are hidden sequences (only *.color.jpg
and *.rendered.depth.png
).
Poses *.pose.txt
of the test (rescan) sequences are aligned with the train (reference) and validation scans. These transform from the camera coordinate system to the world coordinate system. Please use our evaluation service to compute your final performance score on the hidden test set.
The corresponding camera calibration parameters (fx, fy, cx, cy
) can be found here: intrinsics.txt.
We also provide our dataset in kapture format. Examples of visual localization pipelines with kapture can be found here.
Finally, scene stats are available here: stats.txt and are structured as follows:
frame NSSD NCCOEFF SemanticChange GeometricChange VoL Context PoseNovelity
seq09_02/frame-000000 0.571 0.531 0.312 296.868 72.369 1.58 316.253
seq09_02/frame-000001 0.571 0.531 0.312 296.882 72.847 1.581 316.264
seq09_02/frame-000002 0.571 0.531 0.312 296.995 71.599 1.583 316.413
seq09_02/frame-000003 0.571 0.531 0.312 297.063 73.363 1.582 316.497
seq09_02/frame-000004 0.571 0.531 0.313 297.125 72.429 1.581 316.543
...
To run our python code, we recommended to use a virtual environment. Our ./setup.sh
script installs dependencies and creates the necessary data sturcture by downloads files (depth images for validation and example predictions).
python3 -m venv env
source env/bin/activate
./setup.sh
In this work, we propose a new metric DCRE for evaluating camera re-localization. We also examine in detail how different types of scene change affect the performance of different methods, based on different ways of detecting such changes in a given RGB-D frame (see stats).
To evaluate your method on the valdiation set, simply save your prediction results in a .txt
file of the following format:
# scene-id/frame-id qw qx qy qz tx ty tz
seq02_03/frame-000000 0.468 -0.841 -0.238 0.123 -2.371 -0.590 0.044
seq02_03/frame-000001 0.461 -0.839 -0.252 0.134 -2.388 -0.586 0.043
seq02_03/frame-000002 0.454 -0.836 -0.267 0.146 -2.408 -0.582 0.042
seq02_03/frame-000003 0.445 -0.833 -0.286 0.157 -2.432 -0.577 0.040
seq02_03/frame-000104 0.438 -0.830 -0.302 0.163 -2.452 -0.573 0.038
We provide a simple merge script to combine your predictions if you have multiple prediction files:
python merge_poses.py --input_folder=your_predictions --output_file=merged_prediction.txt
To compute the pose errors run the code as follows:
DATA=../../../data
METHOD=active_search
cd src/eval/build
cmake -DCMAKE_BUILD_TYPE=Release .. && make -j8
./eval $DATA $DATA/predictions/$METHOD.txt errors/$METHOD.txt
This generates an output error file in the following format:
# scene-id/frame-id translation error rotation error DCRE
seq01_02/frame-000007 2.0415 174.381 0.605055
seq01_02/frame-000017 1.67518 36.0372 0.339005
seq01_02/frame-000022 0.553614 8.98095 0.104529
seq01_02/frame-000026 0.4914 9.85202 0.123886
seq01_02/frame-000027 0.126553 2.55466 0.0314654
...
To run the evaluation for all methods in the data/predictions
folder simply run ./evaluate.sh
. This will compute and save the errors for each provided pose (in the validation set) and saves it in data/errors
.
Based on these errors files you can then generate evaluation plots (cumulative DCRE) or see the error with respect to increasing change python
plot.py
.
The python
plot.py
script expects the data folder to be in a predefined structure. The predictions are the methods to plot listed in config.json
, see config['change_corr']['methods']
or config['overview']['methods']
. If nothing is set there; all methods in the data/predictions
folder are plotted.
data
├── metadata.json
├── config.json
├── stats.txt
└── predictions
├── active_search.txt
├── d2_net.txt
├── grove_v1.txt
└── ...
...
We also provide the code to produce the sequence-based relocalization results described in the supplementary of the paper.
For methods that process and relocalize each frame independently, you can run
the ./src/compute_batched_relocalizations.py
script to batch the individual frame results into chunks to improve the relocalizations.
Run the script as follows:
./src/compute_batched_relocalizations.py [--chunk-size N] <camera_poses.txt> <reloc_poses.txt> <out_folder>
Params:
--chunk-size
: The number of consecutive frames to use to robustify the relocalization.camera_poses.txt
: File containing a list of camera poses (they have to be accurate frame-to-frame, the origin doesn't matter for this specific script) in the same format as the relocalization poses.reloc_poses.txt
: File containing a list of relocalization poses, format as described hereout_folder
: Folder that will contain the output relocalizations. Same format as above.