diff --git a/README.md b/README.md index 1871e58..5671ed7 100644 --- a/README.md +++ b/README.md @@ -57,7 +57,15 @@ If you use this toolbox or benchmark in your research, please cite this project. year={2022} } ``` - +If you use the Robust Visual-Inertial Odometry in your research, please cite: +```bibtex +@article{li2023rd, + title={RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments}, + author={Li, Jinyu and Pan, Xiaokun and Huang, Gan and Zhang, Ziyang and Wang, Nan and Bao, Hujun and Zhang, Guofeng}, + journal={arXiv preprint arXiv:2310.15072}, + year={2023} +} +``` ## Contributing We appreciate all contributions to improve XRSLAM. diff --git a/configs/euroc_slam.yaml b/configs/euroc_slam.yaml index a67389c..775db95 100644 --- a/configs/euroc_slam.yaml +++ b/configs/euroc_slam.yaml @@ -5,14 +5,18 @@ output: sliding_window: size: 10 + subframe_size: 3 force_keyframe_landmarks: 35 feature_tracker: min_keypoint_distance: 10.0 # [px] - max_keypoint_detection: 1000 + max_keypoint_detection: 200 max_init_frames: 60 max_frames: 20 predict_keypoints: true + clahe_clip_limit: 6.0 + clahe_width: 8 + clahe_height: 8 initializer: keyframe_num: 8 @@ -26,3 +30,14 @@ initializer: solver: iteration_limit: 30 time_limit: 1.0e6 # [s] + +rotation: + misalignment_threshold: 0.02 + ransac_threshold: 10 # degree + +parsac: + use_flag: false + dynamic_probability: 0.15 + threshold: 1.0 + norm_scale: 1.0 + keyframe_check_size: 1 diff --git a/docs/en/benchmark.md b/docs/en/benchmark.md index bff18d3..35fedba 100644 --- a/docs/en/benchmark.md +++ b/docs/en/benchmark.md @@ -1,73 +1,53 @@ # Benchmark -We run our algorithm on EuRoC dataset on Ubuntu18.04 and macOS 10.14. And make comparisons with VINS-Mono, which is one of the state-of-the-art VIO systems. We analyze the accuracy of the algorithm by comparing the root mean squared error (RMSE) of the absolute trajectory error (ATE). ATE is given by the simple difference between the estimated trajectory and ground truth after it has been aligned so that it has a minimal error, and RMSE is the standard deviation of the residuals (prediction errors). The lower the RMSE, the better a given trajectory is able to fit its ground truth. We use "evo" tool to evaluate and compare the trajectory output of odometry and SLAM algorithms, and for more information please refer to [euroc evaluation](./tutorials/euroc_evaluation.md). As shown in the following table, XRSLAM has higher accuracy than VINS-Mono (without loop). The average results for visual-inertial algorithms are bolded, and the following figures show the trajectory of XRSLAM on Vicon Room 1 01 sequence. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ---- - - +We run our algorithm on EuRoC dataset on Ubuntu18.04 and macOS 10.14. And make comparisons with the state-of-the-art VIO systems. We analyze the accuracy of the algorithm by comparing the root mean squared error (RMSE) of the absolute trajectory error (ATE). ATE is given by the simple difference between the estimated trajectory and ground truth after it has been aligned so that it has a minimal error, and RMSE is the standard deviation of the residuals (prediction errors). The lower the RMSE, the better a given trajectory is able to fit its ground truth. We use "evo" tool to evaluate and compare the trajectory output of odometry and SLAM algorithms, and for more information please refer to [euroc evaluation](./tutorials/euroc_evaluation.md). We have achieved obvious better results on EuRoc and ADVIO datasets, which proves the effectiveness of our system. + +**SF-VIO** completely disables the dynamic object removal strategy in XRSLAM, which can be enabled in the configuration *parsac_flag*. When the flag activated, it is referred to as **RD-VIO**. As shown in the following tables, the best results for visual-inertial algorithms are bolded. Comparing with other systems, SF-VIO showed significant improvements on many sequences on EuRoC dataset. Thanks to the additional stabilization effect, the significant drifts are canceled when using the subframe strategy in our system. + +As a challenging dataset in real-world settings, ADVIO offers 23 diverse scenarios, encompassing indoor and outdoor environments, varying lighting conditions, and dynamic elements such as pedestrians and vehicles. Compared to SF-VIO without dynamic object removal strategies, RD-VIO showed significantly better RMSEs on ADVIO dataset. + +**Tracking Accuracy (RMSE in meters) on the EuRoC Dataset.** +| Algorithm | MH-01 | MH-02 | MH-03 | MH-04 | MH-05 | V1-01 | V1-02 | V1-03 | V2-01 | V2-02 | V2-03 | AVG | +|-------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------| +| **SF-VIO** | 0.109 | 0.147 | **0.131** | 0.189 | 0.240 | **0.056** | 0.101 | 0.134 | 0.066 | 0.089 | **0.122** | **0.125** | +| **RD-VIO** | **0.109** | 0.115 | 0.141 | 0.247 | 0.267 | 0.060 | 0.091 | 0.168 | 0.058 | 0.100 | 0.147 | 0.136 | +| **LARVIO** | 0.132 | 0.137 | 0.168 | 0.237 | 0.314 | 0.083 | **0.064** | 0.086 | 0.148 | 0.077 | 0.168 | 0.147 | +| **Open-VINS**| 0.111| 0.287 | 0.181 | **0.182** | 0.365 | 0.059 | 0.084 | **0.075** | 0.086 | **0.074** | 0.145 | 0.150 | +| **VI-DSO** | 0.125 | **0.072** | 0.285 | 0.343 | **0.202** | 0.197 | 0.135 | 4.073 | 0.242 | 0.202 | 0.212 | 0.553 | +| **OKVIS*** | 0.342 | 0.361 | 0.319 | 0.318 | 0.448 | 0.139 | 0.232 | 0.262 | 0.163 | 0.211 | 0.291 | 0.281 | +| **MSCKF** | 0.734 | 0.909 | 0.376 | 1.676 | 0.995 | 0.520 | 0.567 | - | 0.236 | - | - | 0.752 | +| **PVIO** | 0.129 | 0.210 | 0.162 | 0.286 | 0.341 | 0.079 | 0.093 | 0.155 | **0.054** | 0.202 | 0.290 | 0.182 | +| **DynaVINS**| 0.308 | 0.152 | 1.789 | 2.264 | - | - | 0.365 | - | - | - | - | 0.976 | +| **VINS-Fusion(VIO)** | 0.149 | 0.110 | 0.168 | 0.221 | 0.310 | 0.071 | 0.282 | 0.170 | 0.166 | 0.386 | 0.190 | 0.202 | +| **ORB-SLAM3(VIO)**| 0.543 | 0.700 | 1.874 | 0.999 | 0.964 | 0.709 | 0.545 | 2.649 | 0.514 | 0.451 | 1.655 | 1.055 | +
+ +**Accuracy on the ADVIO Dataset** +| Sequence | **SF-VIO** | **RD-VIO** | **VINS-Fusion(VIO)** | **LARVIO** | +| :------: | :--------: | :--------: | :------------------: | :--------: | +| 01 | 2.177 | **1.788** | 2.339 | 5.049 | +| 02 | **1.679** | 1.695 | 1.914 | 4.242 | +| 03 | 2.913 | 2.690 | **2.290** | 4.295 | +| 04 | - | **2.860** | 3.350 | - | +| 05 | 1.385 | 1.263 | **0.938** | 2.034 | +| 06 | 2.837 | **2.497** | 11.005 | 8.201 | +| 07 | 0.559 | **0.548** | 0.912 | 2.369 | +| 08 | 2.075 | 2.151 | **1.136** | 2.078 | +| 09 | **0.332** | 2.281 | 1.063 | 3.168 | +| 10 | 1.997 | 2.128 | **1.847** | 4.742 | +| 11 | 4.103 | **3.986** | 18.760 | 5.298 | +| 12 | 2.084 | 1.951 | - | **1.191** | +| 13 | 3.227 | 2.899 | - | **1.324** | +| 14 | **1.524** | 1.532 | - | - | +| 15 | **0.779** | 0.780 | 0.944 | 0.851 | +| 16 | **0.986** | 0.991 | 1.289 | 2.346 | +| 17 | 1.657 | **1.235** | 1.569 | 1.734 | +| 18 | 1.164 | **1.057** | 3.436 | 1.171 | +| 19 | 3.154 | 2.740 | **2.010** | 3.256 | +| 20 | 7.013 | **6.960** | 10.433 | - | +| 21 | 8.534 | **8.432** | 11.004 | 8.962 | +| 22 | 4.548 | **4.498** | - | 4.686 | +| 23 | 6.486 | 5.085 | **4.668** | 9.389 | +| AVG | 2.873 | **2.671** | 3.272 | 3.699 | +
- ---- diff --git a/docs/en/changelog.md b/docs/en/changelog.md index 6a3daf5..798c64a 100644 --- a/docs/en/changelog.md +++ b/docs/en/changelog.md @@ -1,10 +1,16 @@ # Changelog +## XRSLAM Release v0.6.0 + +**Highlights:** +* [ADD] XRSLAM example for ROS +* [ADD] RD-VIO version of SLAM with IMU-PARSAC algorithm + ## XRSLAM Release v0.5.0 **Highlights:** * [CHANGE] Redefine the function call interface of XRSLAM -* [CHANGE] Use the newly defined interface to call XRSLAM in the example of ios and PC. +* [CHANGE] Use the newly defined interface to call XRSLAM in the example of iOS and PC. * [CHANGE] Refactoring the PC example. * [CHANGE] Unified the configuration file format * [ADD] Add API interface documentation diff --git a/docs/en/config_parse.md b/docs/en/config_parse.md index 698d504..fc65342 100644 --- a/docs/en/config_parse.md +++ b/docs/en/config_parse.md @@ -11,25 +11,27 @@ The following is the parameter description of the PC configuration, include [sla ### camera -* `noise`: the convariance matrix for keypoint measurement noise, (pixel^2) -* `distortion`: the distortion coefficient of camera, (k1 k2 p1 p2) * `intrinsic`: the intrinsics of camera (fx fy cx cy) +* `camera_distortion_flag`: use distortion model or not +* `distortion`: the distortion coefficient of camera, (k1 k2 p1 p2) +* `time_offset`: camera time delay wrt. IMU [s] * `q_bc`: the rotation part of extrinsics from camera to body * `p_bc`: the translation part of extrinsics from camera to body -* **note**: if you need to modify the `distortion` and `intrinsic`, remember the content at `xrslam-pc/player/src/euroc_dataset_reader.cpp` L55,56 to be modified at the same time. +* `noise`: the convariance matrix for keypoint measurement noise, (pixel^2) ### imu +* `q_bi`: the rotation part of extrinsics from IMU to body +* `p_bi`: the translation part of extrinsics from IMU to body * `cov_g`: the convariance matrix for gyroscope white noise, (rad/s/sqrt(hz))^2) * `cov_a`: the convariance matrix for accelerometer white noise, ((m/s^2/sqrt(hz))^2) * `cov_bg`: the convariance matrix for gyroscope bias diffusion, ((rad/s^2/sqrt(hz))^2) * `cov_ba`: the convariance matrix for accelerometer bias diffusion, ((m/s^3/sqrt(hz))^2) -* `q_bi`: the rotation part of extrinsics from IMU to body -* `p_bi`: the translation part of extrinsics from IMU to body ### sliding window * `size`: the size of keyframes in the sliding window +* `subframe_size`: the size of subframes in the subframe window * `force_keyframe_landmark`: decide keyframe if the number of landmarks observed at this frame is below a certain threshold ### feature_tracker @@ -55,6 +57,13 @@ The following is the parameter description of the PC configuration, include [sla * `iteration_limit`: the maxmium number of iteration for each optimization * `time_limit`: the maxmium time which optimizer cost for each optimization +### parsac +* `parsac_flag`: use imu-parsac or not +* `dynamic_probability`: dynamic coefficient to measure the dynamic degree of the scene +* `threshold`: the threshold of the stage one +* `norm_scale`: the scale normalized to the image plane +* `keyframe_check_size`: the size of keyframes involved in cross checking in stage two + ## iOS iOS configuration also includes [slam parameters](../../xrslam-ios/visualizer/configs/slam_params.yaml) and [iPhone Parameters](../../xrslam-ios/visualizer/configs/iPhone X.yaml). @@ -71,11 +80,10 @@ iOS configuration also includes [slam parameters](../../xrslam-ios/visualizer/co * `intrinsic`: the intrinsics of camera (fx fy cx cy) * `q_bc`: the rotation part of extrinsics from camera to body * `p_bc`: the translation part of extrinsics from camera to body -* **note**: if you need to modify the `distortion` and `intrinsic`, remember the content at `xrslam-pc/player/src/euroc_dataset_reader.cpp` L55,56 to be modified at the same time. ### imu -* `cov_g`: the convariance matrix for gyroscope white noise, (rad/s/sqrt(hz))^2) +* `cov_g`: the convariance matrix for gyroscope white noise, ((rad/s/sqrt(hz))^2) * `cov_a`: the convariance matrix for accelerometer white noise, ((m/s^2/sqrt(hz))^2) * `cov_bg`: the convariance matrix for gyroscope bias diffusion, ((rad/s^2/sqrt(hz))^2) * `cov_ba`: the convariance matrix for accelerometer bias diffusion, ((m/s^3/sqrt(hz))^2) @@ -102,5 +110,4 @@ iOS configuration also includes [slam parameters](../../xrslam-ios/visualizer/co * `enable`: enable sfm-based localization or not * `ip`: server IP - * `port`: server port diff --git a/docs/en/tutorials/euroc_evaluation.md b/docs/en/tutorials/euroc_evaluation.md index 1e8266d..f8e6cf1 100644 --- a/docs/en/tutorials/euroc_evaluation.md +++ b/docs/en/tutorials/euroc_evaluation.md @@ -4,16 +4,10 @@ Download EuRoC dataset to YOUR_DATASET_FOLDER. Take [MH_01_easy ](http://robotics.ethz.ch/~asl-datasets/ijrr_euroc_mav_dataset/machine_hall/MH_01_easy/MH_01_easy.zip)for example. -Start the XRSLAM pc player using the following command line, and it will not show visualization. +Start the XRSLAM pc player using the following command line ```bash -./build/xrslam-pc/player/xrslam-pc-player -H -c configs/euroc.yaml --tum trajectory.tum euroc:///data/EuRoC/MH_01_easy/mav0 -``` - -We also provide a GUI version and could start by using the following command line - -```bash -./build/xrslam-pc/player/xrslam-pc-player -c configs/euroc.yaml --tum trajectory.tum euroc:///data/EuRoC/MH_01_easy/mav0 +./build/xrslam-pc/player/xrslam-pc-player -sc configs/euroc_slam.yaml -dc configs/euroc_sensor.yaml --tum trajectory.tum euroc:///data/EuRoC/MH_01_easy/mav0 ``` + Click the first button "Stopped" of the player to automatically execute the program on the whole sequence. (recommend) @@ -38,11 +32,10 @@ cd data/EuRoC/MH_01_easy/mav0/state_groundtruth_estimate0 evo_traj euroc data.csv --save_as_tum ``` -After converting the ground truth trajectory to the "tum" format, you can evaluate the accuracy ( APE and ARE ) by +After converting the ground truth trajectory to the "tum" format, you can evaluate the accuracy by ```bash evo_ape tum data.tum $PROJECT/trajectory.tum -a -evo_ape tum data.tum $PROJECT/trajectory.tum -r angle_deg ``` -You will get the RMSE of XRSLAM on MH_01 sequence, in which APE is 0.147 and ARE is 2.56. +You will get the RMSE of XRSLAM on MH_01 sequence, in which APE is 0.109. diff --git a/xrslam-extra/include/xrslam/extra/opencv_image.h b/xrslam-extra/include/xrslam/extra/opencv_image.h index 568072e..d6038bf 100644 --- a/xrslam-extra/include/xrslam/extra/opencv_image.h +++ b/xrslam-extra/include/xrslam/extra/opencv_image.h @@ -31,7 +31,7 @@ class OpenCvImage : public Image { std::vector> &next_keypoints, std::vector &result_status) const override; - void preprocess() override; + void preprocess(double clipLimit, int width, int height) override; void correct_distortion(const matrix<3> &intrinsics, const vector<4> &coeffs); void release_image_buffer() override; @@ -50,7 +50,7 @@ class OpenCvImage : public Image { typedef ceres::BiCubicInterpolator Interpolator; std::vector interpolator_levels; - static cv::CLAHE *clahe(); + static cv::CLAHE *clahe(double clipLimit, int width, int height); static cv::GFTTDetector *gftt(size_t max_points); static cv::FastFeatureDetector *fast(); static cv::ORB *orb(); diff --git a/xrslam-extra/include/xrslam/extra/yaml_config.h b/xrslam-extra/include/xrslam/extra/yaml_config.h index 773d25a..06a29d1 100644 --- a/xrslam-extra/include/xrslam/extra/yaml_config.h +++ b/xrslam-extra/include/xrslam/extra/yaml_config.h @@ -33,6 +33,9 @@ class YamlConfig : public Config { vector<2> camera_resolution() const override; matrix<3> camera_intrinsic() const override; + vector<4> camera_distortion() const override; + size_t camera_distortion_flag() const override; + double camera_time_offset() const override; quaternion camera_to_body_rotation() const override; vector<3> camera_to_body_translation() const override; quaternion imu_to_body_rotation() const override; @@ -48,6 +51,7 @@ class YamlConfig : public Config { vector<3> output_to_body_translation() const override; size_t sliding_window_size() const override; + size_t sliding_window_subframe_size() const override; size_t sliding_window_tracker_frequent() const override; size_t sliding_window_force_keyframe_landmarks() const override; @@ -55,6 +59,9 @@ class YamlConfig : public Config { size_t feature_tracker_max_keypoint_detection() const override; size_t feature_tracker_max_init_frames() const override; size_t feature_tracker_max_frames() const override; + double feature_tracker_clahe_clip_limit() const override; + size_t feature_tracker_clahe_width() const override; + size_t feature_tracker_clahe_height() const override; bool feature_tracker_predict_keypoints() const override; size_t initializer_keyframe_num() const override; @@ -72,9 +79,21 @@ class YamlConfig : public Config { size_t solver_iteration_limit() const override; double solver_time_limit() const override; + bool parsac_flag() const override; + double parsac_dynamic_probability() const override; + double parsac_threshold() const override; + double parsac_norm_scale() const override; + size_t parsac_keyframe_check_size() const override; + + double rotation_misalignment_threshold() const override; + double rotation_ransac_threshold() const override; + private: vector<2> m_camera_resolution; matrix<3> m_camera_intrinsic; + vector<4> m_camera_distortion; + size_t m_camera_distortion_flag; + double m_camera_time_offset; quaternion m_camera_to_body_rotation; vector<3> m_camera_to_body_translation; quaternion m_imu_to_body_rotation; @@ -89,6 +108,7 @@ class YamlConfig : public Config { vector<3> m_output_to_body_translation; size_t m_sliding_window_size; + size_t m_sliding_window_subframe_size; size_t m_sliding_window_tracker_frequent; size_t m_sliding_window_force_keyframe_landmarks; @@ -96,6 +116,9 @@ class YamlConfig : public Config { size_t m_feature_tracker_max_keypoint_detection; size_t m_feature_tracker_max_init_frames; size_t m_feature_tracker_max_frames; + double m_feature_tracker_clahe_clip_limit; + size_t m_feature_tracker_clahe_width; + size_t m_feature_tracker_clahe_height; bool m_feature_tracker_predict_keypoints; size_t m_initializer_keyframe_num; @@ -112,6 +135,15 @@ class YamlConfig : public Config { size_t m_solver_iteration_limit; double m_solver_time_limit; + + bool m_parsac_flag; + double m_parsac_dynamic_probability; + double m_parsac_threshold; + double m_parsac_norm_scale; + size_t m_parsac_keyframe_check_size; + + double m_rotation_misalignment_threshold; + double m_rotation_ransac_threshold; }; } // namespace xrslam::extra diff --git a/xrslam-extra/src/xrslam/extra/opencv_image.cpp b/xrslam-extra/src/xrslam/extra/opencv_image.cpp index ae45dcd..03a3236 100644 --- a/xrslam-extra/src/xrslam/extra/opencv_image.cpp +++ b/xrslam-extra/src/xrslam/extra/opencv_image.cpp @@ -112,6 +112,27 @@ void OpenCvImage::track_keypoints(const Image *next_image, } } } + + { + std::vector reverse_status; + std::vector reverse_err; + std::vector reverse_pts = curr_cvpoints; + calcOpticalFlowPyrLK( + next_cvimage->image_pyramid, image_pyramid, next_cvpoints, + reverse_pts, reverse_status, reverse_err, Size(21, 21), + (int)level_num(), + TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, 0.01), + OPTFLOW_USE_INITIAL_FLOW); + + for (size_t i = 0; i < reverse_status.size(); ++i) { + if (result_status[i]) { + if (!reverse_status[i] || + cv::norm(curr_cvpoints[i] - reverse_pts[i]) > 0.5) { + result_status[i] = 0; + } + } + } + } } std::vector l; @@ -132,10 +153,9 @@ void OpenCvImage::track_keypoints(const Image *next_image, } } -void OpenCvImage::preprocess() { - clahe()->apply(image, image); +void OpenCvImage::preprocess(double clipLimit, int width, int height) { + clahe(clipLimit, width, height)->apply(image, image); image_pyramid.clear(); - buildOpticalFlowPyramid(image, image_pyramid, Size(21, 21), (int)level_num(), true); } @@ -156,8 +176,8 @@ void OpenCvImage::correct_distortion(const matrix<3> &intrinsics, image = new_image; } -CLAHE *OpenCvImage::clahe() { - static Ptr s_clahe = createCLAHE(6.0, cv::Size(8, 8)); +CLAHE *OpenCvImage::clahe(double clipLimit, int width, int height) { + static Ptr s_clahe = createCLAHE(clipLimit, cv::Size(width, height)); return s_clahe.get(); } diff --git a/xrslam-extra/src/xrslam/extra/yaml_config.cpp b/xrslam-extra/src/xrslam/extra/yaml_config.cpp index d2dec84..bed565c 100644 --- a/xrslam-extra/src/xrslam/extra/yaml_config.cpp +++ b/xrslam-extra/src/xrslam/extra/yaml_config.cpp @@ -84,6 +84,7 @@ YamlConfig::YamlConfig(const std::string &slam_config_filename, m_output_to_body_rotation = Config::output_to_body_rotation(); m_output_to_body_translation = Config::output_to_body_translation(); m_sliding_window_size = Config::sliding_window_size(); + m_sliding_window_subframe_size = Config::sliding_window_subframe_size(); m_sliding_window_tracker_frequent = Config::sliding_window_tracker_frequent(); m_sliding_window_force_keyframe_landmarks = @@ -95,6 +96,10 @@ YamlConfig::YamlConfig(const std::string &slam_config_filename, m_feature_tracker_max_init_frames = Config::feature_tracker_max_init_frames(); m_feature_tracker_max_frames = Config::feature_tracker_max_frames(); + m_feature_tracker_clahe_clip_limit = + Config::feature_tracker_clahe_clip_limit(); + m_feature_tracker_clahe_width = Config::feature_tracker_clahe_width(); + m_feature_tracker_clahe_height = Config::feature_tracker_clahe_height(); m_feature_tracker_predict_keypoints = Config::feature_tracker_predict_keypoints(); m_initializer_keyframe_num = Config::initializer_keyframe_num(); @@ -106,6 +111,15 @@ YamlConfig::YamlConfig(const std::string &slam_config_filename, m_initializer_refine_imu = Config::initializer_refine_imu(); m_solver_iteration_limit = Config::solver_iteration_limit(); m_solver_time_limit = Config::solver_time_limit(); + + m_parsac_flag = Config::parsac_flag(); + m_parsac_dynamic_probability = Config::parsac_dynamic_probability(); + m_parsac_threshold = Config::parsac_threshold(); + m_parsac_norm_scale = Config::parsac_norm_scale(); + + m_rotation_misalignment_threshold = + Config::rotation_misalignment_threshold(); + m_rotation_ransac_threshold = Config::rotation_ransac_threshold(); m_visual_localization_enable = Config::visual_localization_enable(); m_visual_localization_ip = Config::visual_localization_config_ip(); m_visual_localization_port = Config::visual_localization_config_port(); @@ -144,6 +158,19 @@ YamlConfig::YamlConfig(const std::string &slam_config_filename, m_camera_intrinsic(1, 2) = intrinsic[3].as(); } + if (auto node = find_node(device_config, "cam0.distortion", true)) { + assign_vector(m_camera_distortion, node); + } + + if (auto node = + find_node(device_config, "cam0.camera_distortion_flag", true)) { + assign(m_camera_distortion_flag, node); + } + + if (auto node = find_node(device_config, "cam0.time_offset", true)) { + assign(m_camera_time_offset, node); + } + if (auto node = find_node(device_config, "cam0.resolution", true)) { assign_vector(m_camera_resolution, node); } @@ -196,6 +223,11 @@ YamlConfig::YamlConfig(const std::string &slam_config_filename, assign(m_sliding_window_size, node); } + if (auto node = + find_node(slam_config, "sliding_window.subframe_size", false)) { + assign(m_sliding_window_subframe_size, node); + } + if (auto node = find_node(slam_config, "sliding_window.tracker_frequent", false)) { assign(m_sliding_window_tracker_frequent, node); @@ -226,6 +258,21 @@ YamlConfig::YamlConfig(const std::string &slam_config_filename, assign(m_feature_tracker_max_frames, node); } + if (auto node = + find_node(slam_config, "feature_tracker.clahe_clip_limit", false)) { + assign(m_feature_tracker_clahe_clip_limit, node); + } + + if (auto node = + find_node(slam_config, "feature_tracker.clahe_width", false)) { + assign(m_feature_tracker_clahe_width, node); + } + + if (auto node = + find_node(slam_config, "feature_tracker.clahe_height", false)) { + assign(m_feature_tracker_clahe_height, node); + } + if (auto node = find_node(slam_config, "feature_tracker.predict_keypoints", false)) { assign(m_feature_tracker_predict_keypoints, node); @@ -281,12 +328,52 @@ YamlConfig::YamlConfig(const std::string &slam_config_filename, if (auto node = find_node(slam_config, "solver.time_limit", false)) { assign(m_solver_time_limit, node); } + + if (auto node = find_node(slam_config, "parsac.parsac_flag", false)) { + assign(m_parsac_flag, node); + } + + if (auto node = + find_node(slam_config, "parsac.dynamic_probability", false)) { + assign(m_parsac_dynamic_probability, node); + } + + if (auto node = find_node(slam_config, "parsac.threshold", false)) { + assign(m_parsac_threshold, node); + } + + if (auto node = find_node(slam_config, "parsac.norm_scale", false)) { + assign(m_parsac_norm_scale, node); + } + + if (auto node = + find_node(slam_config, "parsac.keyframe_check_size", false)) { + assign(m_parsac_keyframe_check_size, node); + } + + if (auto node = + find_node(slam_config, "rotation.misalignment_threshold", false)) { + assign(m_rotation_misalignment_threshold, node); + } + + if (auto node = + find_node(slam_config, "rotation.ransac_threshold", false)) { + assign(m_rotation_ransac_threshold, node); + } } YamlConfig::~YamlConfig() = default; matrix<3> YamlConfig::camera_intrinsic() const { return m_camera_intrinsic; } +vector<4> YamlConfig::camera_distortion() const { return m_camera_distortion; } + +size_t YamlConfig::camera_distortion_flag() const { + return m_camera_distortion_flag; +} + +double YamlConfig::camera_time_offset() const { return m_camera_time_offset; } + vector<2> YamlConfig::camera_resolution() const { return m_camera_resolution; } quaternion YamlConfig::camera_to_body_rotation() const { @@ -335,6 +422,10 @@ vector<3> YamlConfig::output_to_body_translation() const { size_t YamlConfig::sliding_window_size() const { return m_sliding_window_size; } +size_t YamlConfig::sliding_window_subframe_size() const { + return m_sliding_window_subframe_size; +} + size_t YamlConfig::sliding_window_tracker_frequent() const { return m_sliding_window_tracker_frequent; } @@ -359,6 +450,18 @@ size_t YamlConfig::feature_tracker_max_frames() const { return m_feature_tracker_max_frames; } +double YamlConfig::feature_tracker_clahe_clip_limit() const { + return m_feature_tracker_clahe_clip_limit; +} + +size_t YamlConfig::feature_tracker_clahe_width() const { + return m_feature_tracker_clahe_width; +} + +size_t YamlConfig::feature_tracker_clahe_height() const { + return m_feature_tracker_clahe_height; +} + bool YamlConfig::feature_tracker_predict_keypoints() const { return m_feature_tracker_predict_keypoints; } @@ -409,4 +512,26 @@ size_t YamlConfig::solver_iteration_limit() const { double YamlConfig::solver_time_limit() const { return m_solver_time_limit; } +bool YamlConfig::parsac_flag() const { return m_parsac_flag; } + +double YamlConfig::parsac_dynamic_probability() const { + return m_parsac_dynamic_probability; +} + +double YamlConfig::parsac_threshold() const { return m_parsac_threshold; } + +double YamlConfig::parsac_norm_scale() const { return m_parsac_norm_scale; } + +size_t YamlConfig::parsac_keyframe_check_size() const { + return m_parsac_keyframe_check_size; +} + +double YamlConfig::rotation_misalignment_threshold() const { + return m_rotation_misalignment_threshold; +} + +double YamlConfig::rotation_ransac_threshold() const { + return m_rotation_ransac_threshold; +} + } // namespace xrslam::extra diff --git a/xrslam-interface/include/XRSLAM.h b/xrslam-interface/include/XRSLAM.h index 9c2ec3e..ba1e1df 100644 --- a/xrslam-interface/include/XRSLAM.h +++ b/xrslam-interface/include/XRSLAM.h @@ -182,10 +182,12 @@ typedef struct XRSLAMStringOutput { * intrinsics, extrinsics and so on. * @param[in] license_path license path * @param[in] product_name product name which user can define + * @param[out] config configuration of slam and device * @return 1 success, otherwise 0 */ int XRSLAMCreate(const char *slam_config_path, const char *device_config_path, - const char *license_path, const char *product_name); + const char *license_path, const char *product_name, + void **config); /** * @brief push sensor data to SLAM system diff --git a/xrslam-interface/src/XRSLAMInternal.cpp b/xrslam-interface/src/XRSLAMInternal.cpp index b8f3287..1e4872e 100644 --- a/xrslam-interface/src/XRSLAMInternal.cpp +++ b/xrslam-interface/src/XRSLAMInternal.cpp @@ -4,7 +4,7 @@ int XRSLAMCreate( const char *slam_config_path, // slam configuration file path const char *device_config_path, // device configuration file path - const char *license_path, const char *product_name) { + const char *license_path, const char *product_name, void **config) { if (xrslam::XRSLAMManager::Instance().CheckLicense(license_path, product_name) == 0) return 0; @@ -13,6 +13,7 @@ int XRSLAMCreate( std::make_shared(slam_config_path, device_config_path); xrslam::XRSLAMManager::Instance().Init(yaml_config); + *config = static_cast(yaml_config.get()); return 1; } diff --git a/xrslam-ios/visualizer/configs/slam_params.yaml b/xrslam-ios/visualizer/configs/slam_params.yaml index 0da0fa4..7d98284 100644 --- a/xrslam-ios/visualizer/configs/slam_params.yaml +++ b/xrslam-ios/visualizer/configs/slam_params.yaml @@ -20,3 +20,10 @@ visual_localization: enable: true ip: "10.156.16.69" port: 12345 + +parsac: + parsac_flag: false + dynamic_probability: 0.15 + threshold: 1.0 + norm_scale: 1.0 + keyframe_check_size: 1 diff --git a/xrslam-ios/visualizer/src/XRSLAM_iOS.mm b/xrslam-ios/visualizer/src/XRSLAM_iOS.mm index 3475eff..e2e0d51 100644 --- a/xrslam-ios/visualizer/src/XRSLAM_iOS.mm +++ b/xrslam-ios/visualizer/src/XRSLAM_iOS.mm @@ -107,9 +107,11 @@ - (id)init:(NSString *)model { @"configs/SENSESLAMSDK_165BA1A4-F959-4790-A891-C85DBF5D26EA" ofType:@"lic"]; + void *yaml_config = nil; int num = XRSLAMCreate([slam_config_content UTF8String], [device_config_content UTF8String], - [license_file UTF8String], "SenseSLAMSDK"); + [license_file UTF8String], "SenseSLAMSDK", + &yaml_config); XRGlobalLocalizerCreate([slam_config_content UTF8String], [device_config_content UTF8String]); std::cout << "init xr slam: " << num << std::endl; diff --git a/xrslam-pc/player/src/IO/dataset_reader.cpp b/xrslam-pc/player/src/IO/dataset_reader.cpp index b07897b..b6eeaa8 100644 --- a/xrslam-pc/player/src/IO/dataset_reader.cpp +++ b/xrslam-pc/player/src/IO/dataset_reader.cpp @@ -15,12 +15,14 @@ std::optional path_from_scheme(const std::string &string, } std::unique_ptr -DatasetReader::create_reader(const std::string &filename, bool async) { +DatasetReader::create_reader(const std::string &filename, void *yaml_config, + bool async) { std::unique_ptr reader; if (auto path = path_from_scheme(filename, "euroc://")) { - reader = std::make_unique(path.value()); + reader = + std::make_unique(path.value(), yaml_config); } else if (auto path = path_from_scheme(filename, "tum://")) { - reader = std::make_unique(path.value()); + reader = std::make_unique(path.value(), yaml_config); } else { return nullptr; } diff --git a/xrslam-pc/player/src/IO/dataset_reader.h b/xrslam-pc/player/src/IO/dataset_reader.h index 81bbe8f..97439d1 100644 --- a/xrslam-pc/player/src/IO/dataset_reader.h +++ b/xrslam-pc/player/src/IO/dataset_reader.h @@ -9,6 +9,7 @@ #include #include "XRSLAM.h" +#include "xrslam/extra/yaml_config.h" class DatasetReader { public: @@ -21,7 +22,8 @@ class DatasetReader { virtual std::pair read_accelerometer() = 0; static std::unique_ptr - create_reader(const std::string &filename, bool async = true); + create_reader(const std::string &filename, void *yaml_config, + bool async = true); }; #endif // XRSLAM_PC_DATASET_READER_H diff --git a/xrslam-pc/player/src/IO/euroc_dataset_reader.cpp b/xrslam-pc/player/src/IO/euroc_dataset_reader.cpp index dcb3101..9b9117e 100644 --- a/xrslam-pc/player/src/IO/euroc_dataset_reader.cpp +++ b/xrslam-pc/player/src/IO/euroc_dataset_reader.cpp @@ -1,6 +1,11 @@ #include -EurocDatasetReader::EurocDatasetReader(const std::string &euroc_path) { +EurocDatasetReader::EurocDatasetReader(const std::string &euroc_path, + void *yaml_config) { + + config = std::shared_ptr( + reinterpret_cast(yaml_config)); + CameraCsv camera_csv; ImuCsv imu_csv; @@ -8,9 +13,10 @@ EurocDatasetReader::EurocDatasetReader(const std::string &euroc_path) { imu_csv.load(euroc_path + "/imu0/data.csv"); for (auto &item : camera_csv.items) { - image_data.emplace_back(item.t, + image_data.emplace_back(item.t + config->camera_time_offset(), euroc_path + "/cam0/data/" + item.filename); - all_data.emplace_back(item.t, NextDataType::CAMERA); + all_data.emplace_back(item.t + config->camera_time_offset(), + NextDataType::CAMERA); } for (auto &item : imu_csv.items) { @@ -22,14 +28,14 @@ EurocDatasetReader::EurocDatasetReader(const std::string &euroc_path) { all_data.emplace_back(item.t, NextDataType::ACCELEROMETER); } - std::sort(all_data.begin(), all_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); - std::sort(image_data.begin(), image_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); - std::sort(gyroscope_data.begin(), gyroscope_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); - std::sort(accelerometer_data.begin(), accelerometer_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(all_data.begin(), all_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(image_data.begin(), image_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(gyroscope_data.begin(), gyroscope_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(accelerometer_data.begin(), accelerometer_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); } DatasetReader::NextDataType EurocDatasetReader::next() { @@ -50,13 +56,24 @@ std::pair EurocDatasetReader::read_image() { return {}; } auto [t, filename] = image_data.front(); - cv::Mat img_distorted = cv::imread(filename, cv::IMREAD_GRAYSCALE); + cv::Mat img_distorted = cv::imread(filename, cv::IMREAD_UNCHANGED); + cv::Mat img; - cv::Mat dist_coeffs = (cv::Mat_(4, 1) << -0.28340811, 0.07395907, - 0.00019359, 1.76187114e-05); - cv::Mat K = (cv::Mat_(3, 3) << 458.654, 0, 367.215, 0, 457.296, - 248.375, 0, 0, 1); - cv::undistort(img_distorted, img, K, dist_coeffs); + if (config->camera_distortion_flag()) { + xrslam::vector<4> D = config->camera_distortion(); + xrslam::matrix<3> K = config->camera_intrinsic(); + cv::Mat distortion = (cv::Mat_(4, 1) << D[0], D[1], D[2], D[3]); + cv::Mat intrinsic = + (cv::Mat_(3, 3) << K(0, 0), K(0, 1), K(0, 2), K(1, 0), + K(1, 1), K(1, 2), K(2, 0), K(2, 1), K(2, 2)); + cv::undistort(img_distorted, img, intrinsic, distortion); + } else { + img = img_distorted.clone(); + } + + if (img.channels() != 1) + cv::cvtColor(img, img, cv::COLOR_BGR2GRAY); + image_height = img.rows; image_width = img.cols; diff --git a/xrslam-pc/player/src/IO/euroc_dataset_reader.h b/xrslam-pc/player/src/IO/euroc_dataset_reader.h index 9bdabb5..e77b214 100644 --- a/xrslam-pc/player/src/IO/euroc_dataset_reader.h +++ b/xrslam-pc/player/src/IO/euroc_dataset_reader.h @@ -7,7 +7,7 @@ class EurocDatasetReader : public DatasetReader { public: - EurocDatasetReader(const std::string &filename); + EurocDatasetReader(const std::string &filename, void *yaml_config); NextDataType next() override; void get_image_resolution(int &width, int &height) override; @@ -16,6 +16,7 @@ class EurocDatasetReader : public DatasetReader { std::pair read_accelerometer() override; private: + std::shared_ptr config; std::deque> all_data; std::deque> gyroscope_data; std::deque> accelerometer_data; diff --git a/xrslam-pc/player/src/IO/tum_dataset_reader.cpp b/xrslam-pc/player/src/IO/tum_dataset_reader.cpp index 202c352..3914763 100644 --- a/xrslam-pc/player/src/IO/tum_dataset_reader.cpp +++ b/xrslam-pc/player/src/IO/tum_dataset_reader.cpp @@ -1,6 +1,10 @@ #include -TUMDatasetReader::TUMDatasetReader(const std::string &tum_path) { +TUMDatasetReader::TUMDatasetReader(const std::string &tum_path, + void *yaml_config) { + config = std::shared_ptr( + reinterpret_cast(yaml_config)); + TUMCameraCsv camera_csv; TUMImuCsv imu_csv; @@ -9,9 +13,10 @@ TUMDatasetReader::TUMDatasetReader(const std::string &tum_path) { for (auto &item : camera_csv.items) { // printf("read image %lf\n", item.t); - image_data.emplace_back(item.t, + image_data.emplace_back(item.t + config->camera_time_offset(), tum_path + "/cam0/data/" + item.filename); - all_data.emplace_back(item.t, NextDataType::CAMERA); + all_data.emplace_back(item.t + config->camera_time_offset(), + NextDataType::CAMERA); } num_images = image_data.size(); @@ -24,14 +29,14 @@ TUMDatasetReader::TUMDatasetReader(const std::string &tum_path) { all_data.emplace_back(item.t, NextDataType::ACCELEROMETER); } - std::sort(all_data.begin(), all_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); - std::sort(image_data.begin(), image_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); - std::sort(gyroscope_data.begin(), gyroscope_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); - std::sort(accelerometer_data.begin(), accelerometer_data.end(), - [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(all_data.begin(), all_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(image_data.begin(), image_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(gyroscope_data.begin(), gyroscope_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); + std::stable_sort(accelerometer_data.begin(), accelerometer_data.end(), + [](auto &a, auto &b) { return a.first < b.first; }); } DatasetReader::NextDataType TUMDatasetReader::next() { @@ -56,17 +61,19 @@ std::pair TUMDatasetReader::read_image() { cv::Mat img; if (!image_undistorter) { - xrslam::matrix<3> K; - K << 190.97847715128717, 0, 254.93170605935475, 0, 190.9733070521226, - 256.8974428996504, 0, 0, 1; - std::vector dist_coeffs = { - 0.0034003170790442797, 0.001766278153469831, -0.00266312569781606, - 0.0003299517423931039}; + xrslam::matrix<3> intrinsic = config->camera_intrinsic(); + xrslam::vector<4> D = config->camera_distortion(); + std::vector distortion = {D[0], D[1], D[2], D[3]}; image_undistorter = std::make_unique( - img_distorted.cols, img_distorted.rows, K, dist_coeffs, + img_distorted.cols, img_distorted.rows, intrinsic, distortion, "equidistant"); } - image_undistorter->undistort_image(img_distorted, img); + if (config->camera_distortion_flag()) { + image_undistorter->undistort_image(img_distorted, img); + } else { + img = img_distorted.clone(); + } + image_height = img.rows; image_width = img.cols; diff --git a/xrslam-pc/player/src/IO/tum_dataset_reader.h b/xrslam-pc/player/src/IO/tum_dataset_reader.h index efb19a2..e9df13b 100644 --- a/xrslam-pc/player/src/IO/tum_dataset_reader.h +++ b/xrslam-pc/player/src/IO/tum_dataset_reader.h @@ -8,7 +8,7 @@ class TUMDatasetReader : public DatasetReader { public: - TUMDatasetReader(const std::string &filename); + TUMDatasetReader(const std::string &filename, void *yaml_config); NextDataType next() override; void get_image_resolution(int &width, int &height) override; std::pair read_image() override; @@ -17,6 +17,7 @@ class TUMDatasetReader : public DatasetReader { size_t num_images; private: + std::shared_ptr config; std::deque> all_data; std::deque> gyroscope_data; std::deque> accelerometer_data; diff --git a/xrslam-pc/player/src/main.cpp b/xrslam-pc/player/src/main.cpp index 48808e7..9fada6b 100644 --- a/xrslam-pc/player/src/main.cpp +++ b/xrslam-pc/player/src/main.cpp @@ -87,9 +87,10 @@ int main(int argc, char *argv[]) { bool is_play = program.get("-p"); // create slam with configuration files + void *yaml_config = nullptr; int create_succ = XRSLAMCreate(slam_config_path.c_str(), device_config_path.c_str(), - license_path.c_str(), "XRSLAM PC"); + license_path.c_str(), "XRSLAM PC", &yaml_config); std::cout << "create SLAM success: " << create_succ << std::endl; #ifndef XRSLAM_PC_HEADLESS_ONLY Visualizer visualizer(is_play, device_config_path); @@ -107,7 +108,7 @@ int main(int argc, char *argv[]) { } std::unique_ptr reader = - DatasetReader::create_reader(data_path); + DatasetReader::create_reader(data_path, yaml_config); if (!reader) { fprintf(stderr, "Cannot open \"%s\"\n", data_path.c_str()); return EXIT_FAILURE; diff --git a/xrslam-pc/player/src/opencv_painter.h b/xrslam-pc/player/src/opencv_painter.h index 173d464..9fc8437 100644 --- a/xrslam-pc/player/src/opencv_painter.h +++ b/xrslam-pc/player/src/opencv_painter.h @@ -14,7 +14,10 @@ class OpenCvPainter : public xrslam::InspectPainter { void set_image(const xrslam::Image *image) override { const xrslam::extra::OpenCvImage *cvimage = dynamic_cast(image); - cv::cvtColor(cvimage->image, canvas, cv::COLOR_GRAY2BGR); + if (cvimage->raw.channels() == 3) + canvas = cvimage->raw.clone(); + else + cv::cvtColor(cvimage->raw, canvas, cv::COLOR_GRAY2BGR); } void point(const xrslam::point2i &p, const xrslam::color3b &c, int size = 1, diff --git a/xrslam/CMakeLists.txt b/xrslam/CMakeLists.txt index 6d8c308..03180f5 100644 --- a/xrslam/CMakeLists.txt +++ b/xrslam/CMakeLists.txt @@ -44,6 +44,7 @@ set(PRIVATE_HEADERS ${XRSLAM_SOURCE_DIR}/xrslam/geometry/lie_algebra.h ${XRSLAM_SOURCE_DIR}/xrslam/geometry/stereo.h ${XRSLAM_SOURCE_DIR}/xrslam/geometry/wahba.h + ${XRSLAM_SOURCE_DIR}/xrslam/geometry/pnp.h ${XRSLAM_SOURCE_DIR}/xrslam/map/frame.h ${XRSLAM_SOURCE_DIR}/xrslam/map/map.h ${XRSLAM_SOURCE_DIR}/xrslam/map/track.h @@ -56,6 +57,8 @@ set(PRIVATE_HEADERS ${XRSLAM_SOURCE_DIR}/xrslam/utility/unique_timer.h ${XRSLAM_SOURCE_DIR}/xrslam/utility/worker.h ${XRSLAM_SOURCE_DIR}/xrslam/utility/logger.h + ${XRSLAM_SOURCE_DIR}/xrslam/utility/parsac.h + ${XRSLAM_SOURCE_DIR}/xrslam/utility/imu_parsac.h ${XRSLAM_SOURCE_DIR}/xrslam/ar/virtual_object_manager.h ${XRSLAM_SOURCE_DIR}/xrslam/localizer/base64.h ${XRSLAM_SOURCE_DIR}/xrslam/localizer/base64_convert.h diff --git a/xrslam/include/xrslam/xrslam.h b/xrslam/include/xrslam/xrslam.h index 89c2941..405eed4 100644 --- a/xrslam/include/xrslam/xrslam.h +++ b/xrslam/include/xrslam/xrslam.h @@ -59,8 +59,11 @@ class Config { virtual vector<2> camera_resolution() const = 0; virtual matrix<3> camera_intrinsic() const = 0; + virtual vector<4> camera_distortion() const = 0; virtual quaternion camera_to_body_rotation() const = 0; virtual vector<3> camera_to_body_translation() const = 0; + virtual size_t camera_distortion_flag() const = 0; + virtual double camera_time_offset() const = 0; virtual quaternion imu_to_body_rotation() const = 0; virtual vector<3> imu_to_body_translation() const = 0; @@ -74,6 +77,7 @@ class Config { virtual vector<3> output_to_body_translation() const; virtual size_t sliding_window_size() const; + virtual size_t sliding_window_subframe_size() const; virtual size_t sliding_window_force_keyframe_landmarks() const; virtual size_t sliding_window_tracker_frequent() const; @@ -81,6 +85,9 @@ class Config { virtual size_t feature_tracker_max_keypoint_detection() const; virtual size_t feature_tracker_max_init_frames() const; virtual size_t feature_tracker_max_frames() const; + virtual double feature_tracker_clahe_clip_limit() const; + virtual size_t feature_tracker_clahe_width() const; + virtual size_t feature_tracker_clahe_height() const; virtual bool feature_tracker_predict_keypoints() const; virtual size_t initializer_keyframe_num() const; @@ -98,8 +105,17 @@ class Config { virtual size_t solver_iteration_limit() const; virtual double solver_time_limit() const; + virtual double rotation_misalignment_threshold() const; + virtual double rotation_ransac_threshold() const; + virtual int random() const; + virtual bool parsac_flag() const; + virtual double parsac_dynamic_probability() const; + virtual double parsac_threshold() const; + virtual double parsac_norm_scale() const; + virtual size_t parsac_keyframe_check_size() const; + void log_config() const; }; @@ -118,7 +134,7 @@ class Image { int level = 0) const = 0; virtual ~Image() = default; - virtual void preprocess() {} + virtual void preprocess(double clipLimit, int width, int height) {} virtual void release_image_buffer() = 0; virtual void detect_keypoints(std::vector> &keypoints, size_t max_points = 0, diff --git a/xrslam/src/xrslam/config.cpp b/xrslam/src/xrslam/config.cpp index f457df6..8057697 100644 --- a/xrslam/src/xrslam/config.cpp +++ b/xrslam/src/xrslam/config.cpp @@ -4,6 +4,8 @@ namespace xrslam { Config::~Config() = default; +vector<4> Config::camera_distortion() const { return vector<4>::Zero(); } + quaternion Config::output_to_body_rotation() const { return quaternion::Identity(); } @@ -13,6 +15,8 @@ vector<3> Config::output_to_body_translation() const { size_t Config::sliding_window_size() const { return 10; } +size_t Config::sliding_window_subframe_size() const { return 3; } + size_t Config::sliding_window_force_keyframe_landmarks() const { return 35; } double Config::feature_tracker_min_keypoint_distance() const { return 20.0; } @@ -23,6 +27,12 @@ size_t Config::feature_tracker_max_init_frames() const { return 60; } size_t Config::feature_tracker_max_frames() const { return 200; } +double Config::feature_tracker_clahe_clip_limit() const { return 6.0; } + +size_t Config::feature_tracker_clahe_width() const { return 8; } + +size_t Config::feature_tracker_clahe_height() const { return 8; } + bool Config::feature_tracker_predict_keypoints() const { return true; } size_t Config::initializer_keyframe_num() const { return 8; } @@ -33,9 +43,7 @@ size_t Config::initializer_min_matches() const { return 50; } double Config::initializer_min_parallax() const { return 10; } -size_t Config::initializer_min_triangulation() const { - return 50; // 50 -} +size_t Config::initializer_min_triangulation() const { return 50; } size_t Config::initializer_min_landmarks() const { return 30; } @@ -51,8 +59,22 @@ size_t Config::solver_iteration_limit() const { return 10; } double Config::solver_time_limit() const { return 1.0e6; } +double Config::rotation_misalignment_threshold() const { return 0.1; } + +double Config::rotation_ransac_threshold() const { return 10; } + int Config::random() const { return 648; } +bool Config::parsac_flag() const { return false; } + +double Config::parsac_dynamic_probability() const { return 0.0; } + +double Config::parsac_threshold() const { return 3.0; } + +double Config::parsac_norm_scale() const { return 1.0; } + +size_t Config::parsac_keyframe_check_size() const { return 3; } + size_t Config::sliding_window_tracker_frequent() const { return 1; } void Config::log_config() const { @@ -63,6 +85,18 @@ void Config::log_config() const { << camera_intrinsic() << "\n" << std::endl; + ss << "Config::camera_distortion_flag:\n" + << camera_distortion_flag() << "\n" + << std::endl; + + ss << "Config::camera_distortion:\n" + << camera_distortion().transpose() << "\n" + << std::endl; + + ss << "Config::camera_time_offset:\n" + << camera_time_offset() << "\n" + << std::endl; + ss << "Config::camera_to_body_rotation:\n" << camera_to_body_rotation().coeffs().transpose() << "\n" << std::endl; @@ -101,6 +135,9 @@ void Config::log_config() const { ss << "Config::sliding_window_size: " << sliding_window_size() << std::endl; + ss << "Config::sliding_window_subframe_size: " + << sliding_window_subframe_size() << std::endl; + ss << "Config::sliding_window_force_keyframe_landmarks: " << sliding_window_force_keyframe_landmarks() << std::endl; @@ -122,6 +159,15 @@ void Config::log_config() const { ss << "Config::feature_tracker_predict_keypoints: " << feature_tracker_predict_keypoints() << std::endl; + ss << "Config::feature_tracker_clahe_clip_limit: " + << feature_tracker_clahe_clip_limit() << std::endl; + + ss << "Config::feature_tracker_clahe_width: " + << feature_tracker_clahe_width() << std::endl; + + ss << "Config::feature_tracker_clahe_height: " + << feature_tracker_clahe_height() << std::endl; + ss << "Config::initializer_keyframe_gap: " << initializer_keyframe_gap() << std::endl; @@ -154,6 +200,21 @@ void Config::log_config() const { ss << "Config::solver_time_limit: " << solver_time_limit() << std::endl; + ss << "Config::parsac_flag: " << parsac_flag() << std::endl; + + ss << "Config::parsac_dynamic_probability: " << parsac_dynamic_probability() + << std::endl; + + ss << "Config::parsac_threshold: " << parsac_threshold() << std::endl; + + ss << "Config::parsac_norm_scale: " << parsac_norm_scale() << std::endl; + + ss << "Config::rotation_misalignment_threshold: " + << rotation_misalignment_threshold() << std::endl; + + ss << "Config::rotation_ransac_threshold: " << rotation_ransac_threshold() + << std::endl; + #if defined(XRSLAM_ENABLE_THREADING) ss << std::endl; ss << "THREADING ENABLE" << std::endl; diff --git a/xrslam/src/xrslam/core/feature_tracker.cpp b/xrslam/src/xrslam/core/feature_tracker.cpp index fae839a..741afb2 100644 --- a/xrslam/src/xrslam/core/feature_tracker.cpp +++ b/xrslam/src/xrslam/core/feature_tracker.cpp @@ -36,7 +36,9 @@ void FeatureTracker::work(std::unique_lock &l) { frames.pop_front(); l.unlock(); - frame->image->preprocess(); + frame->image->preprocess(config->feature_tracker_clahe_clip_limit(), + config->feature_tracker_clahe_width(), + config->feature_tracker_clahe_height()); auto [latest_optimized_time, latest_optimized_frame_id, latest_optimized_pose, latest_optimized_motion] = diff --git a/xrslam/src/xrslam/core/feature_tracker.h b/xrslam/src/xrslam/core/feature_tracker.h index 1ecf5bb..f23959e 100644 --- a/xrslam/src/xrslam/core/feature_tracker.h +++ b/xrslam/src/xrslam/core/feature_tracker.h @@ -29,7 +29,7 @@ class FeatureTracker : public Worker { std::optional> get_latest_state() const; - std::unique_ptr map; + std::shared_ptr map; std::unique_ptr keymap; private: diff --git a/xrslam/src/xrslam/core/frontend_worker.cpp b/xrslam/src/xrslam/core/frontend_worker.cpp index c2d988a..f54bc46 100644 --- a/xrslam/src/xrslam/core/frontend_worker.cpp +++ b/xrslam/src/xrslam/core/frontend_worker.cpp @@ -49,6 +49,8 @@ void FrontendWorker::work(std::unique_lock &l) { } else { sliding_window_tracker->map->create_virtual_object_manager(); } + sliding_window_tracker->feature_tracking_map = + detail->feature_tracker->map; std::unique_lock lk(latest_state_mutex); auto [t, pose, motion] = sliding_window_tracker->get_latest_state(); latest_state = {t, pending_frame_id, pose, motion}; diff --git a/xrslam/src/xrslam/core/sliding_window_tracker.cpp b/xrslam/src/xrslam/core/sliding_window_tracker.cpp index 92ea16f..72306bc 100644 --- a/xrslam/src/xrslam/core/sliding_window_tracker.cpp +++ b/xrslam/src/xrslam/core/sliding_window_tracker.cpp @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -60,12 +61,18 @@ void SlidingWindowTracker::mirror_frame(Map *feature_tracking_map, if (Track *track = old_frame_i->get_track(ki)) { if (size_t kj = track->get_keypoint_index(old_frame_j); kj != nil()) { - new_frame_i->get_track(ki, map.get()) - ->add_keypoint(new_frame_j, kj); + Track *new_track = new_frame_i->get_track(ki, map.get()); + new_track->add_keypoint(new_frame_j, kj); + track->tag(TT_TRASH) = + new_track->tag(TT_TRASH) && !new_track->tag(TT_STATIC); } } } + map->prune_tracks([](const Track *track) { + return track->tag(TT_TRASH) && !track->tag(TT_STATIC); + }); + new_frame_j->preintegration.integrate(new_frame_j->image->t, new_frame_i->motion.bg, new_frame_i->motion.ba, true, true); @@ -73,6 +80,13 @@ void SlidingWindowTracker::mirror_frame(Map *feature_tracking_map, } bool SlidingWindowTracker::track() { + + if (config->parsac_flag()) { + if (judge_track_status()) { + update_track_status(); + } + } + localize_newframe(); if (manage_keyframe()) { @@ -126,8 +140,7 @@ void SlidingWindowTracker::localize_newframe() { for (size_t k = 0; k < frame_j->keypoint_num(); ++k) { if (Track *track = frame_j->get_track(k)) { - if (/* track->has_keypoint(frame_i) && */ track->all_tagged( - TT_VALID, TT_TRIANGULATED)) { + if (track->all_tagged(TT_VALID, TT_TRIANGULATED, TT_STATIC)) { solver->put_factor( Solver::create_reprojection_prior_factor(frame_j, track)); } @@ -178,7 +191,8 @@ bool SlidingWindowTracker::manage_keyframe() { map->attach_frame(std::move(frame_lifted)); return true; } else { - if (keyframe_i->subframes.size() >= 3) { + if (keyframe_i->subframes.size() >= + config->sliding_window_subframe_size()) { // [T]...........<-[T] // +-[T]-[T]-[T] // ==> @@ -193,7 +207,7 @@ bool SlidingWindowTracker::manage_keyframe() { size_t mapped_landmark_count = 0; for (size_t k = 0; k < newframe_j->keypoint_num(); ++k) { if (Track *track = newframe_j->get_track(k)) { - if (track->all_tagged(TT_VALID, TT_TRIANGULATED)) { + if (track->all_tagged(TT_VALID, TT_TRIANGULATED, TT_STATIC)) { mapped_landmark_count++; } } @@ -226,6 +240,7 @@ void SlidingWindowTracker::track_landmark() { track->set_landmark_point(p.value()); track->tag(TT_TRIANGULATED) = true; track->tag(TT_VALID) = true; + track->tag(TT_STATIC) = true; } else { // outlier track->landmark.inv_depth = -1.0; @@ -262,6 +277,8 @@ void SlidingWindowTracker::refine_window() { visited_tracks.insert(track); if (!track->tag(TT_VALID)) continue; + if (!track->tag(TT_STATIC)) + continue; if (!track->first_frame()->tag(FT_KEYFRAME)) continue; solver->add_track_states(track); @@ -276,7 +293,7 @@ void SlidingWindowTracker::refine_window() { Track *track = frame->get_track(j); if (!track) continue; - if (!track->all_tagged(TT_VALID, TT_TRIANGULATED)) + if (!track->all_tagged(TT_VALID, TT_TRIANGULATED, TT_STATIC)) continue; if (!track->first_frame()->tag(FT_KEYFRAME)) continue; @@ -341,7 +358,11 @@ void SlidingWindowTracker::refine_window() { } } - map->prune_tracks([](const Track *track) { return !track->tag(TT_VALID); }); + for (size_t k = 0; k < map->track_num(); ++k) { + Track *track = map->get_track(k); + if (!track->tag(TT_VALID)) + track->tag(TT_TRASH) = true; + } } void SlidingWindowTracker::slide_window() { @@ -399,9 +420,10 @@ void SlidingWindowTracker::refine_subwindow() { if (Track *track = last_subframe->get_track(k)) { if (track->tag(TT_VALID)) { if (track->tag(TT_TRIANGULATED)) { - solver->put_factor( - Solver::create_reprojection_prior_factor( - last_subframe, track)); + if (track->tag(TT_STATIC)) + solver->put_factor( + Solver::create_reprojection_prior_factor( + last_subframe, track)); } else { solver->put_factor(Solver::create_rotation_prior_factor( last_subframe, track)); @@ -430,7 +452,8 @@ void SlidingWindowTracker::refine_subwindow() { prev_frame, subframe, subframe->preintegration)); for (size_t k = 0; k < subframe->keypoint_num(); ++k) { if (Track *track = subframe->get_track(k)) { - if (track->all_tagged(TT_VALID, TT_TRIANGULATED)) { + if (track->all_tagged(TT_VALID, TT_TRIANGULATED, + TT_STATIC)) { if (track->first_frame()->tag(FT_KEYFRAME)) { solver->put_factor( Solver::create_reprojection_prior_factor( @@ -458,4 +481,320 @@ SlidingWindowTracker::get_latest_state() const { return {frame->image->t, frame->pose, frame->motion}; } +matrix<3> compute_essential_matrix(matrix<3> &R, vector<3> &t) { + matrix<3> t_ = matrix<3>::Zero(); + + t_(0, 1) = -t(2); + t_(0, 2) = t(1); + t_(1, 0) = t(2); + t_(1, 2) = -t(0); + t_(2, 0) = -t(1); + t_(2, 1) = t(0); + + matrix<3> E = t_ * R; + return E; +} + +double compute_epipolar_dist(matrix<3> F, vector<2> &pt1, vector<2> &pt2) { + vector<3> l = F * pt1.homogeneous(); + double dist = + std::abs(pt2.homogeneous().transpose() * l) / l.segment<2>(0).norm(); + return dist; +} + +bool SlidingWindowTracker::check_frames_rpe(Track *track, const vector<3> &x) { + std::vector> Ps; + std::vector> ps; + + bool is_valid = true; + double rpe = 0.0; + double rpe_count = 0.0; + for (const auto &[frame, keypoint_index] : track->keypoint_map()) { + if (!frame->tag(FT_KEYFRAME)) + continue; + PoseState pose = frame->get_pose(frame->camera); + vector<3> y = pose.q.conjugate() * (x - pose.p); + if (y.z() <= 1.0e-3 || y.z() > 50) { // todo + is_valid = false; + break; + } + rpe += (apply_k(y, frame->K) - + apply_k(frame->get_keypoint(keypoint_index), frame->K)) + .norm(); + rpe_count += 1.0; + } + is_valid = is_valid && (rpe / std::max(rpe_count, 1.0) < 3.0); + + return is_valid; +} + +bool SlidingWindowTracker::filter_parsac_2d2d( + Frame *frame_i, Frame *frame_j, std::vector &mask, + std::vector &pts_to_index) { + + std::vector> pts1, pts2; + + for (size_t ki = 0; ki < frame_i->keypoint_num(); ++ki) { + if (Track *track = frame_i->get_track(ki)) { + if (size_t kj = track->get_keypoint_index(frame_j)) { + if (kj != nil()) { + pts1.push_back(frame_i->get_keypoint(ki).hnormalized()); + pts2.push_back(frame_j->get_keypoint(kj).hnormalized()); + pts_to_index.push_back(kj); + } + } + } + } + + if (pts1.size() < 10) + return false; + + matrix<3> E = + find_essential_matrix_parsac(pts1, pts2, mask, m_th / frame_i->K(0, 0)); + + return true; +} + +void SlidingWindowTracker::predict_RT(Frame *frame_i, Frame *frame_j, + matrix<3> &R, vector<3> &t) { + + auto camera = frame_i->camera; + auto imu = frame_i->imu; + + matrix<4> Pwc = matrix<4>::Identity(); + matrix<4> PwI = matrix<4>::Identity(); + matrix<4> Pwi = matrix<4>::Identity(); + matrix<4> Pwj = matrix<4>::Identity(); + + Pwc.block<3, 3>(0, 0) = camera.q_cs.toRotationMatrix(); + Pwc.block<3, 1>(0, 3) = camera.p_cs; + PwI.block<3, 3>(0, 0) = imu.q_cs.toRotationMatrix(); + PwI.block<3, 1>(0, 3) = imu.p_cs; + Pwi.block<3, 3>(0, 0) = frame_i->pose.q.toRotationMatrix(); + Pwi.block<3, 1>(0, 3) = frame_i->pose.p; + Pwj.block<3, 3>(0, 0) = frame_j->pose.q.toRotationMatrix(); + Pwj.block<3, 1>(0, 3) = frame_j->pose.p; + + matrix<4> Pji = Pwj.inverse() * Pwi; + + matrix<4> P = (Pwc.inverse() * PwI * Pji * PwI.inverse() * Pwc); + + R = P.block<3, 3>(0, 0); + t = P.block(0, 3, 3, 1); +} + +bool SlidingWindowTracker::judge_track_status() { + + Frame *curr_frame = map->get_frame(map->frame_num() - 1); + Frame *keyframe = map->get_frame(map->frame_num() - 2); + Frame *last_frame = keyframe; + if (!keyframe->subframes.empty()) { + last_frame = keyframe->subframes.back().get(); + } + + curr_frame->preintegration.integrate(curr_frame->image->t, + last_frame->motion.bg, + last_frame->motion.ba, true, true); + curr_frame->preintegration.predict(last_frame, curr_frame); + + m_P2D.clear(); + m_P3D.clear(); + m_lens.clear(); + m_indices_map = std::vector(curr_frame->keypoint_num(), -1); + + for (size_t k = 0; k < curr_frame->keypoint_num(); ++k) { + if (Track *track = curr_frame->get_track(k)) { + if (track->all_tagged(TT_VALID, TT_TRIANGULATED)) { + const vector<3> &bearing = curr_frame->get_keypoint(k); + const vector<3> &landmark = track->get_landmark_point(); + m_P2D.push_back(bearing.hnormalized()); + m_P3D.push_back(landmark); + m_lens.push_back(std::max(track->m_life, size_t(0))); + m_indices_map[k] = m_P3D.size() - 1; + } + } + } + + if (m_P2D.size() < 20) + return false; + + const PoseState &pose = curr_frame->get_pose(curr_frame->camera); + + std::vector mask; + matrix<3> Rcw = pose.q.inverse().toRotationMatrix(); + vector<3> tcw = pose.q.inverse() * pose.p * (-1.0); + matrix<4> T_IMU = + find_pnp_matrix_parsac_imu(m_P3D, m_P2D, m_lens, Rcw, tcw, 0.20, 1.0, + mask, 1.0 / curr_frame->K(0, 0)); + + matrix<3> R; + vector<3> t; + predict_RT(keyframe, curr_frame, R, t); + + // check rpe + { + std::vector> P2D_inliers, P2D_outliers; + std::vector> P3D_inliers, P3D_outliers; + + for (int i = 0; i < m_P2D.size(); ++i) { + if (mask[i]) { + P2D_inliers.push_back(m_P2D[i]); + P3D_inliers.push_back(m_P3D[i]); + } else { + P2D_outliers.push_back(m_P2D[i]); + P3D_outliers.push_back(m_P3D[i]); + } + } + + std::vector inlier_errs, outlier_errs; + double inlier_errs_sum = 0, outlier_errs_sum = 0; + for (int i = 0; i < P2D_inliers.size(); i++) { + vector<3> p = pose.q.conjugate() * (P3D_inliers[i] - pose.p); + double proj_err = + (apply_k(p, curr_frame->K) - + apply_k(P2D_inliers[i].homogeneous(), curr_frame->K)) + .norm(); + inlier_errs.push_back(proj_err); + inlier_errs_sum += proj_err; + } + + for (int i = 0; i < P2D_outliers.size(); i++) { + vector<3> p = pose.q.conjugate() * (P3D_outliers[i] - pose.p); + double proj_err = + (apply_k(p, curr_frame->K) - + apply_k(P2D_outliers[i].homogeneous(), curr_frame->K)) + .norm(); + outlier_errs.push_back(proj_err); + outlier_errs_sum += proj_err; + } + } + + matrix<3> E = compute_essential_matrix(R, t); + matrix<3> F = + keyframe->K.transpose().inverse() * E * curr_frame->K.inverse(); + + std::vector> inlier_set1, inlier_set2; + std::vector> outlier_set1, outlier_set2; + for (size_t i = 0; i < curr_frame->keypoint_num(); ++i) { + if (m_indices_map[i] != -1) { + if (size_t j = + curr_frame->get_track(i)->get_keypoint_index(keyframe); + j != nil()) { + if (mask[m_indices_map[i]]) { + inlier_set1.push_back( + apply_k(keyframe->get_keypoint(j), keyframe->K)); + inlier_set2.push_back( + apply_k(curr_frame->get_keypoint(i), curr_frame->K)); + } else { + outlier_set1.push_back( + apply_k(keyframe->get_keypoint(j), keyframe->K)); + outlier_set2.push_back( + apply_k(curr_frame->get_keypoint(i), curr_frame->K)); + } + } + } + } + + std::vector inliers_dist, outliers_dist; + + for (int i = 0; i < inlier_set1.size(); i++) { + vector<2> &p1 = inlier_set1[i]; + vector<2> &p2 = inlier_set2[i]; + double err = compute_epipolar_dist(F, p1, p2) + + compute_epipolar_dist(F.transpose(), p2, p1); + inliers_dist.push_back(err); + } + + for (int i = 0; i < outlier_set1.size(); i++) { + vector<2> &p1 = outlier_set1[i]; + vector<2> &p2 = outlier_set2[i]; + double err = compute_epipolar_dist(F, p1, p2) + + compute_epipolar_dist(F.transpose(), p2, p1); + outliers_dist.push_back(err); + } + + size_t min_num = 20; + if (inliers_dist.size() < min_num || outliers_dist.size() < min_num) + return false; + + std::sort(inliers_dist.begin(), inliers_dist.end()); + std::sort(outliers_dist.begin(), outliers_dist.end()); + + double th1 = inliers_dist[size_t(inliers_dist.size() * 0.5)]; + double th2 = outliers_dist[size_t(outliers_dist.size() * 0.5)]; + + if (th2 < th1 * 2) // mean there is ambiguity + return false; + + m_th = (th1 + th2) / 2; + + for (size_t k = 0; k < curr_frame->keypoint_num(); ++k) { + if (Track *track = curr_frame->get_track(k)) { + // track->tag(TT_STATIC) = true; + if (m_indices_map[k] != -1) { + if (mask[m_indices_map[k]]) { + curr_frame->get_track(k)->tag(TT_OUTLIER) = false; + curr_frame->get_track(k)->tag(TT_STATIC) = true; + } else { + curr_frame->get_track(k)->tag(TT_OUTLIER) = true; + curr_frame->get_track(k)->tag(TT_STATIC) = false; + } + } + } + } + + return true; +} + +void SlidingWindowTracker::update_track_status() { + + Frame *curr_frame = map->get_frame(map->frame_num() - 1); + size_t frame_id = feature_tracking_map->frame_index_by_id(curr_frame->id()); + + if (frame_id == nil()) + return; + + Frame *old_frame = feature_tracking_map->get_frame(frame_id); + + std::vector outlier_cnts(curr_frame->keypoint_num(), 0); + std::vector matches_cnts(curr_frame->keypoint_num(), 0); + size_t start_idx = std::min( + map->frame_num() - 1, + std::max(map->frame_num() - 1 - config->parsac_keyframe_check_size(), + size_t(0))); + for (size_t i = start_idx; i < map->frame_num() - 1; i++) { + std::vector mask; + std::vector pts_to_index; + if (filter_parsac_2d2d(map->get_frame(i), curr_frame, mask, + pts_to_index)) { + for (size_t j = 0; j < mask.size(); j++) { + if (!mask[j]) { + outlier_cnts[pts_to_index[j]] += 1; + } + matches_cnts[pts_to_index[j]] += 1; + } + } + } + + for (size_t i = 0; i < curr_frame->keypoint_num(); i++) { + if (Track *curr_track = curr_frame->get_track(i)) { + if (size_t j = curr_track->get_keypoint_index(old_frame)) { + if (j != nil()) { + Track *old_track = old_frame->get_track(j); + size_t outlier_th = map->frame_num() / 2; + if (outlier_cnts[i] > outlier_th / 2 && + outlier_cnts[i] > 0.8 * matches_cnts[i]) { + curr_track->tag(TT_STATIC) = false; + } + if (!old_track->tag(TT_STATIC) || + !curr_track->tag(TT_STATIC)) { + curr_track->tag(TT_STATIC) = false; + old_track->tag(TT_STATIC) = false; + } + } + } + } + } +} + } // namespace xrslam diff --git a/xrslam/src/xrslam/core/sliding_window_tracker.h b/xrslam/src/xrslam/core/sliding_window_tracker.h index cfed878..d609be7 100644 --- a/xrslam/src/xrslam/core/sliding_window_tracker.h +++ b/xrslam/src/xrslam/core/sliding_window_tracker.h @@ -26,11 +26,27 @@ class SlidingWindowTracker { void refine_subwindow(); + bool judge_track_status(); + void fuse_imu_track(); + bool check_frames_rpe(Track *track, const vector<3> &p); + void predict_RT(Frame *frame_i, Frame *frame_j, matrix<3> &R, vector<3> &t); + bool filter_parsac_2d2d(Frame *frame_i, Frame *frame_j, + std::vector &mask, + std::vector &pts_to_index); + void update_track_status(); + + std::vector> m_P3D; + std::vector> m_P2D; + std::vector m_lens; + std::vector m_indices_map; + bool track(); std::tuple get_latest_state() const; + double m_th; std::unique_ptr map; + std::shared_ptr feature_tracking_map; private: std::shared_ptr config; diff --git a/xrslam/src/xrslam/geometry/pnp.h b/xrslam/src/xrslam/geometry/pnp.h new file mode 100644 index 0000000..7b93efc --- /dev/null +++ b/xrslam/src/xrslam/geometry/pnp.h @@ -0,0 +1,208 @@ +#ifndef XRSLAM_PNP_H +#define XRSLAM_PNP_H + +#include +#include +#include +#include +#include + +namespace xrslam { + +std::vector> solve_pnp_6pt(const std::array, 6> &Xs, + const std::array, 6> &xs) { + cv::Mat rvec, tvec; + + std::vector P3D; + std::vector P2D; + for (auto &X : Xs) + P3D.emplace_back(X[0], X[1], X[2]); + for (auto &x : xs) + P2D.emplace_back(x[0], x[1]); + + cv::Mat K = cv::Mat::eye(3, 3, CV_32FC1); + + cv::solvePnP(P3D, P2D, K, cv::noArray(), rvec, tvec, false, CV_EPNP); + + rvec.convertTo(rvec, CV_32F); + tvec.convertTo(tvec, CV_32F); + + cv::Mat R; + cv::Rodrigues(rvec, R); + cv::Mat T = cv::Mat::eye(4, 4, CV_32FC1); + R.copyTo(T(cv::Rect(0, 0, 3, 3))); + tvec.copyTo(T(cv::Rect(3, 0, 1, 3))); + + T.convertTo(T, CV_64FC1); + matrix<4> pose = + (matrix<4>() << T.at(0, 0), T.at(0, 1), + T.at(0, 2), T.at(0, 3), T.at(1, 0), + T.at(1, 1), T.at(1, 2), T.at(1, 3), + T.at(2, 0), T.at(2, 1), T.at(2, 2), + T.at(2, 3), T.at(3, 0), T.at(3, 1), + T.at(3, 2), T.at(3, 3)) + .finished(); + + std::vector> poses; + poses.push_back(pose); + return poses; +} + +std::vector> solve_pnp_4pt(const std::array, 4> &Xs, + const std::array, 4> &xs) { + cv::Mat rvec, tvec; + + std::vector P3D; + std::vector P2D; + for (auto &X : Xs) + P3D.emplace_back(X[0], X[1], X[2]); + for (auto &x : xs) + P2D.emplace_back(x[0], x[1]); + + cv::Mat K = cv::Mat::eye(3, 3, CV_32FC1); + + cv::solvePnP(P3D, P2D, K, cv::noArray(), rvec, tvec, false, CV_EPNP); + + rvec.convertTo(rvec, CV_32F); + tvec.convertTo(tvec, CV_32F); + + cv::Mat R; + cv::Rodrigues(rvec, R); + cv::Mat T = cv::Mat::eye(4, 4, CV_32FC1); + R.copyTo(T(cv::Rect(0, 0, 3, 3))); + tvec.copyTo(T(cv::Rect(3, 0, 1, 3))); + + T.convertTo(T, CV_64FC1); + matrix<4> pose = + (matrix<4>() << T.at(0, 0), T.at(0, 1), + T.at(0, 2), T.at(0, 3), T.at(1, 0), + T.at(1, 1), T.at(1, 2), T.at(1, 3), + T.at(2, 0), T.at(2, 1), T.at(2, 2), + T.at(2, 3), T.at(3, 0), T.at(3, 1), + T.at(3, 2), T.at(3, 3)) + .finished(); + + std::vector> poses; + poses.push_back(pose); + return poses; +} + +inline double pnp_reproject_error(const matrix<4> &T, const vector<3> &P1, + const vector<2> &p2) { + return (p2 - (T.block<3, 3>(0, 0) * P1 + T.block<3, 1>(0, 3)).hnormalized()) + .squaredNorm(); +} + +matrix<4> find_pnp_matrix(const std::vector> &Xs, + const std::vector> &xs, + std::vector &inlier_mask, + double threshold = 1.0, double confidence = 0.999, + size_t max_iteration = 1000, int seed = 0) { + struct PnpSolver { + std::vector> + operator()(const std::array, 6> &samples1, + const std::array, 6> &samples2) const { + return solve_pnp_6pt(samples1, samples2); + } + // std::vector> operator()(const std::array, 4> + // &samples1, const std::array, 4> &samples2) const { + // return solve_pnp_4pt(samples1, samples2); + // } + }; + struct PnpEvaluator { + const matrix<4> &T; + PnpEvaluator(const matrix<4> &T) : T(T) {} + double operator()(const vector<3> &p1, const vector<2> &p2) const { + return pnp_reproject_error(T, p1, p2); + } + }; + + static const double t2 = 5.99; + + Ransac<6, matrix<4>, PnpSolver, PnpEvaluator> ransac( + 2.0 * t2 * threshold * threshold, confidence, max_iteration, seed); + matrix<4> pose = ransac.solve(Xs, xs); + inlier_mask.swap(ransac.inlier_mask); + + return pose; +} + +matrix<4> find_pnp_matrix_parsac(const std::vector> &Xs, + const std::vector> &xs, + std::vector &inlier_mask, + double threshold = 1.0, + double confidence = 0.999, + size_t max_iteration = 1000, int seed = 0) { + struct PnpSolver { + std::vector> + operator()(const std::array, 6> &samples1, + const std::array, 6> &samples2) const { + return solve_pnp_6pt(samples1, samples2); + } + // std::vector> operator()(const std::array, 4> + // &samples1, const std::array, 4> &samples2) const { + // return solve_pnp_4pt(samples1, samples2); + // } + }; + struct PnpEvaluator { + const matrix<4> &T; + PnpEvaluator(const matrix<4> &T) : T(T) {} + double operator()(const vector<3> &p1, const vector<2> &p2) const { + return pnp_reproject_error(T, p1, p2); + } + }; + + static const double t2 = 5.99; + static std::vector binConfidences(400, 0.5); + Parsac<6, matrix<4>, PnpSolver, PnpEvaluator> parsac( + 2.0 * t2 * threshold * threshold, confidence, max_iteration, seed); + matrix<4> pose = parsac.solve(binConfidences, Xs, xs); + + inlier_mask.swap(parsac.inlier_mask); + + return pose; +} + +matrix<4> find_pnp_matrix_parsac_imu( + const std::vector> &Xs, const std::vector> &xs, + const std::vector &lens, const matrix<3> &R, const vector<3> &t, + const double &dynamic_prob, const double &scale, + std::vector &inlier_mask, double threshold = 1.0, + double confidence = 0.999, size_t max_iteration = 1000, int seed = 0) { + struct PnpSolver { + std::vector> + operator()(const std::array, 6> &samples1, + const std::array, 6> &samples2) const { + return solve_pnp_6pt(samples1, samples2); + } + // std::vector> operator()(const std::array, 4> + // &samples1, const std::array, 4> &samples2) const { + // return solve_pnp_4pt(samples1, samples2); + // } + }; + struct PnpEvaluator { + const matrix<4> &T; + PnpEvaluator(const matrix<4> &T) : T(T) {} + double operator()(const vector<3> &p1, const vector<2> &p2) const { + return pnp_reproject_error(T, p1, p2); + } + }; + + static const double t2 = 5.99; + static std::vector binConfidences(400, 0.5); + IMU_Parsac<6, matrix<4>, PnpSolver, PnpEvaluator> imu_parsac( + 2.0 * t2 * threshold * threshold, confidence, max_iteration, seed); + + imu_parsac.SetPriorPose(R, t); + imu_parsac.SetNormScale(scale); + imu_parsac.SetLens(lens); + imu_parsac.SetDynamicProbability(dynamic_prob); + matrix<4> pose = imu_parsac.solve(binConfidences, Xs, xs); + inlier_mask.swap(imu_parsac.inlier_mask); + + return pose; +} + +} // namespace xrslam + +#endif diff --git a/xrslam/src/xrslam/geometry/stereo.cpp b/xrslam/src/xrslam/geometry/stereo.cpp index 51bb788..95e038c 100644 --- a/xrslam/src/xrslam/geometry/stereo.cpp +++ b/xrslam/src/xrslam/geometry/stereo.cpp @@ -4,6 +4,7 @@ #include #include #include +#include namespace xrslam { @@ -117,4 +118,69 @@ matrix<3> find_homography_matrix(const std::vector> &points1, return H; } +// ----------------------------- +// parsac version +// ----------------------------- +matrix<3> find_essential_matrix_parsac(const std::vector> &points1, + const std::vector> &points2, + std::vector &inlier_mask, + double threshold, double confidence, + size_t max_iteration, int seed) { + struct EssentialSolver { + std::vector> + operator()(const std::array, 5> &samples1, + const std::array, 5> &samples2) const { + return solve_essential_5pt(samples1, samples2); + } + }; + struct EssentialEvaluator { + const matrix<3> &E; + const matrix<3> Et; + EssentialEvaluator(const matrix<3> &E) : E(E), Et(E.transpose()) {} + double operator()(const vector<2> &p1, const vector<2> &p2) const { + float error = essential_geometric_error(E, p1, p2) + + essential_geometric_error(Et, p2, p1); + // std::cout << error << std::endl; + return essential_geometric_error(E, p1, p2) + + essential_geometric_error(Et, p2, p1); + } + }; + static const double t1 = 3.84; + static std::vector binConfidences(400, 0.5); + Parsac<5, matrix<3>, EssentialSolver, EssentialEvaluator> parsac( + 2.0 * t1 * threshold * threshold, confidence, max_iteration, seed); + matrix<3> E = parsac.solve(binConfidences, points1, points2); + inlier_mask.swap(parsac.inlier_mask); + return E; +} + +matrix<3> find_homography_matrix_parsac(const std::vector> &points1, + const std::vector> &points2, + std::vector &inlier_mask, + double threshold, double confidence, + size_t max_iteration, int seed) { + struct HomographySolver { + matrix<3> operator()(const std::array, 4> &samples1, + const std::array, 4> &samples2) const { + return solve_homography_4pt(samples1, samples2); + } + }; + struct HomographyEvaluator { + const matrix<3> &H; + const matrix<3> Hinv; + HomographyEvaluator(const matrix<3> &H) : H(H), Hinv(H.inverse()) {} + double operator()(const vector<2> &p1, const vector<2> &p2) const { + return homography_geometric_error(H, p1, p2) + + homography_geometric_error(Hinv, p2, p1); + } + }; + static const double t2 = 5.99; + static std::vector binConfidences(400, 0.5); + Parsac<4, matrix<3>, HomographySolver, HomographyEvaluator> parsac( + 2.0 * t2 * threshold * threshold, confidence, max_iteration, seed); + matrix<3> H = parsac.solve(binConfidences, points1, points2); + inlier_mask.swap(parsac.inlier_mask); + return H; +} + } // namespace xrslam diff --git a/xrslam/src/xrslam/geometry/stereo.h b/xrslam/src/xrslam/geometry/stereo.h index 3bcf634..0d40975 100644 --- a/xrslam/src/xrslam/geometry/stereo.h +++ b/xrslam/src/xrslam/geometry/stereo.h @@ -54,6 +54,21 @@ matrix<3> find_homography_matrix(const std::vector> &points1, double confidence = 0.999, size_t max_iteration = 1000, int seed = 0); +matrix<3> find_essential_matrix_parsac(const std::vector> &points1, + const std::vector> &points2, + std::vector &inlier_mask, + double threshold = 1.0, + double confidence = 0.999, + size_t max_iteration = 1000, + int seed = 0); +matrix<3> find_homography_matrix_parsac(const std::vector> &points1, + const std::vector> &points2, + std::vector &inlier_mask, + double threshold = 1.0, + double confidence = 0.999, + size_t max_iteration = 1000, + int seed = 0); + inline vector<4> triangulate_point(const matrix<3, 4> &P1, const matrix<3, 4> &P2, const vector<3> &point1, diff --git a/xrslam/src/xrslam/map/frame.cpp b/xrslam/src/xrslam/map/frame.cpp index 158b562..515ef35 100644 --- a/xrslam/src/xrslam/map/frame.cpp +++ b/xrslam/src/xrslam/map/frame.cpp @@ -114,7 +114,8 @@ void Frame::track_keypoints(Frame *next_frame, Config *config) { } } matrix<3> R = find_rotation_matrix(bearings, next_bearings, mask, - M_PI / 18.0); // TODO: make configurable + (M_PI / 180.0) * + config->rotation_ransac_threshold()); std::vector angles; for (size_t i = 0; i < mask.size(); ++i) { @@ -129,7 +130,7 @@ void Frame::track_keypoints(Frame *next_frame, Config *config) { inspect_debug(feature_tracker_angle_misalignment, angle_misalignment) { angle_misalignment = misalignment; } - if (misalignment < 0.02) { // TODO: make configurable + if (misalignment < config->rotation_misalignment_threshold()) { next_frame->tag(FT_NO_TRANSLATION) = true; } @@ -153,7 +154,9 @@ void Frame::track_keypoints(Frame *next_frame, Config *config) { config->feature_tracker_min_keypoint_distance()); for (auto &[keypoint_index, track_length] : keypoint_index_track_length) { vector<2> pt = next_keypoints[keypoint_index]; - if (filter.permit_point(pt)) { + Track *track = this->get_track(keypoint_index); + if (filter.permit_point(pt) && + (!track || (track && !track->tag(TT_TRASH)))) { filter.preset_point(pt); } else { status[keypoint_index] = 0; diff --git a/xrslam/src/xrslam/map/track.cpp b/xrslam/src/xrslam/map/track.cpp index 0eb51f1..e8652b3 100644 --- a/xrslam/src/xrslam/map/track.cpp +++ b/xrslam/src/xrslam/map/track.cpp @@ -4,8 +4,7 @@ namespace xrslam { -Track::Track() = default; - +Track::Track() { tag(TT_STATIC) = true; } Track::~Track() = default; const vector<3> &Track::get_keypoint(Frame *frame) const { @@ -17,6 +16,10 @@ void Track::add_keypoint(Frame *frame, size_t keypoint_index) { frame->tracks[keypoint_index] = this; frame->reprojection_error_factors[keypoint_index] = Solver::create_reprojection_error_factor(frame, this); + if (this->tag(TT_TRIANGULATED)) + m_life++; + else + m_life = 1; } void Track::remove_keypoint(Frame *frame, bool suicide_if_empty) { @@ -40,7 +43,7 @@ void Track::remove_keypoint(Frame *frame, bool suicide_if_empty) { } } -std::optional> Track::triangulate() const { +std::optional> Track::triangulate() { std::vector> Ps; std::vector> ps; for (const auto &[frame, keypoint_index] : keypoint_map()) { @@ -65,6 +68,7 @@ std::optional> Track::triangulate() const { } } if (is_valid) { + m_life = 1; return hlandmark.hnormalized(); } else { return {}; diff --git a/xrslam/src/xrslam/map/track.h b/xrslam/src/xrslam/map/track.h index ad104ca..c1e49f4 100644 --- a/xrslam/src/xrslam/map/track.h +++ b/xrslam/src/xrslam/map/track.h @@ -10,7 +10,15 @@ namespace xrslam { -enum TrackTag { TT_VALID = 0, TT_TRIANGULATED, TT_FIX_INVD }; +enum TrackTag { + TT_VALID = 0, + TT_TRIANGULATED, + TT_FIX_INVD, + TT_TRASH, + TT_STATIC, + TT_OUTLIER, + TT_TEMP +}; class Track : public Tagged, public Identifiable { friend class Map; @@ -57,7 +65,7 @@ class Track : public Tagged, public Identifiable { void add_keypoint(Frame *frame, size_t keypoint_index); void remove_keypoint(Frame *frame, bool suicide_if_empty = true); - std::optional> triangulate() const; + std::optional> triangulate(); double triangulation_angle(const vector<3> &p) const; vector<3> get_landmark_point() const; @@ -66,6 +74,7 @@ class Track : public Tagged, public Identifiable { std::unique_lock lock() const { return map->lock(); } LandmarkState landmark; + size_t m_life = 0; private: std::map> keypoint_refs; diff --git a/xrslam/src/xrslam/utility/imu_parsac.h b/xrslam/src/xrslam/utility/imu_parsac.h new file mode 100644 index 0000000..0cab401 --- /dev/null +++ b/xrslam/src/xrslam/utility/imu_parsac.h @@ -0,0 +1,455 @@ +#ifndef XRSLAM_IMU_PARSAC_H +#define XRSLAM_IMU_PARSAC_H + +#include +#include +#include + +namespace xrslam { + +template +struct IMU_Parsac { + double threshold; + double confidence; + size_t max_iteration; + int seed; + + ModelType model; + size_t inlier_count; + std::vector inlier_mask; + + IMU_Parsac(double threshold, double confidence = 0.999, + size_t max_iteration = 1000, int seed = 0) + : threshold(threshold), confidence(confidence), + max_iteration(max_iteration), seed(seed), + m_parsacMinPriorBinConfidence(0.5) {} + + template + ModelType solve(std::vector &binConfidences, + const std::vector &...data) { + std::tuple &...> tdata = + std::make_tuple(std::cref(data)...); + size_t size = std::get<0>(tdata).size(); + + auto &pts1 = std::get<0>(tdata); + auto &pts2 = std::get<1>(tdata); + + LotBox lotbox(size); + lotbox.seed(seed); + + if (m_verbose) { + std::cout << "============== IMU PARSAC ================" + << std::endl; + std::cout << "-- Matches Size: " << pts1.size() << std::endl; + } + + double K = log(std::max(1 - confidence, 1.0e-5)); + + inlier_count = 0; + + if (size < ModelDoF) { + std::vector _(size, 0); + inlier_mask.swap(_); + return model; + } + + SetBins(20, 20); + if (m_verbose) + std::cout << "-- SetBins()" << std::endl; + + CreateBucket(); + if (m_verbose) + std::cout << "-- CreateBucket()" << std::endl; + + BucketData(pts2); + if (m_verbose) + std::cout << "-- BucketData()" << std::endl; + + ConvertConfidencesBinToValidBin(binConfidences, + m_validBinConfidencesPrior); + if (m_verbose) + std::cout << "-- ConvertConfidencesBinToValidBin()" << std::endl; + + ThresholdAndNormalizeConfidences(m_validBinConfidencesPrior); + if (m_verbose) + std::cout << "-- ThresholdAndNormalizeConfidences()" << std::endl; + + AccumulateConfidences(m_validBinConfidencesPrior, + m_validBinConfidencesAccumulatedPrior); + if (m_verbose) + std::cout << "-- AccumulateConfidences()" << std::endl; + + Sampler sampler(m_validBinConfidencesAccumulatedPrior); + if (m_verbose) + std::cout << "-- sampler()" << std::endl; + + if (!ComputePriorDistribution(pts1, pts2)) { + if (m_verbose) + std::cout << "-- ComputePriorDistribution()" << std::endl; + // std::cout << "IMU Process" << std::endl; + inlier_mask = std::vector(size, 1); + return matrix<4>::Identity(); + } + + std::vector> validBinInliersBest; + size_t iter_max = max_iteration; + float scoreMax = -FLT_MAX, score; + for (size_t iter = 0; iter < iter_max; ++iter) { + // if(m_verbose) std::cout << "iters: " << iter << " " << iter_max + // << std::endl; + std::tuple...> tsample; + + lotbox.refill_all(); + sampler.refill_all(); + + for (size_t si = 0; si < ModelDoF; ++si) { + size_t sample_index; + if (m_nValidBins > 20) + sample_index = sampler.draw_by_weight(); + else + sample_index = lotbox.draw_without_replacement(); + make_sample(tdata, tsample, sample_index, si); + } + + std::vector models{apply(ModelSolver(), tsample)}; + int cnt = 0; + for (const auto ¤t_model : models) { + size_t current_inlier_count = 0; + size_t current_outlier_count = 0; + std::vector current_inlier_mask(size, 0); + std::vector current_outlier_mask(size, 0); + ModelEvaluator eval(current_model); + for (size_t i = 0; i < size; ++i) { + double error = eval(data[i]...); + if (error <= threshold) { + current_inlier_count++; + current_inlier_mask[i] = 1; + } else { + current_outlier_mask[i] = 1; + current_outlier_count++; + } + } + + size_t overlap_inlier_count = 0; + std::vector overlap_inlier_mask(size, 0); + for (int i = 0; i < size; i++) { + if (m_prior_inliers_mask[i] && current_inlier_mask[i]) { + overlap_inlier_mask[i] = 1; + overlap_inlier_count++; + } + } + + if (overlap_inlier_count < ModelDoF) + continue; + + std::vector> validBinInliers; + std::vector> validBinOutliers; + ConvertInliersListToValidBin(current_inlier_mask, + validBinInliers); + // if(m_verbose) std::cout << " -- + // ConvertInliersListToValidBin()" << std::endl; + score = ComputeScore(validBinInliers, m_validBinConfidences); + + // if(m_verbose) std::cout << " -- ComputeScore()" << + // std::endl; + if (score > scoreMax || + score == scoreMax && + (overlap_inlier_count > inlier_count)) { + scoreMax = score; + model = current_model; + inlier_count = overlap_inlier_count; + validBinInliersBest = validBinInliers; + // inlier_mask.swap(overlap_inlier_mask); + inlier_mask.swap(current_inlier_mask); + double inlier_ratio = inlier_count / (double)size; + double N = K / log(1 - pow(inlier_ratio, 5)); + if (N < (double)iter_max) { + iter_max = (size_t)ceil(N); + } + } + // if(m_verbose) std::cout << " -- finish update" << std::endl; + } + // if(m_verbose) std::cout << " -- finish iterate" << std::endl; + } + + if (inlier_count < ModelDoF) { + inlier_mask = std::vector(size, 1); + return matrix<4>::Identity(); + } + + if (m_verbose) + std::cout << "-- Before ComputeScore()" << std::endl; + ComputeScore(validBinInliersBest, m_validBinConfidences); + if (m_verbose) + std::cout << "-- ComputeScore()" << std::endl; + ConvertConfidencesValidBinToBin(m_validBinConfidences, binConfidences); + if (m_verbose) + std::cout << "-- ConvertConfidencesValidBinToBin()" << std::endl; + + return model; + } + + matrix<4> prior_model; + std::vector m_lens; + std::vector m_prior_inliers_mask; + + void SetLens(const std::vector &lens) { m_lens = lens; } + + void SetPriorPose(matrix<3> R, vector<3> t) { + prior_model = matrix<4>::Identity(); + prior_model.block<3, 3>(0, 0) = R; + prior_model.block<3, 1>(0, 3) = t; + } + + void SetNormScale(double scale = 1.0) { m_norm_scale = scale; } + template + bool ComputePriorDistribution(const std::vector &...data) { + + std::tuple &...> tdata = + std::make_tuple(std::cref(data)...); + size_t size = std::get<0>(tdata).size(); + + auto &pts1 = std::get<0>(tdata); + auto &pts2 = std::get<1>(tdata); + + size_t inlier_count = 0; + m_prior_inliers_mask = std::vector(size, 0); + + ModelEvaluator eval(prior_model); + for (size_t i = 0; i < size; ++i) { + double error = eval(data[i]...); + if (error <= threshold * 2.0) { + inlier_count++; + m_prior_inliers_mask[i] = 1; + } + } + + // if(inlier_count < 20 ) //|| inlier_count * 1.0 / size < 0.2 + if ((double)inlier_count / size < 0.15 || + inlier_count < 20) //|| inlier_count * 1.0 / size < 0.2 + return false; + + return true; + } + + void SetDynamicProbability(const double &p) { m_dynamic_probability = p; } + + private: + size_t m_nBinsX, m_nBinsY, m_nBins; + std::vector> m_binLocations; + std::vector m_mapBinToValidBin; + std::vector m_mapValidBinToBin; + std::vector m_mapDataToValidBin; + std::vector> m_validBinData; + std::vector m_validBinDataSizes; + std::vector m_validBinLens; + + std::vector m_validBinConfidences; + std::vector m_validBinConfidencesPrior; + std::vector m_validBinConfidencesAccumulatedPrior; + float m_parsacMinPriorBinConfidence; + + size_t m_nValidBins; + float m_BinHeight; + float m_BinWidth; + bool m_verbose = false; + + double m_dynamic_probability = 0; + double m_norm_scale = 1.0; + + float ComputeScore(std::vector> &validBinInliers, + std::vector &validBinConfidences) { + + validBinConfidences.resize(m_nValidBins); + + float validBinConfidenceSum = 0, validBinConfidenceSqSum = 0; + vector<2> sum(0.0, 0.0); + + for (size_t iBinValid = 0; iBinValid < m_nValidBins; ++iBinValid) { + float t = 1 - std::pow(m_dynamic_probability, + 0.10 * m_validBinLens[iBinValid]); + const float validBinConfidence = + t * float(validBinInliers[iBinValid].size()) / + m_validBinDataSizes[iBinValid]; + validBinConfidences[iBinValid] = validBinConfidence; + vector<2> x = m_binLocations[m_mapValidBinToBin[iBinValid]]; + x *= validBinConfidence; + sum += x; + validBinConfidenceSum += validBinConfidence; + validBinConfidenceSqSum += validBinConfidence * validBinConfidence; + } + float norm = 1.f / validBinConfidenceSum; + vector<2> &mean = sum; + mean *= norm; + + float Cxx = 0, Cxy = 0, Cyy = 0; + for (size_t iBinValid = 0; iBinValid < m_nValidBins; ++iBinValid) { + const float validBinConfidence = validBinConfidences[iBinValid]; + vector<2> x = m_binLocations[m_mapValidBinToBin[iBinValid]]; + vector<2> dx(x[0] - mean[0], x[1] - mean[1]); + Cxx += (dx[0] * dx[0]) * validBinConfidence; + Cxy += (dx[0] * dx[1]) * validBinConfidence; + Cyy += (dx[1] * dx[1]) * validBinConfidence; + } + + norm = validBinConfidenceSum / + (validBinConfidenceSum * validBinConfidenceSum - + validBinConfidenceSqSum); + float imgRatio = + norm * sqrt(Cxx * Cyy - Cxy * Cxy); // * M_PI / (2.0 * 2.0); + float score = imgRatio * validBinConfidenceSum; + return score; + } + + void SetBins(const int nBinsX, const int nBinsY) { + // Some bug occurred! the m_nBinsX is set to be the nBinsX at unknown + // time! if(m_nBinsX == nBinsX && m_nBinsY == nBinsY) + // return; + + m_nBinsX = nBinsX; + m_nBinsY = nBinsY; + m_nBins = m_nBinsX * m_nBinsY; + m_BinHeight = 2 * m_norm_scale / m_nBinsY; + m_BinWidth = 2 * m_norm_scale / m_nBinsX; + } + + void CreateBucket() { + m_binLocations.reserve(m_nBins); + float y = m_BinHeight * 0.5; + for (size_t i = 0; i < m_nBinsY; ++i, y += m_BinHeight) { + float x = m_BinWidth * 0.5; + for (size_t j = 0; j < m_nBinsX; ++j, x += m_BinWidth) { + m_binLocations.emplace_back( + vector<2>(x - m_norm_scale, y - m_norm_scale)); + } + } + } + + template + void BucketData(const std::vector &pts) { + const size_t N = pts.size(); + m_mapDataToValidBin.resize(N); + m_mapBinToValidBin = std::vector(m_nBins, SIZE_MAX); + for (size_t i = 0; i < N; ++i) { + const vector<2> &p1 = pts[i]; + const size_t iBin = + size_t((p1[0] + m_norm_scale) / m_BinWidth) + + m_nBinsX * size_t((p1[1] + m_norm_scale) / m_BinHeight); + const size_t iBinValid = m_mapBinToValidBin[iBin]; + if (iBinValid == SIZE_MAX) { + m_mapBinToValidBin[iBin] = size_t(m_mapValidBinToBin.size()); + m_mapDataToValidBin[i] = size_t(m_mapValidBinToBin.size()); + ; + m_mapValidBinToBin.push_back(iBin); + m_validBinData.push_back(std::vector(1, i)); + m_validBinDataSizes.push_back(1); + m_validBinLens.push_back(m_lens[i]); + } else { + m_mapDataToValidBin[i] = iBinValid; + m_validBinData[iBinValid].push_back(i); + ++m_validBinDataSizes[iBinValid]; + m_validBinLens[iBinValid] += m_lens[i]; + } + } + + m_nValidBins = m_validBinDataSizes.size(); + + for (size_t i = 0; i < m_nValidBins; i++) { + m_validBinLens[i] /= m_validBinDataSizes[i]; + } + } + + void ConvertInliersListToValidBin( + std::vector &inliers_mask, + std::vector> &validBinInliers) { + validBinInliers = std::vector>( + m_nValidBins, std::vector()); + for (int iBinValid = 0; iBinValid < m_nValidBins; ++iBinValid) { + validBinInliers[iBinValid].resize(0); + } + + const size_t N = inliers_mask.size(); + for (size_t i = 0; i < N; ++i) { + if (inliers_mask[i] == 1) { + validBinInliers[m_mapDataToValidBin[i]].push_back(i); + } + } + } + + void + ConvertConfidencesBinToValidBin(const std::vector &binConfidences, + std::vector &validBinConfidences) { + + validBinConfidences.resize(m_nValidBins); + for (size_t iBin = 0; iBin < m_nValidBins; ++iBin) { + validBinConfidences[iBin] = + binConfidences[m_mapValidBinToBin[iBin]]; + } + } + + void ConvertConfidencesValidBinToBin( + const std::vector &validBinConfidences, + std::vector &binConfidences) { + + binConfidences.resize(m_nBins); + for (size_t iBin = 0; iBin < m_nBins; ++iBin) { + if (m_mapBinToValidBin[iBin] == SIZE_MAX) + binConfidences[iBin] = 0; + else + binConfidences[iBin] = + validBinConfidences[m_mapBinToValidBin[iBin]]; + } + } + + void ThresholdAndNormalizeConfidences(std::vector &confidences) { + float confidenceSum = 0; + const size_t N = confidences.size(); + for (size_t i = 0; i < N; ++i) { + confidences[i] = + std::max(m_parsacMinPriorBinConfidence, confidences[i]); + confidenceSum += confidences[i]; + } + const float norm = 1.0 / confidenceSum; + for (size_t i = 0; i < N; ++i) { + confidences[i] *= norm; + } + } + + void AccumulateConfidences(const std::vector &confidences, + std::vector &confidencesAccumulated) { + const size_t N = confidences.size(); + confidencesAccumulated.resize(N + 1); + confidencesAccumulated[0] = 0; + for (size_t i = 0; i < N; ++i) + confidencesAccumulated[i + 1] = + confidencesAccumulated[i] + confidences[i]; + const float norm = 1.f / confidencesAccumulated[N]; + for (size_t i = 0; i < N; ++i) + confidencesAccumulated[i] *= norm; + } + + private: + template + void make_sample(Data &&data, Sample &&sample, size_t idata, + size_t isample) { + size_t I = + std::tuple_size::type>::value; + std::get<0>(sample)[isample] = std::get<0>(data)[idata]; + std::get<1>(sample)[isample] = std::get<1>(data)[idata]; + } + + template + void make_sample_by_prior(Data &&data, Sample &&sample, size_t idata, + size_t isample) { + + size_t idx = + m_validBinData[idata][rand() % m_validBinData[idata].size()]; + std::get<0>(sample)[isample] = std::get<0>(data)[idx]; + std::get<1>(sample)[isample] = std::get<1>(data)[idx]; + } +}; + +} // namespace xrslam + +#endif // XRSLAM_IMU_PARSAC_H diff --git a/xrslam/src/xrslam/utility/parsac.h b/xrslam/src/xrslam/utility/parsac.h new file mode 100644 index 0000000..dea296f --- /dev/null +++ b/xrslam/src/xrslam/utility/parsac.h @@ -0,0 +1,412 @@ +#ifndef XRSLAM_PARSAC_H +#define XRSLAM_PARSAC_H + +#include +#include +#include + +namespace xrslam { + +struct Sampler { + Sampler(std::vector &confidencesAccumulated) + : m_confidencesAccumulated(confidencesAccumulated) { + srand(0); + }; + + bool is_sampled(size_t &idx) { + if (m_sampled_bin_index.size() == 0) + return false; + + for (int i = 0; i < m_sampled_bin_index.size(); ++i) { + if (idx == m_sampled_bin_index[i]) + return true; + } + return false; + } + + size_t draw_by_weight() { + size_t index; + int cnt = 0; + do { + cnt++; + const float r = rand() / (float)RAND_MAX; + index = + size_t(std::upper_bound(m_confidencesAccumulated.begin() + 1, + m_confidencesAccumulated.end(), r) - + m_confidencesAccumulated.begin() - 1); + } while (is_sampled(index)); + m_sampled_bin_index.push_back(index); + + return index; + } + + void refill_all() { m_sampled_bin_index.clear(); } + + void print() { + for (auto i : m_sampled_bin_index) + std::cout << i << " "; + std::cout << std::endl; + } + + private: + std::vector &m_confidencesAccumulated; + std::vector m_sampled_bin_index; +}; + +template +struct Parsac { + double threshold; + double confidence; + size_t max_iteration; + int seed; + + ModelType model; + size_t inlier_count; + std::vector inlier_mask; + + Parsac(double threshold, double confidence = 0.999, + size_t max_iteration = 1000, int seed = 0) + : threshold(threshold), confidence(confidence), + max_iteration(max_iteration), seed(seed), + m_parsacMinPriorBinConfidence(0.5) {} + + template + ModelType solve(std::vector &binConfidences, + const std::vector &...data) { + std::tuple &...> tdata = + std::make_tuple(std::cref(data)...); + size_t size = std::get<0>(tdata).size(); + + auto &pts1 = std::get<0>(tdata); + auto &pts2 = std::get<1>(tdata); + + LotBox lotbox(size); + lotbox.seed(seed); + + if (m_verbose) { + std::cout << "============== PARSAC ================" << std::endl; + std::cout << "-- Matches Size: " << pts1.size() << std::endl; + } + + double K = log(std::max(1 - confidence, 1.0e-5)); + + inlier_count = 0; + + if (size < ModelDoF) { + std::vector _(size, 0); + inlier_mask.swap(_); + return model; + } + + SetBins(20, 20); + if (m_verbose) + std::cout << "-- SetBins()" << std::endl; + + CreateBucket(); + if (m_verbose) + std::cout << "-- CreateBucket()" << std::endl; + + BucketData(pts2); + if (m_verbose) + std::cout << "-- BucketData()" << std::endl; + + ConvertConfidencesBinToValidBin(binConfidences, + m_validBinConfidencesPrior); + if (m_verbose) + std::cout << "-- ConvertConfidencesBinToValidBin()" << std::endl; + + ThresholdAndNormalizeConfidences(m_validBinConfidencesPrior); + if (m_verbose) + std::cout << "-- ThresholdAndNormalizeConfidences()" << std::endl; + + AccumulateConfidences(m_validBinConfidencesPrior, + m_validBinConfidencesAccumulatedPrior); + if (m_verbose) + std::cout << "-- AccumulateConfidences()" << std::endl; + + Sampler sampler(m_validBinConfidencesAccumulatedPrior); + if (m_verbose) + std::cout << "-- sampler()" << std::endl; + + std::vector> validBinInliersBest; + size_t iter_max = max_iteration; + float scoreMax = 0, score; + for (size_t iter = 0; iter < iter_max; ++iter) { + // if(m_verbose) std::cout << "iters: " << iter << std::endl; + std::tuple...> tsample; + + lotbox.refill_all(); + sampler.refill_all(); + + for (size_t si = 0; si < ModelDoF; ++si) { + size_t sample_index; + if (m_nValidBins > 20) + sample_index = sampler.draw_by_weight(); + else + sample_index = lotbox.draw_without_replacement(); + make_sample(tdata, tsample, sample_index, si); + } + + std::vector models{apply(ModelSolver(), tsample)}; + int cnt = 0; + for (const auto ¤t_model : models) { + size_t current_inlier_count = 0; + std::vector current_inlier_mask(size, 0); + + ModelEvaluator eval(current_model); + for (size_t i = 0; i < size; ++i) { + double error = eval(data[i]...); + if (error <= threshold) { + current_inlier_count++; + current_inlier_mask[i] = 1; + } + } + + std::vector> validBinInliers; + ConvertInliersListToValidBin(current_inlier_mask, + validBinInliers); + // if(m_verbose) std::cout << " -- + // ConvertInliersListToValidBin()" << std::endl; + score = ComputeScore(validBinInliers, m_validBinConfidences); + // if(m_verbose) std::cout << " -- ComputeScore()" << + // std::endl; + if (score > scoreMax || + score == scoreMax && + (current_inlier_count > inlier_count)) { + scoreMax = score; + model = current_model; + inlier_count = current_inlier_count; + validBinInliersBest = validBinInliers; + inlier_mask.swap(current_inlier_mask); + double inlier_ratio = inlier_count / (double)size; + double N = K / log(1 - pow(inlier_ratio, 5)); + if (N < (double)iter_max) { + iter_max = (size_t)ceil(N); + } + } + // if(m_verbose) std::cout << " -- finish update" << std::endl; + } + // if(m_verbose) std::cout << " -- finish iterate" << std::endl; + } + if (m_verbose) + std::cout << "-- Before ComputeScore()" << std::endl; + ComputeScore(validBinInliersBest, m_validBinConfidences); + if (m_verbose) + std::cout << "-- ComputeScore()" << std::endl; + + ConvertConfidencesValidBinToBin(m_validBinConfidences, binConfidences); + if (m_verbose) + std::cout << "-- ConvertConfidencesValidBinToBin()" << std::endl; + + return model; + } + + private: + size_t m_nBinsX, m_nBinsY, m_nBins; + std::vector> m_binLocations; + std::vector m_mapBinToValidBin; + std::vector m_mapValidBinToBin; + std::vector m_mapDataToValidBin; + std::vector> m_validBinData; + std::vector m_validBinDataSizes; + + std::vector m_validBinConfidences; + std::vector m_validBinConfidencesPrior; + std::vector m_validBinConfidencesAccumulatedPrior; + float m_parsacMinPriorBinConfidence; + + size_t m_nValidBins; + float m_BinHeight; + float m_BinWidth; + bool m_verbose = false; + + double m_norm_scale = 1.0; + + float ComputeScore(std::vector> &validBinInliers, + std::vector &validBinConfidences) { + + validBinConfidences.resize(m_nValidBins); + + float validBinConfidenceSum = 0, validBinConfidenceSqSum = 0; + vector<2> sum(0.0, 0.0); + + for (size_t iBinValid = 0; iBinValid < m_nValidBins; ++iBinValid) { + const float validBinConfidence = + float(validBinInliers[iBinValid].size()) / + m_validBinDataSizes[iBinValid]; + validBinConfidences[iBinValid] = validBinConfidence; + vector<2> x = m_binLocations[m_mapValidBinToBin[iBinValid]]; + x *= validBinConfidence; + sum += x; + validBinConfidenceSum += validBinConfidence; + validBinConfidenceSqSum += validBinConfidence * validBinConfidence; + } + float norm = 1.f / validBinConfidenceSum; + vector<2> &mean = sum; + mean *= norm; + + float Cxx = 0, Cxy = 0, Cyy = 0; + for (size_t iBinValid = 0; iBinValid < m_nValidBins; ++iBinValid) { + const float validBinConfidence = validBinConfidences[iBinValid]; + vector<2> x = m_binLocations[m_mapValidBinToBin[iBinValid]]; + vector<2> dx(x[0] - mean[0], x[1] - mean[1]); + Cxx += (dx[0] * dx[0]) * validBinConfidence; + Cxy += (dx[0] * dx[1]) * validBinConfidence; + Cyy += (dx[1] * dx[1]) * validBinConfidence; + } + + norm = validBinConfidenceSum / + (validBinConfidenceSum * validBinConfidenceSum - + validBinConfidenceSqSum); + float imgRatio = + norm * sqrt(Cxx * Cyy - Cxy * Cxy); // * M_PI / (2.0 * 2.0); + float score = imgRatio * validBinConfidenceSum; + return score; + } + + void SetBins(const int nBinsX, const int nBinsY) { + // Some bug occurred! the m_nBinsX is set to be the nBinsX at unknown + // time! if(m_nBinsX == nBinsX && m_nBinsY == nBinsY) + // return; + m_nBinsX = nBinsX; + m_nBinsY = nBinsY; + m_nBins = m_nBinsX * m_nBinsY; + m_BinHeight = 2 * m_norm_scale / m_nBinsY; + m_BinWidth = 2 * m_norm_scale / m_nBinsX; + } + + void CreateBucket() { + m_binLocations.reserve(m_nBins); + float y = m_BinHeight * 0.5; + for (size_t i = 0; i < m_nBinsY; ++i, y += m_BinHeight) { + float x = m_BinWidth * 0.5; + for (size_t j = 0; j < m_nBinsX; ++j, x += m_BinWidth) { + m_binLocations.emplace_back( + vector<2>(x - m_norm_scale, y - m_norm_scale)); + } + } + } + + template + void BucketData(const std::vector &pts) { + const size_t N = pts.size(); + m_mapDataToValidBin.resize(N); + m_mapBinToValidBin = std::vector(m_nBins, SIZE_MAX); + for (size_t i = 0; i < N; ++i) { + const vector<2> &p1 = pts[i]; + const size_t iBin = + size_t((p1[0] + m_norm_scale) / m_BinWidth) + + m_nBinsX * size_t((p1[1] + m_norm_scale) / m_BinHeight); + const size_t iBinValid = m_mapBinToValidBin[iBin]; + if (iBinValid == SIZE_MAX) { + m_mapBinToValidBin[iBin] = size_t(m_mapValidBinToBin.size()); + m_mapDataToValidBin[i] = size_t(m_mapValidBinToBin.size()); + ; + m_mapValidBinToBin.push_back(iBin); + m_validBinData.push_back(std::vector(1, i)); + m_validBinDataSizes.push_back(1); + } else { + m_mapDataToValidBin[i] = iBinValid; + m_validBinData[iBinValid].push_back(i); + ++m_validBinDataSizes[iBinValid]; + } + } + m_nValidBins = m_validBinDataSizes.size(); + } + + void ConvertInliersListToValidBin( + std::vector &inliers_mask, + std::vector> &validBinInliers) { + + validBinInliers = std::vector>( + m_nValidBins, std::vector()); + + for (int iBinValid = 0; iBinValid < m_nValidBins; ++iBinValid) { + validBinInliers[iBinValid].resize(0); + } + + const size_t N = inliers_mask.size(); + for (size_t i = 0; i < N; ++i) { + if (inliers_mask[i] == 1) { + validBinInliers[m_mapDataToValidBin[i]].push_back(i); + } + } + } + + void + ConvertConfidencesBinToValidBin(const std::vector &binConfidences, + std::vector &validBinConfidences) { + + validBinConfidences.resize(m_nValidBins); + for (size_t iBin = 0; iBin < m_nValidBins; ++iBin) { + validBinConfidences[iBin] = + binConfidences[m_mapValidBinToBin[iBin]]; + } + } + + void ConvertConfidencesValidBinToBin( + const std::vector &validBinConfidences, + std::vector &binConfidences) { + + binConfidences.resize(m_nBins); + for (size_t iBin = 0; iBin < m_nBins; ++iBin) { + if (m_mapBinToValidBin[iBin] == SIZE_MAX) + binConfidences[iBin] = 0; + else + binConfidences[iBin] = + validBinConfidences[m_mapBinToValidBin[iBin]]; + } + } + + void ThresholdAndNormalizeConfidences(std::vector &confidences) { + float confidenceSum = 0; + const size_t N = confidences.size(); + for (size_t i = 0; i < N; ++i) { + confidences[i] = + std::max(m_parsacMinPriorBinConfidence, confidences[i]); + confidenceSum += confidences[i]; + } + const float norm = 1.0 / confidenceSum; + for (size_t i = 0; i < N; ++i) { + confidences[i] *= norm; + } + } + + void AccumulateConfidences(const std::vector &confidences, + std::vector &confidencesAccumulated) { + const size_t N = confidences.size(); + confidencesAccumulated.resize(N + 1); + confidencesAccumulated[0] = 0; + for (size_t i = 0; i < N; ++i) + confidencesAccumulated[i + 1] = + confidencesAccumulated[i] + confidences[i]; + const float norm = 1.f / confidencesAccumulated[N]; + for (size_t i = 0; i < N; ++i) + confidencesAccumulated[i] *= norm; + } + + private: + template + void make_sample(Data &&data, Sample &&sample, size_t idata, + size_t isample) { + size_t I = + std::tuple_size::type>::value; + std::get<0>(sample)[isample] = std::get<0>(data)[idata]; + std::get<1>(sample)[isample] = std::get<1>(data)[idata]; + } + + template + void make_sample_by_prior(Data &&data, Sample &&sample, size_t idata, + size_t isample) { + + size_t idx = + m_validBinData[idata][rand() % m_validBinData[idata].size()]; + std::get<0>(sample)[isample] = std::get<0>(data)[idx]; + std::get<1>(sample)[isample] = std::get<1>(data)[idx]; + } +}; + +} // namespace xrslam + +#endif // XRSLAM_PARSAC_H diff --git a/xrslam/src/xrslam/utility/ransac.h b/xrslam/src/xrslam/utility/ransac.h index e617cc3..d95f2ac 100644 --- a/xrslam/src/xrslam/utility/ransac.h +++ b/xrslam/src/xrslam/utility/ransac.h @@ -86,8 +86,8 @@ struct Ransac { template static void make_sample_impl(Data &&data, Sample &&sample, size_t idata, size_t isample, std::index_sequence) { - [[maybe_unused]] auto ret = { - (std::get(sample)[isample] = std::get(data)[idata])...}; + [[maybe_unused]] auto ret = {((void *)&( + std::get(sample)[isample] = std::get(data)[idata]))...}; } template
SequenceAPE(m)ARE(deg)
XRSLAMVINS-MonoXRSLAMVINS-Mono
MH_010.147 0.154 2.516 1.516
MH_020.077 0.178 1.707 2.309
MH_030.154 0.195 1.554 1.646
MH_040.269 0.376 1.344 1.431
MH_050.252 0.300 0.740 0.782
V1_010.063 0.088 5.646 6.338
V1_020.097 0.111 1.877 3.278
V1_030.102 0.187 2.190 6.211
V2_010.066 0.082 1.301 2.137
V2_020.092 0.149 1.521 3.976
V2_030.193 0.287 1.592 3.331
Average0.137 0.192 1.998 2.995