Skip to content

Latest commit

 

History

History
174 lines (138 loc) · 9.21 KB

README.md

File metadata and controls

174 lines (138 loc) · 9.21 KB

LightGlue ONNX

Forked from: https://github.com/fabio-sim/LightGlue-ONNX

PINTO Custom

  1. In the process of processing, the feature point narrowing process by score threshold is often used, which causes onnxruntime to terminate abnormally when the number of detected points reaches zero.
  2. Abolish the narrowing of feature points by score (heavy use of NonZero judgment) and change to a fixed extraction of the top 20 scores.
    def unravel_indices(
    indices: torch.LongTensor,
    shape: Tuple[int, ...],
    ) -> torch.LongTensor:
    r"""Converts flat indices into unraveled coordinates in a target shape.
    Args:
    indices: A tensor of (flat) indices, (*, N).
    shape: The targeted shape, (D,).
    Returns:
    The unraveled coordinates, (*, N, D).
    """
    coord = []
    for dim in reversed(shape):
    coord.append(indices % dim)
    indices = indices // dim
    coord = torch.stack(coord[::-1], dim=-1)
    return coord
    def unravel_index(
    indices: torch.LongTensor,
    shape: Tuple[int, ...],
    ) -> Tuple[torch.LongTensor, ...]:
    r"""Converts flat indices into unraveled coordinates in a target shape.
    This is a `torch` implementation of `numpy.unravel_index`.
    Args:
    indices: A tensor of (flat) indices, (N,).
    shape: The targeted shape, (D,).
    Returns:
    A tuple of unraveled coordinate tensors of shape (D,).
    """
    coord = unravel_indices(indices, shape)
    return tuple(coord)
    ############################################################################
    flat_scores = scores.view(-1)
    # total_elements = torch.prod(torch.tensor(scores.shape))
    # total_elements * 0.00025 / 10
    # total_elements_floor = torch.floor(total_elements * 0.00025 / 10)
    # top_nums = (total_elements_floor * 10).to(torch.int32)
    top_nums = 20
    values, indices = flat_scores.topk(top_nums)
    keypoints = torch.stack(unravel_index(indices, scores.shape))
    keypoints_t = keypoints.T
    ############################################################################
  3. Although the feature points are insufficiently narrowed down by score, only 20 of the inference results need to be filtered by score.
  4. The program should determine the score thresholds. For example, use Numpy. The score threshold for feature point extraction in this paper is 0.0005.
    keep0 = mscores0 >= 0.0005
    kpts0 = kpts0[keep0]
    mscores0 = mscores0[keep0]
    
    keep1 = mscores1 >= 0.0005
    kpts1 = kpts1[keep1]
    mscores1 = mscores1[keep1]
  5. The process of removing feature points contained in the top, bottom, left, and right 4 pixels near the edges of the image has been removed from the onnx. This is to eliminate from onnx the NonZero processing that sacrifices inference performance and generality of the model.
    # Discard keypoints near the image borders
    # keypoints, scores = remove_borders(
    # keypoints, scores, self.config["remove_borders"], h * 8, w * 8
    # )
    #
  6. Inference performance is only slightly worse because 20 fixed points are needlessly processed.
  7. Since it is a pain to write preprocessing of the input image in the program, grayscale conversion is included in the model. image
  8. Since my implementation is only temporary and fabio-sim seems to be improving the functionality very frequently, I think it is more reasonable to wait for fabio-sim to improve the functionality.
  9. All OPs can be converted to TensorRT Engine. It will be a highly efficient model that is not offloaded to the CPU. image image
  10. If you want to calculate ArgMax for each batch size, you can loop through the process described in 2.

Open Neural Network Exchange (ONNX) compatible implementation of LightGlue: Local Feature Matching at Light Speed. The ONNX model format allows for interoperability across different platforms with support for multiple execution providers, and removes Python-specific dependencies such as PyTorch.

LightGlue figure

Updates

  • 1 July 2023: Add support for extractor max_num_keypoints.
  • 30 June 2023: Add support for DISK extractor.
  • 28 June 2023: Add end-to-end SuperPoint+LightGlue export & inference pipeline.

ONNX Export

Prior to exporting the ONNX models, please install the requirements of the original LightGlue repository. (Flash Attention does not need to be installed.)

To convert the DISK or SuperPoint and LightGlue models to ONNX, run export.py. We provide two types of ONNX exports: individual standalone models, and a combined end-to-end pipeline (recommended for convenience) with the --end2end flag.

python export.py \
  --img_size 512 \
  --extractor_type superpoint \
  --extractor_path weights/superpoint.onnx \
  --lightglue_path weights/superpoint_lightglue.onnx \
  --dynamic
  • Exporting individually can be useful when intermediate outputs can be cached or precomputed. On the other hand, the end-to-end pipeline can be more convenient.
  • Although dynamic axes have been specified, it is recommended to export your own ONNX model with the appropriate input image sizes of your use case.

ONNX Inference

With ONNX models in hand, one can perform inference on Python using ONNX Runtime (see requirements-onnx.txt).

The LightGlue inference pipeline has been encapsulated into a runner class:

from onnx_runner import LightGlueRunner, load_image, rgb_to_grayscale

image0, scales0 = load_image("assets/sacre_coeur1.jpg", resize=512)
image1, scales1 = load_image("assets/sacre_coeur2.jpg", resize=512)
image0 = rgb_to_grayscale(image0)  # only needed for SuperPoint
image1 = rgb_to_grayscale(image1)  # only needed for SuperPoint

# Create ONNXRuntime runner
runner = LightGlueRunner(
    extractor_path="weights/superpoint.onnx",
    lightglue_path="weights/superpoint_lightglue.onnx",
    providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
)

# Run inference
m_kpts0, m_kpts1 = runner.run(image0, image1, scales0, scales1)

Note that the output keypoints have already been rescaled back to the original image sizes.

Alternatively, you can also run infer.py.

python infer.py \
  --img_paths assets/DSC_0410.JPG assets/DSC_0411.JPG \
  --img_size 512 \
  --lightglue_path weights/superpoint_lightglue.onnx \
  --extractor_type superpoint \
  --extractor_path weights/superpoint.onnx \
  --viz

Caveats

As the ONNX Runtime has limited support for features like dynamic control flow, certain configurations of the models cannot be exported to ONNX easily. These caveats are outlined below.

Feature Extraction

  • Only batch size 1 is currently supported. This limitation stems from the fact that different images in the same batch can have varying numbers of keypoints, leading to non-uniform (a.k.a. ragged) tensors.

LightGlue Keypoint Matching

  • Since dynamic control flow has limited support in ONNX tracing, by extension, early stopping and adaptive point pruning (the depth_confidence and width_confidence parameters) are also difficult to export to ONNX.
  • Flash Attention is turned off.
  • Mixed precision is turned off.
  • Note that the end-to-end version, despite its name, still requires the postprocessing (filtering valid matches) function outside the ONNX model since the scales variables need to be passed.

Additionally, the outputs of the ONNX models differ slightly from the original PyTorch models (by a small error on the magnitude of 1e-6 to 1e-5 for the scores/descriptors). Although the cause is still unclear, this could be due to differing implementations or modified dtypes.

Possible Future Work

  • Support for TensorRT: Appears to be currently blocked by unsupported Einstein summation operations (torch.einsum()) in TensorRT - Thanks to Shidqiet's investigation.
  • Support for batch size > 1: Blocked by the fact that different images can have varying numbers of keypoints. Perhaps max-length padding?
  • Support for dynamic control flow: Investigating script-mode ONNX export instead of trace-mode.
  • Mixed-precision Support
  • Quantization Support

Credits

If you use any ideas from the papers or code in this repo, please consider citing the authors of LightGlue and SuperPoint and DISK. Lastly, if the ONNX versions helped you in any way, please also consider starring this repository.

@inproceedings{lindenberger23lightglue,
  author    = {Philipp Lindenberger and
               Paul-Edouard Sarlin and
               Marc Pollefeys},
  title     = {{LightGlue}: Local Feature Matching at Light Speed},
  booktitle = {ArXiv PrePrint},
  year      = {2023}
}
@article{DBLP:journals/corr/abs-1712-07629,
  author       = {Daniel DeTone and
                  Tomasz Malisiewicz and
                  Andrew Rabinovich},
  title        = {SuperPoint: Self-Supervised Interest Point Detection and Description},
  journal      = {CoRR},
  volume       = {abs/1712.07629},
  year         = {2017},
  url          = {http://arxiv.org/abs/1712.07629},
  eprinttype    = {arXiv},
  eprint       = {1712.07629},
  timestamp    = {Mon, 13 Aug 2018 16:47:29 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-1712-07629.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}
@article{DBLP:journals/corr/abs-2006-13566,
  author       = {Michal J. Tyszkiewicz and
                  Pascal Fua and
                  Eduard Trulls},
  title        = {{DISK:} Learning local features with policy gradient},
  journal      = {CoRR},
  volume       = {abs/2006.13566},
  year         = {2020},
  url          = {https://arxiv.org/abs/2006.13566},
  eprinttype    = {arXiv},
  eprint       = {2006.13566},
  timestamp    = {Wed, 01 Jul 2020 15:21:23 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2006-13566.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}