Skip to content

InsightFace-REST v0.7.0.0

Latest
Compare
Choose a tag to compare
@SthPhoenix SthPhoenix released this 06 Nov 13:50
· 191 commits to master since this release

2021-11-06 v0.7.0.0

Since a lot of updates happened since last release version is updated straight to v0.7.0.0

Comparing to previous release (v0.6.2.0) this release brings improved performance for SCRFD based detectors.

Here is performance comparison on GPU Nvidia RTX 2080 Super for scrfd_10g_gnkps detector paired with
glintr100 recognition model (all tests are using src/api_trt/test_images/Stallone.jpg, 1 face per image):

Num workers Client threads FPS v0.6.2.0 FPS v0.7.0.0 Speed-up
1 1 56 103 83.9%
1 30 72 128 77.7%
6 30 145 179 23.4%

Additions:

  • Added experimental support for msgpack serializer: helps reduce network traffic for embeddings for ~2x.
  • Output names no longer required for detection models when building TRT engine - correct output order is now extracted
    from onnx models.
  • Detection models now can be exported to TRT engine with batch size > 1 - inference code doesn't support it yet, though
    now they could be used in Triton Inference Server without issues.

Model Zoo:

  • Added support for WebFace600k based recognition models from InsightFace repo: w600k_r50 and w600k_mbf
  • Added md5 check for models to allow automatic re-download if models have changed.
  • All scrfd based models now supports batch dimension.

Improvements:

  • 1.5x-2x faster SCRFD re-implementation with Numba: 4.5 ms. vs 10 ms. for lumia.jpg example with
    scrfd_10g_gnkps and threshold = 0.3 (432 faces detected)).
  • Move image normalization step to GPU with help of CuPy (4x lower data transfer from CPU to GPU, about 6%
    inference speedup, and some computations offloaded from CPU).
  • 4.5x Faster face_align.norm_crop implementation with help of Numba and removal of unused computations.
    (Cropping 432 faces from lumia.jpg example tooks 45 ms. vs 205 ms.).
  • Face crops are now extracted only when needed - when face data or embeddings are requested, improving
    detection only performance.
  • Added Numba njit cache to reduce subsequent starts time.
  • Logging timings rounded to ms for better readability.
  • Minor refactoring

Fixes:

  • Since gender/age estimation model is currently not supported exclude it from models preparing step.