-
Notifications
You must be signed in to change notification settings - Fork 20
DPIR
DPIR, or Plug-and-Play Image Restoration with Deep Denoiser Prior, is a denoise and deblocking neural network. See also https://github.com/HolyWu/vs-dpir.
DPIR requires a strength parameter.
Link:
- (stable) https://github.com/AmusementClub/vs-mlrt/releases/download/model-20211209/dpir_v3.7z
Includes these models:
- Denoise models, default sigma is 5.0
- drunet_gray: GRAY denoise
- drunet_color: RGB denoise
- Deblocking models, default sigma is 50.0
- drunet_deblocking_grayscale: GRAY deblocking
- drunet_deblocking_color: RGB deblocking
-
block_w
andblock_h
(tile size) must be multiples of 8. - All DPIR models require a strength parameter, or
sigma
, and you need to pass that in the form of a GRAYS clip (with normalization factor1.0/255
), see examples below for details.
In order to simplify usage, we provided a Python wrapper module vsmlrt that provides a more Pythonic interface:
from vsmlrt import DPIR, DPIRModel, Backend
src = core.std.BlankClip(format=vs.RGBS) # or vs.GRAYS for gray only models
# backend could be:
# - CPU Backend.OV_CPU(): the recommended CPU backend; generally faster than ORT-CPU.
# - CPU Backend.ORT_CPU(num_streams=1, verbosity=2): vs-ort cpu backend.
# - GPU Backend.ORT_CUDA(device_id=0, cudnn_benchmark=True, num_streams=1, verbosity=2)
# - use device_id to select device
# - set cudnn_benchmark=False to reduce script reload latency when debugging, but with slight throughput performance penalty.
# - GPU Backend.TRT(fp16=True, device_id=0, num_streams=1): TensorRT runtime, the fastest NV GPU runtime.
# DPIR is a huge model and GPU backend is highly recommended (use TRT to provide the best performance)
# If the model runs out of GPU memory, increase the tiles parameter.
flt = DPIR(src, strength=5, model=DPIRModel.drunet_color, tiles=2, backend=Backend.ORT_CUDA())
If you want to use variable strength, you can also pass a GRAYS or GRAY8 clip as strength
parameter that has the same dimension as the input clip where each pixel stores the DPIR strength for that pixel.
src = core.std.BlankClip(width=640, height=360, format=vs.GRAYS)
sigma = 2.0
flt = core.ov.Model([src, core.std.BlankClip(src, color=sigma/255.0)], "drunet_gray.onnx")
DPIR is a huge network and it is extremely slow when running on CPU (e.g. for 360p input, you might see 0.05fps/cpu).
Measurements: FPS / Device Memory (MB)
Device memory:
- GPU: device memory including context
Software: VapourSynth R57, Windows 10 LTSC 2021, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
- vs-dpir v1.7.1, PyTorch 1.10.1+cu113, TensorRT 8.2.2, torch2trt
2732b35
- vs-mlrt v8 (driver 511.79)
Model | [1] ort-cuda | [1] trt | [2] cuda | [2] trt | [3] ort-cuda | [3] trt | [3] trt (no tf32) |
---|---|---|---|---|---|---|---|
gray | 2.46 / 5947 | 2.95 / 4157 | 2.34 / 12015 | 2.43 / 4300 | 2.92 / 5759 | 3.26 / 4243 | 3.07 / 4261 |
color | 2.30 / 5979 | 2.75 / 4187 | 2.13 / 12031 | 2.12 / 4384 | 2.86 / 5790 | 3.25 / 4330 | 3.02 / 4291 |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt | [3] ort-cuda | [3] trt | [3] trt (2 streams) |
---|---|---|---|---|---|---|---|---|
gray | 3.67 / 3777 | 9.60 / 3585 | 10.6 / 5430 | 3.47 / 11751 | 7.18 / 4015 | 4.65 / 5759 | 10.9 / 2397 | 11.6 / 3895 |
color | 3.26 / 3817 | 8.65 / 3619 | 10.5 / 5492 | 3.02 / 11765 | 5.67 / 4277 | 4.41 / 3628 | 9.85 / 2440 | 11.5 / 3975 |
Software: VapourSynth R57, Windows 10 LTSC 2021, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
- vs-dpir v1.7.1, PyTorch 1.10.1+cu113, TensorRT 8.2.2, torch2trt
2732b35
- vs-mlrt v8 (driver 511.79)
Model | [1] ort-cuda | [1] trt | [2] cuda | [2] trt | [3] ort-cuda | [3] trt |
---|---|---|---|---|---|---|
gray | 1.68 / 5277 | 1.84 / 4004 | 1.67 / 6916 | 1.87 / 4163 | 1.60 / 5190 | 1.91 / 3659 |
color | 1.53 / 5309 | 1.66 / 4034 | 1.56 / 6942 | 1.71 / 4183 | 1.57 / 5222 | 1.78 / 3691 |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt | [3] ort-cuda | [3] trt | [3] trt (2 streams) |
---|---|---|---|---|---|---|---|---|
gray | 3.04 / 3619 | 6.18 / 2780 | 6.77 / 4531 | 3.07 / 6730 | 5.98 / 3249 | 3.10 / 3276 | 7.22 / 2101 | 7.89 / 3529 |
color | 2.70 / 3659 | 5.64 / 2598 | 6.72 / 4274 | 2.65 / 6744 | 4.78 / 3261 | 2.93 / 3571 | 6.38 / 2323 | 7.64 / 3874 |
Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
- vs-dpir v1.7.1, PyTorch 1.10.1+cu113, TensorRT 8.2.2, torch2trt
2732b35
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt |
---|---|---|---|---|---|
gray | 2.45 / 5188 | 2.59 / 3979 | 2.59 / 6829 | 2.27 / 11552 | 2.45 / 3959 |
color | 2.39 / 5220 | 2.51 / 4011 | 2.56 / 6893 | 2.12 / 11558 | 2.26 / 3979 |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt |
---|---|---|---|---|---|
gray | 5.20 / 3018 | 8.09 / 2831 | 8.50 / 4617 | 5.09 / 11289 | 6.93 / 3461 |
color | 4.95 / 3058 | 7.54 / 2863 | 8.47 / 4687 | 4.29 / 11302 | 5.60 / 3473 |
Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23, lock the GPU clocks at max frequency.
Input size: 1920x1080
- vs-mlrt v6
- vs-dpir v1.7.1, PyTorch 1.10.1+cu113, TensorRT 8.2.2, torch2trt
2732b35
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt |
---|---|---|---|---|---|
gray | 2.34 / 5791 | 2.75 / 4015 | 2.78 / 6641 | 2.20 / 11837 | 2.67 / 4189 |
color | 2.29 / 5823 | 2.73 / 4075 | 2.78 / 6747 | 2.12 / 11853 | 2.54 / 4209 |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt |
---|---|---|---|---|---|
gray | 3.73 / 3621 | 6.67 / 3437 | 6.33 / 5285 | 3.72 / 11853 | 6.17 / 4079 |
color | 3.65 / 3661 | 6.26 / 3423 | 6.32 / 5277 | 3.45 / 11597 | 5.25 / 4103 |
Software: VapourSynth R58, Windows Server 2022, Graphics Driver 511.65, lock the GPU clocks at max frequency.
Input size: 1920x1080
- vs-mlrt v8
Model | [1] trt |
---|---|
gray | 2.75 / 4285 |
color | 2.70 / 4317 |
Model | [1] trt |
---|---|
gray | 7.00 / 2336 |
color | 6.80 / 2368 |
Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
- vs-dpir v1.7.1, PyTorch 1.10.1+cu113, TensorRT 8.2.2, torch2trt
2732b35
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt |
---|---|---|---|---|---|
gray | 7.12 / 5853 | 9.68 / 4111 | 10.3 / 6737 | 6.43 / 11973 | 8.56 / 4261 |
color | 6.95 / 5885 | 9.31 / 4143 | 10.2 / 6801 | 5.62 / 11979 | 7.21 / 4281 |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] cuda | [2] trt |
---|---|---|---|---|---|
gray | 10.1 / 3683 | 18.9 / 3015 | 20.5 / 4603 | 9.67 / 11709 | 14.6 / 3679 |
color | 9.55 / 3723 | 17.7 / 3041 | 20.3 / 4657 | 7.65 / 11713 | 10.5 / 3691 |
Software: VapourSynth R57-A4, Windows Server 2022, Graphics Driver 516.94.
Input size: 1920x1080
- vs-mlrt v9
Model | [1] trt | [1] trt (2 streams) |
---|---|---|
color | 20.5 / 2022 | 24.3 / 3325 |
- Runtimes
- Models
- Device-specific benchmarks
- NVIDIA GeForce RTX 4090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 2080 Ti
- NVIDIA Quadro P6000
- AMD Radeon RX 7900 XTX
- AMD Radeon Pro V620
- AMD Radeon Pro V520
- AMD Radeon VII
- AMD EPYC Zen4
- Intel Core Ultra 7 155H
- Intel Arc A380
- Intel Arc A770
- Intel Data Center GPU Flex 170
- Intel Data Center GPU Max 1100
- Intel Xeon Sapphire Rapids