paper Link
Seung Ho Park, Young Soo Moon, and Nam Ik Cho
Recent studies have significantly enhanced the performance of single-image super-resolution (SR) using convolutional neural networks (CNNs). While there can be many high-resolution (HR) solutions for a given input, most existing CNN-based methods do not explore alternative solutions during the inference. A typical approach to obtaining alternative SR results is to train multiple SR models with different loss weightings and exploit the combination of these models. Instead of using multiple models, we present a more efficient method to train a single adjustable SR model on various combinations of losses by taking advantage of multi-task learning. Specifically, we optimize an SR model with a conditional objective during training, where the objective is a weighted sum of multiple perceptual losses at different feature levels. The weights vary according to given conditions, and the set of weights is defined as a style controller. Also, we present an architecture appropriate for this training scheme, which is the Residual-in-Residual Dense Block equipped with spatial feature transformation layers. At the inference phase, our trained model can generate locally different outputs conditioned on the style control map. Extensive experiments show that the proposed SR model produces various desirable reconstructions without artifacts and yields comparable quantitative performance to state-of-the-art SR methods.
- Pytorch 1.10.0
- CUDA 11.3
- Python 3.8
You can choose any number [0, 1] for t.
python test.py -opt options/test/test_FxSR_PD_4x.yml -t 0.8
- Download the pretrained FxSR-PD 4x model from OneDrive Link
- Download the pretrained FxSR-PD 8x model from OneDrive Link
- Download the pretrained FxSR-DS 4x model from OneDrive Link
- Download the pretrained FxSR-DS 8x model from OneDrive Link
The effect of choosing different layers when estimating perceptual losses on different regions, e.g., on edge and texture regions, where the losses correspond to MSE, ReLU 2-2 (VGG22), and ReLU 4-4 (VGG44) of the VGG-19 network.
The proposed flexible SR model is optimized with a conditional objective, which is a weighted sum of several perceptual losses corresponding to different feature levels, where each weight changes depending on the style map. The architecture of our proposed flexible SR network. We use the RRDB equipped with SFT as a basic block. The condition branch takes a style map for reconstruction style as input. This map is used to control the recovery styles of edges and textures for each region through SFT layers. The proposed Basic Block (RRDB equipped with SFT layer)Changes in the result of FxSR-PS 4x SR according to t on DIV2K validation set.
Visual comparison with state-of-the-art perception-driven SR methods on DIV2K validation set.
Changes in the result of FxSR-DS 4x SR according to t on DIV2K validation set.
Examples of local reconstruction style control.
The conventional method
The FxSR-PD methodT-maps is the modified version of the depth map of an image from the Make3D dataset.
An example of applying a user-created depth map to enhance the perspective feeling with the sharper and richer textured foreground and the background with more reduced camera noise than the ground truth.
Examples of naturally focusing foreground objects without artifacts. (Experiments for FxSR-PD 4x on Div8K validation dataset)
(red circle: over-enhanced and unnatural areas)
Convergence of diversity curve of the proposed FxSR-PD model as the number of training iteration increase
NTIRE 2021 Learning the Super-Resolution Space Challenge Link
We participated in the NTIRE 2021 challenge under the name of SSS. FxSR-DS is the best in terms of LPIPS for both 4x and 8x, 8th in diversity score and 3rd in MOR (Mean Opinion Rank) Link.
@ARTICLE{9684919,
author={Park, Seung Ho and Moon, Young Su and Cho, Nam Ik},
journal={IEEE Access},
title={Flexible Style Image Super-Resolution Using Conditional Objective},
year={2022},
volume={10},
number={},
pages={9774-9792},
doi={10.1109/ACCESS.2022.3144406}}
Our work and implementations are inspired by and based on BasicSR [site]