Skip to content

xxwtiancai/ReasearchProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Research Experience

Independent Research:

FPSNet: Focus-Perceptual-Semantic Full Flow Visual Redundancy Predicting for Camera Image

FPSNet: Focus-Perceptual-Semantic Full Flow Visual Redundancy Predicting for Camera Image FPSNet: Focus-Perceptual-Semantic Full Flow Visual Redundancy Predicting for Camera Image FPSNet: Focus-Perceptual-Semantic Full Flow Visual Redundancy Predicting for Camera Image

Research Brief: With the rapid popularization of electronic devices, the large amount of data generated by camera imaging poses a huge challenge to the limited storage capacity and communication bandwidth. Achieving higher compression ratios without sacrificing visual quality remains a fundamental challenge for image compression. In this paper, we propose a novel full flow bidirectional visual threshold estimation method for camera imaging perceptual compression. Specifically, we study the features from camera imaging to visual perception to semantic understanding, and characterize them with focus identification, perceptual distribution, and semantic segmentation respectively. We also carefully design feature extraction networks suitable for each feature type. In addition, we draw inspiration from the bidirectional perceptual mechanism of the human visual system and propose a feature extraction framework that adopts top-down and bottom-up methods. We further enhance our model by regulating and fusing bidirectional perceptual features through a gated decoding structure. Extensive experimental validation on benchmark datasets confirms that our FPSNet significantly improves the accuracy of visual redundancy prediction.

Current Status: Accepted by 2024 PRCV

PDF and code are available: https://github.com/xxwtiancai/ReasearchProject

Deep Perceptual Lossless Coding: A Case Study of Intra-frame Database and Framework

Deep Perceptual Lossless Coding: A Case Study of Intra-frame Database and Framework Deep Perceptual Lossless Coding: A Case Study of Intra-frame Database and Framework Deep Perceptual Lossless Coding: A Case Study of Intra-frame Database and Framework Deep Perceptual Lossless Coding: A Case Study of Intra-frame Database and Framework

Research Brief: Perceptually lossless coding (PLC) is a critical technique for high-quality video services, which aims to achieve maximum compression ratio while distortions cannot be perceptible by the human visual system (HVS). However, most existing methods unilaterally consider the perceptual characteristics of HVS and rigidly combine the visual perception with the existing coding system. To address these challenges, this paper proposes a new perceptually lossless coding (PLC) method that jointly optimizes the visual perception and coding system. Specifically, a new block-level video database is built based on Multiple-QP to select the best quantization parameter (QP) for each coding tree unit (CTU), and a fine-grained subjective quality evaluation experiment is adopted to generate accurate perceptually lossless labels. Furthermore, a deep neural network is designed to explore the trade-off relationship between rate and distortion to predict perceptually lossless QP. The experimental results demonstrate that the proposed method achieves an average rate saving of 29.47% compared to the latest method at the same perceptual quality.

Current Status: Submitted to IEEE Transactions on Industrial Informatics

PDF and code are available: https://github.com/xxwtiancai/ReasearchProject

Rethinking Perceptual Masking Integration in JND Modeling An Efficient and Explainable Learning Approach

Rethinking Video Error Concealment: a Benchmark Dataset and a Partition-based Method Rethinking Video Error Concealment: a Benchmark Dataset and a Partition-based Method Rethinking Video Error Concealment: a Benchmark Dataset and a Partition-based Method

Research Brief: This study presents a novel, efficient, and explainable perceptual masking integration model to improve visual redundancy prediction. We first analyzed existing methods, identifying issues such as the limitations of simple linear and nonlinear integration, lack of contextual consideration, and constrained generalization capabilities. To address these issues, the paper proposes a neural architecture search-based approach. This approach involves designing feature mapping micro-units aligned with the human visual system, scalable masking order arrangements, and versatile fusion operator selections to automatically search for the optimal fusion network structure. Experimental evaluations on multiple benchmark datasets demonstrated that the proposed method outperforms existing techniques in visual redundancy prediction and exhibits strong generalization capabilities, enhancing the quality of image compression and watermark injection systems.

Current Status: under experimental stage

PDF and code are available: https://github.com/xxwtiancai/ReasearchProject

About

My personal academic project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published