Skip to content

Latest commit

 

History

History
798 lines (791 loc) · 279 KB

README.md

File metadata and controls

798 lines (791 loc) · 279 KB

CVPR2023 Top Open Papers

Best Papers

Type Title Homepage Code Code Stars
Best Paper Planning-oriented Autonomous Driving Link Github GitHub Repo stars
Best Paper Visual Programming: Compositional visual reasoning without training Link Github GitHub Repo stars
Best Paper Honorable Mention DynIBaR: Neural Dynamic Image-Based Rendering Link Github GitHub Repo stars
Best Student Paper 3D Registration with Maximal Cliques Link Github GitHub Repo stars
Best Student Paper Honorable Mention DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation Link Github GitHub Repo stars

Top CVPR2023 Papers with Code

The following CVPR2023 paper information is extracted from the following web page and saved in the papers_info.json file.

https://openaccess.thecvf.com/CVPR2023?day=all
https://cvpr2023.thecvf.com/Conferences/2023/AcceptedPapers

If you find any errors in the paper information or missing Githubs, you are welcome to modify the corresponding content of the papers_info_refined.json file and submit a Pull Request.

Title Paper Code Github Stars
YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors Link Github GitHub Repo stars
From Images to Textual Prompts: Zero-Shot Visual Question Answering With Frozen Large Language Models Link Github GitHub Repo stars
Co-Training 2L Submodels for Visual Recognition Link Github GitHub Repo stars
Token Turing Machines Link Github GitHub Repo stars
How Can Objects Help Action Recognition? Link Github GitHub Repo stars
GINA-3D: Learning To Generate Implicit Neural Assets in the Wild Link Github GitHub Repo stars
Images Speak in Images: A Generalist Painter for In-Context Visual Learning Link Github GitHub Repo stars
Planning-Oriented Autonomous Driving Link Github GitHub Repo stars
Beyond Appearance: A Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks Link Github GitHub Repo stars
InternImage: Exploring Large-Scale Vision Foundation Models With Deformable Convolutions Link Github GitHub Repo stars
DepGraph: Towards Any Structural Pruning Link Github GitHub Repo stars
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale Link Github GitHub Repo stars
Universal Instance Perception As Object Discovery and Retrieval Link Github GitHub Repo stars
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360° Link Github GitHub Repo stars
EfficientViT: Memory Efficient Vision Transformer With Cascaded Group Attention Link Github GitHub Repo stars
Unifying Vision, Text, and Layout for Universal Document Processing Link Github GitHub Repo stars
ConvNeXt V2: Co-Designing and Scaling ConvNets With Masked Autoencoders Link Github GitHub Repo stars
FlexiViT: One Model for All Patch Sizes Link Github GitHub Repo stars
CLIPPO: Image-and-Language Understanding From Pixels Only Link Github GitHub Repo stars
Neighborhood Attention Transformer Link Github GitHub Repo stars
SeqTrack: Sequence to Sequence Learning for Visual Object Tracking Link Github GitHub Repo stars
Deep Learning of Partial Graph Matching via Differentiable Top-K Link Github GitHub Repo stars
Mask DINO: Towards a Unified Transformer-Based Framework for Object Detection and Segmentation Link Github GitHub Repo stars
Paint by Example: Exemplar-Based Image Editing With Diffusion Models Link Github GitHub Repo stars
Cut and Learn for Unsupervised Object Detection and Instance Segmentation Link Github GitHub Repo stars
Masked Image Modeling With Local Multi-Scale Reconstruction Link Github GitHub Repo stars
PAniC-3D: Stylized Single-View 3D Reconstruction From Portraits of Anime Characters Link Github GitHub Repo stars
Learning To Generate Image Embeddings With User-Level Differential Privacy Link Github GitHub Repo stars
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures Link Github GitHub Repo stars
InstMove: Instance Motion for Object-Centric Video Segmentation Link Github GitHub Repo stars
Activating More Pixels in Image Super-Resolution Transformer Link Github GitHub Repo stars
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking Link Github GitHub Repo stars
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking Link Github GitHub Repo stars
OpenGait: Revisiting Gait Recognition Towards Better Practicality Link Github GitHub Repo stars
Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks Link Github GitHub Repo stars
All Are Worth Words: A ViT Backbone for Diffusion Models Link Github GitHub Repo stars
Shape, Pose, and Appearance From a Single Image via Bootstrapped Radiance Field Inversion Link Github GitHub Repo stars
MAGE: MAsked Generative Encoder To Unify Representation Learning and Image Synthesis Link Github GitHub Repo stars
Mask-Free Video Instance Segmentation Link Github GitHub Repo stars
Compressing Volumetric Radiance Fields to 1 MB Link Github GitHub Repo stars
PIDNet: A Real-Time Semantic Segmentation Network Inspired by PID Controllers Link Github GitHub Repo stars
DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network Link Github GitHub Repo stars
FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction Link Github GitHub Repo stars
Detecting Everything in the Open World: Towards Universal Object Detection Link Github GitHub Repo stars
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning Link Github GitHub Repo stars
Cross-Domain Image Captioning With Discriminative Finetuning Link Github GitHub Repo stars
NeuralLift-360: Lifting an In-the-Wild 2D Photo to a 3D Object With 360° Views Link Github GitHub Repo stars
Scaling Language-Image Pre-Training via Masking Link Github GitHub Repo stars
Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation Link Github GitHub Repo stars
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation Link Github GitHub Repo stars
MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors Link Github GitHub Repo stars
ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing Link Github GitHub Repo stars
BiFormer: Vision Transformer With Bi-Level Routing Attention Link Github GitHub Repo stars
All in One: Exploring Unified Video-Language Pre-Training Link Github GitHub Repo stars
Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation Link Github GitHub Repo stars
Wavelet Diffusion Models Are Fast and Scalable Image Generators Link Github GitHub Repo stars
Efficient and Explicit Modelling of Image Hierarchies for Image Restoration Link Github GitHub Repo stars
3D Registration With Maximal Cliques Link Github GitHub Repo stars
Prompting Large Language Models With Answer Heuristics for Knowledge-Based Visual Question Answering Link Github GitHub Repo stars
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks Link Github GitHub Repo stars
DSVT: Dynamic Sparse Voxel Transformer With Rotated Sets Link Github GitHub Repo stars
BEV-LaneDet: An Efficient 3D Lane Detection Based on Virtual Camera via Key-Points Link Github GitHub Repo stars
EDICT: Exact Diffusion Inversion via Coupled Transformations Link Github GitHub Repo stars
Disentangling Writer and Character Styles for Handwriting Generation Link Github GitHub Repo stars
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation Link Github GitHub Repo stars
Conditional Image-to-Video Generation With Latent Flow Diffusion Models Link Github GitHub Repo stars
Inversion-Based Style Transfer With Diffusion Models Link Github GitHub Repo stars
Recurrent Vision Transformers for Object Detection With Event Cameras Link Github GitHub Repo stars
Dense Distinct Query for End-to-End Object Detection Link Github GitHub Repo stars
Neural Video Compression With Diverse Contexts Link Github GitHub Repo stars
Spherical Transformer for LiDAR-Based 3D Recognition Link Github GitHub Repo stars
You Only Segment Once: Towards Real-Time Panoptic Segmentation Link Github GitHub Repo stars
Referring Image Matting Link Github GitHub Repo stars
VideoMAE V2: Scaling Video Masked Autoencoders With Dual Masking Link Github GitHub Repo stars
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation Link Github GitHub Repo stars
NIKI: Neural Inverse Kinematics With Invertible Neural Networks for 3D Human Pose and Shape Estimation Link Github GitHub Repo stars
High-Fidelity 3D GAN Inversion by Pseudo-Multi-View Optimization Link Github GitHub Repo stars
GeoLayoutLM: Geometric Pre-Training for Visual Information Extraction Link Github GitHub Repo stars
OTAvatar: One-Shot Talking Face Avatar With Controllable Tri-Plane Rendering Link Github GitHub Repo stars
PET-NeuS: Positional Encoding Tri-Planes for Neural Surfaces Link Github GitHub Repo stars
MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation Link Github GitHub Repo stars
Robust Model-Based Face Reconstruction Through Weakly-Supervised Outlier Segmentation Link Github GitHub Repo stars
LargeKernel3D: Scaling Up Kernels in 3D Sparse CNNs Link Github GitHub Repo stars
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation Link Github GitHub Repo stars
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis Link Github GitHub Repo stars
Learning a Sparse Transformer Network for Effective Image Deraining Link Github GitHub Repo stars
Visual Prompt Multi-Modal Tracking Link Github GitHub Repo stars
DeepSolo: Let Transformer Decoder With Explicit Points Solo for Text Spotting Link Github GitHub Repo stars
HumanBench: Towards General Human-Centric Perception With Projector Assisted Pretraining Link Github GitHub Repo stars
Learning Visual Representations via Language-Guided Sampling Link Github GitHub Repo stars
GP-VTON: Towards General Purpose Virtual Try-On via Collaborative Local-Flow Global-Parsing Learning Link Github GitHub Repo stars
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales With Multi-Depth Seeds for 3D Object Detection Link Github GitHub Repo stars
NeRF-RPN: A General Framework for Object Detection in NeRFs Link Github GitHub Repo stars
ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation Link Github GitHub Repo stars
Position-Guided Text Prompt for Vision-Language Pre-Training Link Github GitHub Repo stars
Query-Centric Trajectory Prediction Link Github GitHub Repo stars
Rethinking Out-of-Distribution (OOD) Detection: Masked Image Modeling Is All You Need Link Github GitHub Repo stars
LoGoNet: Towards Accurate 3D Object Detection With Local-to-Global Cross-Modal Fusion Link Github GitHub Repo stars
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training Link Github GitHub Repo stars
BEVHeight: A Robust Framework for Vision-Based Roadside 3D Object Detection Link Github GitHub Repo stars
SimpleNet: A Simple Network for Image Anomaly Detection and Localization Link Github GitHub Repo stars
Think Twice Before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving Link Github GitHub Repo stars
Slide-Transformer: Hierarchical Vision Transformer With Local Self-Attention Link Github GitHub Repo stars
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion Link Github GitHub Repo stars
Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking Link Github GitHub Repo stars
Identity-Preserving Talking Face Generation With Landmark and Appearance Priors Link Github GitHub Repo stars
LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation Link Github GitHub Repo stars
Delving Into Shape-Aware Zero-Shot Semantic Segmentation Link Github GitHub Repo stars
Aligning Bag of Regions for Open-Vocabulary Object Detection Link Github GitHub Repo stars
ZegCLIP: Towards Adapting CLIP for Zero-Shot Semantic Segmentation Link Github GitHub Repo stars
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers Link Github GitHub Repo stars
Data-Driven Feature Tracking for Event Cameras Link Github GitHub Repo stars
FeatureBooster: Boosting Feature Descriptors With a Lightweight Neural Network Link Github GitHub Repo stars
Omni Aggregation Networks for Lightweight Image Super-Resolution Link Github GitHub Repo stars
Shifted Diffusion for Text-to-Image Generation Link Github GitHub Repo stars
A Generalized Framework for Video Instance Segmentation Link Github GitHub Repo stars
Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild Link Github GitHub Repo stars
LANA: A Language-Capable Navigator for Instruction Following and Generation Link Github GitHub Repo stars
Learning Generative Structure Prior for Blind Text Image Super-Resolution Link Github GitHub Repo stars
Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement Link Github GitHub Repo stars
TriDet: Temporal Action Detection With Relative Boundary Modeling Link Github GitHub Repo stars
GD-MAE: Generative Decoder for MAE Pre-Training on LiDAR Point Clouds Link Github GitHub Repo stars
Fix the Noise: Disentangling Source Feature for Controllable Domain Translation Link Github GitHub Repo stars
Multimodal Prompting With Missing Modalities for Visual Recognition Link Github GitHub Repo stars
Temporal Consistent 3D LiDAR Representation Learning for Semantic Perception in Autonomous Driving Link Github GitHub Repo stars
Enhanced Training of Query-Based Object Detection via Selective Query Recollection Link Github GitHub Repo stars
Data-Efficient Large Scale Place Recognition With Graded Similarity Supervision Link Github GitHub Repo stars
Super-Resolution Neural Operator Link Github GitHub Repo stars
Revisiting Rotation Averaging: Uncertainties and Robust Losses Link Github GitHub Repo stars
PlaneDepth: Self-Supervised Depth Estimation via Orthogonal Planes Link Github GitHub Repo stars
Human Guided Ground-Truth Generation for Realistic Image Super-Resolution Link Github GitHub Repo stars
DynamicDet: A Unified Dynamic Architecture for Object Detection Link Github GitHub Repo stars
FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation Link Github GitHub Repo stars
HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes With Iterative Intertwined Regularization Link Github GitHub Repo stars
Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information Link Github GitHub Repo stars
UniHCP: A Unified Model for Human-Centric Perceptions Link Github GitHub Repo stars
NeuFace: Realistic 3D Neural Face Rendering From Multi-View Images Link Github GitHub Repo stars
Adaptive Assignment for Geometry Aware Local Feature Matching Link Github GitHub Repo stars
Learning To Generate Text-Grounded Mask for Open-World Semantic Segmentation From Only Image-Text Pairs Link Github GitHub Repo stars
CLIP Is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation Link Github GitHub Repo stars
Anchor3DLane: Learning To Regress 3D Anchors for Monocular 3D Lane Detection Link Github GitHub Repo stars
Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision Link Github GitHub Repo stars
CLIP2Protect: Protecting Facial Privacy Using Text-Guided Makeup via Adversarial Latent Search Link Github GitHub Repo stars
DNF: Decouple and Feedback Network for Seeing in the Dark Link Github GitHub Repo stars
Curricular Contrastive Regularization for Physics-Aware Single Image Dehazing Link Github GitHub Repo stars
Scalable, Detailed and Mask-Free Universal Photometric Stereo Link Github GitHub Repo stars
Learning To Dub Movies via Hierarchical Prosody Models Link Github GitHub Repo stars
BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised Instance Segmentation Link Github GitHub Repo stars
Generic-to-Specific Distillation of Masked Autoencoders Link Github GitHub Repo stars
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding Link Github GitHub Repo stars
Zero-Shot Generative Model Adaptation via Image-Specific Prompt Learning Link Github GitHub Repo stars
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval? Link Github GitHub Repo stars
Unifying Short and Long-Term Tracking With Graph Hierarchies Link Github GitHub Repo stars
Hierarchical Fine-Grained Image Forgery Detection and Localization Link Github GitHub Repo stars
CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution Link Github GitHub Repo stars
Vita-CLIP: Video and Text Adaptive CLIP via Multimodal Prompting Link Github GitHub Repo stars
Masked Image Training for Generalizable Deep Image Denoising Link Github GitHub Repo stars
CLIP2Scene: Towards Label-Efficient 3D Scene Understanding by CLIP Link Github GitHub Repo stars
Efficient Frequency Domain-Based Transformers for High-Quality Image Deblurring Link Github GitHub Repo stars
Multimodal Industrial Anomaly Detection via Hybrid Fusion Link Github GitHub Repo stars
LinK: Linear Kernel for LiDAR-Based 3D Perception Link Github GitHub Repo stars
V2X-Seq: A Large-Scale Sequential Dataset for Vehicle-Infrastructure Cooperative Perception and Forecasting Link Github GitHub Repo stars
Meta Architecture for Point Cloud Analysis Link Github GitHub Repo stars
CF-Font: Content Fusion for Few-Shot Font Generation Link Github GitHub Repo stars
ViTs for SITS: Vision Transformers for Satellite Image Time Series Link Github GitHub Repo stars
ISBNet: A 3D Point Cloud Instance Segmentation Network With Instance-Aware Sampling and Box-Aware Dynamic Convolution Link Github GitHub Repo stars
A Light Weight Model for Active Speaker Detection Link Github GitHub Repo stars
Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark Link Github GitHub Repo stars
DeltaEdit: Exploring Text-Free Training for Text-Driven Image Manipulation Link Github GitHub Repo stars
Understanding Imbalanced Semantic Segmentation Through Neural Collapse Link Github GitHub Repo stars
MP-Former: Mask-Piloted Transformer for Image Segmentation Link Github GitHub Repo stars
Hierarchical Dense Correlation Distillation for Few-Shot Segmentation Link Github GitHub Repo stars
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection Link Github GitHub Repo stars
IFSeg: Image-Free Semantic Segmentation via Vision-Language Model Link Github GitHub Repo stars
AutoFocusFormer: Image Segmentation off the Grid Link Github GitHub Repo stars
EqMotion: Equivariant Multi-Agent Motion Prediction With Invariant Interaction Reasoning Link Github GitHub Repo stars
GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds Link Github GitHub Repo stars
Solving 3D Inverse Problems Using Pre-Trained 2D Diffusion Models Link Github GitHub Repo stars
Finetune Like You Pretrain: Improved Finetuning of Zero-Shot Vision Models Link Github GitHub Repo stars
Augmentation Matters: A Simple-Yet-Effective Approach to Semi-Supervised Semantic Segmentation Link Github GitHub Repo stars
Two-View Geometry Scoring Without Correspondences Link Github GitHub Repo stars
CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability Link Github GitHub Repo stars
Learning Semantic Relationship Among Instances for Image-Text Matching Link Github GitHub Repo stars
LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation Link Github GitHub Repo stars
Robust Mean Teacher for Continual and Gradual Test-Time Adaptation Link Github GitHub Repo stars
AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning With Masked Autoencoders Link Github GitHub Repo stars
Directional Connectivity-Based Segmentation of Medical Images Link Github GitHub Repo stars
Zero-Shot Referring Image Segmentation With Global-Local Context Features Link Github GitHub Repo stars
Contrastive Semi-Supervised Learning for Underwater Image Restoration via Reliable Bank Link Github GitHub Repo stars
Dynamic Focus-Aware Positional Queries for Semantic Segmentation Link Github GitHub Repo stars
Vision Transformer With Super Token Sampling Link Github GitHub Repo stars
Sampling Is Matter: Point-Guided 3D Human Mesh Reconstruction Link Github GitHub Repo stars
3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds Link Github GitHub Repo stars
PROB: Probabilistic Objectness for Open World Object Detection Link Github GitHub Repo stars
Benchmarking Robustness of 3D Object Detection to Common Corruptions Link Github GitHub Repo stars
Adaptive Sparse Convolutional Networks With Global Context Enhancement for Faster Object Detection on Drone Images Link Github GitHub Repo stars
MARLIN: Masked Autoencoder for Facial Video Representation LearnINg Link Github GitHub Repo stars
ConZIC: Controllable Zero-Shot Image Captioning by Sampling-Based Polishing Link Github GitHub Repo stars
Interactive and Explainable Region-Guided Radiology Report Generation Link Github GitHub Repo stars
SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection Link Github GitHub Repo stars
Real-Time 6K Image Rescaling With Rate-Distortion Optimization Link Github GitHub Repo stars
Revisiting Temporal Modeling for CLIP-Based Image-to-Video Knowledge Transferring Link Github GitHub Repo stars
Frequency-Modulated Point Cloud Rendering With Easy Editing Link Github GitHub Repo stars
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-Supervised Video Representation Learning Link Github GitHub Repo stars
BBDM: Image-to-Image Translation With Brownian Bridge Diffusion Models Link Github GitHub Repo stars
LAVENDER: Unifying Video-Language Understanding As Masked Language Modeling Link Github GitHub Repo stars
DynaFed: Tackling Client Data Heterogeneity With Global Dynamics Link Github GitHub Repo stars
Frame Flexible Network Link Github GitHub Repo stars
GeoMAE: Masked Geometric Target Prediction for Self-Supervised Point Cloud Pre-Training Link Github GitHub Repo stars
Collaboration Helps Camera Overtake LiDAR in 3D Detection Link Github GitHub Repo stars
CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning Link Github GitHub Repo stars
RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in Autonomous Driving Link Github GitHub Repo stars
Generalized Relation Modeling for Transformer Tracking Link Github GitHub Repo stars
WildLight: In-the-Wild Inverse Rendering With a Flashlight Link Github GitHub Repo stars
Equiangular Basis Vectors Link Github GitHub Repo stars
DualRefine: Self-Supervised Depth and Pose Estimation Through Iterative Epipolar Sampling and Refinement Toward Equilibrium Link Github GitHub Repo stars
Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-Identification Link Github GitHub Repo stars
Diversity-Aware Meta Visual Prompting Link Github GitHub Repo stars
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training Link Github GitHub Repo stars
Texts as Images in Prompt Tuning for Multi-Label Image Recognition Link Github GitHub Repo stars
PointConvFormer: Revenge of the Point-Based Convolution Link Github GitHub Repo stars
Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection Link Github GitHub Repo stars
RILS: Masked Visual Reconstruction in Language Semantic Space Link Github GitHub Repo stars
Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization Link Github GitHub Repo stars
StyleRes: Transforming the Residuals for Real Image Editing With StyleGAN Link Github GitHub Repo stars
SmallCap: Lightweight Image Captioning Prompted With Retrieval Augmentation Link Github GitHub Repo stars
Learning With Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning Link Github GitHub Repo stars
Handwritten Text Generation From Visual Archetypes Link Github GitHub Repo stars
Post-Training Quantization on Diffusion Models Link Github GitHub Repo stars
DPF: Learning Dense Prediction Fields With Weak Supervision Link Github GitHub Repo stars
OSRT: Omnidirectional Image Super-Resolution With Distortion-Aware Transformer Link Github GitHub Repo stars
SCPNet: Semantic Scene Completion on Point Cloud Link Github GitHub Repo stars
Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation Link Github GitHub Repo stars
Novel Class Discovery for 3D Point Cloud Semantic Segmentation Link Github GitHub Repo stars
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation With Cross-Scale Distortion Awareness Link Github GitHub Repo stars
M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis Link Github GitHub Repo stars
Masked and Adaptive Transformer for Exemplar Based Image Translation Link Github GitHub Repo stars
DCFace: Synthetic Face Generation With Dual Condition Diffusion Model Link Github GitHub Repo stars
T-SEA: Transfer-Based Self-Ensemble Attack on Object Detection Link Github GitHub Repo stars
SMPConv: Self-Moving Point Representations for Continuous Convolution Link Github GitHub Repo stars
N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution Link Github GitHub Repo stars
A Large-Scale Homography Benchmark Link Github GitHub Repo stars
GeoMVSNet: Learning Multi-View Stereo With Geometry Perception Link Github GitHub Repo stars
Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression Link Github GitHub Repo stars
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks Link Github GitHub Repo stars
Learning Transferable Spatiotemporal Representations From Natural Script Knowledge Link Github GitHub Repo stars
Rethinking Federated Learning With Domain Shift: A Prototype View Link Github GitHub Repo stars
Visual-Language Prompt Tuning With Knowledge-Guided Context Optimization Link Github GitHub Repo stars
Dynamic Coarse-To-Fine Learning for Oriented Tiny Object Detection Link Github GitHub Repo stars
Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning Link Github GitHub Repo stars
Joint Video Multi-Frame Interpolation and Deblurring Under Unknown Exposure Time Link Github GitHub Repo stars
Guiding Pseudo-Labels With Uncertainty Estimation for Source-Free Unsupervised Domain Adaptation Link Github GitHub Repo stars
Attribute-Preserving Face Dataset Anonymization via Latent Code Optimization Link Github GitHub Repo stars
Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process Link Github GitHub Repo stars
A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation From a Single RGB Image Link Github GitHub Repo stars
DexArt: Benchmarking Generalizable Dexterous Manipulation With Articulated Objects Link Github GitHub Repo stars
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition With Pre-Trained Vision-Language Models Link Github GitHub Repo stars
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation Link Github GitHub Repo stars
Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation Link Github GitHub Repo stars
Visibility Constrained Wide-Band Illumination Spectrum Design for Seeing-in-the-Dark Link Github GitHub Repo stars
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud Link Github GitHub Repo stars
Sharpness-Aware Gradient Matching for Domain Generalization Link Github GitHub Repo stars
Deep Graph-Based Spatial Consistency for Robust Non-Rigid Point Cloud Registration Link Github GitHub Repo stars
Decoupled Multimodal Distilling for Emotion Recognition Link Github GitHub Repo stars
Open-Vocabulary Point-Cloud Object Detection Without 3D Annotation Link Github GitHub Repo stars
An Image Quality Assessment Dataset for Portraits Link Github GitHub Repo stars
Leveraging Hidden Positives for Unsupervised Semantic Segmentation Link Github GitHub Repo stars
Semantic-Conditional Diffusion Networks for Image Captioning Link Github GitHub Repo stars
STMixer: A One-Stage Sparse Action Detector Link Github GitHub Repo stars
Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset Link Github GitHub Repo stars
Joint Visual Grounding and Tracking With Natural Language Specification Link Github GitHub Repo stars
Where Is My Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization Link Github GitHub Repo stars
Power Bundle Adjustment for Large-Scale 3D Reconstruction Link Github GitHub Repo stars
Rethinking Domain Generalization for Face Anti-Spoofing: Separability and Alignment Link Github GitHub Repo stars
A Unified Pyramid Recurrent Network for Video Frame Interpolation Link Github GitHub Repo stars
Revisiting Reverse Distillation for Anomaly Detection Link Github GitHub Repo stars
SOOD: Towards Semi-Supervised Oriented Object Detection Link Github GitHub Repo stars
POEM: Reconstructing Hand in a Point Embedded Multi-View Stereo Link Github GitHub Repo stars
Towards Efficient Use of Multi-Scale Features in Transformer-Based Object Detectors Link Github GitHub Repo stars
QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation Link Github GitHub Repo stars
MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID Link Github GitHub Repo stars
Towards Better Gradient Consistency for Neural Signed Distance Functions via Level Set Alignment Link Github GitHub Repo stars
Task Residual for Tuning Vision-Language Models Link Github GitHub Repo stars
Structured Sparsity Learning for Efficient Video Super-Resolution Link Github GitHub Repo stars
Uncertainty-Aware Unsupervised Image Deblurring With Deep Residual Prior Link Github GitHub Repo stars
Imitation Learning As State Matching via Differentiable Physics Link Github GitHub Repo stars
PEAL: Prior-Embedded Explicit Attention Learning for Low-Overlap Point Cloud Registration Link Github GitHub Repo stars
Twin Contrastive Learning With Noisy Labels Link Github GitHub Repo stars
TarViS: A Unified Approach for Target-Based Video Segmentation Link Github GitHub Repo stars
Clover: Towards a Unified Video-Language Alignment and Fusion Model Link Github GitHub Repo stars
Towards Realistic Long-Tailed Semi-Supervised Learning: Consistency Is All You Need Link Github GitHub Repo stars
Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers Link Github GitHub Repo stars
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images Link Github GitHub Repo stars
Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos Link Github GitHub Repo stars
Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection With Single Point Supervision Link Github GitHub Repo stars
Interactive Segmentation As Gaussion Process Classification Link Github GitHub Repo stars
PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation Link Github GitHub Repo stars
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization Link Github GitHub Repo stars
Adaptive Patch Deformation for Textureless-Resilient Multi-View Stereo Link Github GitHub Repo stars
TrojDiff: Trojan Attacks on Diffusion Models With Diverse Targets Link Github GitHub Repo stars
Exploring Discontinuity for Video Frame Interpolation Link Github GitHub Repo stars
Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections Link Github GitHub Repo stars
Affordance Grounding From Demonstration Video To Target Image Link Github GitHub Repo stars
Texture-Guided Saliency Distilling for Unsupervised Salient Object Detection Link Github GitHub Repo stars
How to Backdoor Diffusion Models? Link Github GitHub Repo stars
LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising Link Github GitHub Repo stars
Neuron Structure Modeling for Generalizable Remote Physiological Measurement Link Github GitHub Repo stars
Boundary-Enhanced Co-Training for Weakly Supervised Semantic Segmentation Link Github GitHub Repo stars
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection Link Github GitHub Repo stars
RiDDLE: Reversible and Diversified De-Identification With Latent Encryptor Link Github GitHub Repo stars
Perception-Oriented Single Image Super-Resolution Using Optimal Objective Estimation Link Github GitHub Repo stars
Learning Federated Visual Prompt in Null Space for MRI Reconstruction Link Github GitHub Repo stars
Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution Link Github GitHub Repo stars
Learning Distortion Invariant Representation for Image Restoration From a Causality Perspective Link Github GitHub Repo stars
PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery Link Github GitHub Repo stars
MSF: Motion-Guided Sequential Fusion for Efficient 3D Object Detection From Point Cloud Sequences Link Github GitHub Repo stars
CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection Link Github GitHub Repo stars
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective Link Github GitHub Repo stars
Polynomial Implicit Neural Representations for Large Diverse Datasets Link Github GitHub Repo stars
3D-Aware Multi-Class Image-to-Image Translation With NeRFs Link Github GitHub Repo stars
Masked Motion Encoding for Self-Supervised Video Representation Learning Link Github GitHub Repo stars
Histopathology Whole Slide Image Analysis With Heterogeneous Graph Representation Learning Link Github GitHub Repo stars
Towards Scalable Neural Representation for Diverse Videos Link Github GitHub Repo stars
CLOTH4D: A Dataset for Clothed Human Reconstruction Link Github GitHub Repo stars
Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration Link Github GitHub Repo stars
Learning Procedure-Aware Video Representation From Instructional Videos and Their Narrations Link Github GitHub Repo stars
Robust Test-Time Adaptation in Dynamic Scenarios Link Github GitHub Repo stars
Task-Specific Fine-Tuning via Variational Information Bottleneck for Weakly-Supervised Pathology Whole Slide Image Classification Link Github GitHub Repo stars
FashionSAP: Symbols and Attributes Prompt for Fine-Grained Fashion Vision-Language Pre-Training Link Github GitHub Repo stars
MOSO: Decomposing MOtion, Scene and Object for Video Prediction Link Github GitHub Repo stars
ALOFT: A Lightweight MLP-Like Architecture With Dynamic Low-Frequency Transform for Domain Generalization Link Github GitHub Repo stars
A Whac-a-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others Link Github GitHub Repo stars
SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency Link Github GitHub Repo stars
Best of Both Worlds: Multimodal Contrastive Learning With Tabular and Imaging Data Link Github GitHub Repo stars
Viewpoint Equivariance for Multi-View 3D Object Detection Link Github GitHub Repo stars
DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection Link Github GitHub Repo stars
Regularizing Second-Order Influences for Continual Learning Link Github GitHub Repo stars
Backdoor Defense via Adaptively Splitting Poisoned Dataset Link Github GitHub Repo stars
Towards Artistic Image Aesthetics Assessment: A Large-Scale Dataset and a New Method Link Github GitHub Repo stars
JacobiNeRF: NeRF Shaping With Mutual Information Gradients Link Github GitHub Repo stars
Accelerating Vision-Language Pretraining With Free Language Modeling Link Github GitHub Repo stars
Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection Link Github GitHub Repo stars
PA&DA: Jointly Sampling Path and Data for Consistent NAS Link Github GitHub Repo stars
An Empirical Study of End-to-End Video-Language Transformers With Masked Visual Modeling Link Github GitHub Repo stars
QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity Link Github GitHub Repo stars
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection Link Github GitHub Repo stars
ZBS: Zero-Shot Background Subtraction via Instance-Level Background Modeling and Foreground Selection Link Github GitHub Repo stars
Learning the Distribution of Errors in Stereo Matching for Joint Disparity and Uncertainty Estimation Link Github GitHub Repo stars
AdaptiveMix: Improving GAN Training via Feature Space Shrinkage Link Github GitHub Repo stars
Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation Link Github GitHub Repo stars
Camouflaged Object Detection With Feature Decomposition and Edge Reconstruction Link Github GitHub Repo stars
A Strong Baseline for Generalized Few-Shot Semantic Segmentation Link Github GitHub Repo stars
FrustumFormer: Adaptive Instance-Aware Resampling for Multi-View 3D Detection Link Github GitHub Repo stars
Global-to-Local Modeling for Video-Based 3D Human Pose and Shape Estimation Link Github GitHub Repo stars
Siamese DETR Link Github GitHub Repo stars
Distribution Shift Inversion for Out-of-Distribution Prediction Link Github GitHub Repo stars
Towards Unified Scene Text Spotting Based on Sequence Generation Link Github GitHub Repo stars
CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer Link Github GitHub Repo stars
Supervised Masked Knowledge Distillation for Few-Shot Transformers Link Github GitHub Repo stars
MELTR: Meta Loss Transformer for Learning To Fine-Tune Video Foundation Models Link Github GitHub Repo stars
Unsupervised Inference of Signed Distance Functions From Single Sparse Point Clouds Without Learning Priors Link Github GitHub Repo stars
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation Link Github GitHub Repo stars
Adaptive Human Matting for Dynamic Videos Link Github GitHub Repo stars
Making Vision Transformers Efficient From a Token Sparsification View Link Github GitHub Repo stars
ViPLO: Vision Transformer Based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection Link Github GitHub Repo stars
Bi-Directional Distribution Alignment for Transductive Zero-Shot Learning Link Github GitHub Repo stars
ACL-SPC: Adaptive Closed-Loop System for Self-Supervised Point Cloud Completion Link Github GitHub Repo stars
Weakly Supervised Posture Mining for Fine-Grained Classification Link Github GitHub Repo stars
H2ONet: Hand-Occlusion-and-Orientation-Aware Network for Real-Time 3D Hand Mesh Reconstruction Link Github GitHub Repo stars
E2PN: Efficient SE(3)-Equivariant Point Network Link Github GitHub Repo stars
Audio-Visual Grouping Network for Sound Localization From Mixtures Link Github GitHub Repo stars
StyleIPSB: Identity-Preserving Semantic Basis of StyleGAN for High Fidelity Face Swapping Link Github GitHub Repo stars
MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset Link Github GitHub Repo stars
Minimizing the Accumulated Trajectory Error To Improve Dataset Distillation Link Github GitHub Repo stars
Dynamically Instance-Guided Adaptation: A Backward-Free Approach for Test-Time Domain Adaptive Semantic Segmentation Link Github GitHub Repo stars
Glocal Energy-Based Learning for Few-Shot Open-Set Recognition Link Github GitHub Repo stars
Indiscernible Object Counting in Underwater Scenes Link Github GitHub Repo stars
Curricular Object Manipulation in LiDAR-Based Object Detection Link Github GitHub Repo stars
TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning With Structure-Trajectory Prompted Reconstruction for Person Re-Identification Link Github GitHub Repo stars
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification Link Github GitHub Repo stars
HOICLIP: Efficient Knowledge Transfer for HOI Detection With Vision-Language Models Link Github GitHub Repo stars
Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers Link Github GitHub Repo stars
Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection Link Github GitHub Repo stars
DAA: A Delta Age AdaIN Operation for Age Estimation via Binary Code Transformer Link Github GitHub Repo stars
Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution Link Github GitHub Repo stars
The Best Defense Is a Good Offense: Adversarial Augmentation Against Adversarial Attacks Link Github GitHub Repo stars
Dynamic Conceptional Contrastive Learning for Generalized Category Discovery Link Github GitHub Repo stars
Class Adaptive Network Calibration Link Github GitHub Repo stars
Instance-Specific and Model-Adaptive Supervision for Semi-Supervised Semantic Segmentation Link Github GitHub Repo stars
FAC: 3D Representation Learning via Foreground Aware Feature Contrast Link Github GitHub Repo stars
NICO++: Towards Better Benchmarking for Domain Generalization Link Github GitHub Repo stars
Bridging Search Region Interaction With Template for RGB-T Tracking Link Github GitHub Repo stars
Rotation-Invariant Transformer for Point Cloud Matching Link Github GitHub Repo stars
Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm Link Github GitHub Repo stars
CXTrack: Improving 3D Point Cloud Tracking With Contextual Information Link Github GitHub Repo stars
CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition With Variational Alignment Link Github GitHub Repo stars
Revisiting Residual Networks for Adversarial Robustness Link Github GitHub Repo stars
Upcycling Models Under Domain and Category Shift Link Github GitHub Repo stars
Real-Time Multi-Person Eyeblink Detection in the Wild for Untrimmed Video Link Github GitHub Repo stars
PDPP:Projected Diffusion for Procedure Planning in Instructional Videos Link Github GitHub Repo stars
NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation Link Github GitHub Repo stars
Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification Link Github GitHub Repo stars
Detecting Backdoors in Pre-Trained Encoders Link Github GitHub Repo stars
Equivalent Transformation and Dual Stream Network Construction for Mobile Image Super-Resolution Link Github GitHub Repo stars
TAPS3D: Text-Guided 3D Textured Shape Generation From Pseudo Supervision Link Github GitHub Repo stars
Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container Link Github GitHub Repo stars
VNE: An Effective Method for Improving Deep Representation by Manipulating Eigenvalue Distribution Link Github GitHub Repo stars
Re-Thinking Federated Active Learning Based on Inter-Class Diversity Link Github GitHub Repo stars
Joint Appearance and Motion Learning for Efficient Rolling Shutter Correction Link Github GitHub Repo stars
Federated Incremental Semantic Segmentation Link Github GitHub Repo stars
Evading Forensic Classifiers With Attribute-Conditioned Adversarial Faces Link Github GitHub Repo stars
Learning Common Rationale To Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems Link Github GitHub Repo stars
Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition Link Github GitHub Repo stars
Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data Link Github GitHub Repo stars
Optimization-Inspired Cross-Attention Transformer for Compressive Sensing Link Github GitHub Repo stars
Context-Based Trit-Plane Coding for Progressive Image Compression Link Github GitHub Repo stars
Boosting Accuracy and Robustness of Student Models via Adaptive Adversarial Distillation Link Github GitHub Repo stars
Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection Link Github GitHub Repo stars
GradICON: Approximate Diffeomorphisms via Gradient Inverse Consistency Link Github GitHub Repo stars
BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation Link Github GitHub Repo stars
On the Effects of Self-Supervision and Contrastive Alignment in Deep Multi-View Clustering Link Github GitHub Repo stars
Diverse 3D Hand Gesture Prediction From Body Dynamics by Bilateral Hand Disentanglement Link Github GitHub Repo stars
sRGB Real Noise Synthesizing With Neighboring Correlation-Aware Noise Model Link Github GitHub Repo stars
Reliability in Semantic Segmentation: Are We on the Right Track? Link Github GitHub Repo stars
Diversity-Measurable Anomaly Detection Link Github GitHub Repo stars
ABCD: Arbitrary Bitwise Coefficient for De-Quantization Link Github GitHub Repo stars
Block Selection Method for Using Feature Norm in Out-of-Distribution Detection Link Github GitHub Repo stars
Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution Link Github GitHub Repo stars
Two-Shot Video Object Segmentation Link Github GitHub Repo stars
MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Action Recognition Link Github GitHub Repo stars
Extracting Class Activation Maps From Non-Discriminative Features As Well Link Github GitHub Repo stars
Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception Link Github GitHub Repo stars
MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos Link Github GitHub Repo stars
Unsupervised Sampling Promoting for Stochastic Human Trajectory Prediction Link Github GitHub Repo stars
Visual Prompt Tuning for Generative Transfer Learning Link Github GitHub Repo stars
Improved Test-Time Adaptation for Domain Generalization Link Github GitHub Repo stars
Watch or Listen: Robust Audio-Visual Speech Recognition With Visual Corruption Modeling and Reliability Scoring Link Github GitHub Repo stars
Enlarging Instance-Specific and Class-Specific Information for Open-Set Action Recognition Link Github GitHub Repo stars
Inferring and Leveraging Parts From Object Shape for Improving Semantic Image Synthesis Link Github GitHub Repo stars
DiGA: Distil To Generalize and Then Adapt for Domain Adaptive Semantic Segmentation Link Github GitHub Repo stars
Learning a Practical SDR-to-HDRTV Up-Conversion Using New Dataset and Degradation Models Link Github GitHub Repo stars
SliceMatch: Geometry-Guided Aggregation for Cross-View Pose Estimation Link Github GitHub Repo stars
DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection Link Github GitHub Repo stars
On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks Link Github GitHub Repo stars
ScarceNet: Animal Pose Estimation With Scarce Annotations Link Github GitHub Repo stars
Deep Fair Clustering via Maximizing and Minimizing Mutual Information: Theory, Algorithm and Metric Link Github GitHub Repo stars
Stimulus Verification Is a Universal and Effective Sampler in Multi-Modal Human Trajectory Prediction Link Github GitHub Repo stars
Preserving Linear Separability in Continual Learning by Backward Feature Projection Link Github GitHub Repo stars
Generalizable Implicit Neural Representations via Instance Pattern Composers Link Github GitHub Repo stars
Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching Link Github GitHub Repo stars
Progressive Neighbor Consistency Mining for Correspondence Pruning Link Github GitHub Repo stars
Trainable Projected Gradient Method for Robust Fine-Tuning Link Github GitHub Repo stars
Independent Component Alignment for Multi-Task Learning Link Github GitHub Repo stars
Deep Arbitrary-Scale Image Super-Resolution via Scale-Equivariance Pursuit Link Github GitHub Repo stars
DualVector: Unsupervised Vector Font Synthesis With Dual-Part Representation Link Github GitHub Repo stars
Interventional Bag Multi-Instance Learning on Whole-Slide Pathological Images Link Github GitHub Repo stars
Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection Link Github GitHub Repo stars
Partial Network Cloning Link Github GitHub Repo stars
Ultra-High Resolution Segmentation With Ultra-Rich Context: A Novel Benchmark Link Github GitHub Repo stars
Object Detection With Self-Supervised Scene Adaptation Link Github GitHub Repo stars
Generative Bias for Robust Visual Question Answering Link Github GitHub Repo stars
MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation Link Github GitHub Repo stars
Coreset Sampling From Open-Set for Fine-Grained Self-Supervised Learning Link Github GitHub Repo stars
Sparsely Annotated Semantic Segmentation With Adaptive Gaussian Mixtures Link Github GitHub Repo stars
SE-ORNet: Self-Ensembling Orientation-Aware Network for Unsupervised Point Cloud Shape Correspondence Link Github GitHub Repo stars
B-Spline Texture Coefficients Estimator for Screen Content Image Super-Resolution Link Github GitHub Repo stars
High-Fidelity Facial Avatar Reconstruction From Monocular Video With Generative Priors Link Github GitHub Repo stars
DivClust: Controlling Diversity in Deep Clustering Link Github GitHub Repo stars
Large-Scale Training Data Search for Object Re-Identification Link Github GitHub Repo stars
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning Link Github GitHub Repo stars
CREPE: Can Vision-Language Foundation Models Reason Compositionally? Link Github GitHub Repo stars
Semi-Supervised Domain Adaptation With Source Label Adaptation Link Github GitHub Repo stars
StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning Link Github GitHub Repo stars
Unlearnable Clusters: Towards Label-Agnostic Unlearnable Examples Link Github GitHub Repo stars
ScanDMM: A Deep Markov Model of Scanpath Prediction for 360° Images Link Github GitHub Repo stars
PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification Link Github GitHub Repo stars
DIP: Dual Incongruity Perceiving Network for Sarcasm Detection Link Github GitHub Repo stars
Weakly Supervised Video Representation Learning With Unaligned Text for Sequential Videos Link Github GitHub Repo stars
PVT-SSD: Single-Stage 3D Object Detector With Point-Voxel Transformer Link Github GitHub Repo stars
Continuous Intermediate Token Learning With Implicit Motion Manifold for Keyframe Based Motion Interpolation Link Github GitHub Repo stars
VQACL: A Novel Visual Question Answering Continual Learning Setting Link Github GitHub Repo stars
RONO: Robust Discriminative Learning With Noisy Labels for 2D-3D Cross-Modal Retrieval Link Github GitHub Repo stars
PCT-Net: Full Resolution Image Harmonization Using Pixel-Wise Color Transformations Link Github GitHub Repo stars
MixTeacher: Mining Promising Labels With Mixed Scale Teacher for Semi-Supervised Object Detection Link Github GitHub Repo stars
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training Link Github GitHub Repo stars
Computationally Budgeted Continual Learning: What Does Matter? Link Github GitHub Repo stars
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers Link Github GitHub Repo stars
Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network Link Github GitHub Repo stars
R2Former: Unified Retrieval and Reranking Transformer for Place Recognition Link Github GitHub Repo stars
Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization Link Github GitHub Repo stars
Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement Link Github GitHub Repo stars
DistilPose: Tokenized Pose Regression With Heatmap Distillation Link Github GitHub Repo stars
Bitstream-Corrupted JPEG Images Are Restorable: Two-Stage Compensation and Alignment Framework for Image Restoration Link Github GitHub Repo stars
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks Link Github GitHub Repo stars
BiCro: Noisy Correspondence Rectification for Multi-Modality Data via Bi-Directional Cross-Modal Similarity Consistency Link Github GitHub Repo stars
Representation Learning for Visual Object Tracking by Masked Appearance Transfer Link Github GitHub Repo stars
AnchorFormer: Point Cloud Completion From Discriminative Nodes Link Github GitHub Repo stars
TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation Link Github GitHub Repo stars
Proximal Splitting Adversarial Attack for Semantic Segmentation Link Github GitHub Repo stars
NVTC: Nonlinear Vector Transform Coding Link Github GitHub Repo stars
CLAMP: Prompt-Based Contrastive Learning for Connecting Language and Animal Pose Link Github GitHub Repo stars
Enhancing the Self-Universality for Transferable Targeted Attacks Link Github GitHub Repo stars
Randomized Adversarial Training via Taylor Expansion Link Github GitHub Repo stars
Long Range Pooling for 3D Large-Scale Scene Understanding Link Github GitHub Repo stars
Context-Aware Alignment and Mutual Masking for 3D-Language Pre-Training Link Github GitHub Repo stars
Federated Domain Generalization With Generalization Adjustment Link Github GitHub Repo stars
CoMFormer: Continual Learning in Semantic and Panoptic Segmentation Link Github GitHub Repo stars
Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement Learning Link Github GitHub Repo stars
MIST: Multi-Modal Iterative Spatial-Temporal Transformer for Long-Form Video Question Answering Link Github GitHub Repo stars
STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition Link Github GitHub Repo stars
An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions Link Github GitHub Repo stars
Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions Link Github GitHub Repo stars
Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning Link Github GitHub Repo stars
Long-Tailed Visual Recognition via Self-Heterogeneous Integration With Knowledge Excavation Link Github GitHub Repo stars
Bias Mimicking: A Simple Sampling Approach for Bias Mitigation Link Github GitHub Repo stars
OReX: Object Reconstruction From Planar Cross-Sections Using Neural Fields Link Github GitHub Repo stars
Multi-Level Logit Distillation Link Github GitHub Repo stars
Real-Time Evaluation in Online Continual Learning: A New Hope Link Github GitHub Repo stars
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction Link Github GitHub Repo stars
CABM: Content-Aware Bit Mapping for Single Image Super-Resolution Network With Large Input Link Github GitHub Repo stars
Boosting Video Object Segmentation via Space-Time Correspondence Learning Link Github GitHub Repo stars
Hunting Sparsity: Density-Guided Contrastive Learning for Semi-Supervised Semantic Segmentation Link Github GitHub Repo stars
TINC: Tree-Structured Implicit Neural Compression Link Github GitHub Repo stars
Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels Link Github GitHub Repo stars
DeGPR: Deep Guided Posterior Regularization for Multi-Class Cell Detection and Counting Link Github GitHub Repo stars
Large-Capacity and Flexible Video Steganography via Invertible Neural Network Link Github GitHub Repo stars
VDN-NeRF: Resolving Shape-Radiance Ambiguity via View-Dependence Normalization Link Github GitHub Repo stars
LINe: Out-of-Distribution Detection by Leveraging Important Neurons Link Github GitHub Repo stars
Neural Transformation Fields for Arbitrary-Styled Font Generation Link Github GitHub Repo stars
Super-CLEVR: A Virtual Benchmark To Diagnose Domain Robustness in Visual Reasoning Link Github GitHub Repo stars
Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation Link Github GitHub Repo stars
Geometry and Uncertainty-Aware 3D Point Cloud Class-Incremental Semantic Segmentation Link Github GitHub Repo stars
FCC: Feature Clusters Compression for Long-Tailed Visual Recognition Link Github GitHub Repo stars
Neural Vector Fields: Implicit Representation by Explicit Learning Link Github GitHub Repo stars
Learning Action Changes by Measuring Verb-Adverb Textual Relationships Link Github GitHub Repo stars
Make Landscape Flatter in Differentially Private Federated Learning Link Github GitHub Repo stars
Confidence-Aware Personalized Federated Learning via Variational Expectation Maximization Link Github GitHub Repo stars
Unsupervised Visible-Infrared Person Re-Identification via Progressive Graph Matching and Alternate Learning Link Github GitHub Repo stars
Knowledge Combination To Learn Rotated Detection Without Rotated Annotation Link Github GitHub Repo stars
Uncurated Image-Text Datasets: Shedding Light on Demographic Bias Link Github GitHub Repo stars
Symmetric Shape-Preserving Autoencoder for Unsupervised Real Scene Point Cloud Completion Link Github GitHub Repo stars
PointCert: Point Cloud Classification With Deterministic Certified Robustness Guarantees Link Github GitHub Repo stars
Advancing Visual Grounding With Scene Knowledge: Benchmark and Method Link Github GitHub Repo stars
Boosting Low-Data Instance Segmentation by Unsupervised Pre-Training With Saliency Prompt Link Github GitHub Repo stars
3D Human Pose Estimation With Spatio-Temporal Criss-Cross Attention Link Github GitHub Repo stars
Self-Supervised 3D Scene Flow Estimation Guided by Superpoints Link Github GitHub Repo stars
End-to-End Video Matting With Trimap Propagation Link Github GitHub Repo stars
Transductive Few-Shot Learning With Prototype-Based Label Propagation by Iterative Graph Refinement Link Github GitHub Repo stars
Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection Link Github GitHub Repo stars
RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation With Natural Prompts Link Github GitHub Repo stars
Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising Link Github GitHub Repo stars
Fine-Grained Image-Text Matching by Cross-Modal Hard Aligning Network Link Github GitHub Repo stars
MAGVLT: Masked Generative Vision-and-Language Transformer Link Github GitHub Repo stars
Focused and Collaborative Feedback Integration for Interactive Image Segmentation Link Github GitHub Repo stars
OpenMix: Exploring Outlier Samples for Misclassification Detection Link Github GitHub Repo stars
Adaptive Data-Free Quantization Link Github GitHub Repo stars
VideoTrack: Learning To Track Objects via Video Transformer Link Github GitHub Repo stars
Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsistency Pseudo Label Correction Module Link Github GitHub Repo stars
Towards Better Stability and Adaptability: Improve Online Self-Training for Model Adaptation in Semantic Segmentation Link Github GitHub Repo stars
Contrastive Grouping With Transformer for Referring Image Segmentation Link Github GitHub Repo stars
Fuzzy Positive Learning for Semi-Supervised Semantic Segmentation Link Github GitHub Repo stars
3D-POP – An Automated Annotation Approach to Facilitate Markerless 2D-3D Tracking of Freely Moving Birds With Marker-Based Motion Capture Link Github GitHub Repo stars
PointClustering: Unsupervised Point Cloud Pre-Training Using Transformation Invariance in Clustering Link Github GitHub Repo stars
Towards Open-World Segmentation of Parts Link Github GitHub Repo stars
PCR: Proxy-Based Contrastive Replay for Online Class-Incremental Continual Learning Link Github GitHub Repo stars
Quantum Multi-Model Fitting Link Github GitHub Repo stars
Few-Shot Learning With Visual Distribution Calibration and Cross-Modal Distribution Alignment Link Github GitHub Repo stars
Practical Network Acceleration With Tiny Sets Link Github GitHub Repo stars
Feature Alignment and Uniformity for Test Time Adaptation Link Github GitHub Repo stars
Finding Geometric Models by Clustering in the Consensus Space Link Github GitHub Repo stars
VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation Link Github GitHub Repo stars
Meta-Learning With a Geometry-Adaptive Preconditioner Link Github GitHub Repo stars
Divide and Conquer: Answering Questions With Object Factorization and Compositional Reasoning Link Github GitHub Repo stars
Physical-World Optical Adversarial Attacks on 3D Face Recognition Link Github GitHub Repo stars
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-Based Active Learning Link Github GitHub Repo stars
On Calibrating Semantic Segmentation Models: Analyses and an Algorithm Link Github GitHub Repo stars
Binary Latent Diffusion Link Github GitHub Repo stars
Q: How To Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images! Link Github GitHub Repo stars
MetaFusion: Infrared and Visible Image Fusion via Meta-Feature Embedding From Object Detection Link Github GitHub Repo stars
Behavioral Analysis of Vision-and-Language Navigation Agents Link Github GitHub Repo stars
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding Link Github GitHub Repo stars
Progressive Spatio-Temporal Alignment for Efficient Event-Based Motion Estimation Link Github GitHub Repo stars
Iterative Next Boundary Detection for Instance Segmentation of Tree Rings in Microscopy Images of Shrub Cross Sections Link Github GitHub Repo stars
Normalizing Flow Based Feature Synthesis for Outlier-Aware Object Detection Link Github GitHub Repo stars
Non-Contrastive Unsupervised Learning of Physiological Signals From Video Link Github GitHub Repo stars
Task Difficulty Aware Parameter Allocation & Regularization for Lifelong Learning Link Github GitHub Repo stars
Markerless Camera-to-Robot Pose Estimation via Self-Supervised Sim-to-Real Transfer Link Github GitHub Repo stars
Event-Guided Person Re-Identification via Sparse-Dense Complementary Learning Link Github GitHub Repo stars
PeakConv: Learning Peak Receptive Field for Radar Semantic Segmentation Link Github GitHub Repo stars
Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation Link Github GitHub Repo stars
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning Link Github GitHub Repo stars
Good Is Bad: Causality Inspired Cloth-Debiasing for Cloth-Changing Person Re-Identification Link Github GitHub Repo stars
Multiple Instance Learning via Iterative Self-Paced Supervised Contrastive Learning Link Github GitHub Repo stars
Abstract Visual Reasoning: An Algebraic Approach for Solving Raven’s Progressive Matrices Link Github GitHub Repo stars
Introducing Competition To Boost the Transferability of Targeted Adversarial Examples Through Clean Feature Mixup Link Github GitHub Repo stars
Boosting Verified Training for Robust Image Classifications via Abstraction Link Github GitHub Repo stars
DaFKD: Domain-Aware Federated Knowledge Distillation Link Github GitHub Repo stars
Resource-Efficient RGBD Aerial Tracking Link Github GitHub Repo stars
BiasBed – Rigorous Texture Bias Evaluation Link Github GitHub Repo stars
Progressive Open Space Expansion for Open-Set Model Attribution Link Github GitHub Repo stars
Harmonious Feature Learning for Interactive Hand-Object Pose Estimation Link Github GitHub Repo stars
Masked Images Are Counterfactual Samples for Robust Fine-Tuning Link Github GitHub Repo stars
MMANet: Margin-Aware Distillation and Modality-Aware Regularization for Incomplete Multimodal Learning Link Github GitHub Repo stars
CFA: Class-Wise Calibrated Fair Adversarial Training Link Github GitHub Repo stars
Regularization of Polynomial Networks for Image Recognition Link Github GitHub Repo stars
SlowLiDAR: Increasing the Latency of LiDAR-Based Detection Using Adversarial Examples Link Github GitHub Repo stars
Depth Estimation From Indoor Panoramas With Neural Scene Representation Link Github GitHub Repo stars
Improving Robustness of Vision Transformers by Reducing Sensitivity To Patch Corruptions Link Github GitHub Repo stars
EfficientSCI: Densely Connected Network With Space-Time Factorization for Large-Scale Video Snapshot Compressive Imaging Link Github GitHub Repo stars
GKEAL: Gaussian Kernel Embedded Analytic Learning for Few-Shot Class Incremental Task Link Github GitHub Repo stars
Boundary-Aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval Link Github GitHub Repo stars
Towards Practical Plug-and-Play Diffusion Models Link Github GitHub Repo stars
Where We Are and What We’re Looking At: Query Based Worldwide Image Geo-Localization Using Hierarchies and Scenes Link Github GitHub Repo stars
PEFAT: Boosting Semi-Supervised Medical Image Classification via Pseudo-Loss Estimation and Feature Adversarial Training Link Github GitHub Repo stars
From Node Interaction To Hop Interaction: New Effective and Scalable Graph Learning Paradigm Link Github GitHub Repo stars
Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-Shot Learning With Hyperspherical Embeddings Link Github GitHub Repo stars
Architecture, Dataset and Model-Scale Agnostic Data-Free Meta-Learning Link Github GitHub Repo stars
Layout-Based Causal Inference for Object Navigation Link Github GitHub Repo stars
Ensemble-Based Blackbox Attacks on Dense Prediction Link Github GitHub Repo stars
Adversarial Robustness via Random Projection Filters Link Github GitHub Repo stars
NLOST: Non-Line-of-Sight Imaging With Transformer Link Github GitHub Repo stars
Fast Contextual Scene Graph Generation With Unbiased Context Augmentation Link Github GitHub Repo stars
Event-Based Blurry Frame Interpolation Under Blind Exposure Link Github GitHub Repo stars
Defending Against Patch-Based Backdoor Attacks on Self-Supervised Learning Link Github GitHub Repo stars
GradMA: A Gradient-Memory-Based Accelerated Federated Learning With Alleviated Catastrophic Forgetting Link Github GitHub Repo stars
Balanced Product of Calibrated Experts for Long-Tailed Recognition Link Github GitHub Repo stars
Principles of Forgetting in Domain-Incremental Semantic Segmentation in Adverse Weather Conditions Link Github GitHub Repo stars
Annealing-Based Label-Transfer Learning for Open World Object Detection Link Github GitHub Repo stars
Make-a-Story: Visual Memory Conditioned Consistent Story Generation Link Github GitHub Repo stars
Revisiting Prototypical Network for Cross Domain Few-Shot Learning Link Github GitHub Repo stars
Perception and Semantic Aware Regularization for Sequential Confidence Calibration Link Github GitHub Repo stars
Semi-Weakly Supervised Object Kinematic Motion Prediction Link Github GitHub Repo stars
Image Quality-Aware Diagnosis via Meta-Knowledge Co-Embedding Link Github GitHub Repo stars
MaLP: Manipulation Localization Using a Proactive Scheme Link Github GitHub Repo stars
Adjustment and Alignment for Unbiased Open Set Domain Adaptation Link Github GitHub Repo stars
Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions Link Github GitHub Repo stars
Sliced Optimal Partial Transport Link Github GitHub Repo stars
HaLP: Hallucinating Latent Positives for Skeleton-Based Self-Supervised Learning of Actions Link Github GitHub Repo stars
Trap Attention: Monocular Depth Estimation With Manual Traps Link Github GitHub Repo stars
GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection Link Github GitHub Repo stars
Learning From Noisy Labels With Decoupled Meta Label Purifier Link Github GitHub Repo stars
Local Connectivity-Based Density Estimation for Face Clustering Link Github GitHub Repo stars
Physics-Guided ISO-Dependent Sensor Noise Modeling for Extreme Low-Light Photography Link Github GitHub Repo stars
Probing Neural Representations of Scene Perception in a Hippocampally Dependent Task Using Artificial Neural Networks Link Github GitHub Repo stars
A Probabilistic Framework for Lifelong Test-Time Adaptation Link Github GitHub Repo stars
PointCMP: Contrastive Mask Prediction for Self-Supervised Learning on Point Cloud Videos Link Github GitHub Repo stars
Deep Polarization Reconstruction With PDAVIS Events Link Github GitHub Repo stars
Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting Link Github GitHub Repo stars
Probabilistic Debiasing of Scene Graphs Link Github GitHub Repo stars
PMR: Prototypical Modal Rebalance for Multimodal Learning Link Github GitHub Repo stars
Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning Link Github GitHub Repo stars
HyperCUT: Video Sequence From a Single Blurry Image Using Unsupervised Ordering Link Github GitHub Repo stars
Document Image Shadow Removal Guided by Color-Aware Background Link Github GitHub Repo stars
DLBD: A Self-Supervised Direct-Learned Binary Descriptor Link Github GitHub Repo stars
Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning Link Github GitHub Repo stars
Learning Debiased Representations via Conditional Attribute Interpolation Link Github GitHub Repo stars
Bayesian Posterior Approximation With Stochastic Ensembles Link Github GitHub Repo stars
Decoupling Learning and Remembering: A Bilevel Memory Framework With Knowledge Projection for Task-Incremental Learning Link Github GitHub Repo stars
Visual Query Tuning: Towards Effective Usage of Intermediate Representations for Parameter and Memory Efficient Transfer Learning Link Github GitHub Repo stars
Noisy Correspondence Learning With Meta Similarity Correction Link Github GitHub Repo stars
RMLVQA: A Margin Loss Approach for Visual Question Answering With Language Biases Link Github GitHub Repo stars
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval Link Github GitHub Repo stars
BUFFER: Balancing Accuracy, Efficiency, and Generalizability in Point Cloud Registration Link Github GitHub Repo stars
Are Data-Driven Explanations Robust Against Out-of-Distribution Data? Link Github GitHub Repo stars
Model Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection Link Github GitHub Repo stars
Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning Link Github GitHub Repo stars
High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition Link Github GitHub Repo stars
A Bag-of-Prototypes Representation for Dataset-Level Applications Link Github GitHub Repo stars
Neural Dependencies Emerging From Learning Massive Categories Link Github GitHub Repo stars
Learning With Noisy Labels via Self-Supervised Adversarial Noisy Masking Link Github GitHub Repo stars
CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset Link Github GitHub Repo stars
Balanced Energy Regularization Loss for Out-of-Distribution Detection Link Github GitHub Repo stars
Being Comes From Not-Being: Open-Vocabulary Text-to-Motion Generation With Wordless Training Link Github GitHub Repo stars
Masked Representation Learning for Domain Generalized Stereo Matching Link Github GitHub Repo stars
Where Is My Spot? Few-Shot Image Generation via Latent Subspace Optimization Link Github GitHub Repo stars
Genie: Show Me the Data for Quantization Link Github GitHub Repo stars
G-MSM: Unsupervised Multi-Shape Matching With Graph-Based Affinity Priors Link Github GitHub Repo stars
TokenHPE: Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers Link Github GitHub Repo stars
Hierarchical Prompt Learning for Multi-Task Learning Link Github GitHub Repo stars
Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising Link Github GitHub Repo stars
Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration Link Github GitHub Repo stars
Paired-Point Lifting for Enhanced Privacy-Preserving Visual Localization Link Github GitHub Repo stars
Towards Effective Visual Representations for Partial-Label Learning Link Github GitHub Repo stars
Pose-Disentangled Contrastive Learning for Self-Supervised Facial Representation Link Github GitHub Repo stars
Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation Link Github GitHub Repo stars
Spatio-Temporal Pixel-Level Contrastive Learning-Based Source-Free Domain Adaptation for Video Semantic Segmentation Link Github GitHub Repo stars
Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint Link Github GitHub Repo stars
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-Channel Video-Language Retrieval Link Github GitHub Repo stars
Discriminating Known From Unknown Objects via Structure-Enhanced Recurrent Variational AutoEncoder Link Github GitHub Repo stars
Towards Bridging the Performance Gaps of Joint Energy-Based Models Link Github GitHub Repo stars
Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection Link Github GitHub Repo stars
AsyFOD: An Asymmetric Adaptation Paradigm for Few-Shot Domain Adaptive Object Detection Link Github GitHub Repo stars
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning Link Github GitHub Repo stars
X-Pruner: eXplainable Pruning for Vision Transformers Link Github GitHub Repo stars
Efficient Mask Correction for Click-Based Interactive Image Segmentation Link Github GitHub Repo stars
Dynamic Aggregated Network for Gait Recognition Link Github GitHub Repo stars
Bootstrap Your Own Prior: Towards Distribution-Agnostic Novel Class Discovery Link Github GitHub Repo stars
Weakly Supervised Semantic Segmentation via Adversarial Learning of Classifier and Reconstructor Link Github GitHub Repo stars
Adaptive Plasticity Improvement for Continual Learning Link Github GitHub Repo stars
Jedi: Entropy-Based Localization and Removal of Adversarial Patches Link Github GitHub Repo stars
BAAM: Monocular 3D Pose and Shape Reconstruction With Bi-Contextual Attention Module and Attention-Guided Modeling Link Github GitHub Repo stars
Leverage Interactive Affinity for Affordance Learning Link Github GitHub Repo stars
Evolved Part Masking for Self-Supervised Learning Link Github GitHub Repo stars
CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning Link Github GitHub Repo stars
High-Fidelity Event-Radiance Recovery via Transient Event Frequency Link Github GitHub Repo stars
Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures Link Github GitHub Repo stars
Detection of Out-of-Distribution Samples Using Binary Neuron Activation Patterns Link Github GitHub Repo stars
Decoupled Semantic Prototypes Enable Learning From Diverse Annotation Types for Semi-Weakly Segmentation in Expert-Driven Domains Link Github GitHub Repo stars
A Soma Segmentation Benchmark in Full Adult Fly Brain Link Github GitHub Repo stars
KD-DLGAN: Data Limited Image Generation via Knowledge Distillation Link Github GitHub Repo stars
PIVOT: Prompting for Video Continual Learning Link Github GitHub Repo stars
Rate Gradient Approximation Attack Threats Deep Spiking Neural Networks Link Github GitHub Repo stars
L-CoIns: Language-Based Colorization With Instance Awareness Link Github GitHub Repo stars
Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph Link Github GitHub Repo stars
Towards Building Self-Aware Object Detectors via Reliable Uncertainty Quantification and Calibration Link Github GitHub Repo stars
Dense Network Expansion for Class Incremental Learning Link Github GitHub Repo stars
Unsupervised Intrinsic Image Decomposition With LiDAR Intensity Link Github GitHub Repo stars
Neuralizer: General Neuroimage Analysis Without Re-Training Link Github GitHub Repo stars
Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers Link Github GitHub Repo stars
Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling Link Github GitHub Repo stars
Modular Memorability: Tiered Representations for Video Memorability Prediction Link Github GitHub Repo stars
Federated Learning With Data-Agnostic Distribution Fusion Link Github GitHub Repo stars
Four-View Geometry With Unknown Radial Distortion Link Github GitHub Repo stars
Manipulating Transfer Learning for Property Inference Link Github GitHub Repo stars
BUOL: A Bottom-Up Framework With Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction From a Single Image Link Github GitHub Repo stars
3D Spatial Multimodal Knowledge Accumulation for Scene Graph Prediction in Point Cloud Link Github GitHub Repo stars
Efficient Loss Function by Minimizing the Detrimental Effect of Floating-Point Errors on Gradient-Based Attacks Link Github GitHub Repo stars
Towards Professional Level Crowd Annotation of Expert Domain Data Link Github GitHub Repo stars
Improving Robustness of Semantic Segmentation to Motion-Blur Using Class-Centric Augmentation Link Github GitHub Repo stars
Similarity Metric Learning for RGB-Infrared Group Re-Identification Link Github GitHub Repo stars
On the Difficulty of Unpaired Infrared-to-Visible Video Translation: Fine-Grained Content-Rich Patches Transfer Link Github GitHub Repo stars
Camouflaged Instance Segmentation via Explicit De-Camouflaging Link Github GitHub Repo stars
Global Vision Transformer Pruning With Hessian-Aware Saliency Link Github GitHub Repo stars
DoNet: Deep De-Overlapping Network for Cytology Instance Segmentation Link Github GitHub Repo stars
ERM-KTP: Knowledge-Level Machine Unlearning via Knowledge Transfer Link Github GitHub Repo stars
AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning Link Github GitHub Repo stars
Simulated Annealing in Early Layers Leads to Better Generalization Link Github GitHub Repo stars
Similarity Maps for Self-Training Weakly-Supervised Phrase Grounding Link Github GitHub Repo stars
Matching Is Not Enough: A Two-Stage Framework for Category-Agnostic Pose Estimation Link Github GitHub Repo stars
Compositor: Bottom-Up Clustering and Compositing for Robust Part and Object Segmentation Link Github GitHub Repo stars
MEDIC: Remove Model Backdoors via Importance Driven Cloning Link Github GitHub Repo stars
Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing With Non-Learnable Primitives Link Github GitHub Repo stars
Adaptive Graph Convolutional Subspace Clustering Link Github GitHub Repo stars
Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language Link Github GitHub Repo stars
Correlational Image Modeling for Self-Supervised Visual Pre-Training Link Github GitHub Repo stars
Text With Knowledge Graph Augmented Transformer for Video Captioning Link Github GitHub Repo stars
Panoptic Video Scene Graph Generation Link Github GitHub Repo stars
DartBlur: Privacy Preservation With Detection Artifact Suppression Link Github GitHub Repo stars
IDGI: A Framework To Eliminate Explanation Noise From Integrated Gradients Link Github GitHub Repo stars
Ultrahigh Resolution Image/Video Matting With Spatio-Temporal Sparsity Link Github GitHub Repo stars
Vector Quantization With Self-Attention for Quality-Independent Representation Learning Link Github GitHub Repo stars
Privacy-Preserving Representations Are Not Enough: Recovering Scene Content From Camera Poses Link Github GitHub Repo stars
DETRs With Hybrid Matching Link Github GitHub Repo stars
GIVL: Improving Geographical Inclusivity of Vision-Language Models With Pre-Training Methods Link Github GitHub Repo stars
AltFreezing for More General Video Face Forgery Detection Link Github GitHub Repo stars
Heterogeneous Continual Learning Link Github GitHub Repo stars
EMT-NAS:Transferring Architectural Knowledge Between Tasks From Different Datasets Link Github GitHub Repo stars
Efficient Movie Scene Detection Using State-Space Transformers Link Github GitHub Repo stars
Private Image Generation With Dual-Purpose Auxiliary Classifier Link Github GitHub Repo stars
BASiS: Batch Aligned Spectral Embedding Space Link Github GitHub Repo stars
A Large-Scale Robustness Analysis of Video Action Recognition Models Link Github GitHub Repo stars
Neumann Network With Recursive Kernels for Single Image Defocus Deblurring Link Github GitHub Repo stars
Rebalancing Batch Normalization for Exemplar-Based Class-Incremental Learning Link Github GitHub Repo stars
ToThePoint: Efficient Contrastive Learning of 3D Point Clouds via Recycling Link Github GitHub Repo stars
Self-Supervised Blind Motion Deblurring With Deep Expectation Maximization Link Github GitHub Repo stars
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning Link Github GitHub Repo stars
DINN360: Deformable Invertible Neural Network for Latitude-Aware 360° Image Rescaling Link Github GitHub Repo stars
Patch-Craft Self-Supervised Training for Correlated Image Denoising Link Github GitHub Repo stars
Learning Decorrelated Representations Efficiently Using Fast Fourier Transform Link Github GitHub Repo stars
AstroNet: When Astrocyte Meets Artificial Neural Network Link Github GitHub Repo stars
PanoSwin: A Pano-Style Swin Transformer for Panorama Understanding Link Github GitHub Repo stars
Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge Link Github GitHub Repo stars
Polarized Color Image Denoising Link Github GitHub Repo stars