Skip to content

mtliba/MAE-EyeTracking-Evaluation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Perceptual Evaluation of Masked AutoEncoder Emergent Properties Through Eye-Tracking-Based Policy

Abstract

This project delves into enhancing image restoration techniques via self-supervised learning, specifically through the integration of Masked AutoEncoder (MAE) strategy with eye-tracking data. By merging fixation-based saliency with MAE, our goal is to better align computational image reconstruction methods with human visual perception. Our research highlights the potential of incorporating eye-tracking insights to improve both the accuracy and perceptual relevance of self-supervised learning models within the realm of computer vision, showcasing the convergence of computational image restoration and human perceptual systems.

CCS Concepts

  • Computing Methodologies
  • Tracking
  • Computing Methodologies
  • Interest Point and Salient Region Detections

Additional Keywords and Phrases

Self-supervised Learning, Image Reconstruction, Eye Movements, Visual Attention, Perceptual Quality Assessment

Late Breaking Work

Methodology

  1. Data Preparation: We leverage an open-source comprehensive eye movement dataset, featuring 15 participants viewing 1003 images, to challenge the MAEs' reconstruction capabilities.
  2. Patch Selection Policy:
    • Salient Patch Identification: Utilizing eye-tracking data to identify high-interest patches.
    • Non-Salient Patch Selection: Selecting control patches outside the salient regions for comparative analysis.
  3. Model Configuration:
    • MAE Setup: Configuring a pre-trained masked autoencoder to evaluate the impact of patch selection on image quality.
    • Patch Processing: Preparing images by masking identified patches for MAE input.
  4. Reconstruction and Evaluation:
    • Reconstruction: Generating outputs for salient and non-salient patch selections.
    • Perceptual Quality: Assessing image quality using metrics like SSIM, MSE, RMSE, and LPIPS.
  5. Comparative Analysis: Evaluating the impact of visual-attention-based patch selection on reconstruction quality and alignment with human perception.

Qualitative Comparative Analysis

Qualitative results

Quantitative Comparative Analysis

Quantitative results

Getting Started

Download the MIT1003 dataset

wget https://people.csail.mit.edu/tjudd/WherePeopleLook/ALLSTIMULI.zip-O 

wget https://people.csail.mit.edu/tjudd/WherePeopleLook/DATA.zip

wget https://people.csail.mit.edu/tjudd/WherePeopleLook/ALLFIXATIONMAPS.zip

Unzip the dataset

unzip ALLSTIMULI.zip -d path/to/destination

unzip DATA.zip -d path/to/destination

unzip ALLFIXATIONMAPS.zip -d path/to/destination

Run the notebook, and change the paths based on your MIT1003 dataset location.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published