This project delves into enhancing image restoration techniques via self-supervised learning, specifically through the integration of Masked AutoEncoder (MAE) strategy with eye-tracking data. By merging fixation-based saliency with MAE, our goal is to better align computational image reconstruction methods with human visual perception. Our research highlights the potential of incorporating eye-tracking insights to improve both the accuracy and perceptual relevance of self-supervised learning models within the realm of computer vision, showcasing the convergence of computational image restoration and human perceptual systems.
- Computing Methodologies
- Tracking
- Computing Methodologies
- Interest Point and Salient Region Detections
Self-supervised Learning, Image Reconstruction, Eye Movements, Visual Attention, Perceptual Quality Assessment
- Data Preparation: We leverage an open-source comprehensive eye movement dataset, featuring 15 participants viewing 1003 images, to challenge the MAEs' reconstruction capabilities.
- Patch Selection Policy:
- Salient Patch Identification: Utilizing eye-tracking data to identify high-interest patches.
- Non-Salient Patch Selection: Selecting control patches outside the salient regions for comparative analysis.
- Model Configuration:
- MAE Setup: Configuring a pre-trained masked autoencoder to evaluate the impact of patch selection on image quality.
- Patch Processing: Preparing images by masking identified patches for MAE input.
- Reconstruction and Evaluation:
- Reconstruction: Generating outputs for salient and non-salient patch selections.
- Perceptual Quality: Assessing image quality using metrics like SSIM, MSE, RMSE, and LPIPS.
- Comparative Analysis: Evaluating the impact of visual-attention-based patch selection on reconstruction quality and alignment with human perception.
Qualitative Comparative Analysis
Quantitative Comparative Analysis
wget https://people.csail.mit.edu/tjudd/WherePeopleLook/ALLSTIMULI.zip-O
wget https://people.csail.mit.edu/tjudd/WherePeopleLook/DATA.zip
wget https://people.csail.mit.edu/tjudd/WherePeopleLook/ALLFIXATIONMAPS.zip
unzip ALLSTIMULI.zip -d path/to/destination
unzip DATA.zip -d path/to/destination
unzip ALLFIXATIONMAPS.zip -d path/to/destination
Run the notebook, and change the paths based on your MIT1003 dataset location.