Skip to content

RianaAzad/machine-unlearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Unlearning: Theory and Implementation

What is Machine Unlearning?

Machine unlearning addresses the challenge of selectively "forgetting" specific training data points from a trained model without complete retraining. This capability is crucial for:

  • Privacy compliance (e.g., GDPR's "right to be forgotten")
  • Model maintenance and updating
  • Removing corrupted or incorrect training samples
  • Security and privacy protection

Theoretical Foundation

Core Concepts

  1. Learning-Unlearning Duality

    • Learning: Process of incorporating data patterns into model parameters
    • Unlearning: Process of removing data influence while preserving other learned patterns
  2. Catastrophic Forgetting Prevention

    • Challenge: Removing specific data without affecting other learned patterns
    • Solution: Targeted parameter updates that isolate and remove specific influences
  3. Verification Metrics

    • Removal Effectiveness: Ensuring complete removal of target data influence
    • Performance Preservation: Maintaining model accuracy on remaining data
    • Efficiency: Computational cost compared to full retraining

Implementation Approaches

  1. SISA (Sharded, Isolated, Sliced, Aggregated) Training
    • Data is divided into shards during training
    • Each shard trains an independent model
    • Unlearning requires retraining only affected shards
    • Advantages: Efficient, scalable
    • Limitations: Potential performance impact from sharding

Releases

No releases published

Packages

No packages published

Languages