Skip to content

enkiwang/Dataset-distillation-papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 

Repository files navigation

Dataset-distillation-papers

This repository aims to provide a full list of works about dataset distillation (DD) or dataset condensation (DC).

Quick links

Papers sorted by year: | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 |

2023

Papers in 2023 [Back-to-top]

Author Title Type Task Dataset Venue Supp. Material
Ruonan Yu et al Dataset Distillation: A Comprehensive Review Survey Multiple tasks arXiv, Jan., 2023
Shiye Lei et al A Comprehensive Survey to Dataset Distillation Survey Multiple tasks arXiv, Jan., 2023
Noveen Sachdeva et al Data Distillation: A Survey Survey Multiple tasks arXiv, Jan., 2023
Yugeng Liu et al Backdoor Attacks Against Dataset Distillation Security FMNIST, CIFAR10, STL10, SVHN NDSS, 2023 Code

2022

Papers in 2022 [Back-to-top]

Author Title Type Task Dataset Venue Supp. Material
Guang Li et al Compressed Gastric Image Generation Based on Soft-Label Dataset Distillation for Medical Data Sharing Soft-Label Distillation Application: Medical Data Sharing Gastric X-ray Computer Methods and Programs in Biomedicine, 2022
Zijia Wang et al Gift from nature: Potential Energy Minimization for explainable dataset distillation Potential Energy Minimization Image Classification miniImageNet, CUB-200 ACCV Workshop, 2022
Michael Arbel et al Non-Convex Bilevel Games with Critical Point Selection Maps General Optimization Image Classification CIFAR-10 NeurIPS, 2022
Zhiwei Deng et al Remember the Past: Distilling Datasets into Addressable Memories for Neural Networks Image Classification MNIST, SVHN, CIFAR10/100, TinyImageNet NeurIPS, 2022 Code
Noveen Sachdeva et al Infinite Recommendation Networks: A Data-Centric Approach Neural Tangent Kernel Application: Recommender System Amazon Magazine, ML-1M, Douban, Netflix NeurIPS, 2022 Code
Dingfang Chen et al Private Set Generation with Discriminative Information Application: Private Data Generation MNIST, FashionMNIST NeurIPS, 2022 Code, Poster
Justin Cui et al DC-BENCH: Dataset Condensation Benchmark Benchmark Image Classification NeurIPS, 2022 Code, Poster
Yongchao Zhou et al Dataset Distillation using Neural Feature Regression Image Classification CIFAR100, TinyImageNet, ImageNette, ImageWoof NeurIPS, 2022 Code, Slide
Songhua Liu et al Dataset Distillation via Factorization Image Classification SVHN, CIFAR10/100 NeurIPS, 2022 Code, Poster
Noel Loo et al Efficient Dataset Distillation using Random Feature Approximation Image Classification MNIST, FashionMNIST, SVHN, CIFAR-10/100 NeurIPS, 2022 Code, Poster
Yihan Wu et al Towards Robust Dataset Learning Tri-level Optimization Robust Image Classification MNIST, CIFAR10, TinyImageNet arXiv, Nov., 2022
Andrey Zhmoginov et al Decentralized Learning with Multi-Headed Distillation Local DD Application: FL CIFAR-10/100 arXiv, Nov., 2022
Jiawei Du et al Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation Accumulated Trajectory Matching Image Classification arXiv, Nov., 2022
Justin Cui et al Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory Image Classification CIFAR-10/100, ImageNet-1K arXiv, Nov., 2022
Renjie Pi et al DYNAFED: Tackling Client Data Heterogeneity with Global Dynamics Application: FL FMNIST, CIFAR10, CINIC10 arXiv, Nov., 2022
Zongwei Wang et al Quick Graph Conversion for Robust Recommendation Gradient Matching Application: Recommender System Beauty, Alibaba-iFashion, Yelp2018 arXiv, Oct., 2022
Yulan Chen et al Learning from Designers: Fashion Compatibility Analysis Via Dataset Distillation Application: Fashion Analysis ICIP, 2022
Yuna Jeong et al Training data selection based on dataset distillation for rapid deployment in machine-learning workflows Application: Dataset Selection Multimedia Tools and Applications, 2022
Yanlin Zhou et al Communication-Efficient and Attack-Resistant Federated Edge Learning with Dataset Distillation MNIST, Landmark, IMDB, etc Application: FL IEEE TCC, 2022 Code
Nicholas Carlini et al No Free Lunch in "Privacy for Free: How does Dataset Condensation Help Privacy" Application: Privacy CIFAR-10 arXiv, Sept., 2022
Guang Li et al Dataset Distillation for Medical Dataset Sharing Trajectory Matching Application: Medical Data Sharing COVID-19 Chest X-ray arXiv, Sept., 2022
Guang Li et al Dataset Distillation using Parameter Pruning Parameter Pruning Image Classification CIFAR-10/100 arXiv, Sept., 2022
Ping Liu et al Meta Knowledge Condensation for Federated Learning Application: FL MNIST arXiv, Sept., 2022
Dmitry Medvedev et al Learning to Generate Synthetic Training Data Using Gradient Matching and Implicit Differentiation Gradient Matching, Implicit Differentiation Image Classification MNIST CCIS, 2022 Code
Wei Jin et al Condensing Graphs via One-Step Gradient Matching Gradient Matching Graph Classification KDD, 2022 Code
Rui Song et al Federated Learning via Decentralized Dataset Distillation in Resource-Constrained Edge Environments Local DD Application: FL MNIST, CIFAR10 arXiv, Aug., 2022
Hae Beom Lee et al Dataset Condensation with Latent Space Knowledge Factorization and Sharing Local DD Image Classification arXiv, Aug., 2022
Thi-Thu-Huong Le et al A Review of Dataset Distillation for Deep Learning Survey Image Classification ICPTS, 2022
Zixuan Jiang et al Delving into Effective Gradient Matching for Dataset Condensation Gradient Matching Image Classification MNIST/FashionMNIST, SVHN, CIFAR-10/100. arXiv, Jul., 2022 Code
Yuanhao Xiong et al FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning Application: FL MNIST, CIFAR10/100 arXiv, Jul., 2022
Nikolaos Tsilivis et al Can we achieve robustness from data alone? KIP Security MNIST, CIFAR-10 arXiv, Jul., 2022
Nadiya Shvai et al DEvS: Data Distillation Algorithm Based on Evolution Strategy Evolution Strategy Image Classification CIFAR-10 GECCO, 2022
Mattia Sangermano Sample Condensation in Online Continual Learning Gradient Matching Application: Continual learning SplitMNIST, SplitFashionMNIST, SplitCIFAR10 IJCNN, 2022 Code
Brian Moser et al Less is More: Proxy Datasets in NAS approaches Application: NAS CVPRW, 2022
George Cazenavette et al Wearable ImageNet: Synthesizing Tileable Textures via Dataset Distillation Image Classification CVPRW, 2022 Code
George Cazenavette et al Dataset Distillation by Matching Training Trajectories Trajectory Matching Image Classification CIFAR-100, Tiny ImageNet, ImageNet subsets CVPR, 2022 Code
Kai Wang et al CAFE: Learning to Condense Dataset by Aligning Features Feature Alignment Image Classification MNIST, FashionMNIST, SVHN, CIFAR10/100 CVPR, 2022 Code
Mengyang Liu et al Graph Condensation via Receptive Field Distribution Matching Rceptive Field Distribution Matching Graph Classification Cora, PubMed, Citeseer, Ogbn-arxiv, Flikcr arXiv, Jun., 2022
Saehyung Lee et al Dataset Condensation with Contrastive Signals Contrastive Learning Image Classification SVHN, CIFAR-10/100; Automobile, Terrier, Fish ICML 2022 Code
Jang-Hyun Kim et al Dataset Condensation via Efficient Synthetic-Data Parameterization Image Classification CIFAR-10, ImageNet, Speech Commands ICML, 2022 Code
Tian Dong et al Privacy for Free: How does Dataset Condensation Help Privacy? Application: Privacy Image Classification ICML, 2022
Paul Vicol et al On Implicit Bias in Overparameterized Bilevel Optimization General Optimization Image Classification MNIST ICML, 2022
Wei Jin et al Graph Condensation for Graph Neural Networks Gradient Matching Graph Classification Cora, Citeseer, Ogbn-arxiv; Reddit, Flickr ICLR, 2022 Code
Bo Zhao et al Synthesizing Informative Training Samples with GAN GAN Image Classification CIFAR-10/100 arXiv, Apr. 2022 Code
Shengyuan Hu et al FedSynth: Gradient Compression via Synthetic Data in Federated Learning Application: FL MNIST, FEMNIST, Reddit
Aminu Musa et al Learning from Small Datasets: An Efficient Deep Learning Model for Covid-19 Detection from Chest X-ray Using Dataset Distillation Technique Application: Medical Imaging Chest X-ray NIGERCON, 2022
Seong-Woong Kim et al Stable Federated Learning with Dataset Condensation Application: FL CIFAR-10 JCSE, 2022
Robin T. Schirrmeister et al When less is more: Simplifying inputs aids neural network understanding Application: Understanding NN MNIST, Fashion-MNIST, CIFAR10/100, arXiv, Jan, 2022
Isha Garg et al TOFU: Towards Obfuscated Federated Updates by Encoding Weight Updates into Gradients from Proxy Data Application: FL arXiv, Jan., 2022

2021

Papers in 2021 [Back-to-top]

Author Title Type Task Dataset Venue Supp. Material
Timothy Nguyen et al Dataset Distillation with Infinitely Wide Convolutional Networks Kernel Ridge Regression Image Classification MNIST, Fashion-MNIST, CIFAR-10/100, SVHN NeurIPS, 2021 Code
Bo Zhao et al Dataset Condensation with Distribution Matching Distribution Matching Image Classification MNIST, CIFAR10/100, TinyImageNet arXiv, Oct., 2021 Code
Ilia Sucholutsky et al Soft-Label Dataset Distillation and Text Dataset Distillation Label Distillation Image/Text Classification MNIST, IMDB IJCNN, 2021 Code
Felix Wiewel et al Condensed Composite Memory Continual Learning Gradient Matching Application: Continual Learning IJCNN, 2021 Code
Bo Zhao et al Dataset Condensation with Differentiable Siamese Augmentation Data Augmentation Image Classification MNIST, FashionMNIST, SVHN, CIFAR10/100 ICML, 2021 Code, Video
Timothy Nguyen et al Dataset Meta-Learning from Kernel Ridge-Regression Kernel Ridge Regression Image Classification MNIST, CIFAR-10 ICLR, 2021 Code
Bo Zhao et al Dataset Condensation with Gradient Matching Gradient Matching Image Classification CIFAR-10, Fashion-MNIST, MNIST, SVHN, USPS ICLR, 2021 Code
New Properties of the Data Distillation Method When Working with Tabular Data Simulation Tabular Classification LNISA, 2021 Code
Yongqi Li et al Data Distillation for Text Classification Text Classification arXiv, Apr., 2021 Code
Ilia Sucholutsky et al ‘Less Than One’-Shot Learning: Learning N Classes From M<N Samples Label Distillation Application: Few-Shot Learning Similation AAAI, 2021 Code

2020

Papers in 2020 [Back-to-top]

Author Title Type Task Dataset Venue Supp. Material
Yanlin Zhou et al Distilled One-Shot Federated Learning Application: FL arXiv, Sept., 2020
Ondrej Bohdal et al Flexible Dataset Distillation: Learn Labels Instead of Images Label Distillation Image Classification MNIST, CIFAR-10/100, CUB NeurIPS workshop, 2020 Code
Chengeng Huang et al Generative Dataset Distillation Generative Adversarial Networks MNIST Image Classification BigCom, 2021
Jack Goetz et al Federated Learning via Synthetic Data Bi-level Optimization Application: FL arXiv, Aug., 2020
Guang Li et al Soft-Label Anonymous Gastric X-Ray Image Distillation Label Distillation Application: Medical Data Sharing X-ray Images ICIP, 2020

2019

Papers in 2019 [Back-to-top]

Author Title Type Task Dataset Venue Supp. Material
Sam Shleifer et al Proxy Datasets for Training Convolutional Neural Networks Application: Proxy Dataset Generation Imagenette, Imagewoof arXiv, Jun., 2019

2018

Papers in 2018 [Back-to-top]

Author Title Type Task Dataset Venue Supp. Material
Tongzhou Wang et al Dataset Distillation Bi-level Optimization Image Classification MNIST, CIFAR-10 arXiv, Nov., 2018 Code

Releases

No releases published

Packages

No packages published