Abstract:
Dataset distillation offers a lightweight synthetic dataset for fast network training with promising test accuracy. To imitate the performance of the original dataset, most approaches employ bi-level optimization and the distillation space relies on the matching architecture. Nevertheless, these approaches either suffer significant computational costs on large-scale datasets or experience performance decline on cross-architectures. We advocate for designing an economical dataset distillation framework that is independent of the matching architectures, ensuring the versatility of the datasets. With empirical observations, we argue that constraining the consistency of the real and synthetic image spaces will enhance the cross-architecture generalization. Motivated by this, we introduce Dataset Distillation via Disentangled Diffusion Model (D$^4$M), an efficient framework for dataset distillation on large-scale datasets. Compared to architecture-dependent methods, D$^4$M employs diffusion model with an autoencoder to guarantee consistency and incorporates label information into category prototypes, which not only reduces computational overhead but also endows the synthetic image with superior representation capability. The distilled datasets are versatile, eliminating the need for repeated generation of distinct datasets for various architectures. We implement D$^4$M on the ImageNet as well as several other benchmarks. Through comprehensive experiments, D$^4$M demonstrates superior performance and robust generalization, surpassing the SOTA methods across most aspects. Code and distilled datasets will be public.
Chat is not available.