Zero-Shot Knowledge Distillation Using Label-Free Adversarial Perturbation with Taylor Approximation

Kang Il Lee, Seunghyun Lee, Byung Cheol Song

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Knowledge distillation (KD) is one of the most effective neural network light-weighting techniques when training data is available. However, KD is seldom applicable to an environment where it is difficult or impossible to access training data. To solve this problem, a complete zero-shot KD (C-ZSKD) based on adversarial learning has been recently proposed, but the so-called biased sample generation problem limits the performance of C-ZSKD. To overcome this limitation, this paper proposes a novel C-ZSKD algorithm that utilizes a label-free adversarial perturbation. The proposed adversarial perturbation derives a constraint of the squared norm of gradient style by using the convolution of probability distributions and the 2nd order Taylor series approximation. The constraint serves to increase the variance of the adversarial sample distribution, which makes the student model learn the decision boundary of the teacher model more accurately without labeled data. Through analysis of the distribution of adversarial samples on the embedded space, this paper also provides an insight into the characteristics of adversarial samples that are effective for adversarial learning-based C-ZSKD.

Original languageEnglish
Article number9380328
Pages (from-to)45454-45461
Number of pages8
JournalIEEE Access
Volume9
DOIs
StatePublished - 2021

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

Keywords

  • Zero-shot learning
  • adversarial learning
  • knowledge distillation

Fingerprint

Dive into the research topics of 'Zero-Shot Knowledge Distillation Using Label-Free Adversarial Perturbation with Taylor Approximation'. Together they form a unique fingerprint.

Cite this