FSDA: Frequency re-scaling in data augmentation for corruption-robust image classification

Ju Hyeon Nam, Sang Chul Lee

Research output: Contribution to journalArticlepeer-review

Abstract

Modern convolutional neural networks (CNNs) are used in various applications, including computer vision, speech recognition, and robotics. However, practical usage in various applications requires large-scale datasets, and real-world data contains various corruptions that degrade the model's performance owing to the inconsistencies in the training and testing distributions. In this study, we propose Frequency re-Scaling Data Augmentation (FSDA) to improve the classification performance, robustness against corruption, and localizability of classifiers trained on various image classification datasets. Our method consists of two processes: mask generation process (MGP) and pattern re-scaling process (PSP). MGP clusters the frequency domain spectra to produce similar frequency patterns, and then PSP scales frequency by learning rescaling parameters from frequency patterns. Because the CNN classifies images by focusing on their structural features highlighted with FSDA, CNN trained with the proposed method has more robustness against corruption than that with the other data augmentations (DAs). Our technique outperforms the existing DAs on four public image classification datasets, including the CIFAR-10/100, STL-10, and ImageNet. Particularly, our strategy increases the robustness of the classifier against the different corruption errors by an average of 5.04% over the baseline.

Original languageEnglish
Article number110332
JournalPattern Recognition
Volume150
DOIs
StatePublished - Jun 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier Ltd

Keywords

  • Convolutional neural networks
  • Data augmentation
  • Deep learning
  • Frequency domain
  • Image classification

Fingerprint

Dive into the research topics of 'FSDA: Frequency re-scaling in data augmentation for corruption-robust image classification'. Together they form a unique fingerprint.

Cite this