OAK

Self-Supervised Transfer Learning from Natural Images for Sound Classification

Metadata Downloads
Abstract
We propose the implementation of transfer learning from natural images to audio-based images using self-supervised learning schemes. Through self-supervised learning, convolutional neural networks (CNNs) can learn the general representation of natural images without labels. In this study, a convolutional neural network was pre-trained with natural images (ImageNet) via self-supervised learning; subsequently, it was fine-tuned on the target audio samples. Pre-training with the self-supervised learning scheme significantly improved the sound classification performance when validated on the following benchmarks: ESC-50, UrbanSound8k, and GTZAN. The network pre-trained via self-supervised learning achieved a similar level of accuracy as those pre-trained using a supervised method that require labels. Therefore, we demonstrated that transfer learning from natural images contributes to improvements in audio-related tasks, and self-supervised learning with natural images is adequate for pre-training scheme in terms of simplicity and effectiveness.
Author(s)
Shin, SunghoKim, JongwonYu, YeongukLee, SeongjuLee, Kyoobin
Issued Date
2021-04
Type
Article
DOI
10.3390/app11073043
URI
https://scholar.gist.ac.kr/handle/local/11565
Publisher
MDPI
Citation
APPLIED SCIENCES-BASEL, v.11, no.7
ISSN
2076-3417
Appears in Collections:
Department of AI Convergence > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.