OAK

Polyphonic Sound Event Detection Based on Residual Convolutional Recurrent Neural Network With Semi-Supervised Loss Function

Metadata Downloads
Abstract
Polyphonic sound event detection (SED) is an emerging area with many applications for smart disaster safety, security, life logging, etc. This paper proposes a two-stage polyphonic SED model when strongly labeled data are limited but weakly labeled and unlabeled data are available. The first stage of the proposed SED model is constructed by a residual convolutional recurrent neural network (RCRNN)-based mean teacher model with convolutional block attention module (CBAM)-based attention. Then, the second stage fine-tunes the student model from the first stage by applying the proposed semi-supervised loss function to accommodate the noisy targets of weakly labeled and unlabeled data. The proposed SED model is applied to both Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 Challenge Task 4 and DCASE 2020 Challenge Task 4, and its performance is compared with those of the baseline and top-ranked models from both challenges by measuring the F1-score and polyphonic sound detection score (PSDS). The experiments show that the RCRNN-based first-stage model with CBAM-based attention achieves a higher F1-score and PSDS than the baseline and top-ranked models for both challenges. Furthermore, the proposed two-stage SED model with the semi-supervised loss function improves the F1-score by 6.1% and 4.6% compared to the top-ranked models from DCASE 2019 and 2020, respectively.
Author(s)
Kim, Nam KyunKim, Hong Kook
Issued Date
2021-01
Type
Article
DOI
10.1109/ACCESS.2020.3048675
URI
https://scholar.gist.ac.kr/handle/local/11751
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Citation
IEEE ACCESS, v.9, pp.7564 - 7575
ISSN
2169-3536
Appears in Collections:
Department of Electrical Engineering and Computer Science > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.