OAK

Band Splitting-Based Online Music Source Separation with Low Latency

Metadata Downloads
Author(s)
Joohye Son
Type
Thesis
Degree
Master
Department
대학원 전기전자컴퓨터공학부
Advisor
Shin, Jong Won
Abstract
Music source separation has rapidly advanced with the development of deep learning and neural network architectures. Band-split RNN (BSRNN), one of the recent models, shows superior performance by employing band splitting and dual-path recurrent layers. For online separation, however, there is a trade-off between separation performance and latency. Moreover, model capacity and complexity are crucial factors for real-time applications. In this paper, we extend BSRNN to an online separation model. We adopt a structure combining single-input-multi-output (SIMO) for separation and single-input-single-output (SISO) for enhancement. Additionally, we propose a finetuning strategy that utilizes the training songs themselves, which are expected to retain characteristics similar to those of the evaluation set. Experimental results show that our model outperforms a recent real-time model that employs additional data, and the finetuning stage demonstrates its effectiveness of 0.2 dB improvement in SDR.
URI
https://scholar.gist.ac.kr/handle/local/18963
Fulltext
http://gist.dcollection.net/common/orgView/200000878464
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.