OAK

GIST Library Login

Metadata Downloads

Abstract: Music source separation has rapidly advanced with the development of deep learning and neural network architectures. Band-split RNN (BSRNN), one of the recent models, shows superior performance by employing band splitting and dual-path recurrent layers. For online separation, however, there is a trade-off between separation performance and latency. Moreover, model capacity and complexity are crucial factors for real-time applications. In this paper, we extend BSRNN to an online separation model. We adopt a structure combining single-input-multi-output (SIMO) for separation and single-input-single-output (SISO) for enhancement. Additionally, we propose a finetuning strategy that utilizes the training songs themselves, which are expected to retain characteristics similar to those of the evaluation set. Experimental results show that our model outperforms a recent real-time model that employs additional data, and the finetuning stage demonstrates its effectiveness of 0.2 dB improvement in SDR.

Appears in Collections:: Department of Electrical Engineering and Computer Science > 3. Theses(Master)

공개 및 라이선스

qrcode

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.