OAK

Band Splitting-Based Online Music Source Separation with Low Latency

Metadata Downloads
Abstract
Music source separation has rapidly advanced with the development of deep learning and neural network architectures. Band-split RNN (BSRNN), one of the recent models, shows superior performance by employing band splitting and dual-path recurrent layers. For online separation, however, there is a trade-off between separation performance and latency. Moreover, model capacity and complexity are crucial factors for real-time applications. In this paper, we extend BSRNN to an online separation model. We adopt a structure combining single-input-multi-output (SIMO) for separation and single-input-single-output (SISO) for enhancement. Additionally, we propose a finetuning strategy that utilizes the training songs themselves, which are expected to retain characteristics similar to those of the evaluation set. Experimental results show that our model outperforms a recent real-time model that employs additional data, and the finetuning stage demonstrates its effectiveness of 0.2 dB improvement in SDR.
Author(s)
Joohye Son
Issued Date
2024
Type
Thesis
URI
https://scholar.gist.ac.kr/handle/local/18963
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.