OAK

Light-Weight Causal Speech Enhancement with Bone-Conduction

Metadata Downloads
Abstract
Speech enhancement aims to improve the quality of speech degraded by various types of noise, particularly under challenging conditions such as extremely low signal- to-noise ratio (SNR). Traditional methods predominantly rely on speech data captured by air-conduction (AC), which are highly susceptible to noise. This makes speech en- hancement at low SNRs a challenge. In contrast, bone-conduction (BC) is more robust to noise but provide information constrained to a limited frequency bandwidth. In this paper, we propose a novel fusion module that effectively integrates information from both air-conduction and bone-conduction. Additionally, we introduce a light-weight, causal network designed for low computational complexity, making it suitable for de- ployment on resource-constrained devices. Experimental evaluations demonstrate that the proposed model significantly outperforms the baseline, achieving superior speech quality while reducing model size without an increase in computational complexity.
Author(s)
이상윤
Issued Date
2025
Type
Thesis
URI
https://scholar.gist.ac.kr/handle/local/19455
Alternative Author(s)
이상윤(Sangyun Lee)
Department
대학원 AI대학원
Advisor
Shin, Jong Won
Table Of Contents
Abstract (English) i
List of Contents ii
List of Tables iii
List of Figures iv
1 Introduction 1
1.1 Introduction 1
1.2 Related works 3
2 Baseline Model 4
2.1 Problem formulation 4
2.2 Architecture 4
2.2.1 Encoder-Decoder 4
2.2.2 Group LSTM 6
3 Proposed Model 7
3.1 Fusion Module 7
3.2 Encoder-Decoder 8
3.3 Frequency Refinement Module 9
3.4 Complex Skip-Connection 10
3.5 Group Dual-path Gated Recurrent Units 11
4 Experiment and Result 12
4.1 Datasets 12
4.2 Training Objective Functions 13
4.3 Evaluation Metrics 14
4.4 Experimental Results 15
5 Conclusion 20
References 21
– ii –
Degree
Master
Appears in Collections:
Department of AI Convergence > 3. Theses(Master)
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.